mirror of
https://github.com/pacnpal/markov-discord.git
synced 2025-12-20 11:01:04 -05:00
- Added `optimization-plan.md` detailing strategies to reduce response latency and improve training throughput. - Enhanced performance analysis in `performance-analysis.md` with identified bottlenecks and completed optimizations. - Created `productContext.md` summarizing project goals, user scenarios, and implementation priorities. - Developed `markov-store.ts` for high-performance serialized chain storage with alias method sampling. - Implemented database performance indexes in `1704067200000-AddPerformanceIndexes.ts`. - Introduced `markov-worker.ts` for handling CPU-intensive operations in separate threads. - Established a worker pool in `worker-pool.ts` to manage multiple worker threads efficiently.
4.4 KiB
4.4 KiB
[MEMORY BANK: ACTIVE] Optimization Plan - Further Performance Work
Date: 2025-09-25
Purpose: Reduce response latency and improve training throughput beyond existing optimizations.
Context: builds on memory-bank/performance-analysis.md and implemented changes in src/train.ts and src/index.ts.
Goals:
- Target: end-to-end response generation < 500ms for typical queries.
- Training throughput: process 1M messages/hour on dev hardware.
- Memory: keep max heap < 2GB during training on 16GB host.
Measurement & Profiling (first actions)
- Capture baseline metrics:
- Run workload A (100k messages) and record CPU, memory, latency histograms.
- Tools: Node clinic/Flame, --prof, and pprof.
- Add short-term tracing: export traces for top code paths in
src/index.tsandsrc/train.ts. - Create benchmark scripts:
bench/trace.shandbench/load_test.ts(synthetic).
High Priority (implement immediately)
- Persist precomputed Markov chains per channel/guild:
- Add a serialized chain store:
src/markov-store.ts(new). - On training, update chain incrementally instead of rebuilding.
- Benefit: response generation becomes O(1) for chain lookup.
- Add a serialized chain store:
- Use optimized sampling structures (Alias method):
- Replace repeated weighted selection with alias tables built per prefix.
- File changes:
src/index.ts,src/markov-store.ts.
- Offload CPU-bound work to Worker Threads:
- Move chain-building and heavy sampling into Node
worker_threads. - Add a worker pool (4 threads default) with backpressure.
- Files:
src/train.ts,src/workers/markov-worker.ts.
- Move chain-building and heavy sampling into Node
- Use in-memory LRU cache for active chains:
- Keep hot channels' chains in RAM; evict least-recently-used.
- Implement TTL and memory cap.
Medium Priority
- Optimize SQLite for runtime:
- Use WAL mode and PRAGMA journal_mode = WAL; set synchronous = NORMAL.
- Use prepared statements and transactions for bulk writes.
- Temporarily disable non-essential indexes during major bulk imports.
- File:
src/migration/1704067200000-AddPerformanceIndexes.ts.
- Move heavy random-access data into a K/V store:
- Consider LevelDB/LMDB or RocksDB for prefix->suffix lists for faster reads.
- Incremental training API:
- Add an HTTP or IPC to submit new messages and update chain incrementally.
Low Priority / Long term
- Reimplement core hot loops in Rust via Neon or FFI for max throughput.
- Shard storage by guild and run independent workers per shard.
- Replace SQLite with a server DB (Postgres) only if concurrency demands it.
Implementation steps (concrete)
- Add profiling scripts + run baseline (1-2 days).
- Implement
src/markov-store.tswith serialization and alias table builder (1-2 days). - Wire worker pool and move chain building into workers (1-2 days).
- Add LRU cache around store and integrate with response path (0.5-1 day).
- Apply SQLite runtime tuning and test bulk import patterns (0.5 day).
- Add metrics & dashboards (Prometheus + Grafana or simple histograms) (1 day).
- Run load tests and iterate on bottlenecks (1-3 days).
Benchmarks to run
- Baseline: 100k messages, measure 95th percentile response latency.
- After chain-store: expect >5x faster generation.
- After workers + alias: expect ~10x faster generation in CPU-heavy scenarios.
Rollout & Validation
- Feature-flag new chain-store and worker pool behind config toggles in
config/config.json. - Canary rollout to single guild for 24h with load test traffic.
- Compare metrics and only enable globally after verifying thresholds.
Observability & Metrics
- Instrument: response latency histogram, chain-build time, cache hit ratio, DB query durations.
- Log slow queries > 50ms with context.
- Add alerts for cache thrashing and worker queue saturation.
Risks & Mitigations
- Serialization format changes: include versioning and migration utilities.
- Worker crashes: add supervisor and restart/backoff.
- Memory blowup from caching: enforce strict memory caps and stats.
Next actions for Code mode
- Create
src/markov-store.ts,src/workers/markov-worker.ts, add bench scripts, and updateconfig/config.jsontoggles. - I will implement the highest-priority changes in Code mode when you approve.
End.