Update file permissions and add documentation for LLM integration

2026-02-04 19:45:21 -05:00 · 2025-09-25 10:04:25 -04:00
parent 7dbabb2810
commit 1fdd8005f8
7 changed files with 540 additions and 0 deletions
--- a/cline_docs/activeContext.md
+++ b/cline_docs/activeContext.md
@@ -0,0 +1,116 @@
+# Active Context
+Last Updated: 2024-12-27
+
+## Current Focus
+Integrating LLM capabilities into the existing Discord bot while maintaining the unique "personality" of each server's Markov-based responses.
+
+### Active Issues
+1. Response Generation
+   - Need to implement hybrid Markov-LLM response system
+   - Must maintain response speed within acceptable limits
+   - Need to handle API rate limiting gracefully
+
+2. Data Management
+   - Implement efficient storage for embeddings
+   - Design context window management
+   - Handle conversation threading
+
+3. Integration Points
+   - Modify generateResponse function to support LLM
+   - Add embedding generation pipeline
+   - Implement context tracking
+
+## Recent Changes
+- Analyzed current codebase structure
+- Identified integration points for LLM
+- Documented system architecture
+- Created implementation plan
+
+## Active Files
+
+### Core Implementation
+- src/index.ts
+  - Main bot logic
+  - Message handling
+  - Command processing
+
+- src/entity/
+  - Database schema
+  - Need to add embedding and context tables
+
+- src/train.ts
+  - Training pipeline
+  - Need to add embedding generation
+
+### New Files Needed
+- src/llm/
+  - provider.ts (LLM service integration)
+  - embedding.ts (Embedding generation)
+  - context.ts (Context management)
+
+- src/entity/
+  - MessageEmbedding.ts
+  - ConversationContext.ts
+
+## Next Steps
+
+### Immediate Tasks
+1. Create database migrations
+   - Add embedding table
+   - Add context table
+   - Update existing message schema
+
+2. Implement LLM integration
+   - Set up OpenAI client
+   - Create response generation service
+   - Add fallback mechanisms
+
+3. Add embedding pipeline
+   - Implement background processing
+   - Set up batch operations
+   - Add storage management
+
+### Short-term Goals
+1. Test hybrid response system
+   - Benchmark response times
+   - Measure coherence
+   - Validate context usage
+
+2. Optimize performance
+   - Implement caching
+   - Add rate limiting
+   - Tune batch sizes
+
+3. Update documentation
+   - Add LLM configuration guide
+   - Update deployment instructions
+   - Document new commands
+
+### Dependencies
+- OpenAI API access
+- Additional storage capacity
+- Updated environment configuration
+
+## Implementation Strategy
+
+### Phase 1: Foundation
+1. Database schema updates
+2. Basic LLM integration
+3. Simple context tracking
+
+### Phase 2: Enhancement
+1. Hybrid response system
+2. Advanced context management
+3. Performance optimization
+
+### Phase 3: Refinement
+1. User feedback integration
+2. Response quality metrics
+3. Fine-tuning capabilities
+
+## Notes
+- Keep existing Markov system as fallback
+- Monitor API usage and costs
+- Consider implementing local LLM option
+- Need to update help documentation
+- Consider adding configuration commands
--- a/cline_docs/productContext.md
+++ b/cline_docs/productContext.md
@@ -0,0 +1,50 @@
+# Product Context
+Last Updated: 2024-12-27
+
+## Why we're building this
+- To create an engaging Discord bot that learns from and interacts with server conversations
+- To provide natural, contextually relevant responses using both Markov chains and LLM capabilities
+- To maintain conversation history and generate responses that feel authentic to each server's culture
+
+## Core user problems/solutions
+Problems:
+- Current Markov responses can be incoherent or lack context
+- No semantic understanding of conversation context
+- Limited ability to generate coherent long-form responses
+
+Solutions:
+- Integrate LLM to enhance response quality while maintaining server-specific voice
+- Use existing message database for both Markov and LLM training
+- Combine Markov's randomness with LLM's coherence
+
+## Key workflows
+1. Message Collection
+   - Listen to channels
+   - Store messages in SQLite
+   - Track message context and metadata
+
+2. Response Generation
+   - Current: Markov chain generation
+   - Proposed: Hybrid Markov-LLM generation
+   - Context-aware responses
+
+3. Training
+   - Batch processing of channel history
+   - JSON import support
+   - Continuous learning from new messages
+
+## Product direction and priorities
+1. Short term
+   - Implement LLM integration for response generation
+   - Maintain existing Markov functionality as fallback
+   - Add context window for more relevant responses
+
+2. Medium term
+   - Fine-tune LLM on server-specific data
+   - Implement response quality metrics
+   - Add conversation memory
+
+3. Long term
+   - Advanced context understanding
+   - Personality adaptation per server
+   - Multi-modal response capabilities
--- a/cline_docs/systemPatterns.md
+++ b/cline_docs/systemPatterns.md
@@ -0,0 +1,130 @@
+# System Patterns
+Last Updated: 2024-12-27
+
+## High-level Architecture
+
+### Current System
+```
+Discord Events -> Message Processing -> SQLite Storage
+                                   -> Markov Generation
+```
+
+### Proposed LLM Integration
+```
+Discord Events -> Message Processing -> SQLite Storage
+                                   -> Response Generator
+                                      ├─ Markov Chain
+                                      ├─ LLM
+                                      └─ Response Selector
+```
+
+## Core Technical Patterns
+
+### Data Storage
+- SQLite database using TypeORM
+- Entity structure:
+  - Guild (server)
+  - Channel (per-server channels)
+  - Messages (training data)
+
+### Message Processing
+1. Current Flow:
+   - Message received
+   - Filtered for human authorship
+   - Stored in database with metadata
+   - Used for Markov chain training
+
+2. Enhanced Flow:
+   - Add message embedding generation
+   - Store context window
+   - Track conversation threads
+
+### Response Generation
+
+#### Current (Markov)
+```typescript
+interface MarkovGenerateOptions {
+  filter: (result) => boolean;
+  maxTries: number;
+  startSeed?: string;
+}
+```
+
+#### Proposed (Hybrid)
+```typescript
+interface ResponseGenerateOptions {
+  contextWindow: Message[];
+  temperature: number;
+  maxTokens: number;
+  startSeed?: string;
+  forceProvider?: 'markov' | 'llm' | 'hybrid';
+}
+```
+
+## Data Flow
+
+### Training Pipeline
+1. Message Collection
+   - Discord channel history
+   - JSON imports
+   - Real-time messages
+
+2. Processing
+   - Text cleaning
+   - Metadata extraction
+   - Embedding generation
+
+3. Storage
+   - Raw messages
+   - Processed embeddings
+   - Context relationships
+
+### Response Pipeline
+1. Context Gathering
+   - Recent messages
+   - Channel history
+   - User interaction history
+
+2. Generation Strategy
+   - Short responses: Markov chain
+   - Complex responses: LLM
+   - Hybrid: LLM-guided Markov chain
+
+3. Post-processing
+   - Response filtering
+   - Token limit enforcement
+   - Attachment handling
+
+## Key Technical Decisions
+
+### LLM Integration
+1. Local Embedding Model
+   - Use sentence-transformers for message embedding
+   - Store embeddings in SQLite
+   - Enable semantic search
+
+2. Response Generation
+   - Primary: Use OpenAI API
+   - Fallback: Use local LLM
+   - Hybrid: Combine with Markov output
+
+3. Context Management
+   - Rolling window of recent messages
+   - Semantic clustering of related content
+   - Thread-aware context tracking
+
+### Performance Requirements
+1. Response Time
+   - Markov: < 100ms
+   - LLM: < 2000ms
+   - Hybrid: < 2500ms
+
+2. Memory Usage
+   - Max 1GB per guild
+   - Batch processing for large imports
+   - Regular cleanup of old contexts
+
+3. Rate Limiting
+   - Discord API compliance
+   - LLM API quota management
+   - Fallback mechanisms
--- a/cline_docs/techContext.md
+++ b/cline_docs/techContext.md
@@ -0,0 +1,189 @@
+# Technical Context
+Last Updated: 2024-12-27
+
+## Core Technologies
+
+### Current Stack
+- Node.js/TypeScript
+- Discord.js for Discord API
+- TypeORM for database management
+- SQLite for data storage
+- markov-strings-db for response generation
+
+### LLM Integration Stack
+- OpenAI API for primary LLM capabilities
+- sentence-transformers for embeddings
+- Vector extensions for SQLite
+- Redis (optional) for context caching
+
+## Integration Patterns
+
+### Database Schema Extensions
+```sql
+-- New tables for LLM integration
+
+-- Store message embeddings
+CREATE TABLE message_embeddings (
+  id TEXT PRIMARY KEY,
+  message_id TEXT NOT NULL,
+  embedding BLOB NOT NULL,
+  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
+  FOREIGN KEY (message_id) REFERENCES messages(id)
+);
+
+-- Store conversation contexts
+CREATE TABLE conversation_contexts (
+  id TEXT PRIMARY KEY,
+  channel_id TEXT NOT NULL,
+  context_window TEXT NOT NULL,
+  last_updated TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
+  FOREIGN KEY (channel_id) REFERENCES channels(id)
+);
+```
+
+### API Integration
+```typescript
+interface LLMConfig {
+  provider: 'openai' | 'local';
+  model: string;
+  apiKey?: string;
+  maxTokens: number;
+  temperature: number;
+  contextWindow: number;
+}
+
+interface ResponseGenerator {
+  generateResponse(options: {
+    prompt: string;
+    context: Message[];
+    guildId: string;
+    channelId: string;
+  }): Promise<string>;
+}
+```
+
+### Message Processing Pipeline
+```typescript
+interface MessageProcessor {
+  processMessage(message: Discord.Message): Promise<void>;
+  generateEmbedding(text: string): Promise<Float32Array>;
+  updateContext(channelId: string, message: Discord.Message): Promise<void>;
+}
+```
+
+## Key Libraries/Frameworks
+
+### Current Dependencies
+- discord.js: ^14.x
+- typeorm: ^0.x
+- markov-strings-db: Custom fork
+- sqlite3: ^5.x
+
+### New Dependencies
+```json
+{
+  "dependencies": {
+    "@openai/api": "^4.x",
+    "onnxruntime-node": "^1.x",
+    "sentence-transformers": "^2.x",
+    "sqlite-vss": "^0.1.x",
+    "redis": "^4.x"
+  }
+}
+```
+
+## Infrastructure Choices
+
+### Deployment
+- Continue with current deployment pattern
+- Add environment variables for LLM configuration
+- Optional Redis for high-traffic servers
+
+### Scaling Considerations
+1. Message Processing
+   - Batch embedding generation
+   - Background processing queue
+   - Rate limiting for API calls
+
+2. Response Generation
+   - Caching frequent responses
+   - Fallback to Markov when rate limited
+   - Load balancing between providers
+
+3. Storage
+   - Regular embedding pruning
+   - Context window management
+   - Backup strategy for embeddings
+
+## Technical Constraints
+
+### API Limitations
+1. OpenAI
+   - Rate limits
+   - Token quotas
+   - Cost considerations
+
+2. Discord
+   - Message rate limits
+   - Response time requirements
+   - Attachment handling
+
+### Resource Usage
+1. Memory
+   - Embedding model size
+   - Context window storage
+   - Cache management
+
+2. Storage
+   - Embedding data size
+   - Context history retention
+   - Backup requirements
+
+3. Processing
+   - Embedding generation load
+   - Response generation time
+   - Background task management
+
+## Development Environment
+
+### Setup Requirements
+```bash
+# Core dependencies
+npm install
+
+# LLM integration
+npm install @openai/api onnxruntime-node sentence-transformers sqlite-vss
+
+# Optional caching
+npm install redis
+```
+
+### Environment Variables
+```env
+# LLM Configuration
+OPENAI_API_KEY=sk-...
+LLM_PROVIDER=openai
+LLM_MODEL=gpt-3.5-turbo
+LLM_MAX_TOKENS=150
+LLM_TEMPERATURE=0.7
+CONTEXT_WINDOW_SIZE=10
+
+# Optional Redis
+REDIS_URL=redis://localhost:6379
+```
+
+### Testing Strategy
+1. Unit Tests
+   - Message processing
+   - Embedding generation
+   - Context management
+
+2. Integration Tests
+   - LLM API interaction
+   - Database operations
+   - Discord event handling
+
+3. Performance Tests
+   - Response time benchmarks
+   - Memory usage monitoring
+   - Rate limit compliance