System Patterns

Last Updated: 2024-12-27

High-level Architecture

Current System

Discord Events -> Message Processing -> SQLite Storage
                                   -> Markov Generation

Proposed LLM Integration

Discord Events -> Message Processing -> SQLite Storage
                                   -> Response Generator
                                      ├─ Markov Chain
                                      ├─ LLM
                                      └─ Response Selector

Core Technical Patterns

Data Storage

SQLite database using TypeORM
Entity structure:
- Guild (server)
- Channel (per-server channels)
- Messages (training data)

Message Processing

Current Flow:
- Message received
- Filtered for human authorship
- Stored in database with metadata
- Used for Markov chain training
Enhanced Flow:
- Add message embedding generation
- Store context window
- Track conversation threads

Response Generation

Current (Markov)

interface MarkovGenerateOptions {
  filter: (result) => boolean;
  maxTries: number;
  startSeed?: string;
}

Proposed (Hybrid)

interface ResponseGenerateOptions {
  contextWindow: Message[];
  temperature: number;
  maxTokens: number;
  startSeed?: string;
  forceProvider?: 'markov' | 'llm' | 'hybrid';
}

Data Flow

Training Pipeline

Message Collection
- Discord channel history
- JSON imports
- Real-time messages
Processing
- Text cleaning
- Metadata extraction
- Embedding generation
Storage
- Raw messages
- Processed embeddings
- Context relationships

Response Pipeline

Context Gathering
- Recent messages
- Channel history
- User interaction history
Generation Strategy
- Short responses: Markov chain
- Complex responses: LLM
- Hybrid: LLM-guided Markov chain
Post-processing
- Response filtering
- Token limit enforcement
- Attachment handling

Key Technical Decisions

LLM Integration

Local Embedding Model
- Use sentence-transformers for message embedding
- Store embeddings in SQLite
- Enable semantic search
Response Generation
- Primary: Use OpenAI API
- Fallback: Use local LLM
- Hybrid: Combine with Markov output
Context Management
- Rolling window of recent messages
- Semantic clustering of related content
- Thread-aware context tracking

Performance Requirements

Response Time
- Markov: < 100ms
- LLM: < 2000ms
- Hybrid: < 2500ms
Memory Usage
- Max 1GB per guild
- Batch processing for large imports
- Regular cleanup of old contexts
Rate Limiting
- Discord API compliance
- LLM API quota management
- Fallback mechanisms

2.7 KiB Executable File Raw Blame History

System Patterns

High-level Architecture

Current System

Proposed LLM Integration

Core Technical Patterns

Data Storage

Message Processing

Response Generation

Current (Markov)

Proposed (Hybrid)

Data Flow

Training Pipeline

Response Pipeline

Key Technical Decisions

LLM Integration

Performance Requirements

2.7 KiB

Executable File

Raw Blame History