Update file permissions and add documentation for LLM integration

This commit is contained in:
pacnpal
2025-09-25 10:04:25 -04:00
parent 7dbabb2810
commit 1fdd8005f8
7 changed files with 540 additions and 0 deletions

116
cline_docs/activeContext.md Executable file
View File

@@ -0,0 +1,116 @@
# Active Context
Last Updated: 2024-12-27
## Current Focus
Integrating LLM capabilities into the existing Discord bot while maintaining the unique "personality" of each server's Markov-based responses.
### Active Issues
1. Response Generation
- Need to implement hybrid Markov-LLM response system
- Must maintain response speed within acceptable limits
- Need to handle API rate limiting gracefully
2. Data Management
- Implement efficient storage for embeddings
- Design context window management
- Handle conversation threading
3. Integration Points
- Modify generateResponse function to support LLM
- Add embedding generation pipeline
- Implement context tracking
## Recent Changes
- Analyzed current codebase structure
- Identified integration points for LLM
- Documented system architecture
- Created implementation plan
## Active Files
### Core Implementation
- src/index.ts
- Main bot logic
- Message handling
- Command processing
- src/entity/
- Database schema
- Need to add embedding and context tables
- src/train.ts
- Training pipeline
- Need to add embedding generation
### New Files Needed
- src/llm/
- provider.ts (LLM service integration)
- embedding.ts (Embedding generation)
- context.ts (Context management)
- src/entity/
- MessageEmbedding.ts
- ConversationContext.ts
## Next Steps
### Immediate Tasks
1. Create database migrations
- Add embedding table
- Add context table
- Update existing message schema
2. Implement LLM integration
- Set up OpenAI client
- Create response generation service
- Add fallback mechanisms
3. Add embedding pipeline
- Implement background processing
- Set up batch operations
- Add storage management
### Short-term Goals
1. Test hybrid response system
- Benchmark response times
- Measure coherence
- Validate context usage
2. Optimize performance
- Implement caching
- Add rate limiting
- Tune batch sizes
3. Update documentation
- Add LLM configuration guide
- Update deployment instructions
- Document new commands
### Dependencies
- OpenAI API access
- Additional storage capacity
- Updated environment configuration
## Implementation Strategy
### Phase 1: Foundation
1. Database schema updates
2. Basic LLM integration
3. Simple context tracking
### Phase 2: Enhancement
1. Hybrid response system
2. Advanced context management
3. Performance optimization
### Phase 3: Refinement
1. User feedback integration
2. Response quality metrics
3. Fine-tuning capabilities
## Notes
- Keep existing Markov system as fallback
- Monitor API usage and costs
- Consider implementing local LLM option
- Need to update help documentation
- Consider adding configuration commands

50
cline_docs/productContext.md Executable file
View File

@@ -0,0 +1,50 @@
# Product Context
Last Updated: 2024-12-27
## Why we're building this
- To create an engaging Discord bot that learns from and interacts with server conversations
- To provide natural, contextually relevant responses using both Markov chains and LLM capabilities
- To maintain conversation history and generate responses that feel authentic to each server's culture
## Core user problems/solutions
Problems:
- Current Markov responses can be incoherent or lack context
- No semantic understanding of conversation context
- Limited ability to generate coherent long-form responses
Solutions:
- Integrate LLM to enhance response quality while maintaining server-specific voice
- Use existing message database for both Markov and LLM training
- Combine Markov's randomness with LLM's coherence
## Key workflows
1. Message Collection
- Listen to channels
- Store messages in SQLite
- Track message context and metadata
2. Response Generation
- Current: Markov chain generation
- Proposed: Hybrid Markov-LLM generation
- Context-aware responses
3. Training
- Batch processing of channel history
- JSON import support
- Continuous learning from new messages
## Product direction and priorities
1. Short term
- Implement LLM integration for response generation
- Maintain existing Markov functionality as fallback
- Add context window for more relevant responses
2. Medium term
- Fine-tune LLM on server-specific data
- Implement response quality metrics
- Add conversation memory
3. Long term
- Advanced context understanding
- Personality adaptation per server
- Multi-modal response capabilities

130
cline_docs/systemPatterns.md Executable file
View File

@@ -0,0 +1,130 @@
# System Patterns
Last Updated: 2024-12-27
## High-level Architecture
### Current System
```
Discord Events -> Message Processing -> SQLite Storage
-> Markov Generation
```
### Proposed LLM Integration
```
Discord Events -> Message Processing -> SQLite Storage
-> Response Generator
├─ Markov Chain
├─ LLM
└─ Response Selector
```
## Core Technical Patterns
### Data Storage
- SQLite database using TypeORM
- Entity structure:
- Guild (server)
- Channel (per-server channels)
- Messages (training data)
### Message Processing
1. Current Flow:
- Message received
- Filtered for human authorship
- Stored in database with metadata
- Used for Markov chain training
2. Enhanced Flow:
- Add message embedding generation
- Store context window
- Track conversation threads
### Response Generation
#### Current (Markov)
```typescript
interface MarkovGenerateOptions {
filter: (result) => boolean;
maxTries: number;
startSeed?: string;
}
```
#### Proposed (Hybrid)
```typescript
interface ResponseGenerateOptions {
contextWindow: Message[];
temperature: number;
maxTokens: number;
startSeed?: string;
forceProvider?: 'markov' | 'llm' | 'hybrid';
}
```
## Data Flow
### Training Pipeline
1. Message Collection
- Discord channel history
- JSON imports
- Real-time messages
2. Processing
- Text cleaning
- Metadata extraction
- Embedding generation
3. Storage
- Raw messages
- Processed embeddings
- Context relationships
### Response Pipeline
1. Context Gathering
- Recent messages
- Channel history
- User interaction history
2. Generation Strategy
- Short responses: Markov chain
- Complex responses: LLM
- Hybrid: LLM-guided Markov chain
3. Post-processing
- Response filtering
- Token limit enforcement
- Attachment handling
## Key Technical Decisions
### LLM Integration
1. Local Embedding Model
- Use sentence-transformers for message embedding
- Store embeddings in SQLite
- Enable semantic search
2. Response Generation
- Primary: Use OpenAI API
- Fallback: Use local LLM
- Hybrid: Combine with Markov output
3. Context Management
- Rolling window of recent messages
- Semantic clustering of related content
- Thread-aware context tracking
### Performance Requirements
1. Response Time
- Markov: < 100ms
- LLM: < 2000ms
- Hybrid: < 2500ms
2. Memory Usage
- Max 1GB per guild
- Batch processing for large imports
- Regular cleanup of old contexts
3. Rate Limiting
- Discord API compliance
- LLM API quota management
- Fallback mechanisms

189
cline_docs/techContext.md Executable file
View File

@@ -0,0 +1,189 @@
# Technical Context
Last Updated: 2024-12-27
## Core Technologies
### Current Stack
- Node.js/TypeScript
- Discord.js for Discord API
- TypeORM for database management
- SQLite for data storage
- markov-strings-db for response generation
### LLM Integration Stack
- OpenAI API for primary LLM capabilities
- sentence-transformers for embeddings
- Vector extensions for SQLite
- Redis (optional) for context caching
## Integration Patterns
### Database Schema Extensions
```sql
-- New tables for LLM integration
-- Store message embeddings
CREATE TABLE message_embeddings (
id TEXT PRIMARY KEY,
message_id TEXT NOT NULL,
embedding BLOB NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (message_id) REFERENCES messages(id)
);
-- Store conversation contexts
CREATE TABLE conversation_contexts (
id TEXT PRIMARY KEY,
channel_id TEXT NOT NULL,
context_window TEXT NOT NULL,
last_updated TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (channel_id) REFERENCES channels(id)
);
```
### API Integration
```typescript
interface LLMConfig {
provider: 'openai' | 'local';
model: string;
apiKey?: string;
maxTokens: number;
temperature: number;
contextWindow: number;
}
interface ResponseGenerator {
generateResponse(options: {
prompt: string;
context: Message[];
guildId: string;
channelId: string;
}): Promise<string>;
}
```
### Message Processing Pipeline
```typescript
interface MessageProcessor {
processMessage(message: Discord.Message): Promise<void>;
generateEmbedding(text: string): Promise<Float32Array>;
updateContext(channelId: string, message: Discord.Message): Promise<void>;
}
```
## Key Libraries/Frameworks
### Current Dependencies
- discord.js: ^14.x
- typeorm: ^0.x
- markov-strings-db: Custom fork
- sqlite3: ^5.x
### New Dependencies
```json
{
"dependencies": {
"@openai/api": "^4.x",
"onnxruntime-node": "^1.x",
"sentence-transformers": "^2.x",
"sqlite-vss": "^0.1.x",
"redis": "^4.x"
}
}
```
## Infrastructure Choices
### Deployment
- Continue with current deployment pattern
- Add environment variables for LLM configuration
- Optional Redis for high-traffic servers
### Scaling Considerations
1. Message Processing
- Batch embedding generation
- Background processing queue
- Rate limiting for API calls
2. Response Generation
- Caching frequent responses
- Fallback to Markov when rate limited
- Load balancing between providers
3. Storage
- Regular embedding pruning
- Context window management
- Backup strategy for embeddings
## Technical Constraints
### API Limitations
1. OpenAI
- Rate limits
- Token quotas
- Cost considerations
2. Discord
- Message rate limits
- Response time requirements
- Attachment handling
### Resource Usage
1. Memory
- Embedding model size
- Context window storage
- Cache management
2. Storage
- Embedding data size
- Context history retention
- Backup requirements
3. Processing
- Embedding generation load
- Response generation time
- Background task management
## Development Environment
### Setup Requirements
```bash
# Core dependencies
npm install
# LLM integration
npm install @openai/api onnxruntime-node sentence-transformers sqlite-vss
# Optional caching
npm install redis
```
### Environment Variables
```env
# LLM Configuration
OPENAI_API_KEY=sk-...
LLM_PROVIDER=openai
LLM_MODEL=gpt-3.5-turbo
LLM_MAX_TOKENS=150
LLM_TEMPERATURE=0.7
CONTEXT_WINDOW_SIZE=10
# Optional Redis
REDIS_URL=redis://localhost:6379
```
### Testing Strategy
1. Unit Tests
- Message processing
- Embedding generation
- Context management
2. Integration Tests
- LLM API interaction
- Database operations
- Discord event handling
3. Performance Tests
- Response time benchmarks
- Memory usage monitoring
- Rate limit compliance