mirror of
https://github.com/pacnpal/markov-discord.git
synced 2025-12-20 03:01:04 -05:00
Update file permissions and add documentation for LLM integration
This commit is contained in:
116
cline_docs/activeContext.md
Executable file
116
cline_docs/activeContext.md
Executable file
@@ -0,0 +1,116 @@
|
||||
# Active Context
|
||||
Last Updated: 2024-12-27
|
||||
|
||||
## Current Focus
|
||||
Integrating LLM capabilities into the existing Discord bot while maintaining the unique "personality" of each server's Markov-based responses.
|
||||
|
||||
### Active Issues
|
||||
1. Response Generation
|
||||
- Need to implement hybrid Markov-LLM response system
|
||||
- Must maintain response speed within acceptable limits
|
||||
- Need to handle API rate limiting gracefully
|
||||
|
||||
2. Data Management
|
||||
- Implement efficient storage for embeddings
|
||||
- Design context window management
|
||||
- Handle conversation threading
|
||||
|
||||
3. Integration Points
|
||||
- Modify generateResponse function to support LLM
|
||||
- Add embedding generation pipeline
|
||||
- Implement context tracking
|
||||
|
||||
## Recent Changes
|
||||
- Analyzed current codebase structure
|
||||
- Identified integration points for LLM
|
||||
- Documented system architecture
|
||||
- Created implementation plan
|
||||
|
||||
## Active Files
|
||||
|
||||
### Core Implementation
|
||||
- src/index.ts
|
||||
- Main bot logic
|
||||
- Message handling
|
||||
- Command processing
|
||||
|
||||
- src/entity/
|
||||
- Database schema
|
||||
- Need to add embedding and context tables
|
||||
|
||||
- src/train.ts
|
||||
- Training pipeline
|
||||
- Need to add embedding generation
|
||||
|
||||
### New Files Needed
|
||||
- src/llm/
|
||||
- provider.ts (LLM service integration)
|
||||
- embedding.ts (Embedding generation)
|
||||
- context.ts (Context management)
|
||||
|
||||
- src/entity/
|
||||
- MessageEmbedding.ts
|
||||
- ConversationContext.ts
|
||||
|
||||
## Next Steps
|
||||
|
||||
### Immediate Tasks
|
||||
1. Create database migrations
|
||||
- Add embedding table
|
||||
- Add context table
|
||||
- Update existing message schema
|
||||
|
||||
2. Implement LLM integration
|
||||
- Set up OpenAI client
|
||||
- Create response generation service
|
||||
- Add fallback mechanisms
|
||||
|
||||
3. Add embedding pipeline
|
||||
- Implement background processing
|
||||
- Set up batch operations
|
||||
- Add storage management
|
||||
|
||||
### Short-term Goals
|
||||
1. Test hybrid response system
|
||||
- Benchmark response times
|
||||
- Measure coherence
|
||||
- Validate context usage
|
||||
|
||||
2. Optimize performance
|
||||
- Implement caching
|
||||
- Add rate limiting
|
||||
- Tune batch sizes
|
||||
|
||||
3. Update documentation
|
||||
- Add LLM configuration guide
|
||||
- Update deployment instructions
|
||||
- Document new commands
|
||||
|
||||
### Dependencies
|
||||
- OpenAI API access
|
||||
- Additional storage capacity
|
||||
- Updated environment configuration
|
||||
|
||||
## Implementation Strategy
|
||||
|
||||
### Phase 1: Foundation
|
||||
1. Database schema updates
|
||||
2. Basic LLM integration
|
||||
3. Simple context tracking
|
||||
|
||||
### Phase 2: Enhancement
|
||||
1. Hybrid response system
|
||||
2. Advanced context management
|
||||
3. Performance optimization
|
||||
|
||||
### Phase 3: Refinement
|
||||
1. User feedback integration
|
||||
2. Response quality metrics
|
||||
3. Fine-tuning capabilities
|
||||
|
||||
## Notes
|
||||
- Keep existing Markov system as fallback
|
||||
- Monitor API usage and costs
|
||||
- Consider implementing local LLM option
|
||||
- Need to update help documentation
|
||||
- Consider adding configuration commands
|
||||
50
cline_docs/productContext.md
Executable file
50
cline_docs/productContext.md
Executable file
@@ -0,0 +1,50 @@
|
||||
# Product Context
|
||||
Last Updated: 2024-12-27
|
||||
|
||||
## Why we're building this
|
||||
- To create an engaging Discord bot that learns from and interacts with server conversations
|
||||
- To provide natural, contextually relevant responses using both Markov chains and LLM capabilities
|
||||
- To maintain conversation history and generate responses that feel authentic to each server's culture
|
||||
|
||||
## Core user problems/solutions
|
||||
Problems:
|
||||
- Current Markov responses can be incoherent or lack context
|
||||
- No semantic understanding of conversation context
|
||||
- Limited ability to generate coherent long-form responses
|
||||
|
||||
Solutions:
|
||||
- Integrate LLM to enhance response quality while maintaining server-specific voice
|
||||
- Use existing message database for both Markov and LLM training
|
||||
- Combine Markov's randomness with LLM's coherence
|
||||
|
||||
## Key workflows
|
||||
1. Message Collection
|
||||
- Listen to channels
|
||||
- Store messages in SQLite
|
||||
- Track message context and metadata
|
||||
|
||||
2. Response Generation
|
||||
- Current: Markov chain generation
|
||||
- Proposed: Hybrid Markov-LLM generation
|
||||
- Context-aware responses
|
||||
|
||||
3. Training
|
||||
- Batch processing of channel history
|
||||
- JSON import support
|
||||
- Continuous learning from new messages
|
||||
|
||||
## Product direction and priorities
|
||||
1. Short term
|
||||
- Implement LLM integration for response generation
|
||||
- Maintain existing Markov functionality as fallback
|
||||
- Add context window for more relevant responses
|
||||
|
||||
2. Medium term
|
||||
- Fine-tune LLM on server-specific data
|
||||
- Implement response quality metrics
|
||||
- Add conversation memory
|
||||
|
||||
3. Long term
|
||||
- Advanced context understanding
|
||||
- Personality adaptation per server
|
||||
- Multi-modal response capabilities
|
||||
130
cline_docs/systemPatterns.md
Executable file
130
cline_docs/systemPatterns.md
Executable file
@@ -0,0 +1,130 @@
|
||||
# System Patterns
|
||||
Last Updated: 2024-12-27
|
||||
|
||||
## High-level Architecture
|
||||
|
||||
### Current System
|
||||
```
|
||||
Discord Events -> Message Processing -> SQLite Storage
|
||||
-> Markov Generation
|
||||
```
|
||||
|
||||
### Proposed LLM Integration
|
||||
```
|
||||
Discord Events -> Message Processing -> SQLite Storage
|
||||
-> Response Generator
|
||||
├─ Markov Chain
|
||||
├─ LLM
|
||||
└─ Response Selector
|
||||
```
|
||||
|
||||
## Core Technical Patterns
|
||||
|
||||
### Data Storage
|
||||
- SQLite database using TypeORM
|
||||
- Entity structure:
|
||||
- Guild (server)
|
||||
- Channel (per-server channels)
|
||||
- Messages (training data)
|
||||
|
||||
### Message Processing
|
||||
1. Current Flow:
|
||||
- Message received
|
||||
- Filtered for human authorship
|
||||
- Stored in database with metadata
|
||||
- Used for Markov chain training
|
||||
|
||||
2. Enhanced Flow:
|
||||
- Add message embedding generation
|
||||
- Store context window
|
||||
- Track conversation threads
|
||||
|
||||
### Response Generation
|
||||
|
||||
#### Current (Markov)
|
||||
```typescript
|
||||
interface MarkovGenerateOptions {
|
||||
filter: (result) => boolean;
|
||||
maxTries: number;
|
||||
startSeed?: string;
|
||||
}
|
||||
```
|
||||
|
||||
#### Proposed (Hybrid)
|
||||
```typescript
|
||||
interface ResponseGenerateOptions {
|
||||
contextWindow: Message[];
|
||||
temperature: number;
|
||||
maxTokens: number;
|
||||
startSeed?: string;
|
||||
forceProvider?: 'markov' | 'llm' | 'hybrid';
|
||||
}
|
||||
```
|
||||
|
||||
## Data Flow
|
||||
|
||||
### Training Pipeline
|
||||
1. Message Collection
|
||||
- Discord channel history
|
||||
- JSON imports
|
||||
- Real-time messages
|
||||
|
||||
2. Processing
|
||||
- Text cleaning
|
||||
- Metadata extraction
|
||||
- Embedding generation
|
||||
|
||||
3. Storage
|
||||
- Raw messages
|
||||
- Processed embeddings
|
||||
- Context relationships
|
||||
|
||||
### Response Pipeline
|
||||
1. Context Gathering
|
||||
- Recent messages
|
||||
- Channel history
|
||||
- User interaction history
|
||||
|
||||
2. Generation Strategy
|
||||
- Short responses: Markov chain
|
||||
- Complex responses: LLM
|
||||
- Hybrid: LLM-guided Markov chain
|
||||
|
||||
3. Post-processing
|
||||
- Response filtering
|
||||
- Token limit enforcement
|
||||
- Attachment handling
|
||||
|
||||
## Key Technical Decisions
|
||||
|
||||
### LLM Integration
|
||||
1. Local Embedding Model
|
||||
- Use sentence-transformers for message embedding
|
||||
- Store embeddings in SQLite
|
||||
- Enable semantic search
|
||||
|
||||
2. Response Generation
|
||||
- Primary: Use OpenAI API
|
||||
- Fallback: Use local LLM
|
||||
- Hybrid: Combine with Markov output
|
||||
|
||||
3. Context Management
|
||||
- Rolling window of recent messages
|
||||
- Semantic clustering of related content
|
||||
- Thread-aware context tracking
|
||||
|
||||
### Performance Requirements
|
||||
1. Response Time
|
||||
- Markov: < 100ms
|
||||
- LLM: < 2000ms
|
||||
- Hybrid: < 2500ms
|
||||
|
||||
2. Memory Usage
|
||||
- Max 1GB per guild
|
||||
- Batch processing for large imports
|
||||
- Regular cleanup of old contexts
|
||||
|
||||
3. Rate Limiting
|
||||
- Discord API compliance
|
||||
- LLM API quota management
|
||||
- Fallback mechanisms
|
||||
189
cline_docs/techContext.md
Executable file
189
cline_docs/techContext.md
Executable file
@@ -0,0 +1,189 @@
|
||||
# Technical Context
|
||||
Last Updated: 2024-12-27
|
||||
|
||||
## Core Technologies
|
||||
|
||||
### Current Stack
|
||||
- Node.js/TypeScript
|
||||
- Discord.js for Discord API
|
||||
- TypeORM for database management
|
||||
- SQLite for data storage
|
||||
- markov-strings-db for response generation
|
||||
|
||||
### LLM Integration Stack
|
||||
- OpenAI API for primary LLM capabilities
|
||||
- sentence-transformers for embeddings
|
||||
- Vector extensions for SQLite
|
||||
- Redis (optional) for context caching
|
||||
|
||||
## Integration Patterns
|
||||
|
||||
### Database Schema Extensions
|
||||
```sql
|
||||
-- New tables for LLM integration
|
||||
|
||||
-- Store message embeddings
|
||||
CREATE TABLE message_embeddings (
|
||||
id TEXT PRIMARY KEY,
|
||||
message_id TEXT NOT NULL,
|
||||
embedding BLOB NOT NULL,
|
||||
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
||||
FOREIGN KEY (message_id) REFERENCES messages(id)
|
||||
);
|
||||
|
||||
-- Store conversation contexts
|
||||
CREATE TABLE conversation_contexts (
|
||||
id TEXT PRIMARY KEY,
|
||||
channel_id TEXT NOT NULL,
|
||||
context_window TEXT NOT NULL,
|
||||
last_updated TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
||||
FOREIGN KEY (channel_id) REFERENCES channels(id)
|
||||
);
|
||||
```
|
||||
|
||||
### API Integration
|
||||
```typescript
|
||||
interface LLMConfig {
|
||||
provider: 'openai' | 'local';
|
||||
model: string;
|
||||
apiKey?: string;
|
||||
maxTokens: number;
|
||||
temperature: number;
|
||||
contextWindow: number;
|
||||
}
|
||||
|
||||
interface ResponseGenerator {
|
||||
generateResponse(options: {
|
||||
prompt: string;
|
||||
context: Message[];
|
||||
guildId: string;
|
||||
channelId: string;
|
||||
}): Promise<string>;
|
||||
}
|
||||
```
|
||||
|
||||
### Message Processing Pipeline
|
||||
```typescript
|
||||
interface MessageProcessor {
|
||||
processMessage(message: Discord.Message): Promise<void>;
|
||||
generateEmbedding(text: string): Promise<Float32Array>;
|
||||
updateContext(channelId: string, message: Discord.Message): Promise<void>;
|
||||
}
|
||||
```
|
||||
|
||||
## Key Libraries/Frameworks
|
||||
|
||||
### Current Dependencies
|
||||
- discord.js: ^14.x
|
||||
- typeorm: ^0.x
|
||||
- markov-strings-db: Custom fork
|
||||
- sqlite3: ^5.x
|
||||
|
||||
### New Dependencies
|
||||
```json
|
||||
{
|
||||
"dependencies": {
|
||||
"@openai/api": "^4.x",
|
||||
"onnxruntime-node": "^1.x",
|
||||
"sentence-transformers": "^2.x",
|
||||
"sqlite-vss": "^0.1.x",
|
||||
"redis": "^4.x"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Infrastructure Choices
|
||||
|
||||
### Deployment
|
||||
- Continue with current deployment pattern
|
||||
- Add environment variables for LLM configuration
|
||||
- Optional Redis for high-traffic servers
|
||||
|
||||
### Scaling Considerations
|
||||
1. Message Processing
|
||||
- Batch embedding generation
|
||||
- Background processing queue
|
||||
- Rate limiting for API calls
|
||||
|
||||
2. Response Generation
|
||||
- Caching frequent responses
|
||||
- Fallback to Markov when rate limited
|
||||
- Load balancing between providers
|
||||
|
||||
3. Storage
|
||||
- Regular embedding pruning
|
||||
- Context window management
|
||||
- Backup strategy for embeddings
|
||||
|
||||
## Technical Constraints
|
||||
|
||||
### API Limitations
|
||||
1. OpenAI
|
||||
- Rate limits
|
||||
- Token quotas
|
||||
- Cost considerations
|
||||
|
||||
2. Discord
|
||||
- Message rate limits
|
||||
- Response time requirements
|
||||
- Attachment handling
|
||||
|
||||
### Resource Usage
|
||||
1. Memory
|
||||
- Embedding model size
|
||||
- Context window storage
|
||||
- Cache management
|
||||
|
||||
2. Storage
|
||||
- Embedding data size
|
||||
- Context history retention
|
||||
- Backup requirements
|
||||
|
||||
3. Processing
|
||||
- Embedding generation load
|
||||
- Response generation time
|
||||
- Background task management
|
||||
|
||||
## Development Environment
|
||||
|
||||
### Setup Requirements
|
||||
```bash
|
||||
# Core dependencies
|
||||
npm install
|
||||
|
||||
# LLM integration
|
||||
npm install @openai/api onnxruntime-node sentence-transformers sqlite-vss
|
||||
|
||||
# Optional caching
|
||||
npm install redis
|
||||
```
|
||||
|
||||
### Environment Variables
|
||||
```env
|
||||
# LLM Configuration
|
||||
OPENAI_API_KEY=sk-...
|
||||
LLM_PROVIDER=openai
|
||||
LLM_MODEL=gpt-3.5-turbo
|
||||
LLM_MAX_TOKENS=150
|
||||
LLM_TEMPERATURE=0.7
|
||||
CONTEXT_WINDOW_SIZE=10
|
||||
|
||||
# Optional Redis
|
||||
REDIS_URL=redis://localhost:6379
|
||||
```
|
||||
|
||||
### Testing Strategy
|
||||
1. Unit Tests
|
||||
- Message processing
|
||||
- Embedding generation
|
||||
- Context management
|
||||
|
||||
2. Integration Tests
|
||||
- LLM API interaction
|
||||
- Database operations
|
||||
- Discord event handling
|
||||
|
||||
3. Performance Tests
|
||||
- Response time benchmarks
|
||||
- Memory usage monitoring
|
||||
- Rate limit compliance
|
||||
Reference in New Issue
Block a user