Files
thrilltrack-explorer/docs/RATE_LIMITING.md
2025-11-03 15:50:07 +00:00

356 lines
9.2 KiB
Markdown

# Rate Limiting Policy
**Last Updated**: November 3, 2025
**Status**: ACTIVE
**Coverage**: All public edge functions
---
## Overview
ThrillWiki enforces rate limiting on all public edge functions to prevent abuse, ensure fair usage, and protect against denial-of-service (DoS) attacks.
---
## Rate Limit Tiers
### Strict (5 requests/minute per IP)
**Use Case**: Expensive operations that consume significant resources
**Protected Endpoints**:
- `/upload-image` - File upload operations
- Future: Data exports, account deletion
**Reasoning**: File uploads are resource-intensive and should be limited to prevent storage abuse and bandwidth exhaustion.
---
### Standard (10 requests/minute per IP)
**Use Case**: Most API endpoints with moderate resource usage
**Protected Endpoints**:
- `/detect-location` - IP geolocation service
- Future: Public search/filter endpoints
**Reasoning**: Standard protection for endpoints that query external APIs or perform moderate processing.
---
### Lenient (30 requests/minute per IP)
**Use Case**: Read-only, cached endpoints with minimal resource usage
**Protected Endpoints**:
- Future: Cached entity data queries
- Future: Static content endpoints
**Reasoning**: Allow higher throughput for lightweight operations that don't strain resources.
---
### Per-User (Configurable, default 20 requests/minute)
**Use Case**: Authenticated endpoints where rate limiting by user ID provides better protection
**Protected Endpoints**:
- `/process-selective-approval` - 10 requests/minute per moderator
- Future: User-specific API endpoints
**Reasoning**: Moderators have different usage patterns than public users. Per-user limiting prevents credential sharing while allowing legitimate high-volume usage.
**Implementation**:
```typescript
const approvalRateLimiter = rateLimiters.perUser(10); // Custom limit
```
---
## Rate Limit Headers
All responses include rate limit information:
```http
X-RateLimit-Limit: 10
X-RateLimit-Remaining: 7
```
**On Rate Limit Exceeded** (HTTP 429):
```http
Retry-After: 45
X-RateLimit-Limit: 10
X-RateLimit-Remaining: 0
```
---
## Error Response Format
When rate limit is exceeded, you'll receive:
```json
{
"error": "Rate limit exceeded",
"message": "Too many requests. Please try again later.",
"retryAfter": 45
}
```
**HTTP Status Code**: 429 Too Many Requests
---
## Client Implementation
### Handling Rate Limits
```typescript
async function uploadImage(file: File) {
try {
const response = await fetch('/upload-image', {
method: 'POST',
body: formData,
});
if (response.status === 429) {
const data = await response.json();
const retryAfter = data.retryAfter || 60;
console.warn(`Rate limited. Retry in ${retryAfter} seconds`);
// Wait and retry
await new Promise(resolve => setTimeout(resolve, retryAfter * 1000));
return uploadImage(file); // Retry
}
return response.json();
} catch (error) {
console.error('Upload failed:', error);
throw error;
}
}
```
### Exponential Backoff
For production clients, implement exponential backoff:
```typescript
async function uploadWithBackoff(file: File, maxRetries = 3) {
for (let attempt = 0; attempt < maxRetries; attempt++) {
try {
const response = await fetch('/upload-image', {
method: 'POST',
body: formData,
});
if (response.status !== 429) {
return response.json();
}
// Exponential backoff: 1s, 2s, 4s
const backoffDelay = Math.pow(2, attempt) * 1000;
await new Promise(resolve => setTimeout(resolve, backoffDelay));
} catch (error) {
if (attempt === maxRetries - 1) throw error;
}
}
throw new Error('Max retries exceeded');
}
```
---
## Monitoring & Metrics
### Key Metrics to Track
1. **Rate Limit Hit Rate**: Percentage of requests hitting limits
2. **429 Response Count**: Total rate limit errors by endpoint
3. **Top Rate Limited IPs**: Identify potential abuse patterns
4. **False Positive Rate**: Legitimate users hitting limits
### Alerting Thresholds
**Warning Alerts**:
- Rate limit hit rate > 5% on any endpoint
- Single IP hits rate limit > 10 times in 1 hour
**Critical Alerts**:
- Rate limit hit rate > 20% (may indicate DDoS)
- Multiple IPs hitting limits simultaneously (coordinated attack)
---
## Rate Limit Adjustments
### Increasing Limits for Legitimate Use
If you have a legitimate use case requiring higher limits:
1. **Contact Support**: Describe your use case and expected volume
2. **Verification**: We'll verify your account and usage patterns
3. **Temporary Increase**: May grant temporary limit increase
4. **Custom Tier**: High-volume verified accounts may get custom limits
**Examples of Valid Requests**:
- Bulk data migration project
- Integration with external service
- High-traffic public API client
---
## Technical Implementation
### Architecture
Rate limiting is implemented using in-memory rate limiting with:
- **Storage**: Map-based storage (IP → {count, resetAt})
- **Cleanup**: Periodic cleanup of expired entries (every 30 seconds)
- **Capacity Management**: LRU eviction when map exceeds 10,000 entries
- **Emergency Handling**: Automatic cleanup if memory pressure detected
### Memory Management
**Map Capacity**: 10,000 unique IPs tracked simultaneously
**Cleanup Interval**: Every 30 seconds or half the rate limit window
**LRU Eviction**: Removes 30% oldest entries when at capacity
### Shared Middleware
All edge functions use the shared rate limiter:
```typescript
import { rateLimiters, withRateLimit } from '../_shared/rateLimiter.ts';
const limiter = rateLimiters.strict; // or .standard, .lenient, .perUser(n)
serve(withRateLimit(async (req) => {
// Your edge function logic
}, limiter, corsHeaders));
```
---
## Security Considerations
### IP Spoofing Protection
Rate limiting uses `X-Forwarded-For` header (first IP in chain):
- Trusts proxy headers in production (Cloudflare, Supabase)
- Prevents IP spoofing by using first IP only
- Falls back to `X-Real-IP` if `X-Forwarded-For` unavailable
### Distributed Attacks
**Current Limitation**: In-memory rate limiting is per-edge-function instance
- Distributed attacks across multiple instances may bypass limits
- Future: Consider distributed rate limiting (Redis, Supabase table)
**Mitigation**:
- Monitor aggregate request rates across all instances
- Use Cloudflare rate limiting as first line of defense
- Alert on unusual traffic patterns
---
## Bypassing Rate Limits
**Important**: Rate limits CANNOT be bypassed, even for authenticated users.
**Why No Bypass?**:
- Prevents credential compromise from affecting system stability
- Ensures fair usage across all users
- Protects backend infrastructure
**Moderator/Admin Considerations**:
- Per-user rate limiting allows higher individual limits
- Moderators have different tiers for moderation actions
- No complete bypass to prevent abuse of compromised accounts
---
## Testing Rate Limits
### Manual Testing
```bash
# Test upload-image rate limit (5 req/min)
for i in {1..6}; do
curl -X POST https://api.thrillwiki.com/functions/v1/upload-image \
-H "Authorization: Bearer $TOKEN" \
-d '{}' && echo "Request $i succeeded"
done
# Expected: First 5 succeed, 6th returns 429
```
### Automated Testing
```typescript
describe('Rate Limiting', () => {
test('enforces strict limits on upload-image', async () => {
const requests = [];
// Make 6 requests (limit is 5)
for (let i = 0; i < 6; i++) {
requests.push(fetch('/upload-image', { method: 'POST' }));
}
const responses = await Promise.all(requests);
const statuses = responses.map(r => r.status);
expect(statuses.filter(s => s === 200).length).toBe(5);
expect(statuses.filter(s => s === 429).length).toBe(1);
});
});
```
---
## Future Enhancements
### Planned Improvements
1. **Database-Backed Rate Limiting**: Persistent rate limiting across edge function instances
2. **Dynamic Rate Limits**: Adjust limits based on system load
3. **User Reputation System**: Higher limits for trusted users
4. **API Keys**: Rate limiting by API key for integrations
5. **Cost-Based Limiting**: Different limits for different operation costs
---
## Related Documentation
- [Security Fixes (P0)](./SECURITY_FIXES_P0.md)
- [Edge Function Development](./EDGE_FUNCTIONS.md)
- [Error Tracking](./ERROR_TRACKING.md)
---
## Troubleshooting
### "Rate limit exceeded" when I haven't made many requests
**Possible Causes**:
1. **Shared IP**: You're behind a NAT/VPN sharing an IP with others
2. **Recent Requests**: Rate limit window hasn't reset yet
3. **Multiple Tabs**: Multiple browser tabs making requests
**Solutions**:
- Wait for rate limit window to reset (shown in `Retry-After` header)
- Check browser dev tools for unexpected background requests
- Disable browser extensions that might be making requests
### Rate limit seems inconsistent
**Explanation**: Rate limiting is per-edge-function instance
- Multiple instances may have separate rate limit counters
- Distributed traffic may see different limits
- This is expected behavior for in-memory rate limiting
---
## Contact
For rate limit issues or increase requests:
- **Support**: [Contact form on ThrillWiki]
- **Documentation**: https://docs.thrillwiki.com
- **Status**: https://status.thrillwiki.com