# Rate Limit Monitoring Setup This document explains how to set up automated rate limit monitoring with alerts. ## Overview The rate limit monitoring system consists of: 1. **Metrics Collection** - Tracks all rate limit checks in-memory 2. **Alert Configuration** - Database table with configurable thresholds 3. **Monitor Function** - Edge function that checks metrics and triggers alerts 4. **Cron Job** - Scheduled job that runs the monitor function periodically ## Setup Instructions ### Step 1: Enable Required Extensions Run this SQL in your Supabase SQL Editor: ```sql -- Enable pg_cron for scheduling CREATE EXTENSION IF NOT EXISTS pg_cron; -- Enable pg_net for HTTP requests CREATE EXTENSION IF NOT EXISTS pg_net; ``` ### Step 2: Create the Cron Job Run this SQL to schedule the monitor to run every 5 minutes: ```sql SELECT cron.schedule( 'monitor-rate-limits', '*/5 * * * *', -- Every 5 minutes $$ SELECT net.http_post( url:='https://api.thrillwiki.com/functions/v1/monitor-rate-limits', headers:='{"Content-Type": "application/json", "Authorization": "Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJzdXBhYmFzZSIsInJlZiI6InlkdnRtbnJzenlicW5iY3FiZGN5Iiwicm9sZSI6ImFub24iLCJpYXQiOjE3NTgzMjYzNTYsImV4cCI6MjA3MzkwMjM1Nn0.DM3oyapd_omP5ZzIlrT0H9qBsiQBxBRgw2tYuqgXKX4"}'::jsonb, body:='{}'::jsonb ) as request_id; $$ ); ``` ### Step 3: Verify the Cron Job Check that the cron job was created: ```sql SELECT * FROM cron.job WHERE jobname = 'monitor-rate-limits'; ``` ### Step 4: Configure Alert Thresholds Visit the admin dashboard at `/admin/rate-limit-metrics` and navigate to the "Configuration" tab to: - Enable/disable specific alerts - Adjust threshold values - Modify time windows Default configurations are automatically created: - **Block Rate Alert**: Triggers when >50% of requests are blocked in 5 minutes - **Total Requests Alert**: Triggers when >1000 requests/minute - **Unique IPs Alert**: Triggers when >100 unique IPs in 5 minutes (disabled by default) ## How It Works ### 1. Metrics Collection Every rate limit check (both allowed and blocked) is recorded with: - Timestamp - Function name - Client IP - User ID (if authenticated) - Result (allowed/blocked) - Remaining quota - Rate limit tier Metrics are stored in-memory for the last 10,000 checks. ### 2. Monitoring Process Every 5 minutes, the monitor function: 1. Fetches enabled alert configurations from the database 2. Analyzes current metrics for each configuration's time window 3. Compares metrics against configured thresholds 4. For exceeded thresholds: - Records the alert in `rate_limit_alerts` table - Sends notification to moderators via Novu - Skips if a recent unresolved alert already exists (prevents spam) ### 3. Alert Deduplication Alerts are deduplicated using a 15-minute window. If an alert for the same configuration was triggered in the last 15 minutes and hasn't been resolved, no new alert is sent. ### 4. Notifications Alerts are sent to all moderators via the "moderators" topic in Novu, including: - Email notifications - In-app notifications (if configured) - Custom notification channels (if configured) ## Monitoring the Monitor ### Check Cron Job Status ```sql -- View recent cron job runs SELECT * FROM cron.job_run_details WHERE jobid = (SELECT jobid FROM cron.job WHERE jobname = 'monitor-rate-limits') ORDER BY start_time DESC LIMIT 10; ``` ### View Function Logs Check the edge function logs in Supabase Dashboard: `https://supabase.com/dashboard/project/ydvtmnrszybqnbcqbdcy/functions/monitor-rate-limits/logs` ### Test Manually You can test the monitor function manually by calling it via HTTP: ```bash curl -X POST https://api.thrillwiki.com/functions/v1/monitor-rate-limits \ -H "Content-Type: application/json" ``` ## Adjusting the Schedule To change how often the monitor runs, update the cron schedule: ```sql -- Update to run every 10 minutes instead SELECT cron.alter_job('monitor-rate-limits', schedule:='*/10 * * * *'); -- Update to run every hour SELECT cron.alter_job('monitor-rate-limits', schedule:='0 * * * *'); -- Update to run every minute (not recommended - may generate too many alerts) SELECT cron.alter_job('monitor-rate-limits', schedule:='* * * * *'); ``` ## Removing the Cron Job If you need to disable monitoring: ```sql SELECT cron.unschedule('monitor-rate-limits'); ``` ## Troubleshooting ### No Alerts Being Triggered 1. Check if any alert configurations are enabled: ```sql SELECT * FROM rate_limit_alert_config WHERE enabled = true; ``` 2. Check if metrics are being collected: - Visit `/admin/rate-limit-metrics` and check the "Recent Activity" tab - If no activity, the rate limiter might not be in use 3. Check monitor function logs for errors ### Too Many Alerts - Increase threshold values in the configuration - Increase time windows for less sensitive detection - Disable specific alert types that are too noisy ### Monitor Not Running 1. Verify cron job exists and is active 2. Check `cron.job_run_details` for error messages 3. Verify edge function deployed successfully 4. Check network connectivity between cron scheduler and edge function ## Database Tables ### `rate_limit_alert_config` Stores alert threshold configurations. Only admins can modify. ### `rate_limit_alerts` Stores history of all triggered alerts. Moderators can view and resolve. ## Security - Alert configurations can only be modified by admin/superuser roles - Alert history is only accessible to moderators and above - The monitor function runs without JWT verification (as a cron job) - All database operations respect Row Level Security policies ## Performance Considerations - In-memory metrics store max 10,000 entries (auto-trimmed) - Metrics older than the longest configured time window are not useful - Monitor function typically runs in <500ms - No significant database load (simple queries on small tables) ## Future Enhancements Possible improvements: - Function-specific alert thresholds - Alert aggregation (daily/weekly summaries) - Custom notification channels per alert type - Machine learning-based anomaly detection - Integration with external monitoring tools (Datadog, New Relic, etc.)