# Phase 2: Automated Cleanup Jobs - COMPLETE โœ… ## Overview Implemented comprehensive automated cleanup system to prevent database bloat and maintain Sacred Pipeline health. All cleanup tasks run via a master function with detailed logging and error handling. --- ## ๐ŸŽฏ Implemented Cleanup Functions ### 1. **cleanup_expired_idempotency_keys()** **Purpose**: Remove idempotency keys that expired over 1 hour ago **Retention**: Keys expire after 24 hours, deleted after 25 hours **Returns**: Count of deleted keys **Example**: ```sql SELECT cleanup_expired_idempotency_keys(); -- Returns: 42 (keys deleted) ``` --- ### 2. **cleanup_stale_temp_refs(p_age_days INTEGER DEFAULT 30)** **Purpose**: Remove temporary submission references older than specified days **Retention**: 30 days default (configurable) **Returns**: Deleted count and oldest deletion date **Example**: ```sql SELECT * FROM cleanup_stale_temp_refs(30); -- Returns: (deleted_count: 15, oldest_deleted_date: '2024-10-08') ``` --- ### 3. **cleanup_abandoned_locks()** โญ NEW **Purpose**: Release locks from deleted users, banned users, and expired locks **Returns**: Released count and breakdown by reason **Handles**: - Locks from deleted users (no longer in auth.users) - Locks from banned users (profiles.banned = true) - Expired locks (locked_until < NOW()) **Example**: ```sql SELECT * FROM cleanup_abandoned_locks(); -- Returns: -- { -- released_count: 8, -- lock_details: { -- deleted_user_locks: 2, -- banned_user_locks: 3, -- expired_locks: 3 -- } -- } ``` --- ### 4. **cleanup_old_submissions(p_retention_days INTEGER DEFAULT 90)** โญ NEW **Purpose**: Delete old approved/rejected submissions to reduce database size **Retention**: 90 days default (configurable) **Preserves**: Pending submissions, test data **Returns**: Deleted count, status breakdown, oldest deletion date **Example**: ```sql SELECT * FROM cleanup_old_submissions(90); -- Returns: -- { -- deleted_count: 156, -- deleted_by_status: { "approved": 120, "rejected": 36 }, -- oldest_deleted_date: '2024-08-10' -- } ``` --- ## ๐ŸŽ›๏ธ Master Cleanup Function ### **run_all_cleanup_jobs()** โญ NEW **Purpose**: Execute all 4 cleanup tasks in one call with comprehensive error handling **Features**: - Individual task exception handling (one failure doesn't stop others) - Detailed execution results with success/error per task - Performance timing and logging **Example**: ```sql SELECT * FROM run_all_cleanup_jobs(); ``` **Returns**: ```json { "idempotency_keys": { "deleted": 42, "success": true }, "temp_refs": { "deleted": 15, "oldest_date": "2024-10-08T14:32:00Z", "success": true }, "locks": { "released": 8, "details": { "deleted_user_locks": 2, "banned_user_locks": 3, "expired_locks": 3 }, "success": true }, "old_submissions": { "deleted": 156, "by_status": { "approved": 120, "rejected": 36 }, "oldest_date": "2024-08-10T09:15:00Z", "success": true }, "execution": { "started_at": "2024-11-08T03:00:00Z", "completed_at": "2024-11-08T03:00:02.345Z", "duration_ms": 2345 } } ``` --- ## ๐Ÿš€ Edge Function ### **run-cleanup-jobs** **URL**: `https://api.thrillwiki.com/functions/v1/run-cleanup-jobs` **Auth**: No JWT required (called by pg_cron) **Method**: POST **Purpose**: Wrapper edge function for pg_cron scheduling **Features**: - Calls `run_all_cleanup_jobs()` via service role - Structured JSON logging - Individual task failure warnings - CORS enabled for manual testing **Manual Test**: ```bash curl -X POST https://api.thrillwiki.com/functions/v1/run-cleanup-jobs \ -H "Content-Type: application/json" ``` --- ## โฐ Scheduling with pg_cron ### โœ… Prerequisites (ALREADY MET) 1. โœ… `pg_cron` extension enabled (v1.6.4) 2. โœ… `pg_net` extension enabled (for HTTP requests) 3. โœ… Edge function deployed: `run-cleanup-jobs` ### ๐Ÿ“‹ Schedule Daily Cleanup (3 AM UTC) **IMPORTANT**: Run this SQL directly in your [Supabase SQL Editor](https://supabase.com/dashboard/project/ydvtmnrszybqnbcqbdcy/sql/new): ```sql -- Schedule cleanup jobs to run daily at 3 AM UTC SELECT cron.schedule( 'daily-pipeline-cleanup', -- Job name '0 3 * * *', -- Cron expression (3 AM daily) $$ SELECT net.http_post( url := 'https://api.thrillwiki.com/functions/v1/run-cleanup-jobs', headers := '{"Content-Type": "application/json", "Authorization": "Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJzdXBhYmFzZSIsInJlZiI6InlkdnRtbnJzenlicW5iY3FiZGN5Iiwicm9sZSI6ImFub24iLCJpYXQiOjE3NTgzMjYzNTYsImV4cCI6MjA3MzkwMjM1Nn0.DM3oyapd_omP5ZzIlrT0H9qBsiQBxBRgw2tYuqgXKX4"}'::jsonb, body := '{"scheduled": true}'::jsonb ) as request_id; $$ ); ``` **Alternative Schedules**: ```sql -- Every 6 hours: '0 */6 * * *' -- Every hour: '0 * * * *' -- Every Sunday: '0 3 * * 0' -- Twice daily: '0 3,15 * * *' (3 AM and 3 PM) ``` ### Verify Scheduled Job ```sql -- Check active cron jobs SELECT * FROM cron.job WHERE jobname = 'daily-pipeline-cleanup'; -- View cron job history SELECT * FROM cron.job_run_details WHERE jobid = (SELECT jobid FROM cron.job WHERE jobname = 'daily-pipeline-cleanup') ORDER BY start_time DESC LIMIT 10; ``` ### Unschedule (if needed) ```sql SELECT cron.unschedule('daily-pipeline-cleanup'); ``` --- ## ๐Ÿ“Š Monitoring & Alerts ### Check Last Cleanup Execution ```sql -- View most recent cleanup results (check edge function logs) -- Or query cron.job_run_details for execution status SELECT start_time, end_time, status, return_message FROM cron.job_run_details WHERE jobid = (SELECT jobid FROM cron.job WHERE jobname = 'daily-pipeline-cleanup') ORDER BY start_time DESC LIMIT 1; ``` ### Database Size Monitoring ```sql -- Check table sizes to verify cleanup is working SELECT schemaname, tablename, pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS size FROM pg_tables WHERE schemaname = 'public' AND tablename IN ( 'submission_idempotency_keys', 'submission_item_temp_refs', 'content_submissions' ) ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC; ``` --- ## ๐Ÿงช Manual Testing ### Test Individual Functions ```sql -- Test each cleanup function independently SELECT cleanup_expired_idempotency_keys(); SELECT * FROM cleanup_stale_temp_refs(30); SELECT * FROM cleanup_abandoned_locks(); SELECT * FROM cleanup_old_submissions(90); ``` ### Test Master Function ```sql -- Run all cleanup jobs manually SELECT * FROM run_all_cleanup_jobs(); ``` ### Test Edge Function ```bash # Manual HTTP test curl -X POST https://api.thrillwiki.com/functions/v1/run-cleanup-jobs \ -H "Content-Type: application/json" \ -H "Authorization: Bearer YOUR_ANON_KEY" ``` --- ## ๐Ÿ“ˆ Expected Cleanup Rates Based on typical usage patterns: | Task | Frequency | Expected Volume | |------|-----------|-----------------| | Idempotency Keys | Daily | 50-200 keys/day | | Temp Refs | Daily | 10-50 refs/day | | Abandoned Locks | Daily | 0-10 locks/day | | Old Submissions | Daily | 50-200 submissions/day (after 90 days) | --- ## ๐Ÿ”’ Security - All cleanup functions use `SECURITY DEFINER` with `SET search_path = public` - RLS policies verified for all affected tables - Edge function uses service role key (not exposed to client) - No user data exposure in logs (only counts and IDs) --- ## ๐Ÿšจ Troubleshooting ### Cleanup Job Fails Silently **Check**: 1. pg_cron extension enabled: `SELECT * FROM pg_available_extensions WHERE name = 'pg_cron' AND installed_version IS NOT NULL;` 2. pg_net extension enabled: `SELECT * FROM pg_available_extensions WHERE name = 'pg_net' AND installed_version IS NOT NULL;` 3. Edge function deployed: Check Supabase Functions dashboard 4. Cron job scheduled: `SELECT * FROM cron.job WHERE jobname = 'daily-pipeline-cleanup';` ### Individual Task Failures **Solution**: Check edge function logs for specific error messages - Navigate to: https://supabase.com/dashboard/project/ydvtmnrszybqnbcqbdcy/functions/run-cleanup-jobs/logs ### High Database Size After Cleanup **Check**: - Vacuum table: `VACUUM FULL content_submissions;` (requires downtime) - Check retention periods are appropriate - Verify CASCADE DELETE constraints working --- ## โœ… Success Metrics After implementing Phase 2, monitor these metrics: 1. **Database Size Reduction**: 10-30% decrease in `content_submissions` table size after 90 days 2. **Lock Availability**: <1% of locks abandoned/stuck 3. **Idempotency Key Volume**: Stable count (not growing unbounded) 4. **Cleanup Success Rate**: >99% of scheduled jobs complete successfully --- ## ๐ŸŽฏ Next Steps With Phase 2 complete, the Sacred Pipeline now has: - โœ… Pre-approval validation (Phase 1) - โœ… Enhanced error logging (Phase 1) - โœ… CHECK constraints (Phase 1) - โœ… Automated cleanup jobs (Phase 2) **Recommended Next Phase**: - Phase 3: Enhanced Error Handling - Transaction status polling endpoint - Expanded error sanitizer patterns - Rate limiting for submission creation - Form state persistence --- ## ๐Ÿ“ Related Files ### Database Functions - `supabase/migrations/[timestamp]_phase2_cleanup_jobs.sql` ### Edge Functions - `supabase/functions/run-cleanup-jobs/index.ts` ### Configuration - `supabase/config.toml` (function config) --- ## ๐Ÿซ€ The Sacred Pipeline Pumps Stronger With automated maintenance, the pipeline is now self-cleaning and optimized for long-term operation. Database bloat is prevented, locks are released automatically, and old data is purged on schedule. **STATUS**: Phase 2 BULLETPROOF โœ