# JSONB Elimination Plan **PROJECT RULE**: NEVER STORE JSON OR JSONB IN SQL COLUMNS *"If your data is relational, model it relationally. JSON blobs destroy queryability, performance, data integrity, and your coworkers' sanity. Just make the damn tables. NO JSON OR JSONB INSIDE DATABASE CELLS!!!"* --- ## 📊 Current JSONB Violations ### Critical Violations (Must Fix) - ❌ `rides.coaster_stats` - JSONB column storing coaster-specific statistics - ❌ `rides.technical_specs` - JSONB column storing technical specifications - ❌ `ride_models.technical_specs` - JSONB column storing model specifications - ❌ `user_top_lists.items` - JSONB array storing list items ### Migration Status - ⏳ **Phase 1**: Creating relational tables (IN PROGRESS) - ✅ `coaster_stats` table created - ✅ `technical_specifications` table created - ⏳ `list_items` table (pending schema verification) - ⏳ **Phase 2**: Data migration scripts (PENDING) - ⏳ **Phase 3**: Drop JSONB columns (PENDING) - ⏳ **Phase 4**: Update application code (PENDING) --- ## ✅ Acceptable JSONB Usage These are the ONLY approved JSONB columns (configuration objects, no relational structure): ### User Preferences (Configuration) - ✅ `user_preferences.unit_preferences` - User measurement preferences - ✅ `user_preferences.privacy_settings` - Privacy configuration - ✅ `user_preferences.notification_preferences` - Notification settings ### System Configuration - ✅ `admin_settings.setting_value` - System configuration values - ✅ `notification_channels.configuration` - Channel config objects - ✅ `admin_audit_log.details` - Audit metadata (non-queryable) ### Legacy Support (To Be Eliminated) - ⚠️ `content_submissions.content` - Has strict validation, but should migrate to `submission_metadata` table - ⚠️ `rides.former_names` - Array field, should migrate to `entity_former_names` table --- ## 🎯 Refactoring Plan ### 1. Coaster Stats → Relational Table (2 hours) **Current**: `rides.coaster_stats JSONB` **New Structure**: ```sql CREATE TABLE public.coaster_stats ( id UUID PRIMARY KEY, ride_id UUID REFERENCES rides(id), stat_type TEXT CHECK (stat_type IN ( 'vertical_angle', 'airtime_seconds', 'track_material', 'train_type', 'seats_per_train', 'number_of_trains' )), stat_value TEXT NOT NULL, stat_unit TEXT, display_order INTEGER, UNIQUE(ride_id, stat_type) ); ``` **Benefits**: - ✅ Queryable: `SELECT * FROM coaster_stats WHERE stat_type = 'vertical_angle' AND stat_value > 90` - ✅ Indexed: Fast lookups by ride_id or stat_type - ✅ Type safe: No JSON parsing errors - ✅ Referential integrity: Cascade deletes when ride is deleted --- ### 2. Technical Specs → Relational Table (2 hours) **Current**: - `rides.technical_specs JSONB` - `ride_models.technical_specs JSONB` **New Structure**: ```sql CREATE TABLE public.technical_specifications ( id UUID PRIMARY KEY, entity_type TEXT CHECK (entity_type IN ('ride', 'ride_model')), entity_id UUID NOT NULL, spec_name TEXT NOT NULL, spec_value TEXT NOT NULL, spec_unit TEXT, display_order INTEGER, UNIQUE(entity_type, entity_id, spec_name) ); ``` **Benefits**: - ✅ Unified specs table for both rides and models - ✅ Easy filtering: `WHERE spec_name = 'track_gauge'` - ✅ Easy sorting by display_order - ✅ No JSON parsing in queries --- ### 3. User Top Lists → Relational Table (1.5 hours) **Current**: `user_top_lists.items JSONB` (array of `{id, position, notes}`) **New Structure**: ```sql CREATE TABLE public.list_items ( id UUID PRIMARY KEY, list_id UUID REFERENCES user_top_lists(id), entity_type TEXT CHECK (entity_type IN ('park', 'ride', 'coaster')), entity_id UUID NOT NULL, position INTEGER NOT NULL, notes TEXT, UNIQUE(list_id, position), UNIQUE(list_id, entity_id) ); ``` **Benefits**: - ✅ Proper foreign key constraints to entities - ✅ Easy reordering with position updates - ✅ Can join to get entity details directly - ✅ No array manipulation in application code --- ### 4. Former Names → Relational Table (1 hour) **Current**: `rides.former_names TEXT[]` **New Structure**: ```sql CREATE TABLE public.entity_former_names ( id UUID PRIMARY KEY, entity_type TEXT CHECK (entity_type IN ('park', 'ride', 'company')), entity_id UUID NOT NULL, former_name TEXT NOT NULL, used_from DATE, used_until DATE, change_reason TEXT, display_order INTEGER, created_at TIMESTAMPTZ DEFAULT NOW() ); ``` **Benefits**: - ✅ Track date ranges for name changes - ✅ Add reasons for name changes - ✅ Query by date range - ✅ Unified table for all entities --- ## 📈 Performance Benefits ### Before (JSONB) ```sql -- Slow, requires full table scan + JSON parsing SELECT * FROM rides WHERE coaster_stats->>'vertical_angle' > '90'; -- Cannot index JSON keys efficiently -- Cannot enforce referential integrity -- Type errors only caught at runtime ``` ### After (Relational) ```sql -- Fast, uses indexes SELECT r.* FROM rides r JOIN coaster_stats cs ON cs.ride_id = r.id WHERE cs.stat_type = 'vertical_angle' AND cs.stat_value::numeric > 90; -- Proper indexes on ride_id and stat_type -- Database enforces constraints -- Type errors caught at migration time ``` --- ## 🚀 Implementation Priority 1. **HIGH**: `coaster_stats` - Most frequently queried 2. **HIGH**: `technical_specs` - Used across rides and models 3. **MEDIUM**: `list_items` - User-facing feature 4. **MEDIUM**: `former_names` - Historical data tracking 5. **LOW**: `content_submissions.content` - Has validation, migrate when capacity allows --- ## ✅ Completed Migrations - ✅ `reviews.photos` → `review_photos` table (migration 20251001231631) --- ## 📝 Migration Checklist (Per Table) For each JSONB elimination: - [ ] Create new relational table with proper schema - [ ] Add RLS policies matching parent table - [ ] Create indexes for performance - [ ] Write data migration script to copy existing data - [ ] Update all application queries to use new table - [ ] Update all forms/components to use new structure - [ ] Test thoroughly in staging - [ ] Deploy migration to production - [ ] Drop JSONB column after verification - [ ] Update documentation --- ## 🎯 Success Metrics When complete, the codebase will have: ✅ **Zero JSONB columns** (except approved configuration) ✅ **100% queryable data** using standard SQL ✅ **Proper foreign key constraints** throughout ✅ **Type-safe queries** with compile-time validation ✅ **Better performance** through proper indexing ✅ **Easier maintenance** with clear relational structure --- ## 📚 References - [PostgreSQL Best Practices: Avoid JSONB for Relational Data](https://wiki.postgresql.org/wiki/Don%27t_Do_This#Don.27t_use_jsonb_for_relational_data) - Project Custom Knowledge: "NEVER STORE JSON OR JSONB IN SQL COLUMNS"