6.7 KiB
JSONB Elimination Plan
PROJECT RULE: NEVER STORE JSON OR JSONB IN SQL COLUMNS
"If your data is relational, model it relationally. JSON blobs destroy queryability, performance, data integrity, and your coworkers' sanity. Just make the damn tables. NO JSON OR JSONB INSIDE DATABASE CELLS!!!"
📊 Current JSONB Violations
Critical Violations (Must Fix)
- ❌
rides.coaster_stats- JSONB column storing coaster-specific statistics - ❌
rides.technical_specs- JSONB column storing technical specifications - ❌
ride_models.technical_specs- JSONB column storing model specifications - ❌
user_top_lists.items- JSONB array storing list items
Migration Status
- ⏳ Phase 1: Creating relational tables (IN PROGRESS)
- ✅
coaster_statstable created - ✅
technical_specificationstable created - ⏳
list_itemstable (pending schema verification)
- ✅
- ⏳ Phase 2: Data migration scripts (PENDING)
- ⏳ Phase 3: Drop JSONB columns (PENDING)
- ⏳ Phase 4: Update application code (PENDING)
✅ Acceptable JSONB Usage
These are the ONLY approved JSONB columns (configuration objects, no relational structure):
User Preferences (Configuration)
- ✅
user_preferences.unit_preferences- User measurement preferences - ✅
user_preferences.privacy_settings- Privacy configuration - ✅
user_preferences.notification_preferences- Notification settings
System Configuration
- ✅
admin_settings.setting_value- System configuration values - ✅
notification_channels.configuration- Channel config objects - ✅
admin_audit_log.details- Audit metadata (non-queryable)
Legacy Support (To Be Eliminated)
- ⚠️
content_submissions.content- Has strict validation, but should migrate tosubmission_metadatatable - ⚠️
rides.former_names- Array field, should migrate toentity_former_namestable
🎯 Refactoring Plan
1. Coaster Stats → Relational Table (2 hours)
Current: rides.coaster_stats JSONB
New Structure:
CREATE TABLE public.coaster_stats (
id UUID PRIMARY KEY,
ride_id UUID REFERENCES rides(id),
stat_type TEXT CHECK (stat_type IN (
'vertical_angle', 'airtime_seconds', 'track_material',
'train_type', 'seats_per_train', 'number_of_trains'
)),
stat_value TEXT NOT NULL,
stat_unit TEXT,
display_order INTEGER,
UNIQUE(ride_id, stat_type)
);
Benefits:
- ✅ Queryable:
SELECT * FROM coaster_stats WHERE stat_type = 'vertical_angle' AND stat_value > 90 - ✅ Indexed: Fast lookups by ride_id or stat_type
- ✅ Type safe: No JSON parsing errors
- ✅ Referential integrity: Cascade deletes when ride is deleted
2. Technical Specs → Relational Table (2 hours)
Current:
rides.technical_specs JSONBride_models.technical_specs JSONB
New Structure:
CREATE TABLE public.technical_specifications (
id UUID PRIMARY KEY,
entity_type TEXT CHECK (entity_type IN ('ride', 'ride_model')),
entity_id UUID NOT NULL,
spec_name TEXT NOT NULL,
spec_value TEXT NOT NULL,
spec_unit TEXT,
display_order INTEGER,
UNIQUE(entity_type, entity_id, spec_name)
);
Benefits:
- ✅ Unified specs table for both rides and models
- ✅ Easy filtering:
WHERE spec_name = 'track_gauge' - ✅ Easy sorting by display_order
- ✅ No JSON parsing in queries
3. User Top Lists → Relational Table (1.5 hours)
Current: user_top_lists.items JSONB (array of {id, position, notes})
New Structure:
CREATE TABLE public.list_items (
id UUID PRIMARY KEY,
list_id UUID REFERENCES user_top_lists(id),
entity_type TEXT CHECK (entity_type IN ('park', 'ride', 'coaster')),
entity_id UUID NOT NULL,
position INTEGER NOT NULL,
notes TEXT,
UNIQUE(list_id, position),
UNIQUE(list_id, entity_id)
);
Benefits:
- ✅ Proper foreign key constraints to entities
- ✅ Easy reordering with position updates
- ✅ Can join to get entity details directly
- ✅ No array manipulation in application code
4. Former Names → Relational Table (1 hour)
Current: rides.former_names TEXT[]
New Structure:
CREATE TABLE public.entity_former_names (
id UUID PRIMARY KEY,
entity_type TEXT CHECK (entity_type IN ('park', 'ride', 'company')),
entity_id UUID NOT NULL,
former_name TEXT NOT NULL,
used_from DATE,
used_until DATE,
change_reason TEXT,
display_order INTEGER,
created_at TIMESTAMPTZ DEFAULT NOW()
);
Benefits:
- ✅ Track date ranges for name changes
- ✅ Add reasons for name changes
- ✅ Query by date range
- ✅ Unified table for all entities
📈 Performance Benefits
Before (JSONB)
-- Slow, requires full table scan + JSON parsing
SELECT * FROM rides
WHERE coaster_stats->>'vertical_angle' > '90';
-- Cannot index JSON keys efficiently
-- Cannot enforce referential integrity
-- Type errors only caught at runtime
After (Relational)
-- Fast, uses indexes
SELECT r.* FROM rides r
JOIN coaster_stats cs ON cs.ride_id = r.id
WHERE cs.stat_type = 'vertical_angle'
AND cs.stat_value::numeric > 90;
-- Proper indexes on ride_id and stat_type
-- Database enforces constraints
-- Type errors caught at migration time
🚀 Implementation Priority
- HIGH:
coaster_stats- Most frequently queried - HIGH:
technical_specs- Used across rides and models - MEDIUM:
list_items- User-facing feature - MEDIUM:
former_names- Historical data tracking - LOW:
content_submissions.content- Has validation, migrate when capacity allows
✅ Completed Migrations
- ✅
reviews.photos→review_photostable (migration 20251001231631)
📝 Migration Checklist (Per Table)
For each JSONB elimination:
- Create new relational table with proper schema
- Add RLS policies matching parent table
- Create indexes for performance
- Write data migration script to copy existing data
- Update all application queries to use new table
- Update all forms/components to use new structure
- Test thoroughly in staging
- Deploy migration to production
- Drop JSONB column after verification
- Update documentation
🎯 Success Metrics
When complete, the codebase will have:
✅ Zero JSONB columns (except approved configuration)
✅ 100% queryable data using standard SQL
✅ Proper foreign key constraints throughout
✅ Type-safe queries with compile-time validation
✅ Better performance through proper indexing
✅ Easier maintenance with clear relational structure
📚 References
- PostgreSQL Best Practices: Avoid JSONB for Relational Data
- Project Custom Knowledge: "NEVER STORE JSON OR JSONB IN SQL COLUMNS"