Files
thrilltrack-explorer/docs/JSONB_ELIMINATION.md
2025-10-17 13:31:24 +00:00

236 lines
6.7 KiB
Markdown

# JSONB Elimination Plan
**PROJECT RULE**: NEVER STORE JSON OR JSONB IN SQL COLUMNS
*"If your data is relational, model it relationally. JSON blobs destroy queryability, performance, data integrity, and your coworkers' sanity. Just make the damn tables. NO JSON OR JSONB INSIDE DATABASE CELLS!!!"*
---
## 📊 Current JSONB Violations
### Critical Violations (Must Fix)
-`rides.coaster_stats` - JSONB column storing coaster-specific statistics
-`rides.technical_specs` - JSONB column storing technical specifications
-`ride_models.technical_specs` - JSONB column storing model specifications
-`user_top_lists.items` - JSONB array storing list items
### Migration Status
-**Phase 1**: Creating relational tables (IN PROGRESS)
-`coaster_stats` table created
-`technical_specifications` table created
-`list_items` table (pending schema verification)
-**Phase 2**: Data migration scripts (PENDING)
-**Phase 3**: Drop JSONB columns (PENDING)
-**Phase 4**: Update application code (PENDING)
---
## ✅ Acceptable JSONB Usage
These are the ONLY approved JSONB columns (configuration objects, no relational structure):
### User Preferences (Configuration)
-`user_preferences.unit_preferences` - User measurement preferences
-`user_preferences.privacy_settings` - Privacy configuration
-`user_preferences.notification_preferences` - Notification settings
### System Configuration
-`admin_settings.setting_value` - System configuration values
-`notification_channels.configuration` - Channel config objects
-`admin_audit_log.details` - Audit metadata (non-queryable)
### Legacy Support (To Be Eliminated)
- ⚠️ `content_submissions.content` - Has strict validation, but should migrate to `submission_metadata` table
- ⚠️ `rides.former_names` - Array field, should migrate to `entity_former_names` table
---
## 🎯 Refactoring Plan
### 1. Coaster Stats → Relational Table (2 hours)
**Current**: `rides.coaster_stats JSONB`
**New Structure**:
```sql
CREATE TABLE public.coaster_stats (
id UUID PRIMARY KEY,
ride_id UUID REFERENCES rides(id),
stat_type TEXT CHECK (stat_type IN (
'vertical_angle', 'airtime_seconds', 'track_material',
'train_type', 'seats_per_train', 'number_of_trains'
)),
stat_value TEXT NOT NULL,
stat_unit TEXT,
display_order INTEGER,
UNIQUE(ride_id, stat_type)
);
```
**Benefits**:
- ✅ Queryable: `SELECT * FROM coaster_stats WHERE stat_type = 'vertical_angle' AND stat_value > 90`
- ✅ Indexed: Fast lookups by ride_id or stat_type
- ✅ Type safe: No JSON parsing errors
- ✅ Referential integrity: Cascade deletes when ride is deleted
---
### 2. Technical Specs → Relational Table (2 hours)
**Current**:
- `rides.technical_specs JSONB`
- `ride_models.technical_specs JSONB`
**New Structure**:
```sql
CREATE TABLE public.technical_specifications (
id UUID PRIMARY KEY,
entity_type TEXT CHECK (entity_type IN ('ride', 'ride_model')),
entity_id UUID NOT NULL,
spec_name TEXT NOT NULL,
spec_value TEXT NOT NULL,
spec_unit TEXT,
display_order INTEGER,
UNIQUE(entity_type, entity_id, spec_name)
);
```
**Benefits**:
- ✅ Unified specs table for both rides and models
- ✅ Easy filtering: `WHERE spec_name = 'track_gauge'`
- ✅ Easy sorting by display_order
- ✅ No JSON parsing in queries
---
### 3. User Top Lists → Relational Table (1.5 hours)
**Current**: `user_top_lists.items JSONB` (array of `{id, position, notes}`)
**New Structure**:
```sql
CREATE TABLE public.list_items (
id UUID PRIMARY KEY,
list_id UUID REFERENCES user_top_lists(id),
entity_type TEXT CHECK (entity_type IN ('park', 'ride', 'coaster')),
entity_id UUID NOT NULL,
position INTEGER NOT NULL,
notes TEXT,
UNIQUE(list_id, position),
UNIQUE(list_id, entity_id)
);
```
**Benefits**:
- ✅ Proper foreign key constraints to entities
- ✅ Easy reordering with position updates
- ✅ Can join to get entity details directly
- ✅ No array manipulation in application code
---
### 4. Former Names → Relational Table (1 hour)
**Current**: `rides.former_names TEXT[]`
**New Structure**:
```sql
CREATE TABLE public.entity_former_names (
id UUID PRIMARY KEY,
entity_type TEXT CHECK (entity_type IN ('park', 'ride', 'company')),
entity_id UUID NOT NULL,
former_name TEXT NOT NULL,
used_from DATE,
used_until DATE,
change_reason TEXT,
display_order INTEGER,
created_at TIMESTAMPTZ DEFAULT NOW()
);
```
**Benefits**:
- ✅ Track date ranges for name changes
- ✅ Add reasons for name changes
- ✅ Query by date range
- ✅ Unified table for all entities
---
## 📈 Performance Benefits
### Before (JSONB)
```sql
-- Slow, requires full table scan + JSON parsing
SELECT * FROM rides
WHERE coaster_stats->>'vertical_angle' > '90';
-- Cannot index JSON keys efficiently
-- Cannot enforce referential integrity
-- Type errors only caught at runtime
```
### After (Relational)
```sql
-- Fast, uses indexes
SELECT r.* FROM rides r
JOIN coaster_stats cs ON cs.ride_id = r.id
WHERE cs.stat_type = 'vertical_angle'
AND cs.stat_value::numeric > 90;
-- Proper indexes on ride_id and stat_type
-- Database enforces constraints
-- Type errors caught at migration time
```
---
## 🚀 Implementation Priority
1. **HIGH**: `coaster_stats` - Most frequently queried
2. **HIGH**: `technical_specs` - Used across rides and models
3. **MEDIUM**: `list_items` - User-facing feature
4. **MEDIUM**: `former_names` - Historical data tracking
5. **LOW**: `content_submissions.content` - Has validation, migrate when capacity allows
---
## ✅ Completed Migrations
-`reviews.photos``review_photos` table (migration 20251001231631)
---
## 📝 Migration Checklist (Per Table)
For each JSONB elimination:
- [ ] Create new relational table with proper schema
- [ ] Add RLS policies matching parent table
- [ ] Create indexes for performance
- [ ] Write data migration script to copy existing data
- [ ] Update all application queries to use new table
- [ ] Update all forms/components to use new structure
- [ ] Test thoroughly in staging
- [ ] Deploy migration to production
- [ ] Drop JSONB column after verification
- [ ] Update documentation
---
## 🎯 Success Metrics
When complete, the codebase will have:
**Zero JSONB columns** (except approved configuration)
**100% queryable data** using standard SQL
**Proper foreign key constraints** throughout
**Type-safe queries** with compile-time validation
**Better performance** through proper indexing
**Easier maintenance** with clear relational structure
---
## 📚 References
- [PostgreSQL Best Practices: Avoid JSONB for Relational Data](https://wiki.postgresql.org/wiki/Don%27t_Do_This#Don.27t_use_jsonb_for_relational_data)
- Project Custom Knowledge: "NEVER STORE JSON OR JSONB IN SQL COLUMNS"