- Added migration to convert unique_together constraints to UniqueConstraint for RideModel. - Introduced RideFormMixin for handling entity suggestions in ride forms. - Created comprehensive code standards documentation outlining formatting, docstring requirements, complexity guidelines, and testing requirements. - Established error handling guidelines with a structured exception hierarchy and best practices for API and view error handling. - Documented view pattern guidelines, emphasizing the use of CBVs, FBVs, and ViewSets with examples. - Implemented a benchmarking script for query performance analysis and optimization. - Developed security documentation detailing measures, configurations, and a security checklist. - Compiled a database optimization guide covering indexing strategies, query optimization patterns, and computed fields.
7.7 KiB
Database Optimization Guide
This document describes the database optimization strategies implemented in ThrillWiki.
Overview
The application uses several optimization techniques to ensure fast query performance:
- Indexing Strategy - Strategic use of B-tree, GIN, and composite indexes
- Query Optimization - Proper use of
select_relatedandprefetch_related - Computed Fields - Pre-computed values for common aggregations
- Manager Methods - Optimized query patterns encapsulated in managers
Indexing Strategy
B-tree Indexes (Standard)
Standard B-tree indexes are used for fields that are frequently filtered or sorted:
| Model | Field | Index Type | Purpose |
|---|---|---|---|
| User | is_banned |
B-tree | Fast filtering of banned users |
| User | role |
B-tree | Fast filtering by user role |
| User | (is_banned, role) |
Composite | Common query pattern |
| Park | status |
B-tree | Filter by park status |
| Park | search_text |
GIN trigram | Full-text search |
| Ride | status |
B-tree | Filter by ride status |
| Ride | search_text |
GIN trigram | Full-text search |
GIN Indexes
GIN (Generalized Inverted Index) indexes are used for array fields and full-text search:
| Model | Field | Purpose |
|---|---|---|
| Company | roles |
Fast array containment queries (roles__contains=["MANUFACTURER"]) |
| Park | search_text |
Full-text search with trigram similarity |
| Ride | search_text |
Full-text search with trigram similarity |
Creating GIN Indexes
-- Array containment index
CREATE INDEX IF NOT EXISTS parks_company_roles_gin_idx
ON parks_company USING gin(roles);
-- Full-text search index (if using tsvector)
CREATE INDEX IF NOT EXISTS parks_park_search_idx
ON parks_park USING gin(search_text gin_trgm_ops);
Query Optimization Patterns
Manager Methods
The application uses custom managers with optimized query methods:
Park Queries
# List view - includes prefetched relations and stats
parks = Park.objects.optimized_for_list()
# Detail view - deep prefetching for all related data
park = Park.objects.optimized_for_detail().get(slug='magic-kingdom')
# Map display - minimal fields for markers
parks = Park.objects.for_map_display()
# Autocomplete - limited fields, fast lookup
results = Park.objects.get_queryset().search_autocomplete(query='disney', limit=10)
Ride Queries
# List view with related objects
rides = Ride.objects.optimized_for_list()
# Detail view with stats
ride = Ride.objects.optimized_for_detail().get(slug='space-mountain')
# With coaster statistics
rides = Ride.objects.with_coaster_stats().filter(category='RC')
Company Queries
# Manufacturers with ride counts
manufacturers = Company.objects.manufacturers_with_ride_count()
# Designers with ride counts
designers = Company.objects.designers_with_ride_count()
# Operators with park counts
operators = Company.objects.operators_with_park_count()
Avoiding N+1 Queries
Always use the optimized manager methods instead of raw queries:
# BAD - causes N+1 queries
for park in Park.objects.all():
print(park.operator.name) # Each access hits DB
# GOOD - single query with prefetch
for park in Park.objects.optimized_for_list():
print(park.operator.name) # Already loaded
Using only() for Minimal Data
When you only need specific fields, use only():
# Only fetch necessary fields
companies = Company.objects.filter(roles__contains=["MANUFACTURER"]).only(
'id', 'name', 'slug', 'roles'
)
Computed Fields
Park Computed Fields
| Field | Description | Updated When |
|---|---|---|
ride_count |
Number of operating rides | Ride created/deleted/status changed |
coaster_count |
Number of operating coasters | Ride created/deleted/status changed |
opening_year |
Year extracted from opening_date | Park saved with opening_date |
search_text |
Combined searchable text | Park/Location/Company name changes |
Ride Computed Fields
| Field | Description | Updated When |
|---|---|---|
opening_year |
Year extracted from opening_date | Ride saved with opening_date |
search_text |
Combined searchable text | Ride/Park/Company/RideModel changes |
Signal Handlers
Signals automatically update computed fields:
# When a park location changes, update search_text
@receiver(post_save, sender='parks.ParkLocation')
def update_park_search_text_on_location_change(sender, instance, **kwargs):
if hasattr(instance, 'park') and instance.park:
instance.park._populate_computed_fields()
instance.park.save(update_fields=['search_text'])
CheckConstraints
Database-level constraints ensure data integrity:
User Constraints
# Banned users must have a ban_date
models.CheckConstraint(
name='user_ban_consistency',
check=models.Q(is_banned=False) | models.Q(ban_date__isnull=False),
)
RideModel Constraints
# Unique name per manufacturer
models.UniqueConstraint(
fields=['manufacturer', 'name'],
name='ridemodel_manufacturer_name_unique',
)
# Installation year range must be valid
models.CheckConstraint(
name="ride_model_installation_years_logical",
condition=models.Q(first_installation_year__isnull=True) |
models.Q(last_installation_year__isnull=True) |
models.Q(first_installation_year__lte=models.F("last_installation_year")),
)
Performance Benchmarking
Use the benchmark script to measure query performance:
# Run benchmarks
python manage.py shell < scripts/benchmark_queries.py
Key metrics to monitor:
- Average query time (< 100ms for list views, < 50ms for detail views)
- Number of queries per operation (avoid N+1 patterns)
- Index usage (check query plans)
Migration Best Practices
Adding Indexes
# Use RunSQL for GIN indexes (not natively supported by Django)
migrations.RunSQL(
sql="CREATE INDEX IF NOT EXISTS ... USING gin(...)",
reverse_sql="DROP INDEX IF EXISTS ..."
)
Adding Constraints
# Use AddConstraint for proper dependency handling
migrations.AddConstraint(
model_name='user',
constraint=models.CheckConstraint(...)
)
Rollback Procedures
Each migration should be reversible:
# Rollback specific migration
python manage.py migrate accounts 0012
# Check migration plan before applying
python manage.py migrate --plan
Monitoring
Query Analysis
Enable query logging in development:
LOGGING = {
'handlers': {
'console': {'class': 'logging.StreamHandler'},
},
'loggers': {
'django.db.backends': {
'level': 'DEBUG',
'handlers': ['console'],
}
}
}
Index Usage
Check if indexes are being used:
EXPLAIN ANALYZE SELECT * FROM parks_park WHERE status = 'OPERATING';
Quick Reference
Common Query Patterns
| Operation | Method |
|---|---|
| Park list page | Park.objects.optimized_for_list() |
| Park detail page | Park.objects.optimized_for_detail() |
| Map markers | Park.objects.for_map_display() |
| Search autocomplete | Park.objects.get_queryset().search_autocomplete() |
| Ride list page | Ride.objects.optimized_for_list() |
| Ride detail page | Ride.objects.optimized_for_detail() |
| Manufacturer list | Company.objects.manufacturers_with_ride_count() |
| Operator list | Company.objects.operators_with_park_count() |
Index Commands
-- List all indexes for a table
\di+ parks_park*
-- Check index usage statistics
SELECT * FROM pg_stat_user_indexes WHERE relname = 'parks_park';
-- Rebuild an index
REINDEX INDEX parks_company_roles_gin_idx;