feat: Refactor rides app with unique constraints, mixins, and enhanced documentation

- Added migration to convert unique_together constraints to UniqueConstraint for RideModel. - Introduced RideFormMixin for handling entity suggestions in ride forms. - Created comprehensive code standards documentation outlining formatting, docstring requirements, complexity guidelines, and testing requirements. - Established error handling guidelines with a structured exception hierarchy and best practices for API and view error handling. - Documented view pattern guidelines, emphasizing the use of CBVs, FBVs, and ViewSets with examples. - Implemented a benchmarking script for query performance analysis and optimization. - Developed security documentation detailing measures, configurations, and a security checklist. - Compiled a database optimization guide covering indexing strategies, query optimization patterns, and computed fields.
2025-12-23 01:11:09 -05:00 · 2025-12-22 11:17:31 -05:00
parent 45d97b6e68
commit 2e35f8c5d9
71 changed files with 8036 additions and 1462 deletions
--- a/docs/SECURITY.md
+++ b/docs/SECURITY.md
@@ -0,0 +1,193 @@
+# ThrillWiki Security Documentation
+
+This document describes the security measures implemented in ThrillWiki and provides guidelines for maintaining security.
+
+## Security Architecture Overview
+
+ThrillWiki implements defense-in-depth security with multiple layers of protection:
+
+1. **Network Layer**: HTTPS, HSTS, security headers
+2. **Application Layer**: Input validation, output encoding, CSRF protection
+3. **Authentication Layer**: JWT tokens, rate limiting, session management
+4. **Data Layer**: SQL injection prevention, data sanitization
+
+## Security Features
+
+### HTTP Security Headers
+
+The following security headers are configured:
+
+| Header | Value | Purpose |
+|--------|-------|---------|
+| X-Frame-Options | DENY | Prevents clickjacking attacks |
+| X-Content-Type-Options | nosniff | Prevents MIME sniffing |
+| Referrer-Policy | strict-origin-when-cross-origin | Controls referrer information |
+| Content-Security-Policy | Configured | Controls resource loading |
+| Permissions-Policy | Configured | Restricts browser features |
+
+### XSS Prevention
+
+- All user input is escaped by Django's template engine by default
+- Custom `|sanitize` filter for user-generated HTML content
+- SVG sanitization for icon rendering
+- JavaScript data is serialized using Django's `json_script` tag
+
+### CSRF Protection
+
+- Django's CSRF middleware is enabled
+- CSRF tokens required for all state-changing requests
+- HTMX requests automatically include CSRF tokens via `htmx:configRequest` event
+- SameSite cookie attribute set to prevent CSRF attacks
+
+### SQL Injection Prevention
+
+- All database queries use Django ORM
+- No raw SQL with user input
+- `.extra()` calls replaced with Django ORM functions
+- Management commands use parameterized queries
+
+### File Upload Security
+
+- File type validation (extension, MIME type, magic number)
+- File size limits (10MB max)
+- Image integrity validation using PIL
+- Rate limiting (10 uploads per minute)
+- Secure filename generation
+
+### Authentication & Authorization
+
+- JWT-based authentication with short-lived access tokens (15 minutes)
+- Refresh token rotation with blacklisting
+- Rate limiting on authentication endpoints:
+  - Login: 5 per minute, 30 per hour
+  - Signup: 3 per minute, 10 per hour
+  - Password reset: 2 per minute, 5 per hour
+- Permission-based access control
+
+### Session Security
+
+- Redis-backed session storage
+- 1-hour session timeout with sliding expiry
+- HTTPOnly cookies prevent JavaScript access
+- Secure cookies in production (HTTPS only)
+- SameSite attribute set to prevent CSRF
+
+### Sensitive Data Protection
+
+- Passwords hashed with Django's PBKDF2
+- Sensitive fields masked in logs
+- Email addresses partially masked in logs
+- Error messages don't expose internal details in production
+- DEBUG mode disabled in production
+
+## Security Configuration
+
+### Environment Variables
+
+The following environment variables should be set for production:
+
+```bash
+DEBUG=False
+SECRET_KEY=<strong-random-key>
+ALLOWED_HOSTS=yourdomain.com,www.yourdomain.com
+CSRF_TRUSTED_ORIGINS=https://yourdomain.com
+DATABASE_URL=<secure-database-url>
+```
+
+### Production Checklist
+
+Before deploying to production:
+
+- [ ] DEBUG is False
+- [ ] SECRET_KEY is a strong, random value
+- [ ] ALLOWED_HOSTS is configured
+- [ ] HTTPS is enabled (SECURE_SSL_REDIRECT=True)
+- [ ] HSTS is enabled (SECURE_HSTS_SECONDS >= 31536000)
+- [ ] Secure cookies enabled (SESSION_COOKIE_SECURE=True)
+- [ ] Database uses SSL connection
+- [ ] Error emails configured (ADMINS setting)
+
+## Security Audit
+
+Run the security audit command:
+
+```bash
+python manage.py security_audit --verbose
+```
+
+This checks:
+- Django security settings
+- Configuration analysis
+- Middleware configuration
+
+## Vulnerability Reporting
+
+To report a security vulnerability:
+
+1. **Do not** open a public issue
+2. Email security concerns to: [security contact]
+3. Include:
+   - Description of the vulnerability
+   - Steps to reproduce
+   - Potential impact
+   - Any suggested fixes
+
+## Security Updates
+
+- Keep Django and dependencies updated
+- Monitor security advisories
+- Review OWASP Top 10 periodically
+- Run security scans (OWASP ZAP, etc.)
+
+## Code Security Guidelines
+
+### Input Validation
+
+```python
+# Always validate user input
+from apps.core.utils.html_sanitizer import sanitize_html
+
+user_input = request.data.get('content')
+safe_content = sanitize_html(user_input)
+```
+
+### Template Security
+
+```html
+<!-- Use sanitize filter for user content -->
+{% load safe_html %}
+{{ user_content|sanitize }}
+
+<!-- Use json_script for JavaScript data -->
+{{ data|json_script:"data-id" }}
+```
+
+### File Uploads
+
+```python
+from apps.core.utils.file_scanner import validate_image_upload
+
+try:
+    validate_image_upload(uploaded_file)
+except FileValidationError as e:
+    return error_response(str(e))
+```
+
+### Logging
+
+```python
+# Don't log sensitive data
+logger.info(f"User {user.id} logged in")  # OK
+logger.info(f"Password: {password}")  # BAD
+```
+
+## Dependencies
+
+Security-relevant dependencies:
+
+- `bleach`: HTML sanitization
+- `Pillow`: Image validation
+- `djangorestframework-simplejwt`: JWT authentication
+- `django-cors-headers`: CORS configuration
+
+Keep these updated to patch security vulnerabilities.
--- a/docs/SECURITY_CHECKLIST.md
+++ b/docs/SECURITY_CHECKLIST.md
@@ -0,0 +1,155 @@
+# ThrillWiki Security Checklist
+
+Use this checklist for code reviews and pre-deployment verification.
+
+## Pre-Deployment Checklist
+
+### Django Settings
+
+- [ ] `DEBUG = False`
+- [ ] `SECRET_KEY` is unique and strong (50+ characters)
+- [ ] `ALLOWED_HOSTS` is configured (no wildcards)
+- [ ] `CSRF_TRUSTED_ORIGINS` is configured
+- [ ] `SECURE_SSL_REDIRECT = True`
+- [ ] `SECURE_HSTS_SECONDS >= 31536000` (1 year)
+- [ ] `SECURE_HSTS_INCLUDE_SUBDOMAINS = True`
+- [ ] `SECURE_HSTS_PRELOAD = True`
+
+### Cookie Security
+
+- [ ] `SESSION_COOKIE_SECURE = True`
+- [ ] `SESSION_COOKIE_HTTPONLY = True`
+- [ ] `SESSION_COOKIE_SAMESITE = 'Strict'`
+- [ ] `CSRF_COOKIE_SECURE = True`
+- [ ] `CSRF_COOKIE_SAMESITE = 'Strict'`
+
+### Database
+
+- [ ] Database password is strong
+- [ ] Database connection uses SSL
+- [ ] Database user has minimal required permissions
+- [ ] No raw SQL with user input
+
+### Environment
+
+- [ ] Environment variables are used for secrets
+- [ ] No secrets in version control
+- [ ] `.env` file is in `.gitignore`
+- [ ] Production logs don't contain sensitive data
+
+## Code Review Checklist
+
+### Input Validation
+
+- [ ] All user input is validated
+- [ ] File uploads use `validate_image_upload()`
+- [ ] User-generated HTML uses `|sanitize` filter
+- [ ] URLs are validated with `sanitize_url()`
+- [ ] Form data uses Django forms/serializers
+
+### Output Encoding
+
+- [ ] No `|safe` filter on user-controlled content
+- [ ] JSON data uses `json_script` tag
+- [ ] JavaScript strings use `escapejs` filter
+- [ ] SVG icons use `|sanitize_svg` filter
+
+### Authentication
+
+- [ ] Sensitive views require `@login_required`
+- [ ] API views have appropriate `permission_classes`
+- [ ] Password changes invalidate sessions
+- [ ] Rate limiting on auth endpoints
+
+### Authorization
+
+- [ ] Object-level permissions checked
+- [ ] Users can only access their own data
+- [ ] Admin actions require proper permissions
+- [ ] No privilege escalation paths
+
+### Data Protection
+
+- [ ] Sensitive data not logged
+- [ ] PII masked in logs
+- [ ] Error messages don't expose internals
+- [ ] Secure deletion of sensitive data
+
+### CSRF
+
+- [ ] All forms include `{% csrf_token %}`
+- [ ] AJAX requests include CSRF header
+- [ ] CSRF exemptions are documented and justified
+
+### SQL Injection
+
+- [ ] No raw SQL with user input
+- [ ] No `.extra()` with user input
+- [ ] Parameterized queries for raw SQL
+- [ ] Django ORM used for queries
+
+## Incident Response
+
+### If a Vulnerability is Found
+
+1. [ ] Document the vulnerability
+2. [ ] Assess impact and affected users
+3. [ ] Develop and test a fix
+4. [ ] Deploy fix to production
+5. [ ] Notify affected users if needed
+6. [ ] Post-mortem analysis
+
+### If a Breach is Suspected
+
+1. [ ] Isolate affected systems
+2. [ ] Preserve logs and evidence
+3. [ ] Notify relevant parties
+4. [ ] Investigate scope
+5. [ ] Remediate and restore
+6. [ ] Document lessons learned
+
+## Regular Security Tasks
+
+### Weekly
+
+- [ ] Review error logs for anomalies
+- [ ] Check rate limiting effectiveness
+- [ ] Monitor failed login attempts
+
+### Monthly
+
+- [ ] Run `python manage.py security_audit`
+- [ ] Review and update dependencies
+- [ ] Check for security advisories
+
+### Quarterly
+
+- [ ] Full security review
+- [ ] Penetration testing
+- [ ] Update security documentation
+- [ ] Review and rotate secrets
+
+## Security Tools
+
+### Recommended Tools
+
+- **OWASP ZAP**: Web application scanner
+- **bandit**: Python security linter
+- **safety**: Python dependency checker
+- **pip-audit**: Vulnerability scanner for Python packages
+
+### Running Security Scans
+
+```bash
+# Run Django security check
+python manage.py check --tag=security
+
+# Run security audit
+python manage.py security_audit --verbose
+
+# Check for vulnerable dependencies
+pip-audit
+
+# Run Python security linter
+bandit -r backend/
+```
--- a/docs/database-optimization.md
+++ b/docs/database-optimization.md
@@ -0,0 +1,290 @@
+# Database Optimization Guide
+
+This document describes the database optimization strategies implemented in ThrillWiki.
+
+## Overview
+
+The application uses several optimization techniques to ensure fast query performance:
+
+1. **Indexing Strategy** - Strategic use of B-tree, GIN, and composite indexes
+2. **Query Optimization** - Proper use of `select_related` and `prefetch_related`
+3. **Computed Fields** - Pre-computed values for common aggregations
+4. **Manager Methods** - Optimized query patterns encapsulated in managers
+
+## Indexing Strategy
+
+### B-tree Indexes (Standard)
+
+Standard B-tree indexes are used for fields that are frequently filtered or sorted:
+
+| Model | Field | Index Type | Purpose |
+|-------|-------|------------|---------|
+| User | `is_banned` | B-tree | Fast filtering of banned users |
+| User | `role` | B-tree | Fast filtering by user role |
+| User | `(is_banned, role)` | Composite | Common query pattern |
+| Park | `status` | B-tree | Filter by park status |
+| Park | `search_text` | GIN trigram | Full-text search |
+| Ride | `status` | B-tree | Filter by ride status |
+| Ride | `search_text` | GIN trigram | Full-text search |
+
+### GIN Indexes
+
+GIN (Generalized Inverted Index) indexes are used for array fields and full-text search:
+
+| Model | Field | Purpose |
+|-------|-------|---------|
+| Company | `roles` | Fast array containment queries (`roles__contains=["MANUFACTURER"]`) |
+| Park | `search_text` | Full-text search with trigram similarity |
+| Ride | `search_text` | Full-text search with trigram similarity |
+
+### Creating GIN Indexes
+
+```sql
+-- Array containment index
+CREATE INDEX IF NOT EXISTS parks_company_roles_gin_idx
+    ON parks_company USING gin(roles);
+
+-- Full-text search index (if using tsvector)
+CREATE INDEX IF NOT EXISTS parks_park_search_idx
+    ON parks_park USING gin(search_text gin_trgm_ops);
+```
+
+## Query Optimization Patterns
+
+### Manager Methods
+
+The application uses custom managers with optimized query methods:
+
+#### Park Queries
+
+```python
+# List view - includes prefetched relations and stats
+parks = Park.objects.optimized_for_list()
+
+# Detail view - deep prefetching for all related data
+park = Park.objects.optimized_for_detail().get(slug='magic-kingdom')
+
+# Map display - minimal fields for markers
+parks = Park.objects.for_map_display()
+
+# Autocomplete - limited fields, fast lookup
+results = Park.objects.get_queryset().search_autocomplete(query='disney', limit=10)
+```
+
+#### Ride Queries
+
+```python
+# List view with related objects
+rides = Ride.objects.optimized_for_list()
+
+# Detail view with stats
+ride = Ride.objects.optimized_for_detail().get(slug='space-mountain')
+
+# With coaster statistics
+rides = Ride.objects.with_coaster_stats().filter(category='RC')
+```
+
+#### Company Queries
+
+```python
+# Manufacturers with ride counts
+manufacturers = Company.objects.manufacturers_with_ride_count()
+
+# Designers with ride counts
+designers = Company.objects.designers_with_ride_count()
+
+# Operators with park counts
+operators = Company.objects.operators_with_park_count()
+```
+
+### Avoiding N+1 Queries
+
+Always use the optimized manager methods instead of raw queries:
+
+```python
+# BAD - causes N+1 queries
+for park in Park.objects.all():
+    print(park.operator.name)  # Each access hits DB
+
+# GOOD - single query with prefetch
+for park in Park.objects.optimized_for_list():
+    print(park.operator.name)  # Already loaded
+```
+
+### Using only() for Minimal Data
+
+When you only need specific fields, use `only()`:
+
+```python
+# Only fetch necessary fields
+companies = Company.objects.filter(roles__contains=["MANUFACTURER"]).only(
+    'id', 'name', 'slug', 'roles'
+)
+```
+
+## Computed Fields
+
+### Park Computed Fields
+
+| Field | Description | Updated When |
+|-------|-------------|--------------|
+| `ride_count` | Number of operating rides | Ride created/deleted/status changed |
+| `coaster_count` | Number of operating coasters | Ride created/deleted/status changed |
+| `opening_year` | Year extracted from opening_date | Park saved with opening_date |
+| `search_text` | Combined searchable text | Park/Location/Company name changes |
+
+### Ride Computed Fields
+
+| Field | Description | Updated When |
+|-------|-------------|--------------|
+| `opening_year` | Year extracted from opening_date | Ride saved with opening_date |
+| `search_text` | Combined searchable text | Ride/Park/Company/RideModel changes |
+
+### Signal Handlers
+
+Signals automatically update computed fields:
+
+```python
+# When a park location changes, update search_text
+@receiver(post_save, sender='parks.ParkLocation')
+def update_park_search_text_on_location_change(sender, instance, **kwargs):
+    if hasattr(instance, 'park') and instance.park:
+        instance.park._populate_computed_fields()
+        instance.park.save(update_fields=['search_text'])
+```
+
+## CheckConstraints
+
+Database-level constraints ensure data integrity:
+
+### User Constraints
+
+```python
+# Banned users must have a ban_date
+models.CheckConstraint(
+    name='user_ban_consistency',
+    check=models.Q(is_banned=False) | models.Q(ban_date__isnull=False),
+)
+```
+
+### RideModel Constraints
+
+```python
+# Unique name per manufacturer
+models.UniqueConstraint(
+    fields=['manufacturer', 'name'],
+    name='ridemodel_manufacturer_name_unique',
+)
+
+# Installation year range must be valid
+models.CheckConstraint(
+    name="ride_model_installation_years_logical",
+    condition=models.Q(first_installation_year__isnull=True) |
+              models.Q(last_installation_year__isnull=True) |
+              models.Q(first_installation_year__lte=models.F("last_installation_year")),
+)
+```
+
+## Performance Benchmarking
+
+Use the benchmark script to measure query performance:
+
+```bash
+# Run benchmarks
+python manage.py shell < scripts/benchmark_queries.py
+```
+
+Key metrics to monitor:
+- Average query time (< 100ms for list views, < 50ms for detail views)
+- Number of queries per operation (avoid N+1 patterns)
+- Index usage (check query plans)
+
+## Migration Best Practices
+
+### Adding Indexes
+
+```python
+# Use RunSQL for GIN indexes (not natively supported by Django)
+migrations.RunSQL(
+    sql="CREATE INDEX IF NOT EXISTS ... USING gin(...)",
+    reverse_sql="DROP INDEX IF EXISTS ..."
+)
+```
+
+### Adding Constraints
+
+```python
+# Use AddConstraint for proper dependency handling
+migrations.AddConstraint(
+    model_name='user',
+    constraint=models.CheckConstraint(...)
+)
+```
+
+### Rollback Procedures
+
+Each migration should be reversible:
+
+```bash
+# Rollback specific migration
+python manage.py migrate accounts 0012
+
+# Check migration plan before applying
+python manage.py migrate --plan
+```
+
+## Monitoring
+
+### Query Analysis
+
+Enable query logging in development:
+
+```python
+LOGGING = {
+    'handlers': {
+        'console': {'class': 'logging.StreamHandler'},
+    },
+    'loggers': {
+        'django.db.backends': {
+            'level': 'DEBUG',
+            'handlers': ['console'],
+        }
+    }
+}
+```
+
+### Index Usage
+
+Check if indexes are being used:
+
+```sql
+EXPLAIN ANALYZE SELECT * FROM parks_park WHERE status = 'OPERATING';
+```
+
+## Quick Reference
+
+### Common Query Patterns
+
+| Operation | Method |
+|-----------|--------|
+| Park list page | `Park.objects.optimized_for_list()` |
+| Park detail page | `Park.objects.optimized_for_detail()` |
+| Map markers | `Park.objects.for_map_display()` |
+| Search autocomplete | `Park.objects.get_queryset().search_autocomplete()` |
+| Ride list page | `Ride.objects.optimized_for_list()` |
+| Ride detail page | `Ride.objects.optimized_for_detail()` |
+| Manufacturer list | `Company.objects.manufacturers_with_ride_count()` |
+| Operator list | `Company.objects.operators_with_park_count()` |
+
+### Index Commands
+
+```sql
+-- List all indexes for a table
+\di+ parks_park*
+
+-- Check index usage statistics
+SELECT * FROM pg_stat_user_indexes WHERE relname = 'parks_park';
+
+-- Rebuild an index
+REINDEX INDEX parks_company_roles_gin_idx;
+```