feat: Refactor rides app with unique constraints, mixins, and enhanced documentation

- Added migration to convert unique_together constraints to UniqueConstraint for RideModel.
- Introduced RideFormMixin for handling entity suggestions in ride forms.
- Created comprehensive code standards documentation outlining formatting, docstring requirements, complexity guidelines, and testing requirements.
- Established error handling guidelines with a structured exception hierarchy and best practices for API and view error handling.
- Documented view pattern guidelines, emphasizing the use of CBVs, FBVs, and ViewSets with examples.
- Implemented a benchmarking script for query performance analysis and optimization.
- Developed security documentation detailing measures, configurations, and a security checklist.
- Compiled a database optimization guide covering indexing strategies, query optimization patterns, and computed fields.
This commit is contained in:
pacnpal
2025-12-22 11:17:31 -05:00
parent 45d97b6e68
commit 2e35f8c5d9
71 changed files with 8036 additions and 1462 deletions

193
docs/SECURITY.md Normal file
View File

@@ -0,0 +1,193 @@
# ThrillWiki Security Documentation
This document describes the security measures implemented in ThrillWiki and provides guidelines for maintaining security.
## Security Architecture Overview
ThrillWiki implements defense-in-depth security with multiple layers of protection:
1. **Network Layer**: HTTPS, HSTS, security headers
2. **Application Layer**: Input validation, output encoding, CSRF protection
3. **Authentication Layer**: JWT tokens, rate limiting, session management
4. **Data Layer**: SQL injection prevention, data sanitization
## Security Features
### HTTP Security Headers
The following security headers are configured:
| Header | Value | Purpose |
|--------|-------|---------|
| X-Frame-Options | DENY | Prevents clickjacking attacks |
| X-Content-Type-Options | nosniff | Prevents MIME sniffing |
| Referrer-Policy | strict-origin-when-cross-origin | Controls referrer information |
| Content-Security-Policy | Configured | Controls resource loading |
| Permissions-Policy | Configured | Restricts browser features |
### XSS Prevention
- All user input is escaped by Django's template engine by default
- Custom `|sanitize` filter for user-generated HTML content
- SVG sanitization for icon rendering
- JavaScript data is serialized using Django's `json_script` tag
### CSRF Protection
- Django's CSRF middleware is enabled
- CSRF tokens required for all state-changing requests
- HTMX requests automatically include CSRF tokens via `htmx:configRequest` event
- SameSite cookie attribute set to prevent CSRF attacks
### SQL Injection Prevention
- All database queries use Django ORM
- No raw SQL with user input
- `.extra()` calls replaced with Django ORM functions
- Management commands use parameterized queries
### File Upload Security
- File type validation (extension, MIME type, magic number)
- File size limits (10MB max)
- Image integrity validation using PIL
- Rate limiting (10 uploads per minute)
- Secure filename generation
### Authentication & Authorization
- JWT-based authentication with short-lived access tokens (15 minutes)
- Refresh token rotation with blacklisting
- Rate limiting on authentication endpoints:
- Login: 5 per minute, 30 per hour
- Signup: 3 per minute, 10 per hour
- Password reset: 2 per minute, 5 per hour
- Permission-based access control
### Session Security
- Redis-backed session storage
- 1-hour session timeout with sliding expiry
- HTTPOnly cookies prevent JavaScript access
- Secure cookies in production (HTTPS only)
- SameSite attribute set to prevent CSRF
### Sensitive Data Protection
- Passwords hashed with Django's PBKDF2
- Sensitive fields masked in logs
- Email addresses partially masked in logs
- Error messages don't expose internal details in production
- DEBUG mode disabled in production
## Security Configuration
### Environment Variables
The following environment variables should be set for production:
```bash
DEBUG=False
SECRET_KEY=<strong-random-key>
ALLOWED_HOSTS=yourdomain.com,www.yourdomain.com
CSRF_TRUSTED_ORIGINS=https://yourdomain.com
DATABASE_URL=<secure-database-url>
```
### Production Checklist
Before deploying to production:
- [ ] DEBUG is False
- [ ] SECRET_KEY is a strong, random value
- [ ] ALLOWED_HOSTS is configured
- [ ] HTTPS is enabled (SECURE_SSL_REDIRECT=True)
- [ ] HSTS is enabled (SECURE_HSTS_SECONDS >= 31536000)
- [ ] Secure cookies enabled (SESSION_COOKIE_SECURE=True)
- [ ] Database uses SSL connection
- [ ] Error emails configured (ADMINS setting)
## Security Audit
Run the security audit command:
```bash
python manage.py security_audit --verbose
```
This checks:
- Django security settings
- Configuration analysis
- Middleware configuration
## Vulnerability Reporting
To report a security vulnerability:
1. **Do not** open a public issue
2. Email security concerns to: [security contact]
3. Include:
- Description of the vulnerability
- Steps to reproduce
- Potential impact
- Any suggested fixes
## Security Updates
- Keep Django and dependencies updated
- Monitor security advisories
- Review OWASP Top 10 periodically
- Run security scans (OWASP ZAP, etc.)
## Code Security Guidelines
### Input Validation
```python
# Always validate user input
from apps.core.utils.html_sanitizer import sanitize_html
user_input = request.data.get('content')
safe_content = sanitize_html(user_input)
```
### Template Security
```html
<!-- Use sanitize filter for user content -->
{% load safe_html %}
{{ user_content|sanitize }}
<!-- Use json_script for JavaScript data -->
{{ data|json_script:"data-id" }}
```
### File Uploads
```python
from apps.core.utils.file_scanner import validate_image_upload
try:
validate_image_upload(uploaded_file)
except FileValidationError as e:
return error_response(str(e))
```
### Logging
```python
# Don't log sensitive data
logger.info(f"User {user.id} logged in") # OK
logger.info(f"Password: {password}") # BAD
```
## Dependencies
Security-relevant dependencies:
- `bleach`: HTML sanitization
- `Pillow`: Image validation
- `djangorestframework-simplejwt`: JWT authentication
- `django-cors-headers`: CORS configuration
Keep these updated to patch security vulnerabilities.

155
docs/SECURITY_CHECKLIST.md Normal file
View File

@@ -0,0 +1,155 @@
# ThrillWiki Security Checklist
Use this checklist for code reviews and pre-deployment verification.
## Pre-Deployment Checklist
### Django Settings
- [ ] `DEBUG = False`
- [ ] `SECRET_KEY` is unique and strong (50+ characters)
- [ ] `ALLOWED_HOSTS` is configured (no wildcards)
- [ ] `CSRF_TRUSTED_ORIGINS` is configured
- [ ] `SECURE_SSL_REDIRECT = True`
- [ ] `SECURE_HSTS_SECONDS >= 31536000` (1 year)
- [ ] `SECURE_HSTS_INCLUDE_SUBDOMAINS = True`
- [ ] `SECURE_HSTS_PRELOAD = True`
### Cookie Security
- [ ] `SESSION_COOKIE_SECURE = True`
- [ ] `SESSION_COOKIE_HTTPONLY = True`
- [ ] `SESSION_COOKIE_SAMESITE = 'Strict'`
- [ ] `CSRF_COOKIE_SECURE = True`
- [ ] `CSRF_COOKIE_SAMESITE = 'Strict'`
### Database
- [ ] Database password is strong
- [ ] Database connection uses SSL
- [ ] Database user has minimal required permissions
- [ ] No raw SQL with user input
### Environment
- [ ] Environment variables are used for secrets
- [ ] No secrets in version control
- [ ] `.env` file is in `.gitignore`
- [ ] Production logs don't contain sensitive data
## Code Review Checklist
### Input Validation
- [ ] All user input is validated
- [ ] File uploads use `validate_image_upload()`
- [ ] User-generated HTML uses `|sanitize` filter
- [ ] URLs are validated with `sanitize_url()`
- [ ] Form data uses Django forms/serializers
### Output Encoding
- [ ] No `|safe` filter on user-controlled content
- [ ] JSON data uses `json_script` tag
- [ ] JavaScript strings use `escapejs` filter
- [ ] SVG icons use `|sanitize_svg` filter
### Authentication
- [ ] Sensitive views require `@login_required`
- [ ] API views have appropriate `permission_classes`
- [ ] Password changes invalidate sessions
- [ ] Rate limiting on auth endpoints
### Authorization
- [ ] Object-level permissions checked
- [ ] Users can only access their own data
- [ ] Admin actions require proper permissions
- [ ] No privilege escalation paths
### Data Protection
- [ ] Sensitive data not logged
- [ ] PII masked in logs
- [ ] Error messages don't expose internals
- [ ] Secure deletion of sensitive data
### CSRF
- [ ] All forms include `{% csrf_token %}`
- [ ] AJAX requests include CSRF header
- [ ] CSRF exemptions are documented and justified
### SQL Injection
- [ ] No raw SQL with user input
- [ ] No `.extra()` with user input
- [ ] Parameterized queries for raw SQL
- [ ] Django ORM used for queries
## Incident Response
### If a Vulnerability is Found
1. [ ] Document the vulnerability
2. [ ] Assess impact and affected users
3. [ ] Develop and test a fix
4. [ ] Deploy fix to production
5. [ ] Notify affected users if needed
6. [ ] Post-mortem analysis
### If a Breach is Suspected
1. [ ] Isolate affected systems
2. [ ] Preserve logs and evidence
3. [ ] Notify relevant parties
4. [ ] Investigate scope
5. [ ] Remediate and restore
6. [ ] Document lessons learned
## Regular Security Tasks
### Weekly
- [ ] Review error logs for anomalies
- [ ] Check rate limiting effectiveness
- [ ] Monitor failed login attempts
### Monthly
- [ ] Run `python manage.py security_audit`
- [ ] Review and update dependencies
- [ ] Check for security advisories
### Quarterly
- [ ] Full security review
- [ ] Penetration testing
- [ ] Update security documentation
- [ ] Review and rotate secrets
## Security Tools
### Recommended Tools
- **OWASP ZAP**: Web application scanner
- **bandit**: Python security linter
- **safety**: Python dependency checker
- **pip-audit**: Vulnerability scanner for Python packages
### Running Security Scans
```bash
# Run Django security check
python manage.py check --tag=security
# Run security audit
python manage.py security_audit --verbose
# Check for vulnerable dependencies
pip-audit
# Run Python security linter
bandit -r backend/
```

View File

@@ -0,0 +1,290 @@
# Database Optimization Guide
This document describes the database optimization strategies implemented in ThrillWiki.
## Overview
The application uses several optimization techniques to ensure fast query performance:
1. **Indexing Strategy** - Strategic use of B-tree, GIN, and composite indexes
2. **Query Optimization** - Proper use of `select_related` and `prefetch_related`
3. **Computed Fields** - Pre-computed values for common aggregations
4. **Manager Methods** - Optimized query patterns encapsulated in managers
## Indexing Strategy
### B-tree Indexes (Standard)
Standard B-tree indexes are used for fields that are frequently filtered or sorted:
| Model | Field | Index Type | Purpose |
|-------|-------|------------|---------|
| User | `is_banned` | B-tree | Fast filtering of banned users |
| User | `role` | B-tree | Fast filtering by user role |
| User | `(is_banned, role)` | Composite | Common query pattern |
| Park | `status` | B-tree | Filter by park status |
| Park | `search_text` | GIN trigram | Full-text search |
| Ride | `status` | B-tree | Filter by ride status |
| Ride | `search_text` | GIN trigram | Full-text search |
### GIN Indexes
GIN (Generalized Inverted Index) indexes are used for array fields and full-text search:
| Model | Field | Purpose |
|-------|-------|---------|
| Company | `roles` | Fast array containment queries (`roles__contains=["MANUFACTURER"]`) |
| Park | `search_text` | Full-text search with trigram similarity |
| Ride | `search_text` | Full-text search with trigram similarity |
### Creating GIN Indexes
```sql
-- Array containment index
CREATE INDEX IF NOT EXISTS parks_company_roles_gin_idx
ON parks_company USING gin(roles);
-- Full-text search index (if using tsvector)
CREATE INDEX IF NOT EXISTS parks_park_search_idx
ON parks_park USING gin(search_text gin_trgm_ops);
```
## Query Optimization Patterns
### Manager Methods
The application uses custom managers with optimized query methods:
#### Park Queries
```python
# List view - includes prefetched relations and stats
parks = Park.objects.optimized_for_list()
# Detail view - deep prefetching for all related data
park = Park.objects.optimized_for_detail().get(slug='magic-kingdom')
# Map display - minimal fields for markers
parks = Park.objects.for_map_display()
# Autocomplete - limited fields, fast lookup
results = Park.objects.get_queryset().search_autocomplete(query='disney', limit=10)
```
#### Ride Queries
```python
# List view with related objects
rides = Ride.objects.optimized_for_list()
# Detail view with stats
ride = Ride.objects.optimized_for_detail().get(slug='space-mountain')
# With coaster statistics
rides = Ride.objects.with_coaster_stats().filter(category='RC')
```
#### Company Queries
```python
# Manufacturers with ride counts
manufacturers = Company.objects.manufacturers_with_ride_count()
# Designers with ride counts
designers = Company.objects.designers_with_ride_count()
# Operators with park counts
operators = Company.objects.operators_with_park_count()
```
### Avoiding N+1 Queries
Always use the optimized manager methods instead of raw queries:
```python
# BAD - causes N+1 queries
for park in Park.objects.all():
print(park.operator.name) # Each access hits DB
# GOOD - single query with prefetch
for park in Park.objects.optimized_for_list():
print(park.operator.name) # Already loaded
```
### Using only() for Minimal Data
When you only need specific fields, use `only()`:
```python
# Only fetch necessary fields
companies = Company.objects.filter(roles__contains=["MANUFACTURER"]).only(
'id', 'name', 'slug', 'roles'
)
```
## Computed Fields
### Park Computed Fields
| Field | Description | Updated When |
|-------|-------------|--------------|
| `ride_count` | Number of operating rides | Ride created/deleted/status changed |
| `coaster_count` | Number of operating coasters | Ride created/deleted/status changed |
| `opening_year` | Year extracted from opening_date | Park saved with opening_date |
| `search_text` | Combined searchable text | Park/Location/Company name changes |
### Ride Computed Fields
| Field | Description | Updated When |
|-------|-------------|--------------|
| `opening_year` | Year extracted from opening_date | Ride saved with opening_date |
| `search_text` | Combined searchable text | Ride/Park/Company/RideModel changes |
### Signal Handlers
Signals automatically update computed fields:
```python
# When a park location changes, update search_text
@receiver(post_save, sender='parks.ParkLocation')
def update_park_search_text_on_location_change(sender, instance, **kwargs):
if hasattr(instance, 'park') and instance.park:
instance.park._populate_computed_fields()
instance.park.save(update_fields=['search_text'])
```
## CheckConstraints
Database-level constraints ensure data integrity:
### User Constraints
```python
# Banned users must have a ban_date
models.CheckConstraint(
name='user_ban_consistency',
check=models.Q(is_banned=False) | models.Q(ban_date__isnull=False),
)
```
### RideModel Constraints
```python
# Unique name per manufacturer
models.UniqueConstraint(
fields=['manufacturer', 'name'],
name='ridemodel_manufacturer_name_unique',
)
# Installation year range must be valid
models.CheckConstraint(
name="ride_model_installation_years_logical",
condition=models.Q(first_installation_year__isnull=True) |
models.Q(last_installation_year__isnull=True) |
models.Q(first_installation_year__lte=models.F("last_installation_year")),
)
```
## Performance Benchmarking
Use the benchmark script to measure query performance:
```bash
# Run benchmarks
python manage.py shell < scripts/benchmark_queries.py
```
Key metrics to monitor:
- Average query time (< 100ms for list views, < 50ms for detail views)
- Number of queries per operation (avoid N+1 patterns)
- Index usage (check query plans)
## Migration Best Practices
### Adding Indexes
```python
# Use RunSQL for GIN indexes (not natively supported by Django)
migrations.RunSQL(
sql="CREATE INDEX IF NOT EXISTS ... USING gin(...)",
reverse_sql="DROP INDEX IF EXISTS ..."
)
```
### Adding Constraints
```python
# Use AddConstraint for proper dependency handling
migrations.AddConstraint(
model_name='user',
constraint=models.CheckConstraint(...)
)
```
### Rollback Procedures
Each migration should be reversible:
```bash
# Rollback specific migration
python manage.py migrate accounts 0012
# Check migration plan before applying
python manage.py migrate --plan
```
## Monitoring
### Query Analysis
Enable query logging in development:
```python
LOGGING = {
'handlers': {
'console': {'class': 'logging.StreamHandler'},
},
'loggers': {
'django.db.backends': {
'level': 'DEBUG',
'handlers': ['console'],
}
}
}
```
### Index Usage
Check if indexes are being used:
```sql
EXPLAIN ANALYZE SELECT * FROM parks_park WHERE status = 'OPERATING';
```
## Quick Reference
### Common Query Patterns
| Operation | Method |
|-----------|--------|
| Park list page | `Park.objects.optimized_for_list()` |
| Park detail page | `Park.objects.optimized_for_detail()` |
| Map markers | `Park.objects.for_map_display()` |
| Search autocomplete | `Park.objects.get_queryset().search_autocomplete()` |
| Ride list page | `Ride.objects.optimized_for_list()` |
| Ride detail page | `Ride.objects.optimized_for_detail()` |
| Manufacturer list | `Company.objects.manufacturers_with_ride_count()` |
| Operator list | `Company.objects.operators_with_park_count()` |
### Index Commands
```sql
-- List all indexes for a table
\di+ parks_park*
-- Check index usage statistics
SELECT * FROM pg_stat_user_indexes WHERE relname = 'parks_park';
-- Rebuild an index
REINDEX INDEX parks_company_roles_gin_idx;
```