mirror of
https://github.com/pacnpal/thrillwiki_django_no_react.git
synced 2025-12-24 15:31:09 -05:00
- Introduced a comprehensive Secret Management Guide detailing best practices, secret classification, development setup, production management, rotation procedures, and emergency protocols. - Implemented a client-side performance monitoring script to track various metrics including page load performance, paint metrics, layout shifts, and memory usage. - Enhanced search accessibility with keyboard navigation support for search results, ensuring compliance with WCAG standards and improving user experience.
577 lines
18 KiB
Markdown
577 lines
18 KiB
Markdown
# Future Work & Deferred Features
|
|
|
|
This document tracks features that have been deferred for future implementation. Each item includes context, implementation guidance, and priority.
|
|
|
|
## Priority Levels
|
|
- **P0 (Critical)**: Blocks major functionality or has security implications
|
|
- **P1 (High)**: Significantly improves user experience or performance
|
|
- **P2 (Medium)**: Nice-to-have features that add value
|
|
- **P3 (Low)**: Optional enhancements
|
|
|
|
## Feature Tracking
|
|
|
|
### Map Service Enhancements
|
|
|
|
#### THRILLWIKI-106: Map Clustering Algorithm
|
|
|
|
**Priority**: P1 (High)
|
|
**Estimated Effort**: 3-5 days
|
|
**Dependencies**: None
|
|
|
|
**Context**:
|
|
Currently, the map API returns all locations within bounds without clustering. At high zoom levels (zoomed out), this can result in hundreds of overlapping markers, degrading performance and UX.
|
|
|
|
**Proposed Solution**:
|
|
Implement a server-side clustering algorithm using one of these approaches:
|
|
|
|
1. **Grid-based clustering** (Recommended for simplicity):
|
|
- Divide the map into a grid based on zoom level
|
|
- Group locations within each grid cell
|
|
- Return cluster center and count for cells with multiple locations
|
|
|
|
2. **DBSCAN clustering** (Better quality, more complex):
|
|
- Use scikit-learn's DBSCAN algorithm
|
|
- Cluster based on geographic distance
|
|
- Adjust epsilon parameter based on zoom level
|
|
|
|
**Implementation Steps**:
|
|
1. Create `backend/apps/core/services/map_clustering.py` with clustering logic
|
|
2. Add `cluster_locations()` method that accepts:
|
|
- List of `UnifiedLocation` objects
|
|
- Zoom level (1-20)
|
|
- Clustering strategy ('grid' or 'dbscan')
|
|
3. Update `MapLocationsAPIView._build_response()` to call clustering service when `params["cluster"]` is True
|
|
4. Update `MapClusterSerializer` to include cluster metadata
|
|
5. Add tests in `backend/tests/services/test_map_clustering.py`
|
|
|
|
**API Changes**:
|
|
- Response includes `clusters` array with cluster objects
|
|
- Each cluster has: `id`, `coordinates`, `count`, `bounds`, `representative_location`
|
|
|
|
**Performance Considerations**:
|
|
- Cache clustered results separately from unclustered
|
|
- Use spatial indexes on location tables
|
|
- Limit clustering to zoom levels 1-12 (zoomed out views)
|
|
|
|
**References**:
|
|
- [Supercluster.js](https://github.com/mapbox/supercluster) - JavaScript implementation for reference
|
|
- [PostGIS ST_ClusterKMeans](https://postgis.net/docs/ST_ClusterKMeans.html) - Database-level clustering
|
|
|
|
---
|
|
|
|
#### THRILLWIKI-107: Nearby Locations
|
|
|
|
**Priority**: P2 (Medium)
|
|
**Estimated Effort**: 2-3 days
|
|
**Dependencies**: None
|
|
|
|
**Context**:
|
|
Location detail views currently don't show nearby parks or rides. This would help users discover attractions in the same area.
|
|
|
|
**Proposed Solution**:
|
|
Use PostGIS spatial queries to find locations within a radius:
|
|
|
|
```python
|
|
from django.contrib.gis.measure import D # Distance
|
|
from django.contrib.gis.db.models.functions import Distance
|
|
|
|
def get_nearby_locations(location_obj, radius_miles=25, limit=10):
|
|
"""Get nearby locations using spatial query."""
|
|
point = location_obj.point
|
|
|
|
# Query parks within radius
|
|
nearby_parks = Park.objects.filter(
|
|
location__point__distance_lte=(point, D(mi=radius_miles))
|
|
).annotate(
|
|
distance=Distance('location__point', point)
|
|
).exclude(
|
|
id=location_obj.park.id # Exclude self
|
|
).order_by('distance')[:limit]
|
|
|
|
return nearby_parks
|
|
```
|
|
|
|
**Implementation Steps**:
|
|
1. Add `get_nearby_locations()` method to `backend/apps/core/services/location_service.py`
|
|
2. Update `MapLocationDetailAPIView.get()` to call this method
|
|
3. Update `MapLocationDetailSerializer.get_nearby_locations()` to return actual data
|
|
4. Add distance field to nearby location objects
|
|
5. Add tests for spatial queries
|
|
|
|
**API Response Example**:
|
|
```json
|
|
{
|
|
"nearby_locations": [
|
|
{
|
|
"id": "park_123",
|
|
"name": "Cedar Point",
|
|
"type": "park",
|
|
"distance_miles": 5.2,
|
|
"coordinates": [41.4793, -82.6833]
|
|
}
|
|
]
|
|
}
|
|
```
|
|
|
|
**Performance Considerations**:
|
|
- Use spatial indexes (already present on `location__point` fields)
|
|
- Cache nearby locations for 1 hour
|
|
- Limit radius to 50 miles maximum
|
|
|
|
---
|
|
|
|
#### THRILLWIKI-108: Search Relevance Scoring
|
|
|
|
**Priority**: P2 (Medium)
|
|
**Estimated Effort**: 2-3 days
|
|
**Dependencies**: None
|
|
|
|
**Context**:
|
|
Search results currently return a hardcoded relevance score of 1.0. Implementing proper relevance scoring would improve search result quality.
|
|
|
|
**Proposed Solution**:
|
|
Implement a weighted scoring algorithm based on:
|
|
|
|
1. **Text Match Quality** (40% weight):
|
|
- Exact name match: 1.0
|
|
- Name starts with query: 0.8
|
|
- Name contains query: 0.6
|
|
- City/state match: 0.4
|
|
|
|
2. **Popularity** (30% weight):
|
|
- Based on `average_rating` and `ride_count`/`coaster_count`
|
|
- Normalize to 0-1 scale
|
|
|
|
3. **Recency** (15% weight):
|
|
- Recently opened attractions score higher
|
|
- Normalize based on `opening_date`
|
|
|
|
4. **Status** (15% weight):
|
|
- Operating: 1.0
|
|
- Seasonal: 0.8
|
|
- Closed temporarily: 0.5
|
|
- Closed permanently: 0.2
|
|
|
|
**Implementation Steps**:
|
|
1. Create `backend/apps/core/services/search_scoring.py` with scoring logic
|
|
2. Add `calculate_relevance_score()` method
|
|
3. Update `MapSearchAPIView.get()` to calculate scores
|
|
4. Sort results by relevance score (descending)
|
|
5. Add tests for scoring algorithm
|
|
|
|
**Example Implementation**:
|
|
```python
|
|
def calculate_relevance_score(location, query):
|
|
score = 0.0
|
|
|
|
# Text match (40%)
|
|
name_lower = location.name.lower()
|
|
query_lower = query.lower()
|
|
if name_lower == query_lower:
|
|
score += 0.40
|
|
elif name_lower.startswith(query_lower):
|
|
score += 0.32
|
|
elif query_lower in name_lower:
|
|
score += 0.24
|
|
|
|
# Popularity (30%)
|
|
if location.average_rating:
|
|
score += (location.average_rating / 5.0) * 0.30
|
|
|
|
# Status (15%)
|
|
status_weights = {
|
|
'OPERATING': 1.0,
|
|
'SEASONAL': 0.8,
|
|
'CLOSED_TEMP': 0.5,
|
|
'CLOSED_PERM': 0.2
|
|
}
|
|
score += status_weights.get(location.status, 0.5) * 0.15
|
|
|
|
return min(score, 1.0)
|
|
```
|
|
|
|
**Performance Considerations**:
|
|
- Calculate scores in Python (not database) for flexibility
|
|
- Cache search results with scores for 5 minutes
|
|
- Consider using PostgreSQL full-text search for better performance
|
|
|
|
---
|
|
|
|
#### THRILLWIKI-109: Cache Statistics Tracking
|
|
|
|
**Priority**: P2 (Medium)
|
|
**Estimated Effort**: 1-2 hours
|
|
**Dependencies**: None
|
|
**Status**: IMPLEMENTED
|
|
|
|
**Context**:
|
|
The `MapStatsAPIView` returns hardcoded cache statistics (0 hits, 0 misses). Implementing real cache statistics provides visibility into caching effectiveness.
|
|
|
|
**Implementation**:
|
|
Added `get_cache_statistics()` method to `EnhancedCacheService` that retrieves Redis INFO statistics when available. The `MapStatsAPIView` now returns real cache hit/miss data.
|
|
|
|
---
|
|
|
|
### User Features
|
|
|
|
#### THRILLWIKI-104: Full User Statistics Tracking
|
|
|
|
**Priority**: P2 (Medium)
|
|
**Estimated Effort**: 3-4 days
|
|
**Dependencies**: THRILLWIKI-105 (Photo counting)
|
|
|
|
**Context**:
|
|
Current user statistics are calculated on-demand by querying multiple tables. This is inefficient and doesn't track all desired metrics.
|
|
|
|
**Proposed Solution**:
|
|
Implement a `UserStatistics` model with periodic updates:
|
|
|
|
```python
|
|
class UserStatistics(models.Model):
|
|
user = models.OneToOneField(User, on_delete=models.CASCADE)
|
|
|
|
# Content statistics
|
|
parks_visited = models.IntegerField(default=0)
|
|
rides_ridden = models.IntegerField(default=0)
|
|
reviews_written = models.IntegerField(default=0)
|
|
photos_uploaded = models.IntegerField(default=0)
|
|
top_lists_created = models.IntegerField(default=0)
|
|
|
|
# Engagement statistics
|
|
helpful_votes_received = models.IntegerField(default=0)
|
|
comments_made = models.IntegerField(default=0)
|
|
badges_earned = models.IntegerField(default=0)
|
|
|
|
# Activity tracking
|
|
last_review_date = models.DateTimeField(null=True, blank=True)
|
|
last_photo_upload_date = models.DateTimeField(null=True, blank=True)
|
|
streak_days = models.IntegerField(default=0)
|
|
|
|
# Timestamps
|
|
last_calculated = models.DateTimeField(auto_now=True)
|
|
|
|
class Meta:
|
|
verbose_name_plural = "User statistics"
|
|
```
|
|
|
|
**Implementation Steps**:
|
|
1. Create migration for `UserStatistics` model in `backend/apps/accounts/models.py`
|
|
2. Create Celery task `update_user_statistics` in `backend/apps/accounts/tasks.py`
|
|
3. Update statistics on user actions using Django signals:
|
|
- `post_save` signal on `ParkReview`, `RideReview` -> increment `reviews_written`
|
|
- `post_save` signal on `ParkPhoto`, `RidePhoto` -> increment `photos_uploaded`
|
|
4. Add management command `python manage.py recalculate_user_stats` for bulk updates
|
|
5. Update `get_user_statistics` view to read from `UserStatistics` model
|
|
6. Add periodic Celery task to recalculate statistics daily
|
|
|
|
**Performance Benefits**:
|
|
- Reduces database queries from 5+ to 1
|
|
- Enables leaderboards and ranking features
|
|
- Supports gamification (badges, achievements)
|
|
|
|
**Migration Strategy**:
|
|
1. Create model and migration
|
|
2. Run `recalculate_user_stats` for existing users
|
|
3. Enable signal handlers for new activity
|
|
4. Monitor for 1 week before removing old calculation logic
|
|
|
|
---
|
|
|
|
#### THRILLWIKI-105: Photo Upload Counting
|
|
|
|
**Priority**: P2 (Medium)
|
|
**Estimated Effort**: 30 minutes
|
|
**Dependencies**: None
|
|
**Status**: IMPLEMENTED
|
|
|
|
**Context**:
|
|
The user statistics endpoint returns `photos_uploaded: 0` for all users. Photo uploads should be counted from `ParkPhoto` and `RidePhoto` models.
|
|
|
|
**Implementation**:
|
|
Updated `get_user_statistics()` in `backend/apps/api/v1/accounts/views.py` to query `ParkPhoto` and `RidePhoto` models where `uploaded_by=user`.
|
|
|
|
---
|
|
|
|
### Infrastructure
|
|
|
|
#### THRILLWIKI-101: Geocoding Service Integration
|
|
|
|
**Priority**: P3 (Low)
|
|
**Estimated Effort**: 2-3 days
|
|
**Dependencies**: None
|
|
|
|
**Context**:
|
|
`CompanyHeadquarters` model has address fields but no coordinates. This prevents companies from appearing on the map.
|
|
|
|
**Proposed Solution**:
|
|
Integrate a geocoding service to convert addresses to coordinates:
|
|
|
|
**Recommended Services**:
|
|
1. **Google Maps Geocoding API** (Paid, high quality)
|
|
2. **Nominatim (OpenStreetMap)** (Free, rate-limited)
|
|
3. **Mapbox Geocoding API** (Paid, good quality)
|
|
|
|
**Implementation Steps**:
|
|
1. Create `backend/apps/core/services/geocoding_service.py`:
|
|
```python
|
|
class GeocodingService:
|
|
def geocode_address(self, address: str) -> tuple[float, float] | None:
|
|
"""Convert address to (latitude, longitude)."""
|
|
# Implementation using chosen service
|
|
```
|
|
|
|
2. Add geocoding to `CompanyHeadquarters` model:
|
|
- Add `latitude` and `longitude` fields
|
|
- Add `geocoded_at` timestamp field
|
|
- Create migration
|
|
|
|
3. Update `CompanyLocationAdapter.to_unified_location()` to use coordinates if available
|
|
|
|
4. Add management command `python manage.py geocode_companies` for bulk geocoding
|
|
|
|
5. Add Celery task for automatic geocoding on company creation/update
|
|
|
|
**Configuration**:
|
|
Add to `backend/config/settings/base.py`:
|
|
```python
|
|
GEOCODING_SERVICE = env('GEOCODING_SERVICE', default='nominatim')
|
|
GEOCODING_API_KEY = env('GEOCODING_API_KEY', default='')
|
|
GEOCODING_RATE_LIMIT = env.int('GEOCODING_RATE_LIMIT', default=1) # requests per second
|
|
```
|
|
|
|
**Rate Limiting**:
|
|
- Implement exponential backoff for API errors
|
|
- Cache geocoding results to avoid redundant API calls
|
|
- Use Celery for async geocoding to avoid blocking requests
|
|
|
|
**Cost Considerations**:
|
|
- Nominatim: Free but limited to 1 request/second
|
|
- Google Maps: $5 per 1000 requests (first $200/month free)
|
|
- Mapbox: $0.50 per 1000 requests (first 100k free)
|
|
|
|
**Alternative Approach**:
|
|
Store coordinates manually in admin interface for the ~50-100 companies in the database.
|
|
|
|
---
|
|
|
|
#### THRILLWIKI-110: ClamAV Malware Scanning Integration
|
|
|
|
**Priority**: P1 (High) - Security feature
|
|
**Estimated Effort**: 2-3 days
|
|
**Dependencies**: ClamAV daemon installation
|
|
|
|
**Context**:
|
|
File uploads currently use magic number validation and PIL integrity checks, but don't scan for malware. This is a security gap for user-generated content.
|
|
|
|
**Proposed Solution**:
|
|
Integrate ClamAV antivirus scanning for all file uploads.
|
|
|
|
**Implementation Steps**:
|
|
|
|
1. **Install ClamAV**:
|
|
```bash
|
|
# Docker
|
|
docker run -d -p 3310:3310 clamav/clamav:latest
|
|
|
|
# Ubuntu/Debian
|
|
sudo apt-get install clamav clamav-daemon
|
|
sudo freshclam # Update virus definitions
|
|
sudo systemctl start clamav-daemon
|
|
```
|
|
|
|
2. **Install Python client**:
|
|
```bash
|
|
uv add clamd
|
|
```
|
|
|
|
3. **Update `backend/apps/core/utils/file_scanner.py`**:
|
|
```python
|
|
import clamd
|
|
|
|
def scan_file_for_malware(file: UploadedFile) -> Tuple[bool, str]:
|
|
"""Scan file for malware using ClamAV."""
|
|
try:
|
|
# Connect to ClamAV daemon
|
|
cd = clamd.ClamdUnixSocket() # or ClamdNetworkSocket for remote
|
|
|
|
# Scan file
|
|
file.seek(0)
|
|
scan_result = cd.instream(file)
|
|
file.seek(0)
|
|
|
|
# Check result
|
|
if scan_result['stream'][0] == 'OK':
|
|
return True, ""
|
|
else:
|
|
virus_name = scan_result['stream'][1]
|
|
return False, f"Malware detected: {virus_name}"
|
|
|
|
except clamd.ConnectionError:
|
|
# ClamAV not available - log warning and allow upload
|
|
logger.warning("ClamAV daemon not available, skipping malware scan")
|
|
return True, ""
|
|
except Exception as e:
|
|
logger.error(f"Malware scan error: {e}")
|
|
return False, "Malware scan failed"
|
|
```
|
|
|
|
4. **Configuration**:
|
|
Add to `backend/config/settings/base.py`:
|
|
```python
|
|
CLAMAV_ENABLED = env.bool('CLAMAV_ENABLED', default=False)
|
|
CLAMAV_SOCKET = env('CLAMAV_SOCKET', default='/var/run/clamav/clamd.ctl')
|
|
CLAMAV_HOST = env('CLAMAV_HOST', default='localhost')
|
|
CLAMAV_PORT = env.int('CLAMAV_PORT', default=3310)
|
|
```
|
|
|
|
5. **Update file upload views**:
|
|
- Call `scan_file_for_malware()` in avatar upload view
|
|
- Call in media upload views
|
|
- Log all malware detections for security monitoring
|
|
|
|
6. **Testing**:
|
|
- Use EICAR test file for testing: `X5O!P%@AP[4\PZX54(P^)7CC)7}$EICAR-STANDARD-ANTIVIRUS-TEST-FILE!$H+H*`
|
|
- Add unit tests with mocked ClamAV responses
|
|
|
|
**Deployment Considerations**:
|
|
- ClamAV requires ~1GB RAM for virus definitions
|
|
- Update virus definitions daily via `freshclam`
|
|
- Monitor ClamAV daemon health in production
|
|
- Consider using cloud-based scanning service (AWS GuardDuty, VirusTotal) for serverless deployments
|
|
|
|
**Fallback Strategy**:
|
|
If ClamAV is unavailable, log warning and allow upload (fail open). This prevents blocking legitimate uploads if ClamAV daemon crashes.
|
|
|
|
---
|
|
|
|
### Management Commands
|
|
|
|
#### THRILLWIKI-111: Sample Data Creation Command
|
|
|
|
**Priority**: P3 (Low) - Development utility
|
|
**Estimated Effort**: 1-2 days
|
|
**Dependencies**: None
|
|
|
|
**Context**:
|
|
The `create_sample_data` management command is incomplete. This command is useful for:
|
|
- Local development with realistic data
|
|
- Demo environments
|
|
- Testing with diverse data sets
|
|
|
|
**Proposed Solution**:
|
|
Complete the implementation with comprehensive sample data:
|
|
|
|
**Sample Data to Create**:
|
|
1. **Parks** (10-15):
|
|
- Major theme parks (Disney, Universal, Cedar Point)
|
|
- Regional parks
|
|
- Water parks
|
|
- Mix of operating/closed/seasonal statuses
|
|
|
|
2. **Rides** (50-100):
|
|
- Roller coasters (various types)
|
|
- Flat rides
|
|
- Water rides
|
|
- Dark rides
|
|
- Mix of statuses and manufacturers
|
|
|
|
3. **Companies** (20-30):
|
|
- Operators (Disney, Six Flags, Cedar Fair)
|
|
- Manufacturers (Intamin, B&M, RMC)
|
|
- Mix of active/inactive
|
|
|
|
4. **Users** (10):
|
|
- Admin user
|
|
- Regular users with various activity levels
|
|
- Test user for authentication testing
|
|
|
|
5. **Reviews** (100-200):
|
|
- Park reviews with ratings
|
|
- Ride reviews with ratings
|
|
- Mix of helpful/unhelpful votes
|
|
|
|
6. **Media** (50):
|
|
- Park photos
|
|
- Ride photos
|
|
- Mix of approved/pending/rejected
|
|
|
|
**Implementation Steps**:
|
|
1. Create fixtures in `backend/fixtures/sample_data.json`
|
|
2. Update `create_sample_data.py` to load fixtures
|
|
3. Add `--clear` flag to delete existing data before creating
|
|
4. Add `--minimal` flag for quick setup (10 parks, 20 rides)
|
|
5. Document usage in `backend/README.md`
|
|
|
|
**Usage**:
|
|
```bash
|
|
# Full sample data
|
|
python manage.py create_sample_data
|
|
|
|
# Minimal data for quick testing
|
|
python manage.py create_sample_data --minimal
|
|
|
|
# Clear existing data first
|
|
python manage.py create_sample_data --clear
|
|
```
|
|
|
|
**Alternative Approach**:
|
|
Use Django fixtures with `loaddata` command:
|
|
```bash
|
|
python manage.py loaddata sample_parks sample_rides sample_users
|
|
```
|
|
|
|
---
|
|
|
|
## Completed Items
|
|
|
|
### THRILLWIKI-103: Admin Permission Checks
|
|
|
|
**Status**: COMPLETED (Already Implemented)
|
|
|
|
**Context**:
|
|
The `MapCacheView` delete and post methods had TODO comments for adding admin permission checks. Upon review, these checks were already implemented using `request.user.is_authenticated and request.user.is_staff`.
|
|
|
|
**Resolution**: Removed outdated TODO comments.
|
|
|
|
---
|
|
|
|
## Implementation Notes
|
|
|
|
### Creating GitHub Issues
|
|
|
|
Each item in this document can be converted to a GitHub issue using this template:
|
|
|
|
```markdown
|
|
## Description
|
|
[Copy from Context section]
|
|
|
|
## Implementation
|
|
[Copy from Implementation Steps section]
|
|
|
|
## Acceptance Criteria
|
|
- [ ] Feature implemented as specified
|
|
- [ ] Unit tests added with >80% coverage
|
|
- [ ] Integration tests pass
|
|
- [ ] Documentation updated
|
|
- [ ] Code reviewed and approved
|
|
|
|
## Priority
|
|
[Copy Priority value]
|
|
|
|
## Related
|
|
- THRILLWIKI issue number
|
|
- Related features or dependencies
|
|
```
|
|
|
|
### Priority Order for Implementation
|
|
|
|
Based on business value and effort, recommended implementation order:
|
|
|
|
1. **THRILLWIKI-110**: ClamAV Malware Scanning (P1, security)
|
|
2. **THRILLWIKI-106**: Map Clustering (P1, performance)
|
|
3. **THRILLWIKI-107**: Nearby Locations (P2, UX)
|
|
4. **THRILLWIKI-108**: Search Relevance Scoring (P2, UX)
|
|
5. **THRILLWIKI-104**: Full User Statistics (P2, engagement)
|
|
6. **THRILLWIKI-101**: Geocoding Service (P3, completeness)
|
|
7. **THRILLWIKI-111**: Sample Data Command (P3, development)
|