mirror of
https://github.com/pacnpal/thrillwiki_django_no_react.git
synced 2025-12-20 14:11:09 -05:00
@@ -1,158 +0,0 @@
|
||||
# ThrillWiki Data Seeding Script
|
||||
|
||||
## Overview
|
||||
|
||||
The `seed_data.py` management command provides comprehensive test data seeding for the ThrillWiki application. It creates realistic data across all models in the system for testing and development purposes.
|
||||
|
||||
## Usage
|
||||
|
||||
### Basic Usage
|
||||
```bash
|
||||
# Seed with default counts
|
||||
uv run manage.py seed_data
|
||||
|
||||
# Clear existing data and seed fresh
|
||||
uv run manage.py seed_data --clear
|
||||
|
||||
# Custom counts
|
||||
uv run manage.py seed_data --users 50 --parks 20 --rides 100 --reviews 200
|
||||
```
|
||||
|
||||
### Command Options
|
||||
|
||||
- `--clear`: Clear existing data before seeding
|
||||
- `--users N`: Number of users to create (default: 25)
|
||||
- `--companies N`: Number of companies to create (default: 15)
|
||||
- `--parks N`: Number of parks to create (default: 10)
|
||||
- `--rides N`: Number of rides to create (default: 50)
|
||||
- `--ride-models N`: Number of ride models to create (default: 20)
|
||||
- `--reviews N`: Number of reviews to create (default: 100)
|
||||
|
||||
## What Gets Created
|
||||
|
||||
### Users & Accounts
|
||||
- **Admin User**: `admin` / `admin123` (superuser)
|
||||
- **Moderator User**: `moderator` / `mod123` (staff)
|
||||
- **Regular Users**: Random realistic users with profiles
|
||||
- **User Profiles**: Complete with ride credits, social links, preferences
|
||||
- **Notifications**: Sample notifications for users
|
||||
- **Top Lists**: User-created top lists for parks and rides
|
||||
|
||||
### Companies
|
||||
- **Park Operators**: Disney, Universal, Six Flags, Cedar Fair, etc.
|
||||
- **Ride Manufacturers**: B&M, Intamin, Vekoma, RMC, etc.
|
||||
- **Ride Designers**: Werner Stengel, Alan Schilke, John Wardley
|
||||
- **Company Headquarters**: Realistic address data
|
||||
|
||||
### Parks & Locations
|
||||
- **Famous Parks**: Magic Kingdom, Disneyland, Cedar Point, etc.
|
||||
- **Park Locations**: Geographic coordinates and addresses
|
||||
- **Park Areas**: Themed areas within parks
|
||||
- **Park Photos**: Sample photo records
|
||||
|
||||
### Rides & Models
|
||||
- **Famous Coasters**: Steel Vengeance, Millennium Force, etc.
|
||||
- **Ride Models**: B&M Dive Coaster, Intamin Accelerator, etc.
|
||||
- **Roller Coaster Stats**: Height, speed, inversions, etc.
|
||||
- **Ride Photos**: Sample photo records
|
||||
- **Technical Specs**: Detailed specifications for ride models
|
||||
|
||||
### Content & Reviews
|
||||
- **Park Reviews**: User reviews with ratings and visit dates
|
||||
- **Ride Reviews**: Detailed ride experiences
|
||||
- **Review Content**: Realistic review text and ratings
|
||||
|
||||
## Data Quality Features
|
||||
|
||||
### Realistic Data
|
||||
- **Names**: Diverse, realistic user names
|
||||
- **Locations**: Accurate geographic coordinates
|
||||
- **Relationships**: Proper company-park-ride relationships
|
||||
- **Statistics**: Realistic ride statistics and ratings
|
||||
|
||||
### Comprehensive Coverage
|
||||
- **All Models**: Seeds data for every model in the system
|
||||
- **Relationships**: Maintains proper foreign key relationships
|
||||
- **Optional Models**: Handles models that may not exist gracefully
|
||||
|
||||
### Data Integrity
|
||||
- **Unique Constraints**: Uses `get_or_create` to avoid duplicates
|
||||
- **Validation**: Respects model constraints and validation rules
|
||||
- **Dependencies**: Creates data in proper dependency order
|
||||
|
||||
## Technical Implementation
|
||||
|
||||
### Architecture
|
||||
- **Modular Design**: Separate methods for each model type
|
||||
- **Transaction Safety**: All operations wrapped in database transaction
|
||||
- **Error Handling**: Graceful handling of missing optional models
|
||||
- **Progress Reporting**: Clear console output with emojis and counts
|
||||
|
||||
### Model Handling
|
||||
- **Dual Company Models**: Properly handles separate Park and Ride company models
|
||||
- **Optional Models**: Checks for existence before using optional models
|
||||
- **Type Safety**: Proper type hints and error handling
|
||||
|
||||
### Data Generation
|
||||
- **Random but Realistic**: Uses curated lists for realistic data
|
||||
- **Configurable Counts**: All counts are configurable via command line
|
||||
- **Relationship Integrity**: Maintains proper relationships between models
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Common Issues
|
||||
|
||||
1. **Database Schema Mismatch**: If you see timezone constraint errors, run migrations first:
|
||||
```bash
|
||||
uv run manage.py migrate
|
||||
```
|
||||
|
||||
2. **Permission Errors**: Ensure database user has proper permissions for all operations
|
||||
|
||||
3. **Memory Issues**: For large datasets, consider running with smaller batches
|
||||
|
||||
### Known Limitations
|
||||
|
||||
- **Database Schema Compatibility**: May encounter issues with database schemas that have additional required fields not present in the current models (e.g., timezone field)
|
||||
- **pghistory Compatibility**: May have issues with some pghistory configurations
|
||||
- **Cloudflare Images**: Creates placeholder records without actual images
|
||||
- **Geographic Data**: Requires PostGIS for location features
|
||||
- **Transaction Management**: Uses atomic transactions which may fail completely if any model creation fails
|
||||
|
||||
## Development Notes
|
||||
|
||||
### Adding New Models
|
||||
1. Import the model at the top of the file
|
||||
2. Add to `models_to_clear` list if needed
|
||||
3. Create a new `create_*` method
|
||||
4. Call the method in `handle()` in proper dependency order
|
||||
5. Add count to `print_summary()`
|
||||
|
||||
### Customizing Data
|
||||
- Modify the data lists (e.g., `first_names`, `famous_parks`) to customize generated data
|
||||
- Adjust probability weights for different scenarios
|
||||
- Add new relationship patterns as needed
|
||||
|
||||
## Performance
|
||||
|
||||
### Optimization Tips
|
||||
- Use `--clear` sparingly in production-like environments
|
||||
- Consider smaller batch sizes for very large datasets
|
||||
- Monitor database performance during seeding
|
||||
|
||||
### Typical Performance
|
||||
- 25 users, 15 companies, 10 parks, 50 rides: ~30 seconds
|
||||
- 100 users, 50 companies, 25 parks, 200 rides: ~2-3 minutes
|
||||
|
||||
## Security Notes
|
||||
|
||||
- **Default Passwords**: All seeded users have simple passwords for development only
|
||||
- **Admin Access**: Creates admin user with known credentials
|
||||
- **Production Warning**: Never run with `--clear` in production environments
|
||||
|
||||
## Future Enhancements
|
||||
|
||||
- **Bulk Operations**: Use bulk_create for better performance
|
||||
- **Custom Scenarios**: Add preset scenarios (small, medium, large)
|
||||
- **Data Export**: Add ability to export seeded data
|
||||
- **Incremental Updates**: Support for updating existing data
|
||||
Reference in New Issue
Block a user