ML Training Example Projects
Purpose: Provide complete, runnable example projects demonstrating ML training workflows from data preparation through deployment.
Activation Triggers:
-
User requests example projects or starter templates
-
User wants to see working sentiment classification code
-
User needs text generation training examples
-
User mentions RedAI trade classifier
-
User wants reference implementations
-
User needs to understand complete training workflows
Key Resources:
-
scripts/setup-example.sh
-
Initialize and setup any example project
-
scripts/run-training.sh
-
Execute training for any example
-
scripts/test-inference.sh
-
Test trained models
-
examples/sentiment-classification/
-
Binary sentiment classification (IMDB-style)
-
examples/text-generation/
-
GPT-style text generation with LoRA
-
examples/redai-trade-classifier/
-
Financial trade classification with Modal deployment
-
templates/
-
Scaffolding for new projects
Available Example Projects
- Sentiment Classification
Use Case: Binary sentiment analysis (positive/negative reviews)
Features:
-
DistilBERT fine-tuning for text classification
-
Custom dataset loading from JSON
-
Training with validation metrics
-
Model saving and inference
-
Production-ready inference API
Files:
-
train.py
-
Complete training script
-
data.json
-
Sample training data (50 examples)
-
inference.py
-
Inference server
-
README.md
-
Setup and usage guide
Dataset Format:
{"text": "This movie was amazing!", "label": 1} {"text": "Terrible waste of time", "label": 0}
- Text Generation
Use Case: Fine-tune GPT-2 for custom text generation
Features:
-
GPT-2 small model fine-tuning
-
LoRA (Low-Rank Adaptation) for efficient training
-
Custom tokenization
-
Generation with temperature/top-p sampling
-
Modal deployment configuration
Files:
-
train.py
-
LoRA training script
-
config.yaml
-
Hyperparameters and model config
-
generate.py
-
Text generation script
-
modal_deploy.py
-
Modal deployment
-
README.md
-
Complete guide
Config Structure:
model: name: gpt2 max_length: 512 training: epochs: 3 batch_size: 4 learning_rate: 2e-4 lora: r: 8 alpha: 16 dropout: 0.1
- RedAI Trade Classifier
Use Case: Financial trade classification (buy/sell/hold)
Features:
-
Multi-class classification for trading signals
-
Feature engineering from market data
-
Class imbalance handling
-
Modal deployment for production inference
-
Real-time prediction API
Files:
-
train.py
-
Training with class weighting
-
modal_deploy.py
-
Complete Modal deployment
-
data_preprocessing.py
-
Feature engineering
-
README.md
-
Trading strategy guide
Model Input:
-
Price features (open, high, low, close)
-
Volume indicators
-
Technical indicators (RSI, MACD, moving averages)
-
Sentiment scores
Quick Start
Setup Any Example
Initialize example project
./scripts/setup-example.sh <project-name>
Options: sentiment-classification, text-generation, redai-trade-classifier
./scripts/setup-example.sh sentiment-classification
What it does:
-
Creates project directory
-
Copies example files
-
Installs dependencies
-
Downloads/prepares sample data
-
Validates environment
Run Training
Train model for any example
./scripts/run-training.sh <project-name>
Examples:
./scripts/run-training.sh sentiment-classification ./scripts/run-training.sh text-generation ./scripts/run-training.sh redai-trade-classifier
Monitors:
-
Training progress
-
Loss curves
-
Validation metrics
-
GPU utilization
-
Checkpoint saving
Test Inference
Test trained model
./scripts/test-inference.sh <project-name> <input>
Examples:
./scripts/test-inference.sh sentiment-classification "This product is great!" ./scripts/test-inference.sh text-generation "Once upon a time" ./scripts/test-inference.sh redai-trade-classifier market_data.json
Common Workflows
Start From Example Template
Choose example based on use case:
-
Classification → sentiment-classification
-
Generation → text-generation
-
Financial ML → redai-trade-classifier
Setup project:
./scripts/setup-example.sh <example-name>
Customize for your data:
-
Update data loading in train.py
-
Modify model architecture if needed
-
Adjust hyperparameters in config
Run training:
./scripts/run-training.sh <example-name>
Deploy:
-
Local: Use inference.py
-
Production: Use modal_deploy.py
Extend Example with Custom Data
-
Prepare data in example format
-
Replace data files (data.json, config.yaml)
-
Update preprocessing if needed
-
Train with same script
-
Test inference with new data
Deploy Example to Production
All examples include Modal deployment:
Deploy to Modal
cd examples/<project-name> modal deploy modal_deploy.py
Get endpoint URL
modal app show <app-name>
Example Comparison
Feature Sentiment Text Gen Trade Classifier
Task Type Binary Classification Generation Multi-class
Model DistilBERT GPT-2 + LoRA Custom Transformer
Training Time 5-10 min 15-30 min 10-20 min
GPU Required Optional Recommended Required
Modal Deploy ✅ ✅ ✅
Custom Data Easy Moderate Advanced
Customization Guide
Sentiment Classification
Change dataset:
In train.py, update load_data()
def load_data(path): # Your custom loading logic return texts, labels
Change model:
Replace DistilBERT with other models
model_name = "bert-base-uncased" # or roberta-base, etc.
Text Generation
Change generation style:
In config.yaml
generation: temperature: 0.8 # Higher = more creative top_p: 0.9 # Nucleus sampling max_length: 200 # Output length
Add custom prompts:
In generate.py
prompts = [ "Your custom prompt here", "Another prompt" ]
Trade Classifier
Add features:
In data_preprocessing.py
def engineer_features(df): df['rsi'] = calculate_rsi(df['close']) df['macd'] = calculate_macd(df['close']) # Add your custom indicators return df
Change strategy:
Update labels in train.py
0 = sell, 1 = hold, 2 = buy
labels = your_strategy(prices, indicators)
Dependencies
Each example includes its own requirements.txt :
Sentiment Classification:
-
transformers
-
torch
-
datasets
-
scikit-learn
Text Generation:
-
transformers
-
peft (LoRA)
-
torch
-
modal (deployment)
Trade Classifier:
-
transformers
-
pandas
-
numpy
-
modal
-
ta (technical analysis)
Troubleshooting
Training Fails
Issue: Out of memory Fix: Reduce batch size in config
Issue: CUDA not available Fix: Use CPU or install CUDA toolkit
Inference Errors
Issue: Model not found Fix: Check checkpoint path in inference script
Issue: Wrong input format Fix: Validate input matches training data format
Deployment Issues
Issue: Modal authentication Fix: Run modal token new to authenticate
Issue: Dependency conflicts Fix: Use exact versions from requirements.txt
Resources
Scripts: All scripts are in scripts/ with execution permissions
Examples: Complete projects in examples/ directory
Templates: Scaffolding in templates/ for creating new projects
Documentation: Each example has detailed README.md
Supported Frameworks: PyTorch, Transformers, PEFT Deployment Platforms: Modal, Local, FastAPI Version: 1.0.0