ML Training Cost Calculator

Purpose: Provide production-ready cost estimation tools for ML training and inference across cloud GPU platforms (Modal, Lambda Labs, RunPod).

Activation Triggers:

Estimating training costs for ML models
Comparing GPU platform pricing
Calculating GPU hours for training jobs
Budgeting for ML projects
Optimizing inference costs
Evaluating cost-effectiveness of different GPU types
Planning resource allocation

Key Resources:

scripts/estimate-training-cost.sh
Calculate training costs based on model size, data, GPU type
scripts/estimate-inference-cost.sh
Estimate inference costs for production workloads
scripts/calculate-gpu-hours.sh
Convert training parameters to GPU hours
scripts/compare-platforms.sh
Compare costs across Modal, Lambda, RunPod
templates/cost-breakdown.json
Structured cost breakdown template
templates/platform-pricing.yaml
Up-to-date platform pricing data
examples/training-cost-estimate.md
Example training cost calculation
examples/inference-cost-estimate.md
Example inference cost analysis

Platform Pricing Overview

Modal (Serverless - Pay Per Second)

GPU Options:

T4: $0.000164/sec ($0.59/hr) - Development, small models
L4: $0.000222/sec ($0.80/hr) - Cost-effective training
A10: $0.000306/sec ($1.10/hr) - Mid-range training
A100 40GB: $0.000583/sec ($2.10/hr) - Large model training
A100 80GB: $0.000694/sec ($2.50/hr) - Very large models
H100: $0.001097/sec ($3.95/hr) - Cutting-edge training
H200: $0.001261/sec ($4.54/hr) - Latest generation
B200: $0.001736/sec ($6.25/hr) - Maximum performance

Free Credits:

Starter: $30/month free
Startup credits: Up to $50,000 FREE

Lambda Labs (On-Demand Hourly)

Single GPU:

1x A10: $0.31/hr - Cheapest single GPU option
1x V100 16GB: $0.55/hr - Most affordable multi-GPU base

8x GPU Clusters:

8x V100: $4.40/hr ($0.55/GPU) - Most affordable multi-GPU
8x A100 40GB: $10.32/hr ($1.29/GPU)
8x A100 80GB: $14.32/hr ($1.79/GPU)
8x H100: $23.92/hr ($2.99/GPU)

RunPod (Serverless - Pay Per Minute)

Key Features:

Pay-per-minute billing
FlashBoot <200ms cold-starts
Zero egress fees on storage
30+ GPU SKUs available

Cost Estimation Scripts

Estimate Training Cost

Script: scripts/estimate-training-cost.sh

Usage:

bash scripts/estimate-training-cost.sh
--model-size 7B
--dataset-size 10000
--epochs 3
--gpu t4
--platform modal

Parameters:

--model-size : Model size (125M, 350M, 1B, 3B, 7B, 13B, 70B)
--dataset-size : Number of training samples
--epochs : Number of training epochs
--batch-size : Training batch size (default: auto-calculated)
--gpu : GPU type (t4, a10, a100-40gb, a100-80gb, h100)
--platform : Cloud platform (modal, lambda, runpod)
--peft : Use PEFT/LoRA (yes/no, default: no)
--mixed-precision : Use FP16/BF16 (yes/no, default: yes)

Output:

{ "model": "7B", "dataset_size": 10000, "epochs": 3, "gpu": "T4", "platform": "Modal", "estimated_hours": 4.2, "cost_breakdown": { "compute_cost": 2.48, "storage_cost": 0.05, "total_cost": 2.53 }, "cost_optimizations": { "with_peft": 1.26, "savings_percentage": 50 }, "alternative_platforms": { "lambda_a10": 1.30, "runpod_t4": 2.40 } }

Calculation Methodology:

Estimates tokens per sample (avg 500 tokens)
Calculates total training tokens
Applies throughput rates per GPU type
Accounts for PEFT (90% memory reduction)
Accounts for mixed precision (2x speedup)

Estimate Inference Cost

Script: scripts/estimate-inference-cost.sh

Usage:

bash scripts/estimate-inference-cost.sh
--requests-per-day 1000
--avg-latency 2
--gpu t4
--platform modal
--deployment serverless

Parameters:

--requests-per-day : Expected daily requests
--avg-latency : Average inference time (seconds)
--gpu : GPU type
--platform : Cloud platform
--deployment : Deployment type (serverless, dedicated)
--batch-inference : Batch requests (yes/no, default: no)

Output:

{ "requests_per_day": 1000, "requests_per_month": 30000, "avg_latency_sec": 2, "gpu": "T4", "platform": "Modal Serverless", "cost_breakdown": { "daily_compute_seconds": 2000, "daily_cost": 0.33, "monthly_cost": 9.90, "cost_per_request": 0.00033 }, "scaling_analysis": { "requests_10k_day": 99.00, "requests_100k_day": 990.00 }, "dedicated_alternative": { "monthly_cost": 442.50, "break_even_requests_day": 4500 } }

Calculate GPU Hours

Script: scripts/calculate-gpu-hours.sh

Usage:

bash scripts/calculate-gpu-hours.sh
--model-params 7B
--tokens-total 30M
--gpu a100-40gb

Parameters:

--model-params : Model parameters (125M, 350M, 1B, 3B, 7B, 13B, 70B)
--tokens-total : Total training tokens
--gpu : GPU type
--peft : Use PEFT (yes/no)
--multi-gpu : Number of GPUs (default: 1)

GPU Throughput Benchmarks:

T4 (16GB):

7B full fine-tune: 150 tokens/sec
7B with PEFT: 600 tokens/sec

A100 40GB:

7B full fine-tune: 800 tokens/sec
7B with PEFT: 3200 tokens/sec
13B with PEFT: 1600 tokens/sec

A100 80GB:

13B full fine-tune: 600 tokens/sec
70B with PEFT: 400 tokens/sec

H100:

70B with PEFT: 1200 tokens/sec

Compare Platforms

Script: scripts/compare-platforms.sh

Usage:

bash scripts/compare-platforms.sh
--training-hours 4
--gpu-type a100-40gb

Output:

Platform Cost Comparison

Training Job: 4 hours on A100 40GB

Platform	GPU Cost	Egress Fees	Total	Notes
Modal	$8.40	$0.00	$8.40	Serverless, pay-per-second
Lambda	$5.16	$0.00	$5.16	Cheapest for dedicated
RunPod	$8.00	$0.00	$8.00	Pay-per-minute

Winner: Lambda Labs ($5.16)

Savings: $3.24 (38.6% vs Modal)

Recommendation: Use Lambda for long-running dedicated training, Modal for serverless/bursty workloads.

Cost Templates

Cost Breakdown Template

Template: templates/cost-breakdown.json

{ "project_name": "ML Training Project", "cost_estimate": { "training": { "model_size": "7B", "training_runs": 4, "hours_per_run": 4.2, "gpu_type": "T4", "platform": "Modal", "cost_per_run": 2.48, "total_training_cost": 9.92 }, "inference": { "deployment_type": "serverless", "expected_requests_month": 30000, "gpu_type": "T4", "platform": "Modal", "monthly_cost": 9.90 }, "storage": { "model_artifacts_gb": 14, "dataset_storage_gb": 5, "monthly_storage_cost": 0.50 }, "total_monthly_cost": 20.32, "breakdown_percentage": { "training": 48.8, "inference": 48.7, "storage": 2.5 } }, "cost_optimizations_applied": { "peft_lora": "50% training cost reduction", "mixed_precision": "2x faster training", "serverless_inference": "Pay only for actual usage", "batch_inference": "Up to 10x reduction in inference cost" }, "potential_savings": { "without_optimizations": 45.00, "with_optimizations": 20.32, "total_savings": 24.68, "savings_percentage": 54.8 } }

Platform Pricing Data

Template: templates/platform-pricing.yaml

platforms: modal: billing: per-second free_credits: 30 # USD per month startup_credits: 50000 # USD for eligible startups gpus: t4: price_per_sec: 0.000164 price_per_hour: 0.59 vram_gb: 16 a100_40gb: price_per_sec: 0.000583 price_per_hour: 2.10 vram_gb: 40 h100: price_per_sec: 0.001097 price_per_hour: 3.95 vram_gb: 80

lambda: billing: per-hour free_credits: 0 minimum_billing: 1-hour gpus: a10_1x: price_per_hour: 0.31 vram_gb: 24 a100_40gb_1x: price_per_hour: 1.29 vram_gb: 40 a100_40gb_8x: price_per_hour: 10.32 total_vram_gb: 320

runpod: billing: per-minute free_credits: 0 features: - zero_egress_fees - flashboot_200ms gpus: t4: price_per_hour: 0.60 # Approximate vram_gb: 16

Cost Estimation Examples

Example 1: Training 7B Model

File: examples/training-cost-estimate.md

Scenario:

Model: Llama 2 7B fine-tuning
Dataset: 10,000 samples (5M tokens)
Epochs: 3
Total tokens: 15M
Method: LoRA/PEFT

Cost Calculation:

bash scripts/estimate-training-cost.sh
--model-size 7B
--dataset-size 10000
--epochs 3
--gpu t4
--platform modal
--peft yes

Results:

Training Time: 4.2 hours Modal T4 Cost: $2.48 Alternative (Lambda A10): $1.30 (47% cheaper)

Optimization Impact:

Without PEFT: $12.40 (5x more expensive)
With PEFT: $2.48
Savings: $9.92 (80%)

Recommendation: Use Lambda A10 for cheapest option, or Modal T4 for serverless convenience.

Example 2: Production Inference

File: examples/inference-cost-estimate.md

Scenario:

Model: Custom 7B classifier
Expected traffic: 1,000 requests/day
Avg latency: 2 seconds per request
Growth: 10x in 6 months

Cost Calculation:

bash scripts/estimate-inference-cost.sh
--requests-per-day 1000
--avg-latency 2
--gpu t4
--platform modal
--deployment serverless

Current (1K requests/day):

Serverless Modal T4:

Daily cost: $0.33
Monthly cost: $9.90
Cost per request: $0.00033

Dedicated Lambda A10:

Monthly cost: $223 (24/7 instance)
Break-even: 2,250 requests/day
Not recommended for current traffic

After Growth (10K requests/day):

Serverless Modal T4:

Monthly cost: $99.00
Still cost-effective

Dedicated Lambda A10:

Monthly cost: $223
Break-even reached at 2,250 requests/day
Recommendation: Stay serverless until 10K+ daily

Cost Optimization Strategies

Use PEFT/LoRA

Savings: 50-90% training cost reduction

Calculate savings

bash scripts/estimate-training-cost.sh --model-size 7B --peft no

Cost: $12.40

bash scripts/estimate-training-cost.sh --model-size 7B --peft yes

Cost: $2.48

Savings: $9.92 (80%)

Mixed Precision Training

Savings: 2x faster training, 50% cost reduction

Automatically enabled in cost estimations with --mixed-precision yes

Platform Selection

Use Case Guidelines:

Short jobs (<1 hour): Modal serverless

bash scripts/compare-platforms.sh --training-hours 0.5 --gpu-type t4

Winner: Modal ($0.30 vs Lambda $0.31 minimum)

Long jobs (4+ hours): Lambda dedicated

bash scripts/compare-platforms.sh --training-hours 4 --gpu-type a100-40gb

Winner: Lambda ($5.16 vs Modal $8.40)

Variable workloads: Modal serverless

Pay only for actual usage, no idle cost

Batch Inference

Savings: Up to 10x reduction in inference cost

Single inference

bash scripts/estimate-inference-cost.sh
--requests-per-day 1000
--avg-latency 2
--batch-inference no

Cost: $9.90/month

Batch inference (10 requests per batch)

bash scripts/estimate-inference-cost.sh
--requests-per-day 1000
--avg-latency 0.3
--batch-inference yes

Cost: $1.49/month

Savings: $8.41 (85%)

Quick Reference: Cost Per Use Case

Small Model Training (< 1B params)

Best GPU: T4
Best Platform: Modal (serverless)
Typical Cost: $0.50-$2.00 per run
Time: 30 min - 2 hours

Medium Model Training (1B-7B params)

Best GPU: T4 (with PEFT) or A100 40GB
Best Platform: Lambda A10 (cheapest) or Modal T4 (convenience)
Typical Cost: $1.00-$8.00 per run
Time: 2-8 hours

Large Model Training (7B-70B params)

Best GPU: A100 80GB or H100 (with PEFT)
Best Platform: Lambda (dedicated) or Modal (serverless)
Typical Cost: $10-$100 per run
Time: 8-48 hours

Low-Traffic Inference (<1K requests/day)

Best Deployment: Modal serverless
Best GPU: T4
Typical Cost: $5-$15/month

High-Traffic Inference (>10K requests/day)

Best Deployment: Dedicated or batch serverless
Best GPU: A10 or A100
Typical Cost: $100-$500/month

Dependencies

Required for scripts:

Bash 4.0+ (for associative arrays)

bash --version

jq (for JSON processing)

sudo apt-get install jq

bc (for floating-point calculations)

sudo apt-get install bc

yq (for YAML processing)

pip install yq

Best Practices Summary

Always estimate before training - Use cost scripts to avoid surprises
Use PEFT for large models - 50-90% cost savings
Enable mixed precision - 2x speedup with no quality loss
Choose platform based on workload:
Modal: Serverless, short jobs, variable workloads
Lambda: Long-running, dedicated, multi-GPU
RunPod: Per-minute billing flexibility
Batch inference when possible - Up to 10x cost reduction
Apply for startup credits - Modal offers $50K free
Monitor actual costs - Compare estimates to actuals, optimize
Use smallest viable GPU - T4 often sufficient with PEFT

Supported Platforms: Modal, Lambda Labs, RunPod GPU Types: T4, L4, A10, A100 (40GB/80GB), H100, H200, B200 Output Format: JSON cost breakdowns and markdown reports Version: 1.0.0

cost-calculator

Safety Notice

Copy this and send it to your AI assistant to learn

Platform Cost Comparison

Training Job: 4 hours on A100 40GB

Winner: Lambda Labs ($5.16)

Calculate savings

Cost: $12.40

Cost: $2.48

Savings: $9.92 (80%)

Short jobs (<1 hour): Modal serverless

Winner: Modal ($0.30 vs Lambda $0.31 minimum)

Long jobs (4+ hours): Lambda dedicated

Winner: Lambda ($5.16 vs Modal $8.40)

Variable workloads: Modal serverless

Pay only for actual usage, no idle cost

Single inference

Cost: $9.90/month

Batch inference (10 requests per batch)

Cost: $1.49/month

Savings: $8.41 (85%)

Bash 4.0+ (for associative arrays)

jq (for JSON processing)

bc (for floating-point calculations)

yq (for YAML processing)

Source Transparency

Related Skills

document-parsers

stt-integration

model-routing-patterns

react-email-templates