AI Fine-Tuning Cost Calculator
Estimate fine-tuning costs across OpenAI, Google, and open-source models. Compare training costs, LoRA vs full fine-tuning, and total cost of ownership.
Training Data
10100K
504,000
110
Total training tokens
1.5M tokens
Post-Training Inference
110,000
Lowest Total Cost of Ownership (6mo)
Llama 3.1 8Bby Together AI
$2.16
Training (one-time)
$0.7200
Inference/month
$0.2400
Per request
$0.000080
Cheapest Training
Llama 3.1 8B
$0.7200
Cheapest Inference
Llama 3.1 8B
$0.2400/mo
Training Tokens
1.5M
1,000 examples × 3 epochs
Cost Comparison
| Model | Training | Inference/mo | Total (6mo) |
|---|---|---|---|
Llama 3.1 8BBest Value Together AI | $0.7200 | $0.2400 | $2.16 |
Mistral 7B Together AI | $0.7200 | $0.2400 | $2.16 |
Llama 3.1 8B Fireworks AI | $0.9000 | $0.2400 | $2.34 |
Gemini 2.0 Flash Google | $3.00 | $0.7650 | $7.59 |
GPT-4o mini OpenAI | $4.50 | $1.53 | $13.68 |
Llama 3.3 70B Together AI | $7.50 | $1.60 | $17.08 |
Llama 3.3 70B Fireworks AI | $9.00 | $1.60 | $18.58 |
GPT-4.1 mini OpenAI | $6.00 | $4.08 | $30.48 |
GPT-4o OpenAI | $37.50 | $19.13 | $152.25 |
Total Cost Breakdown (6-Month Ownership)
Llama 3.1 8B$2.16
Mistral 7B$2.16
Llama 3.1 8B$2.34
Gemini 2.0 Flash$7.59
GPT-4o mini$13.68
Llama 3.3 70B$17.08
Llama 3.3 70B$18.58
GPT-4.1 mini$30.48
GPT-4o$152.25
Training (one-time)
Inference (6 months)
Model Details
Llama 3.1 8BLoRA
Cheapest fine-tuning option. Ideal for simple classification tasks.
Min examples: 1 | Context: 131K
Mistral 7BLoRA
Fast, efficient model. Good for multilingual tasks.
Min examples: 1 | Context: 32K
Llama 3.1 8BLoRA
Fireworks serverless deployment. Fast inference after fine-tuning.
Min examples: 1 | Context: 131K
Gemini 2.0 FlashLoRA
Affordable fine-tuning via Vertex AI. LoRA supported.
Min examples: 100 | Context: 1M
GPT-4o mini
Best value for OpenAI fine-tuning. Great for most use cases.
Min examples: 10 | Context: 128K
Llama 3.3 70BLoRA
Full fine-tuning or LoRA. Great open-source option for production.
Min examples: 1 | Context: 131K
Llama 3.3 70BLoRA
Fireworks optimized serving with LoRA adapters.
Min examples: 1 | Context: 131K
GPT-4.1 mini
Latest mini model with 1M context. Better instruction following.
Min examples: 10 | Context: 1M
GPT-4o
Most capable OpenAI model for fine-tuning. Higher cost, best quality.
Min examples: 10 | Context: 128K