Cloud AI Pricing
Google Vertex AI Pricing 2026:
Gemini, PaLM & All Model Costs
Complete Google Vertex AI pricing guide for 2026 — Gemini 2.0 Flash, Gemini Pro, Gemma, Imagen, and embedding costs. How Vertex compares to using Gemini API directly.
10 min read·Updated March 2026
Vertex AI Key Pricing
FREE
Gemini 2.0 Flash (limits)
$0.075
Gemini Flash input per 1M
$1.25
Gemini Pro input per 1M
1M
context window (Flash)
Google Vertex AI — Gemini Model Pricing 2026
| Model | Input (per 1M) | Output (per 1M) | Context |
|---|---|---|---|
| Gemini 2.0 Pro | $1.25 | $5.00 | 1M |
| Gemini 2.0 Flash | $0.075 | $0.30 | 1M |
| Gemini 2.0 Flash-Lite | $0.01 | $0.04 | 1M |
| Gemini 1.5 Pro | $1.25 | $5.00 | 2M |
| Gemini 1.5 Flash | $0.075 | $0.30 | 1M |
| text-embedding-004 | $0.025 | N/A | 2K |
Vertex AI vs Gemini API Direct: Key Differences
| Feature | Vertex AI | Gemini API (AI Studio) |
|---|---|---|
| Pricing | Same token rates | Same token rates |
| Free tier | $300 Google Cloud credits | Generous free tier (Flash) |
| Enterprise compliance | Full (GDPR, HIPAA, SOC 2) | Limited |
| Data residency | EU, US, APAC regions | US primarily |
| Model tuning (fine-tuning) | Full support | Limited |
| Batch predictions | 50% discount | 50% discount |
| RAG/grounding | Vertex AI Search integration | Basic |
Gemini Flash vs GPT-4o mini — Budget Model Comparison
| Metric | Gemini 2.0 Flash | GPT-4o mini |
|---|---|---|
| Input cost | $0.075/M | $0.15/M |
| Output cost | $0.30/M | $0.60/M |
| Context window | 1M tokens | 128K tokens |
| Multimodal (vision) | Yes (native) | Yes |
| Speed | Very fast | Very fast |
Gemini Flash is 2× cheaper than GPT-4o mini and has an 8× larger context window. For high-volume, long-context workloads, Gemini Flash is the clear winner on price.
Google Vertex AI Free Tier
Vertex AI offers:
- $300 free credits for new Google Cloud accounts (90-day expiry)
- Free monthly quota on Gemini Flash Lite: 1,500 requests/day
- Gemini API (AI Studio) free tier: 15 RPM, 1M tokens/day on Flash — excellent for development
Vertex AI Fine-Tuning Costs
Fine-tuning Gemini models on Vertex AI is priced per training token:
- Gemini 1.5 Flash fine-tuning: $8.00 per 1M training tokens
- Hosting fine-tuned model: $3.00/1M tokens (vs $0.075 base)
- Minimum training dataset: 100 examples
- Typical fine-tune: 10,000–100,000 examples = $80–800 one-time cost
Real-World Vertex AI Cost Example
Document Processing Pipeline (1M pages/month)
- Average page: 500 tokens input + 200 tokens output
- Total: 500M input + 200M output tokens
- Gemini Flash: $37.50 + $60 = $97.50/month
- GPT-4o: $1,250 + $2,000 = $3,250/month
- Gemini Flash saves $3,152/month (97% cheaper)
Compare Vertex AI vs Azure OpenAI
Calculate which cloud AI platform is cheapest for your workload.
AI Cost Calculator