Skip to content
Cloud AI Pricing

Google Vertex AI Pricing 2026:
Gemini 2.5 Flash, Pro & Enterprise Costs

Complete Google Vertex AI pricing guide for 2026 — Gemini 2.5 Flash-Lite, Flash, and Pro. How Vertex AI compares to Gemini API direct, compliance features, and when to choose each path. Last verified: 2026-04-01.

10 min read·Updated April 2026
Gemini 2.0 shutdown notice: Gemini 2.0 Flash and Gemini 2.0 Flash-Lite are scheduled for shutdown on 2026-06-01. The current production models on Vertex AI are the Gemini 2.5 family. This page reflects Gemini 2.5 pricing.
Vertex AI Gemini 2.5 Pricing at a Glance
$0.10/M
Flash-Lite input (cheapest)
$0.30/M
Gemini 2.5 Flash input
$1.25/M
Gemini 2.5 Pro input
1M tokens
Context window (all tiers)

Gemini 2.5 on Vertex AI — Current Model Pricing

ModelInput / 1M tokensOutput / 1M tokensContext windowBest for
Gemini 2.5 Flash-Lite$0.10$0.401M tokensHigh-volume classification, chatbots, simple tasks
Gemini 2.5 Flash$0.30$2.501M tokensMid-range reasoning, long documents, coding
Gemini 2.5 Pro$1.25$10.001M tokensComplex reasoning, full codebase analysis, research
text-embedding-005$0.025N/A2KSemantic search, RAG ingestion

A key Gemini 2.5 advantage: 1M token context window is available at ALL tiers, including the cheapest Flash-Lite at $0.10/M — GPT-5.4 nano/mini are capped at 128K.

Vertex AI vs Gemini API Direct: Key Differences

FeatureVertex AIGemini API (AI Studio)
PricingSame token ratesSame token rates
Free tier$300 Google Cloud creditsGenerous free tier (Flash-Lite)
Enterprise compliance (GDPR, HIPAA, SOC 2)Full supportLimited
Data residencyEU, US, APAC regionsUS primarily
Fine-tuning (supervised)Full supportLimited
Batch predictions (50% off)YesYes
Google Cloud integration (BigQuery, GCS)NativeNot available
Private networking (VPC)VPC Service ControlsNot available

Choose Gemini API direct for development and cost-sensitive production. Choose Vertex AI when you need enterprise compliance, data residency, or deep GCP integration.

Gemini 2.5 vs OpenAI GPT-5.4 on Price

TierGoogle modelGoogle input/1MOpenAI modelOpenAI input/1MPrice gap
BudgetGemini 2.5 Flash-Lite$0.10GPT-5.4 nano$0.20Google 2× cheaper
Mid-rangeGemini 2.5 Flash$0.30GPT-5.4 mini$0.75Google 2.5× cheaper
PremiumGemini 2.5 Pro$1.25GPT-5.4$2.50Google 2× cheaper

Real-World Vertex AI Cost Example

Document Processing Pipeline (1M pages/month)

  • Average page: 500 tokens input + 200 tokens output
  • Total: 500M input + 200M output tokens
  • Gemini 2.5 Flash-Lite: $50 + $80 = $130/month
  • Gemini 2.5 Flash: $150 + $500 = $650/month
  • GPT-5.4 nano (OpenAI): $100 + $250 = $350/month
  • GPT-5.4 (OpenAI): $1,250 + $3,000 = $4,250/month

Vertex AI Fine-Tuning Costs

  • Gemini 2.5 Flash fine-tuning: $8.00 per 1M training tokens
  • Fine-tuned model inference: standard Gemini 2.5 Flash pricing applies
  • Minimum training dataset: 100 examples
  • Typical fine-tune: 10,000 examples = ~5M tokens = ~$40 one-time cost

When to Choose Vertex AI

  • You're already on Google Cloud — consolidated billing, existing credits, no new vendor
  • HIPAA/GDPR compliance required — Vertex AI is a Google Cloud HIPAA-eligible service
  • Data needs to stay in EU or APAC — Vertex supports regional data residency
  • You need 1M context at the lowest cost — Flash-Lite at $0.10/M with 1M context; no OpenAI equivalent
  • You want batch processing discounts — 50% off for batch predictions via Vertex

Compare Vertex AI vs Azure vs Direct API

Calculate which cloud AI platform is cheapest for your workload volume.

AI API Cost Calculator