Skip to content
Cost Analysis

LLM Cost Per Query 2026:
What Does One AI Request Actually Cost?

The true cost of a single AI API request across all major production models in 2026. See how per-query costs scale from 1 request to 1 million, and find the break-even point for your product. Last verified: 2026-04-01.

10 min read·Updated April 2026
Cost Per Single Query (500 input + 500 output tokens)
$0.00020
Mistral Small 3.2
$0.000725
GPT-5.4 nano
$0.00875
GPT-5.4
$0.015
Claude Opus 4.6

Cost Per Query: Full Table — Production Models Only

Assuming a typical conversational query: 500 input tokens (your message + system prompt) and 500 output tokens (the AI's response). This equals ~375 words in + 375 words out.

ModelCost / Query1K queries/day10K queries/day100K queries/day
Mistral Small 3.2$0.000200$6/mo$60/mo$600/mo
Gemini 2.5 Flash-Lite$0.000250$7.50/mo$75/mo$750/mo
GPT-5.4 nano$0.000725$21.75/mo$217.50/mo$2,175/mo
Gemini 2.5 Flash$0.00140$42/mo$420/mo$4,200/mo
Claude Haiku 4.5$0.00300$90/mo$900/mo$9,000/mo
GPT-5.4$0.00875$262.50/mo$2,625/mo$26,250/mo
Claude Sonnet 4.6$0.00900$270/mo$2,700/mo$27,000/mo
Claude Opus 4.6$0.01500$450/mo$4,500/mo$45,000/mo

1K queries/day = ~30K/month. Assumes 500 input + 500 output tokens per query.

How Query Size Affects Cost

The 500+500 token example is for a simple chat message. Real-world query costs vary dramatically based on use case:

Use CaseTypical TokensFlash-Lite costSonnet 4.6 cost
Simple FAQ answer200 in / 150 out$0.000080$0.00285
Customer support chat turn800 in / 300 out$0.000200$0.00690
Code explanation (1 file)3,000 in / 800 out$0.000620$0.02100
Document summarization (10 pages)8,000 in / 500 out$0.001000$0.03150
Long document analysis (50 pages)40,000 in / 2,000 out$0.004800$0.15000
Full codebase review (1K files)100,000 in / 5,000 out$0.012000$0.37500

The AI Cost Ceiling: When Does AI Become Unsustainable?

AI becomes expensive when your cost-per-query exceeds the revenue or value generated per query. Key benchmarks:

  • SaaS product ($10/month subscription): Can afford ~$0.05–0.10 per query (5–10% of revenue on inference)
  • Free-tier product (ad-supported): Must keep cost below ~$0.001 per query to be viable — use Gemini 2.5 Flash-Lite or Mistral Small 3.2
  • Enterprise tool ($500/month): Can afford $2–5 per complex query — Claude Sonnet 4.6 is reasonable

If your cost per query exceeds 10–15% of your revenue per user, you need to optimize model selection or reprice.

How to Get Your Real Per-Query Cost

Log token counts for a representative sample of real requests. Calculate:

avg_input_tokens = sum(input_tokens) / total_requests
avg_output_tokens = sum(output_tokens) / total_requests
cost_per_query =
(avg_input_tokens / 1_000_000) × input_price
+ (avg_output_tokens / 1_000_000) × output_price

Observability tools like LangSmith, Helicone, and Braintrust calculate this automatically from your API logs. Set up monitoring before you scale — surprises at 10M queries/month are expensive.

Calculate Your Exact Per-Query Cost

Enter your model, token counts, and query volume to get your monthly estimate.

Open API Cost Calculator