LLM Cost Per Query 2026:
What Does One AI Request Actually Cost?
The true cost of a single AI API request across all major production models in 2026. See how per-query costs scale from 1 request to 1 million, and find the break-even point for your product. Last verified: 2026-04-01.
Cost Per Query: Full Table — Production Models Only
Assuming a typical conversational query: 500 input tokens (your message + system prompt) and 500 output tokens (the AI's response). This equals ~375 words in + 375 words out.
| Model | Cost / Query | 1K queries/day | 10K queries/day | 100K queries/day |
|---|---|---|---|---|
| Mistral Small 3.2 | $0.000200 | $6/mo | $60/mo | $600/mo |
| Gemini 2.5 Flash-Lite | $0.000250 | $7.50/mo | $75/mo | $750/mo |
| GPT-5.4 nano | $0.000725 | $21.75/mo | $217.50/mo | $2,175/mo |
| Gemini 2.5 Flash | $0.00140 | $42/mo | $420/mo | $4,200/mo |
| Claude Haiku 4.5 | $0.00300 | $90/mo | $900/mo | $9,000/mo |
| GPT-5.4 | $0.00875 | $262.50/mo | $2,625/mo | $26,250/mo |
| Claude Sonnet 4.6 | $0.00900 | $270/mo | $2,700/mo | $27,000/mo |
| Claude Opus 4.6 | $0.01500 | $450/mo | $4,500/mo | $45,000/mo |
1K queries/day = ~30K/month. Assumes 500 input + 500 output tokens per query.
How Query Size Affects Cost
The 500+500 token example is for a simple chat message. Real-world query costs vary dramatically based on use case:
| Use Case | Typical Tokens | Flash-Lite cost | Sonnet 4.6 cost |
|---|---|---|---|
| Simple FAQ answer | 200 in / 150 out | $0.000080 | $0.00285 |
| Customer support chat turn | 800 in / 300 out | $0.000200 | $0.00690 |
| Code explanation (1 file) | 3,000 in / 800 out | $0.000620 | $0.02100 |
| Document summarization (10 pages) | 8,000 in / 500 out | $0.001000 | $0.03150 |
| Long document analysis (50 pages) | 40,000 in / 2,000 out | $0.004800 | $0.15000 |
| Full codebase review (1K files) | 100,000 in / 5,000 out | $0.012000 | $0.37500 |
The AI Cost Ceiling: When Does AI Become Unsustainable?
AI becomes expensive when your cost-per-query exceeds the revenue or value generated per query. Key benchmarks:
- SaaS product ($10/month subscription): Can afford ~$0.05–0.10 per query (5–10% of revenue on inference)
- Free-tier product (ad-supported): Must keep cost below ~$0.001 per query to be viable — use Gemini 2.5 Flash-Lite or Mistral Small 3.2
- Enterprise tool ($500/month): Can afford $2–5 per complex query — Claude Sonnet 4.6 is reasonable
If your cost per query exceeds 10–15% of your revenue per user, you need to optimize model selection or reprice.
How to Get Your Real Per-Query Cost
Log token counts for a representative sample of real requests. Calculate:
Observability tools like LangSmith, Helicone, and Braintrust calculate this automatically from your API logs. Set up monitoring before you scale — surprises at 10M queries/month are expensive.
Calculate Your Exact Per-Query Cost
Enter your model, token counts, and query volume to get your monthly estimate.
Open API Cost Calculator