AI Chatbot Cost Per Message 2026: What Each Conversation Actually Costs

The Hidden Cost of Chat History

Every chatbot message costs more than just the current exchange — you pay for the entire conversation history resent with each turn. Costs grow with each turn:

Turn 1: 350 input (system + user) + 200 output = 550 tokens
Turn 2: 700 input (history added) + 200 output = 900 tokens
Turn 5: 1,750 input + 200 output = 1,950 tokens
Turn 10: 3,500 input + 200 output = 3,700 tokens

A 10-turn conversation = ~21,250 total tokens — not the ~5,500 you'd expect from a single-turn estimate.

Full Conversation Cost by Model — 2026

Assuming: 200-token system prompt, 150-token user messages, 200-token AI responses.

Model	Turn 1	Turn 5	Full 10-turn	1,000 convos/mo
Gemini 2.5 Flash-Lite	$0.000115	$0.000255	$0.0027	$2.70
Mistral Small 3.2	$0.000095	$0.000205	$0.0022	$2.20
GPT-5.4 nano	$0.000320	$0.000600	$0.0064	$6.40
Claude Haiku 4.5	$0.001350	$0.002750	$0.0293	$29.30
Claude Sonnet 4.6	$0.004050	$0.008250	$0.0878	$87.80

10-turn calculation includes cumulative history: 19,250 total input + 2,000 total output tokens per conversation.

Customer Support Cost at Scale

For a company handling 50,000 support conversations per month (5-turn average):

Model	Cost/conversation	50K convos/mo	vs Human agents
Gemini 2.5 Flash-Lite	$0.000925	$46/mo	99.95% cheaper than humans
GPT-5.4 nano	$0.0023	$115/mo	99.9% cheaper than humans
Claude Haiku 4.5	$0.01025	$512/mo	99.5% cheaper than humans
Claude Sonnet 4.6	$0.03075	$1,538/mo	98.5% cheaper than humans
Human agents (for reference)	$2–5	$100K–$250K/mo	Baseline

3 Ways to Reduce Chatbot Costs

1. Truncate Conversation History

Instead of sending the full conversation history, send only the last 3–5 turns. For most chatbots, this is sufficient context. This caps costs regardless of conversation length and prevents unbounded cost growth.

2. Use Prompt Caching (Claude)

If your system prompt is large (500+ tokens), Anthropic's prompt caching reduces that cost by 90%. Claude Haiku 4.5 cache reads cost $0.10/M — cheaper than GPT-5.4 nano's $0.20/M standard input price. For high-volume chatbots, caching the system prompt is the single biggest cost lever on Anthropic.

3. Route by Complexity

Use Gemini 2.5 Flash-Lite ($0.10/M) or GPT-5.4 nano ($0.20/M) for simple questions (typically 70–80% of support traffic). Escalate to Claude Haiku 4.5 or Sonnet 4.6 only for complex cases. This reduces average cost by 60–80% while maintaining quality for hard cases.

Chatbot Cost Formula

Monthly chatbot cost = Conversations × Average turns × (Input tokens/turn × Input price + Output tokens/turn × Output price) / 1,000,000

Example: 10,000 conversations, 5 turns, 600 input + 200 output per turn, GPT-5.4 nano ($0.20/$1.25):

= 10,000 × 5 × (600 × $0.20 + 200 × $1.25) / 1,000,000 = 10,000 × 5 × $0.00037 = $18.50/month

AI Chatbot Cost Per Message 2026:What Each Conversation Actually Costs