Skip to content

Token Counter

Paste any text to instantly count tokens for GPT-4o, Claude, Gemini and other AI models. See token cost estimates based on current API pricing.

Your Text

0 chars0 words0 lines

Token Count by Model

GPT-4o / GPT-5.4
cl100k_base (tiktoken)
0
tokens
Claude (all models)
Anthropic BPE
0
tokens
Gemini 2.5
SentencePiece
0
tokens
Mistral models
SentencePiece
0
tokens
Llama 3 / open source
tiktoken / custom
0
tokens
Note: Token counts are estimates using standard BPE rules. Actual counts may vary ±5% depending on special characters and language.

What is a Token?

AI language models don't process text character by character or word by word — they work with tokens. A token is a chunk of text that the model treats as a single unit.

In English, 1 token ≈ 4 characters or ¾ of a word. Common short words like "the", "is", "of" are each 1 token. Longer or less common words are split into 2–3 tokens.

  • "Hello" → 1 token
  • "calculator" → 2 tokens (cal-culator)
  • "ChatGPT" → 2 tokens
  • "1000" → 1 token
  • A 750-word essay → ~1,000 tokens

Code and non-English text tokenize differently — Python code is roughly 1 token per 3–4 characters, while Chinese/Japanese text can be 1 token per 1–2 characters.

How many tokens is 1,000 words?

About 1,333 tokens. The standard approximation is 1 token = 0.75 words, so 1,000 tokens = ~750 words. A typical A4 page of English text (~500 words) is around 667 tokens.

Why do different AI models give different token counts?

Each model uses a different tokenizer (vocabulary). GPT models use tiktoken (cl100k_base), Gemini uses SentencePiece, and Llama models use their own BPE. For English text, counts are usually within 5-10% of each other. Non-English text and code can diverge more.

How do I reduce token usage to save money?

Key strategies: (1) Remove unnecessary whitespace and repetition from prompts, (2) Use shorter system prompts, (3) Summarize long conversation history instead of sending full context, (4) For structured data, use concise JSON keys, (5) Use prompt caching for repeated context sections.

What is context window and why does it matter?

The context window is the maximum number of tokens a model can process in a single request (including both your input and its output). GPT-4o has 128K, Claude Opus 4.6 has 1M, Gemini 2.5 Pro has 2M. If your text exceeds the limit, you'll need to chunk it. The context bar above shows how much of the window your text uses.

Token Count

0
GPT-4o / cl100k_base tokens
0
Characters
0
Words
Context window usage
0 used0.0% of 128K

Cost Estimate

As input (prompt)$0.00
As output (completion)$0.00
Pricing$2.5/1M in · $10/1M out
At scale (as input)
100 requests/day$0.00/mo
1,000 requests/day$0.00/mo
10,000 requests/day$0.00/mo