Reference / AI Glossary
AI pricing glossary
Every term you need to understand AI pricing. Tokens, context windows, batch discounts, input vs output. Plain definitions, no filler.
Token
The basic unit of AI pricing. Roughly 0.75 words or 4 characters. AI APIs charge separately for input tokens (what you send) and output tokens (what the model generates back).
Input tokens
Tokens in your prompt, system message, conversation history and any documents you attach. Input tokens are cheaper than output tokens across all major APIs, typically 4 to 5 times cheaper.
Output tokens
Tokens the model generates in its response. Output tokens cost 3 to 5 times more than input tokens. A model generating a long essay costs significantly more than one giving a short answer to the same prompt.
Context window
The maximum number of tokens a model can process in one request, both input and output combined. GPT-4.1 has a 1M token context window. Claude Haiku 4.5 has 200k. A larger context window lets you pass more documents but every token costs money.
Batch API
An async processing mode offered by OpenAI and Anthropic that delivers 50% off standard API pricing. You send requests and receive responses within 24 hours rather than instantly. Document processing, classification and content generation all qualify.
Per million tokens (1M tokens)
The standard unit for quoting AI API pricing. $1.00 per million input tokens means processing one million words of input costs roughly $1.33. At 20 messages per day with 2,000 tokens each across 22 working days, you use roughly 880,000 tokens per month.
Break-even point
The usage volume at which a flat subscription becomes cheaper than pay-per-token API pricing. ChatGPT Plus ($20/month) breaks even against GPT-4.1 mini API at approximately 130 million tokens per month. Most solo users never reach it.
Rate limit
The maximum number of requests or tokens you can send per minute or per day. Hitting a rate limit does not cost more as requests are queued or rejected, not billed.
Prompt caching
A feature on Anthropic and some other APIs that stores repeated prompt prefixes and charges significantly less for cached tokens on subsequent requests. Useful for applications that reuse the same system prompt across many calls.
Common questions
How many tokens is a typical ChatGPT conversation?
A typical message and response exchange is 500 to 2,000 tokens. At 20 messages per day and 2,000 tokens each across 22 working days, that is roughly 880,000 tokens per month, well under 1 million.
Do subscriptions like ChatGPT Plus use tokens?
Yes, but you are not billed per token. You pay a flat rate regardless of how many tokens you use. The subscription only makes financial sense if you consistently use enough tokens to justify the flat fee.
Prices monitored continuously. All figures in USD.