Input cost
Input cost is based on the text you send to the model, including system prompts, user messages, retrieved context, examples, code, and structured data.
LLM API cost planning
Turn a token count into estimated LLM API cost before choosing a model or shipping a prompt-heavy workflow.
A token cost calculator converts token counts into estimated API spend. After counting tokens from your prompt, use model input and output rates to estimate how much a request may cost. This helps teams compare providers, set product limits, size a batch job, or decide whether a prompt should be shortened before it reaches a paid model.
Request cost = input tokens / 1,000,000 x input price + output tokens / 1,000,000 x output price. Cached-input rates apply only when the provider discounts reused prompt content.
Input cost is based on the text you send to the model, including system prompts, user messages, retrieved context, examples, code, and structured data.
Output cost depends on how long the model response is. If you expect long answers, budget output tokens separately instead of assuming the prompt count is the whole cost.
Some providers discount repeated prompt prefixes. Use cached-input pricing only when your request pattern actually reuses stable content in the way the provider supports.
Divide tokens by 1,000,000 and multiply by the provider's price per 1M tokens. Add input and output estimates for a full request.
Many LLM providers price generated output tokens higher than input tokens because generation uses more compute.
No. Cached-input discounts depend on provider rules and request patterns. Always check the official pricing page for production billing.
The main calculator at https://token-calculator.net/ is the fastest way to measure your exact text, compare model cost, and visualize token-sized text pieces in the browser.