Question 1

How do you estimate the cost of an LLM API feature?

Accepted Answer

LLM APIs bill per token, usually quoted per million tokens, with separate prices for input (your prompt and context) and output (the model's response). The monthly cost is (monthly requests × average input tokens ÷ 1,000,000 × input price) + (monthly requests × average output tokens ÷ 1,000,000 × output price). Prompt caching reduces the price of the repeated portion of your input. This calculator runs that math from the numbers you enter.

Question 2

What drives LLM API cost the most?

Accepted Answer

Two things: request volume and tokens per request. Output tokens are typically priced several times higher than input tokens, so long responses cost more than long prompts. The other silent driver is a large fixed input — a big system prompt plus retrieved grounding context sent on every request — which is exactly what prompt caching is designed to discount.

Question 3

How much does prompt caching save?

Accepted Answer

Prompt caching reuses a stable prompt prefix (system prompt, instructions, grounding context) instead of re-processing it on every request, billing those cached input tokens at a large discount. For a feature with a big fixed prefix — like a grounded support assistant — the input portion of the bill can drop substantially. Set a cache hit rate and a cached-input price in the calculator to model your own savings.

Question 4

Where do I find my model's per-token price?

Accepted Answer

Model providers (such as Anthropic, OpenAI, and Google) publish per-million-token input and output prices on their pricing pages, with cached-input and batch tiers listed separately. The price fields in this calculator are editable example values — replace them with your model's actual published rates for an accurate estimate.

Question 5

Does this calculator send my data anywhere?

Accepted Answer

No. Every calculation runs entirely in your browser with JavaScript. Nothing you type is sent to a server or stored.

What will your AI feature cost to run?

Your usage

What it costs

I ship production AI features end-to-end — keys server-side, grounded, streamed, and cached.

How LLM API cost is calculated

Why prompt caching is the biggest lever

Frequently asked questions