AI Agent API Cost Calculator
Upload your agent code for auto-detection, or fill in manually — context accumulation, tool calls and prompt caching modeled in real time.
Drop your agent file here
Python · JavaScript · JSON · YAML — or paste code below
Upload agent code or fill in the form
to see real-time cost estimates here.
Frequently asked questions
Why is my AI agent API cost so much higher than expected?
Context accumulation is the main culprit. Every agent step re-sends the full conversation history (system prompt + all tool definitions + all previous messages + tool results) as input tokens. By step 4, you might be paying for 2,000+ tokens of input when the original user message was only 50 words. Our calculator models this exactly and shows you the step-by-step growth.
What is context accumulation and why does it matter?
Context accumulation is the pattern where each LLM call in an agent loop re-sends the growing conversation history. The formula is: input_tokens(step_n) = base_context + (n-1) × (output_per_step + tool_result_tokens). This means input costs grow roughly quadratically with step count — step 6 can easily cost 3× what step 2 costs.
How much does prompt caching save for AI agents?
With Anthropic's Claude, cached tokens are billed at 0.1× the normal input price (90% discount). For agents with a 200-word system prompt and 5 tool definitions, caching that static portion across all steps typically cuts total agent cost by 40–70%. Enable the caching toggle to see your exact savings.
How many tokens do tool schemas use?
A typical tool definition (name + description + 3–5 parameters) takes roughly 80–150 tokens in JSON schema format. An agent with 5 tools adds around 400–750 tokens of static context overhead to every single step. With prompt caching this overhead is effectively free from step 2 onwards.