Batch API Cost Calculator
Compare standard API cost vs Batch API pricing and see your savings.
OpenAI's Batch API processes requests asynchronously within a 24-hour window in exchange for a 50% discount on both input and output tokens — the same pattern is now offered by Anthropic, Google, and others. If your workload tolerates latency, batching can roughly halve your API bill. Use this calculator to quantify the savings on your real request volume.
Standard API cost
$2,000.00
$5.00/M input + $30.00/M output
Batch API cost
$1,000.00
$2.50/M input + $15.00/M output
You save with Batch API
$1,000.00(50.0%)
On 100,000 requests (100,000,000 input + 50,000,000 output tokens) using GPT-5.5.
What batch API pricing is
The Batch API on OpenAI, Anthropic, and Google takes your request file and returns results within 24 hours. In exchange for that latency, you pay 50% of standard input and output rates for most models. It is not a separate model — the underlying inference is identical. The API is ideal for offline jobs: nightly summarisation, large-scale evaluation, embedding generation, content moderation backfills, structured data extraction across an archive, and similar.
When batch is the right choice
Batch wins for any non-interactive workload above ~10,000 requests per day. Below that the absolute savings are usually small. Batch is wrong for user-facing chat, real-time agents, or anything that needs responses in seconds. A common pattern: send latency-sensitive requests through the standard API and route everything else (analytics, indexing, evals) through batch. Many teams cut their inference bill by 30–45% just by routing the right traffic through the batch endpoint.
Frequently Asked Questions
How much cheaper is the OpenAI Batch API?
What is the catch with batch API pricing?
Do all providers offer batch pricing?
How do I estimate my batch savings before switching?
Batch pricing applies to asynchronous requests completed within 24 hours. Rates are sourced from official provider pricing pages.