Gemini 2.5 Flash Lite vs Gemini 3.1 Flash Lite: Pricing Comparison

Compare pricing, capabilities, and costs for your LLM workloads.

Google

Gemini 2.5 Flash Lite

Pricing (per 1M tokens)

Input$0.1000
Output$0.4000
Cached Input$0.0100
Batch Input$0.0500
Batch Output$0.2000

Context & Output

Context Window1M tokens
Max Output65.5K tokens

Capabilities

Categorybudget
Multimodaltext + image + audio
Fine-tuningNo
StreamingYes

Google

Gemini 3.1 Flash Lite

Pricing (per 1M tokens)

Input$0.2500
Output$1.50
Cached Input$0.0250
Batch Input$0.1250
Batch Output$0.7500

Context & Output

Context Window1M tokens
Max Output65.5K tokens

Capabilities

Categorybudget
Multimodaltext + image + audio
Fine-tuningNo
StreamingYes

Quick Verdict

Cheaper Input Price

Gemini 2.5 Flash Lite

60.0% cheaper

Cheaper Output Price

Gemini 2.5 Flash Lite

73.3% cheaper

Larger Context Window

Gemini 3.1 Flash Lite

+0 tokens

Cost Comparison

Sample workload: 1,000,000 input tokens + 1,000,000 output tokens

Gemini 2.5 Flash Lite

$0.5000

$0.1000/1M input + $0.4000/1M output

Gemini 3.1 Flash Lite

$1.75

$0.2500/1M input + $1.50/1M output

Gemini 2.5 Flash Lite is 71.4% cheaper for this workload.

Frequently Asked Questions

Which is cheaper, Gemini 2.5 Flash Lite or Gemini 3.1 Flash Lite?
For input tokens, Gemini 2.5 Flash Lite is cheaper at $0.1000 per 1M tokens. For output tokens, Gemini 2.5 Flash Lite is cheaper at $0.4000 per 1M tokens. The overall cost depends on your workload's input/output ratio.
What is the context window size of Gemini 2.5 Flash Lite vs Gemini 3.1 Flash Lite?
Gemini 2.5 Flash Lite has a context window of 1M tokens, while Gemini 3.1 Flash Lite has 1M tokens. Gemini 2.5 Flash Lite supports a larger context window of 1M tokens, which is beneficial for processing longer documents.
How do Gemini 2.5 Flash Lite and Gemini 3.1 Flash Lite compare for batch processing?
Both models support batch processing with discounted rates. Gemini 2.5 Flash Lite offers a better batch rate at $0.0500 per 1M input tokens. Batch processing is ideal for non-time-sensitive workloads where you can wait for processing.

Need more tools?

Explore our complete suite of LLM calculators and comparison tools.