Question 1

How much does OpenAI text-embedding-3-small cost per million tokens?

Accepted Answer

OpenAI text-embedding-3-small costs $0.020 per million tokens. This is the most cost-efficient OpenAI embedding model. By comparison, text-embedding-3-large costs $0.130 per million tokens — 6.5× more expensive — and the older ada-002 model costs $0.100 per million tokens. For most RAG pipelines, text-embedding-3-small delivers sufficient retrieval quality at a fraction of the cost of the larger model.

Question 2

What is the cheapest embedding model?

Accepted Answer

Voyage AI voyage-3-lite is the lowest-cost mainstream embedding model at $0.010 per million tokens — half the price of OpenAI text-embedding-3-small ($0.020/M). For high-volume embedding pipelines where cost is the primary constraint, voyage-3-lite is the best starting point. OpenAI text-embedding-3-small is the cheapest option if you want to stay within the OpenAI ecosystem.

Question 3

How many tokens are in 1MB of plain text?

Accepted Answer

One megabyte of plain English text contains approximately 250,000 tokens. This is based on the standard approximation of 1 token per 4 characters of English. For reference: a 500-word document is approximately 3,750 characters, or about 938 tokens. A corpus of 100,000 such documents totals roughly 94 million tokens. Code and CJK languages tokenize differently — code is typically denser (1 token per 3–3.5 characters), while Chinese, Japanese, and Korean text is sparser (1 token per 2–2.5 characters).

Question 4

What is the difference between text-embedding-3-small and text-embedding-3-large?

Accepted Answer

text-embedding-3-small and text-embedding-3-large are both OpenAI's current embedding models, but they differ in output dimension, cost, and retrieval accuracy. text-embedding-3-small produces 1536-dimensional vectors and costs $0.020 per million tokens. text-embedding-3-large produces 3072-dimensional vectors and costs $0.130 per million tokens. The large model scores higher on MTEB benchmarks, particularly for multilingual retrieval and domain-specific tasks. For most general-purpose RAG applications, text-embedding-3-small achieves 85–90% of the large model's retrieval quality at 15% of the cost.

Question 5

How much does it cost to embed 1 million documents?

Accepted Answer

The cost depends on average document length and the model chosen. For a corpus of 1 million documents averaging 500 words each (approximately 938 tokens per document), the total token volume is roughly 938 million tokens. At current pricing: text-embedding-3-small costs $18.76, text-embedding-3-large costs $121.94, voyage-3-lite costs $9.38, and voyage-3 costs $56.28. Use the calculator above to enter your specific document length and corpus size for an exact estimate.

Question 6

Do I pay again if I re-embed my corpus?

Accepted Answer

Yes. Every embedding run — initial or re-run — charges the full per-token rate. There is no caching or re-use discount for previously embedded content. If you re-embed your entire corpus to switch models, update stale content, or add new documents, you pay for every token processed in that run. This makes model selection important upfront: switching embedding models at scale is not free. A 1-billion-token corpus re-embedded with text-embedding-3-small costs $20 per full re-run.

Question 7

How does Voyage AI's pricing compare to OpenAI for embeddings?

Accepted Answer

Voyage AI's cheapest model (voyage-3-lite at $0.010/M tokens) is half the price of OpenAI text-embedding-3-small ($0.020/M tokens). Voyage's mid-tier model (voyage-3 at $0.060/M tokens) is between text-embedding-3-small and text-embedding-3-large on cost, but consistently outperforms text-embedding-3-small on domain-specific retrieval benchmarks including code, legal, and scientific text. For teams not locked into OpenAI infrastructure, Voyage AI typically offers better quality-per-dollar at the mid tier.

Model	Provider	Price/1M	One-Time Cost	Dimensions
text-embedding-3-small	OpenAI	$0.0200	$0.1000	1,536
text-embedding-3-large	OpenAI	$0.1300	$0.6500	3,072
Cohere embed-v3	Cohere	$0.1000	$0.5000	1,024
Voyage 3	Voyage AI	$0.0600	$0.3000	1,024

Embeddings Cost Calculator

How embedding costs are calculated

Choosing the right embedding model

Frequently Asked Questions

Explore Related Tools

Fine-Tuning Cost

Batch API Cost

Cost Estimator