LLM cost archetype

Document processor — LLM cost calculator & pricing model

Your app processes documents in bulk — summarizing, extracting data, classifying, or translating them. This runs in the background, not in real time. Because it's async, you can use cheaper batch pricing.

Does this sound like your app?

Real-world example

A legal tech tool that summarizes contracts overnight. 500 contracts uploaded — each goes through one LLM call to extract key clauses. No user is waiting. Batch API gives 50% off.

Default cost profile

Calls per request
1
Batch-eligible
yes
Avg input tokens
8000
Avg output tokens
800

Assumes 1 LLM call per document with large input (~8,000 tokens of document content) and moderate output (~800 tokens of extracted/summarized data). This is the primary batch-eligible archetype — documents can be queued and processed asynchronously at ~50% off input and output pricing. Models with large context windows (200K+) are preferred since documents can exceed the 8K average.

Rough cost

$10–200/mo at 1,000 documents/day. Batch API is your biggest lever — always enable it.

Red flag

If users are waiting in real time for the result, this isn't batch-eligible and the cost model changes significantly.

Model costs for Document processor← All archetypes