LLM cost archetype

Coding assistant — LLM cost calculator & pricing model

Your app helps developers write, review, explain, or debug code. Prompts are large (code files, diffs, instructions) and outputs are large (generated code, explanations). Token costs are higher than a typical chatbot.

Does this sound like your app?

Real-world example

A PR review tool. Developer opens a pull request — the app sends the entire diff (3,000 tokens) to an LLM and gets back a detailed code review (1,200 tokens). Output pricing matters a lot here.

Default cost profile

Calls per request
1.5
Batch-eligible
no
Avg input tokens
4000
Avg output tokens
1500

Assumes 1–2 LLM calls per request (default 1.5): sometimes a single generation, sometimes a generate-then-review pattern. Input tokens are high (~4,000) because prompts include code context, file contents, and instructions. Output tokens are also high (~1,500) for generated code. Caching helps with repeated project context. Not batch-eligible — developers expect real-time responses.

Rough cost

$20–300/mo at 100–1,000 requests/day. Output-heavy — choose models with competitive output pricing.

Red flag

If users are asking general programming questions without pasting code, this might be a Simple chatbot.

Model costs for Coding assistant← All archetypes