Usage-Based Pricing for AI SaaS 2026: Credits, Seats & Per-Call Models

The Core Challenge: AI COGS Are Variable

Traditional SaaS has near-zero COGS — serving the same software to 1 user or 1,000 users costs nearly the same. AI SaaS is different: every user action generates an API call that costs real money. A heavy user might consume 100× the API cost of a light user, making flat pricing dangerous for margins.

The goal of AI SaaS pricing is to:

Decouple your revenue growth from linear AI cost growth
Protect gross margins at scale (target: 65–80%)
Limit exposure to power users who would otherwise use unlimited AI for $20/month

Pricing Model Comparison

Model	Structure	Pros	Cons	Best for
Credit-based	$X/month = N credits; 1 API call = Y credits	Predictable COGS; heavy users self-limit or buy more	Friction in onboarding; users hate "running out"	Variable AI workloads (content gen, document processing)
Per-seat flat	$X/user/month; unlimited AI usage	Simple to sell; no usage tracking needed	Power users destroy margins; subsidized by light users	Tools where usage is naturally capped (meetings/day, emails/day)
Per-output	$X per document/article/report generated	Revenue scales with value delivered; COGS scales proportionally	Harder to predict revenue; users price-shop	Content generators, document tools, design tools
Tiered flat + overage	$20 for 100 calls/day; $0.05/call after	Familiar SaaS model; clear upgrade path	Overage billing creates bad UX surprise	API-first products, developer tools
Hybrid (seat + credits)	$X/seat/month includes Y credits; extra credits purchasable	Protects margins, predictable base revenue, upsell path	More complex to explain and support	Team productivity tools with high usage variance

Credit Model Design: Key Ratios

When designing a credit model, your markup on API cost determines your gross margin:

Scenario	API cost/credit	You charge/credit	AI Gross Margin	Notes
2× markup	$0.005	$0.010	50%	Unsustainable — after infra and support, COGS > revenue
5× markup	$0.005	$0.025	80%	Minimum viable; leaves room for infra, support, ops
10× markup	$0.005	$0.050	90%	SaaS-level margins; competitive if product adds clear value
20× markup	$0.005	$0.100	95%	Premium positioning; users pay for UX, not raw tokens

Target a 10× markup minimum. If API costs are visible to customers (developer tools), you can get away with less. Consumer products with high switching cost can sustain 20×.

Real Examples: How Top AI SaaS Products Price

Product type	Pricing model	Their charge/output	Estimated API cost	Approx. margin
AI writing tool (1,500 words)	Credits: $49/mo = 150 articles	$0.33/article	$0.015 (Haiku)	95%+
AI meeting assistant	Seat: $19/user/mo, 10 meetings/mo	$1.90/meeting	$0.27 (Deepgram+Haiku)	86%
AI coding copilot (1 dev/mo)	Seat: $19/dev/mo (Copilot Pro)	$19/dev	$8 (custom stack)	58%
AI customer support (per ticket)	Per outcome: $1/deflected ticket	$1.00/ticket	$0.010 (Haiku)	99%
AI document processor	Per-doc: $0.10/page	$0.10/page	$0.0015 (Haiku batch)	98.5%

AI costs are almost always well under 10% of revenue in well-designed products. The exception is coding copilots where the API is near-commoditized.

Setting Usage Limits by Tier

Tiered limits prevent power users from destroying your unit economics:

Tier	Price	Usage limit	Max AI cost	Margin floor
Free	$0	50 calls/mo	$0.05 (nano)	Acquisition cost
Starter	$19/mo	500 calls/mo	$0.36 (nano)	98%
Pro	$49/mo	2,000 calls/mo	$1.45 (nano)	97%
Business	$149/mo	10,000 calls/mo	$72.50 (Sonnet)	51% (upgrade Haiku!)

The Business tier collapses margins if you use Sonnet ($3/M). Switch to Haiku ($1/M) at scale for the same output quality in most use cases, and margins recover to 80%+.

Protecting Margins: 5 Structural Approaches

Hard limits, not soft warnings: If a user hits their monthly limit, block rather than warn. Unlimited overage is an open-ended liability.
Model downgrade at volume: Use Sonnet for new users (quality-first conversion), switch to Haiku at 80% of usage limit. 95% of users won't notice the difference.
Rate limits, not just monthly limits: 10 calls/minute prevents gaming/scraping without hurting real usage. Rate limits are cheaper to enforce than you think.
Output caching: If the same question is asked by multiple users (FAQ-type queries), cache the LLM response and serve it for free. Common in document Q&A products.
Priced add-ons for heavy use: Rather than unlimited plans, sell "credit top-ups" that scale your revenue with your API costs.

Usage-Based Pricing for AI SaaS 2026:
Credits, Seats & Per-Call Billing Models

The Core Challenge: AI COGS Are Variable

Pricing Model Comparison

Credit Model Design: Key Ratios

Real Examples: How Top AI SaaS Products Price

Setting Usage Limits by Tier

Protecting Margins: 5 Structural Approaches

Calculate AI Cost at Your Usage Volume

Usage-Based Pricing for AI SaaS 2026:Credits, Seats & Per-Call Billing Models

The Core Challenge: AI COGS Are Variable

Pricing Model Comparison

Credit Model Design: Key Ratios

Real Examples: How Top AI SaaS Products Price

Setting Usage Limits by Tier

Protecting Margins: 5 Structural Approaches

Calculate AI Cost at Your Usage Volume

Usage-Based Pricing for AI SaaS 2026:
Credits, Seats & Per-Call Billing Models