Product

Cost intelligence for
AI products.

Start with workflow attribution, routing intelligence, and pre-launch cost modeling. Keep budgets, policy, and reporting underneath as the control layer.

Active development Cost by workflow, routing recommendations, and the simulator are in progress for Stipend Cloud. The control layer below is live today.

01 — Cost by Workflow

See which feature is
driving the bill.

Tag requests with labels like feature, agent, or environment. Break down spend by any dimension you define, drill into model mix inside a workflow, and watch the trend before and after a release.

Active development Built on label-aware usage events and a dedicated cost-by-workflow dashboard.
Stipend — Cost by Workflow · April 2026
Current billing period
Grouped by feature with drilldown by model
feature agent environment
Labeled requests
182,440
94% of enforced traffic
Top workflow
$4,776
Content generation
Largest jump
+38%
after v2.3 release
Unlabeled spend
$341
kept separate
Workflow Requests Top model Spend Trend
Content generation
feature=content-gen · env=prod
54.2k
gpt-5.4
$4,776
+38%
Support copilot
agent=support-bot · env=prod
72.1k
gpt-5.4
$2,801
+4%
Search reranker
feature=search-rerank · env=prod
31.6k
claude-opus-4-6
$657
-12%
Label the request
Add labels like feature, agent, and environment so spend can be joined back to the work that caused it.
Drill into model mix
See which models are actually powering a workflow instead of guessing from a provider total at the account level.
Watch the trend
Compare releases and billing periods so cost spikes show up against the workflow that changed, not just the monthly invoice.
02 — Routing Recommendations

See where you're
overpaying.

Stipend analyzes model mix and points out workloads that look overpriced for the output they need. The goal is simple: route cheap work cheaply and keep premium models where they actually matter.

Active development Recommendations use enforced traffic history and become workflow-aware as label coverage improves.
Stipend — Routing Recommendations
Recommended move
84% of your GPT-5.4 support traffic could run on gpt-5.4-mini.
Estimated savings: $1,420 / month with policy guardrails preserved.
Workflow Current model Recommended model Estimated savings
Support copilot
72.1k requests this period
gpt-5.4
gpt-5.4-mini
$1,420 / mo
Search reranker
31.6k requests this period
claude-opus-4-6
claude-haiku-4-5
$388 / mo
Draft summaries
18.9k requests this period
gpt-5.4
gemini-3.1-pro
$271 / mo
Workflow-aware suggestions
Recommendations get more useful when spend is already labeled by feature, agent, or environment instead of only at the account level.
Savings you can act on
See the estimated monthly impact before you touch production routing, not after another invoice lands.
Control stays underneath
Recommendations do not bypass policy. The control layer still decides which providers and models a team can use.
03 — Cost Simulator

Model the cost
before you ship.

Estimate daily, weekly, and monthly cost for a workflow before traffic goes live. Compare models side by side for the same workload instead of discovering the answer on an invoice.

Active development Built on dry-run pricing estimates and side-by-side model comparison for the same request profile.
Stipend — Cost Simulator
Workload profile
Workflow content-gen
Avg input tokens 3,400
Avg output tokens 920
Calls / day 2,600
Environment production
Compare models
Model Daily Monthly
gpt-5.4 $182 $5,460
gpt-5.4-mini $41 $1,230
claude-opus-4-6 $136 $4,080
gemini-3.1-pro $29 $870
Estimate before release
Project daily, weekly, and monthly AI cost for a workflow before it becomes a production surprise.
Compare side by side
Run the same request profile across multiple models so cost becomes part of the shipping decision, not only the invoice review.
Margin next
For teams reselling AI-powered features, margin visibility follows once labeled workflow data is live and trustworthy.
04 — Control Layer

The intelligence layer is new.
The controls still do the hard part.

Workflow visibility only matters if the underlying traffic is enforced and trustworthy. Budgets, provider policy, finance exports, audit logs, and access revocation are still the production foundation.

Live today
Budget & access
Set budget tiers, define provider access, and manage credentials from one control plane.
Live today
Real-time enforcement
Reserve budget before the request lands and reject overages before they hit the provider invoice.
Live today
Finance reporting
Export finance-ready reporting with attribution by team, role, cost center, provider, and model.
Control — Budget & Access

One policy.
Every team's compute.

Define compute budgets by job level, team, or individual. Admins can update policy directly today, assign the right providers to each group, and keep access controls in one place.

Stipend — Allocation Policy · Acme Corp
Budget Policy — 6 role tiers configured
Live today
Role Tier Monthly Budget Provider Access
Engineering · Principal
Level 6+
$3,000 / mo
OpenAI Anthropic Google All models
Engineering · Senior
Level 4–5
$2,000 / mo
OpenAI Anthropic
Engineering · Mid
Level 2–3
$1,200 / mo
OpenAI Anthropic
Product
Any level
$1,500 / mo
OpenAI Anthropic
Design
Any level
$800 / mo
OpenAI gpt-5.4, gpt-5.4-mini
Support
Any level
$200 / mo
gpt-5.4-mini only
Budget by role tier
Define reusable budget tiers by job level, team, or employee. Apply them from the admin console without custom setup work.
Provider model allowlists
Define which providers and models each group can call. Glob patterns like claude-sonnet-* are supported. Blocked at the gateway before the request leaves.
Passthrough or resale
Bring your own API contracts or route through Stipend's resale layer. Either way: one credential per employee, all providers.
Control — Real-Time Enforcement

Hard limits before
the request lands.

The gateway isn't a monitor — it's a gatekeeper. Every AI call is checked against the employee's remaining balance synchronously, before it reaches the provider. Over budget means a clean rejection. No overages, ever.

Stipend Gateway — Request Lifecycle
Employee tool
Cursor, Claude,
VS Code, API
HTTPS
Stipend Gateway
Auth · Allowlist
Budget reserve
Approved
OpenAI
api.openai.com
Response
Reconcile
Actual cost
returned to wallet
01
Key auth
Employee's Stipend key resolved to user, account, and wallet. Revoked keys rejected instantly with no round-trip.
< 1ms
02
Model allowlist
Requested model checked against account policy. Unapproved models are blocked before a reservation is even attempted.
03
Atomic reservation
Worst-case cost estimated and atomically decremented from wallet in one SQL transaction. Concurrent callers cannot race past the limit.
~3ms avg
04
Forward to provider
Request proxied with the resolved API key. Works for both standard and streaming responses, transparent to the calling tool.
05
Reconcile cost
Actual tokens parsed from the provider response. Over-estimated reserve returned to wallet. Usage event written once, immutably.
When budget is exhausted
POST /v1/chat/completions
HTTP 402 Payment Required

{
  "error": {
    "type": "budget_exhausted",
    "remaining_cents": 0,
    "budget_cents": 150000,
    "request_more_url": "https://app.stipend.dev/..."
  }
}
Normal successful request
POST /v1/chat/completions
HTTP 200 OK

# Response is the unmodified provider
# payload - no wrapping, no schema
# changes. Your existing SDK works.

# Wallet updated asynchronously:
# reserved: $0.42 -> actual: $0.31
# surplus $0.11 returned to balance
No overages, guaranteed
The atomic SQL reserve means concurrent requests cannot exceed the wallet balance. The database is the source of truth, not a cache or a flag.
Stream-aware
Works with standard and streaming requests. Token counts parsed from SSE chunks as they arrive. Wallet reconciled on stream close.
Drop-in compatible
The gateway speaks the same request and response shape as OpenAI and Anthropic. Employees swap one base URL. No code changes required in their tools.
Control — Finance Reporting

Reports your CFO
can actually use.

Every request carries full attribution - employee, team, cost center, provider, model. Finance gets a clean breakdown they can export as a CSV and import into existing month-end workflows.

Stipend — AI Spend Report · March 2026
March 2026 · AI Spend Report
Total Spent
$11,760
+14% vs February
Remaining Budget
$1,940
of $13,700 total
Overages
$0
100% enforced
Active Employees
28
across 4 teams
Team Spent Budget usage Status
Engineering
CC-ENG-001
$6,840
85%
On track
Product
CC-PROD-001
$3,200
91%
Near limit
Design
CC-DES-001
$1,240
77%
On track
Support
CC-SUP-001
$480
80%
On track
OpenAI $6,703
Anthropic $4,117
Google $940
Finance-ready CSV for AP and ERP import
Cost center attribution
Every token billed to the right team and role automatically. Allocations match your existing org chart, not a custom taxonomy you have to maintain.
Immutable audit trail
Every request logged once and never updated: who, what model, how many tokens, what cost, at what time. Write-once by design, built for compliance reviews.
Monthly reports, automatic
Finance receives a structured summary on the first of each month. One-click export for AP. No dashboard to check, no manual pull required.
<1 day
From signup to first
enforced request
100%
Of requests budget-checked
before reaching the provider
CSV
Monthly finance report
available today
1 click
Admin action to revoke access
today
Control Details

The production controls
underneath the pull product.

Workflow attribution and optimization create the pull. These controls are what make the numbers enforceable, finance-ready, and safe to run in production.

Budget by Role
Define budget tiers by employee, team, or role and manage them directly from the admin console.
Provider Policy
Define which providers and models each team can access. Glob patterns supported. Requests to unapproved endpoints blocked before they leave the gateway.
Real-Time Enforcement
Every request checked against remaining balance synchronously. Hard limits, not soft alerts. No overages, no bill surprises at month end.
Audit Trail
Complete, immutable logs of every request: who, which model, how many tokens, what cost, at what time. Write-once by design, built for compliance reviews.
Access Lifecycle
Invite employees, issue managed credentials, and revoke access immediately from the admin dashboard when usage should stop.
Cost Center Reporting
Every dollar attributed to a team, role, or cost center. Export finance-ready CSVs for AP or import them into your existing ERP workflow.
For AI-native product teams

See what each workflow costs
before the invoice
does.

We're onboarding founders, CTOs, and engineering leads who need cost-by-workflow visibility first. Routing recommendations and release modeling are in active development, with the control layer live today.

Request Access →