Start with workflow attribution, routing intelligence, and pre-launch cost modeling. Keep budgets, policy, and reporting underneath as the control layer.
Active development Cost by workflow, routing recommendations, and the simulator are in progress for Stipend Cloud. The control layer below is live today.
01 — Cost by Workflow
See which feature is driving the bill.
Tag requests with labels like feature, agent, or environment. Break down spend by any dimension you define, drill into model mix inside a workflow, and watch the trend before and after a release.
Active developmentBuilt on label-aware usage events and a dedicated cost-by-workflow dashboard.
Stipend — Cost by Workflow · April 2026
Current billing period
Grouped by feature with drilldown by model
featureagentenvironment
Labeled requests
182,440
94% of enforced traffic
Top workflow
$4,776
Content generation
Largest jump
+38%
after v2.3 release
Unlabeled spend
$341
kept separate
WorkflowRequestsTop modelSpendTrend
Content generation
feature=content-gen · env=prod
54.2k
gpt-5.4
$4,776
+38%
Support copilot
agent=support-bot · env=prod
72.1k
gpt-5.4
$2,801
+4%
Search reranker
feature=search-rerank · env=prod
31.6k
claude-opus-4-6
$657
-12%
Label the request
Add labels like feature, agent, and environment so spend can be joined back to the work that caused it.
Drill into model mix
See which models are actually powering a workflow instead of guessing from a provider total at the account level.
Watch the trend
Compare releases and billing periods so cost spikes show up against the workflow that changed, not just the monthly invoice.
02 — Routing Recommendations
See where you're overpaying.
Stipend analyzes model mix and points out workloads that look overpriced for the output they need. The goal is simple: route cheap work cheaply and keep premium models where they actually matter.
Active developmentRecommendations use enforced traffic history and become workflow-aware as label coverage improves.
Stipend — Routing Recommendations
Recommended move
84% of your GPT-5.4 support traffic could run on gpt-5.4-mini.
Estimated savings: $1,420 / month with policy guardrails preserved.
Recommendations get more useful when spend is already labeled by feature, agent, or environment instead of only at the account level.
Savings you can act on
See the estimated monthly impact before you touch production routing, not after another invoice lands.
Control stays underneath
Recommendations do not bypass policy. The control layer still decides which providers and models a team can use.
03 — Cost Simulator
Model the cost before you ship.
Estimate daily, weekly, and monthly cost for a workflow before traffic goes live. Compare models side by side for the same workload instead of discovering the answer on an invoice.
Active developmentBuilt on dry-run pricing estimates and side-by-side model comparison for the same request profile.
Stipend — Cost Simulator
Workload profile
Workflowcontent-gen
Avg input tokens3,400
Avg output tokens920
Calls / day2,600
Environmentproduction
Compare models
ModelDailyMonthly
gpt-5.4$182$5,460
gpt-5.4-mini$41$1,230
claude-opus-4-6$136$4,080
gemini-3.1-pro$29$870
Estimate before release
Project daily, weekly, and monthly AI cost for a workflow before it becomes a production surprise.
Compare side by side
Run the same request profile across multiple models so cost becomes part of the shipping decision, not only the invoice review.
Margin next
For teams reselling AI-powered features, margin visibility follows once labeled workflow data is live and trustworthy.
04 — Control Layer
The intelligence layer is new. The controls still do the hard part.
Workflow visibility only matters if the underlying traffic is enforced and trustworthy. Budgets, provider policy, finance exports, audit logs, and access revocation are still the production foundation.
Define compute budgets by job level, team, or individual. Admins can update policy directly today, assign the right providers to each group, and keep access controls in one place.
Stipend — Allocation Policy · Acme Corp
Budget Policy — 6 role tiers configured
Live today
Role TierMonthly BudgetProvider Access
Engineering · Principal
Level 6+
$3,000 / mo
OpenAIAnthropicGoogleAll models
Engineering · Senior
Level 4–5
$2,000 / mo
OpenAIAnthropic
Engineering · Mid
Level 2–3
$1,200 / mo
OpenAIAnthropic
Product
Any level
$1,500 / mo
OpenAIAnthropic
Design
Any level
$800 / mo
OpenAIgpt-5.4, gpt-5.4-mini
Support
Any level
$200 / mo
gpt-5.4-mini only
Budget by role tier
Define reusable budget tiers by job level, team, or employee. Apply them from the admin console without custom setup work.
Provider model allowlists
Define which providers and models each group can call. Glob patterns like claude-sonnet-* are supported. Blocked at the gateway before the request leaves.
Passthrough or resale
Bring your own API contracts or route through Stipend's resale layer. Either way: one credential per employee, all providers.
Control — Real-Time Enforcement
Hard limits before the request lands.
The gateway isn't a monitor — it's a gatekeeper. Every AI call is checked against the employee's remaining balance synchronously, before it reaches the provider. Over budget means a clean rejection. No overages, ever.
Stipend Gateway — Request Lifecycle
Employee tool
Cursor, Claude, VS Code, API
HTTPS
Stipend Gateway
Auth · Allowlist Budget reserve
Approved
OpenAI
api.openai.com
Response
Reconcile
Actual cost returned to wallet
01
Key auth
Employee's Stipend key resolved to user, account, and wallet. Revoked keys rejected instantly with no round-trip.
< 1ms
02
Model allowlist
Requested model checked against account policy. Unapproved models are blocked before a reservation is even attempted.
03
Atomic reservation
Worst-case cost estimated and atomically decremented from wallet in one SQL transaction. Concurrent callers cannot race past the limit.
~3ms avg
04
Forward to provider
Request proxied with the resolved API key. Works for both standard and streaming responses, transparent to the calling tool.
05
Reconcile cost
Actual tokens parsed from the provider response. Over-estimated reserve returned to wallet. Usage event written once, immutably.
When budget is exhausted
POST /v1/chat/completions HTTP 402 Payment Required
The atomic SQL reserve means concurrent requests cannot exceed the wallet balance. The database is the source of truth, not a cache or a flag.
Stream-aware
Works with standard and streaming requests. Token counts parsed from SSE chunks as they arrive. Wallet reconciled on stream close.
Drop-in compatible
The gateway speaks the same request and response shape as OpenAI and Anthropic. Employees swap one base URL. No code changes required in their tools.
Control — Finance Reporting
Reports your CFO can actually use.
Every request carries full attribution - employee, team, cost center, provider, model. Finance gets a clean breakdown they can export as a CSV and import into existing month-end workflows.
Stipend — AI Spend Report · March 2026
March 2026 · AI Spend Report
Total Spent
$11,760
+14% vs February
Remaining Budget
$1,940
of $13,700 total
Overages
$0
100% enforced
Active Employees
28
across 4 teams
TeamSpentBudget usageStatus
Engineering
CC-ENG-001
$6,840
85%
On track
Product
CC-PROD-001
$3,200
91%
Near limit
Design
CC-DES-001
$1,240
77%
On track
Support
CC-SUP-001
$480
80%
On track
OpenAI$6,703
Anthropic$4,117
Google$940
Finance-ready CSV for AP and ERP import
Cost center attribution
Every token billed to the right team and role automatically. Allocations match your existing org chart, not a custom taxonomy you have to maintain.
Immutable audit trail
Every request logged once and never updated: who, what model, how many tokens, what cost, at what time. Write-once by design, built for compliance reviews.
Monthly reports, automatic
Finance receives a structured summary on the first of each month. One-click export for AP. No dashboard to check, no manual pull required.
<1 day
From signup to first enforced request
100%
Of requests budget-checked before reaching the provider
CSV
Monthly finance report available today
1 click
Admin action to revoke access today
Control Details
The production controls underneath the pull product.
Workflow attribution and optimization create the pull. These controls are what make the numbers enforceable, finance-ready, and safe to run in production.
Budget by Role
Define budget tiers by employee, team, or role and manage them directly from the admin console.
Provider Policy
Define which providers and models each team can access. Glob patterns supported. Requests to unapproved endpoints blocked before they leave the gateway.
Real-Time Enforcement
Every request checked against remaining balance synchronously. Hard limits, not soft alerts. No overages, no bill surprises at month end.
Audit Trail
Complete, immutable logs of every request: who, which model, how many tokens, what cost, at what time. Write-once by design, built for compliance reviews.
Access Lifecycle
Invite employees, issue managed credentials, and revoke access immediately from the admin dashboard when usage should stop.
Cost Center Reporting
Every dollar attributed to a team, role, or cost center. Export finance-ready CSVs for AP or import them into your existing ERP workflow.
For AI-native product teams
See what each workflow costs before the invoice does.
We're onboarding founders, CTOs, and engineering leads who need cost-by-workflow visibility first. Routing recommendations and release modeling are in active development, with the control layer live today.
Join the alpha for cost-by-workflow visibility. We're onboarding founders, CTOs, and engineering leads who need feature-level attribution first. Routing recommendations and release modeling are in active development.
Request received.
We'll review your request and reach out if your team is a fit for the workflow cost alpha.