Features — Cost by Workflow, Routing Intelligence, and Control

01 — Cost by Workflow

See which feature is
driving the bill.

Tag requests with labels like feature, agent, or environment. Break down spend by any dimension you define, drill into model mix inside a workflow, and watch the trend before and after a release.

Active development Built on label-aware usage events and a dedicated cost-by-workflow dashboard.

Stipend — Cost by Workflow · April 2026

Current billing period

Grouped by feature with drilldown by model

feature agent environment

Labeled requests

182,440

94% of enforced traffic

Top workflow

$4,776

Content generation

Largest jump

+38%

after v2.3 release

Unlabeled spend

$341

kept separate

Workflow Requests Top model Spend Trend

Content generation

feature=content-gen · env=prod

54.2k

gpt-5.4

$4,776

+38%

Support copilot

agent=support-bot · env=prod

72.1k

gpt-5.4

$2,801

+4%

Search reranker

feature=search-rerank · env=prod

31.6k

claude-opus-4-6

$657

-12%

Label the request

Add labels like feature, agent, and environment so spend can be joined back to the work that caused it.

Drill into model mix

See which models are actually powering a workflow instead of guessing from a provider total at the account level.

Watch the trend

Compare releases and billing periods so cost spikes show up against the workflow that changed, not just the monthly invoice.

02 — Routing Recommendations

See where you're
overpaying.

Stipend analyzes model mix and points out workloads that look overpriced for the output they need. The goal is simple: route cheap work cheaply and keep premium models where they actually matter.

Active development Recommendations use enforced traffic history and become workflow-aware as label coverage improves.

Stipend — Routing Recommendations

Recommended move

84% of your GPT-5.4 support traffic could run on gpt-5.4-mini.

Estimated savings: $1,420 / month with policy guardrails preserved.

Workflow Current model Recommended model Estimated savings

Support copilot

72.1k requests this period

gpt-5.4

gpt-5.4-mini

$1,420 / mo

Search reranker

31.6k requests this period

claude-opus-4-6

claude-haiku-4-5

$388 / mo

Draft summaries

18.9k requests this period

gpt-5.4

gemini-3.1-pro

$271 / mo

Workflow-aware suggestions

Recommendations get more useful when spend is already labeled by feature, agent, or environment instead of only at the account level.

Savings you can act on

See the estimated monthly impact before you touch production routing, not after another invoice lands.

Control stays underneath

Recommendations do not bypass policy. The control layer still decides which providers and models a team can use.

03 — Cost Simulator

Model the cost
before you ship.

Estimate daily, weekly, and monthly cost for a workflow before traffic goes live. Compare models side by side for the same workload instead of discovering the answer on an invoice.

Active development Built on dry-run pricing estimates and side-by-side model comparison for the same request profile.

Stipend — Cost Simulator

Workload profile

Workflow content-gen

Avg input tokens 3,400

Avg output tokens 920

Calls / day 2,600

Environment production

Compare models

Model Daily Monthly

gpt-5.4 $182 $5,460

gpt-5.4-mini $41 $1,230

claude-opus-4-6 $136 $4,080

gemini-3.1-pro $29 $870

Estimate before release

Project daily, weekly, and monthly AI cost for a workflow before it becomes a production surprise.

Compare side by side

Run the same request profile across multiple models so cost becomes part of the shipping decision, not only the invoice review.

Margin next

For teams reselling AI-powered features, margin visibility follows once labeled workflow data is live and trustworthy.

04 — Control Layer

The intelligence layer is new.
The controls still do the hard part.

Workflow visibility only matters if the underlying traffic is enforced and trustworthy. Budgets, provider policy, finance exports, audit logs, and access revocation are still the production foundation.

Live today

Budget & access

Set budget tiers, define provider access, and manage credentials from one control plane.

Live today

Real-time enforcement

Reserve budget before the request lands and reject overages before they hit the provider invoice.

Live today

Finance reporting

Export finance-ready reporting with attribution by team, role, cost center, provider, and model.

Control — Budget & Access

One policy.
Every team's compute.

Define compute budgets by job level, team, or individual. Admins can update policy directly today, assign the right providers to each group, and keep access controls in one place.

Stipend — Allocation Policy · Acme Corp

Budget Policy — 6 role tiers configured

Live today

Role Tier Monthly Budget Provider Access

Engineering · Principal

Level 6+

$3,000 / mo

OpenAI Anthropic Google All models

Engineering · Senior

Level 4–5

$2,000 / mo

OpenAI Anthropic

Engineering · Mid

Level 2–3

$1,200 / mo

OpenAI Anthropic

Product

Any level

$1,500 / mo

OpenAI Anthropic

Design

Any level

$800 / mo

OpenAI gpt-5.4, gpt-5.4-mini

Support

Any level

$200 / mo

gpt-5.4-mini only

Budget by role tier

Define reusable budget tiers by job level, team, or employee. Apply them from the admin console without custom setup work.

Provider model allowlists

Define which providers and models each group can call. Glob patterns like claude-sonnet-* are supported. Blocked at the gateway before the request leaves.

Passthrough or resale

Bring your own API contracts or route through Stipend's resale layer. Either way: one credential per employee, all providers.

Control — Real-Time Enforcement

Hard limits before
the request lands.

The gateway isn't a monitor — it's a gatekeeper. Every AI call is checked against the employee's remaining balance synchronously, before it reaches the provider. Over budget means a clean rejection. No overages, ever.

Stipend Gateway — Request Lifecycle

Employee tool

Cursor, Claude,
VS Code, API

HTTPS

Stipend Gateway

Auth · Allowlist
Budget reserve

Approved

OpenAI

api.openai.com

Response

Reconcile

Actual cost
returned to wallet

Key auth

Employee's Stipend key resolved to user, account, and wallet. Revoked keys rejected instantly with no round-trip.

< 1ms

Model allowlist

Requested model checked against account policy. Unapproved models are blocked before a reservation is even attempted.

Atomic reservation

Worst-case cost estimated and atomically decremented from wallet in one SQL transaction. Concurrent callers cannot race past the limit.

~3ms avg

Forward to provider

Request proxied with the resolved API key. Works for both standard and streaming responses, transparent to the calling tool.

Reconcile cost

Actual tokens parsed from the provider response. Over-estimated reserve returned to wallet. Usage event written once, immutably.

When budget is exhausted

POST /v1/chat/completions
HTTP 402 Payment Required

{
  "error": {
    "type": "budget_exhausted",
    "remaining_cents": 0,
    "budget_cents": 150000,
    "request_more_url": "https://app.stipend.dev/..."
  }
}

Normal successful request

POST /v1/chat/completions
HTTP 200 OK

# Response is the unmodified provider
# payload - no wrapping, no schema
# changes. Your existing SDK works.

# Wallet updated asynchronously:
# reserved: $0.42 -> actual: $0.31
# surplus $0.11 returned to balance

No overages, guaranteed

The atomic SQL reserve means concurrent requests cannot exceed the wallet balance. The database is the source of truth, not a cache or a flag.

Stream-aware

Works with standard and streaming requests. Token counts parsed from SSE chunks as they arrive. Wallet reconciled on stream close.

Drop-in compatible

The gateway speaks the same request and response shape as OpenAI and Anthropic. Employees swap one base URL. No code changes required in their tools.

Control — Finance Reporting

Reports your CFO
can actually use.

Every request carries full attribution - employee, team, cost center, provider, model. Finance gets a clean breakdown they can export as a CSV and import into existing month-end workflows.

Stipend — AI Spend Report · March 2026

March 2026 · AI Spend Report

Total Spent

$11,760

+14% vs February

Remaining Budget

$1,940

of $13,700 total

Overages

100% enforced

Active Employees

across 4 teams

Team Spent Budget usage Status

Engineering

CC-ENG-001

$6,840

85%

On track

Product

CC-PROD-001

$3,200

91%

Near limit

Design

CC-DES-001

$1,240

77%

On track

Support

CC-SUP-001

$480

80%

On track

OpenAI $6,703

Anthropic $4,117

Google $940

Finance-ready CSV for AP and ERP import

Cost center attribution

Every token billed to the right team and role automatically. Allocations match your existing org chart, not a custom taxonomy you have to maintain.

Immutable audit trail

Every request logged once and never updated: who, what model, how many tokens, what cost, at what time. Write-once by design, built for compliance reviews.

Monthly reports, automatic

Finance receives a structured summary on the first of each month. One-click export for AP. No dashboard to check, no manual pull required.

<1 day

From signup to first
enforced request

100%

Of requests budget-checked
before reaching the provider

CSV

Monthly finance report
available today

1 click

Admin action to revoke access
today

Control Details

The production controls
underneath the pull product.

Workflow attribution and optimization create the pull. These controls are what make the numbers enforceable, finance-ready, and safe to run in production.

Budget by Role

Define budget tiers by employee, team, or role and manage them directly from the admin console.

Provider Policy

Define which providers and models each team can access. Glob patterns supported. Requests to unapproved endpoints blocked before they leave the gateway.

Real-Time Enforcement

Every request checked against remaining balance synchronously. Hard limits, not soft alerts. No overages, no bill surprises at month end.

Audit Trail

Complete, immutable logs of every request: who, which model, how many tokens, what cost, at what time. Write-once by design, built for compliance reviews.

Access Lifecycle

Invite employees, issue managed credentials, and revoke access immediately from the admin dashboard when usage should stop.

Cost Center Reporting

Every dollar attributed to a team, role, or cost center. Export finance-ready CSVs for AP or import them into your existing ERP workflow.

Cost intelligence for
AI products.

See which feature is
driving the bill.

See where you're
overpaying.

Model the cost
before you ship.

The intelligence layer is new.
The controls still do the hard part.

One policy.
Every team's compute.

Hard limits before
the request lands.

Reports your CFO
can actually use.

The production controls
underneath the pull product.

See what each workflow costs
before the invoice
does.

Cost intelligence forAI products.

See which feature isdriving the bill.

See where you'reoverpaying.

Model the costbefore you ship.

The intelligence layer is new.The controls still do the hard part.

One policy.Every team's compute.

Hard limits beforethe request lands.

Reports your CFOcan actually use.

The production controlsunderneath the pull product.

See what each workflow costsbefore the invoicedoes.

Cost intelligence for
AI products.

See which feature is
driving the bill.

See where you're
overpaying.

Model the cost
before you ship.

The intelligence layer is new.
The controls still do the hard part.

One policy.
Every team's compute.

Hard limits before
the request lands.

Reports your CFO
can actually use.

The production controls
underneath the pull product.

See what each workflow costs
before the invoice
does.