AI Cost Control
Stop bad AI spend before the provider call with permit-based budget checks, hard limits, and block-before-bill enforcement.
Most teams only discover an AI cost problem after the invoice lands. The missing control is not another dashboard. It is a decision boundary that can deny, constrain, or reroute a request before tokens are consumed.
Keel turns spend control into an authorization decision. The permit carries the budget and routing context up front, then the closeout record reconciles the estimate with what actually ran.
Looking for the full overview? Read the AI Cost Control pillar page →
What Brings Teams Here
An invoice spike lands after the damage is done
A retry loop, model swap, or tenant burst runs all weekend because nothing in the request path can stop it before tokens are consumed.
Provider budgets are scoped to the wrong unit
Key-level limits cannot express tenant, workload, or product rules cleanly when multiple teams and customers share the same provider account.
Finance wants hard limits, engineering has dashboards
Alerts, exports, and post-hoc reports explain where money went. They do not decide whether the next expensive request should run.
Why Current Cost Controls Fail
Dashboards are downstream
Provider billing updates on provider cadence. By the time a graph moves, the expensive behavior has already executed.
Alerts create cleanup workflows
The path from bill export to pager to code-side kill switch is measured in minutes or hours, not in the milliseconds where spend could have been prevented.
Gateway budgets miss workload ownership
API-key caps can throttle traffic, but they rarely model tenant-by-workload approval cleanly once one key serves many healthy and unhealthy paths.
Cost and routing drift separately
Fallback chains, model substitutions, and silent retries change the cost envelope unless they are part of the same decision.
Alerts answer what happened. A permit answers should this run. That is the difference between reporting on spend and actually controlling it.
What Changes With Keel
Block before bill
Keel evaluates budget state, estimated cost, workload, tenant, and model before the provider call. The request is allowed, denied, or constrained before spend lands.
Hard caps with explicit exception paths
Monthly, daily, or hourly envelopes can block outright or route through a named escalation path instead of relying on tribal knowledge during an incident.
One decision layer across services
The same permit model applies whether the request starts in one service or ten, so cost control stops being a pile of bespoke middleware checks.
Reconciled actuals after execution
Estimated cost drives the pre-execution decision; actual usage closes the loop later so the budget record matches what the provider really billed.
What The Permit Carries
- Workload and owner context so cost rules follow the request that is actually creating spend
- Model and provider constraints so cheaper or approved routes can be enforced instead of merely suggested
- Budget scope and remaining headroom so the decision reflects current policy state, not a stale export
- Exception path so hard caps and approved overrides are both explicit instead of improvised
Proof Teams Can Use Later
Structured block reasons
Blocked requests can point to the policy version, matched rule, budget scope, and exception path that shaped the decision.
Budget evidence tied to the request
Teams can show what budget state applied at the moment of approval instead of reconstructing the answer from logs, invoices, and screenshots.
Shared story for finance and engineering
The same permit and closeout record explains why a request ran, what it was expected to cost, and what it actually cost after execution.
Cost is the opening wedge because it is easy to quantify. The same record becomes more valuable later when security, procurement, or finance asks what decision was made and why.