Inference is infrastructure. Treat it like infrastructure.
The customer is an agent.
For most of computing, the customer was a person sitting at a keyboard. The next decade is different. The buyer of compute is increasingly a process — an autonomous agent reasoning, planning, spending. Our product is built for them.
That sounds like marketing. It isn't. It changes everything: hard budget caps, machine-readable everything, retry semantics that don't require human supervision, an audit trail your customer's CFO can verify without trusting us.
One model is the wrong unit.
An agent doesn't have a model. It has a workflow — classify, plan, retrieve, summarize, compose, verify. Each step has different needs. Picking one frontier model for every step is wasteful at best, dangerous at worst.
Templates and model classes — not specific models — are how we encode this. You say reasoning-frontier; we pick the right specific model per call. When the frontier shifts, your code doesn't.
Money is the most honest signal.
Agents spend money. Real money, unattended, while you sleep. Money flows are the cleanest place to enforce policy, because they're unambiguous: either you went over budget or you didn't. There's no grey.
Every hard cap we ship — budget, latency, quality, reliability — is enforced at settlement, not at intent. If a call doesn't fit, we don't run it. If a workflow is on track to exceed cap, we pause and ask. We never quietly downgrade. Agents that quietly downgrade are agents you can't trust.
Proof, not promises.
"Trust us" doesn't scale to billions of agent-initiated transactions. So we built the product around verifiable evidence. Every routing decision, every prompt hash, every dollar charged — on a public chain.
We can't rewrite history. Neither can your competitor, your customer's auditor, or a court. The audit isn't a feature; it's the foundation. Routing happens on top of it.
Speak two languages.
Humans read this site. So do the agents we serve. Both deserve a first-class experience. Everything we publish is simultaneously human-readable and agent-readable — copy that scans for a developer at 11pm, structured data underneath for the agent crawling at 3am.
Read /llms.txt. That's the version of this site an agent sees. It is not a degraded copy. It is the same product, expressed in the shape the agent needs.
Margins beat platform fees.
The temptation in infrastructure is to charge for everything — seats, agents, throughput, "enterprise tier." It feels safer. It's also incompatible with the customer being an agent. Agents don't have seats. Agents don't sign annual contracts. They route a call, settle, and disappear.
So we charge one thing: a thin margin on inference. 8%, flat, every call. No seats, no SKUs, no volume tier behind a sales call. If your spend goes to zero, our revenue does too — and that's how it should be.
Boring is a feature.
This is plumbing. Plumbing is not where you express creativity at the customer's expense. The mark is a sharp-cornered square — not a wave, not a glyph, not a metaphor. The wordmark is Plex Sans at default-set. The accent is one blue. We will not be tempted into "evolving" any of this until the product warrants it.
What we will spend creativity on: routing policy, audit primitives, fallback behavior, template economics. The visual system stays out of the way so the product can move.
Ship what we'd want.
The only honest test for whether a feature should ship: would I, as an agent developer at 11pm on a Tuesday, want this exactly? If we have to argue ourselves into it, it doesn't ship.
This makes the roadmap shorter than most. That's fine.