60-second quickstart

First audited call,
in under a minute.

install

pip install ainfera.

Python 3.10+. Ships the SDK and the ainfera CLI.

your shellcopy

# 30 seconds. No card, no waitlist.
$ pip install ainfera

Free during beta. No card. You get $10 of inference credit when payment turns on at GA, no expiry. read pricing →

ainfera install

Verify GitHub identity. Mint agent. Get key.

One CLI command. The server confirms api.github.com/user matches your handle, mints an Ed25519 AgentCard + free-tier Wallet, and registers the agent on the public chain. Your agent appears on your dashboard at app.ainfera.ai/<handle> the moment the call returns.

ADR-011 · POST /v1/agents/install-from-localcopy

# writes key to ~/.ainfera/keys.json (chmod 0600)
$ ainfera install ./ainfera-agent.yaml

  → verifying gh CLI identity ........ ok (@your-handle)
  → resolving owner tenant ........... ok
  → minting AgentCard (Ed25519) ...... ok
  → opening free-tier Wallet ......... ok
  → audit event agent.installed ........ ok

  agent live at app.ainfera.ai/@your-handle

Every key is scoped to one agent. Caps, budget, and pause toggle live on the agent — not the workspace. Re-run ainfera install to onboard more agents; the CLI reuses the key you already have on disk.

first call

Change one line. Make a call.

Drop in your existing OpenAI client. Point it at us. Set model: "ainfera-inference" to enable routing — or pass a specific model to skip it.

python · first_call.pycopy

# pip install openai
from openai import OpenAI
import os

client = OpenAI(
  base_url="https://api.ainfera.ai/v1",
  api_key=os.environ["AINFERA_KEY"],
)

res = client.chat.completions.create(
  model="ainfera-inference",
  messages=[{"role": "user", "content": "Summarize federated learning in 3 lines."}],
  extra_body={"caps": {"budget": 0.012, "latency_ms": 1500}},
)

print(res.choices[0].message.content)
print("audit:", res.audit.id, "|", res.audit.block)

typescript · first_call.tscopy

// npm i openai
import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.ainfera.ai/v1",
  apiKey: process.env.AINFERA_KEY!,
});

const res = await client.chat.completions.create({
  model: "ainfera-inference",
  messages: [{ role: "user", content: "Summarize federated learning in 3 lines." }],
  caps: { budget: 0.012, latency_ms: 1500 },
} as any);

console.log(res.choices[0].message.content);
console.log("audit:", res.audit.id, res.audit.block);

curl · first_call.shcopy

curl https://api.ainfera.ai/v1/chat/completions \
  -H "Authorization: Bearer $AINFERA_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "ainfera-inference",
    "messages": [{"role":"user","content":"Summarize federated learning in 3 lines."}],
    "caps": {"budget": 0.012, "latency_ms": 1500}
  }'

anthropic SDK · first_call_anthropic.pycopy

# pip install anthropic
from anthropic import Anthropic
import os

client = Anthropic(
  base_url="https://api.ainfera.ai/v1",
  api_key=os.environ["AINFERA_KEY"],
)

msg = client.messages.create(
  model="ainfera-inference",
  max_tokens=512,
  messages=[{"role": "user", "content": "Summarize federated learning in 3 lines."}],
  extra_body={"caps": {"budget": 0.012, "latency_ms": 1500}},
)

print(msg.content[0].text)
print("audit:", msg.audit.id)

$ python first_call.py

Federated learning trains a shared model across many clients
without moving the raw data — only model updates leave the
device. It protects privacy at a 6–12% accuracy cost on
medical-imaging benchmarks.

audit: inf_... | block <block_height>
routed: claude-opus-4-7  (0.942 quality · 940ms · $0.0061)
✓ within caps · ✓ audited

You can also pin a specific model (model: "claude-opus-4-7") — we'll skip routing, but still settle and audit. Caps still apply.

verify

Verify the call landed on chain.

Take the audit.id from your response. Anyone — including you, your customer, your auditor — can pull the record without an API key.

public audit · no key requiredcopy

curl https://audit.ainfera.ai/v1/inf_...

{
  "inference_id":  "inf_...",
  "block":         "<block_height>",
  "decision_hash": "0x4a7261c81f3c0e9b...",
  "response_hash": "0x2d3491bef0a87716...",
  "verified":      true,
  "chain":         "ainfera-mainnet"
}

The chain stores hashes only. Your prompt and the model's response stay in your workspace. read how audit works →

● done · welcome to ainfera

That was one call. Now build the agent.

You now have a working pipeline: every call routed within caps, every decision auditable by anyone. Here's where to go next, ranked by impact for an agent dev.

→ next · most popular

Use a template

"research-deep-dive" is a 6-step workflow used by many agents. Drop it into your code with one call.

→ inspect

Read a workflow trace

See the Gantt for a real multi-step run — every model choice, every cost, every fallback shown explicitly.

→ tune

Configure caps

Budget, latency, quality and reliability — set them per agent or per task type.

questions: hello@ainfera.ai · discord ainfera.ai/discordor read the machine-readable index — same content, agent-shaped