Docs — Outlay

Quickstart Configuration SDKs Architecture & privacy

Quickstart

Route your Claude traffic through Outlay's optimization engine in about five minutes. Your API key and prompts never leave your machine — the proxy runs locally and classifies on your box.

You'll need a deployment from your pilot (a deployment id and an Outlay API key) plus your own Anthropic API key. Not in a pilot yet? Book one →

Get your deployment id and an API key
Open Connect in the console. Copy your deployment id and click Create API key (shown once).

Install the proxy

# Python 3.10+
pip install modelpilot-client

Configure & run it

export ANTHROPIC_API_KEY=sk-ant-…        # stays on your machine
export MODELPILOT_API_KEY=mp_live_…       # from the Connect page
export MODELPILOT_DEPLOYMENT_ID=dep_…
export MODELPILOT_BRAIN_URL=https://modelpilot-brain-prod.fly.dev
export MODELPILOT_CONSOLE_URL=https://modelpilot-console-prod.fly.dev

modelpilot-client            # listens on http://127.0.0.1:8400

Full env-var list under Configuration.

Point your app at it
The proxy speaks the Claude Messages API — just change the base URL. Nothing else in your code changes.
```
# Python
from anthropic import Anthropic
client = Anthropic(base_url="http://127.0.0.1:8400")   # your key, local
```
One-line variants for TypeScript and cURL are under SDKs.
Watch the savings
Send some traffic, then open your dashboard — realized savings by task type, baseline-vs-actual, and a non-inferiority proof rate, recomputed from real tokens.

Choosing a routing mode

Set your mode in the console (it takes effect server-side within seconds, no redeploy):

Mode	What it does	When to use
shadow	Scores a cheaper candidate per request but leaves traffic untouched; builds the cost + quality evidence.	First — to see the routing before trusting it.
advise	Recommends a cheaper model per request; traffic still flows to your chosen model.	To review concrete suggestions.
autopilot	Auto-routes to the cheapest model proven non-inferior on your own work.	Once the numbers convince you. This is where you save.

It always fails open. If Outlay is ever unreachable, your request is forwarded straight to the Claude API, unrouted. We can degrade your savings, never your uptime.

Configuration

The proxy is configured entirely through environment variables. Only the first three are required.

Variable	Required	Description
`ANTHROPIC_API_KEY`	yes	Your Anthropic key. Stays on your machine; used to call Anthropic directly.
`MODELPILOT_API_KEY`	yes	Your Outlay key from the Connect page (`mp_live_…`).
`MODELPILOT_DEPLOYMENT_ID`	yes	The deployment the requests are billed/attributed to (`dep_…`).
`MODELPILOT_BRAIN_URL`	no	Routing-decision endpoint. Defaults to the hosted brain.
`MODELPILOT_CONSOLE_URL`	no	Control-plane endpoint for entitlement + telemetry. Defaults to the hosted console.
`MODELPILOT_BIND`	no	Local listen address. Defaults to `127.0.0.1:8400`.
`MODELPILOT_MODE`	no	Override the console mode (`shadow` / `advise` / `autopilot`) locally.
`MODELPILOT_REQUEST_OBSERVER`	no	`module:factory` hook that receives per-request metadata — the seam Outlay's attribution uses. Stays out of the data path.

To attribute a request to a ticket, pass a work tag on the call — an x-modelpilot-work-ticket header (e.g. PROJ-123) on the request, or the SCOPEPILOT_TICKET env for the whole process. Outlay also recovers the ticket from the branch, a PR→issue link, or a commit trailer when no explicit tag is present.

SDKs

The proxy is a drop-in Messages API endpoint, so every Anthropic SDK works by pointing base_url at it. No new client library to learn.

Python

# pip install anthropic
from anthropic import Anthropic
client = Anthropic(base_url="http://127.0.0.1:8400")

msg = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=512,
    messages=[{"role":"user","content":"Summarize this ticket."}],
    extra_headers={"x-modelpilot-work-ticket":"PROJ-123"},  # optional
)

TypeScript

// npm i @anthropic-ai/sdk
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic({ baseURL: "http://127.0.0.1:8400" });

await client.messages.create({
  model: "claude-sonnet-4-6",
  max_tokens: 512,
  messages: [{ role: "user", content: "Summarize this ticket." }],
}, { headers: { "x-modelpilot-work-ticket": "PROJ-123" } });

cURL

curl http://127.0.0.1:8400/v1/messages \
  -H "content-type: application/json" \
  -H "x-modelpilot-work-ticket: PROJ-123" \
  -d '{"model":"claude-sonnet-4-6","max_tokens":512,
       "messages":[{"role":"user","content":"Summarize this ticket."}]}'

Your Anthropic key is read from the proxy's environment, so it never appears in client code.

Architecture & privacy

Outlay splits cleanly so your sensitive data physically can't reach us. The piece in your environment is a thin, inspectable client that does local classification and tagging and carries no proprietary routing logic. The decision engine and the attribution model run on our side and only ever see metadata.

Your appprompt + API key → Local proxyclassifies on your box → Outlaycategory + token counts + ticket id → Anthropicyour key, your prompt

✓
Prompts and outputs never leave your environment. Classification and routing run locally; we never receive request or response bodies.
✓
Your API key stays on your machine and calls Anthropic directly.
✓
We see metadata only — a task category, numeric features (token counts, flags), the ticket id, and per-request cost/savings figures.
✓
The engine fails open. If our service is unreachable, traffic passes straight through to Anthropic, unrouted. A downgrade only happens after it's proven non-inferior on your own work (shadow → quality canary).

Our ingestion endpoints reject any payload containing prompt text, outputs, or secret-looking keys (HTTP 422). The boundary is enforced, not just promised — full detail on Security & privacy.

One compliance note before you go live

You're using your own Anthropic key, so Anthropic's Usage Policy still applies to your traffic when it routes through Outlay. High-risk domains (legal, healthcare, insurance, finance, employment or housing, academic testing, journalism) need a qualified human reviewing outputs and an "AI was involved" disclosure; any user-facing chatbot or agent should tell users they're talking to AI. These are Anthropic's requirements — full detail in Terms §10.

← Back to Outlay · Book a pilot · Security

Outlay docs.