Documentation

Outlay docs.

Outlay maps your AI spend to your roadmap and, optionally, routes each request down to the cheapest model that's provably good enough. The attribution platform onboards through a design-partner pilot; the pages below cover the optional routing proxy you run yourself — quickstart, configuration, SDKs, and the privacy architecture.

Quickstart

Route your Claude traffic through Outlay's optimization engine in about five minutes. Your API key and prompts never leave your machine — the proxy runs locally and classifies on your box.

You'll need a deployment from your pilot (a deployment id and an Outlay API key) plus your own Anthropic API key. Not in a pilot yet? Book one →
  1. Get your deployment id and an API key

    Open Connect in the console. Copy your deployment id and click Create API key (shown once).
  2. Install the proxy

    # Python 3.10+
    pip install modelpilot-client
  3. Configure & run it

    export ANTHROPIC_API_KEY=sk-ant-…        # stays on your machine
    export MODELPILOT_API_KEY=mp_live_…       # from the Connect page
    export MODELPILOT_DEPLOYMENT_ID=dep_…
    export MODELPILOT_BRAIN_URL=https://modelpilot-brain-prod.fly.dev
    export MODELPILOT_CONSOLE_URL=https://modelpilot-console-prod.fly.dev
    
    modelpilot-client            # listens on http://127.0.0.1:8400

    Full env-var list under Configuration.

  4. Point your app at it

    The proxy speaks the Claude Messages API — just change the base URL. Nothing else in your code changes.
    # Python
    from anthropic import Anthropic
    client = Anthropic(base_url="http://127.0.0.1:8400")   # your key, local

    One-line variants for TypeScript and cURL are under SDKs.

  5. Watch the savings

    Send some traffic, then open your dashboard — realized savings by task type, baseline-vs-actual, and a non-inferiority proof rate, recomputed from real tokens.

Choosing a routing mode

Set your mode in the console (it takes effect server-side within seconds, no redeploy):

ModeWhat it doesWhen to use
shadowScores a cheaper candidate per request but leaves traffic untouched; builds the cost + quality evidence.First — to see the routing before trusting it.
adviseRecommends a cheaper model per request; traffic still flows to your chosen model.To review concrete suggestions.
autopilotAuto-routes to the cheapest model proven non-inferior on your own work.Once the numbers convince you. This is where you save.
It always fails open. If Outlay is ever unreachable, your request is forwarded straight to the Claude API, unrouted. We can degrade your savings, never your uptime.

Configuration

The proxy is configured entirely through environment variables. Only the first three are required.

VariableRequiredDescription
ANTHROPIC_API_KEYyesYour Anthropic key. Stays on your machine; used to call Anthropic directly.
MODELPILOT_API_KEYyesYour Outlay key from the Connect page (mp_live_…).
MODELPILOT_DEPLOYMENT_IDyesThe deployment the requests are billed/attributed to (dep_…).
MODELPILOT_BRAIN_URLnoRouting-decision endpoint. Defaults to the hosted brain.
MODELPILOT_CONSOLE_URLnoControl-plane endpoint for entitlement + telemetry. Defaults to the hosted console.
MODELPILOT_BINDnoLocal listen address. Defaults to 127.0.0.1:8400.
MODELPILOT_MODEnoOverride the console mode (shadow / advise / autopilot) locally.
MODELPILOT_REQUEST_OBSERVERnomodule:factory hook that receives per-request metadata — the seam Outlay's attribution uses. Stays out of the data path.

To attribute a request to a ticket, pass a work tag on the call — an x-modelpilot-work-ticket header (e.g. PROJ-123) on the request, or the SCOPEPILOT_TICKET env for the whole process. Outlay also recovers the ticket from the branch, a PR→issue link, or a commit trailer when no explicit tag is present.

SDKs

The proxy is a drop-in Messages API endpoint, so every Anthropic SDK works by pointing base_url at it. No new client library to learn.

Python

# pip install anthropic
from anthropic import Anthropic
client = Anthropic(base_url="http://127.0.0.1:8400")

msg = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=512,
    messages=[{"role":"user","content":"Summarize this ticket."}],
    extra_headers={"x-modelpilot-work-ticket":"PROJ-123"},  # optional
)

TypeScript

// npm i @anthropic-ai/sdk
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic({ baseURL: "http://127.0.0.1:8400" });

await client.messages.create({
  model: "claude-sonnet-4-6",
  max_tokens: 512,
  messages: [{ role: "user", content: "Summarize this ticket." }],
}, { headers: { "x-modelpilot-work-ticket": "PROJ-123" } });

cURL

curl http://127.0.0.1:8400/v1/messages \
  -H "content-type: application/json" \
  -H "x-modelpilot-work-ticket: PROJ-123" \
  -d '{"model":"claude-sonnet-4-6","max_tokens":512,
       "messages":[{"role":"user","content":"Summarize this ticket."}]}'

Your Anthropic key is read from the proxy's environment, so it never appears in client code.

Architecture & privacy

Outlay splits cleanly so your sensitive data physically can't reach us. The piece in your environment is a thin, inspectable client that does local classification and tagging and carries no proprietary routing logic. The decision engine and the attribution model run on our side and only ever see metadata.

Your appprompt + API key Local proxyclassifies on your box Outlaycategory + token counts + ticket id Anthropicyour key, your prompt
  • Prompts and outputs never leave your environment. Classification and routing run locally; we never receive request or response bodies.
  • Your API key stays on your machine and calls Anthropic directly.
  • We see metadata only — a task category, numeric features (token counts, flags), the ticket id, and per-request cost/savings figures.
  • The engine fails open. If our service is unreachable, traffic passes straight through to Anthropic, unrouted. A downgrade only happens after it's proven non-inferior on your own work (shadow → quality canary).
Our ingestion endpoints reject any payload containing prompt text, outputs, or secret-looking keys (HTTP 422). The boundary is enforced, not just promised — full detail on Security & privacy.

One compliance note before you go live

You're using your own Anthropic key, so Anthropic's Usage Policy still applies to your traffic when it routes through Outlay. High-risk domains (legal, healthcare, insurance, finance, employment or housing, academic testing, journalism) need a qualified human reviewing outputs and an "AI was involved" disclosure; any user-facing chatbot or agent should tell users they're talking to AI. These are Anthropic's requirements — full detail in Terms §10.