API Documentation

Parity Layer API

Parity Layer is a drop-in AI gateway. It proves a cheaper model produces output that matches or beats your expensive model on your actual prompts, then routes to it automatically. You cut AI API costs by 30-60% with output quality preserved and instant fallback. Integration is a two-line change to any OpenAI or Anthropic SDK.

Base URL & authentication

Send requests to https://api.paritylayer.com (use /v1 for OpenAI-style clients). Authenticate with your Parity key (sk-pl-...), which you generate in the dashboard. Your own provider key stays in Parity so it can talk to OpenAI or Anthropic on your behalf.

Authorization: Bearer sk-pl-your-parity-key

Quickstart

Keep your existing code. Change the base URL and the key, nothing else. Prompts, tools, streaming, parameters, and response shapes are identical to calling the provider directly.

Python, OpenAI SDK

from openai import OpenAI

client = OpenAI(
 base_url="https://api.paritylayer.com/v1",
 api_key="sk-pl-...", # your Parity key
)

resp = client.chat.completions.create(
 model="gpt-4o", # your current baseline model
 messages=[{"role": "user", "content": "Summarize this ticket..."}],
)

Python, Anthropic SDK

import anthropic

client = anthropic.Anthropic(
 base_url="https://api.paritylayer.com",
 api_key="sk-pl-...", # your Parity key
)

msg = client.messages.create(
 model="claude-sonnet-4-20250514",
 max_tokens=1024,
 messages=[{"role": "user", "content": "Extract the fields as JSON..."}],
)

TypeScript, OpenAI SDK

import OpenAI from "openai";

const client = new OpenAI({
 baseURL: "https://api.paritylayer.com/v1",
 apiKey: "sk-pl-...",
});

const resp = await client.chat.completions.create({
 model: "gpt-4o",
 messages: [{ role: "user", content: "Classify this message..." }],
});

cURL

curl https://api.paritylayer.com/v1/chat/completions \
 -H "Authorization: Bearer sk-pl-..." \
 -H "Content-Type: application/json" \
 -d '{
 "model": "gpt-4o",
 "messages": [{"role": "user", "content": "Hello"}]
 }'

Endpoints

Parity Layer mirrors the upstream provider APIs, so you use the endpoint your SDK already calls:

POST /v1/chat/completions , OpenAI Chat Completions compatible.
POST /v1/messages , Anthropic Messages compatible.

Streaming (stream: true), tool / function calling, system prompts, and every parameter you already pass are supported and behave exactly as they do upstream. Each endpoint returns the identical response shape as the provider, so your parsing code does not change.

How routing works

You keep calling your baseline model. In the background, Parity tests cheaper candidate models on your actual prompts and statistically proves whether a candidate matches, or beats, your baseline for that prompt type. Only once a candidate is proven does Parity route that prompt type to it. If quality ever drifts, it falls back to your baseline model instantly, so your users never see a worse answer. You stay in control of when routing activates.

Offline proof (no code change)

Not ready to route live traffic? Upload a sample of past requests as a JSONL export and Parity proves the achievable savings offline, before you change a single line of code. You see the number for your own workload first, then decide.

Pricing

You pay per-request, per-token, billed at the cheaper model's rate once a prompt type is proven and routed, 30-60% less than your baseline. Up to 10 prompts are free, no credit card. See the full breakdown on the pricing page.

FAQ

What is the Parity Layer API base URL?

https://api.paritylayer.com. Point your existing OpenAI or Anthropic SDK at it and authenticate with your Parity key (sk-pl-...). For OpenAI-style clients, use https://api.paritylayer.com/v1.

Do I have to change my code to use Parity Layer?

No. It is a two-line change: set the base URL to https://api.paritylayer.com and use your Parity key. Your prompts, tools, streaming, parameters, and response shapes stay identical to calling the provider directly.

How much does it save and does quality drop?

Typical savings are 30-60% of API spend depending on prompt type, so you pay 40-70% of your current bill. Quality is preserved: Parity only routes to a cheaper model after statistically proving it matches (or beats) your baseline on your prompts, and falls back to your baseline instantly if quality ever drifts. Output matches or beats your baseline, never worse.

Can I prove the savings before changing any code?

Yes. Export a sample of past requests as a JSONL file and Parity proves the savings offline, before you change your integration or send live traffic. Up to 10 prompts are free, no credit card.

Start free

Get a Parity key and prove the savings on your own prompts. Up to 10 prompts free, no credit card.

Get your API key