LLM routing
Persist State Across 17,000+ Models
Backboard gives you a single, portable API to 17,000+ LLMs across providers. Bring your own keys from OpenAI, Anthropic, Google Gemini, Cohere, xAI, OpenRouter, and more. Route by cost, speed, quality, or capability—with built‑in state management and ADAPTIVE CONTEXT MANAGEMENT, no token markup, and access to many free models.
LLM routing
Persist State Across 17,000+ Models
Backboard gives you a single, portable API to 17,000+ LLMs across providers. Bring your own keys from OpenAI, Anthropic, Google Gemini, Cohere, xAI, OpenRouter, and more. Route by cost, speed, quality, or capability—with built‑in state management and ADAPTIVE CONTEXT MANAGEMENT, no token markup, and access to many free models.
LLM routing, without the glue code
What is LLM routing on Backboard?
Backboard lets you call 17,000+ models from a single endpoint and change which model you use at any time, without rewriting your app. Instead of hard‑coding every provider's SDK and payload quirks, you:
Integrate one unified API
Replace every provider SDK with a single OpenAI-compatible endpoint. One integration unlocks OpenAI, Anthropic, Google Gemini, Cohere, xAI, and thousands more — no rewrites needed when you switch models.
Integrate one unified API
Replace every provider SDK with a single OpenAI-compatible endpoint. One integration unlocks OpenAI, Anthropic, Google Gemini, Cohere, xAI, and thousands more — no rewrites needed when you switch models.
Choose models with a simple string or routing rule
Pass a model name like "openai/gpt-4o" or a routing rule like "fastest" or "cheapest". Swap models in a config change without touching your app logic.
Choose models with a simple string or routing rule
Pass a model name like "openai/gpt-4o" or a routing rule like "fastest" or "cheapest". Swap models in a config change without touching your app logic.
Bring your own keys (BYOK) for providers you already use
Connect your existing API keys from OpenAI, Anthropic, Google, and other providers. Pay those providers directly at their listed rates — Backboard adds zero markup on tokens.
Bring your own keys (BYOK) for providers you already use
Connect your existing API keys from OpenAI, Anthropic, Google, and other providers. Pay those providers directly at their listed rates — Backboard adds zero markup on tokens.
Let Backboard handle state, context, tools, and memory consistently
Backboard automatically persists conversation state, manages context windows, retrieves memory, and runs tools — consistently, regardless of which model is handling the request.
Let Backboard handle state, context, tools, and memory consistently
Backboard automatically persists conversation state, manages context windows, retrieves memory, and runs tools — consistently, regardless of which model is handling the request.
LLM ROUTING
Why engineers route through Backboard
From benchmark-leading memory to BYOK with no token markup — everything built into one stateful API.
One API, 17,000+ models, many free
Call 17,000+ models from OpenAI, Anthropic, Google, Mistral, Cohere, xAI, and more through one unified endpoint. Hundreds of free models available for experimentation and background tasks.
One API, 17,000+ models, many free
Call 17,000+ models from OpenAI, Anthropic, Google, Mistral, Cohere, xAI, and more through one unified endpoint. Hundreds of free models available for experimentation and background tasks.
BYOK with no token markup
Connect your own API keys from any major provider and pay them directly at their published rates. Backboard adds zero token markup — ever.
BYOK with no token markup
Connect your own API keys from any major provider and pay them directly at their published rates. Backboard adds zero token markup — ever.
Stateful by default
Every request is context-aware out of the box. Conversation state and session memory are handled automatically so you don't have to pass history manually on every call.
Stateful by default
Every request is context-aware out of the box. Conversation state and session memory are handled automatically so you don't have to pass history manually on every call.
Adaptive context management built in
Backboard intelligently trims, summarizes, and prioritizes context to fit within any model's token window — preserving the most relevant information without manual tuning.
Adaptive context management built in
Backboard intelligently trims, summarizes, and prioritizes context to fit within any model's token window — preserving the most relevant information without manual tuning.
Configurable memory, RAG, and tools on every route
Attach memory (lite or pro), RAG retrieval, and tool integrations to any route. Configure them once and they follow your requests across every model you route to.
Configurable memory, RAG, and tools on every route
Attach memory (lite or pro), RAG retrieval, and tool integrations to any route. Configure them once and they follow your requests across every model you route to.
Model Independent Web Search
Built-in real-time web search works across all 17,000+ models — no extra integration required. Available on every plan at no additional cost.
Model Independent Web Search
Built-in real-time web search works across all 17,000+ models — no extra integration required. Available on every plan at no additional cost.
how it works
How LLM routing works
You call a single msg‑style endpoint and pass the model (or routing rule), the state or conversation ID, and optional tools: memory, RAG, web search, custom tools.
1. Resolve model
Backboard maps your model string or routing rule to the right provider and endpoint — whether that's a named model like "anthropic/claude-3-7-sonnet" or a policy like "cheapest with vision".
1. Resolve model
Backboard maps your model string or routing rule to the right provider and endpoint — whether that's a named model like "anthropic/claude-3-7-sonnet" or a policy like "cheapest with vision".
2. Apply state
Your session ID is used to load the relevant conversation history, memory entries, and tool state — so the chosen model receives full context without you passing it manually.
2. Apply state
Your session ID is used to load the relevant conversation history, memory entries, and tool state — so the chosen model receives full context without you passing it manually.
3. Fit context
Adaptive Context Management trims, summarizes, and prioritizes your loaded context to fit within the selected model's token window before the request is sent.
3. Fit context
Adaptive Context Management trims, summarizes, and prioritizes your loaded context to fit within the selected model's token window before the request is sent.
4. Run and return
The request is forwarded to the provider using your own API key (if BYOK), streamed back through Backboard, and the updated state is persisted for the next turn.
4. Run and return
The request is forwarded to the provider using your own API key (if BYOK), streamed back through Backboard, and the updated state is persisted for the next turn.
1. Resolve model
Backboard maps your model string or routing rule to the right provider and endpoint — whether that's a named model like "anthropic/claude-3-7-sonnet" or a policy like "cheapest with vision".
3. Fit context
Adaptive Context Management trims, summarizes, and prioritizes your loaded context to fit within the selected model's token window before the request is sent.
2. Apply state
Your session ID is used to load the relevant conversation history, memory entries, and tool state — so the chosen model receives full context without you passing it manually.
4. Run and return
The request is forwarded to the provider using your own API key (if BYOK), streamed back through Backboard, and the updated state is persisted for the next turn.
routing patterns
Routing patterns you can implement
Same state, same memory, same tools—different models for different jobs.
Cost‑aware routing
Route simple or repetitive queries to cheaper, faster models and reserve expensive reasoning models for tasks that genuinely need them — automatically reducing cost without sacrificing quality.
Cost‑aware routing
Route simple or repetitive queries to cheaper, faster models and reserve expensive reasoning models for tasks that genuinely need them — automatically reducing cost without sacrificing quality.
Latency‑sensitive routing
Direct time-critical requests to the fastest available model for a given capability. Ideal for real-time chat, autocomplete, or user-facing features where response speed matters.
Latency‑sensitive routing
Direct time-critical requests to the fastest available model for a given capability. Ideal for real-time chat, autocomplete, or user-facing features where response speed matters.
Capability‑based routing
Route by what a model is best at — vision, code generation, long context, multilingual, or function calling. Match task type to the model most likely to get it right.
Capability‑based routing
Route by what a model is best at — vision, code generation, long context, multilingual, or function calling. Match task type to the model most likely to get it right.
Provider redundancy
Automatically failover to an alternate provider if your primary model is rate-limited or unavailable. Keep your app running without manual intervention or downtime.
Provider redundancy
Automatically failover to an alternate provider if your primary model is rate-limited or unavailable. Keep your app running without manual intervention or downtime.
how it works
Why not just build your own router?
Wiring a couple of models is easy. The hard parts are what Backboard solves.
Wiring a couple of models is easy. The hard parts:
Keeping state and memory consistent across models and providers
Handling different context windows without losing important info
Tracking cost, latency, and usage when logic is scattered
Making RAG, web search, and tools behave the same for every model
Managing multiple keys and pricing models without accidentally overpaying
Backboard gives you:
A unified API for 17,000+ models
BYOK support for major providers with no token markup
Access to many free models for experimentation and background work
Free state management and Adaptive Context Management baked in
Best‑in‑class configurable memory (lite and pro), plus RAG and web search
You integrate once and get world‑leading routing everywhere.
Get started with Backboard
Wire Backboard into one service today and unlock 17,000+ models, BYOK, stateful behavior, adaptive context, and many free models across your stack.
Get started with Backboard
Wire Backboard into one service today and unlock 17,000+ models, BYOK, stateful behavior, adaptive context, and many free models across your stack.
Get started with Backboard
Wire Backboard into one service today and unlock 17,000+ models, BYOK, stateful behavior, adaptive context, and many free models across your stack.
Built for focused work
Everything you need to build production-grade agent systems on a single, coherent API.
All systems operational
© 2026 Backboard.io
Built for focused work
Everything you need to build production-grade agent systems on a single, coherent API.
All systems operational
© 2026 Backboard.io
Built for focused work
Everything you need to build production-grade agent systems on a single, coherent API.
All systems operational
© 2026 Backboard.io