Just open-sourced Squirrel — an LLM API Gateway built to solve the nightmare of managing multiple models, providers, and prompts across different projects.
If you are building AI apps, managing agents, or running backend services, you have probably hit these walls:
Upgrading models is a grind. Updating hardcoded strings across 10+ repositories takes too much time.
Bleeding money blindly. Provider prices fluctuate, and tracking costs across multiple vendors is impossible manually.
Debugging is pure guesswork. Without full request/response logs, fixing broken prompts is a shot in the dark.
I built Squirrel to fix exactly this. Here is what it does out of the box:
Model Mapping (Change once, apply everywhere)
Stop hardcoding specific models like gpt-4o or claude-3.5-sonnet. Map a virtual name (like my-smart) to a provider in the gateway. Want to upgrade all your apps at once? Just update the mapping in Squirrel. It takes effect instantly across all projects with zero code changes required.
Cost-Based Auto-Routing
Set your provider pricing, and Squirrel automatically routes requests to the cheapest available option. It also supports priority, weight-based, and round-robin strategies.
Complete Observability
Logs every single API call, including streaming responses. Check the admin dashboard to see the exact prompt sent, the model's output, token usage, and Time to First Byte (TTFB). This is an absolute lifesaver for debugging and fine-tuning.
Auto-Retry & Failover
If a provider throws a 500 error or times out, Squirrel seamlessly switches to a backup provider. Your client-side code does not need to handle a thing.
Protocol Compatibility
Works natively with OpenAI and Anthropic SDKs, and auto-translates protocols between them under the hood.
Just open-sourced Squirrel — an LLM API Gateway built to solve the nightmare of managing multiple models, providers, and prompts across different projects.
If you are building AI apps, managing agents, or running backend services, you have probably hit these walls:
Upgrading models is a grind. Updating hardcoded strings across 10+ repositories takes too much time.
Bleeding money blindly. Provider prices fluctuate, and tracking costs across multiple vendors is impossible manually.
Debugging is pure guesswork. Without full request/response logs, fixing broken prompts is a shot in the dark.
I built Squirrel to fix exactly this. Here is what it does out of the box:
Model Mapping (Change once, apply everywhere) Stop hardcoding specific models like gpt-4o or claude-3.5-sonnet. Map a virtual name (like my-smart) to a provider in the gateway. Want to upgrade all your apps at once? Just update the mapping in Squirrel. It takes effect instantly across all projects with zero code changes required.
Cost-Based Auto-Routing Set your provider pricing, and Squirrel automatically routes requests to the cheapest available option. It also supports priority, weight-based, and round-robin strategies.
Complete Observability Logs every single API call, including streaming responses. Check the admin dashboard to see the exact prompt sent, the model's output, token usage, and Time to First Byte (TTFB). This is an absolute lifesaver for debugging and fine-tuning.
Auto-Retry & Failover If a provider throws a 500 error or times out, Squirrel seamlessly switches to a backup provider. Your client-side code does not need to handle a thing.
Protocol Compatibility Works natively with OpenAI and Anthropic SDKs, and auto-translates protocols between them under the hood.
Tech Stack: Python (FastAPI) + Next.js + PostgreSQL/SQLite. One-click deployment via Docker Compose.
Fully open-source under the MIT license. It is still under active development, so feedback, issues, and PRs are incredibly welcome.
Check it out here: https://github.com/mylxsw/llm-gateway