I got tired of paying for Claude Max and then paying again for API access in my dev tools. OCP is a single-file Node proxy that sits between your tools and the Claude CLI. It translates OpenAI chat/completions calls into claude -p invocations, so your existing subscription covers everything.
Works with anything that takes an OPENAI_BASE_URL — I use it with OpenClaw, OpenCode, and Cline running concurrently. Added a small CLI (ocp usage) that reads the rate-limit headers so you can check session/weekly utilization without opening the browser.
Setup is one command: node setup.mjs. Runs as a launchd/systemd daemon.
Interesting approach. The challenge I keep running into is that even with access to multiple models, there's still no native signal for when to trust the output vs when to flag it for review.
Good point. OCP currently focuses on the cost/routing layer, not output quality. But you're touching on something interesting — since OCP sits between the tool and the model, it could theoretically add a lightweight confidence signal (e.g., flag responses that took unusually long, or where the model hedged heavily).
Not on the roadmap yet, but it's the kind of thing that becomes possible once you have a control plane between your agents and the models.
The caller specifies the model in the request body (just like a normal OpenAI API call). OCP maps it to the corresponding Claude CLI flag:
- claude-sonnet-4-6 → claude -p --model sonnet
- claude-opus-4-6 → claude -p --model opus
If you don't specify, it defaults to Sonnet. There's no automatic model selection yet — that's coming in v4 with agent-aware routing (different agents get different models based on their role).
I got tired of paying for Claude Max and then paying again for API access in my dev tools. OCP is a single-file Node proxy that sits between your tools and the Claude CLI. It translates OpenAI chat/completions calls into claude -p invocations, so your existing subscription covers everything.
Works with anything that takes an OPENAI_BASE_URL — I use it with OpenClaw, OpenCode, and Cline running concurrently. Added a small CLI (ocp usage) that reads the rate-limit headers so you can check session/weekly utilization without opening the browser.
Setup is one command: node setup.mjs. Runs as a launchd/systemd daemon.
Interesting approach. The challenge I keep running into is that even with access to multiple models, there's still no native signal for when to trust the output vs when to flag it for review.
Good point. OCP currently focuses on the cost/routing layer, not output quality. But you're touching on something interesting — since OCP sits between the tool and the model, it could theoretically add a lightweight confidence signal (e.g., flag responses that took unusually long, or where the model hedged heavily).
Not on the roadmap yet, but it's the kind of thing that becomes possible once you have a control plane between your agents and the models.
Nice using Calude to build tool to fool Claude :)
How does it decide which model to use per invocation
The caller specifies the model in the request body (just like a normal OpenAI API call). OCP maps it to the corresponding Claude CLI flag:
- claude-sonnet-4-6 → claude -p --model sonnet - claude-opus-4-6 → claude -p --model opus
If you don't specify, it defaults to Sonnet. There's no automatic model selection yet — that's coming in v4 with agent-aware routing (different agents get different models based on their role).