A local proxy that enforces token budgets on every OpenAI-compatible API call. No SDK changes. No code changes. One command.
The proxy sits between your application and the real API. Your code doesn't need to change at all.
Every request is logged with token counts. Blocked requests show exactly why they were stopped.
Use one or combine all three. The strictest matching limit wins.
Just change the --upstream flag.