Dutchman labs – better evals for your AI agent

(dutchmanlabs.com)

2 points | by thesarsour 6 hours ago ago

1 comments

thesarsour 6 hours ago
We built our eval studio tool, equipped with a CLI tool for testing AI agents locally.
Most agent workflows I’ve seen don’t have any real evaluation layer. People test manually or rely on prompt tweaks.
I wanted something closer to how we treat backend systems, where you can run tests before shipping.
Eval Studio:
* scans your repo and detects likely agents * generates eval datasets based on your agent * runs tests locally against your implementation * surfaces failures and behavioral gaps
It doesn’t require deploying anything — it runs directly on your local setup.
Get your API key and try it: dutchmanlabs.com
Would really appreciate feedback, especially from people building LLM apps or agent workflows.