3 points | by chrisdudek 8 hours ago ago
5 comments
> What works... External mechanical checks
This is what I expected verification to mean going in.
For some reason verification in this article means "try to convince the AI even harder than before"
[dead]
I suspected that given some responses the agent was giving during the experimentation. Looked sometimes tendentious when measuring.
I tried to also avoid that, and the main orchestrating agent was running subagents running experiments, and others were checking the results. Somewhat it worked as it was not imposing only successes, there were more failures.
> What works... External mechanical checks
This is what I expected verification to mean going in.
For some reason verification in this article means "try to convince the AI even harder than before"
[dead]
I suspected that given some responses the agent was giving during the experimentation. Looked sometimes tendentious when measuring.
[dead]
I tried to also avoid that, and the main orchestrating agent was running subagents running experiments, and others were checking the results. Somewhat it worked as it was not imposing only successes, there were more failures.