Show HN: RewardGuard – detect reward hacking in RL training loops

(github.com)

1 points | by Giovan321 13 hours ago ago

3 comments