“Opus 4.6 found 22 security bugs, Mythos found 271 on an initial evaluation” sure seems to refute the grumbling I’ve seen from a couple OAI people on Twitter that Mythos isn’t actually anything special and everything it finds could be found by earlier models too.
“Opus 4.6 found 22 security bugs, Mythos found 271 on an initial evaluation” sure seems to refute the grumbling I’ve seen from a couple OAI people on Twitter that Mythos isn’t actually anything special and everything it finds could be found by earlier models too.