4 points | by AkshatVirmani 9 hours ago ago
3 comments
Nice to see a benchmark in this space especially with black-box constraints.
like that the scoring bias is toward bug detection & not test generation only. generating lots of tests with AI is easy but that doesn't necessarily mean they're good
curious.. let me see if this works for our internal setup
Nice to see a benchmark in this space especially with black-box constraints.
like that the scoring bias is toward bug detection & not test generation only. generating lots of tests with AI is easy but that doesn't necessarily mean they're good
curious.. let me see if this works for our internal setup