HN
New
Show
Ask
Jobs
Built with Qwik
SlopCodeBench: Benchmarking How Coding Agents Degrade over Long-Horizon Tasks
(arxiv.org)
1 points | by
FiberBundle
14 hours ago ago
1 comments
cestivan
13 hours ago
[dead]
[dead]