Code as Agent Harness

(arxiv.org)

5 points | by matt_d 13 hours ago ago

1 comments

  • promptsaredead 6 hours ago

    I think a lot of SOTA models are already going this way. Long autonomous tasks in claude code / codex already do this to stay on track and avoid multiplying errors over many steps.