4 comments

  • anup_sia 13 hours ago

    LLMs are great at creating plans but terrible at following them. I've seen agents claim to create 5 files but only make 2, repeat API calls 3x, skip error handling, then report success anyway. The fix: treat execution like todo management—track every step, block the agent if it tries tools not in the current step, and verify completion (don't trust its word, actually check if the file exists). This plus guardrails and git-like versioning improved the reliability siginificantly

    • verdverm 13 hours ago

      seems reasonable and resonates with the approach I plan to take when I start building my agent

  • mayankd 8 hours ago

    For sure, agents tend to bang their heads against a wall, and can deviate in surprising ways to attempt to escape that wall. Balancing the scope of a plan and making agents stick to it is a tricky balance to strike

  • sunir 12 hours ago

    If the plan is too big to fit into context or requires too much attention it overwhelms the llm. You need to decompose into tasks and todos aggressively.