LLM coding workflow going into 2026

(medium.com)

6 points | by lobo_tuerto a day ago ago

2 comments

  • dchuk 21 hours ago

    I’ve been using Agent OS (https://buildermethods.com/agent-os) with Claude code’s $100 plan and it is pretty darn great. I usually ideate with Claude and ChatGPT back and forth on an idea to get to a prd for the whole product/project idea, with a high level roadmap, then go through agent os to bootstrap the core artifacts and then it’s just a loop of shaping specs, break into tasks, implement, then I manually test it out. Using the “standards” concept to produce skills in CC seems to help a lot. I’m currently working on a Mac SwiftUI app (a language I’ve never built anything in) and it’s progressing nicely, has good test coverage, and I haven’t looked at a line of code. I found a couple SwiftUI skills repos online, had Claude adapt them to the agent os approach, and then hit the ground running. Also, it basically functions like the really popular Ralph wiggum concept everyone is raving about lately. Implementation basically just runs on its own with a bunch of parallel agents, sometimes I have to nudge the model after my smoke testing to clean up some stuff. But overall, it just works, and is immensely productive. And this agent os thing adds just enough structure to tame complexity and variability. I highly recommend it and have no other connection to it. I have some thoughts on some enhancements to it I’ll probably issue a few PRs for or fork the repo and implement.

  • sciences44 a day ago

    Really interesting workflow, Addy! Two questions about your approach:

    1. *Spec → Implementation process:* How do you ensure Claude actually completes everything in the spec and finds the optimal path? Do you: - Document every step in extreme detail upfront (spec as single source of truth)? - Use an agentic framework that lets Claude self-guide through implementation? - Iteratively validate each step with human checkpoints?

    2. *Tool comparison:* Have you experimented with GitHub Copilot vs Claude Code vs Cursor? What made you settle on your current stack?

    I'm working on multi-step AI pipelines (3D mesh generation with validation stages), and I find that LLMs often skip edge cases or take suboptimal paths when given too much autonomy.

    Curious if you've built any scaffolding/guardrails to keep the LLM on track, or if your spec writing has evolved to be more "agent-friendly"?

    The balance between human specification vs. agent autonomy seems like the key challenge going into 2026 especially to allow code in production from agents.