I always get Claude Code to create a plan unless its trivial, it will describe all the changes its going to make and to which files, then let it rip in a new context.
the hiding stuff is weird because the whole reason you'd want to see what Claude is doing isn't just curiosity - it's about catching when it goes off the rails before it makes a mess. like when it starts reading through your entire codebase because it misunderstood what you asked for, or when it's about to modify files you didn't want touched. the verbose mode fix is good but honestly this should've been obvious from the start - if you're letting an AI touch your files, you want to know exactly which files. not because you don't trust the tool in theory but because you need to verify it's doing what you actually meant, not what it thinks you meant. abstractions are great until they hide the thing that's about to break your build
> it's about catching when it goes off the rails before it makes a mess
The latest "meta" in AI programming appears to be agent teams (or swarms or clusters or whatever) that are designed to run for long periods of time autonomously.
Through that lens, these changes make more sense. They're not designing UX for a human sitting there watching the agent work. They're designing for horizontally scaling agents that work in uninterrupted stretches where the only thing that matters is the final output, not the steps it took to get there.
That said, I agree with you in the sense that the "going off the rails" problem is very much not solved even on the latest models. It's not clear to me how we can trust a team of AI agents working autonomously to actually build the right thing.
None of those wild experiments are running on a "real", existing codebase that is more than 6 months old. The thing they don't talk about is that nobody outside these AI companies wants to vibe code with a 10 year old codebase with 2000 enterprise customers.
As you as you start to work with a codebase that you care about and need to seriously maintain, you'll see what a mess these agents make.
Yes, this is why I generally still use "ask for permission" prompts.
As tedious as it is a lot of the time ( And I wish there was an in-between "allow this session" not just allow once or "allow all" ), it's invaluable to catch when the model has tried to fix the problem in entirely the wrong project.
Working on a monolithic code-base with several hundred library projects, it's essential that it doesn't start digging in the wrong place.
It's better than it used to be, but the failure mode for going wrong can be extreme, I've come back to 20+ minutes of it going around in circles frustrating itself because of a wrong meaning ascribed to an instruction.
The other side of catcing going off the rails is when it wants to make edits without it reading the context I know would’ve been neccessary for a high quality change.
> Cherny responded to the feedback by making changes. "We have repurposed the existing verbose mode setting for this," he said, so that it "shows file paths for read/searches. Does not show full thinking, hook output, or subagent output (coming in tomorrow's release)."
How to comply with a demand to show more information by showing less information.
Words have lost all meaning. "Verbose" no longer means "containing more words than necessary" but instead "Bit more than usual". "Fast" no longer mean "characterized by quick motion, operation, or effect" but instead depends on the company, some of them use slightly different way, but same "speed", but it's called "fast mode".
It's just a whole new world where words suddenly mean something completely different, and you can no longer understand programs by just reading what labels they use for various things, you need to also lookup if what they think "verbose" means matches with the meaning you've built up understanding of first.
This is really the kind of things Claude sometimes does. "Actually, wait... let's repurpose the existing verbose mode for this, simpler, and it fits the user's request to limit bloating"
Claude logs the conversation to ~/.claude/projects, so you can have write a tool to view them. I made a quick tool that has been valuable the last few weeks: https://github.com/panozzaj/cc-tail
It's probably in their interest to have as many vibed codebases out there as possible, that no human would ever want to look at. Incentivising never-look-at-the-code is effectively a workflow lockin.
I always review every single change / file in full and spend around 40% of the time it takes to produce something doing so. I assume it's the same for a lot of people who used to develop code and swapped to mostly code generation (since it's just faster). The spend I time looking at it depends on how much I care about it - something you don't really get writing things manually.
Well, there is OpenCode [1] as an alternative, among many others. I have found OpenCode being the closest to Claude Code experience, and I find it quite good. Having said that I still prefer Claude Code for the moment.
What does Claude-Code do different that you still prefer it? I'm so in love with OpenCode, I just can't go back. It's such a nicer way of working. I even love the more advanced TUI
I haven't tried it myself but there was a plenty of people in the other thread complaining that even on the Max subscription they couldn't use OpenCode.
Not trying to tell anyone else how to live, just want to make sure the other side of this argument is visible. I run 5+ agents all day every day. I measure, test, and validate outputs exhaustively. I value the decrease in noise in output here because I am very much not looking to micromanage process because I am simply too slow to keep up. When I want logging I can follow to understand “thought process” I ask for that in a specific format in my prompt something like “talk through the problem and your exploration of the data step by step as you go before you make any changes or do any work and use that plan as the basis of your actions”.
I still think it’d be nice to allow an output mode for you folks who are married to the previous approach since it clearly means a lot to you.
My primary plan is the $200 Claude max. They only operate during my working hours and there is significant downtime as they deliver results and await my review.
Can anybody break my black glasses and offer an anecdote of a high-employee count firm actually involving humans for reading feedback? I suspect its just there for "later", but never actually looked at by anyone...
"You can already build such a system yourself quite trivially by getting an FTP account, mounting it locally with curlftpfs, and then using SVN or CVS on the mounted filesystem" — https://news.ycombinator.com/item?id=9224
That is such a silly framing. They are not trying to hide anything. They are trying to create a better product -- and might be making unpopular or simply bad choices along the way -- but the objective here is not to obfuscate which files are edited. It's a side effect.
Instead of adding a settings option to hide the filenames they hide them for everyone AND rewrite verbose mode, which is no longer a verbose mode, but the way to see filenames, thus breaking everyone's (depending on these) workflows for...... what exactly?
If they tried to create a better product I'd expect them to just add the awesome option, not hide something that saves thousands of tokens and context if the model goes the wrong way.
I always get Claude Code to create a plan unless its trivial, it will describe all the changes its going to make and to which files, then let it rip in a new context.
the hiding stuff is weird because the whole reason you'd want to see what Claude is doing isn't just curiosity - it's about catching when it goes off the rails before it makes a mess. like when it starts reading through your entire codebase because it misunderstood what you asked for, or when it's about to modify files you didn't want touched. the verbose mode fix is good but honestly this should've been obvious from the start - if you're letting an AI touch your files, you want to know exactly which files. not because you don't trust the tool in theory but because you need to verify it's doing what you actually meant, not what it thinks you meant. abstractions are great until they hide the thing that's about to break your build
> it's about catching when it goes off the rails before it makes a mess
The latest "meta" in AI programming appears to be agent teams (or swarms or clusters or whatever) that are designed to run for long periods of time autonomously.
Through that lens, these changes make more sense. They're not designing UX for a human sitting there watching the agent work. They're designing for horizontally scaling agents that work in uninterrupted stretches where the only thing that matters is the final output, not the steps it took to get there.
That said, I agree with you in the sense that the "going off the rails" problem is very much not solved even on the latest models. It's not clear to me how we can trust a team of AI agents working autonomously to actually build the right thing.
None of those wild experiments are running on a "real", existing codebase that is more than 6 months old. The thing they don't talk about is that nobody outside these AI companies wants to vibe code with a 10 year old codebase with 2000 enterprise customers.
As you as you start to work with a codebase that you care about and need to seriously maintain, you'll see what a mess these agents make.
Looking at it from far is simply making something large from a smaller input, so its kind of like nondeterministic decompression.
What fills the holes are best practices, what can ruin the result is wrong assumptions.
I dont see how full autonomy can work either without checkpoints along the way.
Yes, this is why I generally still use "ask for permission" prompts.
As tedious as it is a lot of the time ( And I wish there was an in-between "allow this session" not just allow once or "allow all" ), it's invaluable to catch when the model has tried to fix the problem in entirely the wrong project.
Working on a monolithic code-base with several hundred library projects, it's essential that it doesn't start digging in the wrong place.
It's better than it used to be, but the failure mode for going wrong can be extreme, I've come back to 20+ minutes of it going around in circles frustrating itself because of a wrong meaning ascribed to an instruction.
The other side of catcing going off the rails is when it wants to make edits without it reading the context I know would’ve been neccessary for a high quality change.
> Cherny responded to the feedback by making changes. "We have repurposed the existing verbose mode setting for this," he said, so that it "shows file paths for read/searches. Does not show full thinking, hook output, or subagent output (coming in tomorrow's release)."
How to comply with a demand to show more information by showing less information.
Words have lost all meaning. "Verbose" no longer means "containing more words than necessary" but instead "Bit more than usual". "Fast" no longer mean "characterized by quick motion, operation, or effect" but instead depends on the company, some of them use slightly different way, but same "speed", but it's called "fast mode".
It's just a whole new world where words suddenly mean something completely different, and you can no longer understand programs by just reading what labels they use for various things, you need to also lookup if what they think "verbose" means matches with the meaning you've built up understanding of first.
This is really the kind of things Claude sometimes does. "Actually, wait... let's repurpose the existing verbose mode for this, simpler, and it fits the user's request to limit bloating"
Claude logs the conversation to ~/.claude/projects, so you can have write a tool to view them. I made a quick tool that has been valuable the last few weeks: https://github.com/panozzaj/cc-tail
It's probably in their interest to have as many vibed codebases out there as possible, that no human would ever want to look at. Incentivising never-look-at-the-code is effectively a workflow lockin.
I always review every single change / file in full and spend around 40% of the time it takes to produce something doing so. I assume it's the same for a lot of people who used to develop code and swapped to mostly code generation (since it's just faster). The spend I time looking at it depends on how much I care about it - something you don't really get writing things manually.
Well, there is OpenCode [1] as an alternative, among many others. I have found OpenCode being the closest to Claude Code experience, and I find it quite good. Having said that I still prefer Claude Code for the moment.
[1] https://opencode.ai/
What does Claude-Code do different that you still prefer it? I'm so in love with OpenCode, I just can't go back. It's such a nicer way of working. I even love the more advanced TUI
I haven't tried it myself but there was a plenty of people in the other thread complaining that even on the Max subscription they couldn't use OpenCode.
OpenCode would be nicer if they used normal terminal scrolling and not their own thing :(
oh-my-pi plug https://github.com/can1357/oh-my-pi
Not trying to tell anyone else how to live, just want to make sure the other side of this argument is visible. I run 5+ agents all day every day. I measure, test, and validate outputs exhaustively. I value the decrease in noise in output here because I am very much not looking to micromanage process because I am simply too slow to keep up. When I want logging I can follow to understand “thought process” I ask for that in a specific format in my prompt something like “talk through the problem and your exploration of the data step by step as you go before you make any changes or do any work and use that plan as the basis of your actions”.
I still think it’d be nice to allow an output mode for you folks who are married to the previous approach since it clearly means a lot to you.
> I run 5+ agents all day every day
Curious what plans you’re using? running 24/7 x 5 agents would eat up several $200 subscriptions pretty fast
My primary plan is the $200 Claude max. They only operate during my working hours and there is significant downtime as they deliver results and await my review.
When their questionnaire asked me for feedback I specifically mentioned that I hoped they would not reduce visibility to the point of Github Actions.
I guess that fell on deaf ears.
Can anybody break my black glasses and offer an anecdote of a high-employee count firm actually involving humans for reading feedback? I suspect its just there for "later", but never actually looked at by anyone...
Perhaps they can just make it an option??
Why not run Claude on an FUSE based filesystem, and make a script that shows the user which files are being accessed?
"You can already build such a system yourself quite trivially by getting an FTP account, mounting it locally with curlftpfs, and then using SVN or CVS on the mounted filesystem" — https://news.ycombinator.com/item?id=9224
If you read their immediate reply you’ll see it invalidates this tired point: https://news.ycombinator.com/item?id=9479
Better link shows exchange:
https://news.ycombinator.com/item?id=9224
Or this pulls the exchange under the famous HN post itself:
https://news.ycombinator.com/item?id=8863
You can basically ask Claude to build it for you :)
That is such a silly framing. They are not trying to hide anything. They are trying to create a better product -- and might be making unpopular or simply bad choices along the way -- but the objective here is not to obfuscate which files are edited. It's a side effect.
Instead of adding a settings option to hide the filenames they hide them for everyone AND rewrite verbose mode, which is no longer a verbose mode, but the way to see filenames, thus breaking everyone's (depending on these) workflows for...... what exactly?
If they tried to create a better product I'd expect them to just add the awesome option, not hide something that saves thousands of tokens and context if the model goes the wrong way.
How can you combat one unprovable framing by insisting on another unprovable framing?