> IMPORTANT: DO NOT ADD *ANY** COMMENTS unless asked*
Interesting. This was in the old 1.x prompt, removed for 2.0. But CC would pretty much always add comments in 1.x, something I would never request, and would often have to tell it to stop doing (and it would still do it sometimes even after being told to stop).
I can't decide if I like this change or not, tbh. I almost always delete the comments Claude adds, to be sure - but at the same time they seem to provide a sort of utility for me as I read through the generated code. They also act, in a funny way, as a kind of checklist as I review changes - I want them all cleaned up (or maybe edited and left in place) before I PR.
I'm using Anthropic's pay-as-you-go API, since it was easier to set up on the server than CC's CLI/web login method. Running the bot costs me ~$1.8 per month.
The bot is based on Mario Zechner's excellent work[1] - so all credit goes to him!
No, that's not the point of this new checkpoints feature. It's already been possible for a while to rewind context in Claude Code by pressing <ESC><ESC>. This feature rewinds code state alongside context:
> Our new checkpoint system automatically saves your code state before each change, and you can instantly rewind to previous versions by tapping Esc twice or using the /rewind command.
Lots of us were doing something like this already with a combination of WIP git commits and rewinding context. This feature just links the two together and eliminates the manual git stuff.
That is nice, but it makes me wonder how little people actually know and use git nowadays. This is after all, something git really shines at. Still good to see! (It's not like I can't still just use git for that, which I fully intend to do)
That was my first thought too - but this is subtly different, and rewinds the context too. Actually highly useful, because I have often felt like a bad first pass at a solution poisoned my context with Claude.
For the first few hours of using claude code, I was really excited about finally not being too lazy to commit often because cc would do it for me. But then I hit my pro account limit and I realized that I'd rather spent my tokens writing features instead of commits... I should probably upgrade my account.
I really like these tools. Yesterday I gave it a filename for a video of my infant daughter eating which I took while I had my phone on the charger. The top of the charger slightly obscured the video.
I told it to crop the video to just her and remove the obscured portion and that I had ffmpeg and imagemagick installed and it looked at the video, found the crop dimensions, then ran ffmpeg and I had a video of her all cleaned up! Marvelous experience.
My only complaint is that sometimes I want high speed. Unfortunately Cerebras and Groq don't seem to have APIs that are compatible enough for someone to have put them into Charm Crush or anything. But I can't wait for that.
If croq talks openai API, you enable the anthropic protocol, and openai provider with a base url to croq. Set ANTHROPIC_BASE_URL to the open endpoint and start claude.
I haven't tested croq yet, but this could be an interesting use case...
I assumed that OpenRouter wouldn't deliver the same tokens/second which seems to have been a complete mistake. I should have tried it to see. I currently use `ANTHROPIC_BASE_URL` and `ANTHROPIC_AUTH_TOKEN` with z.ai and it works well but CC 2.0 now displays a warning
Auth conflict: Both a token (ANTHROPIC_AUTH_TOKEN) and an API key (/login managed key) are set. This may lead to unexpected behavior.
• Trying to use ANTHROPIC_AUTH_TOKEN? claude /logout
• Trying to use /login managed key? Unset the ANTHROPIC_AUTH_TOKEN environment variable.
Probably just another flag to find.
EDIT: For anyone coming here from elsewhere, Crush from Charm supports Cerebras/Groq natively!
When they announced this I went to try it and they only work with Cline really (which is what they promote there) but Cline has this VSCode dependency as far as I know and I don't really like that. I have my IDE flow and my CLI flow and I don't want to mix them.
Something I realized about this category of tool (I call them "terminal agents" but that already doesn't work now there's an official VS Code extension for this - maybe just "coding agents" instead) is that they're actually an interesting form of general agent.
Claude Code, Codex CLI etc can effectively do anything that a human could do by typing commands into a computer.
They're incredibly dangerous to use if you don't know how to isolate them in a safe container but wow the stuff you can do with them is fascinating.
They're only as dangerous as the capabilities you give them. I just created a `codex` and `claude` user on my Linux box and practically always run in yolo mode. I've not had a problem so far.
One thing I really like using them for is refactoring/reorganizing. The tedious nature of renaming, renaming all implementations, moving files around, creating/deleting folders, updating imports exports, all melts away when you task an agent with it. Of course this assumes they are good enough to do them with quality, which is like 75% of the time for me so far.
I've found that it can be hard or expensive for the agent to do "big but simple" refactors in some cases. For example, I recently tasked one with updating all our old APIs to accept a more strongly typed user ID instead of a generic UUID type. No business logic changes, just change the type of the parameter, and in some cases be wary of misleading argument names by lazy devs copy pasting code. This ended up burning through the entire context window of GPT-5-codex and cost the most $ of anything I've ever tasked an agent with.
Blunt text replacement so far. There are third-party VSCode MCP and LSM MCP servers out there that DO expose those higher-level operations. I haven't tried them out myself -- but it's on my list because I expect they'd cut down on token use and improve latency substantially. I expect Anthropic to eventually build that into their IDE integration.
I was under the impression that Docker container escapes are actually very rare. How high do you rate the chance of a prompt injection attack against Claude running in a docker container on macOS managing to break out of that container?
(Actually hang on, you called me out for suggesting containers like Docker are safe but that's not what I said - I said "a safe container" - which is a perfectly responsible statement to make: if you know how to run them in a "safe container" you should do so. Firecracker or any container not running on your own hardware would count there.)
That's the secret, cap... you can't. And it's due to in band signalling, something I've mentioned on numerous occasions. People should entertain the idea that we're going to have to reeducated people about what is and isn't possible because the AI world has been playing make believe so much they can't see the fundamental problems to which there is no solution.
> They're incredibly dangerous to use if you don't know how to isolate them in a safe container but wow the stuff you can do with them is fascinating.
True but all it will take is one report of something bad/dangerous actually happening and everyone will suddenly get extremely paranoid and start using correct security practices. Most of the "evidence" of AI misalignment seems more like bad prompt design or misunderstanding of how to use tools correctly.
This seems unlikely. We've had decades of horrible security issues, and most people have not gotten paranoid. In fact, after countless data leaks, crypto miner schemes, ransomware, and massive global outages, now people are running LLM bots with the full permission of their user and no guardrails and bragging about it on social media.
"When you use Claude Code, we collect feedback, which includes usage data (such as code acceptance or rejections), associated conversation data, and user feedback submitted via the /bug command."
So I can opt out of training, but they still save the conversation? Why can't they just not use my data when I pay for things. I am tired of paying, and then them stealing my information. Tell you what, create a free tier that harvests data as the cost of the service. If you pay, no data harvesting.
Even that is debatable. There are a lot of weasel words in their text. At most they're saying "we're not training foundation models on your data", which is not to say "we're not training reward models" or "we're not testing our other-data models on your data" and so on.
I guess the safest way to view this is to consider anything you send them as potentially in the next LLMs, for better or worse.
When they ask "How is Claude doing this session?", that appears to be a sneaky way for them to harvest the current conversation based on the terms-of-service clause you pointed out.
That's not just them saving it locally to like `~/.claude/conversations`? Feels weird if all conversations are uploaded to the cloud + retained forever.
I have been using code + vscode extensively for coding, but in the last few months it has been a frustrating downgrade compared to the same prompts and code being pasted into chatGPT.
Is this going to be the way forward? Switching to whichever is better at a task, code base or context?
The vscode integration does feel far tighter now. The one killer feature that Cursor has over it is the ability to track changes across multiple edits. With Claude you have to either accept or reject the changes after every prompt. With Cursor you can accumulate changes until you're ready to accept. You can use git of course but it isn't anywhere near as ergonomic.
I'm currently using Goose[1]. My brother in law uses Claude Code and he likes it. It makes me wonder if I'm missing anything. Can anyone tell me if there's any reason I should switch to Claude Code, or comparisons between the two?
I tried goose and it seems like there's a lot of nice defaults that Claude Code provides that Goose does not. How did you do your initial configuration?
What I've been trying to use it for is to solve a number of long-standing bugs that I've frankly given up on in various Linux tools.
I think I lack the social skills to community drive a fix, probably through some undiagnosed disorder or something so I've been trying to soldier alone on some issues I've had for years.
The issues are things like focus jacking in some window manager I'm using on xorg where the keyboard and the mouse get separate focuses
Goose has been somewhat promising, but still not great.
I mean overall, I don't think any of these coding agents have given me useful insight into my long vexing problems
I think there has to be some type of perception gap or knowledge asymmetry to be really useful - for instance, with foreign languages.
I've studied a few but just in the "taking classes at the local JC" way. These LLMs are absolutely fantastic aids there because I know enough to frame the question but not enough to get the answer.
There's some model for dealing with this I don't have yet.
Essentially I can ask the right question about a variety of things but arguably I'm not doing it right with the software.
I've been writing software for decades, is it really that I'm not competent enough to ask the right question? That's certainly the simplest model but it doesn't check out.
Maybe in some fields I've surpassed a point where llms are useful?
It all circles back to an existential fear of delusional competency.
> Maybe in some fields I've surpassed a point where llms are useful?
I've hit this point while designing developer UX for a library I'm working on. LLMs can nail boilerplate, but when it comes to dev UX they seem to not be very good. Maybe that's because I have a specific vision and some pretty tight requirements? Dunno. I'm in the same spot as you for some stuff.
This, but also the usability of the cli, is a step above the others to me. i.e., switching between modes on the fly and having the plan mode easily accessible via shift+tab.
Never used goose, but looked at it way back when-- Claude Code feels more native IMO. Especially if you're already using Anthropic API/Plans anyways, I'd say give it a try.
To those lamenting that the Plan with Opus/Code with Sonnet feature is not available, check the charts.
Sonnet 4.5 is beating Opus 4.1 on many benchmarks. Feels like it's a change they made not to 'remove options', but because it's currently universally better to just let Sonnet rip.
I've always been curious. Are tags like that one: "<system-reminder>" useful at all? Is the LLM training altered to give a special meaning to specific tags when they are found?
Can a user just write those magic tags (if they knew what they are) and alter the behavior of the LLM in a similar manner?
Claude tends to work well with such semi-xml tags in practice (probably trained for it?).
You can just make them up, and ask it to respond with specific tags, too.
Like “Please respond with the name in <name>…</name> tags and the <surname>.”
It’s one of the approaches to forcing structured responses, or making it role-play multiple actors in one response (having each role in its tags), or asking it to do a round of self-critique in <critique> tags before the final response, etc.
A user can append similar system reminders in their own prompt. It’s one of the things that the Claude Code team discovered worked and now included in other CLIs like Factory, which was talked about today by cofounder of Factory: https://www.youtube.com/live/o4FuKJ_7Ds4?si=py2QC_UWcuDe7vPN
As a burnt-out, laid-off aging developer, I want to thank Anthropic for helping me get in love with programming again. Claude Code on terminal with all my beloved *nix tools and vim rocks.
100%. As a burnt-out manager, who doesn't get a lot of spare time to actually code. It's nice to have a tool like CC where I can make actual incremental changes in the spare 15 minutes I get here and there.
I spend most of my time making version files with the prompt, but pretty impressed by how far I've gotten on an idea that would have never seen the light of day....
The thoughts of having to write input validation, database persistence, and all the other boring things I've had to write a dozen times in the past....
As an Architect, i feel like a large part of my job is to help my team be their best, but I'm also focused on the delivery of a few key solutions. I'm used to writing tasks, and helping assign it to members on the team while occasionally picking up the odd-end piece of work myself, focusing more on architecture and helping individual members when they get stuck or when problems come up. But with the latest coding agents, i'm always thinking in the back of my head (I can get the AI to finish this task 3x quicker, and probably better quality if I just do it myself with the AI). We sit on SCRUM meetings sizing tasks, and i'm thinking "bro, you're just going to take my task description paste it into AI and be done in 1/2 hr" but we size it to a day or 2.
Agreed, it's actually fun again. The evening hours I used to burn with video games and weed are now spent with claude code, rewriting and finishing up all my custom desktop tools and utilities I started years ago.
I had a lot of fun making 'tools' like this, but once I settled into a complicated problem (networking in a multiplayer game), it has become frustrating to watch Claude give back control to me without accomplishing anything, over and over again. I think I need to start using the SDK in order to force it to its job.
It’s still better letting Claude slog through all that boilerplate and skeletal code for you so that you can take the wheel when things start getting interesting. I’ve avoided working on stuff in the past just because I knew I wouldn’t be motivated enough to write the foundation and all the uninteresting stuff that has to come first.
This kind of stuff is where my anxiety rises a bit. Another example like this is audio code - it compiles and “works” but there could be subtle timing bugs or things that cause pops and clicks that are very hard to track down without tracing through the code and building that whole mental model yourself.
There’s a great sweet spot though around stuff like “make me this CRUD endpoint and a migration and a model with these fields and an admin dashboard”.
I've found that in those cases, I likely am better off doing it myself. The LLMs I've used will frequently overfit the code when it gets complicated. I am working on a language learning app and it so often will add special-casing for words occurring in the tests. In general, as soon as you leave boiler-plate territory, I found it will start writing dirtier and dirtier code.
Try giving codex IDE a go, now included with ChatGPT.
Had equal frustrations with Claude making bad decisions, in contrast gpt5 codex high is extremely good!
I've got it using dbus, doing funky stuff with Xlib, working with pipewire using the pulseaudio protocol which it implemented itself (after I told it to quit using libraries for it.) You can't one-shot complicated problems, at least not without extensive careful prompting, but at this point I believe I can walk it through pretty much anything I can imagine wanting done.
Depends on the game tbh, having claude ping me for attention every few minutes disrupts most games too much, but with turn-based or casual games it works out well. OpenTTD and Endless Sky are fun to play while claude churns.
Wait, still no support for the new MCP features? How come Claude Code, from the creators of MCP, is still lacking félicitation, server logging, and progress report ?!
I notice that thinking triggers like "Think harder" are not highlighted in the prompt anymore. Could that mean that thinking is now only a single toggle with tab (no gradation)?
Has anyone figured out how to do claude sub agents without using claude? some sort of opensource cli with openrouter or something? I want to use subagents on differnt LLMs ( copilot,selfhost ).
I wish there were an option to cancel a currently running prompt midway. Right now, pressing Ctrl+C twice ends up terminating the entire session instead.
I'm always watching Claude Code as it runs, ready to hit the Escape key as soon as it goes off the rails. Sometimes it gets stuck in a cul de sac, or misunderstands something basic about the project or its structure and gets off on a bad tangent. But overall I love CC.
I am ended up not using this option anyway. I am using B-MAD agents for planning and it gets into a long-running planning stream, where it needs permission to execute steps. So you end up running the planning in the "accept edits" mode.
I use Opus to write the planning docs for 30 min, then use Sonnet to execute them for another 30 min.
They removed the /model option where you can select Opus to plan and Sonnet to execute. But you can still Shift + Tab to cycle between auto-accept and plan mode.
is Plan mode any different from telling Claude "this is what I'd like to do, please describe an implementation plan"?
that's generally my workflow and I have the results saved into a CLAUDE-X-plan.md. then review the plan and incrementally change it if the initial plan isn't right.
There's a bit of UI around it where you can accept the plan. I personally stopped using it and instead moved to a workflow where I simply ask it to write the plan in a file. It's much easier to edit and improve this way.
Yeah, I just have it generate PRDs/high-level plans, then break it down into cards in "Kanban.md" (a bunch of headers like "Backlog," "In-Progress", etc).
To be honest, Claude is not great about moving cards when it's done with a task, but this workflow is very helpful for getting it back on track if I need to exit a session for any reason.
i've experienced the same thing. usually i try to set up or have it set up a milestone/phase approach to an implementation with checklists (markdown style) but it's 50/50 if it marks them automatically upon completion.
I think they meant the 'Plan with Opus' model. shift+tab still works for me, the VS code extension allows you to plan still too, but the UI is _so_slow with updates.
I was using aider quite a lot from ~ 7 months ago to ~ 3 months ago.
I had to stop because they refuse to implement MCPs and Claude/Codex style agentic workflow just yields better results.
how do i revert to the previous version? I find that the "claude" command in terminal still works great, but the new native VSC extension is missing all these things (before it would launch terminal + run "claude")
I feel like there's so many bugs. The / commands for add-dir and others I used often are gone.
i really hate the fact that every single model has its own cli tool. the ux for claude code is really great, but being stuck using only anthropic models makes me not want to use it no matter how good it is.
Pardon my ignorance, but what does this mean? It's a terminal app that has always expanded to the full terminal, no? I've not noticed any difference in how it renders in the terminal.
A tui does not have to start full screen, v1 of claude did not take over the entire terminal, it would only use a bit at the bottom and scroll up until it was full screen.
pretty sure your old behavior was the broken one tho - i vaguely remember fugling with this to "fullscreen correctly" for a claude-in-docker-in-cygwin-via-MSYS2 a while ago
I'm consistently hitting weird bugs with opencode, like escape codes not being handled correctly so the tui output looks awful, or it hanging on the first startup. Maybe after they migrate to opentui it'll be better
I do like the model selection with opencode though
Actual Changelog[1]
* New native VS Code extension
* Fresh coat of paint throughout the whole app
* /rewind a conversation to undo code changes
* /usage command to see plan limits
* Tab to toggle thinking (sticky across sessions)
* Ctrl-R to search history
* Unshipped claude config command
* Hooks: Reduced PostToolUse 'tool_use' ids were found without 'tool_result' blocks errors
* SDK: The Claude Code SDK is now the Claude Agent SDK Add subagents dynamically with --agents flag
[1] https://github.com/anthropics/claude-code/blob/main/CHANGELO...
You can find the revamped prompt on github[1], or on twitter summarized by my bot[2].
[1] https://github.com/marckrenn/cc-mvp-prompts/compare/v1.0.128...
[2] https://x.com/CCpromptChanges/status/1972709093874757976
Can anyone find the prompts for the new "Output style"s, ie Explanatory and Learning?
> IMPORTANT: DO NOT ADD *ANY** COMMENTS unless asked*
Interesting. This was in the old 1.x prompt, removed for 2.0. But CC would pretty much always add comments in 1.x, something I would never request, and would often have to tell it to stop doing (and it would still do it sometimes even after being told to stop).
I can't decide if I like this change or not, tbh. I almost always delete the comments Claude adds, to be sure - but at the same time they seem to provide a sort of utility for me as I read through the generated code. They also act, in a funny way, as a kind of checklist as I review changes - I want them all cleaned up (or maybe edited and left in place) before I PR.
I am guessing this is an attempt to save computing resources/tokens?
This is excellent. Thanks for sharing this.
You're very welcome – that really means a lot coming from you, Simon.
Are you running the bot with the free tier api?
I'm using Anthropic's pay-as-you-go API, since it was easier to set up on the server than CC's CLI/web login method. Running the bot costs me ~$1.8 per month.
The bot is based on Mario Zechner's excellent work[1] - so all credit goes to him!
[1] https://mariozechner.at/posts/2025-08-03-cchistory/
I think amrrs is referring to the x.com API.
Oh, I'm sorry. Yes, I'm using x's free tier.
FINALLY checkpoints! All around good changes, Claude Code is IMHO the best of the LLM CLI tools.
How do checkpoints work ?
You can rewind your context back to the checkpoint
No, that's not the point of this new checkpoints feature. It's already been possible for a while to rewind context in Claude Code by pressing <ESC><ESC>. This feature rewinds code state alongside context:
> Our new checkpoint system automatically saves your code state before each change, and you can instantly rewind to previous versions by tapping Esc twice or using the /rewind command.
https://www.anthropic.com/news/enabling-claude-code-to-work-...
Lots of us were doing something like this already with a combination of WIP git commits and rewinding context. This feature just links the two together and eliminates the manual git stuff.
That is nice, but it makes me wonder how little people actually know and use git nowadays. This is after all, something git really shines at. Still good to see! (It's not like I can't still just use git for that, which I fully intend to do)
That was my first thought too - but this is subtly different, and rewinds the context too. Actually highly useful, because I have often felt like a bad first pass at a solution poisoned my context with Claude.
Ah, thank you, that's a great sublety that I missed before!
There's value in rewinding both the code and the prompt to the some point in time.
For the first few hours of using claude code, I was really excited about finally not being too lazy to commit often because cc would do it for me. But then I hit my pro account limit and I realized that I'd rather spent my tokens writing features instead of commits... I should probably upgrade my account.
> New native VS Code extension
This is pretty funny while Cursor shipped their own CLI.
And GitHub Copilot shipped a CLI too.
> New native VS Code extension
Looks great, but it's kind of buggy:
- I can't figure out how to toggle thinking
- Have to click in the text box to write, not just anywhere in the Claude panel
- Have to click to reject edits
It seems they also removed the bypass permission setting...
tab-completion of filenames in the directory tree is now unavailable. You'll need to use the Codex style @file to bring up an fzf style list
I think they had the @file thing before codex existed
This is a tangent, but why is there a Jupyter notebook cell editor function and tool usage direction build into the standard context?
Editing Jupyter notebooks in VSCode side by side with Claude Code extension is a pretty good workflow.
I'm disappointed that they haven't done more to make the /resume command more usable. It's still useless for all intents and purposes.
Resume is now a drop down menu at the top in the new VS Code plugin and it's much easier to read.
ooh I like my ctrl-R in gemini cli. Good that it lands here too.
Now if only `/rewind` could undo the `rm -rf ~/*` commands and other bone-headed things it tries to do on the filesystem when you're not watching!
I really like these tools. Yesterday I gave it a filename for a video of my infant daughter eating which I took while I had my phone on the charger. The top of the charger slightly obscured the video.
I told it to crop the video to just her and remove the obscured portion and that I had ffmpeg and imagemagick installed and it looked at the video, found the crop dimensions, then ran ffmpeg and I had a video of her all cleaned up! Marvelous experience.
My only complaint is that sometimes I want high speed. Unfortunately Cerebras and Groq don't seem to have APIs that are compatible enough for someone to have put them into Charm Crush or anything. But I can't wait for that.
You could try to use a router. I'm currently building this:
https://github.com/grafbase/nexus/
If croq talks openai API, you enable the anthropic protocol, and openai provider with a base url to croq. Set ANTHROPIC_BASE_URL to the open endpoint and start claude.
I haven't tested croq yet, but this could be an interesting use case...
I assumed that OpenRouter wouldn't deliver the same tokens/second which seems to have been a complete mistake. I should have tried it to see. I currently use `ANTHROPIC_BASE_URL` and `ANTHROPIC_AUTH_TOKEN` with z.ai and it works well but CC 2.0 now displays a warning
Probably just another flag to find.EDIT: For anyone coming here from elsewhere, Crush from Charm supports Cerebras/Groq natively!
Cerebras has OpenAI compatible "Qwen Code" support. ~4000 tokens/s. Qwen code's 480B param model (MoE) that's quite good. Not quite sonnet good, but speed is amazing.
https://www.cerebras.ai/blog/introducing-cerebras-code
When they announced this I went to try it and they only work with Cline really (which is what they promote there) but Cline has this VSCode dependency as far as I know and I don't really like that. I have my IDE flow and my CLI flow and I don't want to mix them.
But you're right, they have an OpenAI compatible API https://inference-docs.cerebras.ai/resources/openai so perhaps I can actually use this in the CLI! Thanks for making me take another look.
EDIT: Woah, Charm supports this natively. This is great. I am going to try this now.
Cerebras is super cool. I wish OpenAI and Anthropic would have their models hosted there. But I guess supporting yet another platform is hard.
Something I realized about this category of tool (I call them "terminal agents" but that already doesn't work now there's an official VS Code extension for this - maybe just "coding agents" instead) is that they're actually an interesting form of general agent.
Claude Code, Codex CLI etc can effectively do anything that a human could do by typing commands into a computer.
They're incredibly dangerous to use if you don't know how to isolate them in a safe container but wow the stuff you can do with them is fascinating.
They're only as dangerous as the capabilities you give them. I just created a `codex` and `claude` user on my Linux box and practically always run in yolo mode. I've not had a problem so far.
Also, I think shellagent sounds cooler.
That's a great way to run this stuff.
I expect the portion of Claude Code users who have a dedicated user setup like this is pretty tiny!
One nice thing is that Anthropic provides a sample DevContainer: https://github.com/anthropics/claude-code/tree/main/.devcont...
Not the exact setup, but also pretty solid.
One thing I really like using them for is refactoring/reorganizing. The tedious nature of renaming, renaming all implementations, moving files around, creating/deleting folders, updating imports exports, all melts away when you task an agent with it. Of course this assumes they are good enough to do them with quality, which is like 75% of the time for me so far.
I've found that it can be hard or expensive for the agent to do "big but simple" refactors in some cases. For example, I recently tasked one with updating all our old APIs to accept a more strongly typed user ID instead of a generic UUID type. No business logic changes, just change the type of the parameter, and in some cases be wary of misleading argument names by lazy devs copy pasting code. This ended up burning through the entire context window of GPT-5-codex and cost the most $ of anything I've ever tasked an agent with.
does it use the smart refractoring hooks of the IDEs or does it do blunt text replacement
Blunt text replacement so far. There are third-party VSCode MCP and LSM MCP servers out there that DO expose those higher-level operations. I haven't tried them out myself -- but it's on my list because I expect they'd cut down on token use and improve latency substantially. I expect Anthropic to eventually build that into their IDE integration.
Currently it's very slow because it does text replace. It would be way faster if it could use the IDE functions via an MCP.
The later
[flagged]
Please don't cross into personal attack no matter how wrong someone is or you feel they are.
https://news.ycombinator.com/newsguidelines.html
Edit: We've had to ask you this more than once before, and you've continued to do it repeatedly (e.g. https://news.ycombinator.com/item?id=45389115, https://news.ycombinator.com/item?id=45282435). If you don't fix this, we're going to end up banning you, so it would be good if you'd please review the site guidelines and stick to them from now on.
Amazing work, dang. Is there a way to report a comment to the mods? Or the flag feature does that already?
So tell us how to safely run this stuff then.
I was under the impression that Docker container escapes are actually very rare. How high do you rate the chance of a prompt injection attack against Claude running in a docker container on macOS managing to break out of that container?
(Actually hang on, you called me out for suggesting containers like Docker are safe but that's not what I said - I said "a safe container" - which is a perfectly responsible statement to make: if you know how to run them in a "safe container" you should do so. Firecracker or any container not running on your own hardware would count there.)
I'll also point out that I've been writing about security topics for 22 years. https://simonwillison.net/tags/security/
> So tell us how to safely run this stuff then.
That's the secret, cap... you can't. And it's due to in band signalling, something I've mentioned on numerous occasions. People should entertain the idea that we're going to have to reeducated people about what is and isn't possible because the AI world has been playing make believe so much they can't see the fundamental problems to which there is no solution.
https://en.m.wikipedia.org/wiki/In-band_signaling
> They're incredibly dangerous to use if you don't know how to isolate them in a safe container but wow the stuff you can do with them is fascinating.
True but all it will take is one report of something bad/dangerous actually happening and everyone will suddenly get extremely paranoid and start using correct security practices. Most of the "evidence" of AI misalignment seems more like bad prompt design or misunderstanding of how to use tools correctly.
This seems unlikely. We've had decades of horrible security issues, and most people have not gotten paranoid. In fact, after countless data leaks, crypto miner schemes, ransomware, and massive global outages, now people are running LLM bots with the full permission of their user and no guardrails and bragging about it on social media.
"When you use Claude Code, we collect feedback, which includes usage data (such as code acceptance or rejections), associated conversation data, and user feedback submitted via the /bug command."
So I can opt out of training, but they still save the conversation? Why can't they just not use my data when I pay for things. I am tired of paying, and then them stealing my information. Tell you what, create a free tier that harvests data as the cost of the service. If you pay, no data harvesting.
> So I can opt out of training
Even that is debatable. There are a lot of weasel words in their text. At most they're saying "we're not training foundation models on your data", which is not to say "we're not training reward models" or "we're not testing our other-data models on your data" and so on.
I guess the safest way to view this is to consider anything you send them as potentially in the next LLMs, for better or worse.
> When you use Claude Code, we collect feedback
When they ask "How is Claude doing this session?", that appears to be a sneaky way for them to harvest the current conversation based on the terms-of-service clause you pointed out.
This enables the /resume command that lets you start mid-conversation again.
Storing the data is not the same as stealing. It's helpful for many use cases.
I suppose they should have a way to delete conversations though.
That's not just them saving it locally to like `~/.claude/conversations`? Feels weird if all conversations are uploaded to the cloud + retained forever.
Ooo - good question. I'm unsure on this one.
I have been using code + vscode extensively for coding, but in the last few months it has been a frustrating downgrade compared to the same prompts and code being pasted into chatGPT.
Is this going to be the way forward? Switching to whichever is better at a task, code base or context?
The vscode integration does feel far tighter now. The one killer feature that Cursor has over it is the ability to track changes across multiple edits. With Claude you have to either accept or reject the changes after every prompt. With Cursor you can accumulate changes until you're ready to accept. You can use git of course but it isn't anywhere near as ergonomic.
I'm currently using Goose[1]. My brother in law uses Claude Code and he likes it. It makes me wonder if I'm missing anything. Can anyone tell me if there's any reason I should switch to Claude Code, or comparisons between the two?
1: https://block.github.io/goose/
I tried goose and it seems like there's a lot of nice defaults that Claude Code provides that Goose does not. How did you do your initial configuration?
What I've been trying to use it for is to solve a number of long-standing bugs that I've frankly given up on in various Linux tools.
I think I lack the social skills to community drive a fix, probably through some undiagnosed disorder or something so I've been trying to soldier alone on some issues I've had for years.
The issues are things like focus jacking in some window manager I'm using on xorg where the keyboard and the mouse get separate focuses
Goose has been somewhat promising, but still not great.
I mean overall, I don't think any of these coding agents have given me useful insight into my long vexing problems
I think there has to be some type of perception gap or knowledge asymmetry to be really useful - for instance, with foreign languages.
I've studied a few but just in the "taking classes at the local JC" way. These LLMs are absolutely fantastic aids there because I know enough to frame the question but not enough to get the answer.
There's some model for dealing with this I don't have yet.
Essentially I can ask the right question about a variety of things but arguably I'm not doing it right with the software.
I've been writing software for decades, is it really that I'm not competent enough to ask the right question? That's certainly the simplest model but it doesn't check out.
Maybe in some fields I've surpassed a point where llms are useful?
It all circles back to an existential fear of delusional competency.
> Maybe in some fields I've surpassed a point where llms are useful?
I've hit this point while designing developer UX for a library I'm working on. LLMs can nail boilerplate, but when it comes to dev UX they seem to not be very good. Maybe that's because I have a specific vision and some pretty tight requirements? Dunno. I'm in the same spot as you for some stuff.
For throwaway code they're pretty great.
https://github.com/block/goose/discussions/3133#discussionco...
The only real reason to use Claude Code is the inference plan. The agent itself isn't anything special.
This, but also the usability of the cli, is a step above the others to me. i.e., switching between modes on the fly and having the plan mode easily accessible via shift+tab.
Never used goose, but looked at it way back when-- Claude Code feels more native IMO. Especially if you're already using Anthropic API/Plans anyways, I'd say give it a try.
To those lamenting that the Plan with Opus/Code with Sonnet feature is not available, check the charts.
Sonnet 4.5 is beating Opus 4.1 on many benchmarks. Feels like it's a change they made not to 'remove options', but because it's currently universally better to just let Sonnet rip.
Sure but I want to review the ripping plan so it tears along the correct lines.
Shift+Tab still brings up the planning mode.
What's the difference between thinking and planning? Planning leads to the use of the ToDoWrite Tool?
plan mode has a special prompt they use
You have to specify `/model sonnet[1m]` to get the 1 million context version
Prompt: https://raw.githubusercontent.com/marckrenn/cc-mvp-prompts/r...
I've always been curious. Are tags like that one: "<system-reminder>" useful at all? Is the LLM training altered to give a special meaning to specific tags when they are found?
Can a user just write those magic tags (if they knew what they are) and alter the behavior of the LLM in a similar manner?
Claude tends to work well with such semi-xml tags in practice (probably trained for it?).
You can just make them up, and ask it to respond with specific tags, too.
Like “Please respond with the name in <name>…</name> tags and the <surname>.”
It’s one of the approaches to forcing structured responses, or making it role-play multiple actors in one response (having each role in its tags), or asking it to do a round of self-critique in <critique> tags before the final response, etc.
A user can append similar system reminders in their own prompt. It’s one of the things that the Claude Code team discovered worked and now included in other CLIs like Factory, which was talked about today by cofounder of Factory: https://www.youtube.com/live/o4FuKJ_7Ds4?si=py2QC_UWcuDe7vPN
> If you do not use this tool when planning, you may forget to do important tasks - and that is _unacceptable_.
Okay, I know I shouldn't anthropomorphize, but I couldn't prevent myself from thinking that this was a bit of a harsh way of saying things :(
I think they specifically trained claude on any kind of xml tags (see their docs https://docs.claude.com/en/docs/build-with-claude/prompt-eng...)
As a burnt-out, laid-off aging developer, I want to thank Anthropic for helping me get in love with programming again. Claude Code on terminal with all my beloved *nix tools and vim rocks.
100%. As a burnt-out manager, who doesn't get a lot of spare time to actually code. It's nice to have a tool like CC where I can make actual incremental changes in the spare 15 minutes I get here and there.
I spend most of my time making version files with the prompt, but pretty impressed by how far I've gotten on an idea that would have never seen the light of day....
The thoughts of having to write input validation, database persistence, and all the other boring things I've had to write a dozen times in the past....
As an Architect, i feel like a large part of my job is to help my team be their best, but I'm also focused on the delivery of a few key solutions. I'm used to writing tasks, and helping assign it to members on the team while occasionally picking up the odd-end piece of work myself, focusing more on architecture and helping individual members when they get stuck or when problems come up. But with the latest coding agents, i'm always thinking in the back of my head (I can get the AI to finish this task 3x quicker, and probably better quality if I just do it myself with the AI). We sit on SCRUM meetings sizing tasks, and i'm thinking "bro, you're just going to take my task description paste it into AI and be done in 1/2 hr" but we size it to a day or 2.
Agreed, it's actually fun again. The evening hours I used to burn with video games and weed are now spent with claude code, rewriting and finishing up all my custom desktop tools and utilities I started years ago.
I had a lot of fun making 'tools' like this, but once I settled into a complicated problem (networking in a multiplayer game), it has become frustrating to watch Claude give back control to me without accomplishing anything, over and over again. I think I need to start using the SDK in order to force it to its job.
It’s still better letting Claude slog through all that boilerplate and skeletal code for you so that you can take the wheel when things start getting interesting. I’ve avoided working on stuff in the past just because I knew I wouldn’t be motivated enough to write the foundation and all the uninteresting stuff that has to come first.
This kind of stuff is where my anxiety rises a bit. Another example like this is audio code - it compiles and “works” but there could be subtle timing bugs or things that cause pops and clicks that are very hard to track down without tracing through the code and building that whole mental model yourself.
There’s a great sweet spot though around stuff like “make me this CRUD endpoint and a migration and a model with these fields and an admin dashboard”.
I've found that in those cases, I likely am better off doing it myself. The LLMs I've used will frequently overfit the code when it gets complicated. I am working on a language learning app and it so often will add special-casing for words occurring in the tests. In general, as soon as you leave boiler-plate territory, I found it will start writing dirtier and dirtier code.
Try giving codex IDE a go, now included with ChatGPT. Had equal frustrations with Claude making bad decisions, in contrast gpt5 codex high is extremely good!
I mean, yes. This is what Claude is good for: helping solve problems that aren't difficult or complex, just time consuming.
The thing is a lot of software jobs boil down to not difficult but time consuming.
I've got it using dbus, doing funky stuff with Xlib, working with pipewire using the pulseaudio protocol which it implemented itself (after I told it to quit using libraries for it.) You can't one-shot complicated problems, at least not without extensive careful prompting, but at this point I believe I can walk it through pretty much anything I can imagine wanting done.
I still spend my evening hours like that and do ai-assisted coding in the background
Depends on the game tbh, having claude ping me for attention every few minutes disrupts most games too much, but with turn-based or casual games it works out well. OpenTTD and Endless Sky are fun to play while claude churns.
I really didn't need the cognitohazard thought that I could play Factorio and still somehow get things done
I'm 10 months clean on factorio, been doing the 12 step program for it. Feeling real tempted to relapse now...
It's first time I started get hit by "ERROR Out of memory" in CC - after about an hour of use. I'm on Mac Pro M4 Max with 128 GB RAM...
It's a node app, it won't use all memory by default, only a couple gigs.
That's a BIG workstation.
For folks who use neovim, there's always https://github.com/dlants/magenta.nvim , which is just as good as claude code in my (very biased) opinion.
/rewind is a super nice addition. That was annoying the hell out of me.
Anthropic announcement: https://www.anthropic.com/news/enabling-claude-code-to-work-...
Wait, still no support for the new MCP features? How come Claude Code, from the creators of MCP, is still lacking félicitation, server logging, and progress report ?!
How to check the version? claude version one told me that it updated to version two but I don't know if it's true
cl --version 1.0.44 (Claude Code)
as expected … liar! ;)
cl update
Wasn't that hard sorry for bothering
I opened the CLI for about 10 seconds, the "Auto Update" status flashed. Then I restarted and it was version 2.0
I would really like for them to add the option to constantly display how much context is left before compression or a new session.
I haven't tried this, but looks like it might be possible to display it in the status line.
https://www.reddit.com/r/ClaudeAI/comments/1mlhx2j/comment/n...
Looks like it shows now when the remaining context is < 50%, which is a welcome change to the 15% when it previously would appear.
VS Code plugin seems to be missing quite a number of the CLI features.
I notice that thinking triggers like "Think harder" are not highlighted in the prompt anymore. Could that mean that thinking is now only a single toggle with tab (no gradation)?
Ultrathink still works
Has anyone figured out how to do claude sub agents without using claude? some sort of opensource cli with openrouter or something? I want to use subagents on differnt LLMs ( copilot,selfhost ).
Opencode
I wish there were an option to cancel a currently running prompt midway. Right now, pressing Ctrl+C twice ends up terminating the entire session instead.
Wait, doesn't hitting Escape do this already?
Adding to the press Esc comments, if you press it twice, you can revert to previous messages in the current conversation.
I'm always watching Claude Code as it runs, ready to hit the Escape key as soon as it goes off the rails. Sometimes it gets stuck in a cul de sac, or misunderstands something basic about the project or its structure and gets off on a bad tangent. But overall I love CC.
press escape
Just use `claude update` if you already have it. Unfortunately, they removed Plan mode, when I could use Opus for planning and Sonnect for coding.
Though I will see how this pans out.
I am ended up not using this option anyway. I am using B-MAD agents for planning and it gets into a long-running planning stream, where it needs permission to execute steps. So you end up running the planning in the "accept edits" mode.
I use Opus to write the planning docs for 30 min, then use Sonnet to execute them for another 30 min.
> they removed Plan mode
This isn't true, you just need to use the usual shortcut twice: shift+tab
They removed the /model option where you can select Opus to plan and Sonnet to execute. But you can still Shift + Tab to cycle between auto-accept and plan mode.
Oh thank God. parent comment made me think Plan mode was gone entirely by incorrectly stating that...Plan made is gone entirely...
is Plan mode any different from telling Claude "this is what I'd like to do, please describe an implementation plan"?
that's generally my workflow and I have the results saved into a CLAUDE-X-plan.md. then review the plan and incrementally change it if the initial plan isn't right.
It also limits the tools available, reducing context usage and leaving more room actually to plan.
There's a bit of UI around it where you can accept the plan. I personally stopped using it and instead moved to a workflow where I simply ask it to write the plan in a file. It's much easier to edit and improve this way.
Yeah, I just have it generate PRDs/high-level plans, then break it down into cards in "Kanban.md" (a bunch of headers like "Backlog," "In-Progress", etc).
To be honest, Claude is not great about moving cards when it's done with a task, but this workflow is very helpful for getting it back on track if I need to exit a session for any reason.
i've experienced the same thing. usually i try to set up or have it set up a milestone/phase approach to an implementation with checklists (markdown style) but it's 50/50 if it marks them automatically upon completion.
> Unfortunately, they removed Plan mode
If I hit shift-Tab twice I can still get to plan mode
I think they meant the 'Plan with Opus' model. shift+tab still works for me, the VS code extension allows you to plan still too, but the UI is _so_slow with updates.
> Unfortunately, they removed Plan mode
WTF. Terrible decision if true. I don't see that in the changelog though
No. Plan mode still works fine.
They just changed it so you can't set it to use Opus in planning mode... it uses Sonnet 4.5 for both.
Which makes sense Iif it really is a stronger and cheaper model.
I'm concerned that i don't see the "Plan with Opus, impl with Sonnet" feature with Claude 2.0.
If Sonnet 4.5 is always better than Opus 4.1, then it doesn't make sense to plan with Opus.
I hope this is the case.
Tangential, did anybody get FOMO about Aider and found a much better tool?
I was using aider quite a lot from ~ 7 months ago to ~ 3 months ago. I had to stop because they refuse to implement MCPs and Claude/Codex style agentic workflow just yields better results.
I still use aider, because often I know better what to do.
how do i revert to the previous version? I find that the "claude" command in terminal still works great, but the new native VSC extension is missing all these things (before it would launch terminal + run "claude")
I feel like there's so many bugs. The / commands for add-dir and others I used often are gone.
I logged in, it still says "Login"
i really hate the fact that every single model has its own cli tool. the ux for claude code is really great, but being stuck using only anthropic models makes me not want to use it no matter how good it is.
seems like closed source obfuscated blob distributed on npm to save bandwidth cost.
wow its way uglier lol, and why does it default to full screen?
> why does it default to full screen?
Pardon my ignorance, but what does this mean? It's a terminal app that has always expanded to the full terminal, no? I've not noticed any difference in how it renders in the terminal.
What am i misunderstanding in your comment?
A tui does not have to start full screen, v1 of claude did not take over the entire terminal, it would only use a bit at the bottom and scroll up until it was full screen.
I just downgraded to v1 to confirm this.
Weird, i use it exclusively in the terminal (in raw term, tmux and zellij) and i've not noticed any difference in behavior. On both MacOS and Linux.
Wonder what changes that i'm not seeing? Do you think it's a regression or intentional?
sorry, but insert spacebar heater xkcd here :D https://xkcd.com/1172/
pretty sure your old behavior was the broken one tho - i vaguely remember fugling with this to "fullscreen correctly" for a claude-in-docker-in-cygwin-via-MSYS2 a while ago
Well I guess I'll be sticking with opencode.
Do you mind telling us a bit more? I never used OpenCode, what makes it better in your opinion?
I'm consistently hitting weird bugs with opencode, like escape codes not being handled correctly so the tui output looks awful, or it hanging on the first startup. Maybe after they migrate to opentui it'll be better
I do like the model selection with opencode though
What are they doing about the supply chain attacks on npm?
thats your concern as a dev sending a patch to your repo., your IDE doesn't "address" attacks.
Curious: what would, should, or could they be doing?
The same as everyone else; ignoring it and hope it goes away