I asked claude code for a guidelines file so it would collaborate with windsurf. This is what it proposed:
---
This project uses shared planning documents for collaboration with Claude Code. Please:
1. First read and understand these files:
- PLAN.md - current project roadmap and objectives
- ARCHITECTURE.md - technical decisions and system design
- TODO.md - current tasks and their status
- DECISIONS.md - decision history with rationale
- COLLABORATION.md - handoff notes from other tools
2. Before making any significant changes, check these documents for:
- Existing architectural decisions
- Current sprint priorities
- Tasks already in progress
- Previous context from Claude Code
3. After completing work, update the relevant planning documents with:
- Task completion status
- New decisions made
- Any changes to architecture or approach
- Notes for future collaboration
Always treat these files as the single source of truth for project state.
problem is that claude doesn't actually read those or keep them in context unless you prompt it to. it has to be in CLAUDE.md or it'll quickly forget about the contents
Yes! I want an option to always add README.md to the context; It would force me to have a useful, up to date document about how to build, run, and edit my projects.
Ultimately if this stuff is actually intelligent it should be using the same sources of information that we intelligent beings use. Feels silly to have to have to jump through all these hoops to make it work today
They’re definitely not, Claude and all other agents frequently forget the build and test commands present in CLAUDE/etc.md for my various repos (even though most of them were were initialized by the AI).
Never. It's a marketing strategy. Some percentage of users will check these files into their repos, and some percentage of repo browsers will think "what is this X.md?" Given how much money people are spending on these things the value of having a unique filename must be enormous.
It’s a marketing strategy that works here and now, but “never” is a very long time. What could be seen as pioneers claiming names today could be also seen as retrogressive stubbornness tomorrow and lose its marketing value.
That sounds nice and I have the same pain, but not sure AGENT.md is the right abstraction either. After all, these models are indeed different and will respond differently even given the same prompting. Not to mention that different wrappers around those models have different capabilities.
e.g. maybe for CURSOR.md you just want to provide context and best practices without any tool-calling context (because you've found it doesn't do a great job of tool-calling), while for CLAUDE.md (for use with Claude Code) you might want to specify tools that are available to it (because it does a great job with tool calling).
Probably best if you have an AGENT.md that applies to all, and then the tools can also ingest their particular flavor in addition, which (if anything is in conflict) would trump the baseline AGENT file.
I really like the idea of standardizing on AGENT.md, although it's too bad it doesn't really work with the .cursor/rules/ approach of having several rules files that get included based on matching the descriptions or file globs in frontmatter. Then again, I'm not sure if any other agents support an approach like that, and in my experience Cursor isn't entirely predictable about which rules files it ends up including in the context.
I guess having links to supplementary rules files is an option, but I'm not sure which agents (if any) would work well with that.
Yep, that's a peeve of mine. I've resorted to using AGENT.md, and aliasing Claude, Gemini, etc to a command that calls them with an initial instruction to read that file. But of course they will forget after some time.
The whole agentic coding via CLI experience could be much improved by:
- Making it easy to see what command I last issued, without having to scroll up through reams of output hunting for context
- Making it easy to spin up a proper sandbox to run sessions unattended
- Etc.
Maybe for code generation, what we actually need is a code generator that is itself deterministic but uses AI, instead of AI that does code generation.
The deeper problem are the custom commands, hooks and subagents. The time has come that you need to make a strategic choice. Once you have heavily invested into CC, it is not easy to turn to an alternative.
Side remark:
CC is very expensive when using API billing (compared to e.g. GPT-5). Once a company adopts CC and all developers start to adapt to it at full scale, the bill will go out of the roof.
FWIW at least with Claude and Jules on a project I have a decent setup where I put all of the real content in an agents.md and then use “@agents.md” in CLAUDE.md. If all of the tools supported these kinds of context references in markdown it wouldn’t be that hard to have a single source of truth for memory files.
I just wish the AGENTS.md standard wasn't a single file. I have a lot of smaller context documents that aren't applicable to every task, so I like to throw them into a folder (.ai/ or .agents/) and then selectively cat them together or tell the agent to read them.
You could have a python script that generates the MD file on the fly, based on how you want to prompt the model. I think it's kind of funny, how deep we are getting with tools instructing tools instructing tools.
Yeah I suspect some of these providers will become Microsoft in the '90s type bully holdouts on implementing the emerging conventions. But ultimately with CLI interface you have workarounds to all the major providers read in your system guidelines. But in an IDE - e.g. like MS had with VisualStudio - you more lock-in potential for your config files.
I'm at a point where I symlink differnet sets of docs to try to focus context so much I feel like maybe I need a git submodule with different branches of context I want. I left managing people to now manage AI
I went from years of vscode to "Cursor is the future" to never using Cursor at all. Claude Code, even with new limits, is just too good. If I were to switch to gpt-5, why wouldn't I just use Codex? I'm struggling to understand the value of what they're presenting.
The values is that, to use cursor, you don't need anymore to switch your IDE (unless you were already using vscode). You can keep your preferred IDE and run the agent in the terminal. IDE Is for humans, agents need only a terminal for running.
I find the Codex CLI to be the worst of the CLI tools I’ve used (including, but not limited to, Claude Code, Gemini, Aider).
There’s something about it that makes it clunky.
Haven’t tried Cursor CLI yet though.
I don’t visit Twitter links. Why not a link to the GitHub changelog?
Also, as an aside since you are on the team - the organization verification is frustrating in that the docs indicate:
>You must not have recently verified another organization, as each ID can only verify one organization every 90 days.
I champion OpenAI at my work, so naturally I’d be the one to verify there. But I apparently can’t, because I verify for my personal-led org. That gets in the way of me proselytizing gpt-5 based coding tools (such as, possibly, Codex CLI).
Tried your latest version - thanks for posting about it.
Codex needs plan mode (shift-tab in Claude Code)
And Codex needs the prompt to always be available. So you can type to the model while it’s working & have it eventually receive the message and act on it, instead of having to Ctrl-C to interrupt it before typing. Claude Code’s prompt is always ready to type at - you can type while it is working. That goes a long way towards it feeling like it cares about the user.
The Cursor you used a month ago is not the one you get now.
Just saying that because in this space you should always compare latest X with latest Y.
I too switched weeks ago to Claude Code. Then between the times I am out of tokens I launch Cursor and actually find it...better than I remember if not on par with Claude Code (the model and quality of prompts/context matters more than the IDE/CLI tool used too).
Because iterating multiple sessions through multiple terminals is obviously more efficient and seamless than interacting thought a scuffed IDE side panel ui.
In my experience, it is much better at tool-calling, which is huge when we're talking about agentic coding. It also seems to do a better job of keeping things cleaning and not going off on tangents for anything that isn't accomplished in one shot.
I have had the exact opposite experience. Claude Code in any meaningful codebase for me gets stuck in loops of doing the wrong thing. Then when that doesn't work it deletes files and makes its own that don't have the problem it's encountering.
Cursor on the other hand, especially with GPT-5 today but typically with Sonnet 4.1, has been a workhorse at my company for months. I have never had Claude Code complete a meaningful ticket once. Even a small thing like fixing a small bug or updating the documentation on the site.
Would love any tips on how to make Claude Code not a complete waste of electricity.
If you don’t know how to divide a problem up given a toolset you won’t be able to solve it regardless of what those tools are. Maybe Cursor’s interface is more intuitive for you.
The problems I’ve given CC are things that are incredibly simple and basic. Things I knew how to fix immediately. I would tell it the gilt to change and how to change it. And it will get lost when the types are incorrect, or when it causes a test to fail. It will like just delete the test.
I don’t doubt I could improve my prompts but I don’t have those same prompting problems with cursor.
People getting really poor results probably don't recognize that their prompts aren't very good.
I think some users make assumptions about what the model can't do before they even try, so their prompts don't take advantage of all the capabilities the model provides.
Opposite experience. I worked with Claude code a lot, then switched to Cursor and then tried to switch back and discovered that CC often gets stuck in loops. Cursor just works. It definitely helps that I can switch the foundational models in Cursor when it gets stuck.
CC just feeds the whole codebase and entire files into the model, no RAG, nothing in the way. It works substantially better because of that, but it's $expensive$.
What I have found Claude Code is extremely good at is that it makes one change at a time, gives you a chance to read the code its changing, and lets you give feedback in real time and steer it properly. I find the mental load with this method to be MUCH lower than Cursor or any of the other tools which give you two very different options: "Ask" mode which dumps a ton of suggestions on your and then requires semi-manual implementation, or "Agent" mode which dumps a ton of actual changes on you and requires your inspection and feedback and roll-backs, etc.
This may not work for everyone, but as a solo dev who wants to keep a real mental model of my work (and not let it get polluted with AI slop), the Claude Code approach just works really well for me. It's like having a coding partner who can iterate and change direction as you talk, not a junior dev who dumps a pile of code on your plate without discussion.
Think how much training has been done on such Javascript frameworks... no one stops wondering what the outcome would be. The only fact that when I ask to create an app, without any further detail about what to use, and it defaults on React, imo it's a total failure whatever the agent
Sure, you can have your LLM code with any JavaScript framework you want, as long as you don't mind it randomly dropping React code and React-isms in the middle of your app.
To be honest I am being positive and hopefully we'll see an explosion of AI agent that will help iron out all the bug in FOSS that is hosted on different source code hosting platform. Renovate on steroid. I would work on that if my daytime job wasn't my main and only source of revenue.
Ask a FOSS maintainer and they will not be nearly as optimistic about AI reducing the amount of bugs. A lot of AI generated pull requests are broken or useless and the up wasting a lot of the maintainers' time
If we're not there already, it's just a matter of time before LLMs will be able to read and understand a framework they haven't seen before and be able to use it anyway.
LLMs are already trained on JavaScript at a deep level; as LLM reasoning and RAG techniques improve, there will be a time in the not-too-distant future when an LLM can be pointed to the website of a new framework and be able to use it.
Holy moly. I did not see that coming, but it makes sense. I’m enjoying the terminal-based coding agents way more than I ever would have expected. I can keep one spinning in the background while I do #dayjob, and as a bonus I feel like a haX0r.
2025 is the year of the terminal, apparently?
For my prototype purposes, it’s great, and Claude code the most fun I’ve had with tech in a jillion years.
Fascinating to see how agents are redefining what IDEs are. This was not really the case in the chat AI era. But as autonomy increases, the traditional IDE UI becomes less important form of interaction.
I think those CLI tools have pretty good chance to create a new dev tools ecosystem. Creating a full featured language plugin (let alone a full IDE) for VSCode or Intellij is not for a faint-hearted, and cross IDE portability is limited. CLI tools + MCP can be a lot simpler, more composable and more portable.
IDE UI should shift to focusing on catching agentic problems early and obviously, and providing drop dead simple rollback strategies, parallel survival-of-the-fittest solution generation, etc
With all the frontier labs competing in this space now, and them letting you use your consumer subscription through the CLI, I don’t understand how the Cursor products will survive. Why pay an extra $X/mo when I can get this functionality included in the $Y/mo I’m already paying OAI/Anthropic/GOOG?
I think the complete opposite. I love the ux for claude code, but it would be better if it wasnt locked to a single vendor's model. It seems pretty clear to me that a vendor neutral product with a UX as good as Claude Code would be the clear winner.
Habe you tried opencode? I haven't really, but it can use your anthropic subscription and also switch to most other models. It also looks quite nice IMO
If Cursor can build the better UX for all the use-cases, mobile/desktop chatbot, assistant, in IDE coding agent, CLI coding agent, web-based container coding agent, etc.
In theory, they can spend all their resourcing on this, so you could assume they could have those be more polished.
If they win the market-share here, than the models are just commodity, Cursor lets you pick which ever is best at any given time.
In a sense, "users" are going to get locked in on the tooling. They learn the commands, configuration, and so on of Cursor, it's a higher cost for them to re-learn a different UX. Uninstalling and re-installing another app, plugin, etc. is annoying.
No, model providers are not going to let Cursor eat their pie. The biggest cost in AI is in developing LLM models and inference. Players incurring those costs will basically control this market.
I don't think we'll have more than 2 players. I think it's like AMD and Intel, the LLM is almost like providing hardware. The software that exposes the LLM capabilities to the user is the layer that will be able to differentiate.
The models are just going to be fighting performance/cost. And people will choose the best performance for their budget.
And that's ignoring how good local models are getting as well.
It's not that they'll have their launch eaten by Cursor, it's just that they can't be as focused on user experience when they're also laser focused on improving the models to stay competitive.
I agree that cursor has to take an aggressive and differentiated approach to succeed, but they have the benefit of pushing each lab into a commodity.
I pay for Cursor and ChatGPT. I can imagine I’d pay for Gemini if I used an android. The chat bots (1) won’t keep the subscription competitive with APIs because the cost and usage models are different and (2) most chat bots today are more of a UX competition than model quality. And the only winners are ChatGPT and whatever integrated options the user has by default (Gemini, MSFT Copilot, etc).
Because you can always use the best model. Yesterday is was Claude Opus 4.1, today it's GPT-5. If you just were paying Anthropic you will be stuck with Claude.
I'm having trouble finding a use for this outside of virtualized unused environments. Why not instead give me a virtual machine that runs this in a confined storage space?
I would _never_ give an LLM access to any disk I own or control if it had anything more than read permissions
For example, Gemini CLI [1] can use native sandboxing on macOS. It's just a matter of time before every major coding agent will run inside of an operating system's native sandbox/container/jail/VM.
Why not? Have you ever actually used these things? The risk is incredibly low. I run claude code with zero permissions every day for hours. Never a problem.
I have (not an exhaustive list) SSH keys and sensitive repositories hanging out on my filesystem. I don't trust _myself_ with that, let alone an LLM, unless I'm running ollama or similar local nonsense with no net connectivity.
I'm a few degrees removed from an air gapped environment so obviously YMMV. Frankly I find the idea of an LLM writing files or being allowed to access databases or similar cases directly distasteful; I have to review the output anyway and I'll decide what goes to the relevant disk locations / gets run.
They don't have arbitrary access over your file system. They ask permission for doing most everything. Even reading files, they can't do that outside of the current working directory without permission.
I'm pretty comfortable with the agent scaffolding just restricting directory access but I can see places it might not be enough...
If you were being really paranoid then I guess they could write a script in the local directory that then runs and accesses other parts of the filesystem.
I've not seen any evidence an agent would just do that randomly (though I suppose they are nondeterministic). In principle maybe a malicious or unlucky prompt found somewhere in the permitted directory could trigger it?
Comments like this just show how bad the average dev is at security. Ever heard of the principle of least privilege? It's crazy that anyone who has written at least one piece of software would think "nah, it's fine because the software is meant to ask before doing".
Your obviously skilled, spending the money on a Claude only machine would pay for itself in less than three weeks. If I was your employer, it would be a no brainer.
I really like the IDE. It makes enough mistakes that I need to be constantly testing and catching little errors. I’ll interrupt the flow often when it’s going down a path I don’t want it to. When using Codex, for example, it’s doing too much in the background that is harder to correct afterwards. Am I doing this wrong?
Actually, I think where Claude Code shines, is with the VSCode Extension. It's a great mix between a CLI that could be used in a bash script for automation, as well as a coding assistant.
I haven't found however if Cursor cli provides this kind of extension
Flip your thinking around for a second and consider why an IDE is required for an agent that codes for you?
The IDE/editor is for me, the agent doesn't need it. That also means I am not forced to used whatever imperfect forked IDE the agent is implemented against.
Sure, but monitoring, reviewing and steering does not really require modern IDEs in their current form. Also, I'm sure agents can benefit from parts of IDE functionality (navigation, static analysis, integration with build tools, codebase indexing, ...), but they sure don't need the UI. And without UI those parts can become simpler, more composable and more portable (being compatible with multiple agent tools). IMO another way to think about CLI agentic coding tools as of new form of IDEs.
I end up using the VCS tooling (lazygit for me), but coding agents really need to be integrated with this review environment. We need an extra step where the agent will group its changes into logical units (database models in one commit, types in another, business logic in another, tests in another), rather than having to review per-file.
Programming has changed from writing code to reviewing/QAing and reprompting, but the tooling hasn't yet caught up with that workflow. We need Gerrit for coding agents, basically.
Many of these companies are realizing that mainline VSCode is a moat of sorts. I and many people I know won't use any of these that require forking VSCode.
With the benefit that you can also pull in people who don't like using VSCode such as people who use Jetbrains or terminal based code editors.
i dont think it actually does anything better than the chat window in the editor. its strictly worse tbh. it just lets you not be tied to a VSCode interface for editing.im sure Jetbrains diehards would very much appreciate this, but honestly i will find it hard to utilize given the fact Cursor's tab auto-complete is so amazing.
Depends how you define "better". Quality/breadth of tasks/capabilities? Probably not (TBD how gpt5 will fare, colleagues were saying that it was better at some frontend tasks than claude4 in the alpha/beta horizon tests).
But if you take speed/availability/cost into account, there might be "better" offers out there. I did some tests w/ windsurf when they announced their swe1 and swe1-lite models, and the -lite could handle easy tasks pretty well. I also tested 4.1-mini and 4.1-nano. There are tasks that I could see them handle reliably enough to make sense (and they're fast, cheap and don't throttle you).
i'm betting on cursor being the long-term best toolset.
1. with tight integration between cli, background agent, ide, github apps (e.g. bugbot), cursor will accommodate the end-to-end developer experience.
2. as frontier models internalize task routing, there won't be much that feels special about claude code anymore.
3. we should always promote low switching costs between model providers (by supporting independent companies), keeping incentives toward improving the models not ui/data/network lock-in.
cursor and 3rd party tools will, unless they make their own superior foundation model, will always have to fight the higher marginal cost battle. This is particularly bad insofar that they offer fixed pricing subscriptions. That means they’re going to have to employ more context saving tricks which are at odds with better performance.
If the cost economics result in Cursor holding, say, 20% fewer tokens in context versus model-provider coding agents, they will necessarily get worse performance, all things equal.
Unless Cursor offers something dramatically different outside of the basic agentic coding stack it’s hard to see why the market will converge to cursor.
> we should always promote low switching costs between model providers (by supporting independent companies), keeping incentives toward improving the models not ui/data/network lock-in
You’re underestimating the dollars at play here. With cursor routing all your tokens, they will become a foundation model play sooner than you may think
The code isn’t the valuable part. They know all the most common workflows and failure modes, allowing them to create better environments for training agentic models
Happy to short that bet as I think agentic harnesses will be molded along the RL training of the actual model. Tony + the suit created together. Why Claude in Claude Code became existential for Cursor, why cursor moved quick to go agentic and build up with OpenAI in big header line way here.
I think CLI is a good idea for now. Next abstraction seems to be Github PRs where someone (likely me) files an issue/feature, then I click a button, and the agent fixes the issue/feature. Github has talked about something similar, but surely it were a pain to figure out if it was GA and I had access to it given so many different variations they have called gh copilot. (PS: it exists, but not as smooth as I described: https://docs.github.com/en/copilot/how-tos/use-copilot-agent... )
Are those for the anonymous accesses of the AI prompts?
If those are for the authenticated AI prompts, how to create a "non-anonymous" account with a noscript/basic (x)html browsers (not to mention I am self-hosted without paying the DNS mafia, namely my emails are with ip literals, ofc I prefer IPv6).
What differentiates the CLI tools at this point and makes you prefer one over the other?
opencode and Crush can use any model, so apart from a nicer visual experience, are there any aspects that actually make you more productive in one vs the other?
is there a way to get it to display more information? its stuck not doing anything and i cant tell if that's because it timed out or it is running a script or it is thinking or what is even happening. sometimes it just does things without even giving any feedback at all. i dont know what it is thinking or what it is trying to do and i cant really see the output of the terminal commands it is running. it just pauses every once in a while and asks to run a command.
My first thought was, "meh, I already have Claude Code". But then I remembered my primary frustration with Claude Code. I need other LLMs to be able to validate Claude Code's assumptions and work. I need to do this in an automated way. Before Cursor CLI, I did not have a way to programmatically ask Cursor do this. It was very manual, very painful. But, now I can create a Claude Code agent that is a "cursor-specialist" that uses cursor cli to do all of that in an automated way.
Interesting, are you saying you would setup a Stop Hook in Claude Code that calls the Cursor CLI to have it validate and prompt Claude Code with further instructions?
Can you pick thinking models with this or is that implied?
GPT-5 seems a bit slow so far (in terms of deciding and awareness). I’ve gone from waiting for a compiler, to waiting for assets to build to now waiting for an agent to decide what to do - progress I guess :)
Boris Cherny was a (main?) creator of Claude Code at Anthropic. He moved over to Cursor about a month ago. I hope Cursor CLI is an Claude Code Agent port to the Cursor. Hopefully, the code quality would be comparable, modulo Cursor's abridged model access. We will know shortly.
Only if it would work. I think they miss a big opportunity here by (1) not caring about security at all, (2) trying to develop their own model and only make it available in the cloud.
Seriously Cursor. You can’t just write wrappers all your life. VSCode wrapper and now Gemini CLI wrapper. Can you make something from scratch for once? It’s as if they want an exit and they’re putting in minimum effort until that materializes.
When I saw this, the question which immediately came to mind was:
Who would turn loose arbitrary commands (content)
generated by an LLM onto their filesystem?
Then I saw the installation instructions, which are:
curl https://cursor.com/install -fsS | bash
And it made sense.
Only those comfortable with installing software by downloading shell commands from an arbitrary remote web site and immediately executing them would use it.
So what then is the risk of running arbitrary file system modifications generated from a program installed via arbitrary shell commands? None more than what was accepted in order to install it.
Both are opaque, unreviewed, and susceptible to various well known attacks (such as a supply chain attack[0]).
I wonder when all of them will adopt AGENT.md and stop using gemini.md/claude.md/crush.md/summary.md/qwen.md
https://agent.md [redirect -> https://ampcode.com/AGENT.md] https://agent-rules.org
I asked claude code for a guidelines file so it would collaborate with windsurf. This is what it proposed:
---
You could create a CLAUDE.md that just contains:
Still messy, but at least it means it's using the same contentproblem is that claude doesn't actually read those or keep them in context unless you prompt it to. it has to be in CLAUDE.md or it'll quickly forget about the contents
I've added these instructions in CLAUDE.md and .windsurfrules, and yes sometimes you have to remind it, but overall it works quite well.
Sounds like a job for a symlink
Habe you though about adding a Session start hook that reads this file and adds it to the context?
Not yet, but that sounds like a good suggestion.
> it has to be in CLAUDE.md or it'll quickly forget about the contents
And then it will promptly forget about CLAUDE.md as well (happened to me on several occasions)
The font on that RFC is aggressively squashed and aliased, why?
Every time I’ve ever read a {CLAUDE|GEMINI|QWEN}.md I’ve thought all this information could just be in CONTRIBUTING.md instead.
Yes! I want an option to always add README.md to the context; It would force me to have a useful, up to date document about how to build, run, and edit my projects.
You can include in your prompt for it to read the README!
Ultimately if this stuff is actually intelligent it should be using the same sources of information that we intelligent beings use. Feels silly to have to have to jump through all these hoops to make it work today
> It would force me to have a useful, up to date document about how to build, run, and edit my projects.
Not really: our AI agents are probably smart enough to even make sense of somewhat bad instructions.
Not the case at all. AI agents will happily turn your bad ideas into code.
Non sequitur?
I am talking about LLMs figuring out how to build your project with some bad and incomplete instructions plus educated guessing.
They’re definitely not, Claude and all other agents frequently forget the build and test commands present in CLAUDE/etc.md for my various repos (even though most of them were were initialized by the AI).
Whether Claude and co understand is probably not a great proxy for whether your docs are good for humans.
Never. It's a marketing strategy. Some percentage of users will check these files into their repos, and some percentage of repo browsers will think "what is this X.md?" Given how much money people are spending on these things the value of having a unique filename must be enormous.
It’s a marketing strategy that works here and now, but “never” is a very long time. What could be seen as pioneers claiming names today could be also seen as retrogressive stubbornness tomorrow and lose its marketing value.
There's a reason we still call the file robots.txt by that name, and not web-scraping.txt or search-engines.txt.
Brand asset
That sounds nice and I have the same pain, but not sure AGENT.md is the right abstraction either. After all, these models are indeed different and will respond differently even given the same prompting. Not to mention that different wrappers around those models have different capabilities.
e.g. maybe for CURSOR.md you just want to provide context and best practices without any tool-calling context (because you've found it doesn't do a great job of tool-calling), while for CLAUDE.md (for use with Claude Code) you might want to specify tools that are available to it (because it does a great job with tool calling).
Probably best if you have an AGENT.md that applies to all, and then the tools can also ingest their particular flavor in addition, which (if anything is in conflict) would trump the baseline AGENT file.
I really like the idea of standardizing on AGENT.md, although it's too bad it doesn't really work with the .cursor/rules/ approach of having several rules files that get included based on matching the descriptions or file globs in frontmatter. Then again, I'm not sure if any other agents support an approach like that, and in my experience Cursor isn't entirely predictable about which rules files it ends up including in the context.
I guess having links to supplementary rules files is an option, but I'm not sure which agents (if any) would work well with that.
Yep, that's a peeve of mine. I've resorted to using AGENT.md, and aliasing Claude, Gemini, etc to a command that calls them with an initial instruction to read that file. But of course they will forget after some time.
The whole agentic coding via CLI experience could be much improved by:
- Making it easy to see what command I last issued, without having to scroll up through reams of output hunting for context - Making it easy to spin up a proper sandbox to run sessions unattended - Etc.
Maybe for code generation, what we actually need is a code generator that is itself deterministic but uses AI, instead of AI that does code generation.
I think most of them provide an option to change the default file, but it'll be really good if they all can switch to AGENT.md by default
Till then you can also use symlinks
there are issues opened in some repos for this
- Support "AGENT.md" spec + filename · Issue #4970 · google-gemini/gemini-cli
https://github.com/google-gemini/gemini-cli/issues/4970#issu...
https://github.com/anthropics/claude-code/issues/1091
Here for Claude
And then they should standardize on usage rules (an idea in Elixir space: https://hexdocs.pm/usage_rules/readme.html )
In case you haven’t seen this, you can just pipe contents into Claude. Eg
cat AGENT.md | claude
IIRC this saves some tokens.
maybe symlinking will work
That's a more obvious (but less fun) name than what I've been using: ROBOTS.md with symlinks.
https://ampcode.com/AGENT.md#migration
they also suggest using symlinks for now
The deeper problem are the custom commands, hooks and subagents. The time has come that you need to make a strategic choice. Once you have heavily invested into CC, it is not easy to turn to an alternative.
Side remark: CC is very expensive when using API billing (compared to e.g. GPT-5). Once a company adopts CC and all developers start to adapt to it at full scale, the bill will go out of the roof.
FWIW at least with Claude and Jules on a project I have a decent setup where I put all of the real content in an agents.md and then use “@agents.md” in CLAUDE.md. If all of the tools supported these kinds of context references in markdown it wouldn’t be that hard to have a single source of truth for memory files.
Same here each specific instruction file (vs code, cursor, etc.) just says read the AGENTS.md for instructions
I just wish the AGENTS.md standard wasn't a single file. I have a lot of smaller context documents that aren't applicable to every task, so I like to throw them into a folder (.ai/ or .agents/) and then selectively cat them together or tell the agent to read them.
You could have a python script that generates the MD file on the fly, based on how you want to prompt the model. I think it's kind of funny, how deep we are getting with tools instructing tools instructing tools.
You can put them in subdirectories like CODEOWNERS files. https://ampcode.com/AGENT.md#multiple-agent-files
>When multiple files exist, tools SHOULD merge the configurations with more specific files taking precedence over general ones.
How is the tool supposed to merge multiple md files?
Most AI solutions make tracing the finally assembled prompt a bit difficult.
However, if you look through the source code or network requests, you’ll see that merging just means naive “concatenation”.
The same way it synthesizes anything else. We're talking about LLMs here; this is their bread and butter.
Elixir has this idea with usage rules: https://hexdocs.pm/usage_rules/readme.html
Symlink?
Yeah I suspect some of these providers will become Microsoft in the '90s type bully holdouts on implementing the emerging conventions. But ultimately with CLI interface you have workarounds to all the major providers read in your system guidelines. But in an IDE - e.g. like MS had with VisualStudio - you more lock-in potential for your config files.
Yesterday, I was writing about a way I found to pass the same guideline documents into Claude, Gemini, and Aider CLI-coders: https://github.com/sutt/agro/blob/master/docs/case-studies/a...
Isn't think just a symlink?
I'm at a point where I symlink differnet sets of docs to try to focus context so much I feel like maybe I need a git submodule with different branches of context I want. I left managing people to now manage AI
Agree. It's all English. That's the whole point of these tools.
Why are we purposely creating CLI dialects?
Symlinks $AGENT.md to AGENT.md in your repo.
When they stop getting desperate for differentiation by spamming their brand advertising in your repo against your will.
Claude Code likes to add "attribution" in commit messages, which is just pure spam.
You can turn it off: https://docs.anthropic.com/en/docs/claude-code/settings#avai... `includeCoAuthoredBy: false`
5 years from now: Subscribe to the Ultra plan for the ability to prevent the LLM from putting ads for Coca-Cola into your code comments!
It’s not spam, it lets people know it was written by an LLM and maybe you should look closer at it.
Thank you so much for this!
I went from years of vscode to "Cursor is the future" to never using Cursor at all. Claude Code, even with new limits, is just too good. If I were to switch to gpt-5, why wouldn't I just use Codex? I'm struggling to understand the value of what they're presenting.
The values is that, to use cursor, you don't need anymore to switch your IDE (unless you were already using vscode). You can keep your preferred IDE and run the agent in the terminal. IDE Is for humans, agents need only a terminal for running.
I find the Codex CLI to be the worst of the CLI tools I’ve used (including, but not limited to, Claude Code, Gemini, Aider). There’s something about it that makes it clunky. Haven’t tried Cursor CLI yet though.
We (Codex) shipped a pretty large CLI update today and have many more improvements coming. Give it a try if you haven't.
https://x.com/OpenAIDevs/status/1953559797883891735 (0.19 now)
Thanks for the heads up. I’ll check it out!
I don’t visit Twitter links. Why not a link to the GitHub changelog?
Also, as an aside since you are on the team - the organization verification is frustrating in that the docs indicate:
>You must not have recently verified another organization, as each ID can only verify one organization every 90 days.
I champion OpenAI at my work, so naturally I’d be the one to verify there. But I apparently can’t, because I verify for my personal-led org. That gets in the way of me proselytizing gpt-5 based coding tools (such as, possibly, Codex CLI).
Another +1 to this. Some of us are unwilling to click on a Twitter link. Link to changelog would be more appropriate.
Huge +1 to this. We have two orgs at work (for separate budget/rate limit blast radii) and had to get two people to verify this morning…
Tried your latest version - thanks for posting about it.
Codex needs plan mode (shift-tab in Claude Code)
And Codex needs the prompt to always be available. So you can type to the model while it’s working & have it eventually receive the message and act on it, instead of having to Ctrl-C to interrupt it before typing. Claude Code’s prompt is always ready to type at - you can type while it is working. That goes a long way towards it feeling like it cares about the user.
Thanks for mentioning this. These are the kinds of features that are 100% required for me to even consider Codex.
> Use with your ChatGPT plan
It's asking me to buy credits but I'm already on Plus?
nice! auto-updates like open code could help to not have to remember to update
loving the animations and todos so far
also gpt-5 is just great at agentic stuff
Why is Claude Code better than Cursor?
My company has a huge codebase, for me cursor would freeze up / not find relevant files. Claude code seems able to find the right files by itself.
I seem to always have better outcomes with Claude code.
Cursor and these tools evolve quite fast.
The Cursor you used a month ago is not the one you get now.
Just saying that because in this space you should always compare latest X with latest Y.
I too switched weeks ago to Claude Code. Then between the times I am out of tokens I launch Cursor and actually find it...better than I remember if not on par with Claude Code (the model and quality of prompts/context matters more than the IDE/CLI tool used too).
Because iterating multiple sessions through multiple terminals is obviously more efficient and seamless than interacting thought a scuffed IDE side panel ui.
Claude Code has some non-LLM magic in it that just makes it better for code in general, despite (or because of) having minimal IDE integration.
In my experience, it is much better at tool-calling, which is huge when we're talking about agentic coding. It also seems to do a better job of keeping things cleaning and not going off on tangents for anything that isn't accomplished in one shot.
I have had the exact opposite experience. Claude Code in any meaningful codebase for me gets stuck in loops of doing the wrong thing. Then when that doesn't work it deletes files and makes its own that don't have the problem it's encountering.
Cursor on the other hand, especially with GPT-5 today but typically with Sonnet 4.1, has been a workhorse at my company for months. I have never had Claude Code complete a meaningful ticket once. Even a small thing like fixing a small bug or updating the documentation on the site.
Would love any tips on how to make Claude Code not a complete waste of electricity.
If you don’t know how to divide a problem up given a toolset you won’t be able to solve it regardless of what those tools are. Maybe Cursor’s interface is more intuitive for you.
The problems I’ve given CC are things that are incredibly simple and basic. Things I knew how to fix immediately. I would tell it the gilt to change and how to change it. And it will get lost when the types are incorrect, or when it causes a test to fail. It will like just delete the test.
I don’t doubt I could improve my prompts but I don’t have those same prompting problems with cursor.
> Cursor on the other hand, especially with GPT-5 today but typically with Sonnet 4.1
You probably mean Opus 4.1; there's no Sonnet 4.1 yet.
Yes that’s correct.
Better prompts?
> Better prompts?
I think you're right.
People getting really poor results probably don't recognize that their prompts aren't very good.
I think some users make assumptions about what the model can't do before they even try, so their prompts don't take advantage of all the capabilities the model provides.
I don’t really have a problem prompting cursor with the same models. But I have no doubt my prompts could be improved
Opposite experience. I worked with Claude code a lot, then switched to Cursor and then tried to switch back and discovered that CC often gets stuck in loops. Cursor just works. It definitely helps that I can switch the foundational models in Cursor when it gets stuck.
CC just feeds the whole codebase and entire files into the model, no RAG, nothing in the way. It works substantially better because of that, but it's $expensive$.
The more stuff you put in the context the worse models perform. All of them.
Larger context is a bonus sometimes, but in general you're degrading the quality of the output by a lot.
Precise prompting and context management is still very important.
That's not true. It uses CLI tools (e.g. find, grep) to find the relevant code from the codebase.
What I have found Claude Code is extremely good at is that it makes one change at a time, gives you a chance to read the code its changing, and lets you give feedback in real time and steer it properly. I find the mental load with this method to be MUCH lower than Cursor or any of the other tools which give you two very different options: "Ask" mode which dumps a ton of suggestions on your and then requires semi-manual implementation, or "Agent" mode which dumps a ton of actual changes on you and requires your inspection and feedback and roll-backs, etc.
This may not work for everyone, but as a solo dev who wants to keep a real mental model of my work (and not let it get polluted with AI slop), the Claude Code approach just works really well for me. It's like having a coding partner who can iterate and change direction as you talk, not a junior dev who dumps a pile of code on your plate without discussion.
+1 to this. Cursors Agent feels too difficult to wrangle. CC is easier to monitor.
At this point, there are more AI coding agents announced every week than Javascript frameworks, but to be honest, I'm here for it.
Think how much training has been done on such Javascript frameworks... no one stops wondering what the outcome would be. The only fact that when I ask to create an app, without any further detail about what to use, and it defaults on React, imo it's a total failure whatever the agent
Think how many JavaScript frameworks can be vibe coded now!
(This is an exaggeration:)
Sure, you can have your LLM code with any JavaScript framework you want, as long as you don't mind it randomly dropping React code and React-isms in the middle of your app.
It’s not a real JS framework without JSX support and Typescript types that generate page long errors.
To be honest I am being positive and hopefully we'll see an explosion of AI agent that will help iron out all the bug in FOSS that is hosted on different source code hosting platform. Renovate on steroid. I would work on that if my daytime job wasn't my main and only source of revenue.
Ask a FOSS maintainer and they will not be nearly as optimistic about AI reducing the amount of bugs. A lot of AI generated pull requests are broken or useless and the up wasting a lot of the maintainers' time
Ironically, LLMs might make it very hard for new frameworks to gain popularity since they are trained on the popular ones.
If we're not there already, it's just a matter of time before LLMs will be able to read and understand a framework they haven't seen before and be able to use it anyway.
LLMs are already trained on JavaScript at a deep level; as LLM reasoning and RAG techniques improve, there will be a time in the not-too-distant future when an LLM can be pointed to the website of a new framework and be able to use it.
Why would we create a framework to make coding easier when nobody writes code by hand any more?
Make one that's optimal for Ai somehow
Like convex.dev
The concept of JS framework which allows you to rapidly develop an app has the same underlying vibe as coding agent
Holy moly. I did not see that coming, but it makes sense. I’m enjoying the terminal-based coding agents way more than I ever would have expected. I can keep one spinning in the background while I do #dayjob, and as a bonus I feel like a haX0r.
2025 is the year of the terminal, apparently?
For my prototype purposes, it’s great, and Claude code the most fun I’ve had with tech in a jillion years.
Fascinating to see how agents are redefining what IDEs are. This was not really the case in the chat AI era. But as autonomy increases, the traditional IDE UI becomes less important form of interaction. I think those CLI tools have pretty good chance to create a new dev tools ecosystem. Creating a full featured language plugin (let alone a full IDE) for VSCode or Intellij is not for a faint-hearted, and cross IDE portability is limited. CLI tools + MCP can be a lot simpler, more composable and more portable.
IDE UI should shift to focusing on catching agentic problems early and obviously, and providing drop dead simple rollback strategies, parallel survival-of-the-fittest solution generation, etc
With all the frontier labs competing in this space now, and them letting you use your consumer subscription through the CLI, I don’t understand how the Cursor products will survive. Why pay an extra $X/mo when I can get this functionality included in the $Y/mo I’m already paying OAI/Anthropic/GOOG?
I think the complete opposite. I love the ux for claude code, but it would be better if it wasnt locked to a single vendor's model. It seems pretty clear to me that a vendor neutral product with a UX as good as Claude Code would be the clear winner.
Habe you tried opencode? I haven't really, but it can use your anthropic subscription and also switch to most other models. It also looks quite nice IMO
I'm actually starting to think the opposite.
If Cursor can build the better UX for all the use-cases, mobile/desktop chatbot, assistant, in IDE coding agent, CLI coding agent, web-based container coding agent, etc.
In theory, they can spend all their resourcing on this, so you could assume they could have those be more polished.
If they win the market-share here, than the models are just commodity, Cursor lets you pick which ever is best at any given time.
In a sense, "users" are going to get locked in on the tooling. They learn the commands, configuration, and so on of Cursor, it's a higher cost for them to re-learn a different UX. Uninstalling and re-installing another app, plugin, etc. is annoying.
No, model providers are not going to let Cursor eat their pie. The biggest cost in AI is in developing LLM models and inference. Players incurring those costs will basically control this market.
I don't think we'll have more than 2 players. I think it's like AMD and Intel, the LLM is almost like providing hardware. The software that exposes the LLM capabilities to the user is the layer that will be able to differentiate.
The models are just going to be fighting performance/cost. And people will choose the best performance for their budget.
And that's ignoring how good local models are getting as well.
It's not that they'll have their launch eaten by Cursor, it's just that they can't be as focused on user experience when they're also laser focused on improving the models to stay competitive.
I agree that cursor has to take an aggressive and differentiated approach to succeed, but they have the benefit of pushing each lab into a commodity.
I pay for Cursor and ChatGPT. I can imagine I’d pay for Gemini if I used an android. The chat bots (1) won’t keep the subscription competitive with APIs because the cost and usage models are different and (2) most chat bots today are more of a UX competition than model quality. And the only winners are ChatGPT and whatever integrated options the user has by default (Gemini, MSFT Copilot, etc).
Because you can always use the best model. Yesterday is was Claude Opus 4.1, today it's GPT-5. If you just were paying Anthropic you will be stuck with Claude.
Yeah but I still want a general purpose chatbot subscription also. So I’d have to buy Cursor + something else.
I guess Cursor makes sense for people who only use LLMs for coding.
I'm having trouble finding a use for this outside of virtualized unused environments. Why not instead give me a virtual machine that runs this in a confined storage space?
I would _never_ give an LLM access to any disk I own or control if it had anything more than read permissions
For example, Gemini CLI [1] can use native sandboxing on macOS. It's just a matter of time before every major coding agent will run inside of an operating system's native sandbox/container/jail/VM.
[1]: https://github.com/google-gemini/gemini-cli/blob/main/docs/c...
The permissions is quite well defined, by default it will ask you for your approval before every cli command that it will run
Why not? Have you ever actually used these things? The risk is incredibly low. I run claude code with zero permissions every day for hours. Never a problem.
I have (not an exhaustive list) SSH keys and sensitive repositories hanging out on my filesystem. I don't trust _myself_ with that, let alone an LLM, unless I'm running ollama or similar local nonsense with no net connectivity.
I'm a few degrees removed from an air gapped environment so obviously YMMV. Frankly I find the idea of an LLM writing files or being allowed to access databases or similar cases directly distasteful; I have to review the output anyway and I'll decide what goes to the relevant disk locations / gets run.
They don't have arbitrary access over your file system. They ask permission for doing most everything. Even reading files, they can't do that outside of the current working directory without permission.
I'm pretty comfortable with the agent scaffolding just restricting directory access but I can see places it might not be enough...
If you were being really paranoid then I guess they could write a script in the local directory that then runs and accesses other parts of the filesystem.
I've not seen any evidence an agent would just do that randomly (though I suppose they are nondeterministic). In principle maybe a malicious or unlucky prompt found somewhere in the permitted directory could trigger it?
Comments like this just show how bad the average dev is at security. Ever heard of the principle of least privilege? It's crazy that anyone who has written at least one piece of software would think "nah, it's fine because the software is meant to ask before doing".
Your obviously skilled, spending the money on a Claude only machine would pay for itself in less than three weeks. If I was your employer, it would be a no brainer.
Make me that offer :D
That's funny. I was really hoping that Anthropic would make a "Claude GUI".
In one of their Claude Code talks they said it didn’t seem worth it, given their expectation that all IDEs will become obsolete by next year.
Xcode pretty much hung up their hat this year, and threw in with Claude.
If I'm not mistaken, it may be feasible to build one with the Claude Code sdk
Isn't that Claude Desktop?
I really like the IDE. It makes enough mistakes that I need to be constantly testing and catching little errors. I’ll interrupt the flow often when it’s going down a path I don’t want it to. When using Codex, for example, it’s doing too much in the background that is harder to correct afterwards. Am I doing this wrong?
People have preferred either the terminal or chunky IDEs for decades. Neither are wrong.
What's the benefit of this compared to the IDE? To be more like Claude Code?
Actually, I think where Claude Code shines, is with the VSCode Extension. It's a great mix between a CLI that could be used in a bash script for automation, as well as a coding assistant.
I haven't found however if Cursor cli provides this kind of extension
Flip your thinking around for a second and consider why an IDE is required for an agent that codes for you?
The IDE/editor is for me, the agent doesn't need it. That also means I am not forced to used whatever imperfect forked IDE the agent is implemented against.
> why an IDE is required for an agent that codes for you
Because the agents aren't yet good enough for a hands off experience. You have to continuously monitor what it does if you want a passable code base.
Sure, but monitoring, reviewing and steering does not really require modern IDEs in their current form. Also, I'm sure agents can benefit from parts of IDE functionality (navigation, static analysis, integration with build tools, codebase indexing, ...), but they sure don't need the UI. And without UI those parts can become simpler, more composable and more portable (being compatible with multiple agent tools). IMO another way to think about CLI agentic coding tools as of new form of IDEs.
As was already mentioned elsewhere, Emacs + Magit to monitor incoming changes is a great combo.
Whenever I have to take the wheel myself the AI tab completion makes it much smoother so I am kind of addicted to that. Semi-automatic mode.
I would much rather use IntelliJ so perhaps my habits will change at some point, but right now I am stuck with Cursor/vscode for the tab completion.
I don't really need an IDE, but I do need a great code review interface.
I use lazygit for that. But any diff tool you like will work.
As someone who hasn’t used Claude Code yet, can’t you configure it somehow to use a different tool of your liking, or it has to be in the cli?
I end up using the VCS tooling (lazygit for me), but coding agents really need to be integrated with this review environment. We need an extra step where the agent will group its changes into logical units (database models in one commit, types in another, business logic in another, tests in another), rather than having to review per-file.
Programming has changed from writing code to reviewing/QAing and reprompting, but the tooling hasn't yet caught up with that workflow. We need Gerrit for coding agents, basically.
I just merge the change and review the diff. If it’s wrong I either revert or ask Claude to fix it.
Many of these companies are realizing that mainline VSCode is a moat of sorts. I and many people I know won't use any of these that require forking VSCode.
With the benefit that you can also pull in people who don't like using VSCode such as people who use Jetbrains or terminal based code editors.
So you can use an IDE other than VS code.
I am so curious to know. Why is Cursor not just putting whatever this supposedly does better into... Cursor?
i dont think it actually does anything better than the chat window in the editor. its strictly worse tbh. it just lets you not be tied to a VSCode interface for editing.im sure Jetbrains diehards would very much appreciate this, but honestly i will find it hard to utilize given the fact Cursor's tab auto-complete is so amazing.
To compete with Claude code
They are competing with Claude Code already. The competition is not over who can built the nicest CLI.
You can spin up the Cursor CLI inside the terminal of your IDE of choice and not be tethered to Claude's models.
Is there a better agent than the anthropic one
Depends how you define "better". Quality/breadth of tasks/capabilities? Probably not (TBD how gpt5 will fare, colleagues were saying that it was better at some frontend tasks than claude4 in the alpha/beta horizon tests).
But if you take speed/availability/cost into account, there might be "better" offers out there. I did some tests w/ windsurf when they announced their swe1 and swe1-lite models, and the -lite could handle easy tasks pretty well. I also tested 4.1-mini and 4.1-nano. There are tasks that I could see them handle reliably enough to make sense (and they're fast, cheap and don't throttle you).
You can already use non-Anthropic models with Claude Code with tools like Claude Code Router [1].
[1]: https://github.com/musistudio/claude-code-router
No
i'm betting on cursor being the long-term best toolset.
1. with tight integration between cli, background agent, ide, github apps (e.g. bugbot), cursor will accommodate the end-to-end developer experience.
2. as frontier models internalize task routing, there won't be much that feels special about claude code anymore.
3. we should always promote low switching costs between model providers (by supporting independent companies), keeping incentives toward improving the models not ui/data/network lock-in.
i’d respectfully bet against this.
cursor and 3rd party tools will, unless they make their own superior foundation model, will always have to fight the higher marginal cost battle. This is particularly bad insofar that they offer fixed pricing subscriptions. That means they’re going to have to employ more context saving tricks which are at odds with better performance.
If the cost economics result in Cursor holding, say, 20% fewer tokens in context versus model-provider coding agents, they will necessarily get worse performance, all things equal.
Unless Cursor offers something dramatically different outside of the basic agentic coding stack it’s hard to see why the market will converge to cursor.
> we should always promote low switching costs between model providers (by supporting independent companies), keeping incentives toward improving the models not ui/data/network lock-in
You’re underestimating the dollars at play here. With cursor routing all your tokens, they will become a foundation model play sooner than you may think
You're allowing them to train on your code?
The code isn’t the valuable part. They know all the most common workflows and failure modes, allowing them to create better environments for training agentic models
Happy to short that bet as I think agentic harnesses will be molded along the RL training of the actual model. Tony + the suit created together. Why Claude in Claude Code became existential for Cursor, why cursor moved quick to go agentic and build up with OpenAI in big header line way here.
Unless they pair up with OpenAI or Meta.
I think CLI is a good idea for now. Next abstraction seems to be Github PRs where someone (likely me) files an issue/feature, then I click a button, and the agent fixes the issue/feature. Github has talked about something similar, but surely it were a pain to figure out if it was GA and I had access to it given so many different variations they have called gh copilot. (PS: it exists, but not as smooth as I described: https://docs.github.com/en/copilot/how-tos/use-copilot-agent... )
You can already have that with Jules. It's quite impressive.
https://jules.google/
How do you make work those terminal AI prompts?
Are those for the anonymous accesses of the AI prompts?
If those are for the authenticated AI prompts, how to create a "non-anonymous" account with a noscript/basic (x)html browsers (not to mention I am self-hosted without paying the DNS mafia, namely my emails are with ip literals, ofc I prefer IPv6).
What differentiates the CLI tools at this point and makes you prefer one over the other?
opencode and Crush can use any model, so apart from a nicer visual experience, are there any aspects that actually make you more productive in one vs the other?
is there a way to get it to display more information? its stuck not doing anything and i cant tell if that's because it timed out or it is running a script or it is thinking or what is even happening. sometimes it just does things without even giving any feedback at all. i dont know what it is thinking or what it is trying to do and i cant really see the output of the terminal commands it is running. it just pauses every once in a while and asks to run a command.
is there a way to make it more verbose?
I noticed it was taking awhile on the first large-ish task I gave it. I'm assuming it was just a bit overloaded at the moment.
My first thought was, "meh, I already have Claude Code". But then I remembered my primary frustration with Claude Code. I need other LLMs to be able to validate Claude Code's assumptions and work. I need to do this in an automated way. Before Cursor CLI, I did not have a way to programmatically ask Cursor do this. It was very manual, very painful. But, now I can create a Claude Code agent that is a "cursor-specialist" that uses cursor cli to do all of that in an automated way.
Interesting, are you saying you would setup a Stop Hook in Claude Code that calls the Cursor CLI to have it validate and prompt Claude Code with further instructions?
Can you pick thinking models with this or is that implied?
GPT-5 seems a bit slow so far (in terms of deciding and awareness). I’ve gone from waiting for a compiler, to waiting for assets to build to now waiting for an agent to decide what to do - progress I guess :)
Boris Cherny was a (main?) creator of Claude Code at Anthropic. He moved over to Cursor about a month ago. I hope Cursor CLI is an Claude Code Agent port to the Cursor. Hopefully, the code quality would be comparable, modulo Cursor's abridged model access. We will know shortly.
He actually returned to Anthropic shortly after joining Cursor
Could anyone compare this with Claude Code and aider?
Claude Code but can use GPT-5 built in. Not a bad selling point
Claude Code can use GPT-5 via LiteLLM
https://docs.anthropic.com/en/docs/claude-code/llm-gateway#l...
https://docs.litellm.ai/docs/tutorials/claude_responses_api
and access to Cursor's background agent on the web as well, like ChatGPT Codex. So to this point, I'm regret cancelling my Cursor subscription already
I wonder if this will support directly interfacing with OpenAI's APIs vs. going through Cursor's APIs (and billing).
I would highly doubt it. Even when you BYOK inside of Cursor they still say it's routed through their servers.
Hopefully this one is as good as Claude code. None of them that I've tried have come close yet.
Have you tried opencode?
Yeah, opencode and crush. I'm gonna give Claude code router a good try soon.
has to be, given the hype surrounding claude code, a few of them are using claude code just cause it's terminal based.
They realized that CLI is the much better interface for these kinds of tasks.
It seems they haven’t implemented MCP client features in Cursor CLI yet
Does it work with local LLMs like through Ollama or llama.cpp?
Is the pricing any good?
seems pretty basic. I don't see anything unique here. I am happy with my Gemini CLI.
Pivot to CLI
There are certainly some lessons here that go beyond coding agents (when it comes to shipping products).
I’m mostly going to use this as a convenient way to run ffmpeg. Previously I’d need to open Cursor and ask for commands in the terminal there.
So we’re all just waiting for AGENT.md to become the new README, huh? I’m ready when the agents are.
Wouldn't be better to just use the Warp AI solution at this point?
Only if it would work. I think they miss a big opportunity here by (1) not caring about security at all, (2) trying to develop their own model and only make it available in the cloud.
What's the difference between Warp and just opening multiple tabs in my terminal?
They are all clones of gemini cli at this point?
Since Gemini CLI was released under the Apache license, a clone is easy to make.
Claude Code finally has a serious competitor.
Not sure. So far Reddit seems largely negative on Cursor CLI + GPT-5
https://www.reddit.com/r/cursor/comments/1mk8ks5/discussion_...
They are all using mid-tier gpt-5 variants (not the "-high" one that's hidden by default, not gpt-5-thinking) and don't realize it
Seriously Cursor. You can’t just write wrappers all your life. VSCode wrapper and now Gemini CLI wrapper. Can you make something from scratch for once? It’s as if they want an exit and they’re putting in minimum effort until that materializes.
When I saw this, the question which immediately came to mind was:
Then I saw the installation instructions, which are: And it made sense.Only those comfortable with installing software by downloading shell commands from an arbitrary remote web site and immediately executing them would use it.
So what then is the risk of running arbitrary file system modifications generated from a program installed via arbitrary shell commands? None more than what was accepted in order to install it.
Both are opaque, unreviewed, and susceptible to various well known attacks (such as a supply chain attack[0]).
0 - https://en.wikipedia.org/wiki/Supply_chain_attack
I couldn't even install Cursor on Ubuntu . The issue still exists. Why didn't they ask the AI to fix it?