I have about two weeks of using Claude Code and to be honest, as a vibe coding skeptic, I was amazed. It has a learning curve. You need to learn how to give it proper context, how to chunk up the work, etc. And you need to know how to program, obviously. Asking it to do something you don't know how to do, that's just asking for a disaster. I have more than 25 years of experience, so I'm confident with anything Claude Code will try to do and can review it, or stop and redirect it. About 10-15 years ago, I was dreaming about some kind of neural interface, where I could program without writing any code. And I realized that with Claude Code, it's kind of here.
A couple of times I hit the daily limits and decided to try Gemini CLI with the 2.5 pro model as a replacement. That's not even comparable to Claude Code. The frustration with Gemini is just not worth it.
I couldn't imagine paying >100$/month for a dev tool in the past, but I'm seriously considering upgrading to the Max plans.
If you are a Senior Developer, who is comfortable giving a Junior tips, and then guiding them to fixing them (or just stepping in for a brief moment and writing where they missed something) this is for you. I'm hearing from Senior devs all over thought, that Junior developers are just garbage at it. They product slow, insecure, or just outright awful code with it, and then they PR the code they don't even understand.
For me the sweet spot is for boilerplate (give me a blueprint of a class based on a description), translate a JSON for me into a class, or into some other format. Also "what's wrong with this code? How would a Staff Level Engineer white it?" those questions are also useful. I've found bugs before hitting debug by asking what's wrong with the code I just pounded on my keyboard by hand.
A couple of week ago, I had a little down time and thought about a new algorithm I wanted to implement. In my head it seemed simple enough that 1) I thought the solution was already known, and 2) it would be fairly easy to write. So I asked Claude to "write me a python function that does Foo". I spent a whole morning going back and forth getting crap and nothing at all like what I wanted.
I don't know what inspired me, but I just started to pretend that I was talking to one one of my junior engineers. I first asked for a much simpler function that was on the way to what I wanted (well, technically, it was the mathematical inverse of what I wanted), then I asked it to modify it to add one different transform, and then another, and then another. And then finally, once the function was doing what I wanted, I asked it to write me the inverse function. And it got it right.
What was cool about it, is that it turned out to be more complex linear algebra and edge cases than I originally thought, and it would have been weeks for me to figure all of that out. But using it as a research tool and junior engineer in one was the key.
I think if we go down the "vibe coding" route, we will end up with hoards of juniors who don't understand anything and the stuff they produce with AI will be garbage and brittle. But using AI as a tool is starting to feel more compelling to me.
Yes, can confirm that as a senior developer who has needed to spend huge amounts of time reviewing junior code from off-shore contractors with very detailed and explicit instructions, dabbling in agentic LLM coding tools like Claude Code has felt like like a gift from heaven.
I also have concerns about said junior developers wielding such tools, because yes, without being able to supply the right kind of context and being able to understand the difference between a good solution and a bad solution, they will produce tons of awful, but technically working code.
Totally agree with the off-shore component of this. I'm already going to have to break a task down into clear detail and resolve any anticipated blocker myself upfront to avoid multi-timezone multi-day back and forth.
Now that I'm practiced at that, the off-shored part is no longer valuable
Many companies that see themselves as non-technical at the core prefer building solutions with an army of intermediate developers that are hot swappable. Having highly skilled developers is a risk for them.
Unlikely. Microsoft had layoffs everywhere except India. There they keep hiring more. As song as the can keep upskilling themselves while still being much cheaper than US workers they won't fear unemployment.
Just yesterday I saw on X a video of a Miami hotel where the check-in procedure was via a video call to a receptionist in India.
Blowing away the junior -> senior pipeline would, on average, hit every country the same.
Though it raises an interesting point: if a country like India or China did make the investment in hiring, paying, and mentoring junior people but e.g. the US didn't, then you could see a massive shift in the global center of gravity around software expertise in 10 years (plus or minus).
Someone is going to be the best at planning for and investing in the future on this, and someone is going to maximally wishful thinking / short-term thinking this, and seductive-but-not-really-there vibe coding is probably going to be a major pivot point there.
This is such an important point. Not sure about India, which is still very market forces driven, but china can just force its employers to do whatever is of strategic importance. That’s long gone in the US. Market forces here will only ever optimize for short term game, shooting ourselves in the chest.
Ah mate I can’t relate more to the offshore component. I had a very sad experience where I recently had to let go of an offshore team due to them providing devs that essentially ‘junior with copilot’ but labelled as a ‘senior’.
Time and time again I would find telltale signs of dumping LLM output into PRs n then claiming it as their own. Not a problem, but the code didn’t do what the detailed ticket asked and introduced other bugs as a result.
It ultimately became a choice of ‘go through the hassle of making a detailed brief for it to just be put in copilot verbatim and then go through the hassle of reviewing it and explaining the issues back to the offshore dev’ or ‘brief Claude directly’
I hate to say it but from a business perspective the latter won outright. It tears me up as it goes against my morality.
I know what you mean it just feels a bit inhumane to me. Sort of like defining a value for a living being and then determining that they fell beneath said value.
I've got myself in a PILE of trouble when trying to use LLMs with languages/technologies I am unfamiliar with (React, don't judge me).
But with something that I am familiar with (say Go, or Python) LLMs have improved my velocity massively, with the caveat that I have had to explicitly tell the LLM when it is producing something that I know that I don't want (me arguing with an LLM was an experience too!)
> I'm hearing from Senior devs all over thought, that Junior developers are just garbage at it. They product slow, insecure, or just outright awful code with it, and then they PR the code they don't even understand.
If this is the case then we better have full AI generated code within the next 10 years since those "juniors" will remain atrophied juniors forever and the old timers will be checking in with the big clock in the sky. IF we, as a field, believe that this can not possibly happen, then we are making a huge mistake leaning on a tool that requires "deep [orthogonal] experience" to operate properly.
IT education and computer science (at least part of it) will need a stronger focus on software engineering and software architecture skills to teach developers how to be in control of an AI dev tool.
The fastest way is via struggle. Learn to do it yourself first. Understand WHY it does not work. What's good code? What's bad code? What are conventions?
There are no shortcuts - you are not an accountant just because you have a calculator.
Brains are not computers and we don't learn by being given abstract rules. We also don't learn nearly as well from class room teaching as we do from doing things IRL for a real purpose - the brain always knows the difference and that the (real, non-artificially created) stakes are low in a teaching environment.
That's also the huge difference between AI and brains: AI does not work on the real world but on our communication (and even that is limited to text, missing all the nuance or face to face communication includes). The brain works based on sensor data from the real world. The communication method, language, is a very limited add-on on top of how the brain really works. We don't think in language, to do even some abstract language based thinking, e.g. when doing formal math, requires a lot of concentration and effort and still uses a lot of "under the hood" intuition.
That is why even with years of learning the same curriculum we still need to make a significant effort for every single concrete example to "get everyone on the same page", creating compatible internal models under the hood. Everybody's own internal model of even simple things are slightly different, depending on what brain they brought to learning and what exactly they learned, where even things like social classroom interactions went into how the connections were formed. Only based on a huge amount of effort can we then use language to communicate in the abstract, and even then, when we leave the central corridor of ideas people will start arguing forever about definitions. No matter how the written text is the same, the internal model is different for every person.
As someone who took neuroscience, I found this surprisingly well written:
"The brain doesn't like to abstract unless you make it"
> This resource, prepared by members of the University of London Centre for Educational Neuroscience (CEN), gives a brief overview of how the brain works for a general audience. It is based on the most recent research. It aims to give a gist of the brain’s principles of function, covering the brain’s evolutionary origin, how it develops, and how it copes in the modern world.
The best way to learn is to do things IRL that matter. School is a compromise and not really all that great. People motivated by actual need often can learn things that take years in school with middling results significantly faster and with better and deeper results.
Yeah. The only, and I mean only non-social/networking advantages to universities stem from forced learning/reasoning about complex theoretical concepts that form the requisite base knowledge to learn the practical requirements of your field while on the job.
Trade schools and certificate programs are designed to churn out people with journeyman-level skills in some field. They repeatedly drill you on the practical day-in-day-out requirements, tasks, troubleshooting tools and techniques, etc. that you need to walk up to a job site and be useful. The fields generally have a predictable enough set of technical problems to deal with that a deep theoretical exploration is unnecessary. This is just as true for electricians and auto mechanics as it is for people doing limited but logistically complex technical work, like orchestrating a big fleet of windows workstations with all the Microsoft enterprise tools.
In software development and lots of other fields that require grappling with complex theoretical stuff, you really need both the practical and the theoretical background to be productive. That would be a ridiculous undertaking for a school, and it’s why we have internships/externships/jr positions.
The combination of these tools letting the seniors in a department do all of the work so companies don’t have to invest in interns/juniors so there’s no reliable entry point into the field, and there being an even bigger disconnect between what schools offer and the skills they need to compete, the industry has some rough days ahead and a whole lot of people trying to get a foothold in the industry right now are screwed. I’m kind of surprised how little so many people in tech seem to care about the impending rough road for entry-level folks in the industry. I guess it’s a combination of how little most higher level developers have to interact with them, and the fact that everybody was tripping over themselves to hire developers when a lot of seniors joined the industry.
And that is the best thing about AI, it allows you to do and try so much more in the limited time you have. If you have an idea, build it with AI, test it, see where it breaks. AI is going to be a big boost for education, because it allows for so much more experimentation and hands-on.
By using AI, you learn how to use AI, not necessarily how to build architecturally sound and maintainable software, so being able to do much more in a limited amount of time will not necessarily make you a more knowledgeable programmer, or at least that knowledge will most likely only be surface-level pattern recognition. It still needs to be combined with hands-on building your own thing, to truly understand the nuts and bolts of such projects.
If you end up with a working project where you understand all the moving parts, I think AI is great for learning and the ultimate proof whether the learning was succesful if whether you can actually build (and ship) things.
So human teachers are good to have as well, but I remember they were of limited use for me when I was learning programming without AI. So many concepts they tried to teach me without having understood themself first. AI would have likely helped me to get better answers instead of, "because that is how you do it" when asking why to do something in a certain way.
So obviously I would have prefered competent teachers all the time and also now competent teachers with unlimited time instead of faulty AIs for the students, but in reality human time is limited and humans are flawed as well. So I don't see the doomsday expectations for the new generation of programmers. The ultimate goal, building something that works to the spec, did not change and horrible unmaintainable code was also shipped 20 years ago.
I don't agree, to me switching from hand coded source code to ai coded source code is like going from a hand-saw to an electric-saw for your woodworking projects. In the end you still have to know woodworking, but you experiment much more, so you learn more.
Or maybe it's more like going from analog photography to digital photography. Whatever it is, you get more programming done.
Just like when you go from assembly to c to a memory managed language like java. It did some 6502 and 68000 assembly over 35 years ago, now nowbody knows assembly.
Key words there. To you, it's a electric saw because you already know how to program, and that's the other person's point; it doesn't necessarily empower people to build software. You? Yes. Generally though when you hand the public an electric saw and say "have at it, build stuff" you end up with a lot of lost appendages.
Sadly, in this case the "lost appendages" are going to be man-decades of time spent undoing all the landmines vibecoders are going to plant around the digital commons. Which means AI even fails as a metaphorical "electric saw", because a good electric saw should strike fear into the user by promising mortal damage through misuse. AI has no such misuse deterrent, so people will freely misuse it until consequences swing back wildly, and the blast radius is community-scale.
> more like going from analog photography to digital photography. Whatever it is, you get more programming done.
By volume, the primary outcome of digital photography has been a deluge of pointless photographs to the extent we've had to invent new words to categorize them. "selfies". "sexts". "foodstagramming". Sure, AI will increase the actual programming being done, the same way digital photography gave us more photography art. But much more than that, AI will bring the equivalent of "foodstagramming" but for programs. Kind of like how the Apple App Store brought us some good apps, but at the same time 9 bajillion travel guides and flashlight apps. When you lower the bar you also open the flood gates.
> By using AI, you learn how to use AI, not necessarily how to build architecturally sound and maintainable software
> will not necessarily make you a more knowledgeable programmer
I think we'd better start separating "building software" from programming, because the act of programming is going to continue to get less and less valuable.
I would argue that programming has been very overvalued for a while even before AI. And the industry believes it's own hype with a healthy dose of elitism mixed in.
But now AI is removing the facade and it's showing that the idea and the architecture is actually the important part, not the coding if it.
Ok. But most developers aren't building AI tech. Instead, they're coding a SPA or CRUD app or something else that's been done 10000 times before, but just doing it slightly differently. That's exactly why LLMs are so good at this kind of (programming) work.
I would say most people are dealing with tickets and meetings about the tickets more than they are actually spending time with their editor. It may be similar, but that 1 percent difference needs to be nailed down right, as that's where the business lifeline lays.
Unfortunately education everywhere is getting really hurt by access to AI, both from students who are enabled to not their homework, and by teacher review/feedback being replaced by chatbots.
You can't atrophy if you never grew in the first place. The juniors will be stunted. It's the seniors who will become atrophied.
As for whether it's a mistake, isn't that just the way of things these days? The current world is about extracting as much as you can while you're still here. Look around. Nobody is building for the future. There are a few niche groups that talk about it, but nobody is really doing it. It's just take, take, take.
This just seems more of the same, but we're speeding up. We started by extracting fossil fuels deposited over millions of years, then extracting resources and technology from civilisations deposited over millennia, then from the Victorians deposited only a century or two ago, and now it's software deposited over only mere decades. Someone is going to be left holding the bag, we just hope it's not us. Meanwhile most of the population aren't even thinking about it, and most of the fraction that do think are dreaming that technology is going to save us before it's payback time.
Yeah I noticed the issue with more Junior developers right away. Some developers, Junior or not, have yet to be exposed to environments where their PRs are put under HEAVY scrutiny. They are used to loosey-goosey and unfortunately they are not prepared to put LLM changes under the level of scrutiny they require.
The worst is getting, even smallish, PRs with a bunch of changes that look extraneous or otherwise off. After asking questions the code changes without the questions being answered and likely with a new set of problems. I swear I've been prompting an LLM through an engineer/PR middleman :(
That is how you get Oracle source code. It broke my illusions after entering real life big company coding after university, many years ago. It also led to this gem of an HN comment: https://news.ycombinator.com/item?id=18442637
An observation. If we stipulate that this is true that a 'senior developer' benefits from Claude Code but a junior developer do not. Then I'm wondering if that creates this gap where you have a bunch of newly minted '10x' engineers who are doing the work that a bunch of junior devs helped with, and now you're not training any new junior devs because they are unemployable. Is that correct?
It already was the case wasn't it, that you could either get one senior dev to build your thing in a week, or give them a team of juniors and it would take the whole team 4 weeks and be worse.
Yet somehow companies continued to opt for the second approach. Something to do with status from headcount?
Yes, there are companies that opt for broken organizations for a variety of reasons. The observation though is this; Does this lead to a world where the 'minimum' programmer is what we consider today to be a 'Senior Dev' ? It echoes the transition of machinists to operators of CAD/CAM workstations to operate machining centers, rather than hands on the dials of a mill or lathe. It certainly seems like it might make entering the field through a "coder camp" would no longer be practical.
It'll be interesting to see if in a decade when a whole cohort of juniors didn't get trained whether LLMs will be able to do the whole job. I'm guessing a lot of companies are willing to bet on yes.
“Wasting” effort on juniors is where seniors come from. So that first approach is only valid at a sole proprietorship, at an early stage startup, or in an emergency.
I'm getting my moneys worth having claude write tools. We've reached the dream where I can vibe out some one off software and it's great; today I made two different (shitty but usable!) gui programs in seconds that let me visually describe some test data. The alternative was probably half an hour of putting something together if my first idea was good. Then I deleted them and moved on.
It still writes insane things all the time but I find it really helpful to spit out single use stuff and to brainstorm with. I try to get it to perform tasks I don't know how to accomplish (eg. computer vision experiments) and it never really works out in the end but I often learn something and I'm still very happy with my subscription.
I've also found it good at catching mistakes and helping write commit messages.
"Review the top-most commit. Did I make any mistakes? Did I leave anything out of the commit message?"
Sometimes I let it write the message for me:
"Write a new commit message for the current commit."
I've had to tell it how to write commit messages though. It likes to offer subjective opinions, use superlatives and guess at why something was done. I've had to tell it to cut that out: "Summarize what has changes. Be concise but thorough. Avoid adjective and superlatives. Use imperative mood."
What I can recommend is to tell it that for all documentation, readmes and PR descriptions to keep it "tight, no purple-prose, no emojis". That cuts everything down nicely to to-the-point docs without GPTisms and without the emoji storm that makes it look like yet another frontend framework Readme.
Review your own code. Understand why you made the changes. And then clearly describe why you made them. If you can't do that yourself, I think that's a huge gap in your own skills.
Making something else do it means you don't internalize the changes that you made.
Your comment is not a fair interpretation of what I wrote.
For the record, I write better and more detailed commit messages than almost anyone I know across a decades[^0] long career[^1,^2,^3,^4,^5]. But I'm not immune from making mistakes, and everyone can use an editor, or just runs out of mental energy. Unfortunately, I find it hard to get decent PR reviews from my colleagues at work.
So yeah, I've started using Claude Code to help review my own commits. That doesn't mean I don't understand my changes or that I don't know why I made them. And CC is good at banging out a first draft of a commit message. It's also good at catching tiny logic errors that slip through tests and human review. Surprisingly good. You should try it.
I have plenty of criticisms for CC too. I'm not sure it's actually saving me any time. I've spent the last two weeks working 10 hour days with it. For some things it shines. For other things, I would've been better off writing the code from scratch myself, something I've had to do maybe 40% of the time now.
[^5]: None of the these are my best examples, just the ones I found quickly. Most of my commit messages are obviously locked away by my employer. Somewhere in the git history is a paragraphs long commit message from Jeff King (peff) explaining a one line diff. That's probably my favorite commit message of all time. But I also know that at work I've got a message somewhere explaining a single character diff.
My commits’description part, if warranted, is about the reason for the changes, not the specificity of the solution. It’s a little memo to the person reading the diff, not a long monograph. And the diff is usually small.
Can also confirm. Almost any output from claude code needs my careful input for corrections, which you could only spot and provide if you have experience. There is no way a junior is able to command these tools because the main competency to use them correctly is your ability to guide and teach others in software development, which by definition is only possible if you have senior experience in this field. The sycophancy provided by these models will outright damage the skill progression for juniors, but on the other hand there is no way to not use them. So we are in a state where the future seems really uncertain for most of us.
I find the "killer app" right now is anything where you need to integrate information you don't already have in your brain. A new language or framework, a third-party API, etc. Something straightforward but foreign, and well-documented. You'll save so much time because Claude has already read the docs
The interesting thing about all of this vibe coding skepticism, cynicism, and backlash is that many people have their expectations set extremely low. They’re convinced everything the tools produce will be junk or that the worst case examples people provide are representative of the average.
Then they finally go out and use the tools and realize that they exceed their (extremely low) expectations, and are amazed.
Yeah we all know Claude Code isn’t going to generate a $10 billion SaaS with a team of 10 people or whatever the social media engagement bait VCs are pushing this week. However, the tools are more powerful than a lot of people give them credit for.
In case some people having realized it by now: it’s not just the code, it’s also/mostly the marketing. Unless you make something useful that’s hard to replicate..
I have recently found something that’s needed but very niche and the sort of problem that Claude can only give tips on how to go about it.
Hmm not my experience. I've been aggressively trying to use both Cursor and Claude Code. I've done maybe 20-30 attempts with Code at different projects, a couple of them personal small projects. All of them resulted in sub-par results, essentially unusable.
I tried to use it for Python, Rust and Bash. I also tried to use it for crawling and organizing information. I also tried to use it as a debugging buddy. All of the attempts failed.
I simply don't understand how people are using it in a way that improves productivity. For me, all of this is so far a huge timesink with essentially nothing to show for it.
The single positive result was when I asked it to optimize a specific SQL query, and it managed to do it.
Anyway I will keep trying to use it, maybe something needs to click first and it just hasn't yet.
I asked it to implement a C++ backend for an audio plug-in API (CLAP) for the DAW I'm developing and it got it right in maybe less than ten interactions. Implementing other plug-in APIs such as VST3 took me weeks to get to the same level of support.
You're probably in an obscure niche domain, or asking it to do something creative.
Try like upgrading JS package dependencies, or translating between languages, limited tedious things, and you will be surprised how much better it does.
People are using different definitions of "vibe coding". If you expect to just prompt without even looking at the code and being involved in the process the result will be crap. This doesn't preclude the usefulness of models as tools, and maybe in the future vibe coding will actually work. Essentially every coder I respect has an opinion that is some shade of this.
There are the social media types you mention and their polar opposites, the "LLMs have no possible use" crowd. These people are mostly delusional. At the grown-ups table, there is a spectrum of opinions about the relative usefulness.
It's not contradictory to believe that the average programmer right now has his head buried in the sand and should at least take time to explore what value LLMs can provide, while at the same time taking a more conservative approach when using them to do actual work.
>maybe in the future vibe coding will actually work
Vibe coding works today at small enough of scale.
I'm building a personal app to help me track nutrition and I only needed to get involved in the code when Claude would hit its limits for a single file and produced a broken program (and this was via the UI, not Claude Code.) Now at ~3000 lines of python.
After I told it to split it into a few files I don't think I've had to talk about anything at the code level. Note that I eventually did switch to using Claude Code which might have helped (gets annoying copy/pasting multiple files and then my prompts hit max limits).
I just prompt it like an experienced QA/product person to tell it how to build it, point out bugs (as experienced as a user), point out bad data, etc.
A few of my recent prompts (each is a separate prompt):
>for foods found but not in database, list the number of times each shows up
>sort the list by count descending
>Period surplus/deficit seems too low. looking at 2025/07/24 to 2025/07/31
>do not require beige color (but still track it). combine blue/purple as one in stats (but keep separate colors). data has both white and White; should use standard case and not show as two colors
I think it's some variation of the efficient markets hypothesis. There are no problems that are both that lucrative and that easy to solve; if they existed, they would get dogpiled and stop being lucrative. Even in this day and age, $10B of revenue is an incredibly high bar.
On the other hand, $10B as valuation (not revenue) just requires a greater fool. Maybe it's possible, but I doubt there are too many of those fools available.
The question is not whether you can or can't, but whether it is still worth it long term:
- There is a moat of doing so (i.e. will people actually pay for your SaaS knowing that they could do it too via AI) and..
- How many large scale ideas do you need post AI? Many SaaS products are subscription based and loaded with features you don't need. Most people would prefer a simple product that just does what they need without the ongoing costs.
There will be more software. The question is who accrues the economic value of this additional software - the SWE/tech industry (incumbent), the AI industry (disruptor?) and/or the consumer. For the SWE's/tech workers it probably isn't what they envisioned when they started/studied for this industry.
It seems obvious to me it is the consumer who will benefit most.
I had been thinking of buying an $80 license for a piece of software but ended up knocking off a version in Claude Code over a weekend.
It is not even close to something commercial grade that I could sell as a competitor but it is good enough for me to not spend $80 on the license. The huge upside is that I can customize the software in any way I like. I don't care that it isn't maintainable either. Making a new version in ChagGPT5 is going to be my first project.
Just like a few hours ago I was thinking how I would like to customize the fitness/calorie tracking app I use. There are so many features I like that would be tightly coupled to my own situation and not a mass market product.
This to me seems obvious of what the future of software looks like for everything but mission critical software.
> The interesting thing about all of this vibe coding skepticism, cynicism, and backlash is that many people have their expectations set extremely low.
Or they have actually used all these tools, know how they work, and don't buy into hype and marketing.
It doesn't help that a lot of skeptics are also dishonest. A few days ago someone here tried to claim that inserting verbose debug logging, something Claude Code would be very good at, is "actually programming" and it's important work for humans to do.
No, Claude can create logs all across my codebase with much better formatting far faster than I can, so I can focus on actual problem solving. It's frustrating, but par for the course for this forum.
Edit: Dishonest isn't correct, I should have said I just disagree with their statements. I do apologize.
No, some skeptics are actually dishonest. It's part of trolling, and trolling is in fashion right now. Granted, some skeptics are fair, but many do it strictly for the views, without any due diligence.
That's not pedantry, pedantry would be if it were a very minor or technical detail, but being dishonest doesn't have anything to do with having a different opinion.
No, coding can be done by machines. But if you're telling a machine what to program, you're not coding. The machine is. Youre no longer a programmer, you're just a user.
I suppose a more rigorous definition would be useful. We can probably make it more narrow as time goes on
To me, the essence of coding is about using formal languages and definable state machines (i.e, your toolchain) to manipulate the state of a machine in a predictable way.
C, C++, even with their litany of undefined behavior, are still formal languages, and their compilers can still be predicted and understood (no matter how difficult that is). If the compiler does something unexpected, its because you, the programmer, lacked the knowledge of either the language or the compiler's state.
Vibe coding uses natural languages, and interacts with programs whose state is not only unknown, but unknowable. The machine, for the same input, may produce wildly different output. If the machine produces unexpected code, its not because of a lack of knowledge on the part of it programmer - its because the machine is inherently unpredictable and requires more prodding in soft, fuzzy, natural language.
Telling something what outcomes you want, even if described in technical terms only a programmer would understand, is not coding. It's essentially just being a project manager.
Now you may ask - who cares about this no true Scotsman fallacy? If its coding or not coding, we are still producing a program which serves the product needs of the customer.
Personally, I did not learn to code because I give a shit about the product needs of the customer, or the financial wellbeing of the business. I enjoy coding for its own sake - because it is fun to use systems of well defined rules to solve problems. Learning and using C++ is fun, for me; it seems every day i learn something new about the language and how the compiler behaves and I've been using C++ for several years (and I started learning it when I was 12!)
Describing the outcome or goal of a project in natural human language sounds like a nightmare, to be honest. I became a software engineer so I could minimize the amount of natural language required to succeed in life. Natural language has gotten me (and, I suspect, people like me) in trouble over and over again throughout adolescence, but I've never written a piece of code that was misunderstood or ambiguous enough for people to become threatened by or outraged by it.
I think the disconnect is that some people care about products, and some people care about code.
It's qualitatively different to go through source code and specifications to understand how something works than to look at a database with all the weights of an LLM and pretend like you could predict the output.
Ummm, my entire career I have been telling machines what to program, the machines are taking my garbage C/Go/Python/Perl/whatever prompts and translating it to ASM/Machine code that oher machines will use to do... stuff
They're substantively different. Using a compiler requires you to have an internalized model of a state machine and, importantly, a formal language. C, assembler, java, etc.
are all essentially different from using the softness of the English language to coerce results out of a black box
In both all you need is the ability to communicate to the machine in a way that the machine can convert your ideas into actions.
The restricted language of a compiler is a handicap, not evidence of a skill - we've been saying forever that "Natural Language" compilers would be a game changer, and that's all that an AI really is
Edit: It appears that this discussion is going to end up with a definition of "coding"
Is it coding if you tell the computer to perform some action, or is it coding if you tell it how to do that in some highly optimised way (for varying definitions of optimised, eg. Memory efficient, CPU efficient, Dev time efficient... etc)
No one is skeptical of compilers?! I guess you haven’t met many old fashioned C systems programmers, who go out of their way to disable compiler optimisations as much as they can because “it just produces garbage”.
Every generation, we seem to add a level of abstraction conceding because for most of us, it enhances productivity. And every generation, there is a crowd who rails against the new abstraction, mostly unaware of all of the levels of abstraction they already use in their coding.
Luxury! When I were a lad we didn't have them new fangled compilers, we wrote ASM by hand, because compilers cannot (and still to this day I think) optimise ASM as well as a human
Abstractions and compilers are deterministic, no matter if a neckbeard is cranky about the results. LLMs are not deterministic, they are a guessing game. An LLM is not an abstraction, it's a distraction. If you can't tell the difference, then maybe you should lay off the "AI" slop.
I think after all the goalpost moving, we have to ask - why the bitflip does it matter what we call it?
Some people are getting a lot of work done using LLMs. Some of us are using it on occasion to handle thing we don't understand deeply but can trivially verify. Some of us are using it out of laziness because it helps with boilerplate. Everyone who is using it outside of occasional tests is doing it because they find it useful to write code. If it's not coding, then I personally couldn't care less. Only a True Scotsman should case.
If my boss came to me and said "hey we're going to start vibe coding everything st work from now on. You can manually edit code but claude code needs to be your primary driver from now on" I would quit and find a new career. I enjoy coding. I like solving puzzles using the specifics of a language syntax. I write libraries and APIs and I put a great deal of effort into making sure the interface is usable by a human being.
If we get to the point where we are no longer coding, we are just describing things in product language to a computer and letting it do all the real work, then I will find a more fulfilling career because this ain't it
By the time it works flawlessly, it won't be your career anymore, it'll be the product manager's. They will describe what they want and the AI will produce it. You won't be told to "use Claude all the time".
I personally hate coding, but it's a means to an end, and I care about the end. I'm also paranoid about code I don't understand, so I only rarely use AI and even then it's either for things I understand 100% or things that don't matter. But it would be silly to claim they don't produce working code, no matter what we want to call it.
Second Edit: Adding the following paragraph from the wikipedia page for emphasis
Researchers have started to experiment with natural language programming environments that use plain language prompts and then use AI (specifically large language models) to turn natural language into formal code. For example Spatial Pixel created a natural language programming environment to turn natural language into P5.js code through OpenAI's API. In 2021 OpenAI developed a natural language programming environment for their programming large language model called Codex.
Technically you’re not vibe coding. You’re using AI to do software engineering. Vibe coding is specifically the process of having AI produce code and plowing ahead without understanding it.
I know I’m being pedantic, but people mean very different things when they talk about this stuff, and I don’t think any credence should be given to vibe coding.
To some extent, OP is still vibe coding because one has to trust Claude's every single decision which can't be easily verified at the first glance anyway. Agreed that we need a new word for heavily AI-assisted software development though, I once used a word "vivid coding" for this kind of process.
I vibe code quite a bit and will plow through a lot of front end code despite being a backend engineer. In my case, it's on personal projects where I'm ambitious and asking the LLM to "replace an entire SaaS" sort of thing. At work most of the code is a couple lines here or there and trivial to review.
When I try the more complex things I will do multiple passes with AI, have 2-3 LLMs review it and delete deprecated code, refactor, interrogate it and ask it to fix bad patterns, etc. In an evening I can refactor a large code base this way. For example Gemini is meh compared to Claude Opus at new code, but somewhat decent for reviewing code that's already there, since the 1M context window allows it to tie things together Claude wouldn't be able to fit in 256k. I might then bounce a suggestion back from Gemini -> Claude -> Grok to fix something. It's kind of like managing a team of interns with different specialties and personalities.
"A key part of the definition of vibe coding is that the user accepts code without full understanding.[1] Programmer Simon Willison said: 'If an LLM wrote every line of your code, but you've reviewed, tested, and understood it all, that's not vibe coding in my book—that's using an LLM as a typing assistant.'"
I wasn't familiar with his full message, so I didn't realize that the current definition of vibe coding was so cynical. Many of us don't see it that way.
1. Not looking at the code
2. YOLO everything
3. Paste errors back into the model verbatim
That said, I describe what I do as vibe coding, but I introduce code review bots into the mix. I also roadmap a plan with deep research before hand and require comprehensive unit and behavioural tests from the model.
Just a few months ago I couldn't imagine paying more than $20/mo for any kind of subscription, but here I am paying $200/mo for the Max 20 plan!
Similarly amazed as an experienced dev with 20 YoE (and a fellow Slovak, although US based). The other tools, while helpful, were just not "there" and they were often simply more trouble than they were worth producing a lot of useless garbage. Claude Code is clearly on another level, yes it needs A LOT of handholding; my MO is do Plan Mode until I'm 100% sure it understands the reqs and the planned code changes are reasonable, then let it work, and finally code review what it did (after it auto-fixes things like compiler errors, unit test failures and linting issues). It's kind of like a junior engineer that is a little bit daft but very knowledgeable but works super, super fast and doesn't talk back :)
It is definitely the future, what can I say? This is a clear direction where software development is heading.
When I first tried letting Cursor loose on a relatively small code base (1500 lines, 2 files), I had it fix a bug (or more than one) with a clear testcase and a rough description of the problem, and it was a disaster.
The first commit towards the fix was plausible, though still not fully correct, but in the end not only it wasn't able to fix it, each commit was also becoming more and more baroque. I cut it when it wrote almost 100 lines of code to compare version numbers (which already existed in the source). The problem with discussing the plan is that, while debugging, you don't yourself have a full idea of the plan.
I don't call it a total failure because I asked the AI to improve some error messages to help it debug, and I will keep that code. It's pretty good at writing new code, very good at reviewing it, but for me it was completely incapable of performing maintainance.
These tools and LLMs differ in quality, for me Claude Code with Claude 4 was the first tool that worked well enough. I tried Cursor before, it's been a 6+ months ago though, but I wasn't very impressed.
Same for me. Cursor was a mess for me. I don't know why and how it works for other people. Claude code on the other hand was a success from day one and I'm using it happily for months now.
I used Cursor for about 5 months before switching to Claude Code. I was only productive with Cursor when I used it in a very specific way, which was basically me doing by hand what Claude Code does internally. I maintained planning documents, todo lists, used test driven development and linting tools, etc. My .cursorrules file looks like what I imagine the Claude system prompt to be.
Claude Code took the burden of maintaining that off my shoulders.
Also Cursor was/is utterly useless any all non-Anthropic models, which are the default.
Fwiw, I dipped my toes into AI assisted coding a few weeks ago and started with cursor. Was very unimpressed (spent more time prompting and fight the tool than making forward progress) until I tried Claude code. Happily dropped cursor immediately (cancelled my sub) and am now having a great time using CC productively (just the basic $20/mo plan). Still needs hand-holding but it's a net productivity boost.
This was a problem I regularly had using Copilot w/ GPT4o or Sonnet 3.5/3.7... sometimes I would end up down a rabbit hole and blow multiple days of work, but more typically I'd be out an hour or two and toss everything to start again.
Don't have this w/ Claude Code working over multiple code bases of 10-30k LOC. Part of the reason is the type of guidance I give in the memory files helps keep this at bay, as does linting (ie. class/file length), but I also chunk things up into features that I PR review and have it refactor to keep things super tidy.
Yeah, Github Copilot just didn't work for me at all. The completions are OK and I actually still use it for that but the agent part is completely useless. Claude Code is in another league.
May I ask what you are use it for? I have been using it for fun mostly, side projects, learning, experimenting. I would never use it for work codebase, unless, well, the company ordered or at least permitted it. And even then, I'm not really sure I would feel comfortable with the level of liberty CC takes. So I'm curious about others.
Of course you need an explicit permit from the company to use (non-local) AI tools.
Before that was given, I used AI as a fancier search engine, and for coming up with solutions to problems I explained in abstract (without copy-pasting actual code in or out).
I have a similar amount of engineering experience, was highly skeptical, and I've come to similar conclusions with Claude Code after spending two weeks on a greenfield project (TS api, react-native client, TS/React admin panel).
As I've improved planning and context management, the results have been fairly consistent. As long as I can keep a task within the context window, it does a decent job almost every time. And occasionally I have to have it brute-force its way to green lint/typecheck/tests. That's been one of the biggest speed bumps.
I've found that gemini is great at the occasional detailed code-review to help find glaring issues or things that were missed, but having it implement anything has been severely lacking. I have to literally tell it not to do anything because it will gladly just start writing files on a whim. I generally use the opus model to write detailed plans, sonnet to implement, and then opus and gemini to review and plan refactors.
I'm impressed. The progress is SLOW. I'd have gotten to the stage I'm at in 1/3 to 1/2 the time, likely with fewer tests and significantly less process documentation. But the results are otherwise fairly great. And the learning process has kept me motivated to keep this old side-project moving.
I was switching between two accounts for a week while testing, but in the end upgraded to the $100/month plan and I think I've been rate-limited once since. I don't know if I'll be using this for every-day professional work, but I think it's a great tool for a few categories of work.
Feels like the most valuable skill to have as a programmer in times of Claude Code is that of carefully reading spec documentation and having an acute sense of critical thinking when reviewing code.
Critical Skills is spotting the potential bugs before they happen but in order to do that you need to have an extremely acute understanding or a have a lot of experience in the stack, libs and programming language of choice. Something that, ironically, you will not get by "vibe coding".
Fascinating since I found the recent Claude models untrustworthy for writing and editing SQL. E.g. it'd write conditions correctly, but not add parens around ANDs and ORs (which gemini pro then highlighted as a bug, correctly.)
If you aren't already (1) telling Claude Code which flavor of SQL you want (there are several major dialects and many more minor ones) and (2) giving it access to up-to-date documentation via MCP (e.g. https://github.com/arabold/docs-mcp-server) so it has direct access to canonical docs for authoritative grounding and syntax references, you'll find that you get much better results by doing one or both of those things.
Documentation on features your SQL dialect supports and key requirements for your query are very important for incentivizing it to generate the output you want.
As a recent example, I am working on a Rust app with integrated DuckDB, and asked it to implement a scoring algorithm query (after chatting with it to generate a Markdown file "RFC" describing how the algorithm works.) It started the implementation with an absolute minimal SQL query that pulled all metrics for a given time window.
I questioned this rather than accepting the change, and it said its plan was to implement the more complex aggregation logic in Rust because 1) it's easier to interpret Rust branching logic than SQL statements (true) and 2) because not all SQL dialects include EXP(), STDDEV(), VAR() support which would be necessary to compute the metrics.
The former point actually seems like quite a reasonable bias to me, personally I find it harder to review complex aggregations in SQL than mentally traversing the path of data through a bunch of branches. But if you are familiar with DuckDB you know that 1) it does support these features and 2) the OLAP efficiency of DuckDB makes it a better choice for doing these aggregations in a performant way than iterating through the results in Rust, so the initial generated output is suboptimal.
I informed it of DuckDB's support for these operations and pointed out the performance consideration and it gladly generated the (long and certainly harder to interpret) SQL query, so it is clearly quite capable, just needs some prodding to go in the right direction.
I found claude sonnet 4 really good at writing SQL if you give it a feedback loop with real data. It will research the problem, research the data, and improve queries until it finds a solution. And then it will optimize it, even optimize performance if you ask it to run explain plan or look at pg_stat_statemnts (postgres).
It's outrageously good at performance optimization. There's been multiple really complex queries I've optimized with it that I'd been putting off for a long time. Claude code figured the exact indexes to add within seconds (not ones I would have got easily manually).
This kind of thing is a key point. Tell Claude Code to build the project, run linters, run the tests, and fix the errors. This (in my experience) has a good chance of filtering out mistakes. Claude is fully capable of running all of the tools, reading the output, and iterating. Higher level mistakes will need code written in a way that is testable with tests that can catch them, although you probably want that anyway.
Even when I hand roll certain things, it still nice to have Claude Code take over any other grunt work that might come my way. And there are always yaks to shave, always.
I completed my degree over 20 years ago and due to dot com bust and the path I took never coded as a full time role, some smallbits of dev and scripting but nothing where I would call myself a developer. I've had loads of ideas down through the years but never had the time work to complete them or learn the language/stack to complete them. Over the last 3 weeks I've been working on something small that should be ready for a beta release by the end of August. The ability to sit down and work on a feature or bug when I only have a spare 30 mins and be immediately productive without having to get in the zone is a game changer for me. Also while I can read and understand the code writing it would be at least 10 times slower for me. This is a small codebase that will have less than 5k lines and is not complicated so github copilot is working well for me in this case.
I could see me paying for higher tiers given the productivity gains.
The only issue I can see is that we might end up with a society where those that can afford the best subscriptions have more free time, get more done, make more money and are more successful in general. Even current base level subscriptions are too expensive for huge percentage of the global population.
Gemini is not that good right now. GLM-4.5, which just came out, is pretty decent and very cheap. I use these with the RooCode plugin for VSCode that connects to it via OpenRouter. $10 of credits lasts a day of coding for me where as Claude would run that out in an hour.
I found Gemini CLI to be totally useless too. Last week I tried Claude Code with GLM4.5 (via z.ai API), though, and it was genuinely on par with Sonnet.
Thank you for the recommendation. I've been testing this on an open source project and it's indeed good. Not as good as Sonnet 4, but good enough. And the pricing is very reasonable. Don't know if I'd trust it to work on private code, but for public code it's a great option.
> I couldn't imagine paying >100$/month for a dev tool in the past, but I'm seriously considering upgrading to the Max plans.
Sadly, my experience with the Max plan has been extremely poor. It’s not even comparable, I’ve been vastly experimenting with claude code in the last weeks, spending more than 80$ per day, it’s amazing. The problem is that in the Max plan you’re not the one managing the context length, and this ruins the model ability to keep things memory. Of course this is expected, the longer the context the more expensive to run, but it’s so frustrating to fail in a coding task because it’s so obvious the model lost a crucial part of the context.
My experience has been similar, over perhaps 4-6 weeks of Claude Code. My first few days were a bit rough, and I was tempted to give up and proclaim that all my skeptic's opinions were correct and that it was useless. But there is indeed a learning curve to using it. After a month I'm still learning, but I can get it to give me useful output that I'm happy committing to my projects, after reviewing it line by line.
Agreed that context and chunking are the key to making it productive. The times when I've tried to tell it (in a single prompt) everything I want it to do, were not successful. The code was garbage, and a lot of it just didn't do what I wanted it to do. And when there are a lot of things that need to be fixed, CC has trouble making targeted changes to fix issues one by one. Much better is to build each small chunk, and verify that it fully works, before moving on to the next.
You also have to call its bullshit: sometimes it will try to solve a problem in a way you know is wrong, so you have to stop it and tell it to do it in another way. I suppose I shouldn't call it "bullshit"; if we're going to use the analogy of CC being like an inexperienced junior engineer, then that's just the kind of thing that happens when you pair with a junior.
I still often do find that I give it a task, and when it's done, realize that I could have finished it much faster. But sometimes the task is tedious, and I'm fine with it taking a little longer if I don't have to do it myself. And sometimes it truly does take care of it faster than I would have been able to. In the case of tech that I'm learning myself (React, Tailwindcss, the former of which I dabbled with 10 years ago, but my knowledge is completely out of date), CC has been incredibly useful when I don't really know how to do something. I'm fine letting CC do it, and then I read the code and learn something myself, instead of having to pore over various tutorials of varying quality in order to figure it out on my own.
So I think I'm convinced, and I'll continue to make CC more and more a part of my workflow. I'm currently on the Pro plan, and have hit the usage limits a couple times. I'm still a little shy about upgrading to Max and spending $100/mo on a dev tool... not sure if I'll get over that or not.
One thing I’ve started doing is using Gemini cli as a sidecar for Claude Code to load in a huge amount of context around a set of changes to get a second opinion - it’s been pretty handy for that particular use case due to its context size advantage
Completely agree. You really have to learn how to use it.
For example, heard many say that doing big refactorings is causing problems. Found a way that is working for SwiftUI projects. I did a refactoring, moving files, restructuring large files into smaller components, and standardizing component setup of different views.
The pattern that works for me: 1) ask it to document the architecture and coding standards, 2) ask it to create a plan for refactoring, 3) ask it to do a low-risk refactoring first, 4) ask it to update the refacting plan, and then 5) go through all the remaining refactorings.
The refactoring plan comes with timeline estimates in days, but that is completely rubbish with claude code. Instead i asked it to estimate in 1) number of chat messages, 2) number of tokens, 3) cost based on number of tokens, 4) number of files impacted.
Another approach that works well is to first generate a throw away application. Then ask it to create documentation how to do it right, incorporate all the learning and where it got stuck. Finally, redo the application with these guidelines and rules.
Another tip, sometimes when it gets stuck, i open the project in windsurf, and ask another LLM (e.g., Gemini 2.5 pro, or qwen coder) to review the project and problem and then I will ask windsurf to provide me with a prompt to instruct claude code to fix it. Works well in some cases.
Also, biggest insight so far: don't expect it to be perfect first time. It needs a feedback loop: generate code, test the code, inspect the results and then improve the code.
Works well for SQL, especially if it can access real data: inspect the database, try some queries, try to understand the schema from your data and then work towards a SQL query that works. And then often as a final step it will simplify the working query.
I use an MCP tool with full access to a test database, so you can tell it to run explain plan and look at the statistics (pg_stat_statements). It will draw a mermaid diagram of your query, with performance numbers included (nr records retrieved, cache hit, etc), and will come back with optimized query and index suggestions.
Tried it also on csv and parquet files with duckdb, it will run the explain plan, compare both query, explain why parquet is better, will see that the query is doing predicate push down, etc.
Also when it gets things wrong, instead of inspecting the code, i ask it to create a design document with mermaid diagrams describing what it has built. Quite often that quickly shows some design mistake that you can ask it to fix.
Also with multiple tools on the same project, you have the problem of each using it's own way of keeping track of the plan. I asked claude code to come up with rules for itself and windsurf to collaborate on a project. It came back with a set of rules for CLAUDE.md and .windsurfrules on which files to have, and how to use them (PLAN.md, TODO.md, ARCHITECTURE.md, DECISION.md, COLLABORATION.md)
Claude code is great until it isn’t. You’re going to get to a point where you need to modify something or add something… a small feature that would have been easy if you wrote everything, and now it’s impossible because the architecture is just a mishmash of vibe coded stuff you don’t understand.
So far I'm bullish on subagents to help with that. Validate completion status, bullshit detection, catching over engineering etc. I can load them with extra context like conventions ahd specific prompts to clamp down on the Claude-isms during development.
I understand completely what you're saying. But with the delusions that management is under right now, you're just going to seem like someone that's resisting the flow of code and becoming a bottleneck.
This. It helps to tell it to plan and to then interrogate it about that plan, change it to specification etc. Think of it as a refinement session before a pairing session. The results are considerably better if you do it this way. I've written kubernetes operators, flask applications, Kivy applications, and a transparent ssh proxy with Claude in the last two months, all outside of work.
It also helps to tell it to write tests first: I lean towards integration tests for most things but it is decent at writing good unit tests etc too. Obviously, review is paramount if TDD is going to work.
As a hobbyist coder, the more time I spend brainstorming with all the platforms about specs and tests and architecture, the better the ultimate results.
Having used Claude Code extensively for the last few months, I still haven't reached this "until it isn't" point. Review the code that comes out. It goes a long way.
Yes, my point is that you don't even have "it compiles" as a way to measure a code review. Maybe you did a great job, maybe you did a terrible job, how do you tell?
How can you end up with code you don't understand, if you review anything it writes? I wouldn't let it deviate from the architecture I want to have for the project. I had problems with junior devs in the past, too eager to change a project, and I couldn't really tell them to stop (need to work on my communication skills). No such problem with Claude Code.
I don’t remember what architecture was used by PRs I reviewed a month ago. I remember what architecture I designed 15 years ago for projects I was part of.
I've only used the agentic tools a bit, but I've found that they're able to generate code at a velocity that I struggle to keep in my head. The development loop also doesn't require me to interact with the code as much, so I have worse retention of things like which functions are in which file, what helper functions already exist, etc.
It's less that I can't understand, and more that my context on the code is very weak.
I might have to try this. Without having tried it, it feels like the context I think I lack is more nitty gritty than would be exposed like this. It's not like I'm unsure of how a request ends up in a database transaction, but more "do we need or already have an abstraction over paging in database queries?". It doesn't feel like mermaid diagrams or design documents would include that, but I'm open to being wrong there.
> a mishmash of vibe coded stuff you don’t understand.
No, there is a difference between "I wrote this code" and "I understand this code". You don't need to write all the code in a project to understand it. Otherwise writing software in a team would not be a viable undertaking.
Yes, the default when it does anything is to try and create. It will read my CLAUDE.md file, it will read the code that is already there, and then it will try to write it again. I have had this happen many times (today, I had to prompt 5/6 times to read the file as a feature had already been implemented).
...and if something is genuinely complex, it will (imo) generally do a bad job. It will produce something that looks like it works superficially, but as you examine it will either not work in a non-obvious way or be poorly designed.
Still very useful but to really improve your productivity you have to understand when not to use it.
English is much less expressive compared to code. Typing the keys was never the slow part for senior developers.
It does work with an LLM, but you’re reinventing the wheel with these crazy markup files. We created a family of language to express how to move bits around and replacing that with English is silly.
Vibe coding is fast because you’re ok with not thinking about the code. Anytime you have to do that, an LLM is not going to be much faster.
In theory, there is no reason why this is the case. For the same reason, there is no reason why juniors can't create perfect code first time...it is just the tickets are never detailed enough?
But in reality, it doesn't work like that. The code is just bad.
You are responsible for the layers. You should either do the design on your own, or let the tool ask you questions and guide you. But you should have it write down the plan, and only then you let it code. If it messes up the code, you /clear, load the plan again and tell it to do the code differently.
It's really the same with junior devs. I wouldn't tell a junior dev to implement a CRM app, but I can tell the junior dev to add a second email field to the customer management page.
You're not setting good enough boundaries or reviewing what it's doing closely enough.
Police it, and give it explicit instructions.
Then after it's done its work prompt it with something like "You're the staff engineer or team lead on this project, and I want you to go over your own git diff like it's a contribution from a junior team member. Think critically and apply judgement based on the architecture of the project describes @HERE.md and @THERE.md."
Ah yes…the old “you’re holding it wrong”. The problem is these goddamn things don’t learn, so you put in the effort to police it…and you have to keep doing that until the end of time. Better off training someone off the street to be a software engineer.
Yes, sometimes you are actually indeed holding it wrong. Sometimes a product has to be used in a certain way to get good results. You're not going to blame the shampoo when someone uses only a tiny drop of it, and the hair remains dirty.
This is still early days with LLMs and coding assistants. You do have to hold them in the right way sometimes. If you're not willing to do that, or think that provides less value than doing it another way... great, good for you, do it the way you think is best for you.
I've been a coding assistant skeptic for a long time. I just started playing with Claude Code a month or so ago. I was frustrated for a bit until I learned how to hold it the right way. It is a long, long way from being a substitute for a real human programmer, but it's helpful to me. I certainly prefer it to pair programming with a human (I hate pair programming), so this provides value.
If you don't care to figure out for yourself if it can provide you value, that's your choice. But this technology is going to get better, and you might later find yourself wishing you'd looked into it earlier. Just like any new tool that starts out rough but eventually turns out to be very useful.
Your claude.md (or equivalent) is the best way to teach them. At the end of any non-trivial coding session, I'll ask for it to propose edits/additions to that file based on both the functional changes and the process we followed to get there.
That's not the end of the story, though. LLMs don't learn, but you can provide them with a "handbook" that they read in every time you start a new conversation with them. While it might take a human months or years to learn what's in that handbook, the LLM digests it in seconds. Yes, you have to keep feeding it the handbook every time you start from a clean slate, and it might have taken you months to get that handbook into the complete state it's in. But maybe that's not so bad.
The good thing about this process its it means such a handbook functions as documentation for humans too, if properly written.
Claude is actually quite good at reading project documentation and code comments and acting on them. So it's also useful for encouraging project authors to write such documentation.
I'm now old enough that I need such breadcrumbs around the code to get context anyways. I won't remember why I did things without them.
It's just a tool, not an intelligence or a person.
You use it to make your job easier. If it doesn't make your job easier, you don't use it.
Anybody trying to sell you on a bill of goods that this is somehow "automating away engineers" and "replacing expensive software developers" is either stupid or lying (or both).
I find it incredibly useful, but it's garbage-in, garbage-out just like anything else with computers. If your code base is well commented and documented and laid out in a consistent pattern, it will tend to follow that pattern, especially if it follows standards. And it does better in languages (like Rust) that have strict type systems and coding standards.
Even better if you have rigorous tests for it to check its own work against.
They don't learn by themselves, but you can add instructions as they make mistakes that are effectively them learning. You have to write code review feedback for juniors, so that s not an appreciable difference.
> Better off training someone off the street to be a software engineer.
And that person is going to quit and you have to start all over again. They also cost at least 100x the price.
I've been telling people, this is Uber in 2014, you're getting a benefit and it's being paid for with venture capital money, it's about as good as it's going to get.
Not so. Adding to context files helps enormously. Having touchstone files (ARCHITECTURE.md) you can reference helps enormously. The trick is to steer, and create the guardrails.
Honestly, it feels like DevOps had a kid with Product.
> Honestly, it feels like DevOps had a kid with Product.
You've just described a match made in hell. DevOps - let's overcomplicate things (I'm looking at you K8s) and Product - they create pretty screenshots and flows but not actually think about the product as a system (or set of systems.)
Gemini is shockingly, embarrassingly, shamefully bad (for something out of a company like Google). Even the open models like Qwen and Kimi are better on opencode.
In my experience, Gemini is pretty good in multishotting. So just give it a system prompt, some example user/assistant pairs, and it can produce great results!
And this is its biggest weakness for coding. As soon as it makes a single mistake, it's over. It somehow has learned that during this "conversation" it's having, it should make that mistake over and over again. And then it starts saying things like "Wow, I'm really messing up the diff format!"
Ah I was thinking maybe the Gemini-cli agent itself might be attributable to the problems, thus maybe try the opencode/Gemini combo instead..
I'd like to mess around with "opencode+copilot free-tier auth" or "{opencode|crush}+some model via groq(still free?)" to see what kind of mileage I can get and if it's halfway decent..
> I have about two weeks of using Claude Code and to be honest, as a vibe coding skeptic, I was amazed.
And, yet, when I asked it to correct a CMake error in a fully open source codebase (broken dependency declaration), it couldn't work it out. It even started hallucinating version numbers and dependencies that were so obviously broken that at least it was obvious to me that it wasn't helping.
This has been, and continues to be, my experience with AI coding. Every time I hit something that I really, really want the AI to do and get right (like correcting my build system errors), it fails and fails miserably.
It seems like everybody who sings the praises of AI coding all have one thing in common--Javascript. Make of that what you will.
This is typically the outcome, when you have it look at a generic problem and fix it, especially if the problem depends on external information (like specific version numbers, etc). You have to either tell it where to look it up, or ask it to ask you questions how things need to be resolved. I personally use it to work on native code, C++ (with CMake), Zig, some Python. Works fine.
I have not tried it, for a variety of reasons, but my (quite limited, anecdotal, and gratis) experience with other such tools is, that I can get them to write something I could perhaps get as an answer on StackOverflow: Limited scope, limited length, address at most one significant issue; and perhaps that has to do with what they are trained on. But that once things get complicated, it's hopeless.
You said Claude Code was significantly better than some alternatives, so better than what I describe, but - we need to know _on what_.
Not with Claude Code but with Cursor using Claude Sonnet 4 I coded an entire tower defense game, title, tutorial, gameplay with several waves of enemies, and a “rewind time” mechanic. The whole thing was basically vibe coded, I touched maybe a couple dozen lines of code. Apparently it wasn’t terrible [0]
I've been working on the design of a fairly complicated system using the daffy robots to iterate over a bunch of different ideas. Trying things out (conceptually) to explore the pros and cons of each decision before even writing a single line of code. The code is really a formality at this point as each and every piece is laid out and documented.
Contrast this with the peg parser VM it basically one-shotted but needed a bunch of debug work. A fuzzy spec (basically just the lpeg paper) and a few iterations and it produced a fully tested VM. After that the AST -> Opcode compiler was super easy as it just had to do some simple (fully defined by this point) transforms and Bob's your uncle. Not the best code ever but a working and tested system.
Then my predilection for yak shaving took over as the AST needed to be rewritten to make integration as a python C extension module viable (and generated). And why have separate AST and opcode optimization passes when they can be integrated? Oh, and why even have opcodes in the first place when you can rewrite the VM to use Continuation Passing Style and make the entire machine AST-> CPS Transform -> Optimizer -> Execute with a minimum of fuss?
So, yeah, I think it's fair to say the daffy robots are a little more than a StackOverflow chatbot. Plus, what I'm really working on is a lot more complicated than this, needing to redo the AST was just the gateway drug.
I haven't used Claude Code - but have been using Amp a lot recently. Amp always hits on target. They created something really special.
Has anyone here used both Claude Code and Amp and can compare the two's effectiveness? I know one is CLI and the other an editor extensions. I'm looking for comparisons beyond that. Thanks!
It burns through credit too quickly. As a previous Sourcegraph Cody user, I was trying Amp first, but I've spent tens of dollars every day for the trial, and that was with an eye on the usage. It felt horrible seeing how I pay mostly for it's mistakes and the time it takes debugging. With CC, I can let go of the anxiety. I get several hours a day out of the Claude Pro plan and that's mostly good enough for now. If it's not, I'll upgrade to Max, as at $100 that's still less than what I'd have spent on Amp.
That's the thing for me too: I don't want to pay for the agent's mistakes, even if those mistakes are in part the fault of my prompt. I'm fine with having usage limits if it means I pay a fixed cost per month. Not sure how long this will last, considering how expensive all this is for the companies to run, though.
I feel like Amp's costs are actually in line with Sourcegraph's costs, and eventually Anthropic, OpenAI, et al. will all be charging a lot more than they are now.
It's the classic play to entice people to something for low cost, and then later ramp it up once they're hooked. Right now they can afford to burn VC money, but that won't last forever.
check out openrouter.ai you can pay for credits that get used per prompt instead of forking out a fixed lump sum and it rotates keys so you can avoid being throttled, you can even use the same credits on any model in their index
This is not actually such a big change for me. I've been doing mostly architecture for several years now. Thinking about the big picture, how things fit together, and how to make complex things simple is what I care about. I've been jokingly calling what I do "programming without coding" even before the current AIs existed. It's just that now I have a extra tool I can use for writing the code.
The more I use it the more I realise my first two weeks and the amazement I felt were an illusion.
I’m not going to tell you it’s not useful, it is. But then shine wears off pretty fast and when it does, you’re basically left with a faster way to type. At least in my experience.
I really don't know what is it, but Claude Code just seems like an extremely well tuned package. You can have the same core models, but the internal prompts matter, how they are looking up extra context matters, how easy is it to add external context matters, how it applies changes matters, how eager is it to actually use an external tool to help you matters. With Claude Code, it just feels right. When I say I want a review, I get a review, when I want code, I get code, when I want just some git housekeeping, I get that.
That has not been my experience, Copilot using Claude is way different than claude code for me. Anecdotal, and "vibes" based, but it'd what I've been experiencing.
I use vim for most of my development, so I'm always in a terminal anyway. I like my editor setup, and getting the benefits of a coding assistant without having to drastically change my editor has huge value to me.
The native tool use is a game changer. When I ask it to debug something it can independently add debug logging to a method, run the tests, collect the output, and code based off that until the tests are fixed.
Having spent a couple of weeks putting both AIDE-centric (Cursor, Windsurf) and CLI-centric (Claude Code, OpenAI Codex, Gemini CLI) options through real-world tasks, Cursor was one of the least effective tools for me. I ultimately settled on Claude Code and am very happy with it.
I realized Claude Code is the abstraction level I want to work in. Cursor et al still stick me way down into the code muck when really I only want to see the code during review. It's an implementation detail that I still have to review because it's makes mistakes, even when guided perfectly, but otherwise I want to think in interfaces, architecture, components. The low level code, don't care. Is it up to spec and conventions, does it work? Good enough for me.
Claude Code is ahead of anything else, in a very noticeable way. (I've been writing my own cli tooling for AI codegen from 2023 - and in that journey I've tried most of the options out there. It has been a big part of my work - so that's how I know.)
I agree with many things that the author is doing:
1. Monorepos can save time
2. Start with a good spec. Spend enough time on the spec. You can get AI to write most of the spec for you, if you provide a good outline.
3. Make sure you have tests from the beginning. This is the most important part. Tests (along with good specs) are how an AI agent can recurse into a good solution. TDD is back.
4. Types help (a lot!). Linters help as well. These are guard rails.
5. Put external documentation inside project docs, for example in docs/external-deps.
6. And finally, like every tool it takes time to figure out a technique that works best for you. It's arguably easier than it was (especially with Claude Code), but there's still stuff to learn. Everyone I know has a slightly different workflow - so it's a bit like coding.
I vibe coded quite a lot this week. Among them, Permiso [1] - a super simple GraphQL RBAC server. It's nowhere close to best tested and reviewed, but can be quite useful already if you want something simple (and can wait until it's reviewed.)
> 2. Start with a good spec. Spend enough time on the spec. You can get AI to write most of the spec for you, if you provide a good outline.
Curious how you outline the spec, concretely. A sister markdown document? How detailed is it? etc.
> 3. Make sure you have tests from the beginning. This is the most important part. Tests (along with good specs) are how an AI agent can recurse into a good solution. TDD is back.
Ironically i've been struggling with this. For best results i've found claude to do best with a test hook, but then claude loses the ability to write tests before code works to validate bugs/assumptions, it just starts auto fixing things and can get a bit wonky.
It helps immensely to ensure it doesn't forget anything or abandon anything, but it's equally harmful at certain design/prototype stages. I've taken to having a flag where i can enable/disable the test behavior lol.
I will start with a basic markdown outline and then use a prompt describing more of the system in just flowing (yet coherent) thought and, crucially, I'll ask the model to "Organize the spec in such a way that an LLM can best understand it and make use of it." The result is a much more succinct document with all the important pieces.
(or -- you can write a spec that is still more fleshed out for humans, if you need to present this to managers. Then ask the LLM to write a separate spec document that is tailored for LLMs)
> Curious how you outline the spec, concretely. A sister markdown document? How detailed is it? etc.
Yes. I write the outline in markdown. And then get AI to flesh it out. The I generate a project structure, with stubbed API signatures. Then I keep refining until I've achieved a good level of detail - including full API signatures and database schemas.
> Ironically i've been struggling with this. For best results i've found claude to do best with a test hook, but then claude loses the ability to write tests before code works to validate bugs/assumptions, it just starts auto fixing things and can get a bit wonky.
I generate a somewhat basic prototype first. At which point I have a good spec, and a good project structure, API and db schemas. Then continuously refine the tests and code. Like I was saying, types and linting are also very helpful.
What kind or projects are more suitable for this approach? Because my workflow, sans LLM agents, have been to rely on frameworks to provide a base abstraction for me to build upon. The hardest is to nail down the business domain, done over rounds of discussions with stakeholders. Coding is pretty breezy in comparison.
That's why you see such a difference in time saved using LLMs for programming across the population. If you have all the domain knowledge and the problem is generic enough it's a 100x multiplier. Otherwise your experience can easily range from 0.1x to 10x.
I don't even write the outline myself. I tell CC to come up with a plan, and then we iterate on that together with CC and I might also give it to Gemini for review and tell CC to apply Gemini's suggestions.
Playwright is such a chore with Claude but I'm afraid to live without it. Every feature seems to be about 70% of the time spent fixing it's playwright mess. It struggles with running the tests, basic data setup and cleanup, auth and just basic best practices. I have a testing guide that outlines all this but it half asses every thing ..
Yes they can save you some time, but at the cost of Claude's time and lots of tokens making tool calls attempting to find what it needs to find. Aider is much nicer, from the standpoint that you can add the files you need it to know about, and send it off to do its thing.
I still don't understand why Claude is more popular than Aider, which is by nearly every measure a better tool, and can use whatever LLM is more appropriate for the task at hand.
> Aider is much nicer, from the standpoint that you can add the files you need it to know about, and send it off to do its thing.
As a user, I don't want to sit there specifying about 15-30 files, then realize that I've missed some and that it ruins everything. I want to just point the tool at the codebase and tell it: "Go do X. Look at the current implementation and patterns, as well as the tests, alongside the docs. Update everything as needed along the way, here's how you run the tests..."
Indexing the whole codebase into Qdrant might also help a little.
I think it makes sense to want that, but at least for me personally I’ve had dramatically better overall results when manually managing the context in Aider than letting Claude Code try to figure out for itself what it needs.
It can be annoying, but I think it both helps me be more aware of what’s being changed (vs just seeing a big diff after a while), and lends itself to working on smaller subtasks that are more likely to work on the first try.
You get much better results in CC as well if you're able to give the relevant files as a starting point. In that regard these two tools are not all that different.
Aider does know the whole repository tree (it scans the git index). It just doesn't read the files until you tell it to. If it thinks it needs access to a file, it will prompt you to add it. I find this to be a fairly good model. Obviously it doesn't work off line though.
Honestly, it's just this. "Claude the bar button on foo modal is broken with a failed splork". And CC hunts down foo.ts, traces that it's an API call to query.ts, pulls in the associated linked model, traces the api/slork.go and will as often as not end up with "I've found the issue!" and fix it. On a one sentence prompt. I think it's called an "Oh fuck" moment the first time you see this work. And it works remarkably reliably. [handwave caveats, stupid llms, etc]
As an alternative to monorepos, you can add another repo to your workspace by informing it relevant code is located at XXX path on your machine. Claude will add that code to your workspace for the session.
Agreed, for CC to work well, it needs quite a bit of structure
I’ve been working on a Django project with good tests, types and documentation. CC mostly does great, even if it needs guidance from time to time
Recently also started a side project to try to run CC offline with local models. Got a decent first version running with the help of ChatGPT, then decided to switch to CC. CC has been constantly trying to avoid solving the most important issues, sidestepping errors and for almost everything just creating a new file/script with a different approach (instead of fixing or refactoring the current code)
I've also found that structure is key instead of trusting its streams of consciousness.
For unit testing, I actually pre-write some tests so it can learn what structure I'm looking for. I go as far as to write mocks and test classes that *constrain* what it can do.
With constraints, it does a much better job than if it were just starting from scratch and improvising.
There's a numerical optimization analogy to this: if you just ask a solver to optimize a complicated nonlinear (nonconvex) function, you will likely get stuck or hit a local optimum. But if you carefully constrain its search space, and guide it, you increase your chances of getting to the optimum.
LLMs are essentially large function evaluators with a huge search space. The more you can herd it (like herding a flock into the right pen), the better it will converge.
The real power of Claude Code comes when you realise it can do far more than just write code.
It can, in fact, control your entire computer. If there's a CLI tool, Claude can run it. If there's not a CLI tool... ask Claude anyway, you might be
surprised.
E.g. I've used Claude to crop and resize images, rip MP3s from YouTube videos, trim silence from audio files, the list goes on. It saves me incredible amounts of time.
I don't remember life before it. Never going back.
You probably want to give Claude a computer. I'm not sure you always want to give it your computer unless you're in the loop.
We have Linux instances running an IDE running in cloud vms that we can access through the browser at https://brilliant.mplode.dev. Personally I think this is closer to the ideal UX for operating an agent (our environment doesn't install agents by default yet, but you should be able to just install them manually). You don't have to do anything to set up terminal access or ssh except sign in and wait for your initial instance to start, and once you have any instance provisioned it automatically pauses and resumes based on whether your browser has it open. It's literally Claude + A personal Linux instance + an IDE that you can just open from a link
Pretty soon I should be able run as many of these at a time as I can afford, and control all of their permissions/filesystems/whatever with JWTs and containers. If it gets messed up or needs my attention I open it with the IDE as my UI and can just dive in and fix it. I don't need a regular Linux desktop environment or UI or anything. Just render things in panes of the IDE or launch a container serving a webapp doing what I want and open it instead of the IDE. Haven't ever felt this excited about tech progress
I got it to diagnose why my Linux PC was crashing. It did a lot of journalctl grepping on my behalf and was glad for its help. Think it may have helped fix it but will see.
I was having a kernel panic on boot, I would work around it by loading the previous kernel. Turns out I had just ran out of space on my boot partition, but in my initial attemps to debug and fix I had gotten into a broken package state.
I handed it the reigns just out of morbid curiosity, and because I couldn't be bothered continuing for the night, but to my surprise (and with my guidance step by step) it did figure it all out. It found unused kernels, after uninstalling them didn't remove them, it deleted them with rm. It then helped resolve the broken package state and eventually I was back in a clean working state.
Importantly though, it did not know it hadn't actually cleaned up the boot partition initially. I had to insist that it had not in fact just freed up space, and that it would need to remove them.
Completely agree. Another use case is a static site generator. I just write posts with whatever syntax I want and tell Claude Code to make it into a blog post in the same format. For example, I can just write in the post “add image image.jpeg here” and it will add it - much easier than messing around with Markdown or Hugo.
Beyond just running CLI commands, you can have CC interact with those, e.g I built this little tool that gives CC a Tmux-cli command (a convenience wrapper around Tmux) that lets it interact with CLI applications and monitor them etc:
For example this lets CC spawn another CC instance and give it a task (way better than the built-in spawn-and-let-go black box), or interact with CLI scripts that expect user input, or use debuggers like Pdb for token-efficient debugging and code-understanding, etc.
It's the automators dream come true. Anything can be automated, anything scripted, anything documented. Even if we're gonna use other (possibly local) models in the future, this will be my interface of choice. It's so powerful.
Automation is now trivially easy. I think of another new way to speed up my workflow — e.g. a shell script for some annoying repetitive task — and Claude oneshots it. Productivity gains built from productivity gains.
I don’t feel Claude code helps one iota with the issue in 1319. If anything, it has increased the prevalence of “ongoing development” as I auto mate more things and create more problems to solve.
However, I have fixed up and added features to 10 year old scripts that I never considered worth the trade off to work on. It makes the cost of automation cheaper.
It's not a dream come true to have a bunch of GPUs crunching at full power to achieve your minor automation, with the company making them available losing massive amounts of money on it:
I'd like to say I'm praising the paradigm shift more than anything else (and this is to some degree achievable with smaller, open and sometimes local agentic models), but yes, there are definitely nasty externalities (though burning VC cash is not high up that list for me). I hope some externalities can be be optimized away.
The point is that it costs more than $1200, you're just not the one paying all the costs. It seems like there are a ton of people on HN who are absolutely pumped to be totally dependent on a tool that must rugpull them eventually to continue existing as a business. It feels like an incredible shame that the craft is now starting to become totally dependent on tools like this, where you're calling out to the cloud to do even the most basic programming task.
>> I thought I would see a pretty drastic change in terms of Pull Requests, Commits and Line of Code merged in the last 6 weeks. I don’t think that holds water though
The chart basically shows same output with claude than before.
Which kinda represents what I felt when using LLMs.
You "feel" more productive and you definitely feel "better" because you don't do the work now, you babysit the model and feel productive.
But at the end of the day the output is the same because all advantages of LLMs is nerfed by time you have to review all that, fix it, re-prompt it etc.
And because you offload the "hard" part - and don't flex that thinking muscle - your skills decline pretty fast.
Try using Claude or another LLM for a month and then try doing a tiny little app without it. Its not only the code part that will seem hard - but the general architecture/structuring too.
And in the end the whole code base slowly (but not that slowly) degrades and in longer term results net negative. At least with current LLMs.
I've been exploring vibe coding lately and by far the biggest benefit is the lack of mental strain.
You don't have to try to remember your code as a conceptual whole, what your technical implementation of the next hour of code was going to be like at the same time as a stubborn bug is taunting you.
You just ask Mr smartybots and it deliver anything between proofreading and documentation and whatnot, with some minor fuckups occasionally
But the mental strain is how you build skills and get better at your job over time. If it's too much mental strain, maybe your code's architecture or implementation can be improved.
A lot of this sounds like "this bot does my homework for me, and now I get good grades and don't have to study so hard!"
Perhaps you set a very high quality bar, but I don't see the LLMs creating messy code. If anything, they are far more diligent in structuring it well and making it logically sequenced and clear than I would be. For example, very often I name a variable slightly incorrectly at the start and realise it should be just slightly different at the end and only occasionally do I bother to go rename it everywhere. Even with automated refactoring tools to do it, it's just more work than I have time for. I might just add a comment above it somewhere explaining the meaning is slightly different to how it is named. This sort of thing x 100 though.
> hey are far more diligent in structuring it well and making it logically sequenced and clear than I would be
Yes, with the caveat: only on the first/zeroth shot. But even when they keep most/all of the code in context if you vibe code without incredibly strict structuring/guardrails, by the time you are 3-4 shots in, the model has "forgotten" the original arch, is duplicating data structures for what it needs _this_ shot and will gleefully end up with amnesiac-level repetitions, duplicate code that does "mostly the same" thing, all of which acts as further poison for progress. The deeper you go without human intervention the worse this gets.
You can go the other way, and it really does work. Setup strict types, clear patterns, clear structures. And intervene to explain + direct. The type of things senior engineers push back on in junior PRs. "Why didn't you just extend this existing data structure and factor that call into the trivially obvious extension of XYZ??".
I haven't found such a bug yet. If it fails to debug on its second attempt I usually switch to a different model or tell it to carpet bomb the code with console logs, write test scripts and do a web search, etc.
The strength (and weakness) of these models is their patience is infinite.
My friend, there’s no solid evidence that this is the case. So far, there are a bunch of studies, mostly preprints, that make vague implications, but none that can show clear causal links between a lack of mental strain and atrophying brain function from LLMs.
You're right, we only have centuries of humans doing hard things that require ongoing practice to stay sharp. Ask anyone who does something you can't fake, like playing the piano, what taking months off does to their abilities. To be fair, you can get them back much faster than someone that never had the skills to begin, but skills absolutely atrophy if you are not actively engaged with them.
I wish, but as it stands right now LLMs have to be driven and caged ruthlessly. Conventions, architecture, interfaces, testing, integration. Yes, you can YOLO it and just let it cook up _something_, but that something will an unmaintainable mess. So I'm removing my brain from the abstraction level of code (as much as I dare), but most definitely not from everything else.
We know that learning and building mental capabilities require effort over time. We know that when people have not been applying/practicing programming for years, their skills have atrophied. I think a good default expectation is that unused skills will go away over time. Of course the questions are, is the engagement we have with LLMs enough to sustain the majority of the skills? Or is there new skills one builds that can compensate foe those lost (even when the LLM is no longer used)? How quickly do the changes happen? Are there wider effects, positive and/or negative?
I know what you‘re writing is the whole point of vibe coding, but I‘d strongly urge you to not do this. If you don’t review the code an LLM is producing, you‘re taking on technical debt. That’s fine for small projects and scripts, but not for things you want to maintain for longer. Code you don’t understand is essentially legacy code. LLM output should be bent to our style and taste, and ideally look like our own code.
If that helps, call it agentic engineering instead of vibe coding, to switch to a more involved mindset.
Not for me. I just reversed engineered a bluetooth protocol for a device which would taken me at least a few days capturing streams of data wireshark. Now i dumped entire dumps inside a llm and it gave me much more control finding the right offsets etc. It took me only a day.
Maybe you do today. Will that always be the case? People running Windows do not have as much control over their systems as they should. What does enshittification look like for AI?
> If there's a CLI tool, Claude can run it. If there's not a CLI tool... ask Claude anyway, you might be surprised.
No Claude Code needed for that! Just hang around r/unixporn and you'll collect enough scripts and tips to realize that mainstream OS have pushed computers from a useful tool to a consumerism toy.
That's like saying "you don't need a car, just hang around this bicycle shop long enough and you'll realize you can exercise your way around the town!"
Simple task of unzipping with tar is cryptic enough that collecting unix scripts from random people is definitely something people don't want to do in 2025.
Remembering one thing is easy, remembering all the things is not. With an agentic CLI I don't need to remember anything, other than if it looks safe or not.
The point is not that a tool maybe exists. The point is: You don't have to care if the tool exists and you don't have to collect anything. Just ask Claude code and it does what you want.
I've been using Claude code 12-16 hours a day since I first got it running two weeks ago.
Here's the tips I've discovered:
1. Immediately change to sonnet (the cli defaults to opus for max users). I tested coding with opus extensively and it never matches the quality of sonnet.
2. Compacting often ends progress - it's difficult to get back to the same quality of code after compacting.
3. First prompt is very important and sets the vibe. If your instance of Claude seems hesitant, doubtful, sometimes even rude, it's always better to end the session and start again.
4. There are phrases that make it more effective. Try, "I'm so sorry if this is a bad suggestion, but I want to implement x and y." For whatever reason it makes Claude more eager to help.
5. Monolithic with docker orchestration: I essentially 10x'd when I started letting Claude itself manage docker containers, check their logs for errors, rm them, rebuild them, etc. Now I can get an entirely new service online in a docker container, from zero to operational, in one Claude prompt.
5. it's not just docker, give it playwright MCP server so it can see what it is implementing in UI and requests
6. start in plan mode and iterate on the plan until you're happy
7. use slash commands, they are mini prompts you can keep refining over time, including providing starting context and reminding it that it can use tools like gh to interact with Github
not sure I agree on 1.
2. compact when you are at a good stop, not when you are forced to because you are at 0%
Use agents to validate the code. Is it over engineered, does it conform to conventions and spec, is it actually implemented or half bullshit. I run three of these at the end of a feature or task and it almost always send Opus back to the workbench fixing a bunch of stuff. And since they have their own context, you don't blow up the main context and can go for longer.
Sometimes it's a bit too eager to mess around inside the containers, like when I ask it to understand some code sometimes it won't stop trying to run it inside the container in a myriad of ways that won't work.
It once did a container exec that piped the target file into the projects cli command runner, which did nothing, but gives you an example of the string of wacky ways it will insist on running code instead of just reading it.
Where are you hosting those containers? Our serverless/linux cli/browser IDE at https://brilliant.mplode.dev runs on containers in our nascent cloud platform and we’re almost ready to start serving arbitrary containers on it deployed directly from the IDE. I’m curious if there are any latency/data/auth/etc pain points you’ve been running into
I had success with creating a VERY detailed plan.md file - down to how all systems connect together, letting claude-loop[1] run while I sleep and coming back in the morning manually patching it up.
## IMPORTANT
- Use thiserror, create the primary struct ServiceError in error.rs which has all #[from], do not use custom result types or have errors for different modules, all errors should fall under this struct
- The error from above should implement IntoResponse to translate it to client error without leaking any sensitive information and so that ServiceError can be used as error type for axum
## PLAN
### Project Setup
(Completed) Set up Rust workspace with server and jwt-generator crates
(Completed) Create Cargo.toml workspace configuration with required dependencies (axum, sqlx, jsonwebtoken, serde, tokio, uuid, thiserror)
(Completed) Create compose.yaml for PostgreSQL test database with environment variables
(Completed) Design database schema in tables.sql (data table with key UUID, data JSONB, created_at, updated_at; locks table with lock_id UUID, locked_at, expires_at)
### Database Layer
(Completed) Implement database connection module with PostgreSQL connection pool
(Completed) Create database migration system to auto-deploy tables.sql if tables don't exist
(Completed) Implement data model structs for database entities (DataRecord, Lock)
### JWT System
(Completed) Create jwt-generator utility that takes secret key, permissions (read/write), and expiration time
(Completed) Implement JWT authentication middleware for server with permission validation
(Completed) Add JWT token validation and permission checking for endpoints
### Core API Endpoints
(Completed) Implement POST /set endpoint for storing/updating JSONB data with partial update support using jsonb_set
(Completed) Implement GET /get/<key> endpoint with optional sub-key filtering for partial data retrieval
(Completed) Add automatic created_at and updated_at timestamp handling in database operations
### Streaming & Binary Support
(Completed) Implement streaming bytes endpoint with compact binary format (not base64) for efficient data transfer
(Completed) Add support for returning all data if no specific format specified in GET requests
### Lock System
(Completed) Implement database-backed lock system with locks table
(Completed) Create POST /lock endpoint that tries to obtain lock for 5 seconds with UUID parameter
(Completed) Create DELETE /unlock endpoint to release locks by UUID
(Completed) Add lock timeout and cleanup mechanism for expired locks
### Error Handling & Final Polish
(Completed) Implement comprehensive error handling with proper HTTP status codes
(Completed) Add input validation for all endpoints (UUID format, JSON structure, etc.)
(Completed) Test all endpoints with various scenarios (valid/invalid data, concurrent access, lock timeouts)
```
took 4 iterations (>30 minutes!), everything works as expected. the plan itself was partially generated with ccl since I told it to break down tasks into smaller steps then with some manual edits I got it down to that final product. I later swapped locks to be built on a lease system and it handled that quite nicely as well.
> 5. Monolithic with docker orchestration: I essentially 10x'd when I started letting Claude itself manage docker containers, check their logs for errors, rm them, rebuild them, etc. Now I can get an entirely new service online in a docker container, from zero to operational, in one Claude prompt.
This is very interesting. What's your setup, and what kind of prompt might you use to get Claude to work well with Docker? Do you do anything to try and isolate the Claude instance from the rest of your machine (i.e. run these Docker instances inside of a VM) or just YOLO?
Not the parent but I've totally been doing this, too. I've been using docker compose and Claude seems to understand that fine in terms of scoping everything - it'll run "docker compose logs foo" "docker compose restart bar" etc. I've never tried to isolate it, though I tend to rarely yolo and keep an eye on what it's doing and approve (I also look at the code diffs as it goes). It's allowed to read-only access stuff without asking but everything else I look at.
I'm fascinated by #5. As someone who goes out of my way to avoid Docker while realizing its importance, I would love to know the general format of your prompt.
It's the difference between Claude making code that "looks good" and code that actually runs.
You don't have to be stuck anymore saying, "hey help me fix this code."
Say, "Use tmux to create a persistent session, then run this python program there and debug it until its working perfectly"
Irrespective of how good Claude code actually is (I haven’t used it, but I think this article makes a really cogent case), here’s something that bothers me: I’m very junior, I have a big slow ugly codebase of gdscript (basically python) that I’m going to convert to C# to both clean it up and speed it up.
This is for a personal project, I haven’t written a ton of C# or done this amount of refactoring before, so this could be educational in multiple ways.
If I were to use Claude for this Id feel like I was robbing myself of something that could teach me a lot (and maybe motivate me to start out with structuring my code better in the future). If I don’t use Claude I feel like Im wasting my (very sparse) free time on a pretty uninspiring task that may very well be automated away in most future jobs, mostly out of some (misplaced? Masochistic?) belief about programming craft.
This sort of back and forth happens a lot in my head now with projects.
I'm on the tail end of my 35+ year developer career, but one thing I always do with any LLM stuff is this: I'll ask it to solve something generally I know I COULD solve, I just don't feel like it.
Example: Yesterday I was working with an Open API 3.0 schema. I know I could "fix" the schema to conform to a sample input, I just didn't feel like it because it's dull, I've done it before, and I'd learn nothing. So I asked Claude to do it, and it was fine. Then the "Example" section no longer matched the schema, so Claude wrote me a fitting example.
But the key here is I would have learned nothing by doing this.
There are, however, times where I WOULD have learned something. So whenever I find the LLM has shown me something new, I put that knowledge in my "knowledge bank". I use the Anki SRS flashcard app for that, but there are other ways, like adding to your "TIL blog" (which I also do), or taking that new thing and writing it out from scratch, without looking at the solution, a few times and compiling/running it. Then trying to come up with ways this knowledge can be used in different ways; changing the requirements and writing that.
Basically getting my brain to interact with this new thing in at least 2 ways so it can synthesize with other things in your brain. This is important.
Learning a new (spoken) language uses this a lot. Learn a new word? Put it in 3 different sentences. Learn a new phrase? Create at least 2-3 new phrases based on that.
I'm hoping this will keep my grey matter exercised enough to keep going.
In my experience, if you don't review the generated code, and thus become proficient in C# enough to do that, the codebase will become trash very quickly.
Errors compound with LLM coding, and, unless you correct them, you end up with a codebase too brittle to actually be worth anything.
Friends of mine apparently don't have that problem, and they say they have the LLM write enough tests that they catch the brittleness early on, but I haven't tried that approach. Unfortunately, my code tends to not be very algorithmic, so it's hard to test.
After 16 years of coding professionally, I can say Claude Code has made me considerably better at the things that I had to bang my head against the wall to learn. For things I need to learn that are novel to me, for productivity sake, it’s been “easy come; easy go” like any other learning experience.
My two cents are:
If your goal is learning fully, I would prioritize the slow & patient route (no matter how fast “things” are moving.)
If your goal is to learn quickly, Claude Code and other AI tooling can be helpful in that regard. I have found using “ask” modes more than “agent” modes (where available) can go a long way with that. I like to generate analogies, scenarios, and mnemonic devices to help grasp new concepts.
If you’re just interested in getting stuff done, get good at writing specs and letting the agents run with it, ensuring to add many tests along the way, of course.
I perceive there’s at least some value in all approaches, as long as we are building stuff.
Yes!
Valuable, fundamental, etc. - do it yourself, the slow path.
Boring, uninspiring, commodity - and most of all - easily reversible and not critical - to the LLM it goes!
When learning things intrinsic motivation makes one unreasonably effective. So if there is a field you like - just focus on it. This will let you proceed much faster at _valuable_ things which all in all is the best use of ones time in any case.
Software crafting when you are not at a job should be fun. If it’s not fun, just do the least effort that suits your purpose. And be super diligent only about the parts _you_ care about.
IMHO people who think everyone should do everything from first principles with the diligence of a swiss clocksmith are just being difficult. It’s _one_ way of doing it but it’s not the _only right way_.
Care about important things. If a thing is not important and not interesting just deal with it the least painfull way and focus on something value adding.
A few years ago there was a blog post trend going around about “write you’re own x” instead of using a library or something. You learn a lot about how software by writing your own version of a thing. Want to learn how client side routing works? Write a client side router. I think LLMs have basically made it so anything can be “library” code. So really it comes down to what you want to get out of the project. Do you want to get better at C#? Then you should probably do the port yourself. If you just want to have the ported code and focus on some other aspect, then have Claude do it for you.
Really if your goal is to learn something, then no matter what you do there has to be some kind of struggle. I’ve noticed whenever something feels easy, I’m usually not really learning much.
Before AI, there was copy paste. People who copied code from Stackoverflow without understanding it learned nothing, and I saw it up close many times. I don't see a problem with you asking for advice or concepts. But if you have it write everything for you, you definitely won't learn
That being said, you have to protect your time as a developer. There are a million things to learn, and if making games is your goal as a junior, porting GDscript code doesn't sound like an amazing use of your time. Even though you will definitely learn from it.
The difference now is that LLMs propose to provide copy+paste for everything, and for your exact scenario. At least with Stack Overflow, you usually had to adapt the answers to your specific scenario, and there often weren’t answers for more esoteric things.
I think this is a really interesting point. I have a few thoughts as a read it (as a bit of a grey-beard).
Things are moving fast at the moment, but I think it feels even faster because of how slowly things have been moving for the last decade. I was getting into web development in the mid-to-late-90s, and I think the landscape felt similar then. Plugged-in people kinda knew the web was going to be huge, but on some level we also know that things were going to change fast. Whatever we learnt would soon fall by the wayside and become compost for the next new thing we had to learn.
It certainly feels to me like things have really been much more stable for the last 10-15 years (YMMV).
So I guess what I'm saying is: yeah, this is actually kinda getting back to normal. At least that is how I see it, if I'm in an excitable optimistic mood.
I'd say pick something and do it. It may become brain-compost, but I think a good deep layer of compost is what will turn you into a senior developer. Hopefully that metaphor isn't too stretched!
I’ve also felt what GP expresses earlier this year. I am a grey-beard now. When I was starting my career in the early 2000’s a grey-beard told me, “The tech is entirely replaced every 10 years.” This was accompanied by an admonition to evolve or die in each cycle.
This has largely been true outside of some outlier fundamentals, like TCP.
I have tried Claude code extensively and I feel it’s largely the same. To GP’s point, my suggestion would be to dive into the project using Claude Code and also work to learn how to structure the code better. Do both. Don’t do nothing.
It does definitely seem to be, and stands to reason, that better developers get better results out of Claude et al.
You're on the right track in noticing you'll be missing valuable lessons, and this might rob you of better outcomes even with AI in the future. As it is a side project though keeping motivation is important too.
As well, you'll eventually learn those lessons through future work if you keep coding yourself. But if instead you lean more toward assistance it is hard to say if you would become as skilled in the raw skill of coding, and that might affect yoir abilityto wield AI to full effecf.
Having done a lot of work across many languages, including gdscript and C# for various games, I do think you'll learn a huge amount from doing the work yourself and such an opportunity is a bit more rare to come by in paid work.
How much do you care about experience with C# and porting software? If that's an area you're interested in pursuing maybe do it by hand I guess. Otherwise I'd just use claude.
Disagree entirely, and would suggest the parent intentionally dive in on things like this.
The best way to skill up over the course of one's career is to expose yourself to as broad an array of languages, techniques, paradigms, concepts, etc. So sure, you may never touch C# again. But by spending time to dig in a bit you'll pick up some new ideas that you can bring forward with you to other things you *do* care about later.
I agree here. GP should take time to learn the thing and use AI to assist in learning not direct implementation.
If there is going to be room for junior folks in SWE, it will probably be afforded to those who understand some language’s behaviors at a fairly decent level.
I’d presume they will also be far better at system design, TDD and architecture than yesterday’s juniors, (because they will have to be to drive AI better than other hopeful candidates).
But there will be plenty of what will be grey beards around that expect syntactical competence and fwiw, if you can’t read your own code, even slowly, you fail at the most crucial aspect of AI accelerated development—-validation.
Well I think you've identified a task that should be yours. If the writing of the code itself is going to help you, then don't let AI take that help from you because of a vague need for "productivity". We all need to take time to make ourselves better at our craft, and at some point AI can't do that for you.
But I do think it could help, for example by showing you a better pattern or language or library feature after you get stuck or finish a first draft. That's not cheating that's asking a friend.
Yup, I absolutely agree with you. I've been coding professionally for around 25 years now, 10-ish before that as a hobby as a child and teenager. There's lots of stuff I know, but still lots of stuff I don't know. If my goal is to learn a new language, I'm going to build the entire thing without using a coding assistant. At most I might use Claude (not Code) to ask pointed questions, and then use those answers to write my own code (and not copy/paste anything from Claude).
Often I'll use Claude Code to write something that I know how to write, but don't feel like writing, either because it's tedious, or because it's a little bit fiddly (which I know from past experience), and I don't feel like dealing with the details until CC gives me some output that I can test and review and modify.
But sometimes, I'll admit, I just don't really care to learn that deeply. I started a project that is using Rust for the backend, but I need a frontend too. I did some React around 10 years ago, but my knowledge there (what I remember, anyway) is out of date. So sometimes I'll just ask Claude to build an entire section of a page. I'll have Claude do it incrementally, and read the code after each step so I understand what's going on. And sometimes I do tell Claude I'm not happy with the approach, and to do something differently. But in a way I kinda do not care so much about this code, aside from it being functional and maintainable-looking.
And I think that's fine! We don't have to learn everything, even if it's something we need to accomplish whatever it is we've set out to accomplish. I think the problem that you'll run into is that you might be too junior to recognize what are the things you really need to learn, and what are the things you can let something else "learn" for you.
One of the things I really worry about this current time we're in is that companies will start firing their junior engineers, with a belief (however misguided) that their senior engineers, armed with coding assistants, can be just as productive. So junior engineers will lose their normal path to gaining experience, and young adults entering college will shy away from programming, since it's hard to get a job as a junior engineer. Then when those senior engineers start to retire, there will be no one to take their places. Of course, the current crop of company management won't care; they'll have made their millions already and be retired. So... push to get as much experience as you can, and get over the hump into senior engineer territory.
Have it generate the code. Then have another instance criticize the code and say how it could be improved and why. Then ask questions to this instance about things you don't know or understand. Ask for links. Read the links. Take notes. Internalize.
One day I was fighting Claude on some core Ruby method and it was not agreeing with me about it, so I went to check the actual docs. It was right. I have been using Ruby since 2009.
As someone who is programming computers for almost 30 years and professionally for about 20 by all means do some of it manually, but leverage LLMs in tutor/coach mode, with „explain this but don’t solve it for me” prompts when stuck. Let the tool convert the boring parts once you’re confident they’re truly boring.
Programming takes experience to acquire taste for what’s right, what’s not, and what smells bad and will bite you but you can temporarily (yeah) not care. If you let the tool do everything for you you won’t ever acquire that skill, and it’s critical to judge and review your work and work of others, including LLM slop.
I agree it’s hard and I feel lucky for never having to make the LLM vs manual labor choice. Nowadays it’s yet another step in learning the craft, but the timing is wrong for juniors - you are now expected to do senior level work (code reviews) from day 1. Tough!
What’s wrong with using a Claude code to write a possible initial iteration and then go back and review the code for understanding? Various languages and frameworks have there own footguns but those usually are not unfixable later on.
AFAICT you learn significantly more in building something from the ground-up than you do when code-reviewing someone else's code. In my experience you really don't build the mental model of how the code is working unless you either build it yourself, refactor it yourself, or you have to heavily debug it to fix a bug or something.
I actually think this helps in that learning - it's sitting alongside a more experienced expert doing it and seeing what they came up with.
In the same sense that the best way to learn to write is often to read a lot, whether English or code. Of course, you also have to do it, but having lots of examples to go on helps.
Based on my usage of Claude Code, i would not trust it with anything so major.
My problem with it is that it produces _good looking_ code that, at a glance, looks 'correct', and occasionally even works. But then i look at it closely, and it's actually bad code, or has written unnecessary additional code that isn't doing anything, or has broken some other section of the app, etc.
So if you don't know enough C# to tell whether the C# it's spitting out is good or not, you're going to have a bad time
Cursor has made writing C++ like a scripting language for me. I no longer wrestle with arcane error messages, they go straight into Cursor and I ask it to resolve and then from its solution I learn what my error was.
Open your C++ project in Cursor. Before anything else ask it to review the codebase and tell you what the codebase does so you can understand its power. Play around asking it to find sections of the code that handle functionality for certain features. It should really impress you.
Continue to work on it in your preferred IDE let’s say Visual Studio. When you hit your first compile error, just for fun even if you understand the error, copy and paste it into Cursor and ask it to help you understand the cause and propose a solution. Ask it to implement it, attempt to compile, give it back any further errors that its solution may have to review and fix. You will eventually compile.
Then before you go back to work writing your next task, ask Cursor to propose how it might complete the task. After the proposal review and either tell it to proceed to implement or suggest tweaks or better alternatives. For complex tasks try setting the model manually to o3 and rerunning the same prompt and you can see how it thinks much better and can one shot solutions to complex errors. I try to use auto and if it fails on more complex tasks I resubmit the original query with o3. If o3 fails then you may have to gather more context by hand and really hold its hand through the chain of reasoning. That’s for a future post.
More advanced: Create a build.bat script that Cursor can run after it has implemented code to run and see its own errors so you can avoid the copy paste round trip. (Look into Cursor rules for this but a rule prompt that says 'after implementing any significant code changes please run .\build.bat and review and fix any further errors') This simple efficiency should allow you to experience the real productivity behind Cursor where you’re no longer dying the death of 1000 cuts losing a minute here or a minute there on rote time consuming steps and you can start to operate from a higher natural language level and really feel the ‘flow’.
Typing out the code is just an annoying implementation detail. You may feel ‘competency leaving your fingers’ as DHH might say but I’d argue you can feel your ass filling up with rocket fuel.
It really depends on how much actual logic you implement in Gdscript. It is really slow though, even slower than python as far as I know. So if you’re doing anything beyond gluing engine calls together (eg writing complicated enemy logic) its easy to run into performance issues. The “official” way to deal with that is to create gdextensions for the slow stuff, but at that point you might aswell do everything in C# (imo).
It’s easy to convince yrself that code is going to be fast enough, but games run into bottlenecks really quickly, and it also makes your stuff inaccessible to people who don’t have great hardware.
It depends how you use it. You can ask Claude Code for instructions to migrate the Code yourself, and it will be a teacher. Or you can ask it to create a migration plan and the execute it, in which case learning will of course be very limited. I recommend to do the conversion in smaller steps if possible. We tried to migrate a project just for fun in one single step and Claude Code failed miserably (itself thought it had done a terrific job), but doing it in smaller chunks worked out quite well.
Is it really that much better than cursor’s agent? I’m hesitant to try because it would be out of pocket and I get cursor for free (work). It’s hard to understand how it could be that different if both are using sonnet under the hood.
I see a lot of comments here gushing about CC but I've used and I really don't get it. I find that it takes me just as long to explain to it what I need done as it takes to just do the work myself.
What's happening is that we are being bombarded by marketing on all fronts. These gushing statements are no different from the testimonials and advertorials from the days of yore.
It’s absolute lunacy to think everyone lauding Claude code is paid marketing/shilling.
What’s actually happening is there’s a divide being created between engineers that know how to use it, and engineers that don’t or want to convince themselves that it’s useless or whatever.
I don't believe anything I see online anymore. Payola is everywhere and people are happy to sell their professional souls for little more than likes and free LLM credits.
Deeper still is this: The group that openly relies on LLMs to code is likely the first group to be replaced by LLMs, since they've already identified the recipe that the organization needs to replace them.
More broadly, we live in an age where marketing is crucial to getting noticed, where good work alone is not sufficient and you have the perfect scenario for people to market themselves out of the workforce.
There is also a group of engineers who like to... engineer stuff? I really do enjoy writing codes by myself, it gives me dopamine. The reason I've learnt talking to machines is that I don't like talking to people, so I don't fancy talking to machines like they were human beings.
You can still do both. Its just all of the grunt work is no longer grunt work, and all the tech debt you've been putting off is no longer an issue, and all of the ideas you've been meaning to try out but don't have the time can suddenly be explored in an afternoon, and so on and so forth.
For new features, by all means code it by hand. Maybe that is best! But a codebase is much more than new features. Invaluable tool.
If you don't see how that fits into "group 2" in GP's comment even though it wasn't explicitly called out, then we may have identifed why you don't find agentic coding to be enjoyable.
There are also those who code in languages that are not the most popular, on operating systems that are not the most popular, or frameworks that are not the most popular.
I have been coding with claude code for about 3 weeks and I love it. I have bout 10yoe and mostly do Python ML / Data Eng. Here are a few reasons:
1. It takes away the pain of starting. I have no barrier to writing text but there is a barrier to writing the first line of code, to a large extend coming form just remembering the context, where to import what from, setting up boilerplate etc.
2. While it works I can use my brain capacity to think about what I'm doing.
3. I can now do multiple things in parallel.
4. It makes it so much easier to "go the extra mile" (I don't add "TODOs" anymore in the code I just spin up a new Claude for it)
5. I can do much more analysis, (like spinnig up detailed plotting / analysis scripts)
6. It fixes most simple linting/typing/simple test bugs for me automatically.
Overall I feel like this kind of coding allows me to focus about the essence: What should I be doing? Is the output correct? What can we do to make it better?
Taking the pain of starting is a big one. It lets me do things I would never have done just because it’d go on the “if only I had time” wish list.
Now literally between prompts, I had a silly idea to write a NYT Connections game in the terminal and three prompts later it was done:
https://github.com/jleclanche/connections-tui
> 4. It makes it so much easier to "go the extra mile" (I don't add "TODOs" anymore in the code I just spin up a new Claude for it)
This especially. I've never worked at a place that didn't skimp on tests or tech debt due to limited resources. Now you can get a decent test suite just from saying you want it.
Will it satisfy purists? No, but lots of mid hanging fruit long left unpicked can now be automatically picked.
I've actually gone through and tried to refactor the tests Claude writes (when I ask it to only touch test files). I can't improve them, generally speaking. Often they're limited by architectural or code style choices in the main code. And there are minor stylistic things here or there.
But the bulk of it, is that you get absolutely top tier tests for the same level of effort as a half-assed attempt.
If you set it up with good test quality tools (mutation testing is my favorite) it goes even further - beyond what I think is actually reasonable to ask a human to test unless you're e.g. writing life and safety critical systems.
As one of the curious minority who keeps trying agentic coding but not liking it, I've been looking for explanations why my experience differs from the mainstream. I think it might lie in this nugget:
> I believe with Claude Code, we are at the
> “introduction of photography” period of
> programming. Painting by hand just doesn’t
> have the same appeal anymore when a single
> concept can just appear and you shape it
> into the thing you want with your code review
> and editing skills.
The comparison seems apt and yet, still people paint, still people pay for paintings, still people paint for fun.
I like coding by hand. I dislike reviewing code (although I do it, of course). Given the choice, I'll opt for the former (and perhaps that's why I'm still an IC).
When people talk about coding agents as very enthusiastic but very junior engineering interns, it fills me with dread rather than joy.
> still people paint, still people pay for paintings
But in what environment? It seems to me that most of the crafts that have been replaced by the assembly line are practiced not so much for the product itself, but for an experience both the creator and the consumer can participate in, at least in their imagination.
You don't just order such artifacts on Amazon anonymously; you establish some sort of relationship with the artisan and his creative process. You become part of a narrative. Coding is going to need something similar if it wants to live in that niche.
I don't think it's a complete good comparison. In the past painting was the only way to depict real world events, but painting is also art, and it often doesn't necessarily depict reality, but the artist's interpretation of it. That is why people still paint.
So yeah if you like coding as an art form, you can still keep doing that. It's probably just a bit harder to make lots of money with it. But most people code to make a product (which in itself could be a form of art). And yeah if it's faster to reach your goals of making a product with the help of AI, then the choice is simple of course.
But yeah in a way I'm also sad that the code monkey will disappear, and we all become more like the lead developer who doesn't really program anymore but only guides the project, reviews code and makes technical decisions. I liked being the code monkey, not having to deal a lot with all the business stuff. But yeah, things change you know.
I totally get this side of things. I see the benefits of Agentic coding for small tasks, minor fixes, or first drafts. That said, I don't understand the pseudo-tribalism around specific interfaces to what amounts to only a few models under the hood and worry about what its doing for (or not doing for) junior devs.
Also, if we could get AI tooling to do the reviews for us reliably, I'd be a much happier developer.
A more apt metaphor is moving from hand-tools to power-tools.
The painting/photography metaphor stretches way too far imo - photography was fundamentally a new output format, a new medium, an entirely new process. Agentic coding isn't that.
I'm a heavy user of Claude Code and I use it like a coding assistant.
How well you can manage a development team in real life has strong correlations on how much value you get out of an LLM based coding assistant.
If you can't describe what success looks like, expect people to read your mind, and get angry at validating questions, then you will have problems both with coding assistants and leading teams of developers.
Calling what vibe coding, though? If you're reviewing, understanding, and testing everything that the coding assistant outputs, then you aren't vibe coding.
If you're just letting the coding assistant do its thing, uncritically, and committing whatever results, then you're vibe coding.
It sounds like you're not vibe coding. That's good. No need to throw away a useful term (even if it's a weird, gen-Z sounding term) that describes a particular (poor) way to use a LLM.
The point that I'm probably missing (and others) is that we associate the phrase 'Vibe Coding' with 'Using an LLM to help with coding' and they're not the same.
Maybe the critics of Vibe Coding need to remember that all users of LLMs for coding support aren't about to regret their life choices.
Well said. The skills involved are actually quite a bit different than coding. It's about how clearly and accurately you can describe things. how good you are at understanding what tooling you need to build to improve your experience. It's a different skillset.
I don't think that's true with current-gen models. You can even go so far as to write pseudocode for the LLM to translate to a real language, and for anything out-of-the-box my experience is that it will blatantly ignore your instructions. A baseline-competent junior at least has the context to know that if there are 5 steps listed and they only did 3 then there's probably a problem.
Prompting an LLM is definitely a different skillset from actually coding, but just "describing it better" isn't good enough.
I don't believe it is good enough, it's also not as relevant.
my prompts have gotten less and less. It's the hooks and subagents, and using the tools that matter far more.
This is a thread about claude code. the other LLms don't matter. Nothing ever blatantly ignores my instructions in claude code. that's a thing of the past.
Of course, not using claude code, for sure. But all my instructions are followed with my setup. That really isn't an issue for me personally anymore.
Your experience echoes my own for sufficiently trivial tasks, but I haven't gotten any of this to work for the actual time-consuming parts of my job. It's so reliably bad for some tasks that I've reworked them into screening questions for candidates trying to skate by with AI without knowing the fundamentals. Is that really not your experience, even with claude code?
Right, and I wasn't able to get this to work for any actual time consuming parts of my job until last weekend with sub-agents, and testing head to head battles with sub-agents, and selecting the best one and repeating.
Last weekend I did nothing but have different ideas battle it out against each other, with me selecting the most successful one, and repeating.
And now, my experience is no longer the same. Before last weekend, i had the same experience you are describing.
The suggested ones are terrible, and it's guidance is terrible.
Last weekend I ran head to head tests of agents against each other with a variety of ideas, and selected the best one, and did it again. It has caused me to have a very specific subagent system, and I have an agent who creates those.
I try to use claude code a lot, I keep getting very frustrated with how slow it is and how it always does things wrong. It does not feel like its saving my any mental energy on most tasks. I do gravitate towards it for some things. But then I am sometimes burned on doing that and its not pleasent.
For example, last week i decided to play with nushell, i have a somewhat simple .zshrc so i just gave it to claude and asked it to convert it to nushell. The nu it generated for the most part was not even valid, i spent 30 mins with it, it never worked. took me about 10 minutes in the docs to convert it.
So it's miserable experiences like that that make me want to never touch it, because I might get burned again. There are certainly things that I have found value in, but its so hit or miss that i just find my self not wanting to bother.
Have you tried context7 MCP? For things that are not mainstream (like Javascript, Typescript popularity), LLM might struggle. I usually have better result with using something like context7 where it can pull up more relevant, up to date examples.
I only use 2 MCP servers, and those are context7 and perplexity. For things like updated docs, I have it ask context7. For the more difficult technical tasks where I think it's going to stumble, I'll instruct Claude Code to ask perplexity and that usually resolves it. Or at least it'll surface up to me in our conversation so that we both are learning something new at that point.
For some new stuff I'm working on, I use Rails 8. I also use Railway for my host, which isn't as widely-used as a service like Heroku, for example. Rails 8 was just released in November, so there's very little training data available. And it takes time for people to upgrade, gems to catch up, conversations to bubble up, etc. Operating without these two MCP servers usually caused Claude Code to repeatedly stumble over itself on more complex or nuanced tasks. It was good at setting up the initial app, but when I started getting into things like Turbo/Stimulus, and especially for parts of the UI that conditionally show, it really struggled.
It's a lot better now - it's not perfect, but it's significantly better than relying solely on its training data or searching the web.
I've only used Claude Code for like 4 weeks, but I'm learning a lot. It feels less like I'm an IC doing this work, and my new job is (1) product manager that writes out clear PRDs and works with Claude Code to build it, (2) PR reviewer that looks at the results and provides a lot of guidance, (3) tester. I allocate my time 50%/20%/30% respectively.
Thanks, I’ll check out Perplexity. We seem to be using a similar stack. I’m also on Rails 8 with Stimulus, Hotwire, esbuild, and Tailwind.
Playwright MCP has been a big help for frontend work. It gives the agent faster feedback when debugging UI issues. It handles responsive design too, so you can test both desktop and mobile views. Not sure if you know this, but Claude Code also works with screenshots. In some cases, I provide a few screenshots and the agent uses Playwright to verify that the output is nearly pixel perfect. It has been invaluable for me and is definitely worth a try if you have not already.
This is basically my experience with it. I thought it'd be great for writing tests, but every single time, no matter how much coaxing, i end up rewriting the whole thing myself anyway. Asking it for help debugging has not yet yielded good results for me.
For extremely simple stuff, it can be useful. I'll have it parse a command's output into JSON or CSV when I'm too lazy to do it myself, or scaffold an empty new project (but like, how often am i doing that?). I've also found it good at porting simple code from like python to JavaScript or typescript to go.
But the negative experiences really far outweigh the good, for me.
Really agree with the author's thoughts on maintenance here. I've run into a ton of cases where I would have written a TODO or made a ticket to capture some refactoring and instead just knocked it out right then with Claude. I've also used Claude to quickly try out a refactoring idea and then abandoned it because I didn't like how it came out. It really lowers the activation energy for these kinds of maintenance things.
Letting Claude rest was a great point in the article, too. I easily get manifold value compared to what I pay, so I haven't got it grinding on its own on a bunch of things in parallel and offline. I think it could quickly be an accelerator for burnout and cruft if you aren't careful, so I keep to a supervised-by-human mode.
Has anyone had their own experience of how Claude or similar AI agents perform in large (1M+ lines) legacy code bases? To give a bit more context, I work on a Java code base that is 20+ years old. It was continuously updated and expanded but contains mostly spaghetti code. Would Claude add any value here?
Agreed. CC lets you attempt things that you wouldn’t have dared to try. For example here are two things I recently added to the Langroid LLM agent framework with CC help:
Nice collapsible HTML logs of agent conversations (inspired by Mario Zechner’s Claude-trace), which took a couple hours of iterations, involving HTML/js/CSS:
A migration from Pydantic-v1 to v2, which took around 7 hours of iterations (would have taken a week at least if I even tried it manually and still probably wouldn’t have been as bullet-proof):
I appreciate that Orta linked to my "Full-breadth Developers" post here, for two reasons:
1. I am vain and having people link to my stuff fills the void in my broken soul
2. He REALLY put in the legwork to document in a concrete way what it looks like for these tools to enable someone to move up a level of abstraction. The iron triangle has always been Quality, Scope, Time. This innovation is such an accelerant that that ambitious programmers can now imagine game-changing increases in scope without sacrificing quality and in the same amount of time.
For this particular moment we're in, I think this post will serve as a great artifact of what it felt like.
So far what I've noticed with Claude Code is not _productivity gains_ but _gains in my thoughtfulness_
As in the former is hyped, but the latter - stopping to ask questions, reflect, what should we do - is really powerful. I find I'm more thoughtful, doing deeper research, and asking deeper questions than if I was just hacking something together on the weekend that I regretted later.
Agreed. The most unique thing I find with vibecoding is not that it presses all the keyboard buttons. That’s a big timesaver, but it’s not going to make your code “better” as it has no taste. But what it can do is think of far more possibilities than you can far quicker. I love saying “this is what I need to do, show me three to five ways of doing it as snippets, weigh the pros and cons”. Then you pick one and let it go. No more trying the first thing you think of, realizing it sucks after you wrote it, then back to square one.
I use this with legacy code too. “Lines n—n+10 smell wrong to me, but I don’t know why and I don’t know what to do to fix it.” Gemini has done well for me at guessing what my gut was upset about and coming up with the solution. And then it just presses all the buttons. Job done.
It's less that I'm a skeptic, but more that I'm finding I intensely abhor the world we're building for ourselves with these tools (which I admittedly use a lot).
The answer can and certainly will fill many books, dissertations, PhD thesis etc.
Without going too philosophical, although one is not unjustified in going there, and just focusing on my own small professional corner (software engineering): these llm developments mostly kill an important part of thinking and might ultimately make me dumber. For example, I know what a B tree is and can (could) painstakingly implement one when and if I needed to, the process of which would be long, full of mistakes and learning. Now, just having a rough idea will be enough, and most people will never get the chance to do it themselves.
Now B-tree is an intentionally artificial example, but you can extrapolate that to more practical or realistic examples.
On a more immediate front, there's also the matter of threat to my livelihood. I have significant expenses for the foreseeable future, and if my line of work gets a 100 or even 10x average productivity boost, there just might be less jobs going around. Farm ox watching the first internal combustion tractors.
I can think of many other reasons, but those are the most pressing and personal to me.
Not the GP but we're descending into a world where we just recycle the same "content" over and over. Nothing will be special, there'll be nothing to be proud of. Just constant dopamine hits administered by our overlords. Read Brave New World if you haven't.
I have and I don't see the connection with AI-assisted coding.
If your comment was about "generative AI in general" then I think this is the problem with trying to discuss AI on the internet at the moment. It quickly turns into "defend all aspects of AI or else you've lost". I can't predict all aspects of AI. I don't like all aspects of AI and I can't weigh up the pros and cons of a vast number of distinct topics all at once. (and neither, I suspect, can anyone else)
I think it's possible Claude Code might be the most transformative piece of software since ChatGPT. It's a step towards an AI agent that can actually _act_ at a fundamental level - with any command that can be found on a computer - in a way that's beyond the sandboxed ChatGPT or even just driving a browser.
I'm most interested in how well these tools can tackle complex legacy systems.
We have tonnes of code that's been built over a decade with all kinds of idioms and stylistic conventions that are enforced primarily through manual review. This relates in part to working in a regulated environment where we know certain types of things need radical transparency and auditability, so writing code the "normal" way a developer would is problematic.
So I am curious how well it can see the existing code style and then implicitly emulate that? My current testing of other tools seems to suggest they don't handle it very well; typically I am getting code that looks very foreign to the existing code. It exhibits the true "regression to the mean" spirit of LLMs where it's providing me with "how would the average competent engineer write this", which is not at all how we need the code written.
Currently, this is the main barrier to us using these tools in our codebase.
You need to provide agentic tools with enough context about the project so they can find their way around. In Claude Code this is typically done via a CLAUDE.md document at the root of the codebase.
I work on Chromium and my experience improved immensely by using a detailed context document (~3,000 words) with all sorts of relevant information, from the software architecture and folder organisation to the C++ coding style.
(The first draft of that document was created by Claude itself from the project documentation.)
I've had a lot of luck with Claude on my 8 year old, multi-language codebase. But I do have to babysit it and provide a lot of context.
I created some tutorial files which contain ways to do a lot of standard things. Turns out humans found these useful too. With the examples, I've found Opus generally does a good job following existing idioms, while Sonnet struggles.
Ultimately it depends on how many examples in that language showed up on stackoverflow or in public GitHub repos. Otherwise, ymmv if it's not python, c++, rust or JavaScript
I wish I got this level of productivity. I think every article should list exactly what they asked the LLM to do because I'm not getting as much use from it and I don't know if it's because what I work on is rare compared to say website front and backend code and/or if I just suck at prompts/context or I'm using the wrong services or don't have the correct MCPs etc....
Is it possible to view the prompt history? I’ve had extreme levels of productivity and would love to list out how I’ve been using it generally for an article like this but it would be incredibly impractical to log it on the side.
With Claude Code at least, all of the chats you've had are stored in jsonl files on your computer in ~/.claude - I made a little TUI for exploring these in https://github.com/orta/claude-code-to-adium
Personally, I'm less sold on tracking prompts as being valuable both for production cases (imo if a human should read it, a human should have wrote/fully edited it applies to commits/PRs/docs etc) and for vibe cases where the prompts are more transitory
I really want to record a live commentary of me working with claude. Maybe that's something you could think about.
I feel like the results I get are qualitatively superior to anything I've seen anyone I've worked with produce. The fact that it's a lot faster is just gravy on top.
This is the saddest part of this whole thing for me. You consider the prompts and config to be the real source code, but those are just completely lost into the ether. Even if you saved the prompts you can't reproduce their effects.
Then there's the question of how do other developers contribute to the code. They don't have your prompts, they just have the code.
So, no, prompts are not source code, that's why I ask for people to just show the code they are producing and nobody ever does.
I also make my design documents (roughly the prompts generated by the prompts) into committed markdown documents. So I show the second-tier prompts at least, you could consider those an intermediate language representation if you like.
> Then there's the question of how do other developers contribute to the code. They don't have your prompts, they just have the code.
I usually try to commit the initial prompts and adjustments. I don't commit trivial things like "That's not quite right, try doing X again" or "Just run the entire test suite"
I've found if you're working on something where it hasn't seen a billion examples you have to give it additional information like an academic paper or similar. And get it to summarize its understanding every once in a while so you can pick up the idea in another chat (once the context gets too long) without having to explain it all again but to also ensure it's not going off the rails ...as they tend to do.
They know a lot about a lot of things but the details get all jumbled up in their stupid robot brains so you have to help them out a bunch.
I have nearly 20 years of experience in technology, and have been writing toy scripts or baby automations for most of my career. I started out in a managed services help desk and took that route many folks take across and around the different IT disciplines.
I mostly spend my days administering SaaS tools, and one of my largest frustrations has always been that I didn’t know enough to really build a good plugin or add-on for whatever tool I was struggling with, and I’d find a limited set of good documentation or open source examples to help me out. With my limited time (full time job) and attendant challenges (ADHD & autism + all the fun trauma that comes from that along with being Black, fat and queer), I struggled to ever start anything out of fear of failure or I’d begin a course and get bored because I wasn’t doing anything that captured my imagination & motivation.
Tools like Claude Code, Cursor, and even the Claude app have absolutely changed the game for me. I’m learning more than ever, because even the shitty code that these tools can write is an opportunity for debugging and exploration, but I have something tangible to iterate on. Additionally, I’ve found that Claude is really good at giving me lessons and learning based on an idea I have, and then I have targeted learning I can go do using source docs and tutorials that are immediately relevant to what I’m doing instead of being faced with choice paralysis. Being able to build broken stuff in seconds that I want to get working (a present problem is so much more satisfying than a future one) and having a tool that knows more than I do about code most of the time but never gets bored of my silly questions or weird metaphors has been so helpful in helping me build my own tools. Now I think about building my own stuff first before I think about buying something!
ADHD here, and Claude code has been a game changer for me as well. I don’t get sidetracked going lost in documentation loops, suffer decision paralysis, or forget what I’m currently doing or what I need to do next. It’s almost like I’m body doubling with Claude code.
Over the holidays I built a plan for an app that would be worthwhile to my children, oldest son first. That plan developed to several thousand words of planning documents (MVP, technical stack, layout). That was just me lying in the sun with Claude on mobile.
Today I (not a programmer, although programming for 20+ years, but mostly statistics) started building with Claude Code via Pro. Burned through my credits in about 3 hours. Got to MVP (happy tear in my eye). Actually one of the best looks I've ever gotten from my son. A look like, wow, dad, that's more than I'd ever think you could manage.
Tips:
- Plan ahead! I've had Claude tell me that a request would fit better way back on the roadmap. My roadmap manages me.
- Force Claude to build a test suite and give debugging info everywhere (backend, frontend).
- Claude and me work together on a clear TODO. He needs guidance as well as I do. It forgot a very central feature of my MVP. Do not yet know why. Asked kindly and it was built.
Questions (not specifically to you kind HN-folks, although tips are welcome):
- Why did I burn through my credits in 3 hours?
- How can I force Claude to keep committed to my plans, my CLAUDE.md, etc.
- Is there a way to ask Claude to check the entire project for consistency? And/Or should I accept that vibing will leave crusts spread around?
I'm on a pro plan, also run into limits within 2 hours, then have to wait until the limits of the 5 hour window reset (next reset is in 1 hour 40 minutes at 2am)...
You can just ask claude to review your code, write down standard, verify that code is produced according to standards and guidelines. And if it finds that project is not consistent, ask it to make a plan and execute on the plan.
Yeah, not going to lie, working at Google and having unlimited access to Gemini sure is nice (even if it has performance issues vs Claude Code… I can’t say as I can’t use it at work)
To me it feels like we're in the VC subsidized days for tools like Claude Code. Given how expensive we know GPU usage is and that it's not likely to come down, and these companies will need to eventually be profitable, I wonder if we're all heading for a point where ultimately Claude Code and the like will be like $2K per month instead of $200 on the high end.
Good article, but fwiw, I think GraphQL is a bane for web dev for 90% of projects. It overcomplicates, bloats, and doesn't add anything over regular OpenAPI specs for what is usually just CRUD resource operations.
It seems to be great at writing tests, spitting out UI code, and many other things where there are many examples around.
Among other things I work on database optimizers and there Claude fails spectacularly. It produces wrong code, fails to find the right places where to hook up an abstraction, overlooks affects on other parts of the code, and generally confidently proposes changes that simply do not work at all (to put it mildly).
Your mileage may vary... It seems to be depend heavily on the amount of existing (open) code around.
Being able to do big refactors quickly in the moment really helps in a solo dev environment, but in a team it puts a lot of review (and QA) burden on them. It makes me wonder if we're moving towards a teams model where individuals own different parts of the system, rather than everyone reviewing each others code and working together
Anybody had similarly good experience with Gemini CLI? I'm only a hobbyist coder, so paying for Claude feels silly when Gemini is free (at least for now), but so far I've only used it inside Cline-like extensions
I’ve used both. Claude more extensively. I’ve had good results with Gemini too, however it seems easier to get stuck in a loop. Happens with Claude too but not quite as frequent.
By loop I mean you tell it no don’t implement this service, look at this file instead and mimic that and instead it does what it did before.
Using two different VS code profiles is an interesting idea beyond AI. I get confused every time I have different projects open because code always looks the same. Maybe it would make sense to have a different theme for every project.
Every time you use these tools irresponsibly, for instance for what I like to call headless programming (vibe coding), understand that you are incurring tech debt. Not just in terms of your project but personal debt regarding what you SHOULD have learned in order to implement the solution.
It’s like using ChatGPT in high school: it can be a phenomenal tutor, or it can do everything for you and leave you worse off.
The general lesson from this is that Results ARE NOT everything.
Another really nice use case building very sophisticated test tooling. Normally a company might not allocate enough resources to a task like that but with Claude Code it's a no brainer. Also can create very sophisticated mocks like say db mock that can parse all queries in the codebase and apply them to in memory fake tables. Would be total pain to build and maintain by hand but with claude code takes literally minutes.
Sure but thats pretty orthogonal to the above. Say had to create parser+mapper for a domain specific query language that than gets mapped to a number of SQL backends. Used cc to create custom test harness that can use SQLC style files to drive tests. They are way more readable and much easier to maintain vs plain Rust tests for this task. Took half a day to create with cc would prob take like a week without it.
In my experience they are great for test tooling. For actual tests after I have covered a number of cases it's very workable to tell it to identify gaps and edge cases and propose tests than I'd say I accept about 70% of it suggestions.
While people's experience with LLMs is pretty varied and subjective, saying they're bad at writing tests just isn't true. Claude Code is incredible at writing tests and testing infrastructure.
It worth mentioning that one should tell CC to not overmock, and to produce only truly relevant tests. I use an agent that I invoke to spot this stuff, because I've run into some truly awful overmocked non-tests before.
> Painting by hand just doesn’t have the same appeal anymore when a single concept can just appear and you shape it into the thing you want with your code review and editing skills.
In the meanwhile one the most anticipated game in the industry, a second chapter of an already acclaimed product, has its art totally hand painted
I think it's two schools of thought, end product vs process. It seems a lot of people who like AI only care about getting the end product, and don't care how it was made. On the other hand, some people are invested in how something is made, and see the process of creation and refinement as a part of the end product itself.
People are not invested in how something is made imo, it's just better and beautiful and connects better with them. I agree though that AI entusiast only care about getting stuff done, but I won't call it an "end-product"
People really aren't going to like this, but OP is directionally correct.
At 100 dev shop size you're likely to have plenty of junior and middling devs, for whom tools like CC will act as a net negative in the short-mid term (mostly by slowing down your top devs who have to shovel the shit that CC pushes out at pace and that junior/mids can't or don't catch). Your top devs (likely somewhere around 1/5 of your workforce) will deliver 80% of the benefit of something like CC.
We're not hiring junior or even early-mid devs since around Mar/Apr. These days they cost $200/mo + $X in API spend. There's a shift in the mind-work of how "dev" is being approached. It's.. alarming, but it's happening.
Is this the one that goes 'Oh no, I accidentally your whole codebase, I suck, I accept my punishment and you should never trust me again' or is that a different one?
I seem to remember the 'oh no I suck' one comes out of Microsoft's programmer world? It seems like that must be a tough environment for coders if such feelings run so close to the surface that the LLMs default to it.
I'm NOT saying it is, but without regulatory agencies having a look or it being open source, this might be well working as intended, since Anthropic makes more money out of it.
I think the most interesting change Claude enables is letting AI try stuff. We do this all the time.
I have this sense this works best in small teams right now, because Claude wants to produce code changes and PRs. Puzzmo, where OP works, is <5 engineers.
In larger codebases, PRs don't feel like the right medium in every case for provocative AI explorations. If you're going to kick something off before a meeting and see what it might look like to solve it, it might be better to get back a plan, or a pile of regexps, or a list of teams that will care.
Having an AI produce a detailed plan for larger efforts, based on an idea, seems amazing.
Coding agents are empowering, but it is not well appreciated that they are setting a new baseline. It will soon not be considered impressive to do all the things that the author did, but expected. And you will not work less but the same hours -- or more, if you don't use agents.
Despite this, I think agents are a very welcome new weapon.
Does Claude Code use a different model then Claude.ai? Because Sonnet 4 and Opus 4 routinely get things wrong for me. Both of them have sent me on wild goose chases, where they confidently claimed "X is happening" about my code but were 100% wrong. They also hallucinated APIs, and just got a lot of details wrong in general.
The problem-space I was exploring was libusb and Python, and I used ChatGPT and also Claude.ai to help debug some issues and flesh out some skeleton code. Claude's output was almost universally wrong. ChatGPT got a few things wrong, but was in general a lot closer to the truth.
AI might be coming for our jobs eventually, but it won't be Claude.ai.
The reason that claude code is “good” is because it can run tests, compile the code, run a linter, etc. If you actually pay attention to what it’s doing, at least in my experience, it constantly fucks up, but can sort of correct itself by taking feedback from outside tools. Eventually it proclaims “Perfect!” (which annoys me to no end), and spits out code that at least looks like it satisfies what you asked for. Then if you just ignore the tests that mock all the useful behaviors out, the amateur hour mistakes in data access patterns, and the security vulnerabilities, it’s amazing!
You're right, but you can actually improve it pretty dramatically with sub agents. Once you get into a groove with sub agents, it really makes a big difference.
I stopped writing as much code because of RSI and carpal tunnel but Claude has given me a way to program without pain (perhaps an order of magnitude less pain). As much as I was wanting to reject it, I literally am going to need it to continue my career.
You aren't the first person I have heard say this. It's an under-appreciated way in which these tools are a game-changer. They are a wonderful gift to those of us prone to RSI, because they're most good at precisely the boilerplate repetitive stuff that tends to cause the most discomfort. I used to feel slightly sick every time I was faced with some big piece of boilerplate I had to hammer out, because of my RSI and also because it just makes me bored. No longer. People worry that these tools will end careers, but (for now at least) I think they can save the careers of more than a few people. A side-effect is I now enjoy programming much more, because I can operate at a level of abstraction where I am actually dealing with novel problems rather than sending my brain to sleep and my wrists to pain hell hammering out curly braces or yaml boilerplate.
Now that you point this out, since I started using Claude my RSI pain is virtually non-existent. There is so much boilerplate and repetitive work taken out when Claude can hit 90% of the mark.
Especially with very precise language. I've heard of people using speech to text to use it which opens up all sorts of accessibility windows.
I find it very effective to use a good STT/dictation app since giving sufficient detailed context to CC is very important, and it becomes tedious to type all of that.
I’ve experimented with several dictation apps, including super whisper, etc., and I’ve settled on Wispr Flow. I’m very picky about having good keyboard shortcuts for hands-free dictation mode (meaning having a good keyboard shortcut to toggle recording on and off), and of course, accuracy and speed. Wispr Flow seems to fit all my needs for now but I’d love to switch to a local-only app and ditch the $15/mo sub :)
Sorry to hear that and whilst it wasn't my original goal to serve such a use case I wonder if being able to interact with Claude Code via voice will help you? On MacOS it uses free defaults for TTS and ASR but you can BYOK to other providors. https://github.com/robdmac/talkito
Superwhisper is great. It's closed source, however. There may be other comparable open spurce options available now. I'd suggest trying superwhisper, so you know what's possible and maybe compare to open source options after. Superwhisper runs locally and has a one time purchase option, which makes it acceptable to me.
Talkito (I posted the link further up) is open source and unlike Superwhisper it makes Claude Code talk back to you as well - which was the original aim to be able to multitask.
Genuinely - what's the problem with this? It seems to be someone documenting a big increase in their productivity in a way that might be actually useful to others.
They don't write like the kind of person you can dismiss out of hand and there's no obvious red flags.
Other than "I don't like AI" - what is so insufferable here?
I was very, very skeptic. Then a couple of weeks ago I started with Atlassian's Rovo Dev CLI (Sonnet 4) and immediately managed to build and finish a couple of projects. I learned a lot and for sure having experience, decide on stack an architecture is a huge benefit for "guiding" an agentic coding app. I'm not sure if anyone can build and maintain a project without having at least some skills, but If you are an experienced developer, this is kind of magic.
I also appreciate the common best practice to write a requirements document in Markdown before letting the agent start. AWS' kiro.dev is really nice in separating the planning stage from the execution stage but you can use almost any "chatbot" even ChatGPT for that stage. If you suffer from ADHD or lose focus easily, this is key. Even if you decide to finish some steps manually.
It doesn't really matter if you use Claude Code (with Claude LLM), Rovo Dev CLI, Kiro, Opencode, gemini-cli, whatever. Pick the ones that offer daily free tokens and try it out. And no, they will almost never complete without any error. But just copy+paste the error to the prompt or ask some nasty questions ("Did you really implement deduplication and caching?") and usually the agent magically sees the issues and starts to fix it.
I don't know if it's something only I "perceive," but as a 50-year-old who started learning to use computers from the command line, using Claude Code's CLI mode gives me a unique sense of satisfaction.
I've switched to opencode. I use it with Sonnet for targeted refactoring tasks and Gemini to do things that touch a lot of files, which otherwise can get expensive quickly.
For me, the most compelling use of LLMs is to one shot scripts, small functions, unit tests, etc.
I don’t understand how people have the patience to do an entire application just vibe coding the whole time. As the article suggests, it doesn’t even save that much time.
If it can’t be done in one shot with simple context I don’t want it.
I did a company hackathon recently where we were encouraged to explore vibe coding more and this was essentially my take-away too. It's kinda perfect for a hackathon but it was insanely mind-numbing to relegate the problem solving and mental model-building to the LLM and sit there writing prompts all day. If that has to become my career, I genuinely might have to change career paths, but it doesn't seem like that's likely -- using it as a tool to help here and there and sometimes provide suggestions definitely feels like way to actually use it to get better results.
A lot of things that the author achieved with Claude Code is migrating or refactoring of code. To me, who started using Claude Code just two weeks ago, this seems to be one of the real strengths at the moment. We have a large business app that uses an abandoned component library and contains a lot of cruft. Migrating to another component library seemed next to impossible, but with Claude Code the whole process took me just about one week. It is making mistakes (non-matching tags for example), but with some human oversight we reached the first goal. Next goal is removing as much cruft as possible, so working on the app becomes possible or even fun again.
I remember when JetBrains made programming so much easier with their refactoring tools in IntelliJ IDEA. To me (with very limited AI experience) this seems to be a similar step, but bigger.
On the other hand though, automated refactoring like in IntelliJ can scale practically infinitely, are extremely low cost, and are gauranteed to never make any mistakes.
Not saying this is more useful per se, just saying that different approaches have their pros and cons.
I tried out Claude for the first time today. I have a giant powershell script that has evolved over the years, doing a bunch of different stuff. I've been meaning to refactor it for a long time, but it's such a tangled mess that every time I try, I give up fairly quickly. GPT has not been able to split it into separate modules successfully. Today I tried Claude and it refactored it into a beautifully separated collections of modules in about 30 minutes. I am extremely impressed.
My mom was an English teacher and an electronics tech later in her career. I explained to her how LLMs are requiring a lot of technical writing and documentation up front and devs that haven’t tried that approach or hate that can be quick to dismiss when the output is bad. Her reply, “Oh so the humanities do matter!’ Hire juniors with a left brain right brain mix.
We have to be careful not to anthropomorphize them but LLMs absolutely respond to nuanced word choice and definition of behavior that align with psychology (humanities). How to judge that in an interview? Maybe a “Write instructions for a robot to make a peanut butter and jelly sandwich” exercise. Make them type it. Prospects who did robotics club have an edge?
Can they touch type? I’ve seen experienced devs that chicken peck its painful. What happens when they have to write a stream of prompts, abort, and rephrase rapidly? Schools aren’t mandating typing and I see an increase (in my own home! I tried…) of feral child invented systems like caps lock on/off instead of shift with weird cross keyboard overhand reaches.
I've thought about this lately. In order to do that, you need to know where people typically stumble, and then create a rubric around that. Here are some things I'd look for:
- Ability to clearly define requirements up front (the equivalent mistake in coding interviews is to start by coding, rather than asking questions and understanding the problem + solution 100% before writing a single line of code). This might be the majority of the interview.
- Ability to anticipate where the LLM will make mistakes. See if they use perplexity/context7 for example. Relying solely on the LLM's training data is a mistake.
- A familiarity with how to parallelize work and when that's useful vs not. Do they understand how to use something like worktrees, multiple repos, or docker to split up the work?
- Uses tests (including end-to-end and visual testing)
- Can they actually deliver a working feature/product within a reasonable amount of time?
- Is the final result looking like AI slop, or is it actually performant, maintainable (by both humans and new context windows), well-designed, and follows best practices?
- Are they able to work effectively within a large codebase? (this depends on what stage you're in; if you're a larger company, this is important, but if you're a startup, you probably want the 0->1 type of interview)
- What sort of tools are they using? I'd give more weight if someone was using Claude Code, because that's just the best tool for the job. And if they're just doing the trendy thing like using Claude Agents, I'd subtract points.
- How efficient did they use the AI? Did they just churn through tokens? Did they use the right model given the task complexity?
first you have to decide if you want juniors that are able to push tasks through and be guided by a senior, as the juniors won't understand what they are doing or why the AI is telling them to do it "wrong".
senior developers already know how to use AI tools effectively, and are often just as fast as AI, so they only get the benefits out of scaffolding.
really everything comes down to planning, and your success isn't going to come down to people using AI tools, it will come down to the people guiding the process, namely project managers, designers, and the architects and senior developers that will help realize the vision.
juniors that can push tasks to completion can only be valuable if they have proper guidance, otherwise you'll just be making spaghetti.
I've used it a bit. I've done some very useful stuff, and I've given up with other stuff and just done it manually.
What it excels at is translation. This is what LLMs were originally designed for after all.
It could be between programming languages, like "translate this helm chart into a controller in Go". It will happily spit out all the structs and basic reconciliation logic. Gets some wrong but even after correcting those bits still saves so much time.
And of course writing precise specs in English, it will translate them to code. Whether this really saves time I'm not so convinced. I still have to type those specs in English, but now what I'm typing is lost and what I get is not my own words.
Of course it's good at generating boilerplate, but I never wrote much boilerplate by hand anyway.
I've found it's quite over eager to generate swathes of code when you wanted to go step by step and write tests for each new bit. It doesn't really "get" test-driven development and just wants to write untested code.
Overall I think it's without doubt amazing. But then so is a clown at a children's birthday party. Have you seen those balloon animals?! I think it's useful to remain sceptical and not be amazed by something just because you can't do it. Amazing doesn't mean useful.
I worry a lot about what's happening in our industry. Already developers get away with incredibly shoddy practices. In other industries such practices would get you struck off, licences stripped, or even sent to prison. Now we have to contend with juniors and people who don't even understand programming generating software that runs.
I can really see LLMs becoming outlawed in software development for software that matters, like medical equipment or anything that puts the public in danger. But maybe I'm being overly optimistic. I think generally people understand the dangers of an electrician mislabelling a fusebox or something, but don't understand the dangers of shoddy software.
And there is indeed software that is not covered by those tags, plenty of it in fact. It just so happens that it's a few orders of magnitude more expensive, and so you never hear about it until you're actually designing e.g. provably safe firmware for pacemakers and the like.
The problem for me is to predict what AI might get wrong. Claude can solve hard coding problems one day just to fail with basic stuff like thread safety the next. But overall I think it is clear that we reached the point where AI, if used correctly, saves a lot of development time.
I think Claude Code is great, but I really grew accustomed to the "Cursor-tab tab tab" autocomplete style. A little perplexed why the Claude Code integration into VS Code doesn't add something like this? It would make it the perfect product to me. Surprised more people do not talk about this/it isn't a more commonly requested feature.
"Cursor tab tab tab" is just nuts. I'm also getting accustomed to type carelessly, making syntax mistakes who cares if a tab can fix that. I fly with it. As per why more people dont talk about this, I have a the strong opinion that tools find success in the median of the market, not in the excellence. I think coders with a great autocomplete are a real deal. I find so boring and courtproductive chatting about a problem
With these agentic coders you can have better conversations about the code. My favorite use case with CC is after a day coding I can ask it to for a thorough review of the changes, a file, or even the whole project.. setting it to work when I go off to bed and have it ranking issues and even proposing a fix for the most important ones. If you get the prompt right and enable permissions it can work for quite a long time independently.
I just use Claude Code in Cursor's terminal (both a hotkey away, very convenient). For 2 months I don't use cursor chat, but tab autocomplete is to good, definitely worth 20$.
I get downvoted every time I praise Claude. But everyone in this thread is getting upvoted for saying the same things. Can someone explain to me the difference?
At this point I am 99% convinced that AI coding skeptics are nothing short of Luddites.
They would be like "but a robot will never ever clean a house as well as I would", well, no shit, but they can still do the overwhelming majority of the work very well (or at least as good as you instruct them to) and leave you with details and orchestration.
Genuinely not yet convinced. The CEO of windsurf said the goal for the next year was to get agentic coding to the reliability and maturity that it can be used in production for mature codebases. That rhymes true.
I use autocomplete and chat with LLMs as a substitute for stack overflow. They’re great for that. Beyond that, myself and colleagues have found AI agents are not yet ready for serious work. I want them to be, I really do. But more than that I want our software to be reliable, our code to be robust and understandable, and I don’t want to worry about whether we are painting ourselves into a corner.
We build serious software infrastructure that supports other companies’ software and our biggest headache is supporting code that we built earlier this year using AI. It’s poorly written, full of bugs including problems very hard to spot from the code, and is just incomprehensible for the most part.
Other companies I know are vibe coding, and making great progress, but their software is CRUD SaaS and worst case they could start over. We do not have that luxury.
If you get as much enjoyment out of problem-solving and programming as you do doing household chores, I'm not sure why you went into this career. I agree that the folks who are 100% anti-using-it-ever are overreacting, but IME replacing the overwhelming majority of the work with vibe-coding is both mind-numbing from a developer perspective and doesn't actually get you better results (or faster results, unless you're basically cowboy coding with no regard for thoroughly ensuring correctness)
Instructing Claude Code to handle a circular dependency edge case in your code and write tests for it, while reviewing the work definitely does not quality as vibe coding.
I recently tried a 7-day trial version of Claude Code. I had 3 distinct experiences with it: one obviously positive, one bad, and one neutral-but-trending-positive.
The bad experience was asking it to produce a relatively non-trivial feature in an existing Python module.
I have a bunch of classes for writing PDF files. Each class corresponds to a page template in a document (TitlePage, StatisticsPage, etc). Under the hood these classes use functions like `draw_title(x, y, title)` or `draw_table(x, y, data)`. One of these tables needed to be split across multiple pages if the number of rows exceeded the page space. So I needed Claude Code to do some sort of recursive top-level driver that would add new pages to a document until it exhausted the input data.
I spent about an hour coaching Claude through the feature, and in the end it produced something that looked superficially correct, but didn't compile. After spending some time debugging, I moved on and wrote the thing by hand. This feature was not trivial even for me to implement, and it took about 2 days. It broke the existing pattern in the module. The module was designed with the idea that `one data container = one page`, so splitting data across multiple pages was a new pattern the rest of the module needed to be adapted to. I think that's why Claud did not do well.
+++
The obviously good experience with Claude was getting it to add new tests to a well-structured suite of integration tests. Adding tests to this module is a boring chore, because most of the effort goes into setting up the input data. The pattern in the test suite is something like this: IntegrationTestParent class that contains all the test logic, and a bunch of IntegrationTestA/B/C/D that do data set up, and then call the parent's test method.
Claude knocked this one out of the park. There was a clear pattern to follow, and it produced code that was perfect. It saved me 1 or 2 hours, but the cool part was that it was doing this in its own terminal window, while I worked on something else. This is a type of simple task I'd give to new engineers to expose them to existing patterns.
+++
The last experience was asking it to write a small CLI tool from scratch in a language I don't know. The tool worked like this: you point it at a directory, and it then checks that there are 5 or 6 files in that directory, and that the files are named a certain way, and are formatted a certain way. If the files are missing or not formatted correctly, throw an error.
The tool was for another team to use, so they could check these files, before they tried forwarding these files to me. So I needed an executable binary that I could throw up onto Dropbox or something, that the other team could just download and use. I primarily code in Python/JavaScript, and making a shareable tool like that with an interpreted language is a pain.
So I had Claude whip something up in Golang. It took about 2 hours, and the tool worked as advertised. Claude was very helpful.
On the one hand, this was a clear win for Claude. On the other hand, I didn't learn anything. I want to learn Go, and I can't say that I learned any Go from the experience. Next time I have to code a tool like that, I think I'll just write it from scratch myself, so I learn something.
+++
Eh. I've been using "AI" tools since they came out. I was the first at my company to get the pre-LLM Copilot autocomplete, and when ChatGPT became available I became a heavy user overnight. I have tried out Cursor (hate the VSCode nature of it), and I tried out the re-branded Copilot. Now I have tried Claude Code.
I am not an "AI" skeptic, but I still don't get the foaming hype. I feel like these tools at best make me 1.5X -- which is a lot, so I will always stay on top of new tooling -- but I don't feel like I am about to be replaced.
Your bad experience is because AI can’t really reason in general. It gets some kinda reasoning via a transformer, but that’s nothing like the reasoning that goes into the problem you described.
LLMs are great at translation. Turn this English into code, essentially. But ask it to solve a novel problem like that without a description of the solution, how will it approach it? If there’s an example in its training set maybe it can recall it. Otherwise it has no capability to derive a solution.
However, most problems (novel problems, problems not in the training set) can be decomposed into simpler, known problems. At the moment the AI isn't great at driving this decomposition, so that has to be driven by a meat bag.
I have about two weeks of using Claude Code and to be honest, as a vibe coding skeptic, I was amazed. It has a learning curve. You need to learn how to give it proper context, how to chunk up the work, etc. And you need to know how to program, obviously. Asking it to do something you don't know how to do, that's just asking for a disaster. I have more than 25 years of experience, so I'm confident with anything Claude Code will try to do and can review it, or stop and redirect it. About 10-15 years ago, I was dreaming about some kind of neural interface, where I could program without writing any code. And I realized that with Claude Code, it's kind of here.
A couple of times I hit the daily limits and decided to try Gemini CLI with the 2.5 pro model as a replacement. That's not even comparable to Claude Code. The frustration with Gemini is just not worth it.
I couldn't imagine paying >100$/month for a dev tool in the past, but I'm seriously considering upgrading to the Max plans.
If you are a Senior Developer, who is comfortable giving a Junior tips, and then guiding them to fixing them (or just stepping in for a brief moment and writing where they missed something) this is for you. I'm hearing from Senior devs all over thought, that Junior developers are just garbage at it. They product slow, insecure, or just outright awful code with it, and then they PR the code they don't even understand.
For me the sweet spot is for boilerplate (give me a blueprint of a class based on a description), translate a JSON for me into a class, or into some other format. Also "what's wrong with this code? How would a Staff Level Engineer white it?" those questions are also useful. I've found bugs before hitting debug by asking what's wrong with the code I just pounded on my keyboard by hand.
Yes, this!
A couple of week ago, I had a little down time and thought about a new algorithm I wanted to implement. In my head it seemed simple enough that 1) I thought the solution was already known, and 2) it would be fairly easy to write. So I asked Claude to "write me a python function that does Foo". I spent a whole morning going back and forth getting crap and nothing at all like what I wanted.
I don't know what inspired me, but I just started to pretend that I was talking to one one of my junior engineers. I first asked for a much simpler function that was on the way to what I wanted (well, technically, it was the mathematical inverse of what I wanted), then I asked it to modify it to add one different transform, and then another, and then another. And then finally, once the function was doing what I wanted, I asked it to write me the inverse function. And it got it right.
What was cool about it, is that it turned out to be more complex linear algebra and edge cases than I originally thought, and it would have been weeks for me to figure all of that out. But using it as a research tool and junior engineer in one was the key.
I think if we go down the "vibe coding" route, we will end up with hoards of juniors who don't understand anything and the stuff they produce with AI will be garbage and brittle. But using AI as a tool is starting to feel more compelling to me.
“ we will end up with hoards of juniors who don't understand anything and the stuff they produce”
I’ve spent a lot of my career cleaning up this kind of nonsense so the future looks bright
Yes, can confirm that as a senior developer who has needed to spend huge amounts of time reviewing junior code from off-shore contractors with very detailed and explicit instructions, dabbling in agentic LLM coding tools like Claude Code has felt like like a gift from heaven.
I also have concerns about said junior developers wielding such tools, because yes, without being able to supply the right kind of context and being able to understand the difference between a good solution and a bad solution, they will produce tons of awful, but technically working code.
Totally agree with the off-shore component of this. I'm already going to have to break a task down into clear detail and resolve any anticipated blocker myself upfront to avoid multi-timezone multi-day back and forth.
Now that I'm practiced at that, the off-shored part is no longer valuable
The unemployment in India is going to be catastrophic. Geopolitical.
Many companies that see themselves as non-technical at the core prefer building solutions with an army of intermediate developers that are hot swappable. Having highly skilled developers is a risk for them.
Unlikely. Microsoft had layoffs everywhere except India. There they keep hiring more. As song as the can keep upskilling themselves while still being much cheaper than US workers they won't fear unemployment.
Just yesterday I saw on X a video of a Miami hotel where the check-in procedure was via a video call to a receptionist in India.
You know senior developers can also be off-shored, right?
Blowing away the junior -> senior pipeline would, on average, hit every country the same.
Though it raises an interesting point: if a country like India or China did make the investment in hiring, paying, and mentoring junior people but e.g. the US didn't, then you could see a massive shift in the global center of gravity around software expertise in 10 years (plus or minus).
Someone is going to be the best at planning for and investing in the future on this, and someone is going to maximally wishful thinking / short-term thinking this, and seductive-but-not-really-there vibe coding is probably going to be a major pivot point there.
This is such an important point. Not sure about India, which is still very market forces driven, but china can just force its employers to do whatever is of strategic importance. That’s long gone in the US. Market forces here will only ever optimize for short term game, shooting ourselves in the chest.
Ah mate I can’t relate more to the offshore component. I had a very sad experience where I recently had to let go of an offshore team due to them providing devs that essentially ‘junior with copilot’ but labelled as a ‘senior’.
Time and time again I would find telltale signs of dumping LLM output into PRs n then claiming it as their own. Not a problem, but the code didn’t do what the detailed ticket asked and introduced other bugs as a result.
It ultimately became a choice of ‘go through the hassle of making a detailed brief for it to just be put in copilot verbatim and then go through the hassle of reviewing it and explaining the issues back to the offshore dev’ or ‘brief Claude directly’
I hate to say it but from a business perspective the latter won outright. It tears me up as it goes against my morality.
Why does it go against your morality? Sounds like a totally rational business decision, only affecting a sub-par partner
I know what you mean it just feels a bit inhumane to me. Sort of like defining a value for a living being and then determining that they fell beneath said value.
This.
I've got myself in a PILE of trouble when trying to use LLMs with languages/technologies I am unfamiliar with (React, don't judge me).
But with something that I am familiar with (say Go, or Python) LLMs have improved my velocity massively, with the caveat that I have had to explicitly tell the LLM when it is producing something that I know that I don't want (me arguing with an LLM was an experience too!)
> I'm hearing from Senior devs all over thought, that Junior developers are just garbage at it. They product slow, insecure, or just outright awful code with it, and then they PR the code they don't even understand.
If this is the case then we better have full AI generated code within the next 10 years since those "juniors" will remain atrophied juniors forever and the old timers will be checking in with the big clock in the sky. IF we, as a field, believe that this can not possibly happen, then we are making a huge mistake leaning on a tool that requires "deep [orthogonal] experience" to operate properly.
IT education and computer science (at least part of it) will need a stronger focus on software engineering and software architecture skills to teach developers how to be in control of an AI dev tool.
The fastest way is via struggle. Learn to do it yourself first. Understand WHY it does not work. What's good code? What's bad code? What are conventions?
There are no shortcuts - you are not an accountant just because you have a calculator.
Brains are not computers and we don't learn by being given abstract rules. We also don't learn nearly as well from class room teaching as we do from doing things IRL for a real purpose - the brain always knows the difference and that the (real, non-artificially created) stakes are low in a teaching environment.
That's also the huge difference between AI and brains: AI does not work on the real world but on our communication (and even that is limited to text, missing all the nuance or face to face communication includes). The brain works based on sensor data from the real world. The communication method, language, is a very limited add-on on top of how the brain really works. We don't think in language, to do even some abstract language based thinking, e.g. when doing formal math, requires a lot of concentration and effort and still uses a lot of "under the hood" intuition.
That is why even with years of learning the same curriculum we still need to make a significant effort for every single concrete example to "get everyone on the same page", creating compatible internal models under the hood. Everybody's own internal model of even simple things are slightly different, depending on what brain they brought to learning and what exactly they learned, where even things like social classroom interactions went into how the connections were formed. Only based on a huge amount of effort can we then use language to communicate in the abstract, and even then, when we leave the central corridor of ideas people will start arguing forever about definitions. No matter how the written text is the same, the internal model is different for every person.
As someone who took neuroscience, I found this surprisingly well written:
"The brain doesn't like to abstract unless you make it"
http://howthebrainworks.science/how_the_brain_works_/the_bra...
> This resource, prepared by members of the University of London Centre for Educational Neuroscience (CEN), gives a brief overview of how the brain works for a general audience. It is based on the most recent research. It aims to give a gist of the brain’s principles of function, covering the brain’s evolutionary origin, how it develops, and how it copes in the modern world.
The best way to learn is to do things IRL that matter. School is a compromise and not really all that great. People motivated by actual need often can learn things that take years in school with middling results significantly faster and with better and deeper results.
Yeah. The only, and I mean only non-social/networking advantages to universities stem from forced learning/reasoning about complex theoretical concepts that form the requisite base knowledge to learn the practical requirements of your field while on the job.
Trade schools and certificate programs are designed to churn out people with journeyman-level skills in some field. They repeatedly drill you on the practical day-in-day-out requirements, tasks, troubleshooting tools and techniques, etc. that you need to walk up to a job site and be useful. The fields generally have a predictable enough set of technical problems to deal with that a deep theoretical exploration is unnecessary. This is just as true for electricians and auto mechanics as it is for people doing limited but logistically complex technical work, like orchestrating a big fleet of windows workstations with all the Microsoft enterprise tools.
In software development and lots of other fields that require grappling with complex theoretical stuff, you really need both the practical and the theoretical background to be productive. That would be a ridiculous undertaking for a school, and it’s why we have internships/externships/jr positions.
The combination of these tools letting the seniors in a department do all of the work so companies don’t have to invest in interns/juniors so there’s no reliable entry point into the field, and there being an even bigger disconnect between what schools offer and the skills they need to compete, the industry has some rough days ahead and a whole lot of people trying to get a foothold in the industry right now are screwed. I’m kind of surprised how little so many people in tech seem to care about the impending rough road for entry-level folks in the industry. I guess it’s a combination of how little most higher level developers have to interact with them, and the fact that everybody was tripping over themselves to hire developers when a lot of seniors joined the industry.
And that is the best thing about AI, it allows you to do and try so much more in the limited time you have. If you have an idea, build it with AI, test it, see where it breaks. AI is going to be a big boost for education, because it allows for so much more experimentation and hands-on.
By using AI, you learn how to use AI, not necessarily how to build architecturally sound and maintainable software, so being able to do much more in a limited amount of time will not necessarily make you a more knowledgeable programmer, or at least that knowledge will most likely only be surface-level pattern recognition. It still needs to be combined with hands-on building your own thing, to truly understand the nuts and bolts of such projects.
If you end up with a working project where you understand all the moving parts, I think AI is great for learning and the ultimate proof whether the learning was succesful if whether you can actually build (and ship) things.
So human teachers are good to have as well, but I remember they were of limited use for me when I was learning programming without AI. So many concepts they tried to teach me without having understood themself first. AI would have likely helped me to get better answers instead of, "because that is how you do it" when asking why to do something in a certain way.
So obviously I would have prefered competent teachers all the time and also now competent teachers with unlimited time instead of faulty AIs for the students, but in reality human time is limited and humans are flawed as well. So I don't see the doomsday expectations for the new generation of programmers. The ultimate goal, building something that works to the spec, did not change and horrible unmaintainable code was also shipped 20 years ago.
I don't agree, to me switching from hand coded source code to ai coded source code is like going from a hand-saw to an electric-saw for your woodworking projects. In the end you still have to know woodworking, but you experiment much more, so you learn more.
Or maybe it's more like going from analog photography to digital photography. Whatever it is, you get more programming done.
Just like when you go from assembly to c to a memory managed language like java. It did some 6502 and 68000 assembly over 35 years ago, now nowbody knows assembly.
> to me
Key words there. To you, it's a electric saw because you already know how to program, and that's the other person's point; it doesn't necessarily empower people to build software. You? Yes. Generally though when you hand the public an electric saw and say "have at it, build stuff" you end up with a lot of lost appendages.
Sadly, in this case the "lost appendages" are going to be man-decades of time spent undoing all the landmines vibecoders are going to plant around the digital commons. Which means AI even fails as a metaphorical "electric saw", because a good electric saw should strike fear into the user by promising mortal damage through misuse. AI has no such misuse deterrent, so people will freely misuse it until consequences swing back wildly, and the blast radius is community-scale.
> more like going from analog photography to digital photography. Whatever it is, you get more programming done.
By volume, the primary outcome of digital photography has been a deluge of pointless photographs to the extent we've had to invent new words to categorize them. "selfies". "sexts". "foodstagramming". Sure, AI will increase the actual programming being done, the same way digital photography gave us more photography art. But much more than that, AI will bring the equivalent of "foodstagramming" but for programs. Kind of like how the Apple App Store brought us some good apps, but at the same time 9 bajillion travel guides and flashlight apps. When you lower the bar you also open the flood gates.
> By using AI, you learn how to use AI, not necessarily how to build architecturally sound and maintainable software
> will not necessarily make you a more knowledgeable programmer
I think we'd better start separating "building software" from programming, because the act of programming is going to continue to get less and less valuable.
I would argue that programming has been very overvalued for a while even before AI. And the industry believes it's own hype with a healthy dose of elitism mixed in.
But now AI is removing the facade and it's showing that the idea and the architecture is actually the important part, not the coding if it.
I find it super ironic that you talk about "the industry believing its own hype" and then continue with a love letter for AI.
Ok. But most developers aren't building AI tech. Instead, they're coding a SPA or CRUD app or something else that's been done 10000 times before, but just doing it slightly differently. That's exactly why LLMs are so good at this kind of (programming) work.
I would say most people are dealing with tickets and meetings about the tickets more than they are actually spending time with their editor. It may be similar, but that 1 percent difference needs to be nailed down right, as that's where the business lifeline lays.
Also, not all dev jobs are web tech or AI tech.
Unfortunately education everywhere is getting really hurt by access to AI, both from students who are enabled to not their homework, and by teacher review/feedback being replaced by chatbots.
You can't atrophy if you never grew in the first place. The juniors will be stunted. It's the seniors who will become atrophied.
As for whether it's a mistake, isn't that just the way of things these days? The current world is about extracting as much as you can while you're still here. Look around. Nobody is building for the future. There are a few niche groups that talk about it, but nobody is really doing it. It's just take, take, take.
This just seems more of the same, but we're speeding up. We started by extracting fossil fuels deposited over millions of years, then extracting resources and technology from civilisations deposited over millennia, then from the Victorians deposited only a century or two ago, and now it's software deposited over only mere decades. Someone is going to be left holding the bag, we just hope it's not us. Meanwhile most of the population aren't even thinking about it, and most of the fraction that do think are dreaming that technology is going to save us before it's payback time.
Yeah I noticed the issue with more Junior developers right away. Some developers, Junior or not, have yet to be exposed to environments where their PRs are put under HEAVY scrutiny. They are used to loosey-goosey and unfortunately they are not prepared to put LLM changes under the level of scrutiny they require.
The worst is getting, even smallish, PRs with a bunch of changes that look extraneous or otherwise off. After asking questions the code changes without the questions being answered and likely with a new set of problems. I swear I've been prompting an LLM through an engineer/PR middleman :(
Our offshore devs keep doing this and it drives me nuts. No answers to my question, completely different code gets pushed.
It does but most execs dont care about long term they want that perf bonus before they leave in 3-4 years
That is how you get Oracle source code. It broke my illusions after entering real life big company coding after university, many years ago. It also led to this gem of an HN comment: https://news.ycombinator.com/item?id=18442637
An observation. If we stipulate that this is true that a 'senior developer' benefits from Claude Code but a junior developer do not. Then I'm wondering if that creates this gap where you have a bunch of newly minted '10x' engineers who are doing the work that a bunch of junior devs helped with, and now you're not training any new junior devs because they are unemployable. Is that correct?
It already was the case wasn't it, that you could either get one senior dev to build your thing in a week, or give them a team of juniors and it would take the whole team 4 weeks and be worse.
Yet somehow companies continued to opt for the second approach. Something to do with status from headcount?
Yes, there are companies that opt for broken organizations for a variety of reasons. The observation though is this; Does this lead to a world where the 'minimum' programmer is what we consider today to be a 'Senior Dev' ? It echoes the transition of machinists to operators of CAD/CAM workstations to operate machining centers, rather than hands on the dials of a mill or lathe. It certainly seems like it might make entering the field through a "coder camp" would no longer be practical.
It'll be interesting to see if in a decade when a whole cohort of juniors didn't get trained whether LLMs will be able to do the whole job. I'm guessing a lot of companies are willing to bet on yes.
As long as LLMs aren't the 'Cold Fusion' of this cycle, sure. :-)
>> Something to do with status from headcount?
And usually projected as ensuring bus factor > 1
“Wasting” effort on juniors is where seniors come from. So that first approach is only valid at a sole proprietorship, at an early stage startup, or in an emergency.
n=1 but my experience is the ratio of what'd I'd class "senior" devs (per the example given) to everyone else is comfortably 10:1.
Do you mean that for every 11 devs, 10 of them are "senior" as per the example? Or that only 1 is?
1 senior to 10 everyone else. Sorry that wasn't super clear.
I'm getting my moneys worth having claude write tools. We've reached the dream where I can vibe out some one off software and it's great; today I made two different (shitty but usable!) gui programs in seconds that let me visually describe some test data. The alternative was probably half an hour of putting something together if my first idea was good. Then I deleted them and moved on.
It still writes insane things all the time but I find it really helpful to spit out single use stuff and to brainstorm with. I try to get it to perform tasks I don't know how to accomplish (eg. computer vision experiments) and it never really works out in the end but I often learn something and I'm still very happy with my subscription.
I've also found it good at catching mistakes and helping write commit messages.
"Review the top-most commit. Did I make any mistakes? Did I leave anything out of the commit message?"
Sometimes I let it write the message for me:
"Write a new commit message for the current commit."
I've had to tell it how to write commit messages though. It likes to offer subjective opinions, use superlatives and guess at why something was done. I've had to tell it to cut that out: "Summarize what has changes. Be concise but thorough. Avoid adjective and superlatives. Use imperative mood."
What I can recommend is to tell it that for all documentation, readmes and PR descriptions to keep it "tight, no purple-prose, no emojis". That cuts everything down nicely to to-the-point docs without GPTisms and without the emoji storm that makes it look like yet another frontend framework Readme.
This is insane to me.
Review your own code. Understand why you made the changes. And then clearly describe why you made them. If you can't do that yourself, I think that's a huge gap in your own skills.
Making something else do it means you don't internalize the changes that you made.
Your comment is not a fair interpretation of what I wrote.
For the record, I write better and more detailed commit messages than almost anyone I know across a decades[^0] long career[^1,^2,^3,^4,^5]. But I'm not immune from making mistakes, and everyone can use an editor, or just runs out of mental energy. Unfortunately, I find it hard to get decent PR reviews from my colleagues at work.
So yeah, I've started using Claude Code to help review my own commits. That doesn't mean I don't understand my changes or that I don't know why I made them. And CC is good at banging out a first draft of a commit message. It's also good at catching tiny logic errors that slip through tests and human review. Surprisingly good. You should try it.
I have plenty of criticisms for CC too. I'm not sure it's actually saving me any time. I've spent the last two weeks working 10 hour days with it. For some things it shines. For other things, I would've been better off writing the code from scratch myself, something I've had to do maybe 40% of the time now.
[^0]: https://seclists.org/bugtraq/1998/Jul/172
[^1]: https://github.com/git/git/commit/441adf0ccf571a9fe15658fdfc...
[^2]: https://github.com/git/git/commit/cacfc09ba82bfc6b0e1c047247...
[^3]: https://github.com/fastlane/fastlane/pull/21644
[^4]: https://github.com/CocoaPods/Core/pull/741
[^5]: None of the these are my best examples, just the ones I found quickly. Most of my commit messages are obviously locked away by my employer. Somewhere in the git history is a paragraphs long commit message from Jeff King (peff) explaining a one line diff. That's probably my favorite commit message of all time. But I also know that at work I've got a message somewhere explaining a single character diff.
How long are you commit messages if you still are ahead after typing all this prompt?
My commits’description part, if warranted, is about the reason for the changes, not the specificity of the solution. It’s a little memo to the person reading the diff, not a long monograph. And the diff is usually small.
This is a good call! Do you have claude use atomic commits or do you manually copy/paste the output?
Saving your summary instructions as a CLAUDE.md
Can also confirm. Almost any output from claude code needs my careful input for corrections, which you could only spot and provide if you have experience. There is no way a junior is able to command these tools because the main competency to use them correctly is your ability to guide and teach others in software development, which by definition is only possible if you have senior experience in this field. The sycophancy provided by these models will outright damage the skill progression for juniors, but on the other hand there is no way to not use them. So we are in a state where the future seems really uncertain for most of us.
I find the "killer app" right now is anything where you need to integrate information you don't already have in your brain. A new language or framework, a third-party API, etc. Something straightforward but foreign, and well-documented. You'll save so much time because Claude has already read the docs
> as a vibe coding skeptic, I was amazed.
The interesting thing about all of this vibe coding skepticism, cynicism, and backlash is that many people have their expectations set extremely low. They’re convinced everything the tools produce will be junk or that the worst case examples people provide are representative of the average.
Then they finally go out and use the tools and realize that they exceed their (extremely low) expectations, and are amazed.
Yeah we all know Claude Code isn’t going to generate a $10 billion SaaS with a team of 10 people or whatever the social media engagement bait VCs are pushing this week. However, the tools are more powerful than a lot of people give them credit for.
In case some people having realized it by now: it’s not just the code, it’s also/mostly the marketing. Unless you make something useful that’s hard to replicate..
I have recently found something that’s needed but very niche and the sort of problem that Claude can only give tips on how to go about it.
I guess I'm not sure why people think AI is going to be very useful on niche problem without spending a massive computation budget.
Any niche problem requires original work or deep searching for something that can complete it. There is no free lunch.
what if you could get something as good as claude on a 8-13b model. if you could quantize you could run it even on a 4090 easily.
Hmm not my experience. I've been aggressively trying to use both Cursor and Claude Code. I've done maybe 20-30 attempts with Code at different projects, a couple of them personal small projects. All of them resulted in sub-par results, essentially unusable.
I tried to use it for Python, Rust and Bash. I also tried to use it for crawling and organizing information. I also tried to use it as a debugging buddy. All of the attempts failed.
I simply don't understand how people are using it in a way that improves productivity. For me, all of this is so far a huge timesink with essentially nothing to show for it.
The single positive result was when I asked it to optimize a specific SQL query, and it managed to do it.
Anyway I will keep trying to use it, maybe something needs to click first and it just hasn't yet.
I asked it to implement a C++ backend for an audio plug-in API (CLAP) for the DAW I'm developing and it got it right in maybe less than ten interactions. Implementing other plug-in APIs such as VST3 took me weeks to get to the same level of support.
You're probably in an obscure niche domain, or asking it to do something creative.
Try like upgrading JS package dependencies, or translating between languages, limited tedious things, and you will be surprised how much better it does.
Initially I loved it, overtime I agree with people that call it a slot machine, unless you’re very deliberate with it, it’d just that. Gambling.
People are using different definitions of "vibe coding". If you expect to just prompt without even looking at the code and being involved in the process the result will be crap. This doesn't preclude the usefulness of models as tools, and maybe in the future vibe coding will actually work. Essentially every coder I respect has an opinion that is some shade of this.
There are the social media types you mention and their polar opposites, the "LLMs have no possible use" crowd. These people are mostly delusional. At the grown-ups table, there is a spectrum of opinions about the relative usefulness.
It's not contradictory to believe that the average programmer right now has his head buried in the sand and should at least take time to explore what value LLMs can provide, while at the same time taking a more conservative approach when using them to do actual work.
>maybe in the future vibe coding will actually work
Vibe coding works today at small enough of scale.
I'm building a personal app to help me track nutrition and I only needed to get involved in the code when Claude would hit its limits for a single file and produced a broken program (and this was via the UI, not Claude Code.) Now at ~3000 lines of python.
After I told it to split it into a few files I don't think I've had to talk about anything at the code level. Note that I eventually did switch to using Claude Code which might have helped (gets annoying copy/pasting multiple files and then my prompts hit max limits).
I just prompt it like an experienced QA/product person to tell it how to build it, point out bugs (as experienced as a user), point out bad data, etc.
A few of my recent prompts (each is a separate prompt):
>for foods found but not in database, list the number of times each shows up
>sort the list by count descending
>Period surplus/deficit seems too low. looking at 2025/07/24 to 2025/07/31
>do not require beige color (but still track it). combine blue/purple as one in stats (but keep separate colors). data has both white and White; should use standard case and not show as two colors
> Yeah we all know Claude Code isn’t going to generate a $10 billion SaaS with a team of 10 people…
Not trying to argue, since I don’t have counter evidence, but how can you be so sure?
I think it's some variation of the efficient markets hypothesis. There are no problems that are both that lucrative and that easy to solve; if they existed, they would get dogpiled and stop being lucrative. Even in this day and age, $10B of revenue is an incredibly high bar.
On the other hand, $10B as valuation (not revenue) just requires a greater fool. Maybe it's possible, but I doubt there are too many of those fools available.
Yeah, $10B revenue is insane.
I interpreted the parent as saying $10B valuation.
The question is not whether you can or can't, but whether it is still worth it long term:
- There is a moat of doing so (i.e. will people actually pay for your SaaS knowing that they could do it too via AI) and..
- How many large scale ideas do you need post AI? Many SaaS products are subscription based and loaded with features you don't need. Most people would prefer a simple product that just does what they need without the ongoing costs.
There will be more software. The question is who accrues the economic value of this additional software - the SWE/tech industry (incumbent), the AI industry (disruptor?) and/or the consumer. For the SWE's/tech workers it probably isn't what they envisioned when they started/studied for this industry.
It seems obvious to me it is the consumer who will benefit most.
I had been thinking of buying an $80 license for a piece of software but ended up knocking off a version in Claude Code over a weekend.
It is not even close to something commercial grade that I could sell as a competitor but it is good enough for me to not spend $80 on the license. The huge upside is that I can customize the software in any way I like. I don't care that it isn't maintainable either. Making a new version in ChagGPT5 is going to be my first project.
Just like a few hours ago I was thinking how I would like to customize the fitness/calorie tracking app I use. There are so many features I like that would be tightly coupled to my own situation and not a mass market product.
This to me seems obvious of what the future of software looks like for everything but mission critical software.
Because if it's generated by Claude Code, then basically any team of 10 people can make a competing service.
> The interesting thing about all of this vibe coding skepticism, cynicism, and backlash is that many people have their expectations set extremely low.
Or they have actually used all these tools, know how they work, and don't buy into hype and marketing.
It doesn't help that a lot of skeptics are also dishonest. A few days ago someone here tried to claim that inserting verbose debug logging, something Claude Code would be very good at, is "actually programming" and it's important work for humans to do.
No, Claude can create logs all across my codebase with much better formatting far faster than I can, so I can focus on actual problem solving. It's frustrating, but par for the course for this forum.
Edit: Dishonest isn't correct, I should have said I just disagree with their statements. I do apologize.
No, some skeptics are actually dishonest. It's part of trolling, and trolling is in fashion right now. Granted, some skeptics are fair, but many do it strictly for the views, without any due diligence.
That's not what dishonesty means. That's just someone who disagrees with you
Thank you for calling out my inaccuracy. One thing I'm always appreciative for in HN is the level of pedantry day after day.
That's not what pedantry means.
Thank you for calling out my inaccuracy.
lol. That made me chuckle.
That's not pedantry, pedantry would be if it were a very minor or technical detail, but being dishonest doesn't have anything to do with having a different opinion.
But this comment might step a bit into pedantry.
> But this comment might step a bit into pedantry.
Especially if I point out that they said "are also", not "are".
Im a vibe code skeptic because I dont consider it coding. I assume it can write some decent code, but that's not coding.
Coding can only be done by replicators born out of carbon, I imagine?
No, coding can be done by machines. But if you're telling a machine what to program, you're not coding. The machine is. Youre no longer a programmer, you're just a user.
Unless you’re writing pure assembly, aren’t we all using machines to generate code?
I suppose a more rigorous definition would be useful. We can probably make it more narrow as time goes on
To me, the essence of coding is about using formal languages and definable state machines (i.e, your toolchain) to manipulate the state of a machine in a predictable way.
C, C++, even with their litany of undefined behavior, are still formal languages, and their compilers can still be predicted and understood (no matter how difficult that is). If the compiler does something unexpected, its because you, the programmer, lacked the knowledge of either the language or the compiler's state.
Vibe coding uses natural languages, and interacts with programs whose state is not only unknown, but unknowable. The machine, for the same input, may produce wildly different output. If the machine produces unexpected code, its not because of a lack of knowledge on the part of it programmer - its because the machine is inherently unpredictable and requires more prodding in soft, fuzzy, natural language.
Telling something what outcomes you want, even if described in technical terms only a programmer would understand, is not coding. It's essentially just being a project manager.
Now you may ask - who cares about this no true Scotsman fallacy? If its coding or not coding, we are still producing a program which serves the product needs of the customer.
Personally, I did not learn to code because I give a shit about the product needs of the customer, or the financial wellbeing of the business. I enjoy coding for its own sake - because it is fun to use systems of well defined rules to solve problems. Learning and using C++ is fun, for me; it seems every day i learn something new about the language and how the compiler behaves and I've been using C++ for several years (and I started learning it when I was 12!)
Describing the outcome or goal of a project in natural human language sounds like a nightmare, to be honest. I became a software engineer so I could minimize the amount of natural language required to succeed in life. Natural language has gotten me (and, I suspect, people like me) in trouble over and over again throughout adolescence, but I've never written a piece of code that was misunderstood or ambiguous enough for people to become threatened by or outraged by it.
I think the disconnect is that some people care about products, and some people care about code.
> can still be predicted and understood (no matter how difficult that is).
If we're making up hypotheticals, then LLMs can be predicted and understood (no matter how difficult that is). They don't run on pixie dust.
It's qualitatively different to go through source code and specifications to understand how something works than to look at a database with all the weights of an LLM and pretend like you could predict the output.
you don't have to - just run the matrix multiplication. it's no different from using a pocket calculator.
Ummm, my entire career I have been telling machines what to program, the machines are taking my garbage C/Go/Python/Perl/whatever prompts and translating it to ASM/Machine code that oher machines will use to do... stuff
They're substantively different. Using a compiler requires you to have an internalized model of a state machine and, importantly, a formal language. C, assembler, java, etc. are all essentially different from using the softness of the English language to coerce results out of a black box
No, not at all.
In both all you need is the ability to communicate to the machine in a way that the machine can convert your ideas into actions.
The restricted language of a compiler is a handicap, not evidence of a skill - we've been saying forever that "Natural Language" compilers would be a game changer, and that's all that an AI really is
Edit: It appears that this discussion is going to end up with a definition of "coding"
Is it coding if you tell the computer to perform some action, or is it coding if you tell it how to do that in some highly optimised way (for varying definitions of optimised, eg. Memory efficient, CPU efficient, Dev time efficient... etc)
Writing code that gets compiled is deterministic, but asking an LLM to produce code is a non-deterministic guessing game.
Notice how nobody is skeptical of compilers?
No one is skeptical of compilers?! I guess you haven’t met many old fashioned C systems programmers, who go out of their way to disable compiler optimisations as much as they can because “it just produces garbage”.
Every generation, we seem to add a level of abstraction conceding because for most of us, it enhances productivity. And every generation, there is a crowd who rails against the new abstraction, mostly unaware of all of the levels of abstraction they already use in their coding.
Luxury! When I were a lad we didn't have them new fangled compilers, we wrote ASM by hand, because compilers cannot (and still to this day I think) optimise ASM as well as a human
Abstractions and compilers are deterministic, no matter if a neckbeard is cranky about the results. LLMs are not deterministic, they are a guessing game. An LLM is not an abstraction, it's a distraction. If you can't tell the difference, then maybe you should lay off the "AI" slop.
It is not coding if you use natural human language
I think after all the goalpost moving, we have to ask - why the bitflip does it matter what we call it?
Some people are getting a lot of work done using LLMs. Some of us are using it on occasion to handle thing we don't understand deeply but can trivially verify. Some of us are using it out of laziness because it helps with boilerplate. Everyone who is using it outside of occasional tests is doing it because they find it useful to write code. If it's not coding, then I personally couldn't care less. Only a True Scotsman should case.
If my boss came to me and said "hey we're going to start vibe coding everything st work from now on. You can manually edit code but claude code needs to be your primary driver from now on" I would quit and find a new career. I enjoy coding. I like solving puzzles using the specifics of a language syntax. I write libraries and APIs and I put a great deal of effort into making sure the interface is usable by a human being.
If we get to the point where we are no longer coding, we are just describing things in product language to a computer and letting it do all the real work, then I will find a more fulfilling career because this ain't it
By the time it works flawlessly, it won't be your career anymore, it'll be the product manager's. They will describe what they want and the AI will produce it. You won't be told to "use Claude all the time".
I personally hate coding, but it's a means to an end, and I care about the end. I'm also paranoid about code I don't understand, so I only rarely use AI and even then it's either for things I understand 100% or things that don't matter. But it would be silly to claim they don't produce working code, no matter what we want to call it.
How so?
BASIC was supposed to be astep toward using a natural human language, and a number of Microsoft products were too.
As I said before, we've been wanting to use a Natural Human Language for programming for a long time.
Edit: adding a wikipedia reference because apparently I am the only person on the planet that has been looking for a way to use a Natural Human Language for programming https://en.wikipedia.org/wiki/Natural_language_programming
Second Edit: Adding the following paragraph from the wikipedia page for emphasis
Researchers have started to experiment with natural language programming environments that use plain language prompts and then use AI (specifically large language models) to turn natural language into formal code. For example Spatial Pixel created a natural language programming environment to turn natural language into P5.js code through OpenAI's API. In 2021 OpenAI developed a natural language programming environment for their programming large language model called Codex.
Maybe you have. I would like no such thing.
Oh, and.. who are you?
Technically you’re not vibe coding. You’re using AI to do software engineering. Vibe coding is specifically the process of having AI produce code and plowing ahead without understanding it.
I know I’m being pedantic, but people mean very different things when they talk about this stuff, and I don’t think any credence should be given to vibe coding.
To some extent, OP is still vibe coding because one has to trust Claude's every single decision which can't be easily verified at the first glance anyway. Agreed that we need a new word for heavily AI-assisted software development though, I once used a word "vivid coding" for this kind of process.
I vibe code quite a bit and will plow through a lot of front end code despite being a backend engineer. In my case, it's on personal projects where I'm ambitious and asking the LLM to "replace an entire SaaS" sort of thing. At work most of the code is a couple lines here or there and trivial to review.
When I try the more complex things I will do multiple passes with AI, have 2-3 LLMs review it and delete deprecated code, refactor, interrogate it and ask it to fix bad patterns, etc. In an evening I can refactor a large code base this way. For example Gemini is meh compared to Claude Opus at new code, but somewhat decent for reviewing code that's already there, since the 1M context window allows it to tie things together Claude wouldn't be able to fit in 256k. I might then bounce a suggestion back from Gemini -> Claude -> Grok to fix something. It's kind of like managing a team of interns with different specialties and personalities.
Both are vibe coding. The term was coined by Andrej Karpathy, a computer scientist who served as the director of artificial intelligence at Tesla.
Maybe you're thinking of slop coding?
No, they're not.
"A key part of the definition of vibe coding is that the user accepts code without full understanding.[1] Programmer Simon Willison said: 'If an LLM wrote every line of your code, but you've reviewed, tested, and understood it all, that's not vibe coding in my book—that's using an LLM as a typing assistant.'"
(https://en.m.wikipedia.org/wiki/Vibe_coding)
> The term was coined by Andrej Karpathy, a computer scientist who served as the director of artificial intelligence at Tesla.
And...?
I wasn't familiar with his full message, so I didn't realize that the current definition of vibe coding was so cynical. Many of us don't see it that way.
Karpathy's definition definitely requires:
1. Not looking at the code 2. YOLO everything 3. Paste errors back into the model verbatim
That said, I describe what I do as vibe coding, but I introduce code review bots into the mix. I also roadmap a plan with deep research before hand and require comprehensive unit and behavioural tests from the model.
Just a few months ago I couldn't imagine paying more than $20/mo for any kind of subscription, but here I am paying $200/mo for the Max 20 plan!
Similarly amazed as an experienced dev with 20 YoE (and a fellow Slovak, although US based). The other tools, while helpful, were just not "there" and they were often simply more trouble than they were worth producing a lot of useless garbage. Claude Code is clearly on another level, yes it needs A LOT of handholding; my MO is do Plan Mode until I'm 100% sure it understands the reqs and the planned code changes are reasonable, then let it work, and finally code review what it did (after it auto-fixes things like compiler errors, unit test failures and linting issues). It's kind of like a junior engineer that is a little bit daft but very knowledgeable but works super, super fast and doesn't talk back :)
It is definitely the future, what can I say? This is a clear direction where software development is heading.
When I first tried letting Cursor loose on a relatively small code base (1500 lines, 2 files), I had it fix a bug (or more than one) with a clear testcase and a rough description of the problem, and it was a disaster.
The first commit towards the fix was plausible, though still not fully correct, but in the end not only it wasn't able to fix it, each commit was also becoming more and more baroque. I cut it when it wrote almost 100 lines of code to compare version numbers (which already existed in the source). The problem with discussing the plan is that, while debugging, you don't yourself have a full idea of the plan.
I don't call it a total failure because I asked the AI to improve some error messages to help it debug, and I will keep that code. It's pretty good at writing new code, very good at reviewing it, but for me it was completely incapable of performing maintainance.
These tools and LLMs differ in quality, for me Claude Code with Claude 4 was the first tool that worked well enough. I tried Cursor before, it's been a 6+ months ago though, but I wasn't very impressed.
Same for me. Cursor was a mess for me. I don't know why and how it works for other people. Claude code on the other hand was a success from day one and I'm using it happily for months now.
I used Cursor for about 5 months before switching to Claude Code. I was only productive with Cursor when I used it in a very specific way, which was basically me doing by hand what Claude Code does internally. I maintained planning documents, todo lists, used test driven development and linting tools, etc. My .cursorrules file looks like what I imagine the Claude system prompt to be.
Claude Code took the burden of maintaining that off my shoulders.
Also Cursor was/is utterly useless any all non-Anthropic models, which are the default.
Fwiw, I dipped my toes into AI assisted coding a few weeks ago and started with cursor. Was very unimpressed (spent more time prompting and fight the tool than making forward progress) until I tried Claude code. Happily dropped cursor immediately (cancelled my sub) and am now having a great time using CC productively (just the basic $20/mo plan). Still needs hand-holding but it's a net productivity boost.
This was a problem I regularly had using Copilot w/ GPT4o or Sonnet 3.5/3.7... sometimes I would end up down a rabbit hole and blow multiple days of work, but more typically I'd be out an hour or two and toss everything to start again.
Don't have this w/ Claude Code working over multiple code bases of 10-30k LOC. Part of the reason is the type of guidance I give in the memory files helps keep this at bay, as does linting (ie. class/file length), but I also chunk things up into features that I PR review and have it refactor to keep things super tidy.
Yeah, Github Copilot just didn't work for me at all. The completions are OK and I actually still use it for that but the agent part is completely useless. Claude Code is in another league.
> wrote almost 100 lines of code to compare version numbers
Oh boy, it was almost sentient. That is a deep mathematical proof!
> Just a few months ago I couldn't imagine paying more than $20/mo for any kind of subscription, but here I am paying $200/mo for the Max 20 plan!
I wonder, are most devs with a job paying for it themselves, rather than the company they work for?
I'm recently unemployed after 20 continuous years in the industry, trying to launch a SaaS, so yes, paying for it myself.
May I ask what you are use it for? I have been using it for fun mostly, side projects, learning, experimenting. I would never use it for work codebase, unless, well, the company ordered or at least permitted it. And even then, I'm not really sure I would feel comfortable with the level of liberty CC takes. So I'm curious about others.
Of course you need an explicit permit from the company to use (non-local) AI tools.
Before that was given, I used AI as a fancier search engine, and for coming up with solutions to problems I explained in abstract (without copy-pasting actual code in or out).
It's our own SaaS we're trying to launch with my partner. So no work-related issues.
I have a similar amount of engineering experience, was highly skeptical, and I've come to similar conclusions with Claude Code after spending two weeks on a greenfield project (TS api, react-native client, TS/React admin panel).
As I've improved planning and context management, the results have been fairly consistent. As long as I can keep a task within the context window, it does a decent job almost every time. And occasionally I have to have it brute-force its way to green lint/typecheck/tests. That's been one of the biggest speed bumps.
I've found that gemini is great at the occasional detailed code-review to help find glaring issues or things that were missed, but having it implement anything has been severely lacking. I have to literally tell it not to do anything because it will gladly just start writing files on a whim. I generally use the opus model to write detailed plans, sonnet to implement, and then opus and gemini to review and plan refactors.
I'm impressed. The progress is SLOW. I'd have gotten to the stage I'm at in 1/3 to 1/2 the time, likely with fewer tests and significantly less process documentation. But the results are otherwise fairly great. And the learning process has kept me motivated to keep this old side-project moving.
I was switching between two accounts for a week while testing, but in the end upgraded to the $100/month plan and I think I've been rate-limited once since. I don't know if I'll be using this for every-day professional work, but I think it's a great tool for a few categories of work.
Feels like the most valuable skill to have as a programmer in times of Claude Code is that of carefully reading spec documentation and having an acute sense of critical thinking when reviewing code.
Critical Skills is spotting the potential bugs before they happen but in order to do that you need to have an extremely acute understanding or a have a lot of experience in the stack, libs and programming language of choice. Something that, ironically, you will not get by "vibe coding".
For seniors these skills are very valuable before LLM anyway.
Fascinating since I found the recent Claude models untrustworthy for writing and editing SQL. E.g. it'd write conditions correctly, but not add parens around ANDs and ORs (which gemini pro then highlighted as a bug, correctly.)
If you aren't already (1) telling Claude Code which flavor of SQL you want (there are several major dialects and many more minor ones) and (2) giving it access to up-to-date documentation via MCP (e.g. https://github.com/arabold/docs-mcp-server) so it has direct access to canonical docs for authoritative grounding and syntax references, you'll find that you get much better results by doing one or both of those things.
Documentation on features your SQL dialect supports and key requirements for your query are very important for incentivizing it to generate the output you want.
As a recent example, I am working on a Rust app with integrated DuckDB, and asked it to implement a scoring algorithm query (after chatting with it to generate a Markdown file "RFC" describing how the algorithm works.) It started the implementation with an absolute minimal SQL query that pulled all metrics for a given time window.
I questioned this rather than accepting the change, and it said its plan was to implement the more complex aggregation logic in Rust because 1) it's easier to interpret Rust branching logic than SQL statements (true) and 2) because not all SQL dialects include EXP(), STDDEV(), VAR() support which would be necessary to compute the metrics.
The former point actually seems like quite a reasonable bias to me, personally I find it harder to review complex aggregations in SQL than mentally traversing the path of data through a bunch of branches. But if you are familiar with DuckDB you know that 1) it does support these features and 2) the OLAP efficiency of DuckDB makes it a better choice for doing these aggregations in a performant way than iterating through the results in Rust, so the initial generated output is suboptimal.
I informed it of DuckDB's support for these operations and pointed out the performance consideration and it gladly generated the (long and certainly harder to interpret) SQL query, so it is clearly quite capable, just needs some prodding to go in the right direction.
Haven't heard of docs-mcp-server, but there is the very popular Context7 with 23k Github stars and more active development:
https://github.com/upstash/context7
Great suggestion, thank you!
I found claude sonnet 4 really good at writing SQL if you give it a feedback loop with real data. It will research the problem, research the data, and improve queries until it finds a solution. And then it will optimize it, even optimize performance if you ask it to run explain plan or look at pg_stat_statemnts (postgres).
It's outrageously good at performance optimization. There's been multiple really complex queries I've optimized with it that I'd been putting off for a long time. Claude code figured the exact indexes to add within seconds (not ones I would have got easily manually).
Try getting Claude to write a style guideline based on some of your existing manually coded work and then see if it improves using that in context.
The trick is to have it run it through sqlglot and correct the errors.
This kind of thing is a key point. Tell Claude Code to build the project, run linters, run the tests, and fix the errors. This (in my experience) has a good chance of filtering out mistakes. Claude is fully capable of running all of the tools, reading the output, and iterating. Higher level mistakes will need code written in a way that is testable with tests that can catch them, although you probably want that anyway.
Claude Sonnet 4 is very good at generating Cypher queries for Neo4j
I bet it highly depends on the work you do.
It is very useful for simpler tasks like writing tests, converting code bases etc where the hard part is already done.
When it comes to actually doing something hard - it is not very useful at least in my experience.
And if you do something even a bit niche - it is mostly useless and its faster do dig into topic on your own that try to have Claude implement it.
Even when I hand roll certain things, it still nice to have Claude Code take over any other grunt work that might come my way. And there are always yaks to shave, always.
I completed my degree over 20 years ago and due to dot com bust and the path I took never coded as a full time role, some smallbits of dev and scripting but nothing where I would call myself a developer. I've had loads of ideas down through the years but never had the time work to complete them or learn the language/stack to complete them. Over the last 3 weeks I've been working on something small that should be ready for a beta release by the end of August. The ability to sit down and work on a feature or bug when I only have a spare 30 mins and be immediately productive without having to get in the zone is a game changer for me. Also while I can read and understand the code writing it would be at least 10 times slower for me. This is a small codebase that will have less than 5k lines and is not complicated so github copilot is working well for me in this case.
I could see me paying for higher tiers given the productivity gains.
The only issue I can see is that we might end up with a society where those that can afford the best subscriptions have more free time, get more done, make more money and are more successful in general. Even current base level subscriptions are too expensive for huge percentage of the global population.
Gemini is not that good right now. GLM-4.5, which just came out, is pretty decent and very cheap. I use these with the RooCode plugin for VSCode that connects to it via OpenRouter. $10 of credits lasts a day of coding for me where as Claude would run that out in an hour.
I found Gemini CLI to be totally useless too. Last week I tried Claude Code with GLM4.5 (via z.ai API), though, and it was genuinely on par with Sonnet.
Thank you for the recommendation. I've been testing this on an open source project and it's indeed good. Not as good as Sonnet 4, but good enough. And the pricing is very reasonable. Don't know if I'd trust it to work on private code, but for public code it's a great option.
> I couldn't imagine paying >100$/month for a dev tool in the past, but I'm seriously considering upgrading to the Max plans.
Sadly, my experience with the Max plan has been extremely poor. It’s not even comparable, I’ve been vastly experimenting with claude code in the last weeks, spending more than 80$ per day, it’s amazing. The problem is that in the Max plan you’re not the one managing the context length, and this ruins the model ability to keep things memory. Of course this is expected, the longer the context the more expensive to run, but it’s so frustrating to fail in a coding task because it’s so obvious the model lost a crucial part of the context.
What do you mean you are not managing context length? Context length is same, 200K tokens.
My experience has been similar, over perhaps 4-6 weeks of Claude Code. My first few days were a bit rough, and I was tempted to give up and proclaim that all my skeptic's opinions were correct and that it was useless. But there is indeed a learning curve to using it. After a month I'm still learning, but I can get it to give me useful output that I'm happy committing to my projects, after reviewing it line by line.
Agreed that context and chunking are the key to making it productive. The times when I've tried to tell it (in a single prompt) everything I want it to do, were not successful. The code was garbage, and a lot of it just didn't do what I wanted it to do. And when there are a lot of things that need to be fixed, CC has trouble making targeted changes to fix issues one by one. Much better is to build each small chunk, and verify that it fully works, before moving on to the next.
You also have to call its bullshit: sometimes it will try to solve a problem in a way you know is wrong, so you have to stop it and tell it to do it in another way. I suppose I shouldn't call it "bullshit"; if we're going to use the analogy of CC being like an inexperienced junior engineer, then that's just the kind of thing that happens when you pair with a junior.
I still often do find that I give it a task, and when it's done, realize that I could have finished it much faster. But sometimes the task is tedious, and I'm fine with it taking a little longer if I don't have to do it myself. And sometimes it truly does take care of it faster than I would have been able to. In the case of tech that I'm learning myself (React, Tailwindcss, the former of which I dabbled with 10 years ago, but my knowledge is completely out of date), CC has been incredibly useful when I don't really know how to do something. I'm fine letting CC do it, and then I read the code and learn something myself, instead of having to pore over various tutorials of varying quality in order to figure it out on my own.
So I think I'm convinced, and I'll continue to make CC more and more a part of my workflow. I'm currently on the Pro plan, and have hit the usage limits a couple times. I'm still a little shy about upgrading to Max and spending $100/mo on a dev tool... not sure if I'll get over that or not.
One thing I’ve started doing is using Gemini cli as a sidecar for Claude Code to load in a huge amount of context around a set of changes to get a second opinion - it’s been pretty handy for that particular use case due to its context size advantage
Completely agree. You really have to learn how to use it.
For example, heard many say that doing big refactorings is causing problems. Found a way that is working for SwiftUI projects. I did a refactoring, moving files, restructuring large files into smaller components, and standardizing component setup of different views.
The pattern that works for me: 1) ask it to document the architecture and coding standards, 2) ask it to create a plan for refactoring, 3) ask it to do a low-risk refactoring first, 4) ask it to update the refacting plan, and then 5) go through all the remaining refactorings.
The refactoring plan comes with timeline estimates in days, but that is completely rubbish with claude code. Instead i asked it to estimate in 1) number of chat messages, 2) number of tokens, 3) cost based on number of tokens, 4) number of files impacted.
Another approach that works well is to first generate a throw away application. Then ask it to create documentation how to do it right, incorporate all the learning and where it got stuck. Finally, redo the application with these guidelines and rules.
Another tip, sometimes when it gets stuck, i open the project in windsurf, and ask another LLM (e.g., Gemini 2.5 pro, or qwen coder) to review the project and problem and then I will ask windsurf to provide me with a prompt to instruct claude code to fix it. Works well in some cases.
Also, biggest insight so far: don't expect it to be perfect first time. It needs a feedback loop: generate code, test the code, inspect the results and then improve the code.
Works well for SQL, especially if it can access real data: inspect the database, try some queries, try to understand the schema from your data and then work towards a SQL query that works. And then often as a final step it will simplify the working query.
I use an MCP tool with full access to a test database, so you can tell it to run explain plan and look at the statistics (pg_stat_statements). It will draw a mermaid diagram of your query, with performance numbers included (nr records retrieved, cache hit, etc), and will come back with optimized query and index suggestions.
Tried it also on csv and parquet files with duckdb, it will run the explain plan, compare both query, explain why parquet is better, will see that the query is doing predicate push down, etc.
Also when it gets things wrong, instead of inspecting the code, i ask it to create a design document with mermaid diagrams describing what it has built. Quite often that quickly shows some design mistake that you can ask it to fix.
Also with multiple tools on the same project, you have the problem of each using it's own way of keeping track of the plan. I asked claude code to come up with rules for itself and windsurf to collaborate on a project. It came back with a set of rules for CLAUDE.md and .windsurfrules on which files to have, and how to use them (PLAN.md, TODO.md, ARCHITECTURE.md, DECISION.md, COLLABORATION.md)
Claude code is great until it isn’t. You’re going to get to a point where you need to modify something or add something… a small feature that would have been easy if you wrote everything, and now it’s impossible because the architecture is just a mishmash of vibe coded stuff you don’t understand.
The people successfully using Claude Code for big projects aren’t letting it get to the point where they don’t understand what it wrote.
The best results come from working iteratively with it. I reject about 1/3 of edits to request some changes or a change of direction.
If you just try to have it jam on code until the end result appears to work then you will be disappointed. But that’s operator error.
So far I'm bullish on subagents to help with that. Validate completion status, bullshit detection, catching over engineering etc. I can load them with extra context like conventions ahd specific prompts to clamp down on the Claude-isms during development.
Remind me in two years...
I understand completely what you're saying. But with the delusions that management is under right now, you're just going to seem like someone that's resisting the flow of code and becoming a bottleneck.
The trick is to ask it to do more narrow tasks and design the structure of the code base yourself.
This. It helps to tell it to plan and to then interrogate it about that plan, change it to specification etc. Think of it as a refinement session before a pairing session. The results are considerably better if you do it this way. I've written kubernetes operators, flask applications, Kivy applications, and a transparent ssh proxy with Claude in the last two months, all outside of work.
It also helps to tell it to write tests first: I lean towards integration tests for most things but it is decent at writing good unit tests etc too. Obviously, review is paramount if TDD is going to work.
As a hobbyist coder, the more time I spend brainstorming with all the platforms about specs and tests and architecture, the better the ultimate results.
Having used Claude Code extensively for the last few months, I still haven't reached this "until it isn't" point. Review the code that comes out. It goes a long way.
>> Review the code that comes out. It goes a long way.
Sure, but if I do 5 reviews for a task - in 99% of cases it is net negative as it is faster to DIY it at that point. Harder for sure, but faster.
Maybe our brains are wired different but reading and reviewing code is way faster for me than writing it.
There's very few objective ways to measure review 'performance'.
Coding is easy, it works or doesn't.
This ignores a bunch of higher level and long-term concepts like maintenance, complexity and extensibility.
With only "it works" you end up like this (iconic HN comment on Oracle codebase): https://news.ycombinator.com/item?id=18442637
Yes, my point is that you don't even have "it compiles" as a way to measure a code review. Maybe you did a great job, maybe you did a terrible job, how do you tell?
Or it works till it doesn't. There is a lot of code that will work until some condition is met.
You're misusing the tool starting from not giving clear instructions.
How can you end up with code you don't understand, if you review anything it writes? I wouldn't let it deviate from the architecture I want to have for the project. I had problems with junior devs in the past, too eager to change a project, and I couldn't really tell them to stop (need to work on my communication skills). No such problem with Claude Code.
I don’t remember what architecture was used by PRs I reviewed a month ago. I remember what architecture I designed 15 years ago for projects I was part of.
I've only used the agentic tools a bit, but I've found that they're able to generate code at a velocity that I struggle to keep in my head. The development loop also doesn't require me to interact with the code as much, so I have worse retention of things like which functions are in which file, what helper functions already exist, etc.
It's less that I can't understand, and more that my context on the code is very weak.
Ask it to document the code with design documents and mermaid diagrams. Much faster to review.
I might have to try this. Without having tried it, it feels like the context I think I lack is more nitty gritty than would be exposed like this. It's not like I'm unsure of how a request ends up in a database transaction, but more "do we need or already have an abstraction over paging in database queries?". It doesn't feel like mermaid diagrams or design documents would include that, but I'm open to being wrong there.
If you have a question like that, just ask.
So youre telling me that reading is the same as writing? In terms of the brain actually consuming and processing the info you gave it and storing it
> a mishmash of vibe coded stuff you don’t understand.
No, there is a difference between "I wrote this code" and "I understand this code". You don't need to write all the code in a project to understand it. Otherwise writing software in a team would not be a viable undertaking.
Yes, the default when it does anything is to try and create. It will read my CLAUDE.md file, it will read the code that is already there, and then it will try to write it again. I have had this happen many times (today, I had to prompt 5/6 times to read the file as a feature had already been implemented).
...and if something is genuinely complex, it will (imo) generally do a bad job. It will produce something that looks like it works superficially, but as you examine it will either not work in a non-obvious way or be poorly designed.
Still very useful but to really improve your productivity you have to understand when not to use it.
How do you write complex code as a human? You create abstraction layers, right?
Why wouldn't that work with an llm? It takes effort, sure, but it certainly also takes effort if you have to do it "by hand"?
English is much less expressive compared to code. Typing the keys was never the slow part for senior developers.
It does work with an LLM, but you’re reinventing the wheel with these crazy markup files. We created a family of language to express how to move bits around and replacing that with English is silly.
Vibe coding is fast because you’re ok with not thinking about the code. Anytime you have to do that, an LLM is not going to be much faster.
Because it creates the wrong layers.
In theory, there is no reason why this is the case. For the same reason, there is no reason why juniors can't create perfect code first time...it is just the tickets are never detailed enough?
But in reality, it doesn't work like that. The code is just bad.
You are responsible for the layers. You should either do the design on your own, or let the tool ask you questions and guide you. But you should have it write down the plan, and only then you let it code. If it messes up the code, you /clear, load the plan again and tell it to do the code differently.
It's really the same with junior devs. I wouldn't tell a junior dev to implement a CRM app, but I can tell the junior dev to add a second email field to the customer management page.
You're not setting good enough boundaries or reviewing what it's doing closely enough.
Police it, and give it explicit instructions.
Then after it's done its work prompt it with something like "You're the staff engineer or team lead on this project, and I want you to go over your own git diff like it's a contribution from a junior team member. Think critically and apply judgement based on the architecture of the project describes @HERE.md and @THERE.md."
Ah yes…the old “you’re holding it wrong”. The problem is these goddamn things don’t learn, so you put in the effort to police it…and you have to keep doing that until the end of time. Better off training someone off the street to be a software engineer.
Yes, sometimes you are actually indeed holding it wrong. Sometimes a product has to be used in a certain way to get good results. You're not going to blame the shampoo when someone uses only a tiny drop of it, and the hair remains dirty.
This is still early days with LLMs and coding assistants. You do have to hold them in the right way sometimes. If you're not willing to do that, or think that provides less value than doing it another way... great, good for you, do it the way you think is best for you.
I've been a coding assistant skeptic for a long time. I just started playing with Claude Code a month or so ago. I was frustrated for a bit until I learned how to hold it the right way. It is a long, long way from being a substitute for a real human programmer, but it's helpful to me. I certainly prefer it to pair programming with a human (I hate pair programming), so this provides value.
If you don't care to figure out for yourself if it can provide you value, that's your choice. But this technology is going to get better, and you might later find yourself wishing you'd looked into it earlier. Just like any new tool that starts out rough but eventually turns out to be very useful.
Your claude.md (or equivalent) is the best way to teach them. At the end of any non-trivial coding session, I'll ask for it to propose edits/additions to that file based on both the functional changes and the process we followed to get there.
How do I distill 30 years of experience/knowledge into a Claude.md file? People learn, LLMs don't - end of story.
> People learn, LLMs don't - end of story.
That's not the end of the story, though. LLMs don't learn, but you can provide them with a "handbook" that they read in every time you start a new conversation with them. While it might take a human months or years to learn what's in that handbook, the LLM digests it in seconds. Yes, you have to keep feeding it the handbook every time you start from a clean slate, and it might have taken you months to get that handbook into the complete state it's in. But maybe that's not so bad.
The good thing about this process its it means such a handbook functions as documentation for humans too, if properly written.
Claude is actually quite good at reading project documentation and code comments and acting on them. So it's also useful for encouraging project authors to write such documentation.
I'm now old enough that I need such breadcrumbs around the code to get context anyways. I won't remember why I did things without them.
The same way you program...
Break apart your knowledge into relevant chunks for Claude so that you can only have what's useful in its context window.
It's just a tool, not an intelligence or a person.
You use it to make your job easier. If it doesn't make your job easier, you don't use it.
Anybody trying to sell you on a bill of goods that this is somehow "automating away engineers" and "replacing expensive software developers" is either stupid or lying (or both).
I find it incredibly useful, but it's garbage-in, garbage-out just like anything else with computers. If your code base is well commented and documented and laid out in a consistent pattern, it will tend to follow that pattern, especially if it follows standards. And it does better in languages (like Rust) that have strict type systems and coding standards.
Even better if you have rigorous tests for it to check its own work against.
They don't learn by themselves, but you can add instructions as they make mistakes that are effectively them learning. You have to write code review feedback for juniors, so that s not an appreciable difference.
> Better off training someone off the street to be a software engineer.
And that person is going to quit and you have to start all over again. They also cost at least 100x the price.
> They also cost at least 100x the price.
Because right now AI companies are losing their asses - it costs significantly more than what they are charging.
I've been telling people, this is Uber in 2014, you're getting a benefit and it's being paid for with venture capital money, it's about as good as it's going to get.
True but the tech is improving so fast that in a year we can probably get equivalent performance for 10-100x cheaper
Incorrect, the hardware is not improving so fast that it's getting 10-100x cheaper.
The world is not a zero-sum game.
Not so. Adding to context files helps enormously. Having touchstone files (ARCHITECTURE.md) you can reference helps enormously. The trick is to steer, and create the guardrails.
Honestly, it feels like DevOps had a kid with Product.
> Honestly, it feels like DevOps had a kid with Product.
You've just described a match made in hell. DevOps - let's overcomplicate things (I'm looking at you K8s) and Product - they create pretty screenshots and flows but not actually think about the product as a system (or set of systems.)
Maybe instead try opencode or crush with Gemini/Google auth when your Claude Code hits the limit.
Gemini is shockingly, embarrassingly, shamefully bad (for something out of a company like Google). Even the open models like Qwen and Kimi are better on opencode.
In my experience, Gemini is pretty good in multishotting. So just give it a system prompt, some example user/assistant pairs, and it can produce great results!
And this is its biggest weakness for coding. As soon as it makes a single mistake, it's over. It somehow has learned that during this "conversation" it's having, it should make that mistake over and over again. And then it starts saying things like "Wow, I'm really messing up the diff format!"
Ah I was thinking maybe the Gemini-cli agent itself might be attributable to the problems, thus maybe try the opencode/Gemini combo instead..
I'd like to mess around with "opencode+copilot free-tier auth" or "{opencode|crush}+some model via groq(still free?)" to see what kind of mileage I can get and if it's halfway decent..
> I have about two weeks of using Claude Code and to be honest, as a vibe coding skeptic, I was amazed.
And, yet, when I asked it to correct a CMake error in a fully open source codebase (broken dependency declaration), it couldn't work it out. It even started hallucinating version numbers and dependencies that were so obviously broken that at least it was obvious to me that it wasn't helping.
This has been, and continues to be, my experience with AI coding. Every time I hit something that I really, really want the AI to do and get right (like correcting my build system errors), it fails and fails miserably.
It seems like everybody who sings the praises of AI coding all have one thing in common--Javascript. Make of that what you will.
This is typically the outcome, when you have it look at a generic problem and fix it, especially if the problem depends on external information (like specific version numbers, etc). You have to either tell it where to look it up, or ask it to ask you questions how things need to be resolved. I personally use it to work on native code, C++ (with CMake), Zig, some Python. Works fine.
hook up zen mcp to openrouter and use cerabras inference with kimi k2 and qwen3 480b at 2k tok/sec
What exactly have you written with Claude Code?
I have not tried it, for a variety of reasons, but my (quite limited, anecdotal, and gratis) experience with other such tools is, that I can get them to write something I could perhaps get as an answer on StackOverflow: Limited scope, limited length, address at most one significant issue; and perhaps that has to do with what they are trained on. But that once things get complicated, it's hopeless.
You said Claude Code was significantly better than some alternatives, so better than what I describe, but - we need to know _on what_.
Not with Claude Code but with Cursor using Claude Sonnet 4 I coded an entire tower defense game, title, tutorial, gameplay with several waves of enemies, and a “rewind time” mechanic. The whole thing was basically vibe coded, I touched maybe a couple dozen lines of code. Apparently it wasn’t terrible [0]
[0] https://news.ycombinator.com/item?id=44463967
I've been working on the design of a fairly complicated system using the daffy robots to iterate over a bunch of different ideas. Trying things out (conceptually) to explore the pros and cons of each decision before even writing a single line of code. The code is really a formality at this point as each and every piece is laid out and documented.
Contrast this with the peg parser VM it basically one-shotted but needed a bunch of debug work. A fuzzy spec (basically just the lpeg paper) and a few iterations and it produced a fully tested VM. After that the AST -> Opcode compiler was super easy as it just had to do some simple (fully defined by this point) transforms and Bob's your uncle. Not the best code ever but a working and tested system.
Then my predilection for yak shaving took over as the AST needed to be rewritten to make integration as a python C extension module viable (and generated). And why have separate AST and opcode optimization passes when they can be integrated? Oh, and why even have opcodes in the first place when you can rewrite the VM to use Continuation Passing Style and make the entire machine AST-> CPS Transform -> Optimizer -> Execute with a minimum of fuss?
So, yeah, I think it's fair to say the daffy robots are a little more than a StackOverflow chatbot. Plus, what I'm really working on is a lot more complicated than this, needing to redo the AST was just the gateway drug.
I was the same and switched t the first max plan. It’s very efficient with token usage for what I’ve been trying so far.
I haven't used Claude Code - but have been using Amp a lot recently. Amp always hits on target. They created something really special.
Has anyone here used both Claude Code and Amp and can compare the two's effectiveness? I know one is CLI and the other an editor extensions. I'm looking for comparisons beyond that. Thanks!
I don’t know why amp isn’t talked about more. It’s better than Claude code.
It burns through credit too quickly. As a previous Sourcegraph Cody user, I was trying Amp first, but I've spent tens of dollars every day for the trial, and that was with an eye on the usage. It felt horrible seeing how I pay mostly for it's mistakes and the time it takes debugging. With CC, I can let go of the anxiety. I get several hours a day out of the Claude Pro plan and that's mostly good enough for now. If it's not, I'll upgrade to Max, as at $100 that's still less than what I'd have spent on Amp.
That's the thing for me too: I don't want to pay for the agent's mistakes, even if those mistakes are in part the fault of my prompt. I'm fine with having usage limits if it means I pay a fixed cost per month. Not sure how long this will last, considering how expensive all this is for the companies to run, though.
I feel like Amp's costs are actually in line with Sourcegraph's costs, and eventually Anthropic, OpenAI, et al. will all be charging a lot more than they are now.
It's the classic play to entice people to something for low cost, and then later ramp it up once they're hooked. Right now they can afford to burn VC money, but that won't last forever.
My AMP bill is less than Claude Code but I’m getting more work done.
check out openrouter.ai you can pay for credits that get used per prompt instead of forking out a fixed lump sum and it rotates keys so you can avoid being throttled, you can even use the same credits on any model in their index
After 25 years, do you think this is the climax of your career? Where do you go from here? Just Claude code until the end of your days?
This is not actually such a big change for me. I've been doing mostly architecture for several years now. Thinking about the big picture, how things fit together, and how to make complex things simple is what I care about. I've been jokingly calling what I do "programming without coding" even before the current AIs existed. It's just that now I have a extra tool I can use for writing the code.
The more I use it the more I realise my first two weeks and the amazement I felt were an illusion.
I’m not going to tell you it’s not useful, it is. But then shine wears off pretty fast and when it does, you’re basically left with a faster way to type. At least in my experience.
It’s amazing , but it’s dumb.
I feel like Cursor gives the same experience without having to be in the terminal. I don't see how Claude Code is so much better
I really don't know what is it, but Claude Code just seems like an extremely well tuned package. You can have the same core models, but the internal prompts matter, how they are looking up extra context matters, how easy is it to add external context matters, how it applies changes matters, how eager is it to actually use an external tool to help you matters. With Claude Code, it just feels right. When I say I want a review, I get a review, when I want code, I get code, when I want just some git housekeeping, I get that.
I haven’t found massive performance between tools that use the same underlying LLM
The benefit of Claude Code is that you can pay a fixed monthly fee and get a lot more than you would with API requests alone.
That has not been my experience, Copilot using Claude is way different than claude code for me. Anecdotal, and "vibes" based, but it'd what I've been experiencing.
I use neovim so claude code makes more sense to me. I think having the code agent independent from the code editor is a plus.
I use vim for most of my development, so I'm always in a terminal anyway. I like my editor setup, and getting the benefits of a coding assistant without having to drastically change my editor has huge value to me.
The native tool use is a game changer. When I ask it to debug something it can independently add debug logging to a method, run the tests, collect the output, and code based off that until the tests are fixed.
I went on n a bit of a YouTube frenzy last weekend on getting an overview of agentic tools.
A lot of people are saying that cursor is much worse than Claude Code who have used both.
Having spent a couple of weeks putting both AIDE-centric (Cursor, Windsurf) and CLI-centric (Claude Code, OpenAI Codex, Gemini CLI) options through real-world tasks, Cursor was one of the least effective tools for me. I ultimately settled on Claude Code and am very happy with it.
I realized Claude Code is the abstraction level I want to work in. Cursor et al still stick me way down into the code muck when really I only want to see the code during review. It's an implementation detail that I still have to review because it's makes mistakes, even when guided perfectly, but otherwise I want to think in interfaces, architecture, components. The low level code, don't care. Is it up to spec and conventions, does it work? Good enough for me.
I put Cursor as 4th of the tools I have tried. Claude Code, Junie, and Copilot all do work that I find much more acceptable.
Claude Code is ahead of anything else, in a very noticeable way. (I've been writing my own cli tooling for AI codegen from 2023 - and in that journey I've tried most of the options out there. It has been a big part of my work - so that's how I know.)
I agree with many things that the author is doing:
1. Monorepos can save time
2. Start with a good spec. Spend enough time on the spec. You can get AI to write most of the spec for you, if you provide a good outline.
3. Make sure you have tests from the beginning. This is the most important part. Tests (along with good specs) are how an AI agent can recurse into a good solution. TDD is back.
4. Types help (a lot!). Linters help as well. These are guard rails.
5. Put external documentation inside project docs, for example in docs/external-deps.
6. And finally, like every tool it takes time to figure out a technique that works best for you. It's arguably easier than it was (especially with Claude Code), but there's still stuff to learn. Everyone I know has a slightly different workflow - so it's a bit like coding.
I vibe coded quite a lot this week. Among them, Permiso [1] - a super simple GraphQL RBAC server. It's nowhere close to best tested and reviewed, but can be quite useful already if you want something simple (and can wait until it's reviewed.)
[1]: https://github.com/codespin-ai/permiso
> 2. Start with a good spec. Spend enough time on the spec. You can get AI to write most of the spec for you, if you provide a good outline.
Curious how you outline the spec, concretely. A sister markdown document? How detailed is it? etc.
> 3. Make sure you have tests from the beginning. This is the most important part. Tests (along with good specs) are how an AI agent can recurse into a good solution. TDD is back.
Ironically i've been struggling with this. For best results i've found claude to do best with a test hook, but then claude loses the ability to write tests before code works to validate bugs/assumptions, it just starts auto fixing things and can get a bit wonky.
It helps immensely to ensure it doesn't forget anything or abandon anything, but it's equally harmful at certain design/prototype stages. I've taken to having a flag where i can enable/disable the test behavior lol.
I will start with a basic markdown outline and then use a prompt describing more of the system in just flowing (yet coherent) thought and, crucially, I'll ask the model to "Organize the spec in such a way that an LLM can best understand it and make use of it." The result is a much more succinct document with all the important pieces.
(or -- you can write a spec that is still more fleshed out for humans, if you need to present this to managers. Then ask the LLM to write a separate spec document that is tailored for LLMs)
> Curious how you outline the spec, concretely. A sister markdown document? How detailed is it? etc.
Yes. I write the outline in markdown. And then get AI to flesh it out. The I generate a project structure, with stubbed API signatures. Then I keep refining until I've achieved a good level of detail - including full API signatures and database schemas.
> Ironically i've been struggling with this. For best results i've found claude to do best with a test hook, but then claude loses the ability to write tests before code works to validate bugs/assumptions, it just starts auto fixing things and can get a bit wonky.
I generate a somewhat basic prototype first. At which point I have a good spec, and a good project structure, API and db schemas. Then continuously refine the tests and code. Like I was saying, types and linting are also very helpful.
What kind or projects are more suitable for this approach? Because my workflow, sans LLM agents, have been to rely on frameworks to provide a base abstraction for me to build upon. The hardest is to nail down the business domain, done over rounds of discussions with stakeholders. Coding is pretty breezy in comparison.
That's why you see such a difference in time saved using LLMs for programming across the population. If you have all the domain knowledge and the problem is generic enough it's a 100x multiplier. Otherwise your experience can easily range from 0.1x to 10x.
I don't even write the outline myself. I tell CC to come up with a plan, and then we iterate on that together with CC and I might also give it to Gemini for review and tell CC to apply Gemini's suggestions.
Playwright is such a chore with Claude but I'm afraid to live without it. Every feature seems to be about 70% of the time spent fixing it's playwright mess. It struggles with running the tests, basic data setup and cleanup, auth and just basic best practices. I have a testing guide that outlines all this but it half asses every thing ..
> 1. Monorepos can save time
Yes they can save you some time, but at the cost of Claude's time and lots of tokens making tool calls attempting to find what it needs to find. Aider is much nicer, from the standpoint that you can add the files you need it to know about, and send it off to do its thing.
I still don't understand why Claude is more popular than Aider, which is by nearly every measure a better tool, and can use whatever LLM is more appropriate for the task at hand.
> Aider is much nicer, from the standpoint that you can add the files you need it to know about, and send it off to do its thing.
As a user, I don't want to sit there specifying about 15-30 files, then realize that I've missed some and that it ruins everything. I want to just point the tool at the codebase and tell it: "Go do X. Look at the current implementation and patterns, as well as the tests, alongside the docs. Update everything as needed along the way, here's how you run the tests..."
Indexing the whole codebase into Qdrant might also help a little.
I think it makes sense to want that, but at least for me personally I’ve had dramatically better overall results when manually managing the context in Aider than letting Claude Code try to figure out for itself what it needs.
It can be annoying, but I think it both helps me be more aware of what’s being changed (vs just seeing a big diff after a while), and lends itself to working on smaller subtasks that are more likely to work on the first try.
You get much better results in CC as well if you're able to give the relevant files as a starting point. In that regard these two tools are not all that different.
Aider does know the whole repository tree (it scans the git index). It just doesn't read the files until you tell it to. If it thinks it needs access to a file, it will prompt you to add it. I find this to be a fairly good model. Obviously it doesn't work off line though.
Because it works.
Honestly, it's just this. "Claude the bar button on foo modal is broken with a failed splork". And CC hunts down foo.ts, traces that it's an API call to query.ts, pulls in the associated linked model, traces the api/slork.go and will as often as not end up with "I've found the issue!" and fix it. On a one sentence prompt. I think it's called an "Oh fuck" moment the first time you see this work. And it works remarkably reliably. [handwave caveats, stupid llms, etc]
As an alternative to monorepos, you can add another repo to your workspace by informing it relevant code is located at XXX path on your machine. Claude will add that code to your workspace for the session.
> Aider is much nicer, from the standpoint that you can add the files you need it to know about, and send it off to do its thing.
Use /add-dir in Claude
Agreed, for CC to work well, it needs quite a bit of structure
I’ve been working on a Django project with good tests, types and documentation. CC mostly does great, even if it needs guidance from time to time
Recently also started a side project to try to run CC offline with local models. Got a decent first version running with the help of ChatGPT, then decided to switch to CC. CC has been constantly trying to avoid solving the most important issues, sidestepping errors and for almost everything just creating a new file/script with a different approach (instead of fixing or refactoring the current code)
I've also found that structure is key instead of trusting its streams of consciousness.
For unit testing, I actually pre-write some tests so it can learn what structure I'm looking for. I go as far as to write mocks and test classes that *constrain* what it can do.
With constraints, it does a much better job than if it were just starting from scratch and improvising.
There's a numerical optimization analogy to this: if you just ask a solver to optimize a complicated nonlinear (nonconvex) function, you will likely get stuck or hit a local optimum. But if you carefully constrain its search space, and guide it, you increase your chances of getting to the optimum.
LLMs are essentially large function evaluators with a huge search space. The more you can herd it (like herding a flock into the right pen), the better it will converge.
> Put external documentation inside project docs
Most projects have their documentation on their website. Do you spend time formatting it into a clean Markdown file?
The real power of Claude Code comes when you realise it can do far more than just write code.
It can, in fact, control your entire computer. If there's a CLI tool, Claude can run it. If there's not a CLI tool... ask Claude anyway, you might be surprised.
E.g. I've used Claude to crop and resize images, rip MP3s from YouTube videos, trim silence from audio files, the list goes on. It saves me incredible amounts of time.
I don't remember life before it. Never going back.
You probably want to give Claude a computer. I'm not sure you always want to give it your computer unless you're in the loop.
We have Linux instances running an IDE running in cloud vms that we can access through the browser at https://brilliant.mplode.dev. Personally I think this is closer to the ideal UX for operating an agent (our environment doesn't install agents by default yet, but you should be able to just install them manually). You don't have to do anything to set up terminal access or ssh except sign in and wait for your initial instance to start, and once you have any instance provisioned it automatically pauses and resumes based on whether your browser has it open. It's literally Claude + A personal Linux instance + an IDE that you can just open from a link
Pretty soon I should be able run as many of these at a time as I can afford, and control all of their permissions/filesystems/whatever with JWTs and containers. If it gets messed up or needs my attention I open it with the IDE as my UI and can just dive in and fix it. I don't need a regular Linux desktop environment or UI or anything. Just render things in panes of the IDE or launch a container serving a webapp doing what I want and open it instead of the IDE. Haven't ever felt this excited about tech progress
I got it to diagnose why my Linux PC was crashing. It did a lot of journalctl grepping on my behalf and was glad for its help. Think it may have helped fix it but will see.
I was having a kernel panic on boot, I would work around it by loading the previous kernel. Turns out I had just ran out of space on my boot partition, but in my initial attemps to debug and fix I had gotten into a broken package state.
I handed it the reigns just out of morbid curiosity, and because I couldn't be bothered continuing for the night, but to my surprise (and with my guidance step by step) it did figure it all out. It found unused kernels, after uninstalling them didn't remove them, it deleted them with rm. It then helped resolve the broken package state and eventually I was back in a clean working state.
Importantly though, it did not know it hadn't actually cleaned up the boot partition initially. I had to insist that it had not in fact just freed up space, and that it would need to remove them.
I learned the hard old fashioned way how to build a imagemagick/mogrify command. Having the ai tools assist saves a crazy amount of time.
Completely agree. Another use case is a static site generator. I just write posts with whatever syntax I want and tell Claude Code to make it into a blog post in the same format. For example, I can just write in the post “add image image.jpeg here” and it will add it - much easier than messing around with Markdown or Hugo.
Beyond just running CLI commands, you can have CC interact with those, e.g I built this little tool that gives CC a Tmux-cli command (a convenience wrapper around Tmux) that lets it interact with CLI applications and monitor them etc:
https://github.com/pchalasani/claude-code-tools
For example this lets CC spawn another CC instance and give it a task (way better than the built-in spawn-and-let-go black box), or interact with CLI scripts that expect user input, or use debuggers like Pdb for token-efficient debugging and code-understanding, etc.
It's the automators dream come true. Anything can be automated, anything scripted, anything documented. Even if we're gonna use other (possibly local) models in the future, this will be my interface of choice. It's so powerful.
Yes, Claude has killed XKCD 1319:
https://xkcd.com/1319/
Automation is now trivially easy. I think of another new way to speed up my workflow — e.g. a shell script for some annoying repetitive task — and Claude oneshots it. Productivity gains built from productivity gains.
The is not the xkcd I thought it would be. This is the xkcd that Claude code makes me think of:
https://xkcd.com/1205/
I don’t feel Claude code helps one iota with the issue in 1319. If anything, it has increased the prevalence of “ongoing development” as I auto mate more things and create more problems to solve.
However, I have fixed up and added features to 10 year old scripts that I never considered worth the trade off to work on. It makes the cost of automation cheaper.
Combine with pywin32 to open up windows.
It hasn't killed anything. Might have reduced the time for some tasks. Try something not trivial and you still spend more than you save.
It's not a dream come true to have a bunch of GPUs crunching at full power to achieve your minor automation, with the company making them available losing massive amounts of money on it:
https://www.wheresyoured.at/the-haters-gui/
... while also exposing the contents of your computer to surveillence.
Well, yes there is that.
I'd like to say I'm praising the paradigm shift more than anything else (and this is to some degree achievable with smaller, open and sometimes local agentic models), but yes, there are definitely nasty externalities (though burning VC cash is not high up that list for me). I hope some externalities can be be optimized away.
But a fair comment.
Well me plus a $1200 / year subscription is still much cheaper than 2 of me.
The point is that it costs more than $1200, you're just not the one paying all the costs. It seems like there are a ton of people on HN who are absolutely pumped to be totally dependent on a tool that must rugpull them eventually to continue existing as a business. It feels like an incredible shame that the craft is now starting to become totally dependent on tools like this, where you're calling out to the cloud to do even the most basic programming task.
Its all great until
>> I thought I would see a pretty drastic change in terms of Pull Requests, Commits and Line of Code merged in the last 6 weeks. I don’t think that holds water though
The chart basically shows same output with claude than before. Which kinda represents what I felt when using LLMs.
You "feel" more productive and you definitely feel "better" because you don't do the work now, you babysit the model and feel productive.
But at the end of the day the output is the same because all advantages of LLMs is nerfed by time you have to review all that, fix it, re-prompt it etc.
And because you offload the "hard" part - and don't flex that thinking muscle - your skills decline pretty fast.
Try using Claude or another LLM for a month and then try doing a tiny little app without it. Its not only the code part that will seem hard - but the general architecture/structuring too.
And in the end the whole code base slowly (but not that slowly) degrades and in longer term results net negative. At least with current LLMs.
I've been exploring vibe coding lately and by far the biggest benefit is the lack of mental strain.
You don't have to try to remember your code as a conceptual whole, what your technical implementation of the next hour of code was going to be like at the same time as a stubborn bug is taunting you.
You just ask Mr smartybots and it deliver anything between proofreading and documentation and whatnot, with some minor fuckups occasionally
But the mental strain is how you build skills and get better at your job over time. If it's too much mental strain, maybe your code's architecture or implementation can be improved.
A lot of this sounds like "this bot does my homework for me, and now I get good grades and don't have to study so hard!"
It's alright until you have a bug the LLM can't solve, then you have to go in the code yourself and you realize what a mess it has made.
Perhaps you set a very high quality bar, but I don't see the LLMs creating messy code. If anything, they are far more diligent in structuring it well and making it logically sequenced and clear than I would be. For example, very often I name a variable slightly incorrectly at the start and realise it should be just slightly different at the end and only occasionally do I bother to go rename it everywhere. Even with automated refactoring tools to do it, it's just more work than I have time for. I might just add a comment above it somewhere explaining the meaning is slightly different to how it is named. This sort of thing x 100 though.
> hey are far more diligent in structuring it well and making it logically sequenced and clear than I would be
Yes, with the caveat: only on the first/zeroth shot. But even when they keep most/all of the code in context if you vibe code without incredibly strict structuring/guardrails, by the time you are 3-4 shots in, the model has "forgotten" the original arch, is duplicating data structures for what it needs _this_ shot and will gleefully end up with amnesiac-level repetitions, duplicate code that does "mostly the same" thing, all of which acts as further poison for progress. The deeper you go without human intervention the worse this gets.
You can go the other way, and it really does work. Setup strict types, clear patterns, clear structures. And intervene to explain + direct. The type of things senior engineers push back on in junior PRs. "Why didn't you just extend this existing data structure and factor that call into the trivially obvious extension of XYZ??".
"You're absolutely right!" etc.
> Perhaps you set a very high quality bar
Yes, of course. Do you not? Aren't you embarrassed to admit that you use AI because you don't care about quality?
I would be ashamed to think "you set a high quality bar" is some kind of critique
I haven't found such a bug yet. If it fails to debug on its second attempt I usually switch to a different model or tell it to carpet bomb the code with console logs, write test scripts and do a web search, etc.
The strength (and weakness) of these models is their patience is infinite.
Well I don't have the patience of waiting for it to find the right solution ahah
"mental strain" in a way of remembering/thinking hard is like muscle strain. You need it to be in shape otherwise it starts atrophying.
My friend, there’s no solid evidence that this is the case. So far, there are a bunch of studies, mostly preprints, that make vague implications, but none that can show clear causal links between a lack of mental strain and atrophying brain function from LLMs.
You're right, we only have centuries of humans doing hard things that require ongoing practice to stay sharp. Ask anyone who does something you can't fake, like playing the piano, what taking months off does to their abilities. To be fair, you can get them back much faster than someone that never had the skills to begin, but skills absolutely atrophy if you are not actively engaged with them.
My assembly skills have atrophied terribly, and that's ok.
Using LLMs is not moving up a level of abstraction, it is removing your own brain from the abstraction altogether
I wish, but as it stands right now LLMs have to be driven and caged ruthlessly. Conventions, architecture, interfaces, testing, integration. Yes, you can YOLO it and just let it cook up _something_, but that something will an unmaintainable mess. So I'm removing my brain from the abstraction level of code (as much as I dare), but most definitely not from everything else.
We know that learning and building mental capabilities require effort over time. We know that when people have not been applying/practicing programming for years, their skills have atrophied. I think a good default expectation is that unused skills will go away over time. Of course the questions are, is the engagement we have with LLMs enough to sustain the majority of the skills? Or is there new skills one builds that can compensate foe those lost (even when the LLM is no longer used)? How quickly do the changes happen? Are there wider effects, positive and/or negative?
I mostly referred to skills, not brain function itself.
I know what you‘re writing is the whole point of vibe coding, but I‘d strongly urge you to not do this. If you don’t review the code an LLM is producing, you‘re taking on technical debt. That’s fine for small projects and scripts, but not for things you want to maintain for longer. Code you don’t understand is essentially legacy code. LLM output should be bent to our style and taste, and ideally look like our own code.
If that helps, call it agentic engineering instead of vibe coding, to switch to a more involved mindset.
Not for me. I just reversed engineered a bluetooth protocol for a device which would taken me at least a few days capturing streams of data wireshark. Now i dumped entire dumps inside a llm and it gave me much more control finding the right offsets etc. It took me only a day.
That's not really coding though is it.
No point comparing apples with oranges, most of us don't program by reverse engineering using wireshark.
Do you have a decent backup system? Or do you make use of sandboxes?
> It can, in fact, control your entire computer.
Honestly, that sounds like malware.
I also don’t understand how everybody is installing these packages with npm -g and not worrying about anything
No because _you_ are in control. Not the AI.
Maybe you do today. Will that always be the case? People running Windows do not have as much control over their systems as they should. What does enshittification look like for AI?
Why should I trust any of these AI vendors?
> If there's a CLI tool, Claude can run it. If there's not a CLI tool... ask Claude anyway, you might be surprised.
No Claude Code needed for that! Just hang around r/unixporn and you'll collect enough scripts and tips to realize that mainstream OS have pushed computers from a useful tool to a consumerism toy.
That's like saying "you don't need a car, just hang around this bicycle shop long enough and you'll realize you can exercise your way around the town!"
Simple task of unzipping with tar is cryptic enough that collecting unix scripts from random people is definitely something people don't want to do in 2025.
It's cryptic until you sit down to learn it
tar xzvf <filename> -> "tar eXtract Zipped Verbose File <filename>"
tar czvf <filename> <files> -> "tar Compress Zipped Verbose File <filename> <files>"
So it's either x or c if you want to unzip a tar file or create one
Add z if the thing you're uncompressing ends in .gz, or if you want the thing you're creating to be compressed.
v is just if you want the filenames to be printed to stdout as they're extracted or imported
f tells it to read from or output to a file, as opposed to reading from stdin or output to stdout
Then your provide the filename to read from/export to, and if you're exporting you then provide the file list.
Not that hard.
Script? I havent used anything more complex than "tar xzf file" in a decade
Remembering one thing is easy, remembering all the things is not. With an agentic CLI I don't need to remember anything, other than if it looks safe or not.
But how will you remember to use the agentic CLI?
No one remember all the things. It's usually ctrl+r (history search) or write it down in some script or alias.
Ah yes, the "I've never needed X, so clearly no one else in the world will ever need X" rationale. So bulletproof.
The point is not that a tool maybe exists. The point is: You don't have to care if the tool exists and you don't have to collect anything. Just ask Claude code and it does what you want.
At least that's how I read the comment.
I've been using Claude code 12-16 hours a day since I first got it running two weeks ago. Here's the tips I've discovered:
1. Immediately change to sonnet (the cli defaults to opus for max users). I tested coding with opus extensively and it never matches the quality of sonnet.
2. Compacting often ends progress - it's difficult to get back to the same quality of code after compacting.
3. First prompt is very important and sets the vibe. If your instance of Claude seems hesitant, doubtful, sometimes even rude, it's always better to end the session and start again.
4. There are phrases that make it more effective. Try, "I'm so sorry if this is a bad suggestion, but I want to implement x and y." For whatever reason it makes Claude more eager to help.
5. Monolithic with docker orchestration: I essentially 10x'd when I started letting Claude itself manage docker containers, check their logs for errors, rm them, rebuild them, etc. Now I can get an entirely new service online in a docker container, from zero to operational, in one Claude prompt.
5. it's not just docker, give it playwright MCP server so it can see what it is implementing in UI and requests
6. start in plan mode and iterate on the plan until you're happy
7. use slash commands, they are mini prompts you can keep refining over time, including providing starting context and reminding it that it can use tools like gh to interact with Github
not sure I agree on 1.
2. compact when you are at a good stop, not when you are forced to because you are at 0%
> give it playwright MCP server
Or just `$ playwright`. Skip the MCP ceremony (and wasted tokens) and just have CC use CLI tools. Works remarkably well.
Use agents to validate the code. Is it over engineered, does it conform to conventions and spec, is it actually implemented or half bullshit. I run three of these at the end of a feature or task and it almost always send Opus back to the workbench fixing a bunch of stuff. And since they have their own context, you don't blow up the main context and can go for longer.
Sometimes it's a bit too eager to mess around inside the containers, like when I ask it to understand some code sometimes it won't stop trying to run it inside the container in a myriad of ways that won't work.
It once did a container exec that piped the target file into the projects cli command runner, which did nothing, but gives you an example of the string of wacky ways it will insist on running code instead of just reading it.
Where are you hosting those containers? Our serverless/linux cli/browser IDE at https://brilliant.mplode.dev runs on containers in our nascent cloud platform and we’re almost ready to start serving arbitrary containers on it deployed directly from the IDE. I’m curious if there are any latency/data/auth/etc pain points you’ve been running into
I had success with creating a VERY detailed plan.md file - down to how all systems connect together, letting claude-loop[1] run while I sleep and coming back in the morning manually patching it up.
[1]: https://github.com/DeprecatedLuke/claude-loop
What are some examples of what you've got it to do?
``` # PostgreSQL Web API Project Plan
## IMPORTANT - Use thiserror, create the primary struct ServiceError in error.rs which has all #[from], do not use custom result types or have errors for different modules, all errors should fall under this struct - The error from above should implement IntoResponse to translate it to client error without leaking any sensitive information and so that ServiceError can be used as error type for axum
## PLAN
### Project Setup (Completed) Set up Rust workspace with server and jwt-generator crates (Completed) Create Cargo.toml workspace configuration with required dependencies (axum, sqlx, jsonwebtoken, serde, tokio, uuid, thiserror) (Completed) Create compose.yaml for PostgreSQL test database with environment variables (Completed) Design database schema in tables.sql (data table with key UUID, data JSONB, created_at, updated_at; locks table with lock_id UUID, locked_at, expires_at)
### Database Layer (Completed) Implement database connection module with PostgreSQL connection pool (Completed) Create database migration system to auto-deploy tables.sql if tables don't exist (Completed) Implement data model structs for database entities (DataRecord, Lock)
### JWT System (Completed) Create jwt-generator utility that takes secret key, permissions (read/write), and expiration time (Completed) Implement JWT authentication middleware for server with permission validation (Completed) Add JWT token validation and permission checking for endpoints
### Core API Endpoints (Completed) Implement POST /set endpoint for storing/updating JSONB data with partial update support using jsonb_set (Completed) Implement GET /get/<key> endpoint with optional sub-key filtering for partial data retrieval (Completed) Add automatic created_at and updated_at timestamp handling in database operations
### Streaming & Binary Support (Completed) Implement streaming bytes endpoint with compact binary format (not base64) for efficient data transfer (Completed) Add support for returning all data if no specific format specified in GET requests
### Lock System (Completed) Implement database-backed lock system with locks table (Completed) Create POST /lock endpoint that tries to obtain lock for 5 seconds with UUID parameter (Completed) Create DELETE /unlock endpoint to release locks by UUID (Completed) Add lock timeout and cleanup mechanism for expired locks
### Error Handling & Final Polish (Completed) Implement comprehensive error handling with proper HTTP status codes (Completed) Add input validation for all endpoints (UUID format, JSON structure, etc.) (Completed) Test all endpoints with various scenarios (valid/invalid data, concurrent access, lock timeouts) ```
took 4 iterations (>30 minutes!), everything works as expected. the plan itself was partially generated with ccl since I told it to break down tasks into smaller steps then with some manual edits I got it down to that final product. I later swapped locks to be built on a lease system and it handled that quite nicely as well.
You are the 5% they were talking about lol.
Surprisingly - only averaging $70 a day.
I have the main agent use Opus, and have it always call sub-agents running Sonnet. That's the best setup I've found.
I turn off compacting to be manual, makes it easy to find a stopping point and write all context out to an md file before compacting.
First prompt isn't very important to me.
I haven't found i need special phrases. What matters is how context heavy I can make my subagents.
> 5. Monolithic with docker orchestration: I essentially 10x'd when I started letting Claude itself manage docker containers, check their logs for errors, rm them, rebuild them, etc. Now I can get an entirely new service online in a docker container, from zero to operational, in one Claude prompt.
This is very interesting. What's your setup, and what kind of prompt might you use to get Claude to work well with Docker? Do you do anything to try and isolate the Claude instance from the rest of your machine (i.e. run these Docker instances inside of a VM) or just YOLO?
Not the parent but I've totally been doing this, too. I've been using docker compose and Claude seems to understand that fine in terms of scoping everything - it'll run "docker compose logs foo" "docker compose restart bar" etc. I've never tried to isolate it, though I tend to rarely yolo and keep an eye on what it's doing and approve (I also look at the code diffs as it goes). It's allowed to read-only access stuff without asking but everything else I look at.
I YOLO and there isn't much guidance needed, it knows how docker and compose etc. works, how to get logs, exec in etc.
"bring this online in my local docker" will get you a running service, specify further as much as you like.
I'm fascinated by #5. As someone who goes out of my way to avoid Docker while realizing its importance, I would love to know the general format of your prompt.
It's the difference between Claude making code that "looks good" and code that actually runs. You don't have to be stuck anymore saying, "hey help me fix this code." Say, "Use tmux to create a persistent session, then run this python program there and debug it until its working perfectly"
Letting Claude manage docker has been really good.
I’m working my way through building a guide to my future self for packaging up existing products in case I forget in 6 months.
At the same time frontier models may improve it, make it worse, or it stays the same, and what I’m after is consistency.
Irrespective of how good Claude code actually is (I haven’t used it, but I think this article makes a really cogent case), here’s something that bothers me: I’m very junior, I have a big slow ugly codebase of gdscript (basically python) that I’m going to convert to C# to both clean it up and speed it up.
This is for a personal project, I haven’t written a ton of C# or done this amount of refactoring before, so this could be educational in multiple ways.
If I were to use Claude for this Id feel like I was robbing myself of something that could teach me a lot (and maybe motivate me to start out with structuring my code better in the future). If I don’t use Claude I feel like Im wasting my (very sparse) free time on a pretty uninspiring task that may very well be automated away in most future jobs, mostly out of some (misplaced? Masochistic?) belief about programming craft.
This sort of back and forth happens a lot in my head now with projects.
I'm on the tail end of my 35+ year developer career, but one thing I always do with any LLM stuff is this: I'll ask it to solve something generally I know I COULD solve, I just don't feel like it.
Example: Yesterday I was working with an Open API 3.0 schema. I know I could "fix" the schema to conform to a sample input, I just didn't feel like it because it's dull, I've done it before, and I'd learn nothing. So I asked Claude to do it, and it was fine. Then the "Example" section no longer matched the schema, so Claude wrote me a fitting example.
But the key here is I would have learned nothing by doing this.
There are, however, times where I WOULD have learned something. So whenever I find the LLM has shown me something new, I put that knowledge in my "knowledge bank". I use the Anki SRS flashcard app for that, but there are other ways, like adding to your "TIL blog" (which I also do), or taking that new thing and writing it out from scratch, without looking at the solution, a few times and compiling/running it. Then trying to come up with ways this knowledge can be used in different ways; changing the requirements and writing that.
Basically getting my brain to interact with this new thing in at least 2 ways so it can synthesize with other things in your brain. This is important.
Learning a new (spoken) language uses this a lot. Learn a new word? Put it in 3 different sentences. Learn a new phrase? Create at least 2-3 new phrases based on that.
I'm hoping this will keep my grey matter exercised enough to keep going.
I'm looking forward to the day that LLMs automatically put knowledge like this into Anki or something like it.
I found that big part of learning on Anki is creating the cards myself. Using cards from others / generated cards via LLM is not the same.
I made tools for Open-webui to add to anki. They work well I think. There are probably MCP tools too.
In my experience, if you don't review the generated code, and thus become proficient in C# enough to do that, the codebase will become trash very quickly.
Errors compound with LLM coding, and, unless you correct them, you end up with a codebase too brittle to actually be worth anything.
Friends of mine apparently don't have that problem, and they say they have the LLM write enough tests that they catch the brittleness early on, but I haven't tried that approach. Unfortunately, my code tends to not be very algorithmic, so it's hard to test.
After 16 years of coding professionally, I can say Claude Code has made me considerably better at the things that I had to bang my head against the wall to learn. For things I need to learn that are novel to me, for productivity sake, it’s been “easy come; easy go” like any other learning experience.
My two cents are:
If your goal is learning fully, I would prioritize the slow & patient route (no matter how fast “things” are moving.)
If your goal is to learn quickly, Claude Code and other AI tooling can be helpful in that regard. I have found using “ask” modes more than “agent” modes (where available) can go a long way with that. I like to generate analogies, scenarios, and mnemonic devices to help grasp new concepts.
If you’re just interested in getting stuff done, get good at writing specs and letting the agents run with it, ensuring to add many tests along the way, of course.
I perceive there’s at least some value in all approaches, as long as we are building stuff.
Yes! Valuable, fundamental, etc. - do it yourself, the slow path.
Boring, uninspiring, commodity - and most of all - easily reversible and not critical - to the LLM it goes!
When learning things intrinsic motivation makes one unreasonably effective. So if there is a field you like - just focus on it. This will let you proceed much faster at _valuable_ things which all in all is the best use of ones time in any case.
Software crafting when you are not at a job should be fun. If it’s not fun, just do the least effort that suits your purpose. And be super diligent only about the parts _you_ care about.
IMHO people who think everyone should do everything from first principles with the diligence of a swiss clocksmith are just being difficult. It’s _one_ way of doing it but it’s not the _only right way_.
Care about important things. If a thing is not important and not interesting just deal with it the least painfull way and focus on something value adding.
A few years ago there was a blog post trend going around about “write you’re own x” instead of using a library or something. You learn a lot about how software by writing your own version of a thing. Want to learn how client side routing works? Write a client side router. I think LLMs have basically made it so anything can be “library” code. So really it comes down to what you want to get out of the project. Do you want to get better at C#? Then you should probably do the port yourself. If you just want to have the ported code and focus on some other aspect, then have Claude do it for you.
Really if your goal is to learn something, then no matter what you do there has to be some kind of struggle. I’ve noticed whenever something feels easy, I’m usually not really learning much.
Before AI, there was copy paste. People who copied code from Stackoverflow without understanding it learned nothing, and I saw it up close many times. I don't see a problem with you asking for advice or concepts. But if you have it write everything for you, you definitely won't learn
That being said, you have to protect your time as a developer. There are a million things to learn, and if making games is your goal as a junior, porting GDscript code doesn't sound like an amazing use of your time. Even though you will definitely learn from it.
The difference now is that LLMs propose to provide copy+paste for everything, and for your exact scenario. At least with Stack Overflow, you usually had to adapt the answers to your specific scenario, and there often weren’t answers for more esoteric things.
I think this is a really interesting point. I have a few thoughts as a read it (as a bit of a grey-beard).
Things are moving fast at the moment, but I think it feels even faster because of how slowly things have been moving for the last decade. I was getting into web development in the mid-to-late-90s, and I think the landscape felt similar then. Plugged-in people kinda knew the web was going to be huge, but on some level we also know that things were going to change fast. Whatever we learnt would soon fall by the wayside and become compost for the next new thing we had to learn.
It certainly feels to me like things have really been much more stable for the last 10-15 years (YMMV).
So I guess what I'm saying is: yeah, this is actually kinda getting back to normal. At least that is how I see it, if I'm in an excitable optimistic mood.
I'd say pick something and do it. It may become brain-compost, but I think a good deep layer of compost is what will turn you into a senior developer. Hopefully that metaphor isn't too stretched!
I’ve also felt what GP expresses earlier this year. I am a grey-beard now. When I was starting my career in the early 2000’s a grey-beard told me, “The tech is entirely replaced every 10 years.” This was accompanied by an admonition to evolve or die in each cycle.
This has largely been true outside of some outlier fundamentals, like TCP.
I have tried Claude code extensively and I feel it’s largely the same. To GP’s point, my suggestion would be to dive into the project using Claude Code and also work to learn how to structure the code better. Do both. Don’t do nothing.
Thx to both of you, I think these replies helped me a bit.
It does definitely seem to be, and stands to reason, that better developers get better results out of Claude et al.
You're on the right track in noticing you'll be missing valuable lessons, and this might rob you of better outcomes even with AI in the future. As it is a side project though keeping motivation is important too.
As well, you'll eventually learn those lessons through future work if you keep coding yourself. But if instead you lean more toward assistance it is hard to say if you would become as skilled in the raw skill of coding, and that might affect yoir abilityto wield AI to full effecf.
Having done a lot of work across many languages, including gdscript and C# for various games, I do think you'll learn a huge amount from doing the work yourself and such an opportunity is a bit more rare to come by in paid work.
How much do you care about experience with C# and porting software? If that's an area you're interested in pursuing maybe do it by hand I guess. Otherwise I'd just use claude.
Disagree entirely, and would suggest the parent intentionally dive in on things like this.
The best way to skill up over the course of one's career is to expose yourself to as broad an array of languages, techniques, paradigms, concepts, etc. So sure, you may never touch C# again. But by spending time to dig in a bit you'll pick up some new ideas that you can bring forward with you to other things you *do* care about later.
I agree here. GP should take time to learn the thing and use AI to assist in learning not direct implementation.
If there is going to be room for junior folks in SWE, it will probably be afforded to those who understand some language’s behaviors at a fairly decent level.
I’d presume they will also be far better at system design, TDD and architecture than yesterday’s juniors, (because they will have to be to drive AI better than other hopeful candidates).
But there will be plenty of what will be grey beards around that expect syntactical competence and fwiw, if you can’t read your own code, even slowly, you fail at the most crucial aspect of AI accelerated development—-validation.
Well I think you've identified a task that should be yours. If the writing of the code itself is going to help you, then don't let AI take that help from you because of a vague need for "productivity". We all need to take time to make ourselves better at our craft, and at some point AI can't do that for you.
But I do think it could help, for example by showing you a better pattern or language or library feature after you get stuck or finish a first draft. That's not cheating that's asking a friend.
For something like this, i'd ask claude code to review the project and create design and architecture documents for it.
Then i'd ask it to create a plan to recreate it in c#.
Next i'd ask claude code to generate a new project in c#, following the small steps it defined in the planning document.
Then i'd ask claude code to review its experience building the app and update the original plan document with these insights.
Then throw away the first c# project, and have another go at it. Make sure the plan includes starting with tests.
Yup, I absolutely agree with you. I've been coding professionally for around 25 years now, 10-ish before that as a hobby as a child and teenager. There's lots of stuff I know, but still lots of stuff I don't know. If my goal is to learn a new language, I'm going to build the entire thing without using a coding assistant. At most I might use Claude (not Code) to ask pointed questions, and then use those answers to write my own code (and not copy/paste anything from Claude).
Often I'll use Claude Code to write something that I know how to write, but don't feel like writing, either because it's tedious, or because it's a little bit fiddly (which I know from past experience), and I don't feel like dealing with the details until CC gives me some output that I can test and review and modify.
But sometimes, I'll admit, I just don't really care to learn that deeply. I started a project that is using Rust for the backend, but I need a frontend too. I did some React around 10 years ago, but my knowledge there (what I remember, anyway) is out of date. So sometimes I'll just ask Claude to build an entire section of a page. I'll have Claude do it incrementally, and read the code after each step so I understand what's going on. And sometimes I do tell Claude I'm not happy with the approach, and to do something differently. But in a way I kinda do not care so much about this code, aside from it being functional and maintainable-looking.
And I think that's fine! We don't have to learn everything, even if it's something we need to accomplish whatever it is we've set out to accomplish. I think the problem that you'll run into is that you might be too junior to recognize what are the things you really need to learn, and what are the things you can let something else "learn" for you.
One of the things I really worry about this current time we're in is that companies will start firing their junior engineers, with a belief (however misguided) that their senior engineers, armed with coding assistants, can be just as productive. So junior engineers will lose their normal path to gaining experience, and young adults entering college will shy away from programming, since it's hard to get a job as a junior engineer. Then when those senior engineers start to retire, there will be no one to take their places. Of course, the current crop of company management won't care; they'll have made their millions already and be retired. So... push to get as much experience as you can, and get over the hump into senior engineer territory.
Have it generate the code. Then have another instance criticize the code and say how it could be improved and why. Then ask questions to this instance about things you don't know or understand. Ask for links. Read the links. Take notes. Internalize.
One day I was fighting Claude on some core Ruby method and it was not agreeing with me about it, so I went to check the actual docs. It was right. I have been using Ruby since 2009.
As someone who is programming computers for almost 30 years and professionally for about 20 by all means do some of it manually, but leverage LLMs in tutor/coach mode, with „explain this but don’t solve it for me” prompts when stuck. Let the tool convert the boring parts once you’re confident they’re truly boring.
Programming takes experience to acquire taste for what’s right, what’s not, and what smells bad and will bite you but you can temporarily (yeah) not care. If you let the tool do everything for you you won’t ever acquire that skill, and it’s critical to judge and review your work and work of others, including LLM slop.
I agree it’s hard and I feel lucky for never having to make the LLM vs manual labor choice. Nowadays it’s yet another step in learning the craft, but the timing is wrong for juniors - you are now expected to do senior level work (code reviews) from day 1. Tough!
What’s wrong with using a Claude code to write a possible initial iteration and then go back and review the code for understanding? Various languages and frameworks have there own footguns but those usually are not unfixable later on.
AFAICT you learn significantly more in building something from the ground-up than you do when code-reviewing someone else's code. In my experience you really don't build the mental model of how the code is working unless you either build it yourself, refactor it yourself, or you have to heavily debug it to fix a bug or something.
In my experience, determining what to write is harder than how to write it, so you deprive yourself of that learning if you start from generated code
I actually think this helps in that learning - it's sitting alongside a more experienced expert doing it and seeing what they came up with.
In the same sense that the best way to learn to write is often to read a lot, whether English or code. Of course, you also have to do it, but having lots of examples to go on helps.
Based on my usage of Claude Code, i would not trust it with anything so major.
My problem with it is that it produces _good looking_ code that, at a glance, looks 'correct', and occasionally even works. But then i look at it closely, and it's actually bad code, or has written unnecessary additional code that isn't doing anything, or has broken some other section of the app, etc.
So if you don't know enough C# to tell whether the C# it's spitting out is good or not, you're going to have a bad time
Cursor has made writing C++ like a scripting language for me. I no longer wrestle with arcane error messages, they go straight into Cursor and I ask it to resolve and then from its solution I learn what my error was.
Can you really use Cursor for CPP ? Would you mind describing your setup? How better is it than copilot or windsurf?
Open your C++ project in Cursor. Before anything else ask it to review the codebase and tell you what the codebase does so you can understand its power. Play around asking it to find sections of the code that handle functionality for certain features. It should really impress you.
Continue to work on it in your preferred IDE let’s say Visual Studio. When you hit your first compile error, just for fun even if you understand the error, copy and paste it into Cursor and ask it to help you understand the cause and propose a solution. Ask it to implement it, attempt to compile, give it back any further errors that its solution may have to review and fix. You will eventually compile.
Then before you go back to work writing your next task, ask Cursor to propose how it might complete the task. After the proposal review and either tell it to proceed to implement or suggest tweaks or better alternatives. For complex tasks try setting the model manually to o3 and rerunning the same prompt and you can see how it thinks much better and can one shot solutions to complex errors. I try to use auto and if it fails on more complex tasks I resubmit the original query with o3. If o3 fails then you may have to gather more context by hand and really hold its hand through the chain of reasoning. That’s for a future post.
More advanced: Create a build.bat script that Cursor can run after it has implemented code to run and see its own errors so you can avoid the copy paste round trip. (Look into Cursor rules for this but a rule prompt that says 'after implementing any significant code changes please run .\build.bat and review and fix any further errors') This simple efficiency should allow you to experience the real productivity behind Cursor where you’re no longer dying the death of 1000 cuts losing a minute here or a minute there on rote time consuming steps and you can start to operate from a higher natural language level and really feel the ‘flow’.
Typing out the code is just an annoying implementation detail. You may feel ‘competency leaving your fingers’ as DHH might say but I’d argue you can feel your ass filling up with rocket fuel.
Doing the easy stuff is what gives you the skills to do the harder stuff that a LLM can’t do, which arguably makes this hard indeed
Is GDScript really less efficient than C# in Godot?
What bottlenecks are you experiencing?
I'm a developer experienced with Python (GDScript-like) and C#, but am new to Godot and started with GDScript.
It really depends on how much actual logic you implement in Gdscript. It is really slow though, even slower than python as far as I know. So if you’re doing anything beyond gluing engine calls together (eg writing complicated enemy logic) its easy to run into performance issues. The “official” way to deal with that is to create gdextensions for the slow stuff, but at that point you might aswell do everything in C# (imo).
It’s easy to convince yrself that code is going to be fast enough, but games run into bottlenecks really quickly, and it also makes your stuff inaccessible to people who don’t have great hardware.
It depends how you use it. You can ask Claude Code for instructions to migrate the Code yourself, and it will be a teacher. Or you can ask it to create a migration plan and the execute it, in which case learning will of course be very limited. I recommend to do the conversion in smaller steps if possible. We tried to migrate a project just for fun in one single step and Claude Code failed miserably (itself thought it had done a terrific job), but doing it in smaller chunks worked out quite well.
Hii
Is it really that much better than cursor’s agent? I’m hesitant to try because it would be out of pocket and I get cursor for free (work). It’s hard to understand how it could be that different if both are using sonnet under the hood.
I see a lot of comments here gushing about CC but I've used and I really don't get it. I find that it takes me just as long to explain to it what I need done as it takes to just do the work myself.
What's happening is that we are being bombarded by marketing on all fronts. These gushing statements are no different from the testimonials and advertorials from the days of yore.
It’s absolute lunacy to think everyone lauding Claude code is paid marketing/shilling.
What’s actually happening is there’s a divide being created between engineers that know how to use it, and engineers that don’t or want to convince themselves that it’s useless or whatever.
Group 2 will not fare well in the coming months.
I don't believe anything I see online anymore. Payola is everywhere and people are happy to sell their professional souls for little more than likes and free LLM credits.
Deeper still is this: The group that openly relies on LLMs to code is likely the first group to be replaced by LLMs, since they've already identified the recipe that the organization needs to replace them.
More broadly, we live in an age where marketing is crucial to getting noticed, where good work alone is not sufficient and you have the perfect scenario for people to market themselves out of the workforce.
There is also a group of engineers who like to... engineer stuff? I really do enjoy writing codes by myself, it gives me dopamine. The reason I've learnt talking to machines is that I don't like talking to people, so I don't fancy talking to machines like they were human beings.
You can still do both. Its just all of the grunt work is no longer grunt work, and all the tech debt you've been putting off is no longer an issue, and all of the ideas you've been meaning to try out but don't have the time can suddenly be explored in an afternoon, and so on and so forth.
For new features, by all means code it by hand. Maybe that is best! But a codebase is much more than new features. Invaluable tool.
If you don't see how that fits into "group 2" in GP's comment even though it wasn't explicitly called out, then we may have identifed why you don't find agentic coding to be enjoyable.
There are also those who code in languages that are not the most popular, on operating systems that are not the most popular, or frameworks that are not the most popular.
> Group 2 will not fare well in the coming months.
If it's a matter of months then latecomers will be up to speed in months as well, which isn't really that long a time.
You need to give it something to do that you don't think it will be able to do. That is where I think it starts to click.
This is a good idea. Otoh I have mixed results, claude code easily convinces themselves to retry failed ides in a loop
I have been coding with claude code for about 3 weeks and I love it. I have bout 10yoe and mostly do Python ML / Data Eng. Here are a few reasons:
1. It takes away the pain of starting. I have no barrier to writing text but there is a barrier to writing the first line of code, to a large extend coming form just remembering the context, where to import what from, setting up boilerplate etc.
2. While it works I can use my brain capacity to think about what I'm doing.
3. I can now do multiple things in parallel.
4. It makes it so much easier to "go the extra mile" (I don't add "TODOs" anymore in the code I just spin up a new Claude for it)
5. I can do much more analysis, (like spinnig up detailed plotting / analysis scripts)
6. It fixes most simple linting/typing/simple test bugs for me automatically.
Overall I feel like this kind of coding allows me to focus about the essence: What should I be doing? Is the output correct? What can we do to make it better?
Taking the pain of starting is a big one. It lets me do things I would never have done just because it’d go on the “if only I had time” wish list.
Now literally between prompts, I had a silly idea to write a NYT Connections game in the terminal and three prompts later it was done: https://github.com/jleclanche/connections-tui
> 4. It makes it so much easier to "go the extra mile" (I don't add "TODOs" anymore in the code I just spin up a new Claude for it)
This especially. I've never worked at a place that didn't skimp on tests or tech debt due to limited resources. Now you can get a decent test suite just from saying you want it.
Will it satisfy purists? No, but lots of mid hanging fruit long left unpicked can now be automatically picked.
I've actually gone through and tried to refactor the tests Claude writes (when I ask it to only touch test files). I can't improve them, generally speaking. Often they're limited by architectural or code style choices in the main code. And there are minor stylistic things here or there.
But the bulk of it, is that you get absolutely top tier tests for the same level of effort as a half-assed attempt.
If you set it up with good test quality tools (mutation testing is my favorite) it goes even further - beyond what I think is actually reasonable to ask a human to test unless you're e.g. writing life and safety critical systems.
As one of the curious minority who keeps trying agentic coding but not liking it, I've been looking for explanations why my experience differs from the mainstream. I think it might lie in this nugget:
The comparison seems apt and yet, still people paint, still people pay for paintings, still people paint for fun.I like coding by hand. I dislike reviewing code (although I do it, of course). Given the choice, I'll opt for the former (and perhaps that's why I'm still an IC).
When people talk about coding agents as very enthusiastic but very junior engineering interns, it fills me with dread rather than joy.
> still people paint, still people pay for paintings
But in what environment? It seems to me that most of the crafts that have been replaced by the assembly line are practiced not so much for the product itself, but for an experience both the creator and the consumer can participate in, at least in their imagination.
You don't just order such artifacts on Amazon anonymously; you establish some sort of relationship with the artisan and his creative process. You become part of a narrative. Coding is going to need something similar if it wants to live in that niche.
I don't disagree with any of that. But as long as there are companies willing to pay me to write code the old-fashioned way, I'll keep doing it.
I don't think it's a complete good comparison. In the past painting was the only way to depict real world events, but painting is also art, and it often doesn't necessarily depict reality, but the artist's interpretation of it. That is why people still paint.
So yeah if you like coding as an art form, you can still keep doing that. It's probably just a bit harder to make lots of money with it. But most people code to make a product (which in itself could be a form of art). And yeah if it's faster to reach your goals of making a product with the help of AI, then the choice is simple of course.
But yeah in a way I'm also sad that the code monkey will disappear, and we all become more like the lead developer who doesn't really program anymore but only guides the project, reviews code and makes technical decisions. I liked being the code monkey, not having to deal a lot with all the business stuff. But yeah, things change you know.
I totally get this side of things. I see the benefits of Agentic coding for small tasks, minor fixes, or first drafts. That said, I don't understand the pseudo-tribalism around specific interfaces to what amounts to only a few models under the hood and worry about what its doing for (or not doing for) junior devs.
Also, if we could get AI tooling to do the reviews for us reliably, I'd be a much happier developer.
A more apt metaphor is moving from hand-tools to power-tools.
The painting/photography metaphor stretches way too far imo - photography was fundamentally a new output format, a new medium, an entirely new process. Agentic coding isn't that.
Lets stop calling it Vibe Coding.
I'm a heavy user of Claude Code and I use it like a coding assistant.
How well you can manage a development team in real life has strong correlations on how much value you get out of an LLM based coding assistant.
If you can't describe what success looks like, expect people to read your mind, and get angry at validating questions, then you will have problems both with coding assistants and leading teams of developers.
Calling what vibe coding, though? If you're reviewing, understanding, and testing everything that the coding assistant outputs, then you aren't vibe coding.
If you're just letting the coding assistant do its thing, uncritically, and committing whatever results, then you're vibe coding.
It sounds like you're not vibe coding. That's good. No need to throw away a useful term (even if it's a weird, gen-Z sounding term) that describes a particular (poor) way to use a LLM.
Yeah, maybe you're right.
The point that I'm probably missing (and others) is that we associate the phrase 'Vibe Coding' with 'Using an LLM to help with coding' and they're not the same.
Maybe the critics of Vibe Coding need to remember that all users of LLMs for coding support aren't about to regret their life choices.
Well said. The skills involved are actually quite a bit different than coding. It's about how clearly and accurately you can describe things. how good you are at understanding what tooling you need to build to improve your experience. It's a different skillset.
I don't think that's true with current-gen models. You can even go so far as to write pseudocode for the LLM to translate to a real language, and for anything out-of-the-box my experience is that it will blatantly ignore your instructions. A baseline-competent junior at least has the context to know that if there are 5 steps listed and they only did 3 then there's probably a problem.
Prompting an LLM is definitely a different skillset from actually coding, but just "describing it better" isn't good enough.
I don't believe it is good enough, it's also not as relevant.
my prompts have gotten less and less. It's the hooks and subagents, and using the tools that matter far more.
This is a thread about claude code. the other LLms don't matter. Nothing ever blatantly ignores my instructions in claude code. that's a thing of the past.
Of course, not using claude code, for sure. But all my instructions are followed with my setup. That really isn't an issue for me personally anymore.
Your experience echoes my own for sufficiently trivial tasks, but I haven't gotten any of this to work for the actual time-consuming parts of my job. It's so reliably bad for some tasks that I've reworked them into screening questions for candidates trying to skate by with AI without knowing the fundamentals. Is that really not your experience, even with claude code?
Right, and I wasn't able to get this to work for any actual time consuming parts of my job until last weekend with sub-agents, and testing head to head battles with sub-agents, and selecting the best one and repeating.
Last weekend I did nothing but have different ideas battle it out against each other, with me selecting the most successful one, and repeating.
And now, my experience is no longer the same. Before last weekend, i had the same experience you are describing.
What's your experience with sub-agents ? is it really improving the output?
Did you use the one suggested here https://docs.anthropic.com/en/docs/claude-code/sub-agents or have you created custom ones?
The suggested ones are terrible, and it's guidance is terrible.
Last weekend I ran head to head tests of agents against each other with a variety of ideas, and selected the best one, and did it again. It has caused me to have a very specific subagent system, and I have an agent who creates those.
I try to use claude code a lot, I keep getting very frustrated with how slow it is and how it always does things wrong. It does not feel like its saving my any mental energy on most tasks. I do gravitate towards it for some things. But then I am sometimes burned on doing that and its not pleasent.
For example, last week i decided to play with nushell, i have a somewhat simple .zshrc so i just gave it to claude and asked it to convert it to nushell. The nu it generated for the most part was not even valid, i spent 30 mins with it, it never worked. took me about 10 minutes in the docs to convert it.
So it's miserable experiences like that that make me want to never touch it, because I might get burned again. There are certainly things that I have found value in, but its so hit or miss that i just find my self not wanting to bother.
Have you tried context7 MCP? For things that are not mainstream (like Javascript, Typescript popularity), LLM might struggle. I usually have better result with using something like context7 where it can pull up more relevant, up to date examples.
I only use 2 MCP servers, and those are context7 and perplexity. For things like updated docs, I have it ask context7. For the more difficult technical tasks where I think it's going to stumble, I'll instruct Claude Code to ask perplexity and that usually resolves it. Or at least it'll surface up to me in our conversation so that we both are learning something new at that point.
For some new stuff I'm working on, I use Rails 8. I also use Railway for my host, which isn't as widely-used as a service like Heroku, for example. Rails 8 was just released in November, so there's very little training data available. And it takes time for people to upgrade, gems to catch up, conversations to bubble up, etc. Operating without these two MCP servers usually caused Claude Code to repeatedly stumble over itself on more complex or nuanced tasks. It was good at setting up the initial app, but when I started getting into things like Turbo/Stimulus, and especially for parts of the UI that conditionally show, it really struggled.
It's a lot better now - it's not perfect, but it's significantly better than relying solely on its training data or searching the web.
I've only used Claude Code for like 4 weeks, but I'm learning a lot. It feels less like I'm an IC doing this work, and my new job is (1) product manager that writes out clear PRDs and works with Claude Code to build it, (2) PR reviewer that looks at the results and provides a lot of guidance, (3) tester. I allocate my time 50%/20%/30% respectively.
Thanks, I’ll check out Perplexity. We seem to be using a similar stack. I’m also on Rails 8 with Stimulus, Hotwire, esbuild, and Tailwind.
Playwright MCP has been a big help for frontend work. It gives the agent faster feedback when debugging UI issues. It handles responsive design too, so you can test both desktop and mobile views. Not sure if you know this, but Claude Code also works with screenshots. In some cases, I provide a few screenshots and the agent uses Playwright to verify that the output is nearly pixel perfect. It has been invaluable for me and is definitely worth a try if you have not already.
This is basically my experience with it. I thought it'd be great for writing tests, but every single time, no matter how much coaxing, i end up rewriting the whole thing myself anyway. Asking it for help debugging has not yet yielded good results for me.
For extremely simple stuff, it can be useful. I'll have it parse a command's output into JSON or CSV when I'm too lazy to do it myself, or scaffold an empty new project (but like, how often am i doing that?). I've also found it good at porting simple code from like python to JavaScript or typescript to go.
But the negative experiences really far outweigh the good, for me.
Really agree with the author's thoughts on maintenance here. I've run into a ton of cases where I would have written a TODO or made a ticket to capture some refactoring and instead just knocked it out right then with Claude. I've also used Claude to quickly try out a refactoring idea and then abandoned it because I didn't like how it came out. It really lowers the activation energy for these kinds of maintenance things.
Letting Claude rest was a great point in the article, too. I easily get manifold value compared to what I pay, so I haven't got it grinding on its own on a bunch of things in parallel and offline. I think it could quickly be an accelerator for burnout and cruft if you aren't careful, so I keep to a supervised-by-human mode.
Wrote up some more thoughts a few weeks ago at https://www.modulecollective.com/posts/agent-assisted-coding....
Has anyone had their own experience of how Claude or similar AI agents perform in large (1M+ lines) legacy code bases? To give a bit more context, I work on a Java code base that is 20+ years old. It was continuously updated and expanded but contains mostly spaghetti code. Would Claude add any value here?
Agreed. CC lets you attempt things that you wouldn’t have dared to try. For example here are two things I recently added to the Langroid LLM agent framework with CC help:
Nice collapsible HTML logs of agent conversations (inspired by Mario Zechner’s Claude-trace), which took a couple hours of iterations, involving HTML/js/CSS:
https://langroid.github.io/langroid/notes/html-logger/
A migration from Pydantic-v1 to v2, which took around 7 hours of iterations (would have taken a week at least if I even tried it manually and still probably wouldn’t have been as bullet-proof):
https://github.com/langroid/langroid/releases/tag/0.59.0-b3
I appreciate that Orta linked to my "Full-breadth Developers" post here, for two reasons:
1. I am vain and having people link to my stuff fills the void in my broken soul
2. He REALLY put in the legwork to document in a concrete way what it looks like for these tools to enable someone to move up a level of abstraction. The iron triangle has always been Quality, Scope, Time. This innovation is such an accelerant that that ambitious programmers can now imagine game-changing increases in scope without sacrificing quality and in the same amount of time.
For this particular moment we're in, I think this post will serve as a great artifact of what it felt like.
Thanks yeah, your post hit the nail on the head so well I got to delete maybe a third of my notes for this!
So far what I've noticed with Claude Code is not _productivity gains_ but _gains in my thoughtfulness_
As in the former is hyped, but the latter - stopping to ask questions, reflect, what should we do - is really powerful. I find I'm more thoughtful, doing deeper research, and asking deeper questions than if I was just hacking something together on the weekend that I regretted later.
Agreed. The most unique thing I find with vibecoding is not that it presses all the keyboard buttons. That’s a big timesaver, but it’s not going to make your code “better” as it has no taste. But what it can do is think of far more possibilities than you can far quicker. I love saying “this is what I need to do, show me three to five ways of doing it as snippets, weigh the pros and cons”. Then you pick one and let it go. No more trying the first thing you think of, realizing it sucks after you wrote it, then back to square one.
I use this with legacy code too. “Lines n—n+10 smell wrong to me, but I don’t know why and I don’t know what to do to fix it.” Gemini has done well for me at guessing what my gut was upset about and coming up with the solution. And then it just presses all the buttons. Job done.
It's less that I'm a skeptic, but more that I'm finding I intensely abhor the world we're building for ourselves with these tools (which I admittedly use a lot).
Why abhor specifically?
The answer can and certainly will fill many books, dissertations, PhD thesis etc.
Without going too philosophical, although one is not unjustified in going there, and just focusing on my own small professional corner (software engineering): these llm developments mostly kill an important part of thinking and might ultimately make me dumber. For example, I know what a B tree is and can (could) painstakingly implement one when and if I needed to, the process of which would be long, full of mistakes and learning. Now, just having a rough idea will be enough, and most people will never get the chance to do it themselves. Now B-tree is an intentionally artificial example, but you can extrapolate that to more practical or realistic examples. On a more immediate front, there's also the matter of threat to my livelihood. I have significant expenses for the foreseeable future, and if my line of work gets a 100 or even 10x average productivity boost, there just might be less jobs going around. Farm ox watching the first internal combustion tractors.
I can think of many other reasons, but those are the most pressing and personal to me.
Not the GP but we're descending into a world where we just recycle the same "content" over and over. Nothing will be special, there'll be nothing to be proud of. Just constant dopamine hits administered by our overlords. Read Brave New World if you haven't.
> Read Brave New World if you haven't.
I have and I don't see the connection with AI-assisted coding.
If your comment was about "generative AI in general" then I think this is the problem with trying to discuss AI on the internet at the moment. It quickly turns into "defend all aspects of AI or else you've lost". I can't predict all aspects of AI. I don't like all aspects of AI and I can't weigh up the pros and cons of a vast number of distinct topics all at once. (and neither, I suspect, can anyone else)
I think it's possible Claude Code might be the most transformative piece of software since ChatGPT. It's a step towards an AI agent that can actually _act_ at a fundamental level - with any command that can be found on a computer - in a way that's beyond the sandboxed ChatGPT or even just driving a browser.
I'm most interested in how well these tools can tackle complex legacy systems.
We have tonnes of code that's been built over a decade with all kinds of idioms and stylistic conventions that are enforced primarily through manual review. This relates in part to working in a regulated environment where we know certain types of things need radical transparency and auditability, so writing code the "normal" way a developer would is problematic.
So I am curious how well it can see the existing code style and then implicitly emulate that? My current testing of other tools seems to suggest they don't handle it very well; typically I am getting code that looks very foreign to the existing code. It exhibits the true "regression to the mean" spirit of LLMs where it's providing me with "how would the average competent engineer write this", which is not at all how we need the code written.
Currently, this is the main barrier to us using these tools in our codebase.
You need to provide agentic tools with enough context about the project so they can find their way around. In Claude Code this is typically done via a CLAUDE.md document at the root of the codebase.
I work on Chromium and my experience improved immensely by using a detailed context document (~3,000 words) with all sorts of relevant information, from the software architecture and folder organisation to the C++ coding style.
(The first draft of that document was created by Claude itself from the project documentation.)
In my experience, doing it via Claude.md is significantly worse than doing it in the sub-agents context.
I've had a lot of luck with Claude on my 8 year old, multi-language codebase. But I do have to babysit it and provide a lot of context.
I created some tutorial files which contain ways to do a lot of standard things. Turns out humans found these useful too. With the examples, I've found Opus generally does a good job following existing idioms, while Sonnet struggles.
Ultimately it depends on how many examples in that language showed up on stackoverflow or in public GitHub repos. Otherwise, ymmv if it's not python, c++, rust or JavaScript
It's still easier to handle greenfield projects, but everything is improving, and the gap is decreasing.
I wish I got this level of productivity. I think every article should list exactly what they asked the LLM to do because I'm not getting as much use from it and I don't know if it's because what I work on is rare compared to say website front and backend code and/or if I just suck at prompts/context or I'm using the wrong services or don't have the correct MCPs etc....
Is it possible to view the prompt history? I’ve had extreme levels of productivity and would love to list out how I’ve been using it generally for an article like this but it would be incredibly impractical to log it on the side.
With Claude Code at least, all of the chats you've had are stored in jsonl files on your computer in ~/.claude - I made a little TUI for exploring these in https://github.com/orta/claude-code-to-adium
Personally, I'm less sold on tracking prompts as being valuable both for production cases (imo if a human should read it, a human should have wrote/fully edited it applies to commits/PRs/docs etc) and for vibe cases where the prompts are more transitory
I really want to record a live commentary of me working with claude. Maybe that's something you could think about.
I feel like the results I get are qualitatively superior to anything I've seen anyone I've worked with produce. The fact that it's a lot faster is just gravy on top.
Just show the code you've produced.
Right but that's like showing the compiled binary but not the source. With claude code prompt and config is everything.
This is the saddest part of this whole thing for me. You consider the prompts and config to be the real source code, but those are just completely lost into the ether. Even if you saved the prompts you can't reproduce their effects.
Then there's the question of how do other developers contribute to the code. They don't have your prompts, they just have the code.
So, no, prompts are not source code, that's why I ask for people to just show the code they are producing and nobody ever does.
I have repos on github. I show the code.
I also make my design documents (roughly the prompts generated by the prompts) into committed markdown documents. So I show the second-tier prompts at least, you could consider those an intermediate language representation if you like.
> Then there's the question of how do other developers contribute to the code. They don't have your prompts, they just have the code.
I usually try to commit the initial prompts and adjustments. I don't commit trivial things like "That's not quite right, try doing X again" or "Just run the entire test suite"
> So, no, prompts are not source code
Hard disagree, but that's fine.
I've found if you're working on something where it hasn't seen a billion examples you have to give it additional information like an academic paper or similar. And get it to summarize its understanding every once in a while so you can pick up the idea in another chat (once the context gets too long) without having to explain it all again but to also ensure it's not going off the rails ...as they tend to do.
They know a lot about a lot of things but the details get all jumbled up in their stupid robot brains so you have to help them out a bunch.
I have nearly 20 years of experience in technology, and have been writing toy scripts or baby automations for most of my career. I started out in a managed services help desk and took that route many folks take across and around the different IT disciplines.
I mostly spend my days administering SaaS tools, and one of my largest frustrations has always been that I didn’t know enough to really build a good plugin or add-on for whatever tool I was struggling with, and I’d find a limited set of good documentation or open source examples to help me out. With my limited time (full time job) and attendant challenges (ADHD & autism + all the fun trauma that comes from that along with being Black, fat and queer), I struggled to ever start anything out of fear of failure or I’d begin a course and get bored because I wasn’t doing anything that captured my imagination & motivation.
Tools like Claude Code, Cursor, and even the Claude app have absolutely changed the game for me. I’m learning more than ever, because even the shitty code that these tools can write is an opportunity for debugging and exploration, but I have something tangible to iterate on. Additionally, I’ve found that Claude is really good at giving me lessons and learning based on an idea I have, and then I have targeted learning I can go do using source docs and tutorials that are immediately relevant to what I’m doing instead of being faced with choice paralysis. Being able to build broken stuff in seconds that I want to get working (a present problem is so much more satisfying than a future one) and having a tool that knows more than I do about code most of the time but never gets bored of my silly questions or weird metaphors has been so helpful in helping me build my own tools. Now I think about building my own stuff first before I think about buying something!
ADHD here, and Claude code has been a game changer for me as well. I don’t get sidetracked going lost in documentation loops, suffer decision paralysis, or forget what I’m currently doing or what I need to do next. It’s almost like I’m body doubling with Claude code.
well said. I'm actually getting better at coding oddly enough because i'm reading so much more code.
Not using Claude code, but using LLMs for patching C# functions in already compiled vendor code using things like Harmony lib.
Being able to override some hard coded vendor crap has been useful.
Over the holidays I built a plan for an app that would be worthwhile to my children, oldest son first. That plan developed to several thousand words of planning documents (MVP, technical stack, layout). That was just me lying in the sun with Claude on mobile.
Today I (not a programmer, although programming for 20+ years, but mostly statistics) started building with Claude Code via Pro. Burned through my credits in about 3 hours. Got to MVP (happy tear in my eye). Actually one of the best looks I've ever gotten from my son. A look like, wow, dad, that's more than I'd ever think you could manage.
Tips:
- Plan ahead! I've had Claude tell me that a request would fit better way back on the roadmap. My roadmap manages me.
- Force Claude to build a test suite and give debugging info everywhere (backend, frontend).
- Claude and me work together on a clear TODO. He needs guidance as well as I do. It forgot a very central feature of my MVP. Do not yet know why. Asked kindly and it was built.
Questions (not specifically to you kind HN-folks, although tips are welcome):
- Why did I burn through my credits in 3 hours?
- How can I force Claude to keep committed to my plans, my CLAUDE.md, etc.
- Is there a way to ask Claude to check the entire project for consistency? And/Or should I accept that vibing will leave crusts spread around?
I'm on a pro plan, also run into limits within 2 hours, then have to wait until the limits of the 5 hour window reset (next reset is in 1 hour 40 minutes at 2am)...
You can just ask claude to review your code, write down standard, verify that code is produced according to standards and guidelines. And if it finds that project is not consistent, ask it to make a plan and execute on the plan.
Ask, ask, ask.
Burning through your credits is normal. We're in the "lmao free money"/"corner the market" phase, where anthropic offers claude code at a loss.
Recently they had to lower token allowances because they're haemorrhaging money.
You can run "ccusage" in the background to keep tabs, so you're leas surprised, is all I can say.
Enjoy the cheap inference while you can, unless someone cracks the efficiency puzzle the frontier models might get a lot more expensive at one point.
Yeah, not going to lie, working at Google and having unlimited access to Gemini sure is nice (even if it has performance issues vs Claude Code… I can’t say as I can’t use it at work)
To me it feels like we're in the VC subsidized days for tools like Claude Code. Given how expensive we know GPU usage is and that it's not likely to come down, and these companies will need to eventually be profitable, I wonder if we're all heading for a point where ultimately Claude Code and the like will be like $2K per month instead of $200 on the high end.
Good article, but fwiw, I think GraphQL is a bane for web dev for 90% of projects. It overcomplicates, bloats, and doesn't add anything over regular OpenAPI specs for what is usually just CRUD resource operations.
You either get it, or you don't! But if you do...
It seems to be great at writing tests, spitting out UI code, and many other things where there are many examples around.
Among other things I work on database optimizers and there Claude fails spectacularly. It produces wrong code, fails to find the right places where to hook up an abstraction, overlooks affects on other parts of the code, and generally confidently proposes changes that simply do not work at all (to put it mildly).
Your mileage may vary... It seems to be depend heavily on the amount of existing (open) code around.
Being able to do big refactors quickly in the moment really helps in a solo dev environment, but in a team it puts a lot of review (and QA) burden on them. It makes me wonder if we're moving towards a teams model where individuals own different parts of the system, rather than everyone reviewing each others code and working together
Anybody had similarly good experience with Gemini CLI? I'm only a hobbyist coder, so paying for Claude feels silly when Gemini is free (at least for now), but so far I've only used it inside Cline-like extensions
I’ve used both. Claude more extensively. I’ve had good results with Gemini too, however it seems easier to get stuck in a loop. Happens with Claude too but not quite as frequent.
By loop I mean you tell it no don’t implement this service, look at this file instead and mimic that and instead it does what it did before.
Using two different VS code profiles is an interesting idea beyond AI. I get confused every time I have different projects open because code always looks the same. Maybe it would make sense to have a different theme for every project.
For me real limit is the amount of code I can read and lucidly understand to spot issues in a given day.
Every time you use these tools irresponsibly, for instance for what I like to call headless programming (vibe coding), understand that you are incurring tech debt. Not just in terms of your project but personal debt regarding what you SHOULD have learned in order to implement the solution.
It’s like using ChatGPT in high school: it can be a phenomenal tutor, or it can do everything for you and leave you worse off.
The general lesson from this is that Results ARE NOT everything.
Another really nice use case building very sophisticated test tooling. Normally a company might not allocate enough resources to a task like that but with Claude Code it's a no brainer. Also can create very sophisticated mocks like say db mock that can parse all queries in the codebase and apply them to in memory fake tables. Would be total pain to build and maintain by hand but with claude code takes literally minutes.
Hill I’m willing to die on (metaphorically):
If your test structure is a pain to interact with, that usually means some bad decisions somewhere in the architecture of your project.
Sure but thats pretty orthogonal to the above. Say had to create parser+mapper for a domain specific query language that than gets mapped to a number of SQL backends. Used cc to create custom test harness that can use SQLC style files to drive tests. They are way more readable and much easier to maintain vs plain Rust tests for this task. Took half a day to create with cc would prob take like a week without it.
In my experience LLMs are notoriously bad at tests, so this is, to me, one of the worst use cases possible.
In my experience they are great for test tooling. For actual tests after I have covered a number of cases it's very workable to tell it to identify gaps and edge cases and propose tests than I'd say I accept about 70% of it suggestions.
In JS, yes. It stacks with good testing stories, it’s quite good.
While people's experience with LLMs is pretty varied and subjective, saying they're bad at writing tests just isn't true. Claude Code is incredible at writing tests and testing infrastructure.
It worth mentioning that one should tell CC to not overmock, and to produce only truly relevant tests. I use an agent that I invoke to spot this stuff, because I've run into some truly awful overmocked non-tests before.
> Painting by hand just doesn’t have the same appeal anymore when a single concept can just appear and you shape it into the thing you want with your code review and editing skills.
In the meanwhile one the most anticipated game in the industry, a second chapter of an already acclaimed product, has its art totally hand painted
I think it's two schools of thought, end product vs process. It seems a lot of people who like AI only care about getting the end product, and don't care how it was made. On the other hand, some people are invested in how something is made, and see the process of creation and refinement as a part of the end product itself.
People are not invested in how something is made imo, it's just better and beautiful and connects better with them. I agree though that AI entusiast only care about getting stuff done, but I won't call it an "end-product"
Tried it a few times, and I feel I'm paying to become a worse developer for a maybe 30% speed increase in total.
The question is, can I fire 80 out of 100 engineers and purchase Claude Code subscription instead?
People really aren't going to like this, but OP is directionally correct.
At 100 dev shop size you're likely to have plenty of junior and middling devs, for whom tools like CC will act as a net negative in the short-mid term (mostly by slowing down your top devs who have to shovel the shit that CC pushes out at pace and that junior/mids can't or don't catch). Your top devs (likely somewhere around 1/5 of your workforce) will deliver 80% of the benefit of something like CC.
We're not hiring junior or even early-mid devs since around Mar/Apr. These days they cost $200/mo + $X in API spend. There's a shift in the mind-work of how "dev" is being approached. It's.. alarming, but it's happening.
My opinion on Claude as ChatGPT user.
It feels like ChatGPT on cocaine, I mean, I asked for a small change and it came with 5 solutions changing all my codebase.
Was it Sonnet or Opus? I've found that Sonnet will just change a few small things, Opus will go and do big bang changes.
YMMV, though, maybe it's the way I was prompting it. Try using Plan Mode and having it only make small changes.
Is this opinion on claude code or claude the model?
Is this the one that goes 'Oh no, I accidentally your whole codebase, I suck, I accept my punishment and you should never trust me again' or is that a different one?
I seem to remember the 'oh no I suck' one comes out of Microsoft's programmer world? It seems like that must be a tough environment for coders if such feelings run so close to the surface that the LLMs default to it.
That sounds like gemini: https://www.reddit.com/r/cursor/comments/1m2hkoo/did_gemini_...
I'm NOT saying it is, but without regulatory agencies having a look or it being open source, this might be well working as intended, since Anthropic makes more money out of it.
I think the most interesting change Claude enables is letting AI try stuff. We do this all the time.
I have this sense this works best in small teams right now, because Claude wants to produce code changes and PRs. Puzzmo, where OP works, is <5 engineers.
In larger codebases, PRs don't feel like the right medium in every case for provocative AI explorations. If you're going to kick something off before a meeting and see what it might look like to solve it, it might be better to get back a plan, or a pile of regexps, or a list of teams that will care.
Having an AI produce a detailed plan for larger efforts, based on an idea, seems amazing.
Coding agents are empowering, but it is not well appreciated that they are setting a new baseline. It will soon not be considered impressive to do all the things that the author did, but expected. And you will not work less but the same hours -- or more, if you don't use agents.
Despite this, I think agents are a very welcome new weapon.
Within agent use skill/experience there will be a spectrum as well. Some that wield them very effectively, others maybe less so.
Does Claude Code use a different model then Claude.ai? Because Sonnet 4 and Opus 4 routinely get things wrong for me. Both of them have sent me on wild goose chases, where they confidently claimed "X is happening" about my code but were 100% wrong. They also hallucinated APIs, and just got a lot of details wrong in general.
The problem-space I was exploring was libusb and Python, and I used ChatGPT and also Claude.ai to help debug some issues and flesh out some skeleton code. Claude's output was almost universally wrong. ChatGPT got a few things wrong, but was in general a lot closer to the truth.
AI might be coming for our jobs eventually, but it won't be Claude.ai.
Pretty sure it’s the same model.
The reason that claude code is “good” is because it can run tests, compile the code, run a linter, etc. If you actually pay attention to what it’s doing, at least in my experience, it constantly fucks up, but can sort of correct itself by taking feedback from outside tools. Eventually it proclaims “Perfect!” (which annoys me to no end), and spits out code that at least looks like it satisfies what you asked for. Then if you just ignore the tests that mock all the useful behaviors out, the amateur hour mistakes in data access patterns, and the security vulnerabilities, it’s amazing!
You're right, but you can actually improve it pretty dramatically with sub agents. Once you get into a groove with sub agents, it really makes a big difference.
I stopped writing as much code because of RSI and carpal tunnel but Claude has given me a way to program without pain (perhaps an order of magnitude less pain). As much as I was wanting to reject it, I literally am going to need it to continue my career.
You aren't the first person I have heard say this. It's an under-appreciated way in which these tools are a game-changer. They are a wonderful gift to those of us prone to RSI, because they're most good at precisely the boilerplate repetitive stuff that tends to cause the most discomfort. I used to feel slightly sick every time I was faced with some big piece of boilerplate I had to hammer out, because of my RSI and also because it just makes me bored. No longer. People worry that these tools will end careers, but (for now at least) I think they can save the careers of more than a few people. A side-effect is I now enjoy programming much more, because I can operate at a level of abstraction where I am actually dealing with novel problems rather than sending my brain to sleep and my wrists to pain hell hammering out curly braces or yaml boilerplate.
Now that you point this out, since I started using Claude my RSI pain is virtually non-existent. There is so much boilerplate and repetitive work taken out when Claude can hit 90% of the mark.
Especially with very precise language. I've heard of people using speech to text to use it which opens up all sorts of accessibility windows.
I find it very effective to use a good STT/dictation app since giving sufficient detailed context to CC is very important, and it becomes tedious to type all of that.
I’ve experimented with several dictation apps, including super whisper, etc., and I’ve settled on Wispr Flow. I’m very picky about having good keyboard shortcuts for hands-free dictation mode (meaning having a good keyboard shortcut to toggle recording on and off), and of course, accuracy and speed. Wispr Flow seems to fit all my needs for now but I’d love to switch to a local-only app and ditch the $15/mo sub :)
Sorry to hear that and whilst it wasn't my original goal to serve such a use case I wonder if being able to interact with Claude Code via voice will help you? On MacOS it uses free defaults for TTS and ASR but you can BYOK to other providors. https://github.com/robdmac/talkito
check out https://talonvoice.com !
Are you using dictation for text entry
Great suggestion! I will be now :)
Superwhisper is great. It's closed source, however. There may be other comparable open spurce options available now. I'd suggest trying superwhisper, so you know what's possible and maybe compare to open source options after. Superwhisper runs locally and has a one time purchase option, which makes it acceptable to me.
Talkito (I posted the link further up) is open source and unlike Superwhisper it makes Claude Code talk back to you as well - which was the original aim to be able to multitask.
Damn, I thought the functional programming era of HN was insufferable, I had no idea how bad it was going to get.
Genuinely - what's the problem with this? It seems to be someone documenting a big increase in their productivity in a way that might be actually useful to others.
They don't write like the kind of person you can dismiss out of hand and there's no obvious red flags.
Other than "I don't like AI" - what is so insufferable here?
Ah yet another attempt to push the “$200/month is a bargain!!!” narrative. Sad.
I get it. No one is really making any money yet, including openAI.
As the VC money dries up this is only going to get worse. Like ads in responses worse.
this_variable_name_is_sponsored_by_coinbase bad. Which these vibe chuckleheads will claim is no big deal because only losers read code.
> Ah yet another attempt to push the “$200/month is a bargain!!!” narrative.
Well - compared to a real developer (even junior one), it's peanuts.
Love that these folks are enhancing team productivity - not just individual - making it easier to do prototypes.
When more ideas make it further you get more shots on goal. You are almost certain to have more hits!
I was very, very skeptic. Then a couple of weeks ago I started with Atlassian's Rovo Dev CLI (Sonnet 4) and immediately managed to build and finish a couple of projects. I learned a lot and for sure having experience, decide on stack an architecture is a huge benefit for "guiding" an agentic coding app. I'm not sure if anyone can build and maintain a project without having at least some skills, but If you are an experienced developer, this is kind of magic.
I also appreciate the common best practice to write a requirements document in Markdown before letting the agent start. AWS' kiro.dev is really nice in separating the planning stage from the execution stage but you can use almost any "chatbot" even ChatGPT for that stage. If you suffer from ADHD or lose focus easily, this is key. Even if you decide to finish some steps manually.
It doesn't really matter if you use Claude Code (with Claude LLM), Rovo Dev CLI, Kiro, Opencode, gemini-cli, whatever. Pick the ones that offer daily free tokens and try it out. And no, they will almost never complete without any error. But just copy+paste the error to the prompt or ask some nasty questions ("Did you really implement deduplication and caching?") and usually the agent magically sees the issues and starts to fix it.
A few years ago the SRE crowd went through a toil automation phase. SWEs are now gaining the tools to do the same.
I don't know if it's something only I "perceive," but as a 50-year-old who started learning to use computers from the command line, using Claude Code's CLI mode gives me a unique sense of satisfaction.
Claude Code blows everything else out of the water for me. Which makes me certain I have a blind spot.
Has anyone who’s gone decent at Clauding had matching success with other tools?
I've switched to opencode. I use it with Sonnet for targeted refactoring tasks and Gemini to do things that touch a lot of files, which otherwise can get expensive quickly.
For me, the most compelling use of LLMs is to one shot scripts, small functions, unit tests, etc.
I don’t understand how people have the patience to do an entire application just vibe coding the whole time. As the article suggests, it doesn’t even save that much time.
If it can’t be done in one shot with simple context I don’t want it.
I did a company hackathon recently where we were encouraged to explore vibe coding more and this was essentially my take-away too. It's kinda perfect for a hackathon but it was insanely mind-numbing to relegate the problem solving and mental model-building to the LLM and sit there writing prompts all day. If that has to become my career, I genuinely might have to change career paths, but it doesn't seem like that's likely -- using it as a tool to help here and there and sometimes provide suggestions definitely feels like way to actually use it to get better results.
A lot of things that the author achieved with Claude Code is migrating or refactoring of code. To me, who started using Claude Code just two weeks ago, this seems to be one of the real strengths at the moment. We have a large business app that uses an abandoned component library and contains a lot of cruft. Migrating to another component library seemed next to impossible, but with Claude Code the whole process took me just about one week. It is making mistakes (non-matching tags for example), but with some human oversight we reached the first goal. Next goal is removing as much cruft as possible, so working on the app becomes possible or even fun again.
I remember when JetBrains made programming so much easier with their refactoring tools in IntelliJ IDEA. To me (with very limited AI experience) this seems to be a similar step, but bigger.
On the other hand though, automated refactoring like in IntelliJ can scale practically infinitely, are extremely low cost, and are gauranteed to never make any mistakes.
Not saying this is more useful per se, just saying that different approaches have their pros and cons.
I tried out Claude for the first time today. I have a giant powershell script that has evolved over the years, doing a bunch of different stuff. I've been meaning to refactor it for a long time, but it's such a tangled mess that every time I try, I give up fairly quickly. GPT has not been able to split it into separate modules successfully. Today I tried Claude and it refactored it into a beautifully separated collections of modules in about 30 minutes. I am extremely impressed.
Anyone has bet practices or guidelines for how to get most out of Claude Code?
Does anyone have advice on how to hire developers who can utilize AI coding tools effectively?
My mom was an English teacher and an electronics tech later in her career. I explained to her how LLMs are requiring a lot of technical writing and documentation up front and devs that haven’t tried that approach or hate that can be quick to dismiss when the output is bad. Her reply, “Oh so the humanities do matter!’ Hire juniors with a left brain right brain mix.
We have to be careful not to anthropomorphize them but LLMs absolutely respond to nuanced word choice and definition of behavior that align with psychology (humanities). How to judge that in an interview? Maybe a “Write instructions for a robot to make a peanut butter and jelly sandwich” exercise. Make them type it. Prospects who did robotics club have an edge?
Can they touch type? I’ve seen experienced devs that chicken peck its painful. What happens when they have to write a stream of prompts, abort, and rephrase rapidly? Schools aren’t mandating typing and I see an increase (in my own home! I tried…) of feral child invented systems like caps lock on/off instead of shift with weird cross keyboard overhand reaches.
I've thought about this lately. In order to do that, you need to know where people typically stumble, and then create a rubric around that. Here are some things I'd look for:
- Ability to clearly define requirements up front (the equivalent mistake in coding interviews is to start by coding, rather than asking questions and understanding the problem + solution 100% before writing a single line of code). This might be the majority of the interview.
- Ability to anticipate where the LLM will make mistakes. See if they use perplexity/context7 for example. Relying solely on the LLM's training data is a mistake.
- A familiarity with how to parallelize work and when that's useful vs not. Do they understand how to use something like worktrees, multiple repos, or docker to split up the work?
- Uses tests (including end-to-end and visual testing)
- Can they actually deliver a working feature/product within a reasonable amount of time?
- Is the final result looking like AI slop, or is it actually performant, maintainable (by both humans and new context windows), well-designed, and follows best practices?
- Are they able to work effectively within a large codebase? (this depends on what stage you're in; if you're a larger company, this is important, but if you're a startup, you probably want the 0->1 type of interview)
- What sort of tools are they using? I'd give more weight if someone was using Claude Code, because that's just the best tool for the job. And if they're just doing the trendy thing like using Claude Agents, I'd subtract points.
- How efficient did they use the AI? Did they just churn through tokens? Did they use the right model given the task complexity?
first you have to decide if you want juniors that are able to push tasks through and be guided by a senior, as the juniors won't understand what they are doing or why the AI is telling them to do it "wrong".
senior developers already know how to use AI tools effectively, and are often just as fast as AI, so they only get the benefits out of scaffolding.
really everything comes down to planning, and your success isn't going to come down to people using AI tools, it will come down to the people guiding the process, namely project managers, designers, and the architects and senior developers that will help realize the vision.
juniors that can push tasks to completion can only be valuable if they have proper guidance, otherwise you'll just be making spaghetti.
I've used it a bit. I've done some very useful stuff, and I've given up with other stuff and just done it manually.
What it excels at is translation. This is what LLMs were originally designed for after all.
It could be between programming languages, like "translate this helm chart into a controller in Go". It will happily spit out all the structs and basic reconciliation logic. Gets some wrong but even after correcting those bits still saves so much time.
And of course writing precise specs in English, it will translate them to code. Whether this really saves time I'm not so convinced. I still have to type those specs in English, but now what I'm typing is lost and what I get is not my own words.
Of course it's good at generating boilerplate, but I never wrote much boilerplate by hand anyway.
I've found it's quite over eager to generate swathes of code when you wanted to go step by step and write tests for each new bit. It doesn't really "get" test-driven development and just wants to write untested code.
Overall I think it's without doubt amazing. But then so is a clown at a children's birthday party. Have you seen those balloon animals?! I think it's useful to remain sceptical and not be amazed by something just because you can't do it. Amazing doesn't mean useful.
I worry a lot about what's happening in our industry. Already developers get away with incredibly shoddy practices. In other industries such practices would get you struck off, licences stripped, or even sent to prison. Now we have to contend with juniors and people who don't even understand programming generating software that runs.
I can really see LLMs becoming outlawed in software development for software that matters, like medical equipment or anything that puts the public in danger. But maybe I'm being overly optimistic. I think generally people understand the dangers of an electrician mislabelling a fusebox or something, but don't understand the dangers of shoddy software.
if there was legal recourse for most software issues you can bet the current frenzy around ai-agentic coding would be much more carefully done
i think like many things, laws wont appear until a major disaster happens (or you get a president on your side *)
* https://www.onlinesafetytrainer.com/president-theodore-roose...
And there is indeed software that is not covered by those tags, plenty of it in fact. It just so happens that it's a few orders of magnitude more expensive, and so you never hear about it until you're actually designing e.g. provably safe firmware for pacemakers and the like.
Seriously over engineered system. Unnecessary complexity everywhere. Illusion of progress.
“Trust, but verify”
AI, but refactor
This is an excellent insight
The problem for me is to predict what AI might get wrong. Claude can solve hard coding problems one day just to fail with basic stuff like thread safety the next. But overall I think it is clear that we reached the point where AI, if used correctly, saves a lot of development time.
I think Claude Code is great, but I really grew accustomed to the "Cursor-tab tab tab" autocomplete style. A little perplexed why the Claude Code integration into VS Code doesn't add something like this? It would make it the perfect product to me. Surprised more people do not talk about this/it isn't a more commonly requested feature.
"Cursor tab tab tab" is just nuts. I'm also getting accustomed to type carelessly, making syntax mistakes who cares if a tab can fix that. I fly with it. As per why more people dont talk about this, I have a the strong opinion that tools find success in the median of the market, not in the excellence. I think coders with a great autocomplete are a real deal. I find so boring and courtproductive chatting about a problem
Agree. I used Claude code a bit and enjoyed it but also felt like I was too disconnected to the changes, I guess too much vibe coding?
Cursor is a nice balance for me still. I am automating a lot of the writing but it’s still bite size pieces that feel easier to review.
With these agentic coders you can have better conversations about the code. My favorite use case with CC is after a day coding I can ask it to for a thorough review of the changes, a file, or even the whole project.. setting it to work when I go off to bed and have it ranking issues and even proposing a fix for the most important ones. If you get the prompt right and enable permissions it can work for quite a long time independently.
I just use Claude Code in Cursor's terminal (both a hotkey away, very convenient). For 2 months I don't use cursor chat, but tab autocomplete is to good, definitely worth 20$.
I get downvoted every time I praise Claude. But everyone in this thread is getting upvoted for saying the same things. Can someone explain to me the difference?
You're replying to people who had negative experiences with LLMs and posted their nuanced criticisms with: "OK but have you tried Claude 4??"
It's not helpful or justified within those conversations.
The Internet is fickle and full of dullards. It's always a toss of the dice.
Yeah, you're in a thread of people who actually use it.
can anyone compare it with cursor?
At this point I am 99% convinced that AI coding skeptics are nothing short of Luddites.
They would be like "but a robot will never ever clean a house as well as I would", well, no shit, but they can still do the overwhelming majority of the work very well (or at least as good as you instruct them to) and leave you with details and orchestration.
Genuinely not yet convinced. The CEO of windsurf said the goal for the next year was to get agentic coding to the reliability and maturity that it can be used in production for mature codebases. That rhymes true.
I use autocomplete and chat with LLMs as a substitute for stack overflow. They’re great for that. Beyond that, myself and colleagues have found AI agents are not yet ready for serious work. I want them to be, I really do. But more than that I want our software to be reliable, our code to be robust and understandable, and I don’t want to worry about whether we are painting ourselves into a corner.
We build serious software infrastructure that supports other companies’ software and our biggest headache is supporting code that we built earlier this year using AI. It’s poorly written, full of bugs including problems very hard to spot from the code, and is just incomprehensible for the most part.
Other companies I know are vibe coding, and making great progress, but their software is CRUD SaaS and worst case they could start over. We do not have that luxury.
If you get as much enjoyment out of problem-solving and programming as you do doing household chores, I'm not sure why you went into this career. I agree that the folks who are 100% anti-using-it-ever are overreacting, but IME replacing the overwhelming majority of the work with vibe-coding is both mind-numbing from a developer perspective and doesn't actually get you better results (or faster results, unless you're basically cowboy coding with no regard for thoroughly ensuring correctness)
Vibe coding is when you don't check the output.
Instructing Claude Code to handle a circular dependency edge case in your code and write tests for it, while reviewing the work definitely does not quality as vibe coding.
I recently tried a 7-day trial version of Claude Code. I had 3 distinct experiences with it: one obviously positive, one bad, and one neutral-but-trending-positive.
The bad experience was asking it to produce a relatively non-trivial feature in an existing Python module.
I have a bunch of classes for writing PDF files. Each class corresponds to a page template in a document (TitlePage, StatisticsPage, etc). Under the hood these classes use functions like `draw_title(x, y, title)` or `draw_table(x, y, data)`. One of these tables needed to be split across multiple pages if the number of rows exceeded the page space. So I needed Claude Code to do some sort of recursive top-level driver that would add new pages to a document until it exhausted the input data.
I spent about an hour coaching Claude through the feature, and in the end it produced something that looked superficially correct, but didn't compile. After spending some time debugging, I moved on and wrote the thing by hand. This feature was not trivial even for me to implement, and it took about 2 days. It broke the existing pattern in the module. The module was designed with the idea that `one data container = one page`, so splitting data across multiple pages was a new pattern the rest of the module needed to be adapted to. I think that's why Claud did not do well.
+++
The obviously good experience with Claude was getting it to add new tests to a well-structured suite of integration tests. Adding tests to this module is a boring chore, because most of the effort goes into setting up the input data. The pattern in the test suite is something like this: IntegrationTestParent class that contains all the test logic, and a bunch of IntegrationTestA/B/C/D that do data set up, and then call the parent's test method.
Claude knocked this one out of the park. There was a clear pattern to follow, and it produced code that was perfect. It saved me 1 or 2 hours, but the cool part was that it was doing this in its own terminal window, while I worked on something else. This is a type of simple task I'd give to new engineers to expose them to existing patterns.
+++
The last experience was asking it to write a small CLI tool from scratch in a language I don't know. The tool worked like this: you point it at a directory, and it then checks that there are 5 or 6 files in that directory, and that the files are named a certain way, and are formatted a certain way. If the files are missing or not formatted correctly, throw an error.
The tool was for another team to use, so they could check these files, before they tried forwarding these files to me. So I needed an executable binary that I could throw up onto Dropbox or something, that the other team could just download and use. I primarily code in Python/JavaScript, and making a shareable tool like that with an interpreted language is a pain.
So I had Claude whip something up in Golang. It took about 2 hours, and the tool worked as advertised. Claude was very helpful.
On the one hand, this was a clear win for Claude. On the other hand, I didn't learn anything. I want to learn Go, and I can't say that I learned any Go from the experience. Next time I have to code a tool like that, I think I'll just write it from scratch myself, so I learn something.
+++
Eh. I've been using "AI" tools since they came out. I was the first at my company to get the pre-LLM Copilot autocomplete, and when ChatGPT became available I became a heavy user overnight. I have tried out Cursor (hate the VSCode nature of it), and I tried out the re-branded Copilot. Now I have tried Claude Code.
I am not an "AI" skeptic, but I still don't get the foaming hype. I feel like these tools at best make me 1.5X -- which is a lot, so I will always stay on top of new tooling -- but I don't feel like I am about to be replaced.
Your bad experience is because AI can’t really reason in general. It gets some kinda reasoning via a transformer, but that’s nothing like the reasoning that goes into the problem you described.
LLMs are great at translation. Turn this English into code, essentially. But ask it to solve a novel problem like that without a description of the solution, how will it approach it? If there’s an example in its training set maybe it can recall it. Otherwise it has no capability to derive a solution.
However, most problems (novel problems, problems not in the training set) can be decomposed into simpler, known problems. At the moment the AI isn't great at driving this decomposition, so that has to be driven by a meat bag.