It is now significantly harder to figure out who understands the systems and is using AI effectively and who doesn't know shit and is just slinging LLM copypasta around. Before 2025, the underperformers/coasters were at least relatively identifiable by the paucity of their contributions. Now all of the sudden every single engineer is filing PRs, code reviews, technical design documents, and every other artifact under the sun with perfect formatting and at least superficial plausibility. This is mostly due to incredible pressure from the C-level for every engineer to be using as much AI as possible, but it's also just a game theory respopnse because it's in every engineer's best interest to be as prolific as possible.
We are absolutely drowning in documentation and code that seems legit and the only recourse is to lean on AI to help process the sheer quantity of it. I have a feeling that the fallout from this phase of the industry is going to be an exotic form of technical debt that is remarkable mostly in its enormity.
I'm sure this is gated by where you work (especially by how technically savvy your manager is), but the most effective contributors at my job tend to be the ones with near-zero (or sub-zero!) net LoC.
LLMs are prolific and they love to add shit. Truly capable engineers are able to achieve more business outcomes with less code / fewer moving parts.
The most effective contributors at your job remove more code than they add? That doesn't sound effective that sounds like digging ditches to fill them. Every line of code removed is a line that was previously added.
Turning inefficient, unreadable code into efficient, readable code often results in an overall reduction in LoC.
High-quality code and high-volume code are highly anti-correlated. Incidentally, low-quality code that is excessively long just so happens to be common complaint with AI-generated code.
Maybe it was temporarily needed, but the assumptions around it has changed and now it's unneeded. Then people built on top of that not understanding that they could simplify the whole system, and only later was that option discovered.
I definitely agree with the GP, and the point is that most often someone else (or an LLM) added all those LOC that are removed to make the system sensible.
> That doesn't sound effective that sounds like digging ditches to fill them. Every line of code removed is a line that was previously added.
Because they were added doesn't mean they were needed and even if the same person added and then removed them, it doesn't mean they are digging ditches to fill them.
The idea that "I would have written a shorter letter, but I did not have the time" also applies to code, and sometimes later you are blessed with more time than you had when implementing something under deadline pressure.
> Because they were added doesn't mean they were needed and even if the same person added and then removed them, it doesn't mean they are digging ditches to fill them.
Huh? If LoC weren't needed then adding them was unnecessary and a waste of time. Someone who is known at an organization for removing unnecessary code screams inefficiency to me. It's paying one person to create a mess then another to clean it up.
> Huh? If they weren't needed then adding them was unnecessary and a waste of time.
My previous reply already addressed this?
I can't help but think you are being purposefully obtuse if you can't acknowledge the concept of developers creating known (and hopefully temporary) technical debt due to various forms of deadline related time pressure or changing requirements.
if you're trying to use sloc as a proxy for productivity in any way, shape or form you've already lost the game.
i tend to find that the most productive teams make better decisions and work fewer hours. the quality of decisions is such a huge force multiplier that it renders actual hours worked almost an irrelevant variable.
I think it depends a little on how and where you work. In the energy industry of Europe where we are extremely regulated AI has been writing some excellent and maintainable code. Of course we can't do any of that CLEAN SOLID DRY stuff, or any abstraction and implicity really, and I imagine that AI would struggle with that. Though you have to wonder if any of those religions ever really worked when you consider that they've still failed to replace most COBOL systems 30 years later. Anyway, that's a different discussion and even Uncle Bob has moved on to functional programming.
I've yet to have Opus 4.8 fail me with defensive explict code. Often it'll write code that is better than what I might have done. I imagine it would be a nightmare to go through one of the OOP debug chains with implict error handling, but when every function has a runtime assertion which is basically the contract for how it is supposed to work and exactly what to do if it encounters a corrupt state, then things are just so much easier with AI.
I do agree with you on documentation. The amount we have has exploded in the post AI world. Which is a little ironic since the assertion is frankly what you'll need to know and not the 10 pages of prose the AI autogenerated in the shared loop (microsoft's terrible confluence). It is what it is though, and at least it's easier to meet EU compliance rules now, since those are more about the bureaucracy than actual security.
This game-theory phase seems to go hand-in-hand with totally myopic grift/gig/hustle thinking too. There is no product but the confidence act needed to win each moment.
Chalk up yet another echo of the 1920s Gilded Age? Between all these economic spasms and the simultaneous tilting towards fascism, I think there is way too much historical rhyming going on right now...
So, in other words, all the "awesome engineers" can't really tell good code from bad unless it's really obvious? Why should we listen to you about AI code being crap, then? Maybe, in the end, you don't really know. Maybe AI is better at it than you?
No. Your AI tool that summarized the comment did you dirty. The key here was this:
> perfect formatting and at least superficial plausibility
Basically, a library full of books that have nice covers is going to take time to see that all those books are just filled with ipsum lorem. Before, they coudln't stand up a fake library.
I liked this article, and I see a lot of other commenters didn't, so I'll give my take:
When starting on a new codebase, how do you make yourself into a helpful contributor as quickly as possible? I go straight for the humans and their human docs. What problem was the system originally built to solve? What was the original design, and what were its biggest problems? Who is currently using it? If you know these, reading the code is much easier because you can guess why things were done the way they are.
I think Charity is observing a very old problem and expecting the new technology to lead to a new solution of some kind. I doubt she thinks even the current generation of tools are the end of the AI software development story. She's not saying we'll drop design docs right into Claude code and walk away (design docs aren't complete either, that's why when you're ramping up you also have to talk to people, read old tickets and postmortems, etc.)
What she's observing is that, in prod, people don't like infra where it's hard to tell how it got into is current state, and so infra-as-code is what we do now. She's also observing that, "it's hard to tell how it got into its current state" is the status quo with codebases, which other people have observed going back to "Programming as Theory Building" and earlier. And she's expecting that, analogous to infra, software development will somehow be done with tools focused on making "how the code got into its current state" clearer.
I wonder if the reception is so variable due to differing exposure to 1) infra as code and 2) engineering teams that don't produce any artifacts outside of their code.
> When starting on a new codebase, how do you make yourself into a helpful contributor as quickly as possible? I go straight for the humans and their human docs. What problem was the system originally built to solve? What was the original design, and what were its biggest problems? Who is currently using it? If you know these, reading the code is much easier because you can guess why things were done the way they are.
This is the way but plenty of engineering teams don't have any human docs at all. Decisions are made in one engineer's head or in a chat that isn't saved. The spec was just a few notes in a ticket that was deleted during cleanup or lost when the team changed trackers. There's no map of the codebase or features, no ADRs, minimal observability. All you have is the code. You read the code to try and figure out what is going on then ping an engineer who made a recent commit to a specific area to ask if they remember why something was done the way it was. Someone makes a change and it breaks something on the other side of the codebase that they thought was totally unrelated, etc.
I liked the article. It was a long (and entertaining) build up to the conclusion, but I'm scratching my head how the author got there.
AI needs more discipline, yes. But theoretically that discipline can be learned much easier than becoming a good engineer.
Think of it this way... 20 years ago, to write good, scalable C code - you needed to 1) either be a genius, or 2) dedicated to the craft.
You need to learn dozens of tools like the back of your hand.
* ASan
* LSan
* UBSan
* TSan
* GDB
etc... God forbid if you needed to manually read DWARF files. Unless you're a pure genius, this is not feasible to master in a short amount of time. And in parallel, you need to learn how to design systems, too, otherwise, you're still not very good, and that's an almost completely orthogonal skillset.
Now, you simply need to be aware of the hazards in your language/framework, tell your LLM to test for them, have the infrastructure set up to see if they've adequately tested for those hazards, and maybe read the actual tests and implementation.
It is pretty easy to be able to read and understand Rust compared to debugging all the sorcery-like errors that come during Rust development... It is easy to see that you need a Loom test for certain scenarios, and to write a tool to detect if you did it.
Even if you're still working in C or Zig, it far easier to know and detect when you need to use those tools then to learn to use them all individually.
It is not hard to learn to read SQL. Almost ~50% of business professionals can. Python is barely harder. Rust can look like sorcery if you don't read a 50 page guide to understand to read it, but that's a VERY small price to pay compared to spending ~10 years learning the craft painfully by trial and error.
I'm not sure how you get from "LLMs work in mysterious ways" to "So we need more discipline" to "everything is fine."
I agree that everything is fine. I just don't think this is the clear path and thought process.
Anyone who has the determination to get things to actually work, and takes a little bit of time to understand what makes them not, should be able to leverage LLMs to work wonders.
In my opinion, LLMs are going to make things far more complicated, because the cost of building something complicated is becoming almost free.
Engineering was always about discipline and getting things to work. But you needed a set of prerequisite skills to have much value. Most of those are gone now.
It is simply far easier than before. It does require discipline, yes. But discipline is cheap compared to ~10 years of trial by fire.
> Those are not code problems. They are evaluation problems.
> Code becomes precious when it is the only place knowledge lives.
Reading AI code all day is _agonizing_. Just, a horrible way to live, and it melts people's brains at the moment you need them to be the most capable.
Manual programming has this really productive and gratifying feedback loop, where you read the code, write the code, and fix it until it compiles/runs/does what you want. AI code not only does half that for you, but it makes the "click" at the end uninspiring because you're never sure if it's cheated a bit to get to that moment.
Trying to operate with AI-generated code as the only durable artifact of programming is a dead end for the industry. Charity points to (and correct discards) architecture diagrams/specs as an interesting space to work in. My suspicion is that it's closer to the thing that's hand-written: prompts, markdown plans, and other nudges. Focus on the thing that you, as a human, produce, and that's the basis for both the core loop of "did the AI follow my instructions" and it's higher-leverage when you go to code review.
By the time you get to the PR, you've probably typed enough to Claude that you can regenerate the code, but the current industry default is to just throw away all those sessions and ship the code. That's backwards!
If a coworker dumped a 5k-line code review on you, you'd tell them to come back when it's broken down into smaller, reviewable chunks. Large dumps of code are basically unreviewable by humans, but it seems like a lot of people have forgotten about that when it comes to LLMs.
I think it's worse than that. At least if I dumped 5k LoC on somebody in 2021, you knew I spent the time to write it, so it's "fair" to ask you to read it. But I didn't write it in 2026, so you shouldn't read it.
I think it's less about "break it down" and more about "let's communicate at the same altitude."
Breaking up a giant PR can be a tedious, time-consuming hassle, and in the past I could sympathize in practice if someone had a giant PR they didn't have time to decompose once they got it working.
But it's also the exact sort of thing that LLMs are literally perfect for in my experience so there's really no excuse anymore. I've never seen Claude fail to turn a 5k PR into a well-decomposed Graphite stack.
It is not so much forgetting as much as it is acceptance that when welcoming AI into a codebase, the code can no longer matter; that all that matters is that the properties of the system are validated. That isn't a change that comes free, so nobody should be expecting magic, it is a different set of tradeoffs. There is no such thing as a panacea.
> If a coworker dumped a 5k-line code review on you, you'd tell them to come back when it's broken down into smaller, reviewable chunks.
I would, and all my training at Google told me to do that. But what I found after I left that comfortable box was that somehow this kind of practice is acceptable in the industry at large and you're expected to just Deal With It(tm). 5k lines isn't even high by what I've seen.
Worse the "code review" tools that people have access to in GitHub make this absolutely and totally unworkable to incrementally improve review. Messy merge commits full of "responding to code review" comments. Threads impossible to follow. Just bad tooling.
So a lot of shops, from what I've seen, are just yeeting it with very shallow reviews.
This is my observation pre agentic AI. LLMs just threw kerosene on that dumpster fire.
Are there any products out there that are capturing the prompts/sessions? I imagine you could do it in an adhoc way, asking Claude to write up a summary of the session as part of the commit message. But is there anything else that's more structured/higher level?
> What happened in 2025 was this: the economics of code production were turned upside down. Instead of being very hard, time-consuming, and expensive to generate code, it became effectively free and instant. Lines of code went from being treasured, reused, cared for and carefully curated, to being disposable and regenerable, practically overnight.
I've been thinking about this a whole lot recently. So much of my intuition about software development is based on 25 years of accumulated experience on how long it will take to write different bits of code.
Should I add validation for this one edge-case which won't break everything but will make a little bit of a mess if someone hits it? If that's an extra couple of hours of code I might skip it. If it's one more prompt, why wouldn't I?
That's just on the small scale. There are entire projects that I'd never previously have considered, because I don't need a custom SQLite SELECT query parsing library enough to justify spending a week or more building one. But now... https://github.com/simonw/sqlite-ast
People get VERY upset (and condescending) any time you suggest that being able to produce lines of code faster is a valuable thing. And sure, measuring output through "lines of code" is stupid.
But measuring "lines of verified code that deliver valuable" isn't stupid at all. That's the thing we can do faster now.
I read the article, and it seems she is forgetting the aphorism "all models are wrong". This is a common mistake that people who like "realistic" "simulation" RPGs often make. Any suitably comprehensive model of a thing is just the thing itself. To have a model of a location that includes all the detail of the actual location, you would need a 1:1 scale model, which is just a copy of the location.
Any plan (i.e. prompt to a model) sufficiently capable of reliably replicating 100% of the functionality of a system is likely the source code of the system itself.
This has been my position from the beginning of when agentic coding harnesses became genuinely useful.
I now do documentation driven development, and with very few exceptions I am committing code that is better written, better documented, easier to reason about and maintain, with less library overuse than I ever did as a senior lead with a smal team, and I’m doing it for 1/4 the price, at 4x the speed.
But it’s not vibe coding. Discipline is critical, as is deep systems understanding.
arent they still? or at least a lot. its too much current to win the swim race against the deluge of llm LOC. but i also disagree with some of the things the author just casually lays out, which is whether the LLMs can write good code. they write working code, but it looks written by a demogorgon and i get a bit ill seeing it. its bad but not bad in a way that a human would ever write, like i dont get that kind of sick reading spaghetti code written by new devs. it's a kind of sick like cthulhus eggs are hatching somewhere in your guts.
Ok, I like the idea and support that seniors value simplicity ... but how the hell do you stay employed for even a month (let alone until "manager time") without writing any code?
You don't just delete stuff… it's more that your pull requests remove more lines than they add. But I'm sure the person you're replying to is exaggerating, or they got promoted because of completely unrelated reasons.
Great article. I'm not sure the author is correct - but I think something is happening to the adage:
> A sufficiently detailed specification is runnable code.
In a way I think LLMs will enable the dream of 4gl and "sufficiently smart compilers"[c].
LLMs aren't smart, but they are capable. Especially capable of translation and transformation.
I can certainly see them help move the abstraction horizon at which we work - so that rigid high level descriptions of the desired logic/process along with the process for quality testing - become the relevant curated artifacts - and the generated go/rust/java/python/etc code become incidental and mutable; subject to constant rewriting as part of the deployment of systems.
[c] You know, the ones that take naive C/C++ and produce executables that fully leverage RISC/EPIC platforms to be better than CISC. See also: Intel Itanium
This is what Anthropic did with agents and $20k to write a C compiler that survived gcc’s torture suite. But the LLM knew:
1. What a C compiler was
2. What a C compiler looked like
3. What the C compiler had to do at runtime to pass gcc’s torture suite through some sort of collaborative iteration (compile, run, did it get stuck at some torture suite test or fail?)
Remove 1 and 2, or replace it with imperfect business logic, and you’re left with a system that is built to _only_ pass the tests you supply it, or in the most extreme case, print(“unit and functional tests pass!”)
I did not enjoy reading this article. The writing was fine, and each individual paragraph was fine, but the whole thing together was meandering and dare I say pointless. It was so many words and yet so little seems to have been said.
I'm not sure this article had enough thought put into it. For example:
What happened in 2025 was this: the economics of code production were turned upside down. Instead of being very hard, time-consuming, and expensive to generate code, it became effectively free and instant. Lines of code went from being treasured, reused, cared for and carefully curated, to being disposable and regenerable, practically overnight.
It's not so much as "the economics [...] were turned upside down", but that a manufacturing process that used to be strictly additive (akin to 3D printing) is now complemented by a subtractive process (akin to CNC milling). The "shape" that is demanded hasn't really changed, and nor has the human effort (as long as you care about achieving certain tolerances). You still have to "treasure, reuse, care for, and curate" your product to whatever degree the market demands.
Also I disagree with:
Lines of code are not the ideal artifact to review
What does "ideal" mean here? When I was growing up "show your work" was the rule for all examinations. Why? Because we're working to improve mental models and thought processes for the next generation, not just products we will release tomorrow.
I think the point is that there are better engineering artifacts to review instead of lines of code. Encoding the decisions, structure, requirements, testing, monitoring, then reviewing those and having AI generate and regenerate code based on them. The code itself doesn't matter if enough thought and rigor has gone into the structure that produces the code.
> What does "ideal" mean here? When I was growing up "show your work" was the rule for all examinations. Why? Because we're working to improve mental models and thought processes for the next generation, not just products we will release tomorrow.
They're saying that the mental models and thought processes are incredibly important but that code is not the place for that work to live.
> They're saying that the mental models and thought processes are incredibly important but that code is not the place for that work to live.
What I meant is that, insofar as some work has been produced with a human mind involved and where imperfect abstractions are used, one should not for whatever idealistic reasons push for reviewing the work at some coarser granularity than the details which are readily available. That's a way to foster and encourage mistakes, in both the work and in the mental model.
So when you say that code is not the place for that work to live (or more closely to the line I disagree with, that code is not an 'ideal' artifact to review), you are essentially purporting that there is a perfect abstraction that can generally be trusted, which I disagree is currently the case for an LLM spec versus produced code.
> They're saying that the mental models and thought processes are incredibly important but that code is not the place for that work to live.
They’re important for discussion and brainstorming. They’re also important for sharing context before reviewing. But code is the only perfect representation in terms of semantics of what the computer will do.
You can have all the diagram and all the proses you want, but they’re still ambiguous.
The intro lobbed up a clear cut point of contention for the article to address. I found the following writing to loose steam on that point. I turned to skimming, and did not manage to find a conclusion.
I suspect the stance they described as one readers mistakenly took away from their previous article to in fact be their stance. Otherwise why dance around it so hard?
> The writing was fine, and each individual paragraph was fine, but the whole thing together was meandering and dare I say pointless. It was so many words and yet so little seems to have been said.
I have a doubt that one of Three Virtues of a Programmer, laziness is still considered a virtue on AI coding era.
Now that AI coding speed and performance outperformed most of human. But AI still need human to be commanded. Yes, you can let AI agent manage sub-agents but still, human is at the top of manager who order AI what should be written.
So human must command and final say on when it's done.
As defined by Larry Wall: "Laziness: The quality that makes you go to great effort to reduce overall energy expenditure. It makes you write labor-saving programs that other people will find useful and document what you wrote so you don't have to answer so many questions about it."
That is still an enormous virtue in the AI era. It is completely the opposite of what many AI-using programmers are doing, which is being lazy in the conventional sense, minimizing their individual energy expenditure at the price of increasing the overall energy expenditure.
Being big-picture lazy is a virtue. Being individually lazy is a vice.
>"It’s easy to forget, but for most of 2025, the idea that AI-generated code was slop and might always be slop was not only a reasonable position to hold, it was the default, mainstream position.
That question was answered decisively last November."
It's easy to forget that people said this exact thing about every model after GPT 3.5. This is a standard trick the industry uses to invalidate negative experience with LLMs. 'You are prompting it wrong' becomes 'you are using Gemini, but you should use Clade' which then becomes 'well, all of your criticism is now irrelevant, because everything is fixed in this new version'.
This "discussion" about capabilities is set up to be asymmetrical and basically non-falsifiable.
You seem to be saying model capabilities aren't improving. They are. The fact that many mathematicians have looked at the result and confirmed it and solved some other problems with the technique elevates this above claims.
i mean i am very much still waiting for it to not be slop, but fable actually i think made a bit of headway in this direction, the code it writes what little of it i saw, makes me want to fall over dead slightly less than other models.
In general most developers are going to find themselves fighting incentives which will color their opinion. AI isn't there yet but if you are going to abase your whole world view on a point on a graph and not on the trajectory you are in for a bad time.
> A few days back I wrote a piece called “AI enthusiasts are in a race against time, AI skeptics are in a race against entropy.”
Guess who the author is.
> > The enthusiasts are not wrong. We are starting to see real, non-imaginary, discontinuous leaps in capabilities from teams that lean in hard to working with AI. And this does not feel like a normal technology cycle where you can wait for the dust to settle; teams that sit this out while competitors are hustling could be out of business before the dust settles. That’s a real, existential threat.
It’s not imaginary. It’s real. This time it’s different. And on a higher level, the FOMO is real. It’s not imaginary. It’s even existential.
Why do they all write the same as well? It’s so emphatic.
> The tech is cool, but as a thinking, feeling, breathing human who cares about other people, it can be hard to get excited about anything that so many people are this upset about. It’s also hard to get excited about something when so many of the loudest voices are out there talking gleefully about putting everyone permanently out of work, and so many artists and writers and people from developing nations are talking openly about the impact on them.
> Hold your desire to jump in and berate me here, I beg you. Like I said, I will deal with the ethics and morality of using AI in my very next post. Be honest, your attention span is no more up for reading a 10,000-word essay than mine is up for writing one. (Can we blame AI for that too?)
More Inevitability Soothsaying. All our feelings are crashing with Existentinal Threat Reality.
Writing software begins with a solid design that is defensible. If you don't have that, the AI will produce slop.
Once you're happy with the design, you need a solid plan. If you don't have that, the AI will produce slop.
Once you're happy with the plan, you can set the AI loose, but don't get too complacent! Anything that you missed in the previous phases could very well lead to slop (although likely localized).
And then then, as your project matures and you gain more understanding of the space, you start to notice deficiencies in your model. This is where AI really shines: design and code changes to adapt to reality.
This is why I built https://saasufy.com/ - Vibe coders shouldn't trust themselves with backend security. Unfortunately, it's extremely difficult to get right. There's a lot to think about;
- Schema validation with appropriate size limits on all relevant fields.
- Authentication.
- Access control.
- Backpressure management and rate limiting in case a (possibly malicious) user tries to perform too many computationally expensive actions in a short time.
- Ensuring that the actions of one user doesn't throttle another user which is connected to the same process/host, e.g. using async constructs to avoid freezing the main process.
- DDoS mitigation.
- Avoiding race conditions.
- Designing a good database schema, with well chosen indexes, with deterministic IDs/idempotency to avoid double-insertion scenarios. You don't want to be forced to rely on overly complex queries with a lot of joins. This doesn't scale well and rarely necessary.
- Logging and error handling.
- Avoiding conflicts and accidental overwrite with old data when multiple users are editing different fields of the same resource concurrently.
- Efficient distribution of realtime messages.
- Scalability.
The list goes on and on... And every piece has to be implemented perfectly. This involves a huge number of carefully thought-out decisions.
It is now significantly harder to figure out who understands the systems and is using AI effectively and who doesn't know shit and is just slinging LLM copypasta around. Before 2025, the underperformers/coasters were at least relatively identifiable by the paucity of their contributions. Now all of the sudden every single engineer is filing PRs, code reviews, technical design documents, and every other artifact under the sun with perfect formatting and at least superficial plausibility. This is mostly due to incredible pressure from the C-level for every engineer to be using as much AI as possible, but it's also just a game theory respopnse because it's in every engineer's best interest to be as prolific as possible.
We are absolutely drowning in documentation and code that seems legit and the only recourse is to lean on AI to help process the sheer quantity of it. I have a feeling that the fallout from this phase of the industry is going to be an exotic form of technical debt that is remarkable mostly in its enormity.
I'm sure this is gated by where you work (especially by how technically savvy your manager is), but the most effective contributors at my job tend to be the ones with near-zero (or sub-zero!) net LoC.
LLMs are prolific and they love to add shit. Truly capable engineers are able to achieve more business outcomes with less code / fewer moving parts.
The most effective contributors at your job remove more code than they add? That doesn't sound effective that sounds like digging ditches to fill them. Every line of code removed is a line that was previously added.
Turning inefficient, unreadable code into efficient, readable code often results in an overall reduction in LoC.
High-quality code and high-volume code are highly anti-correlated. Incidentally, low-quality code that is excessively long just so happens to be common complaint with AI-generated code.
> The most effective contributors at your job remove more code than they add
Perhaps they tackle non-code-editing tasks like architecture, design, mentoring and code review (think staff and principal tasks)
> Every line of code removed is a line that was previously added
Yes. This os not a failure. Code has a surprisingly short half-life.
It’s not a failure that resources were spent writing code that was not needed?
Maybe it was temporarily needed, but the assumptions around it has changed and now it's unneeded. Then people built on top of that not understanding that they could simplify the whole system, and only later was that option discovered.
What would you keep from this?
I definitely agree with the GP, and the point is that most often someone else (or an LLM) added all those LOC that are removed to make the system sensible.
> That doesn't sound effective that sounds like digging ditches to fill them. Every line of code removed is a line that was previously added.
Because they were added doesn't mean they were needed and even if the same person added and then removed them, it doesn't mean they are digging ditches to fill them.
The idea that "I would have written a shorter letter, but I did not have the time" also applies to code, and sometimes later you are blessed with more time than you had when implementing something under deadline pressure.
> Because they were added doesn't mean they were needed and even if the same person added and then removed them, it doesn't mean they are digging ditches to fill them.
Huh? If LoC weren't needed then adding them was unnecessary and a waste of time. Someone who is known at an organization for removing unnecessary code screams inefficiency to me. It's paying one person to create a mess then another to clean it up.
> Huh? If they weren't needed then adding them was unnecessary and a waste of time.
My previous reply already addressed this?
I can't help but think you are being purposefully obtuse if you can't acknowledge the concept of developers creating known (and hopefully temporary) technical debt due to various forms of deadline related time pressure or changing requirements.
• LoC/LOC = Lines of Code
• sloc = Source Lines of Code
.. so I suppose nloc would mean Net LoC
if you're trying to use sloc as a proxy for productivity in any way, shape or form you've already lost the game.
i tend to find that the most productive teams make better decisions and work fewer hours. the quality of decisions is such a huge force multiplier that it renders actual hours worked almost an irrelevant variable.
Maybe the solution is to look out for the most silent engineers. Those that output less despite having the ability to create near infinite output.
I think it depends a little on how and where you work. In the energy industry of Europe where we are extremely regulated AI has been writing some excellent and maintainable code. Of course we can't do any of that CLEAN SOLID DRY stuff, or any abstraction and implicity really, and I imagine that AI would struggle with that. Though you have to wonder if any of those religions ever really worked when you consider that they've still failed to replace most COBOL systems 30 years later. Anyway, that's a different discussion and even Uncle Bob has moved on to functional programming.
I've yet to have Opus 4.8 fail me with defensive explict code. Often it'll write code that is better than what I might have done. I imagine it would be a nightmare to go through one of the OOP debug chains with implict error handling, but when every function has a runtime assertion which is basically the contract for how it is supposed to work and exactly what to do if it encounters a corrupt state, then things are just so much easier with AI.
I do agree with you on documentation. The amount we have has exploded in the post AI world. Which is a little ironic since the assertion is frankly what you'll need to know and not the 10 pages of prose the AI autogenerated in the shared loop (microsoft's terrible confluence). It is what it is though, and at least it's easier to meet EU compliance rules now, since those are more about the bureaucracy than actual security.
AI usage perhaps will have to be monitored by AI.
This game-theory phase seems to go hand-in-hand with totally myopic grift/gig/hustle thinking too. There is no product but the confidence act needed to win each moment.
Chalk up yet another echo of the 1920s Gilded Age? Between all these economic spasms and the simultaneous tilting towards fascism, I think there is way too much historical rhyming going on right now...
So, in other words, all the "awesome engineers" can't really tell good code from bad unless it's really obvious? Why should we listen to you about AI code being crap, then? Maybe, in the end, you don't really know. Maybe AI is better at it than you?
No. Your AI tool that summarized the comment did you dirty. The key here was this:
> perfect formatting and at least superficial plausibility
Basically, a library full of books that have nice covers is going to take time to see that all those books are just filled with ipsum lorem. Before, they coudln't stand up a fake library.
The issue comes down to time and effort.
I liked this article, and I see a lot of other commenters didn't, so I'll give my take:
When starting on a new codebase, how do you make yourself into a helpful contributor as quickly as possible? I go straight for the humans and their human docs. What problem was the system originally built to solve? What was the original design, and what were its biggest problems? Who is currently using it? If you know these, reading the code is much easier because you can guess why things were done the way they are.
Also, this blog post has gotten popular: https://blog.gpkb.org/posts/just-send-me-the-prompt/
I think Charity is observing a very old problem and expecting the new technology to lead to a new solution of some kind. I doubt she thinks even the current generation of tools are the end of the AI software development story. She's not saying we'll drop design docs right into Claude code and walk away (design docs aren't complete either, that's why when you're ramping up you also have to talk to people, read old tickets and postmortems, etc.)
What she's observing is that, in prod, people don't like infra where it's hard to tell how it got into is current state, and so infra-as-code is what we do now. She's also observing that, "it's hard to tell how it got into its current state" is the status quo with codebases, which other people have observed going back to "Programming as Theory Building" and earlier. And she's expecting that, analogous to infra, software development will somehow be done with tools focused on making "how the code got into its current state" clearer.
I wonder if the reception is so variable due to differing exposure to 1) infra as code and 2) engineering teams that don't produce any artifacts outside of their code.
> When starting on a new codebase, how do you make yourself into a helpful contributor as quickly as possible? I go straight for the humans and their human docs. What problem was the system originally built to solve? What was the original design, and what were its biggest problems? Who is currently using it? If you know these, reading the code is much easier because you can guess why things were done the way they are.
This is the way but plenty of engineering teams don't have any human docs at all. Decisions are made in one engineer's head or in a chat that isn't saved. The spec was just a few notes in a ticket that was deleted during cleanup or lost when the team changed trackers. There's no map of the codebase or features, no ADRs, minimal observability. All you have is the code. You read the code to try and figure out what is going on then ping an engineer who made a recent commit to a specific area to ask if they remember why something was done the way it was. Someone makes a change and it breaks something on the other side of the codebase that they thought was totally unrelated, etc.
You guys get tickets that tell you what to do in detail?
I liked the article. It was a long (and entertaining) build up to the conclusion, but I'm scratching my head how the author got there.
AI needs more discipline, yes. But theoretically that discipline can be learned much easier than becoming a good engineer.
Think of it this way... 20 years ago, to write good, scalable C code - you needed to 1) either be a genius, or 2) dedicated to the craft.
You need to learn dozens of tools like the back of your hand.
* ASan
* LSan
* UBSan
* TSan
* GDB
etc... God forbid if you needed to manually read DWARF files. Unless you're a pure genius, this is not feasible to master in a short amount of time. And in parallel, you need to learn how to design systems, too, otherwise, you're still not very good, and that's an almost completely orthogonal skillset.
Now, you simply need to be aware of the hazards in your language/framework, tell your LLM to test for them, have the infrastructure set up to see if they've adequately tested for those hazards, and maybe read the actual tests and implementation.
It is pretty easy to be able to read and understand Rust compared to debugging all the sorcery-like errors that come during Rust development... It is easy to see that you need a Loom test for certain scenarios, and to write a tool to detect if you did it.
Even if you're still working in C or Zig, it far easier to know and detect when you need to use those tools then to learn to use them all individually.
It is not hard to learn to read SQL. Almost ~50% of business professionals can. Python is barely harder. Rust can look like sorcery if you don't read a 50 page guide to understand to read it, but that's a VERY small price to pay compared to spending ~10 years learning the craft painfully by trial and error.
I'm not sure how you get from "LLMs work in mysterious ways" to "So we need more discipline" to "everything is fine."
I agree that everything is fine. I just don't think this is the clear path and thought process.
Anyone who has the determination to get things to actually work, and takes a little bit of time to understand what makes them not, should be able to leverage LLMs to work wonders.
In my opinion, LLMs are going to make things far more complicated, because the cost of building something complicated is becoming almost free.
Engineering was always about discipline and getting things to work. But you needed a set of prerequisite skills to have much value. Most of those are gone now.
It is simply far easier than before. It does require discipline, yes. But discipline is cheap compared to ~10 years of trial by fire.
> Those are not code problems. They are evaluation problems.
> Code becomes precious when it is the only place knowledge lives.
Reading AI code all day is _agonizing_. Just, a horrible way to live, and it melts people's brains at the moment you need them to be the most capable.
Manual programming has this really productive and gratifying feedback loop, where you read the code, write the code, and fix it until it compiles/runs/does what you want. AI code not only does half that for you, but it makes the "click" at the end uninspiring because you're never sure if it's cheated a bit to get to that moment.
Trying to operate with AI-generated code as the only durable artifact of programming is a dead end for the industry. Charity points to (and correct discards) architecture diagrams/specs as an interesting space to work in. My suspicion is that it's closer to the thing that's hand-written: prompts, markdown plans, and other nudges. Focus on the thing that you, as a human, produce, and that's the basis for both the core loop of "did the AI follow my instructions" and it's higher-leverage when you go to code review.
By the time you get to the PR, you've probably typed enough to Claude that you can regenerate the code, but the current industry default is to just throw away all those sessions and ship the code. That's backwards!
If a coworker dumped a 5k-line code review on you, you'd tell them to come back when it's broken down into smaller, reviewable chunks. Large dumps of code are basically unreviewable by humans, but it seems like a lot of people have forgotten about that when it comes to LLMs.
I think it's worse than that. At least if I dumped 5k LoC on somebody in 2021, you knew I spent the time to write it, so it's "fair" to ask you to read it. But I didn't write it in 2026, so you shouldn't read it.
I think it's less about "break it down" and more about "let's communicate at the same altitude."
I wrote a (bait-titled) post about it: https://tern.sh/blog/stop-reading-prs/
113 files +22913 −2423
305 files +15075 −13110
153 files +21934 −8698
125 files +28120 −2398
43 files +11188 −63
118 files +21564 −647
These are the largest (6 of 35) in the past 30 days. added: 190079 removed: 39696 in the last 6 months
from one person.
Breaking up a giant PR can be a tedious, time-consuming hassle, and in the past I could sympathize in practice if someone had a giant PR they didn't have time to decompose once they got it working.
But it's also the exact sort of thing that LLMs are literally perfect for in my experience so there's really no excuse anymore. I've never seen Claude fail to turn a 5k PR into a well-decomposed Graphite stack.
It is not so much forgetting as much as it is acceptance that when welcoming AI into a codebase, the code can no longer matter; that all that matters is that the properties of the system are validated. That isn't a change that comes free, so nobody should be expecting magic, it is a different set of tradeoffs. There is no such thing as a panacea.
I think they expect you to also use an LLM to review, and I bet they are doing exactly that when asked to review someone else's code.
> If a coworker dumped a 5k-line code review on you, you'd tell them to come back when it's broken down into smaller, reviewable chunks.
I would, and all my training at Google told me to do that. But what I found after I left that comfortable box was that somehow this kind of practice is acceptable in the industry at large and you're expected to just Deal With It(tm). 5k lines isn't even high by what I've seen.
Worse the "code review" tools that people have access to in GitHub make this absolutely and totally unworkable to incrementally improve review. Messy merge commits full of "responding to code review" comments. Threads impossible to follow. Just bad tooling.
So a lot of shops, from what I've seen, are just yeeting it with very shallow reviews.
This is my observation pre agentic AI. LLMs just threw kerosene on that dumpster fire.
Are there any products out there that are capturing the prompts/sessions? I imagine you could do it in an adhoc way, asking Claude to write up a summary of the session as part of the commit message. But is there anything else that's more structured/higher level?
We're working on it, thought it's all early. I'd love feedback: https://tern.sh
First product compares the code to the prompts and highlights places the agent made decisions you weren't involved in: https://tern.sh/docs/tours/
> What happened in 2025 was this: the economics of code production were turned upside down. Instead of being very hard, time-consuming, and expensive to generate code, it became effectively free and instant. Lines of code went from being treasured, reused, cared for and carefully curated, to being disposable and regenerable, practically overnight.
I've been thinking about this a whole lot recently. So much of my intuition about software development is based on 25 years of accumulated experience on how long it will take to write different bits of code.
Should I add validation for this one edge-case which won't break everything but will make a little bit of a mess if someone hits it? If that's an extra couple of hours of code I might skip it. If it's one more prompt, why wouldn't I?
This new feature would be a lot easier to understand if there was a custom API explorer for it. There's no way I could justify investing in that... unless it's just 10 minutes with Codex, and it was: https://tools.simonwillison.net/datasette-extras-explorer#ur... (linked from the release notes https://docs.datasette.io/en/latest/changelog.html#extra-sup...)
That's just on the small scale. There are entire projects that I'd never previously have considered, because I don't need a custom SQLite SELECT query parsing library enough to justify spending a week or more building one. But now... https://github.com/simonw/sqlite-ast
People get VERY upset (and condescending) any time you suggest that being able to produce lines of code faster is a valuable thing. And sure, measuring output through "lines of code" is stupid.
But measuring "lines of verified code that deliver valuable" isn't stupid at all. That's the thing we can do faster now.
I read the article, and it seems she is forgetting the aphorism "all models are wrong". This is a common mistake that people who like "realistic" "simulation" RPGs often make. Any suitably comprehensive model of a thing is just the thing itself. To have a model of a location that includes all the detail of the actual location, you would need a 1:1 scale model, which is just a copy of the location. Any plan (i.e. prompt to a model) sufficiently capable of reliably replicating 100% of the functionality of a system is likely the source code of the system itself.
This has been my position from the beginning of when agentic coding harnesses became genuinely useful.
I now do documentation driven development, and with very few exceptions I am committing code that is better written, better documented, easier to reason about and maintain, with less library overuse than I ever did as a senior lead with a smal team, and I’m doing it for 1/4 the price, at 4x the speed.
But it’s not vibe coding. Discipline is critical, as is deep systems understanding.
Before 2023 I remember everyone here on HN championed that removing lines of code was the strongest senior metric
arent they still? or at least a lot. its too much current to win the swim race against the deluge of llm LOC. but i also disagree with some of the things the author just casually lays out, which is whether the LLMs can write good code. they write working code, but it looks written by a demogorgon and i get a bit ill seeing it. its bad but not bad in a way that a human would ever write, like i dont get that kind of sick reading spaghetti code written by new devs. it's a kind of sick like cthulhus eggs are hatching somewhere in your guts.
Removing lines of code without removing functionality.
Simplification is still good. I remember one senior that only removed code when he joined the company I was at until he became a manager!
Ok, I like the idea and support that seniors value simplicity ... but how the hell do you stay employed for even a month (let alone until "manager time") without writing any code?
You don't just delete stuff… it's more that your pull requests remove more lines than they add. But I'm sure the person you're replying to is exaggerating, or they got promoted because of completely unrelated reasons.
Great article. I'm not sure the author is correct - but I think something is happening to the adage:
> A sufficiently detailed specification is runnable code.
In a way I think LLMs will enable the dream of 4gl and "sufficiently smart compilers"[c].
LLMs aren't smart, but they are capable. Especially capable of translation and transformation.
I can certainly see them help move the abstraction horizon at which we work - so that rigid high level descriptions of the desired logic/process along with the process for quality testing - become the relevant curated artifacts - and the generated go/rust/java/python/etc code become incidental and mutable; subject to constant rewriting as part of the deployment of systems.
[c] You know, the ones that take naive C/C++ and produce executables that fully leverage RISC/EPIC platforms to be better than CISC. See also: Intel Itanium
This is what Anthropic did with agents and $20k to write a C compiler that survived gcc’s torture suite. But the LLM knew:
1. What a C compiler was
2. What a C compiler looked like
3. What the C compiler had to do at runtime to pass gcc’s torture suite through some sort of collaborative iteration (compile, run, did it get stuck at some torture suite test or fail?)
Remove 1 and 2, or replace it with imperfect business logic, and you’re left with a system that is built to _only_ pass the tests you supply it, or in the most extreme case, print(“unit and functional tests pass!”)
It was also trained on gcc and clang.
I did not enjoy reading this article. The writing was fine, and each individual paragraph was fine, but the whole thing together was meandering and dare I say pointless. It was so many words and yet so little seems to have been said.
I'm not sure this article had enough thought put into it. For example:
It's not so much as "the economics [...] were turned upside down", but that a manufacturing process that used to be strictly additive (akin to 3D printing) is now complemented by a subtractive process (akin to CNC milling). The "shape" that is demanded hasn't really changed, and nor has the human effort (as long as you care about achieving certain tolerances). You still have to "treasure, reuse, care for, and curate" your product to whatever degree the market demands.Also I disagree with:
What does "ideal" mean here? When I was growing up "show your work" was the rule for all examinations. Why? Because we're working to improve mental models and thought processes for the next generation, not just products we will release tomorrow.I think the point is that there are better engineering artifacts to review instead of lines of code. Encoding the decisions, structure, requirements, testing, monitoring, then reviewing those and having AI generate and regenerate code based on them. The code itself doesn't matter if enough thought and rigor has gone into the structure that produces the code.
> What does "ideal" mean here? When I was growing up "show your work" was the rule for all examinations. Why? Because we're working to improve mental models and thought processes for the next generation, not just products we will release tomorrow.
They're saying that the mental models and thought processes are incredibly important but that code is not the place for that work to live.
> They're saying that the mental models and thought processes are incredibly important but that code is not the place for that work to live.
What I meant is that, insofar as some work has been produced with a human mind involved and where imperfect abstractions are used, one should not for whatever idealistic reasons push for reviewing the work at some coarser granularity than the details which are readily available. That's a way to foster and encourage mistakes, in both the work and in the mental model.
So when you say that code is not the place for that work to live (or more closely to the line I disagree with, that code is not an 'ideal' artifact to review), you are essentially purporting that there is a perfect abstraction that can generally be trusted, which I disagree is currently the case for an LLM spec versus produced code.
> They're saying that the mental models and thought processes are incredibly important but that code is not the place for that work to live.
They’re important for discussion and brainstorming. They’re also important for sharing context before reviewing. But code is the only perfect representation in terms of semantics of what the computer will do.
You can have all the diagram and all the proses you want, but they’re still ambiguous.
The intro lobbed up a clear cut point of contention for the article to address. I found the following writing to loose steam on that point. I turned to skimming, and did not manage to find a conclusion.
I suspect the stance they described as one readers mistakenly took away from their previous article to in fact be their stance. Otherwise why dance around it so hard?
> The writing was fine, and each individual paragraph was fine, but the whole thing together was meandering and dare I say pointless. It was so many words and yet so little seems to have been said.
I bet that I know why!
I enjoyed it, people post on blogs as a way to entertain themselves, not necessarily the reader.
meta, but: I gave up. I found the language really hard to follow and the point of the piece didn’t stand out to me. shrug
I have a doubt that one of Three Virtues of a Programmer, laziness is still considered a virtue on AI coding era.
Now that AI coding speed and performance outperformed most of human. But AI still need human to be commanded. Yes, you can let AI agent manage sub-agents but still, human is at the top of manager who order AI what should be written.
So human must command and final say on when it's done.
Is laziness still a good virtue in AI era?
I'd argue using AI is the epitome of laziness, at least in some sense.
If you buy that, then it follows that the more work you accomplish with AI, the "lazier" of a dev you are.
As defined by Larry Wall: "Laziness: The quality that makes you go to great effort to reduce overall energy expenditure. It makes you write labor-saving programs that other people will find useful and document what you wrote so you don't have to answer so many questions about it."
That is still an enormous virtue in the AI era. It is completely the opposite of what many AI-using programmers are doing, which is being lazy in the conventional sense, minimizing their individual energy expenditure at the price of increasing the overall energy expenditure.
Being big-picture lazy is a virtue. Being individually lazy is a vice.
>"It’s easy to forget, but for most of 2025, the idea that AI-generated code was slop and might always be slop was not only a reasonable position to hold, it was the default, mainstream position.
That question was answered decisively last November."
It's easy to forget that people said this exact thing about every model after GPT 3.5. This is a standard trick the industry uses to invalidate negative experience with LLMs. 'You are prompting it wrong' becomes 'you are using Gemini, but you should use Clade' which then becomes 'well, all of your criticism is now irrelevant, because everything is fixed in this new version'.
This "discussion" about capabilities is set up to be asymmetrical and basically non-falsifiable.
The old model couldn't do math, the new one solved a big open problem.
"Open AI claims that its model disproven an Erdős conjecture, therefore my crappy way of arguing about software quality is valid."
I really don't know how I'm supposed to reply to stuff like this.
You seem to be saying model capabilities aren't improving. They are. The fact that many mathematicians have looked at the result and confirmed it and solved some other problems with the technique elevates this above claims.
i mean i am very much still waiting for it to not be slop, but fable actually i think made a bit of headway in this direction, the code it writes what little of it i saw, makes me want to fall over dead slightly less than other models.
So, using artificial intelligence requires more expertise than not using it?
If you ask a surgeon if you need surgery...
In general most developers are going to find themselves fighting incentives which will color their opinion. AI isn't there yet but if you are going to abase your whole world view on a point on a graph and not on the trajectory you are in for a bad time.
Thanks, great to have the perspective of thoughtful engineers who have been in the trenches for a long time
> A few days back I wrote a piece called “AI enthusiasts are in a race against time, AI skeptics are in a race against entropy.”
Guess who the author is.
> > The enthusiasts are not wrong. We are starting to see real, non-imaginary, discontinuous leaps in capabilities from teams that lean in hard to working with AI. And this does not feel like a normal technology cycle where you can wait for the dust to settle; teams that sit this out while competitors are hustling could be out of business before the dust settles. That’s a real, existential threat.
It’s not imaginary. It’s real. This time it’s different. And on a higher level, the FOMO is real. It’s not imaginary. It’s even existential.
Why do they all write the same as well? It’s so emphatic.
> The tech is cool, but as a thinking, feeling, breathing human who cares about other people, it can be hard to get excited about anything that so many people are this upset about. It’s also hard to get excited about something when so many of the loudest voices are out there talking gleefully about putting everyone permanently out of work, and so many artists and writers and people from developing nations are talking openly about the impact on them.
> Hold your desire to jump in and berate me here, I beg you. Like I said, I will deal with the ethics and morality of using AI in my very next post. Be honest, your attention span is no more up for reading a 10,000-word essay than mine is up for writing one. (Can we blame AI for that too?)
More Inevitability Soothsaying. All our feelings are crashing with Existentinal Threat Reality.
This has been my experience with AI.
Writing software begins with a solid design that is defensible. If you don't have that, the AI will produce slop.
Once you're happy with the design, you need a solid plan. If you don't have that, the AI will produce slop.
Once you're happy with the plan, you can set the AI loose, but don't get too complacent! Anything that you missed in the previous phases could very well lead to slop (although likely localized).
And then then, as your project matures and you gain more understanding of the space, you start to notice deficiencies in your model. This is where AI really shines: design and code changes to adapt to reality.
Broadly concur with this and in fact it’s all of this is going to make doing real engineering easier in my opinion
The author makes the wrong assumption though that the majority of people who are doing engineering want to do even more engineering.
It’s my experience that most technology workers just want a high paycheck and have some kind of association with being in tech and doing cool things
> a high paycheck and have some kind of association with being in tech and doing cool things
yeh, I can see how that is now mistaken for a definition of 'engineer' or 'hacker'.
I am sorry you never knew what engineering truly means.
That is the problem imo. Most tech workers want a big check and no work. Gross. I like the work. But i do get wanting to get a nut with little effort.
> Instead of being very hard, time-consuming, and expensive to generate code
Was this article written by AI? It's certainly stupid enough!
This is why I built https://saasufy.com/ - Vibe coders shouldn't trust themselves with backend security. Unfortunately, it's extremely difficult to get right. There's a lot to think about;
- Schema validation with appropriate size limits on all relevant fields.
- Authentication.
- Access control.
- Backpressure management and rate limiting in case a (possibly malicious) user tries to perform too many computationally expensive actions in a short time.
- Ensuring that the actions of one user doesn't throttle another user which is connected to the same process/host, e.g. using async constructs to avoid freezing the main process.
- DDoS mitigation.
- Avoiding race conditions.
- Designing a good database schema, with well chosen indexes, with deterministic IDs/idempotency to avoid double-insertion scenarios. You don't want to be forced to rely on overly complex queries with a lot of joins. This doesn't scale well and rarely necessary.
- Logging and error handling.
- Avoiding conflicts and accidental overwrite with old data when multiple users are editing different fields of the same resource concurrently.
- Efficient distribution of realtime messages.
- Scalability.
The list goes on and on... And every piece has to be implemented perfectly. This involves a huge number of carefully thought-out decisions.