Things I've Done with AI

(sjer.red)

84 points | by shepherdjerred a day ago ago

135 comments

brotchie 21 hours ago

Not enough time, too many projects. Useful projects I did over the weekend with Opus 4.6 and GPT 5.4 (just casually chatting with it).

2025 Taxes

Dumped all pdfs of all my tax forms into a single folder, asked Claude the rename them nicely. Ask it to use Gemini 2.5 Flash to extract out all tax-relevant details from all statements / tax forms. Had it put together a webui showing all income, deductions, etc, for the year. Had it estimate my 2025 tax refund / underpay.

Result was amazing. I now actually fully understand the tax position. It broke down all the progressive tax brackets, added notes for all the extra federal and state taxes (i.e. Medicare, CA Mental Health tax, etc).

Finally had Claude prepare all of my docs for upload to my accountant: FinCEN reporting, summary of all docs, etc.

Desk Fabrication

Planning on having a furniture maker fabricate a custom walnut solid desk for a custom office standing desk. Want to create a STEP of the exact cuts / bevels / countersinks / etc to help with fabrication.

Worked with Codex to plan out and then build an interactive in-browser 3D CAD experience. I can ask Codex to add some component (i.e. a grommet) and it will generate a parameterized B-rep geometry for that feature and then allow me to control the parameters live in the web UI.

Codex found Open CASCADE Technology (OCCT) B-rep modeling library, which has a web assembly compiled version, and integrated it.

Now have a WebGL view of the desk, can add various components, change their parameters, and see the impact live in 3D.

[-]

cj 21 hours ago

I love the tax use case.

What scares me though is how I've (still) seen ChatGPT make up numbers in some specific scenarios.

I have a ChatGPT project with all of my bloodwork and a bunch of medical info from the past 10 years uploaded. I think it's more context than ChatGPT can handle at once. When I ask it basic things like "Compare how my lipids have trended over the past 2 years" it will sometimes make up numbers for tests, or it will mix up the dates on a certain data points.

It's usually very small errors that I don't notice until I really study what it's telling me.

And also the opposite problem: A couple days ago I thought I saw an error (when really ChatGPT was right). So I said "No, that number is wrong, find the error" and instead of pushing back and telling me the number was right, it admitted to the error (there was no error) and made up a reason why it was wrong.

Hallucinations have gotten way better compared to a couple years ago, but at least ChatGPT seems to still break down especially when it's overloaded with a ton of context, in my experience.

[-]

arjie 20 hours ago

In my case, what I like to do is extract data into machine-readable format and then once the data is appropriately modeled, further actions can use programmatic means to analyze. As an example, I also used Claude Code on my taxes:

1. I keep all my accounts in accounting software (originally Wave, then beancount)

2. Because the machinery is all in programmatically queriable means, the data is not in token-space, only the schema and logic

I then use tax software to prep my professional and personal returns. The LLM acts as a validator, and ensures I've done my accounts right. I have `jmap` pull my mail via IMAP, my Mercury account via a read-only transactions-only token and then I let it compare against my beancount records to make sure I've accounted for things correctly.

For the most part, you want it to be handling very little arithmetic in token-space though the SOTA models can do it pretty flawlessly. I did notice that they would occasionally make arithmetic errors in numerical comparison, but when using them as an assistant you're not using them directly but as a hypothesis generator and a checker tool and if you ask it to write out the reasoning it's pretty damned good.

For me Opus 4.6 in Claude Code was remarkable for this use-case. These days, I just run `,cc accounts` and then look at the newly added accounts in fava and compare with Mercury. This is one of those tedious-to-enter trivial-to-verify use-cases that they excel at.

To be honest, I was fine using Wave, but without machine-access it's software that's dead to me.

shepherdjerred 20 hours ago

I've gotten better results by telling it "write a Python program to calculate X"

[-]

brotchie 19 hours ago

For the tax thing. I had Claude write a CLI and a prompt for Gemini Flash 2.5 to do the structured extraction: i.e. .pdf -> JSON. The JSON schema was pretty flexible, and open to interpretation by Gemini, so it didn't produce 100% consistent JSON structures.

To then "aggregate" all of the json outputs, I had Claude look at the json outputs, and then iterate on a Python tool to programmatically do it. I saw it iterating a few times on this: write the most naive Python tool, run it, throws exception, rinse and repeat, until it was able to parse all the json files sensibly.

dmd 20 hours ago

Yeah, in my user prompt I have "Whenever you are asked to perform any operation which could be done deterministically by a program, you should write a program to do it that way and feed it the data, rather than thinking through the problem on your own." It's worked wonders.

[-]

20 hours ago

[deleted]

cj 20 hours ago

Good call. I’ve also had better results pre-processing PDFs, extracting data into structured format, and then running prompts against that.

Which should pair well with the “write a script” tactic.

[-]

tavavex 20 hours ago

Yeah, asking for a tool to do a thing is almost always better than asking for the thing directly, I find. LLMs are kind of not there in terms of always being correct with large batches of data. And when you ask for a script, you can actually verify what's going on in there, without taking leaps of faith.

ElFitz 20 hours ago

I’d say for these use cases it’s better to make it build the tools that do the thing than to make it doing the thing itself.

And it usually takes just as long.

thijsvandien 21 hours ago

I don't know, but I would never upload such sensitive information to a service like that (local models FTW!) or trust the numbers.

[-]

basch 20 hours ago

Which part is sensitive? Social is public, income is private but what is someone going to do with it?

[-]

AlecSchueler 4 hours ago

That's dream info for targeted advertising and political manipulation.

jumpman500 20 hours ago

It's not good in some job negotiations if someone has a very clear picture of what your current net worth and income is. Also in some purchases companies could price discriminate more effectively against you.

thijsvandien 20 hours ago

Now that's a question I'd feel more confident having answered by an LLM. Personally, I'm tired of arguing with "nothing to hide", which (no offense) is just terribly naive these days.

[-]

whackernews 17 hours ago

I find it really weird too, like, haven’t we done this? Also struggle to understand the motivation for arguing from this direction. Do people forget it’s the normal, default position NOT to be spied on?

whackernews 17 hours ago

Where’s the line for you? Would you upload a picture of you sat on the toilet for example?

mandeepj 20 hours ago

> Result was amazing. I now actually fully understand the tax position.

You couldn’t do that with TurboTax or block’s tax file? You don’t have to submit or pay.

MikeNotThePope 20 hours ago

Be careful with taxes. Hallucinations will cost you.

g947o 7 hours ago

We usually call that FAAFO

slopinthebag 20 hours ago

> had Claude prepare all of my docs for upload to my accountant: FinCEN reporting, summary of all docs, etc.

I imagine your accountant had the same reaction I do when an amateur shows me their vibe codebase.

whattheheckheck 21 hours ago

I had ai hallucinate that you can use different container images at runtime for emr serverless. That was incorrect its only at application creation time.

Hope you dont get audited

semiquaver 21 hours ago

I feel pretty productive myself with AI but this list isn’t beating the rap that AI boosters mostly use AI to do useless stuff focused on pretending to improve productivity or projects that make it easier to use AI.

[-]

stavros 21 hours ago

Here's what I made:

* https://www.stavros.io/posts/i-made-a-voice-note-taker/ - A voice note recorder.

* https://github.com/skorokithakis/stavrobot - My secure AI personal assistant that's made my life admin massively easier.

* https://github.com/skorokithakis/macropad - A macropad.

* https://github.com/skorokithakis/sleight-of-hand - A clock that ticks seconds irregularly but is accurate for minutes.

* https://pine.town - A whimsical little massively multiplayer drawing town.

* https://encyclopedai.stavros.io - A fictional encyclopedia.

* https://justone.stavros.io - A web implementation of the board game Just One.

* https://www.themakery.cc - The website and newsletter for my maker community.

* https://theboard.stavros.io - A feature board that implements itself.

* https://github.com/skorokithakis/dracula - A blood test viewer.

* https://github.com/skorokithakis/support-email-bot - An email bot to answer common support queries for my users.

Maybe some of these will beat the rap.

[-]

profsummergig 21 hours ago

> "A clock that ticks seconds irregularly but is accurate for minutes."

Sounds like something that could be tried as a fix for a kind of OCD (obsessive seconds counting).

[-]

stavros 21 hours ago

Maybe, although it's actually giving me OCD, I think. It's really hard to tune out because of the irregular ticking. I implemented a regular mode to combat this, defeating the purpose somewhat.

[-]

bencyoung 8 hours ago

Sounds like the Chronophage clock in Cambridge: https://en.wikipedia.org/wiki/Corpus_Clock. It it's purely mechanical but has odd pauses in the ticks etc

observationist 20 hours ago

Unpredictable things catch our attention - it's the exceptions that are important to survival, and our brains evolved to cope with the stimuli that this experiment messes with.

Something like this would be anxiety inducing for most people, I bet. That'd be an excellent experiment, track heart rate, EEG, and performance on a range of cognitive tasks with 2 minute long breaks between each tasks, one group exposed to the irregular ticking, another exposed to regular ticking, another with silence, and one last one with pleasant white noise.

pinkmuffinere 20 hours ago

what was the motivation for originally making it with irregular ticking?

[-]

stavros 20 hours ago

It sounded fun (and it is)! My favorite mode is one that ticks each second imperceptibly fast, and then stalls for a second in one of the ticks (so that it lasts two).

It's just the right amount of "did that clock just skip a beat? Nah must just be my imagination".

[-]

pinkmuffinere 20 hours ago

Ha, cool! I love the whimsicality of this

[-]

stavros 20 hours ago

Thanks, I love it too!

risyachka 20 hours ago

It does not matter how much stuff is built. What matters is what comes out of it.

And with AI the result of 99.9% is abandonware. Just piles of code no one will ever touch again.

Which proves the point of no productivity gains. Its just cheap dopamine hits.

[-]

danso 20 hours ago

The user you're responding too lists a "blood test viewer" [0], which looks to be a tool that turns his blood test PDFs into structured and analyzed data. You're saying that unless he continuously revises/upgrades the code, it's still "abandonware" even if it meets his needs for the near future?

[0] https://github.com/skorokithakis/dracula

[-]

sarchertech 20 hours ago

Bit rot is real. The dependencies listed here include calling into AI APIs that will stop working with time. So yes if no one keeps this up to date it will rot into useless likely very quickly.

That’s not even mentioning that this tools doesn’t do much beyond wrap a call to Claude. And it’s using Claude to display blood test data to the end user. This is not something I’d trust an LLM to not mess up. You’d really want to double check every single result.

[-]

dsf2 19 hours ago

Also humans are not bots.

We hate having to feel like we have to double check everything. We have an asymmetric relationship with gains and losses etc.

Is it me or is this stuff flying over peoples heads?

slopinthebag 19 hours ago

Just saying, you can paste the sample report into ChatGPT and it does the same thing, and even creates interactive graphs for you. Im not sure how useful something is if a chatbot can do it, with the side benefit of being able to ask for follow up questions.

[-]

simmerup 18 hours ago

i guess the custom UI makes you believe you can trust the output, as if there’s any thought going into it rather than just an LLM hallucinating for you

tempaccount5050 20 hours ago

Missing the point. I no longer need to buy or rely on someone else for software I want to use. A lot of things I want to do ARE one offs. I can write software and throw it away when I'm done.

[-]

incr_me 20 hours ago

I know this sounds sarcastic but I really mean it: For years everyone has been monastically extolling some variation of "the best code is deleted code". Now, we have a machine that spits out infinite code that we can infinitely delete. It's a blessing that we can have shitty code generated that exposes at light speed how shitty our ideas are and have always been.

[-]

dsf2 19 hours ago

A nicer framing is original ideas and original thinking in general is very hard and doesn't come around very often.

Steve Jobs once said a thing about the belief that an idea is 90% of the work is a disease. He is and was absolutely right.

sarchertech 20 hours ago

You still need to spend plenty of time verifying they work though unless it’s something where that truly doesn’t matter.

grim_io 20 hours ago

Abandonware is what the customer wants.

Constant enshittification and UI redesigns are driven by the provider to justify monthly extortion.

saulpw 21 hours ago

Some of them definitely do not. Like a fictional encyclopedia? What is the point of that? That's like "an alphabetical novel".

And even for the ones that might "beat the rap", I don't understand from your descriptions why they are interesting or unique. A voice note recorder? Cool. There are already hundreds if not thousands of those, why did you need to make your own in the first place? I'm not saying that yours isn't special, I'm just saying that it doesn't help to post the blandest description possible if you're trying to impress people with the utility of your utility.

[-]

senko 20 hours ago

So not only does he have to show what he built with AI, what he built with AI has to be interesting and unique to you? Why? He's not selling it to you.

Seems like the bar is now it has to be a mass market product. On another post someone else commented a SaaS doesn't count if it doesn't earn sustainable revenue.

I guess OpenClaw also doesn't count because we don't know how much Peter got from OpenAI.

This is an ideological flame war, not a rational discussion. There's no convincing anyone.

[-]

timacles 3 hours ago

It’s kind of like the beginning sequence from back to the future 1 when it shows all the random inventions at Docs house.

Yeah they are interesting, I guess they do something but are any of them actually delivering value? That’s when you get into the argument of what is value and to whom, but as AIs role in society of generating productivity, that’s pretty disputable if every person being able to build their own train set that turns on the toaster and makes coffee is going to move us forward as a species like , say the internet.

That’s really the only argument, is the use of LLMs worth the trillions of dollars and selling out the future of humanity for. Not is it fun Bildungsroman quirky apps really fast

munksbeer 16 hours ago

> Seems like the bar is now it has to be a mass market product.

The bar for this will just keep moving. Some people are heavily invested in the anti-stance, so human nature being what it is, you've little hope of changing their minds anyway.

Grimblewald 6 hours ago

no, the bar is accurate and descriptive descriptions. You know how AI words are typically hollow and devoid of meaning? loads of grammatically fine words but not actually saying anything? well, these repos are the github version of that. Lots of words but so starved of meaning I shut off mentally trying to read half of them. Some descriptions are outright lies.

saulpw 19 hours ago

I'm actually becoming an AI convert myself. If there is ideology here, it's not about AI, but about keeping trash off the streets.

For example, I checked out their "Fictional Encyclopedia". It's an absolutely terrible project, much worse than useless, because it claims to be an "encyclopedia" right in the name (the tagline is "Everything about everything"), yet it's engineered to just completely make things up, and nowhere on the page does it indicate this! I looked up my own niche open-source project, and was prepared to be at least somewhat impressed that it pulled together facts on the fly into an encyclopedic form. For the first couple of paragraphs that seemed like it might be the case, then it veered into complete fantasy and just kept going.

Like what is the point of this? I can already ask a chatbot the same question and at least then I have explicit indicators that it might be hallucinating. But this page deliberately confuses truth and reality for absolutely zero purpose. It's a waste of brain cells, for both the creator and the consumer, with no redeeming value. It's neither interesting, nor different, nor valuable. AND it's burning tokens to boot!

I mean, come on, the bar is not that high. Some of stavros' projects may even be over it. But the first projects I checked were sub-basement, and I am not interested in searching through mounds of trash for what might be a quarter dollar. I'm actually kind of disappointed that stavros didn't have (or apply) the sense or taste to whittle down that list of 11 (!) projects to some 3 that show off the value of their work. Which I'm starting to understand is everyone's issue with AI brain rot; it seems to just encourage "here's everything, I dunno, you figure it out" which is maddening and deserves the pushback it gets.

Grimblewald 6 hours ago

don't waste your time, they're a slop slinger who won't take any feedback that could feel like a hit to the ego. I've wasted too much time on them already, cut your losses and move on. Their 'safer' personal bot for example is anything but, but they won't listen to feedback.

stavros 21 hours ago

Sounds like the goalposts are moving from "not useless stuff focused on pretending to improve productivity or projects that make it easier to use AI" to "extremely useful stuff".

[-]

saulpw 21 hours ago

One issue is that I interpreted the parent as OR, not AND. "useless stuff OR productivity tools OR AI tools".

Moreover though, I'm not even saying you shouldn't do those things. I'm actually playing around with AI quite a bit, and certainly have created my share of useless/productivity tools. But it's not a flex to show off your own Flappy Birds or OpenNanoClaw clone, even if they are written in COBOL or MUMPS.

And they definitely do not have to be "extremely useful". But they should answer the question: what problem does it solve?

jjee 21 hours ago

Fair. But finally we are seeing what LLM proponents are putting forward.

And it’s exactly what I expected - lines of code. Cute. But… so what? This is not good for the AI hype and nor any continued support for future investment.

On the other hand all this stuff is going to drive continual innovation. The more tokens generated the more model producers invest. And we might eventually get to a place of local models.

[-]

stavros 21 hours ago

I swear, I'm going to stop commenting on this site, the amount of shitting on people who use LLMs (ie everyone) is just impossible to deal with.

[-]

tuesdaynight 19 hours ago

Don't do that, just avoid answering the "non-believers" or whatever they are called. Your comments are insightful for me (and for a lot of other people, I'm sure). You don't need to prove that they are useful, just comment about your experience and ignore them. It's like arguing about religion trying to make the other person to flip their beliefs (a waste of time for everyone involved)

[-]

stavros 18 hours ago

I guess you're right, I really need to get better at ignoring some people. It just really got to me today because someone else looked at one of my projects for two seconds and decided to tell me off for it being "insecure" and "slop", and it kind of ruined my day.

Thanks for the support!

slopinthebag 20 hours ago

I have the opposite experience, the amount of AI boosters deriding the less enthusiastic, gleefully exclaiming how someone will be "left behind" if they don't immediately adopt the latest hype cycle, or sharing AI slop and either embellishing or outright lying about it's capabilities is making me want to log off forever. "Handwritten code? Don't you only care about providing maximum shareholder value?" No.

[-]

munksbeer 16 hours ago

No-one (apart from some CEOs) cares that you don't use AI, I promise you.

The thing that triggers people is comments like yours still, even at this point, claiming that AI just produces slop and everyone is just lying.

It is absurd, and people are obviously going to react to it.

[-]

slopinthebag 15 hours ago

When did I claim AI just produces slop? When did I claim everyone was lying?

If by "react" you mean make stuff up, sure.

lukan 20 hours ago

"Or projects that make it easier to use AI"

I get the sentiment, but this is natural with a groundbraking new technology. We are still in the process of figuring out how to best apply generative LLM's in a productive way. Lots of people tinker and share their results. Most is surely hype and will get thrown away and forgotten soon, but some is solid. And I am glad for it as I did not take part in that but now enjoy the results as the agents have become really good now.

[-]

harry8 20 hours ago

> "Or projects that make it easier to use AI"

This is exactly the same reason why the appropriate question to ask about Haskell is "where are the open source projects that are useful for something that is not programming?"

The answer for Haskell after 3 decades is very, very little. Pandoc, Git Annexe, Xmonad. Might be something else since I last did the exercise but for Haskell the answer is not much. Then we examine why the kids (us kids of all ages) can't or don't write Haskell programs.

The answer for LLM coding may be very different. But the question "where is the software that does something that solves a problem outside its own orbit" is crucial. (You have a problem. You want to use foo to solve it, now you have two problems but you can use foo to solve a part of the second one!!)

The price of getting code written just went down. Where are the site/business launches? Apps? New ideas being built? Specifically. With links. Not general, hand-wavy "these are the sorts of things that ..." because even if it's superb analysis, without some data that can be checked it's indistinguishable from hype.

Whatever data we get will be very informative.

[-]

lukan 20 hours ago

For instance, there is a abandoned open source project, I would have liked to see revived, https://www.wickeditor.com/ (a attempt at recreating flash with web technology). Current official state in the repo: outdated dependencies, build process, etc.

I looked into doing it manually, but gave up. Way too much dirty work and me no energy for that.

Then I discovered that claude CLI got good - and told it to do it (with some handholding).

And it did it. Build process modernized. No more outdated dependencies. Then I added some features I missed in the original wick editor. Again, it did it and it works.

A working editor that was abandoned and missed features - now working again with the missing features. With minimal work done from my side (but I did put in work before to understand the source).

I call this a very useful result. There are lots of abandoned half working projects out there. Lots of value to be recovered. Unlike Haskell, Agents are not just busy with building agents, but real tools. Currently I have the agents refactor a old codebase of mine. Lot's of tech dept. Lot's of hacks. Bad documentation. There are features I wanted to implement for ages but never did as I did not wanted to touch that ugly code again. But claude did it. It is almost scary of what they are already capable of.

shepherdjerred 21 hours ago

That's a fair criticism of my personal projects. Maybe 3-4 of those could potentially see usefulness outside of myself.

At work, I would say I've done plenty of "useful" things with AI, but that's hard to show off given that I work on an internal application.

[-]

peteforde 20 hours ago

I don't think you should feel like your personal projects need to be vetted by an armchair peanut gallery. It's actually kind of offensive how so many people show up in a thread like this and demand that what sparked joy for you be formally subjected to a gauntlet of moving goalpost validation markers.

Quite simply, I don't think that they are asking or arguing in good faith.

SunshineTheCat 21 hours ago

I've actually felt the same way about some (not all) but some "productivity" hacks I've seen people post online with their OpenClaw setups.

I chuckle when I see some of them because you could achieve the same (or often faster) result by jotting a note onto a notecard and sticking it in your pocket.

Most of the other automations running don't really seem to serve any real purpose at all.

But hey, if it's fun, have at it.

gopher_space 20 hours ago

I mean I’m using it to deconstruct and reinvent my development process from the ground up, but it’s so easy to do this now and so customized for my specific needs that the idea of posting about it never crossed my mind.

bronlund 21 hours ago

If you are a parent, you know that feeling when your child is struggling with something and gets frustrated, but you keep silent and don't help because you know that the child has to figure this out by themselves. That's the same feeling I get when I hear all those doom and gloom perspectives on how AI is ruining coding :D

[-]

ssrshh 19 hours ago

Not condescending at all

[-]

bronlund 19 hours ago

My children are on their own journey and I’m just trying to be supportive. I don’t measure their worth by how much they agree or understand my perspective.

[-]

simmerup 18 hours ago

Maybe being around children is giving you a false sense of superiority

[-]

bronlund 17 hours ago

Relax. It’s going to be okey :)

[-]

simmerup 2 hours ago

case in point

archagon 18 hours ago

For some reason, AI boosters can't help but condescend. I've never seen this with the rollout of any other technology. It's like this stuff immediately becomes a core part of their personality.

[-]

bronlund 17 hours ago

I agree that this new thing is polarizing, but as with the rollout of any groundbreaking technology, the ones looking backwards is just going to be left behind.

[-]

archagon 17 hours ago

As I see it, the only reason to make such a bold claim (as opposed to just doing what you do and seeing how things shake out) is insecurity. Especially if you’re condescending about it.

[-]

bronlund 17 hours ago

This discussion is not new. Books was criticized for being addictive and antisocial. TV for rotting our brains. And even if there are some truths to these claims, I do appreciate both.

If you want to be passive aggressive without AI, the more tokens there is for the rest of us ;)

slopinthebag 15 hours ago

I largely agree with you but I still can't stand the "you're gonna be left behind!!" framing that is really common with people who are enthusiastic about AI. What does it even mean to "be left behind" in this instance, it's just a vague emotional expression.

These AI tools are not hard to learn, in fact they're super easy when you have some experience programming, so the only people who are going to be left behind are the ones who simply refuse to use the tools out of principle. And why would they care about being "left behind"? They're making a conscious choice to not use the tools. They want to be left behind!

And not everyone who is skeptical are that out of principle, some just don't see the value yet or are slowly and cautiously adopting it into their workflow. If AI powered coding ends up being even half as good as promised, so good that denying the evidence is impossible, they can just start using it and catch right up. So who exactly is "being left behind" here? It's complete nonsense while simultaneously being extremely condescending and I get triggered every time I read the phrase.

I don't mean anything against you personally with my ranting, it's more a general observation. Perhaps you and some others do mean it as a genuine bit of advice, like "hey, you should learn these tools or else you might struggle to find work in the future", but the sense I get most of the time are people who are gleeful that the non-believers are soon to be homeless or whatever.

[-]

bronlund 7 hours ago

Yeah, I could have formulated that in another way - 'missing out' would be a better term. I do not mean to be condescending in the meaning that I look down on people that have other preferences than me. Take books as an example. I do not believe that people that can't read is of any lesser value than me, and whether you can read or not, is a poor indication of how intelligent you are - or how happy you are for that matter. Sure, information is important, but there are a lot of different types of information and a lot of different ways of acquire it - so I do not believe that my way is any better than someone else's. That being said, me being able to read and appreciate books, do believe that people who can't, are missing out.

We know books has been used for both good and evil, but I still think books are a wonderful invention. Same with television or the internet, the quality of the content on there doesn't really take anything away from the fact that the technology in itself is absolutely amazing. In hindsight - it has been a minute since Gutenberg - how the society has adopted the written word do have real implications for the people living in it though. If you can't read today, you will struggle. Not because there are anything wrong with you, but because the system more or less take it for granted that you can.

And it is going to be the same with AI, but even more so. The ones who learns to master it will dominate those who don't. It will create a new form of class divide, where access to tokens and knowing how to use them, is going to be the main drivers. AI is still in it's early stages and we see that not everything is alright with it, take for instance the economics around it or the environmental impact it has. But still, I do believe that it is an amazing invention, and that if you do not embrace it, you are missing out.

[-]

slopinthebag 2 hours ago

If someone said "Yeah I haven't read in 5 years" would you find it reasonable to tell them they're going to be left behind? If someone did that to me in person I'd consider whacking them in the face.

Maybe don't say it online if you wouldn't say it offline.

[-]

bronlund 41 minutes ago

No, but if you claim that reading will be the end of human kind, I will have no problem leaving you behind.

[-]

slopinthebag 35 minutes ago

Sounds good buddy

adampunk 4 hours ago

I think this approach means well, but it doesn’t connect with the reality of the times. You’ve got people repeatedly insisting that times have shifted and folks are going to be left behind because that’s what’s happening. October 2025 marked a turning point in all of software— that sounds grandiose, but it’s true. The longer we pretend that it didn’t happen the harder it gets to adjust.

I think people are talking like this because we have not lived in a genuine computing revolution like this since probably the introduction of the micro computer. It’s been more than 40 years.

I get that people are mad about this. That’s real obvious when you comment in any way about the use of AI. You get told that you’re a robot you get told that you’re not a real engineer you get told that you’re insecure you get told that—-all kinds of things. So it’s super clear that people are upset because they’re being fucking childish about it. Even a post like this one where the author tries hard to be pretty nice, we see the same sneering comments about training your own replacement and shit like that. It’s not subtle.

Where I get off the train is concluding that because they’re upset that they don’t need to be told what’s happening. All of computing is already changing. It’s already happening. It’s like if the sun winked out right now we would discover it in eight minutes, but the event has already happened. We are merely outside the cone of visibility. This shit is all happening right now. It is all real. I think it does a disservice to people to pretend as though it’s not.

[-]

slopinthebag 2 hours ago

No disrespect intended but I think you're in a bubble. Things will change, sure, but not to the same extent as the invention of the Internet for example.

[-]

bronlund 30 minutes ago

I disagree. This is like the invention of the steam machine which may be the most important factor in kickstarting the industrial revolution. I am pretty sure none of the guys who were there, had any clue as to how this was going to change the world, but I suspect at least some, knew that; this is something else.

People are bitching now about how AI has ruined coding, not fully grasping that for most people, there will be no code, no applications, no operating systems. AI will pretend to be all of that, and doing it way better. A six year old will be able to "out-code" all of us.

This is half a year ago: https://www.youtube.com/watch?v=dGiqrsv530Y

archagon 39 minutes ago

Comments like the one you’re replying to give me a disturbing feeling — like AI is speaking through the mouths of its users, Pluribus style.

[-]

adampunk 6 minutes ago

What am I supposed to do with this, man?

Am I supposed to talk to you like this? Should I do some psychoanalysis here? If you wanna say I’m pantomimed machinery, then I think we may need to have a discussion.

Because here’s what it looks like to me: I think there’s a lot of people who arguably had a pretty good handle on how their corner of computing worked. They can understand a pretty deep dive into the stack they use, and where they have to deposit something into the intellectual hinterlands, it can safely be abstracted away on dependable, engineered machines or standards. That is no mean feat; lots of people cannot say that. The fact that someone who does say it doesn’t fully understand paging or floating point arithmetic is not a sign that something is wrong, but rather that we have succeeded in big shared engineering problems. Cool.

Some new shit is afoot. We are entering into a new, turbulent, uncertain era of computing. A lot of people who previously had a pretty confident grasp of both the core in the frontier of their work now do not understand what is driving the frontier. They have made the fact that they do not understand this everyone else’s problem. Rather than admit that they do not understand an area they used to understand we are subjected to incessant infantile progression through what I hope are stages of grief. Because at least then it might come to an end.

Everyone has an explanation for why this is all gonna collapse tomorrow and why they don’t need to learn about it. Everyone has a smart remark about the use of AI for some very important moral reason which also means they don’t need to learn about it. They both add up to the same thing which might just be healthier if treated as a true admission of ignorance.

lagrange77 19 hours ago

What is your perspective on the matter from a parent point of view?

[-]

bronlund 17 hours ago

I think an humble and open mind is essential. I think that we reap what we sow, but also that struggle makes us robust.

I try to explain stuff to my kids, to the best of my ability, but give them room to make their own conclusions. As an old fart, there is a limit to how relevant my world will be to them - and I have to acknowledge that.

Change is scary and not always for the better, but in my humble opinion; we have nothing to lose and everything to gain.

I, For One, Welcome Our New AI Overlords :]

piker 21 hours ago

> I’ll continue use these tools with the hope that they don’t make me obsolete too quickly.

I'm starting to believe using them is more likely to make you obsolete than not.

[-]

vermilingua 21 hours ago

It baffles me that so many people are so willing to pay for the privilege of training their own replacement.

[-]

jjee 21 hours ago

But are you though?

From where I stand this thing is going to provide great leverage to those who don’t simply just write code. I personally doubt the thing will ever get to a place where it can be trusted to operate alone - it needs a team of people and to go super fast you need more people.

Moreover, the price won’t be high due to competition.

I’ve changed my view on LLMs as being good, as long as competition is fierce.

[-]

alas44 20 hours ago

Looks like a LLM generated comment

[-]

shepherdjerred 20 hours ago

It reads like a human to me. But I understand being suspicious of an account that’s 40min old

jjee 20 hours ago

[flagged]

[-]

alas44 20 hours ago

Not sure how hacker news can effectively protect against what looks like fake users posting LLM generated comments :(

aerhardt 20 hours ago

I am pretty sure non-technical people are not going to be able to compete in any meaningful way with technical people.

shepherdjerred 21 hours ago

I completely agree. Most programmers work on rather boring and not particularly novel things. If they don't adapt, then they'll be replaced.

I do think it'll be a while before LLMs make significant contributions to complex projects, though. For example I can't imagine many maintainers of the Linux kernel use LLMs much.

[-]

piker 20 hours ago

No. That's not really where I'm coming from.

I believe your skills are atrophying when you use these things no matter how trivial the case. That compounds with their bias towards solving problems by producing more code to further reduce your productivity without them.

[-]

shepherdjerred 20 hours ago

Ah I read it wrong. I must be using LLMs too much :)

I do agree with you to some extent. I think anyone who uses LLMs will need to set aside some time writing code by hand to keep their skills sharp.

max_streese 21 hours ago

And if we do adapt we might still get replaced because less of us will be able to do more. Or we wont because of Jevons Paradox. Linux maintainers on the other hand can code (with and without AI) what I could not (with or without AI). So in a way becoming a more knowledgeable, more skilled programmer is the way? In any case, too much speculation about the future.

keybored 21 hours ago

That’s the most craven AI user line I’ve read. Well at least from this week.

smokel 21 hours ago

I've written an Obsidian clone for myself, which has proper Emacs keybindings. Took me a few hours too many to get in all the features that I need.

What I find interesting is that I have little motivation to open source it. Making it usable for others requires a substantial amount of time, which would otherwise be just a fraction of the development time.

[-]

xorvoid 20 hours ago

I was thinking about doing the same. Build a clone with AI custom tailored for my own quirks. And not bothering to open source it because it's too bespoke for anyone else. How hard was this? Can you share any advice?

[-]

smokel 11 hours ago

It turned out to be pretty hard in some places. I'm using CodeMirror as the basic building block, which is great, but it does not support WYSIWYG table editing out of the box. Getting that to work requires one to use a separate CodeMirror instance for the cell editor, which makes things rather complicated. For the LLM as well :)

I think I've spent ~20 hours and a couple of $100 of Claude Opus tokens in Cursor. So it's not cheaper or easier, but the amount of frustration saved with having proper Emacs keybindings might delay catastrophic global warming by a few days.

Oh, and of course I'm not compatible with all the Obsidian extensions, nor do I have proper hosting for server-side sync yet. All in all, a fool's errand, but I'm having fun.

[-]

xorvoid 2 hours ago

Thank you! Re extensions: my thinking was that if you build a clone, then extensions become irrelevant. Just build what you need directly into the software. Extensions systems always seemed to me to be a second class citizen. I think I read an old story of Linus Torvalds using an old fork of microemacs and whenever he disliked something he would just go tweak it's C code (e.g. key bindings). I'm kind of thinking that but done with an LLM. Software could in theory be smaller and more bespoke. And it you want it to work differently, you just prompt an LLM to change the actual source code. Then you don't need higher level configuration/cuatomization interfaces. Simpler software.

bityard 20 hours ago

I have a theory (and I'm sure I am far from the first one to voice it) that the number of useful open source projects released to the public will be on the decline now that anyone scratch their own itch with a few hours of vibe coding. Why would I spend hours evaluating a dozen different note-taking applications and _maybe_ find one that is _kinda close_ to what I want, if I can instead have Claude vibe me one up _exactly_ the way I want it?

(I actually did write my own note-taking application, but that was before LLMs were any good at writing code.)

[-]

archagon 18 hours ago

Because when it eventually and inevitably corrupts your data, you won't know what to do or have any recourse?

[-]

TheAceOfHearts 17 hours ago

Surely any sane person vibe coding a note taking app just has it save all the notes as markdown files to disk? At that point making a backup is trivial and they're unlikely to get corrupted.

[-]

archagon 17 hours ago

So why vibe code a version of a thing that already exists in a dozen different permutations, and with actual eyes on the codebase?

[-]

smokel 11 hours ago

In a typical open source project only one person has had a look at a particular piece of code. Only in the larger and more mature projects do people actually spend time reviewing code. Also, if you don't pay for the free code, there is often no serious recourse to recover your data either.

As stated in my first comment, Obsidian does not support Emacs keybindings properly, nor is it open source. Writing an extension to add Emacs keybindings is not at all trivial, because you have to work around a lot of existing and undocumented functionality.

There are other reasons for not vibe coding your own alternative, but as LLMs keep progressing, these reasons may become less relevant.

ipaddr 20 hours ago

I've heard a few people say I haven't written a single line of code since ...

What do people think of it?

I personal don't think that's a badge of honor. Aside from losing your coding skills you miss oppurtunities to generate AI pieces and connect them to existing systems that can't be feed into the AI. Plus making small changes is easier than having the AI make them without messing something else up.

[-]

Maxatar 20 hours ago

I wouldn't say strictly speaking that I've written no code, but the amount of code I've written since "committing" to using Claude Code since February is absolutely miniscule.

I prefer having Claude make even small changes at this point since every change it makes ends up tweaking it to better understand something about my coding convention, standard, interpretation etc... It does pick up on these little changes and commits them to memory so that in the long run you end up not having to make any little changes whatsoever.

And to drive this point further, even prior to using LLMs, if I review someone's work and see even a single typo or something minor that I could probably just fix in a second, I still insist that the author is the one to fix it. It's something my mentor at Google did with me which at the time I kind of felt was a bit annoying, but I've come to understand their reason for it and appreciate it.

[-]

sarchertech 20 hours ago

Unfortunately Claude has a context window limit so it’s not going to keep “learning” forever.

[-]

Maxatar 20 hours ago

Sort of... Claude Code writes to a memory.md file that it uses to store important information across conversations. If I review mine it has plenty of details about things like coding convention, structure, and overall architecture of the application it's working on.

The second thing Claude Code does is when it reaches the end of its context window it /compact the session, which takes a summary of the current session, dumps it into a file, and then starts a new session with that summary. But it also retains logs of all the previous sessions that it can use and search through.

Looking over my session of Claude Code, out of the 256k tokens available, about 50k of these tokens are used among "memory" and session summaries, and 200k tokens are available to work with. The reality is that the vast majority of tokens Claude Code uses is for its own internal reasoning as opposed to being "front-end" facing so to speak.

Additionally given that ChatGPT Codex just increased its context length from 256k to 1 million tokens, I expect Anthropic will release an update within a month or so to catch up with their own 1 million token model.

[-]

sarchertech 20 hours ago

There’s a few problems with that.

1. The closer the context gets to full the worse it performs.

2. The more context it has the less it weights individual items.

That is Claude might learn you hate long functions and add a line about short functions. When that is the only thing in the function it is likely to follow other very closely. But when it’s 1 piece of such longer context, it is much more likely to ignore it.

3. Tokens cost money even you are currently being subsidized.

4. You have no idea how new models and new system prompt will perform with your current memory.md file.

5. Unlike learning something yourself, anything you teach Claude is likely to start being controlled by your employer. They might not let you take it with you when you go.

[-]

shepherdjerred 19 hours ago

> 3. Tokens cost money even you are currently being subsidized.

keep in mind that those 50k memory tokens would likely be cached after the first run and thus significantly cheaper

[-]

sarchertech 14 hours ago

Caching has so many caveats. The cache expiration window is short, if you change document in the context it clears the cache, if you change anything in the prompt prefix it clears the cache. And there’s no reason to think that Anthropic will keep charging dramatically less for cached tokens on the future once they start trying to make a profit.

[-]

shepherdjerred 13 hours ago

My understanding is that Claude Code/Codex both put great effort into utilizing caching.

[-]

sarchertech 6 hours ago

Yeah of course they do because it saves them more money than they are passing on to you. That doesn’t mean that they are magically able to overcome the tradeoffs inherent to caching. All of the issues I mentioned will still invalidate your cache.

aerhardt 20 hours ago

I haven’t typed a line of code in like six months but I still review all production code and stay very connected to the codebase. I don’t feel my skills have withered at all.

stavros 21 hours ago

What did you think of Dagger? I used Earthly a while ago but the one thing I didn't like was that it couldn't parallelize runs, since it only ran on one CI instance. Other than that, I liked that I could run my entire CI pipeline locally, but didn't like it so much that I ended up using it for much else.

[-]

shepherdjerred 21 hours ago

I really like Dagger. I had a _lot_ of weird issues with Earthly, like edge cases. Dagger has been mostly solid.

It still has gaps. I don't think they've landed on the right model for CI. Like Earthly, their model is a CI runner + local cache. I believe a distributed cache (like Bazel) makes more sense.

If I were choosing between the two I'd personally always pick Dagger, but I think there is a strong argument for Earthly for simpler projects. If you're using multiple Earthfiles or a few hundred lines of Earthly, I think you've outgrown it.

[-]

stavros 19 hours ago

Thanks, I'll give it another shot!

JeanMarcS 21 hours ago

And like everyone else you trained the AI how to replace you by giving it more insight on how to prompt stuff.

[-]

shepherdjerred 21 hours ago

Yes. I also freely release almost all of the code I've ever written, aside from what I've done at work (which I would release if I legally could)

vunderba 20 hours ago

I tend to only use LLMs to complete projects that are relatively unique and that haven't been done before. Because if I'm not going to get anything out of the journey, I might as well get something out of the destination.

*Piece Together*

An animated puzzle game that I built with a fairly heavy reliance on agentic coding, especially for scaffolding. I did have to jump in and tweak some things manually (the piece-matching algorithm, responsive design, etc.), but overall I’d estimate that LLMs handled about 80% of the work. It's heavily based on the concept of animated puzzles in the early edutainment game The Island of Dr. Brain.

https://animated-puzzles.specr.net

*Lend Me Your Ears*

Lend Me Your Ears is an interactive web-based game inspired by the classic Simon toy (originally by Milton Bradley). It presents players with a sequence of musical notes and challenges them to reproduce the sequence using either an on-screen piano, MIDI keyboard, or an acoustic instrument such as a guitar.

https://lend-me-your-ears.specr.net

*Shâh Kur - Invisible Chess*

A voice controlled blindfold chess game that uses novel types of approaches (last N pieces moved hidden, fade over time, etc). Already been already playing it daily on my walks.

https://shahkur.specr.net

*Word game to find the common word*

It's based off an old word game where one person tries to come up with three words: sign, watch, bus. The other person has to think of a common word that forms compound-style words with each of them: stop.

I was quite surprised to see that this didn't exist online already.

https://common-thread.specr.net

*A Slide Puzzle*

Slide puzzles for qualified MENSA members. I built it for a friend who's basically a real-life equivalent of Dustin Hoffman's character from Rain Man. So you might have to rearrange a slide puzzle from the periodic table of elements, or the U.S. presidents by portrait, etc.

https://slide-puzzles.specr.net

*Glyphshift*

Transforms random words on web pages into different writing systems like Hiragana, Braille, and Morse Code to help you learn and practice reading these alphabets so you can practice the most functionally pointless task, like being able to read braille visually.

https://github.com/scpedicini/glyph-shift

All of these were built with varying levels of assistance from agentic coding. None of them were purely vibe-coded and there was a great deal of manual and unit testing to verify functionality as it was built.

[-]

fmbb 20 hours ago

> All of these were built with varying levels of assistance from agentic coding. None of them were purely vibe-coded and there was a great deal of manual and unit testing to verify functionality as it was built.

It also seems like none of them are relatively unique and all of them have been done before.

[-]

vunderba 19 hours ago

Name them. Go ahead, I'll wait.

Simon toy that's integrated into an ear training tool?

Blindfold chess with Last N moves hidden?

Mensa-style slide puzzles?

An extension that converts random words into phonetic equivalents like morse, braille, and vorticon?

I've also made some way less useful stuff like a win32 app that lets you physically grab a window and hurl it which invokes an WM_DESTROY when it completely is off the screen.

And an app that measures low frequencies to tell if you are blowing into the mic and then increases the speed of the CPU fan to cool it down.

lowsong 20 hours ago

> At work, all that matters is that value is delivered to the business. Code needs to be maintainable so that new requirements can be met. Code follows design patterns, when appropriate, because they are known solutions to common problems, and thus are easy to talk about with others. Code has type systems and static analysis so that programmers make fewer mistakes.

This is a narrow view of software engineering. Thinking that your role is "code that works" is hardly better than thinking you're a "(human) resource that produces code". Your job is to provide value. You do that by building knowledge, not only of the system you're developing but of the problem space you're exploring, the customers you're serving, the innovations you can do that your competitors can't.

It's like saying that a soccer player's purpose is "to kick a ball" and therefore a machine that launches balls faster and further than any human will replace all soccer players, and soon all professional teams will be made up of robots.

[-]

saint-evan 20 hours ago

I think your view is sentimental. For businesses the code usually IS the value, and devs ARE human resources that produce code. It sounds cynical, but it’s basically how most orgs operate. From the company’s POV employees function as cogs in a larger system whose purpose is to generate value considering that businesses are structured to optimize outcomes i.e. Profit. If tech appears that can produce the same output more cheaply or efficiently, companies will most definitely as we've seen so far explore replacing people with it. I mean take a look at corporate posture around LLMs. But do I get the point you’re making about knowledge, domain understanding, and solving real problems because those things clearly matter in practice but from the company’s pov, they matter only because they help produce better code/systems which are still the concrete artifact that embodies the business logic and operations. A symbolic model of the business itself encoded in software. So the framing of devs as human resources that produce code and code as the primary value correctly describes how many businesses see the relationship. And I don't really see the equivalence between SWE-ing in a business context and sports

[-]

lowsong 19 hours ago

> From the company’s POV employees function as cogs in a larger system whose purpose is to generate value considering that businesses are structured to optimize outcomes i.e. Profit. If tech appears that can produce the same output more cheaply or efficiently, companies will most definitely as we've seen so far explore replacing people with it.

Businesses wish this were the case, and many will even say it or start to believe it. But it doesn't bare out to be true in practice.

Think about it this way, engineers are expensive so a company is going to want to have as few of them as possible to do as much work as possible. Long before LLMs came along there have been many rounds of "replace expensive engineers" fads.

Visual programming was going to destroy the industry, where any idiot could drag and drop a few boxes and put together software. Turns out that didn't work out and now visual programming is all but dead. Then we had consultants and software consultancies. Why keep engineers on staff and have to deal with benefits and HR functions when you can hire consultants for just long enough to get the job done and end their contracts. Then we had offshoring. Why hire expensive developers in markets like California when you can hire far cheaper engineers abroad in a country with lower wages and laxer employment law. (It's not a quality thing either, many of these engineers are unquestionably excellent.)

Or, think about what happens when software companies get acquired. It's almost unheard of for the acquiring company to layoff all of the engineering staff from the acquired company right away, if anything it's the opposite with vesting incentives to convince engineers to stay.

If all that mattered was the code and the systems, and people were cogs that produced code that businesses wanted to optimise, then none of these actions make sense. You'd see companies offshore and use consultants with the company that does "good enough" as cheaply as possible. You'd see engineers from acquisitions be laid off immediately, replaced with cheaper staff as fast as possible.

There are businesses like that operate like this, it happens all the time. But, all of the most successful and profitable tech companies in the world don't do this. Why?

[-]

saint-evan 16 hours ago

>If all that mattered was the code and the systems, and people were cogs that produced code that businesses wanted to optimise, then none of these actions make sense.

No, No... Of course all that matters isn't just the code. My framing was about how organizations model the work SWE do economically.

>Visual programming was going to destroy the industry, where any idiot could drag and drop a few boxes and put together software. Turns out that didn't work out and now visual programming is all but dead. Then we had consultants and software consultancies. Why keep engineers on staff and have to deal with benefits and HR functions when you can hire consultants for just long enough to get the job done and end their contracts. Then we had offshoring. Why hire expensive developers in markets like California when you can hire far cheaper engineers abroad in a country with lower wages and laxer employment law. (It's not a quality thing either, many of these engineers are unquestionably excellent.)

It seems like we're agreeing along the same tangent. With this argument, you're admitting that businesses do see SWE as cogs in a wheel and seasonally try to replace them... The seasonality of 'make the engineer replaceable' fads really does point to businesses trying to simplify what devs actually do since most of what they measure is working code output because it’s a tangible artifact (this is waht the OP meant by implying being a working code producer at work). Knowledge, judgment, architectural intuition, and domain understanding are harder to quantify, so they disappear from the model even though they ARE the real constraint. So for the record, I do agree with you that code isn't everything but I maintain that SWEs are modelled based on working codes produced even in more successful companies that invest in domain knowledge and long-term system understanding.

Metrics, performance reviews, sprint velocity, delivery timelines, all orbit around observable artifacts because those are what management systems can actually track objectively and equitably. It's a handy abstraction just like looking only at the ins/outs of a logic gate as opposed to looking at the implementation and wiring. Of course, a NOT gate would get upset over being called a 'bit flipper', it's not all thar physically exists but from our POV, it doesn't exactly matter. This applies to human labor even if a leaky abstraction w

[-]

lowsong an hour ago

> you're admitting that businesses do see SWE as cogs in a wheel and seasonally try to replace them...

Not quite. I agree that companies will try to do this, but every company that has tried to treat engineering staff as replaceable units of person-hours has failed.

> Metrics, performance reviews, sprint velocity, delivery timelines, all orbit around observable artifacts because those are what management systems can actually track objectively and equitably. It's a handy abstraction just like looking only at the ins/outs of a logic gate as opposed to looking at the implementation and wiring.

Yes, and these metrics are, usually, worthless.

It's not that companies and managers will not try to replace engineers with AI. I'm sure they will. I'm sure many will be laid off because "AI does it cheaper now".

My point is that companies that have gone down this route in the past have failed, and AI is no different. Companies that lean strongly into AI as a workforce replacement will fail too.

slopinthebag 20 hours ago

> Speaking in the context of solving a problem: does AI need to write beautiful code? No. It needs to write code that works. The code doesn’t need to be maintainable in the traditional sense. If you have sufficient tests, you can throw some LLMs at a pile of “bad” code and have them figure it out.

Code doesn't need to be "beautiful", but the beauty of code has nothing to do with maintainability. Linus once said "Bad programmers worry about the code. Good programmers worry about data structures and their relationships." The actual hard part of software is not the code, it's what isn't in the code - the assumptions, relationships, feedback loops, emergent behaviours, etc. Maintainability in that regard is about system design. Imagine software as a graph, the nodes being pieces of code and the edges being those implicit relationships. LLM's are good at generating the nodes but useless at the edges.

The only thing that seems to work is to have a validation criteria (eg. a test suite) that the LLM can use to do a guided random walk towards a solution where the edges and nodes align to satisfy the criteria. This can be useful if what you are doing doesn't really matter, like in the case of all the pet projects and tools people share. But it does matter if your program assumes responsibility somewhere, like if you're handling user data. This idea of guardrail-style programming has been around for a while, but nobody drives by bouncing off the guardrails to get to their destination, because it's much more efficient to encode what a program should do instead of what it shouldn't, which is the case with this type of mega-test-driven-development. Is it more efficient to tell someone where not to go when giving directions as opposed to telling them how to get there?

Take the Cloudflare Next.js experiment for example - their version passed all the Next.js tests but still had issues because the test suite didn't even come close to encoding how the system works.

So no, you still need to care about maintainability. You don't need to obsess over code aesthetics or design patterns or whatever, but you never needed to do that. In fact, more than ever programmers need to be concerned with the edges of their software and how they can guide the LLM's to generate the nodes (code) while maintaining the invariants of the edges.

[-]

sarchertech 20 hours ago

They whole “you can just throw LLMs at the test suite and regenerate the code” thing needs to die excuse it can’t work for any software that has users. A test suite cannot feasibly cover every observable behavior. Every time you regenerate the code this way you’ll change a huge chunk of the thousands of little observable behaviors that aren’t fixed in place by the test suite. You can’t do this if you have users.

Similarly to your directions analogy, I’ve been using the the analogy id trying to ensure that a 1000 restaurant franchise produces the exact same peanut butter sandwich for ever customer.

It’s much easier to figure out the primitives that your employees understand and then use those primitives to describe exactly how to build a sandwich than it is to write a massive specification that describes what they should produce and just let them figure it out.