A Step Behind the Bleeding Edge: A Philosophy on AI in Dev

(somehowmanage.com)

58 points | by Ozzie_osman 2 days ago ago

15 comments

This is one of the more true and balanced articles.

On the verification loop: I think there’s so much potential here. AI is pretty good at autonomously working on tasks that have a well defined and easy to process verification hook.

A lot of software tasks are “migrate X to Y” and this is a perfect job for AI.

The workflow is generally straightforward - map the old thing to the new thing and verify that the new thing works the same way. Most of this can be automated using AI.

Wanna migrate codebase from C to Rust? I definitely think it should be possible autonomously if the code base is small enough. You do have to ask the AI to intelligently come up with extensive way to verify that they work the same. Maybe UI check, sample input and output check on API and functionality check.

[-]

akiselev 18 minutes ago

> On the verification loop: I think there’s so much potential here. AI is pretty good at autonomously working on tasks that have a well defined and easy to process verification hook.

It's scary how good it's become with Opus 4.5. I've been experimenting with giving it access to Ghidra and a debugger [1] for reverse engineering and it's just been plowing through crackmes (from sites like crackmes.one where new ones are released constantly). I haven't bothered trying to have it crack any software but I wouldn't be surprised if it was effective at that too.

I'm also working through reverse engineering several file formats by just having it write CLI scripts to export them to JSON then recreate the input file byte by byte with an import command, using either CLI hex editors or custom diff scripts (vibe coded by the agent).

I still get routinely frustrated trying to use it for anything complicated but whole classes of software development problems have been reduced to vibe coding that feedback loop and then blowing through Claude Max rate limits.

[1] Shameless plug: https://github.com/akiselev/ghidra-cli https://github.com/akiselev/debugger-cli

willtemperley 2 hours ago

I'm very happy with the chat interface thanks.

* The interface is near identical across bots

* I can switch bots whenever I like. No integration points and vendor lock-in.

* It's the same risk as any big-tech website.

* I really don't need more tooling in my life.

[-]

simianwords 41 minutes ago

I think the agents are also becoming fungible at the integration layer.

Any coding agent should be easily to whatever IDE or workflow you need.

The agents are not full fungible though. Each have their own characteristics.

kranner an hour ago

> If you ask AI to write a document for you, you might get 80% of the deep quality you’d get if you wrote it yourself for 5% of the effort. But, now you’ve also only done 5% of the thinking.

This, but also for code. I just don't trust new code, especially generated code; I need time to sit with it. I can't make the "if it passes all the tests" crowd understand and I don't even want to. There are things you think of to worry about and test for as you spend time with a system. If I'm going to ship it and support it, it will take as long as it will take.

[-]

simianwords 6 minutes ago

Honest question: why is this not enough?

If the code passes tests, and also works at the functionality level - what difference does it make if you’ve read the code or not?

You could come up with pathological cases like: it passed the tests by deleting them. And the code written by it is extremely messy.

But we know that LLMs are way smarter than this. There’s very very low chance of this happening and even if it does - it quick glance at code can fix it.

layer8 34 minutes ago

Yes, regression tests are not enough. One generally has to think through code repeatedly, with different aspects in mind, to convince oneself that it is correct under all circumstances. Tests only point-check, they don’t ensure correct behavior under all conceivable scenarios.

slfreference an hour ago

I think what LLMs do with words is similar to what artists do with software like cinema4d.

We have control points (prompts + context) and we ask LLMs to draw a 3D surface which passes through those points satisfying some given constraints. Subsequent chats are like edit operations.

https://youtu.be/-5S2qs32PII

[-]

catdog 26 minutes ago

An LLM is an impressive, yet still imperfect and unpredictable translation machine. The code it outputs can only be as good as your prompt is precise, minus the often blatant mistakes it makes.

asyncadventure 2 hours ago

This is a refreshingly pragmatic take on AI adoption. The "step behind the bleeding edge" philosophy resonates deeply - especially the security concerns around the AI gold rush mentality. The emphasis on maintaining ownership of your work while leveraging AI for toil rather than thought is spot on. Too many teams are either completely avoiding AI or blindly adopting every new tool without considering the broader implications.

[-]

whatevermom5 an hour ago

Would be nice if we had a dedicated button to flag AI comments.

OsamaJaber 33 minutes ago

The hardest part of (a step behind) is knowing when something is crossed over How do you decide when a tool is mature enough to adopt?

[-]

simianwords 3 minutes ago

This requires good intuition and being unemotional.

Lots of people who become successful are the ones who can get this prediction correct.

piker 2 hours ago

> “Their (ie the document’s) value stems from the discipline and the thinking the writer is forced to impose upon himself as [she] identifies and deals with trouble spots”.

Real quote

> "Hence their value stems from the discipline and the thinking the writer is forced to impose upon himself as he identifies and deals with trouble spots in his presentation."

I mean seriously?

satisfice 40 minutes ago

The one thing disagree with is having the AI do its own verification. I explicitly instruct it never to check anything unless I ask it to.

This is better because I use my own test as a forcing function to learn and understand what the AI has done. Only after primary testing might I tell it to do checking for itself.