Using AI to secure AI

(mattsayar.com)

89 points | by MattSayar 7 hours ago ago

26 comments

bink 4 hours ago

I think it's funny that I don't see any findings from either Claude or DataDog that couldn't be detected using static analysis. They're pretty simple code bases and maybe that's why.

I'll pay more attention when they start finding vulnerabilities in commonly used, more complex applications.

[-]

boston_clone 2 hours ago

Ah, then you’ll want to check out xbow -

https://xbow.com/blog

I believe some folks here (moyix) are active with the project.

mmsc 5 hours ago

Currently living through a great litmus test of competency versus luck by company leaders

gbrindisi 3 hours ago

We’ve kinda solved the detection of issues. what we still lack is understanding what’s important.

I think an underappreciated use case for LLMs is to contextualize security issues.

Rather than asking Claude to detect problems, I think it’s more useful to let it figure out the context around vulnerabilities and help triage them.

(for better or worse, I am knee-deep in this stuff)

amelius 2 hours ago

Next: using AI to sue AI.

ryao 3 hours ago

The quotation is more impactful in the original Latin: Quis custodiet ipsos custodes?

[-]

themanmaran 3 hours ago

custodes[.]ai would be a great startup name

[-]

bee_rider 2 hours ago

Actually, Custodes would have nothing to do with abominable intelligence </warhammer 40k>

ofjcihen 4 hours ago

This has already been leading to some incredible profits for security companies like mine.

So please, don’t be too loud about how terrible it is :)

scarlettadham 3 hours ago

is this the Blackwall from Cyberpunk, kinda reminds me of that.

ohdeargodno 4 hours ago

At this point, fuck it, do it, I'm here for the laughs now.

Let Claude run on your production servers and delete ld when something doesn't run (https://www.reddit.com/r/linux4noobs/comments/1mlveoo/help/). Let it nuke your containers and your volumes because why the fuck not (https://github.com/anthropics/claude-code/issues/5632). Let the vibecoders put out thousands of lines of shit code for their stealth B2B startup that's basically a wrapper around OpenAI and MySQL (5.7, because ChatGPT read online that MERN is a super popular stack but relational databases are gooder), then laugh at them when it inevitably gets "hacked" (the user/pw combo was admin/admin and PHPMyAdmin was open to the internet). Burn through thousands of CPU hours generating dogshit code, organising "agents" that cost you 15 cents to do a curl https://github.com/api/what-did-i-break-in/cba3df677. Have Gemini record all your meetings, then don't read the notes it made, and make another meeting with 5 different people the next week.

It will reveal a bunch of things: which companies are ran by incompetent leaders, which ones are running on incompetent engineers, which ones keep existing because some dumbass VC wants to throw money in the money burning pit.

Stand back, have a laugh. When you're thrust in a circus, don't participate in the clown show.

[-]

troupo 3 hours ago

As I wrote on twitter last month: https://x.com/dmitriid/status/1947245603164996039

--- start quote ---

If you have as much as 1 year experience, your job is safe from AI: you'll make mountains of money unfucking all the AI fuckups

--- end quote ---

johntiger1 3 hours ago

who watches the watch man?

malfist 6 hours ago

According to my company's senior leadership there's nothing the magic dust of AI can't solve. Even problems with AI can be solved by more AI

[-]

kelseyfrog 4 hours ago

This is where it gets fun.

We're on the precipice of being able to install AI into positions of business critical processes. Hiring, billing, sales, and compliance. It's going to be great watching c-suite and VPs who are drunk on the sauce accept AI in these positions and get golden parachutes when the business ends up facing a massive external audit, fraud, and the possibility of bankruptcy.

[-]

at-fates-hands 3 hours ago

>> nWe're on the precipice of being able to install AI into positions of business critical processes. Hiring, billing, sales, and compliance

We're already there. Have been for several years now. I was doing RPA (robotic process automation) for about 4 years in a corporate environment. It went from, "Lets automate these mundane tasks" to "How can we create a billing platform that can be totally automated?". This was back in 2021, just for reference.

>> It's going to be great watching c-suite and VPs who are drunk on the sauce

Hopefully this will be a cautionary tale of what happens when they do?

https://www.reuters.com/legal/lawsuit-claims-unitedhealth-ai...

UnitedHealth Group Inc (UNH.N), uses an artificial intelligence algorithm that systematically denies elderly patients' claims for extended care such as nursing facility stays, according to a proposed class action lawsuit, filed on Tuesday.

Family members of two now-deceased UnitedHealth beneficiaries sued the insurer in federal court in Minnesota, saying they were forced to pay out of pocket for care that doctors said was medically necessary.

TZubiri 3 hours ago

Thinking about pivoting to pentesting

bongodongobob 4 hours ago

Pfft. The hammer will come down IT leadership, not execs.

[-]

citizenpaul 3 hours ago

>IT Leadership

You nailed it. Ive found that HN users in general have terrible understanding of how power dynamics work. Most seem to want to jam some sort of logic outcome to a situation that always only has one outcome. Those with power decide the outcome.

[-]

shermantanktop 2 hours ago

Word. It’s true until events overtake them. But until then, the dominant understanding of a problem is that which preserves the current power structure.

And that’s why the C-level AI mania is so fascinating - preserving the status quo usually means rejecting or controlling change. But with AI they are embracing something that could eat their status, presumably out of legitimate fear of the alternative.

andy99 4 hours ago

IT leadership will blame their subordinates, the ones that knew better - somehow in these things it's always the people who should be able to say "I told you so" that get the blame.

nicce 5 hours ago

This reminds me about "The Emperor's New Clothes" way too much.

crinkly 4 hours ago

Going through that now. I’m part of the Chernobyl clean up team already. Mostly because my part of the org is the only one with a positive ROI and isn’t a fucked up mess still so I’ve got plenty of time to deal with everyone else’s problems.

How did we get in this situation? Avoid all the fucking fads.

aurumque 4 hours ago

And yet when I recommend that replacing senior leadership is one of highest ROI potentials for AI they immediately shut down the conversation.

[-]

crinkly 4 hours ago

Yeah. Our senior leadership just ask ChatGPT what to do. Might as well skip the middle man.

jimt1234 6 hours ago

We must work at the same company. LOL