First Self-Propagating Worm Using Invisible Code Hits OpenVSX and VS Code

(koi.ai)

82 points | by dnslavin 3 days ago ago

61 comments

"Here's the thing - this technique completely breaks traditional code review. You can't spot what you can't see. GitHub's diff view? Shows nothing suspicious. Your IDE's syntax highlighting? All clear. Manual code inspection? Everything looks normal.

The invisible code technique isn't just clever - it's a fundamental break in our security model. We've built entire systems around the assumption that humans can review code. GlassWorm just proved that assumption wrong."

This is pure Claude talk.

[-]

burkaman 3 days ago

Yeah the whole article is awful to read. Everything the LLM added is completely useless fluff, sometimes misleading, and always painful to get through.

[-]

r_lee 3 days ago

Alright, no fluff. Only real talk. It's not just a great argument--it's the truth. You're absolutely right.

1bpp 3 days ago

Claude, remember to always replace em-dashes with a single dash

dingnuts 3 days ago

it sure is and it's complete bullshit too!

that screenshot looks suspicious as hell, and my editor (Emacs) has a whitespace mode that shows unprintable characters sooooo

if GitHub's diff view displays unprintable characters like this that seems like a problem with GitHub lol

"it isn't just X it's Y" fuck me, man. get this slop off the front page. if there's something useful in it, someone can write a blog post about it. by hand.

[-]

ForOldHack 3 days ago

My Editor VSCode has the Hex editor installed, always... invisible unicode? Not to Hex. What? are you doing without Hex mode? What?

[-]

adamhartenz 3 days ago

Does your Hex editor extension get automatically updated?

gary_0 3 days ago

If all you're interested in is which extensions have been infected:

Compromised OpenVSX Extensions:

    codejoy.codejoy-vscode-extension@1.8.3
    codejoy.codejoy-vscode-extension@1.8.4
    l-igh-t.vscode-theme-seti-folder@1.2.3
    kleinesfilmroellchen.serenity-dsl-syntaxhighlight@0.3.2
    JScearcy.rust-doc-viewer@4.2.1
    SIRILMP.dark-theme-sm@3.11.4
    CodeInKlingon.git-worktree-menu@1.0.9
    CodeInKlingon.git-worktree-menu@1.0.91
    ginfuru.better-nunjucks@0.3.2
    ellacrity.recoil@0.7.4
    grrrck.positron-plus-1-e@0.0.71
    jeronimoekerdt.color-picker-universal@2.8.91
    srcery-colors.srcery-colors@0.3.9
    sissel.shopify-liquid@4.0.1
    TretinV3.forts-api-extention@0.3.1

Compromised Microsoft VSCode Extensions:

    cline-ai-main.cline-ai-agent@3.1.3

[-]

wasabi991011 3 days ago

Important note, the most common vscode extension for Cline is saoudrizwan.claude-dev, not cline-ai-main.cline-ai-agent.

I was freaking out for a bit.

benxh 3 days ago

cline is used by a lot of devs

[-]

wasabi991011 3 days ago

Yeah I was freaking out, but turns out it's not the usual Cline extension (which has extension is saoudrizwan.claude-dev).

wrs 3 days ago

That's clever, but if your code review missed the perfectly visible line

    eval(atob(decodedString))

then they didn't really need invisible characters to get past you, did they?

[-]

rezonant 3 days ago

Ahh but what if you are code reviewing a malware package already? Then this would be entirely normal!

blauditore 3 days ago

Why not just indicate non-printable characters in code review tools? I've always wondered that, regardless of security implications. They are super rare in real code (except line breaks and tabs maybe), so no disruption in most cases.

Also, as notes in other comments, you can't do shady stuff purely with invisible code.

The article seems bit sensationalist to me.

[-]

ShowalkKama 3 days ago

Because spaces, tabs, CR and LF are invisible too yet perfectly normal to find within code. You could very easily implement a decode() function that uses only those characters.

[-]

blauditore 7 minutes ago

But to get any meaningful result, you'd need to insert them in unusual ways or amounts, likely breaking formatting rules. Trailing whitespace or excessive line breaks should be caught by linting tools and/or code review.

kulahan 3 days ago

For anyone else curious WTH “invisible code” is…

> invisible Unicode characters that make malicious code literally disappear from code editors.

[-]

rictic 3 days ago

So, they have a custom decode function that extracts info from unprinted characters which they then pass to `eval`. This article is trying to make this seem way fancier than it is. Maybe GitHub or `git diff` don't give a sense of how many bits of info are in the unicode string, but the far scarier bit of code is the `eval(atob(decodedString))` at the bottom. If your security practices don't flag that, either at code review, lint, or runtime then you're in trouble.

Not to say that you can't make innocuous looking code into a moral equivalent of eval, but giving this a fancy name like Glassworm doesn't seem warranted on that basis.

[-]

Terr_ 3 days ago

Yeah, doing eval(extract_and_decode(file)) is marginally sneakier than eval(fetch_from_internet()) , but it's not so far as being some sort of, er... "mirror life" biology.

moffkalast 3 days ago

Makes you wonder why unicode has invisible characters in the first place and why a compiler would interpret them at all.

[-]

h4ck_th3_pl4n3t 3 days ago

It's not the compiler.

It's JavaScript and its fucked up UTF-16 strings.

UTF-16 should have been UTF-8 for a variety of reasons, and I thought we have learned from the Effective power لُلُصّبُلُلصّبُررً ॣ ॣh ॣ ॣ 冗 incident.

[-]

amingilani 3 days ago

The what incident? Can you elaborate?

Edit: Here’s the incident-https://www.theregister.com/2015/05/27/text_message_unicode_...

[-]

h4ck_th3_pl4n3t 3 days ago

Not only iOS was affected. MacOS, too. Firefox, too. Chromium, too.

Essentially everything that used libicu as a unicode parser.

Was quite fun posting this in IRC and other chats and seeing clients go offline at the time :)

AnimalMuppet 3 days ago

The compiler doesn't. They get passed to decode, and then to eval.

afishhh 3 days ago

Using non-printable characters to encode malicious code is creative, but I wouldn't say it "breaks our security model".

I would be pretty suspicious if I saw a large string of non-printable text wrapped in a decode() function during code review... Hard to find a legitimate use for encoding things like this.

Also another commenter[1] said there's an eval of the decoded string further down the file, and that's definitely not invisible.

Has no one thought to review the AI slop before publishing?

[1] https://news.ycombinator.com/item?id=45649224

[-]

codebje 3 days ago

There's no self-propagation happening, that's just the terrible article's breathless hyping of how devastating the attack is. It's plain old deliberately injected and launched malware. OpenVSX is a huge vector for malicious actors taking real Marketplace extensions, injecting a payload, and uploading them. The article lists exactly one affected Marketplace extension, but that extension does not exist.

> Has no one thought to review the AI slop before publishing?

If only Koi reviewed their AI slop before publishing :(

nawgz 3 days ago

Cool write-up. Seems pretty unintuitive to me that Unicode would allow someone to serialize normal code as invisible characters and that something like an IDE or a git diff has never been hardened against that at all.

In my mind it's one thing to let a string control whitespace a bit versus having the ability to write any string in a non-renderable format. Can anyone point me to some more information about why this capability even exists?

[-]

dragonwriter 3 days ago

> Seems pretty unintuitive to me that Unicode would allow someone to serialize normal code as invisible characters

If you have a text encoding with two invisible characters, you can trivially encode anything that you could represent in a digital computer in it, in binary, by treating one as a zero and the other as a one. More invisible characters and some opinionated assumptions about what you are allows denser representation than one bit per character.

Of course, the trick in any case is you have to also slip in the call to decode and execute the invisible code, and unless you have a very unusual language, that’s going to be very visible.

[-]

nawgz 3 days ago

I see now, those “decode” and “eval” are huge red flags that are downplayed heavily by the author. Cheers for the response

clscott 3 days ago

The issue does not lie with Unicode.

It's just a custom string encoder/decoder whose encoded character set is restricted to non-printables.

Many editors and IDEs have features (or plugins) to detect these characters.

VSCode: https://marketplace.visualstudio.com/items?itemName=YusufDan...

VIM: https://superuser.com/questions/249289/display-invisible-cha...

wunderwuzzi23 3 days ago

It gets even worse with LLMs and agents.

Many LLMs can interpret invisible Unicode Tag characters as instructions and follow them (eg invisible comment or text in a GitHub issue).

I wrote about this a few times, here a recent example with Google Jules: https://embracethered.com/blog/posts/2025/google-jules-invis...

OptionOfT 3 days ago

I have started denying any kind of non-ASCII characters in the source code.

I understand this is extremely limiting, but it does do the trick. For now.

[-]

rkagerer 3 days ago

This is an old-man rant, but the first time I saw Unicode I felt like I was looking at a train wreck coming from a long way off. It has too many edge cases, footguns and unintuitive artifacts like this. I wish we constrained its use to only where required. Text was so much easier to reason about and safer to manipulate in the ASCII days.

[-]

OptionOfT 3 days ago

I don't think it's an old-man rant. I think experience comes with age, but I don't associate with old-man (yet).

It's about safety.

AnimalMuppet 3 days ago

I mean, someone could still run a string of printable characters into "decode" and then "eval"...

[-]

OptionOfT 3 days ago

At least that is visible in a PR.

[-]

AnimalMuppet 3 days ago

The decode and eval calls are always visible.

[-]

OptionOfT 3 days ago

Security comes in layers. This is one layer.

fxtentacle 3 days ago

I call bullshit on this: "The attacker is using a public blockchain - immutable, decentralized, impossible to take down - as their C2 server."

"There's no hosting provider to contact, no registrar to pressure, no infrastructure to shut down. The Solana blockchain just... exists. "

Yes, but you still need to connect to it. Blocking access to *.solana.com is enough to stop the trojan from accessing its 2nd stage.

"Connections to Solana RPC nodes look completely normal. Security tools won't flag it. "

Then your security tools are badly configured. Lots of crypto traffic should be treated as a red flag in almost any corporate environment.

"there's literally no way to take it down"

There is, you just have to accept that Solana goes down with it. Why is A-OK in a work environment.

[-]

maccam912 3 days ago

There's also the backup C2 path though, via google calendar. Wayyy less of a red flag.

[-]

fxtentacle 3 days ago

I'm surprised that Google hasn't deactivated the link in the 24+ hours since that article went online.

[-]

dns_snek 3 days ago

That should tell you (everyone) how much these companies actually care about our security the next time they claim to be stripping away our freedoms "for our security".

[-]

throwaway48476 3 days ago

Google is a malware services company. They make money when someone creates malware OBS and pays Google for it to be the top result.

iSnow 3 days ago

>Yes, but you still need to connect to it. Blocking access to *.solana.com is enough to stop the trojan from accessing its 2nd stage.

How is that if you can just run a bunch of Solana RPC servers? For what would you need to access solana.com or a subdomain?

rezonant 3 days ago

> There is, you just have to accept that Solana goes down with it.

And nothing of value was lost.

knallfrosch 3 days ago

That blocks Solana only on your corporate network.

djmips 3 days ago

Obviously... SMH - what a tough read this blog post was.

lennartkoopmann 3 days ago

I was always afraid of browser extensions and now I'm also afraid of IDE extensions. Recently came across SecureAnnex[0] and it looks promising to get some control over it.

[0] https://secureannex.com/

OutOfHere 3 days ago

Is there a linter written in Rust or such that I can throw in any project to scan it for unexpected Unicode? It would help for the linter to support a config file.

3 days ago

[deleted]

vemv 3 days ago

What are the specific "Unicode variation selectors" in question?

I'd like to implement some simple linting against them.

DiabloD3 3 days ago

And this is why you don't use VSCode.

[-]

agile-gift0262 3 days ago

and this is why you must minimise and be extra careful with the extensions you install in your editor of choice.

3 days ago

[deleted]

h4ck_th3_pl4n3t 3 days ago

Imagine a worm written in VimL or emacs lisp.

Haha, that would be kinda fun as an experiment :D

[-]

DiabloD3 3 days ago

I'd love to see someone do it, even as a proof of concept.

dist-epoch 3 days ago

Do you also not use SSH? Because that was also infected last year (XZ)

[-]

DiabloD3 3 days ago

I use Debian Stable, and we didn't have the bug.

sublinear 3 days ago

> Let me say that again: the malware is invisible. Not obfuscated. Not hidden in a minified file. Actually invisible to the human eye.

I stopped reading at this point. This is not only false, but yet another strong reason to lint out the silly nonsense people argued for on here years ago. No emoji, no ligatures, etc.

a-dub 3 days ago

vim-plug with pinned hashes and manual reviews ftw!

Blackthorn 3 days ago

AI slop has become an absolute plague on this forum.