Yes, you did. And it's a good design. You even did the GC question justice.
My concern is more in the spirit of "Your scientists were so preoccupied with whether or not they could, they didn't stop to think if they should.". Of course JS being single threaded wasn't a hard constraint. Lift it, and people like you can use the parallelism to do great things.
The problem is that most developers are not you. Shared memory concurrency is foot-artillery (especially if truly parallel). Adding threads to the JS ecosystem is selling W48 nuclear artillery shells at the toy store.
JS's ostensible limitation to a single thread forced users to do what they should have been doing anyway: message-passing, thread-per-core architecture, and actor-ish stuff. People who don't know better reach for shared memory concurrency because it seems like a good way to solve problems, but it's actually a dangerous attractor in idea space. JS engine limitations were accidentally keeping people away from it. Now that they can hear the siren's song of a mutex, they'll run around on the hard problems of parallel programming.
Now, that's not a reason to avoid shipping such a system. It's just not something I would have chosen to implement for the masses.
The code needs to be not in the state of "no obvious bugs", but "obviously no bugs". Especially the programming language runtime. Otherwise there is no hope you can sustain any development whatsoever
On one hand, sure, the entire point of a programming language is to make complex ideas able to be expressed in simpler abstractions. On the other hand, we can damn well try.
IMO the very minimum requirement should be that you've demonstrated effort to reduce unnecessary complexity of the problem. Sure, some problems are complex enough that there might not exist an obvious solution, yet usually after a while once you're familiar with some topic the existing solutions do start to appear obvious. If they're not I'd argue we're doing something very very wrong
I think it's also worth distinguishing _problem complexity_ and _solution complexity_. The problem might be really really hard (and it very obviously is in the case of adding multi-threading to JavaScript). But it does not mean that the solution has to be hard to understand. It doesn't mean that any average PHP developer (I can say that, I started with PHP) should be able to verify the correctness of the patch, but for a person who is well familiar with the area there shouldn't exist areas they can't understand.
Look at the description of your own Fil-C: it focuses on clarity of explanation of how it works, and it actually does make sense (and, hopefully, works well enough too). Compare that with the pull request sent here. I'll wait
I think LLVM is a perfect example of what happens when it's too complicated: it's slow, it's bug-ridden when you stray away from the beaten path (e.g. Rust hits bugs in LLVM like this one https://www.reddit.com/r/rust/comments/l4roqk/a_fix_for_the_... ), and it's really hard to use and understand.
It's obviously not useless because of that, but it's a great example of what happens when you cannot fully control the implementation complexity
It’s pretty incredible to me that a mammoth change like this is possible to prototype now using LLMs.
It makes me wonder how much of our software stack will become more malleable to big ideas and experiments in the future, like Filip’s idea here. Even if you don’t want to merge the code, it’s still an incredible existence proof that something like this could work.
Last time I read the bun docs I spotted an off-by-one bug in sample code, so I opened a github issue. An AI bot responded, confirming the issue, and opened a PR to fix it - A simple "+ 1" added in the right place. Two other AI bots reviewed the PR, which went on for several rounds of "improvements". Last time I checked, neither the issue nor the PR received any human attention (actually I just checked again, and the PR has been closed by stalebot).
Don't have much to say on the topic but recalled this excerpt from the book Coders at Work in the chapter interviewing Douglas Crockford.
```
In my experience, the worst bugs are the real-time bugs, which have to do with interactions with multiple threads. My approach to those bugs is to avoid making them. So I don't like threads. I think threads are an atrocious programming model. They're an occasionally necessarily evil, but they're not necessary for most of the things we use threads for.
One of the things I like about the browser model is that we only get one thread. Some people complain about that—if you lock up that thread, then the browser's locked up. So you just don't do that. There are constantly calls for putting threads into JavaScript and so far we've resisted that. I'm really glad we have.
The event-based model, which is what we're using in the browser, works really well. The only place where it breaks down is if you have some process that takes too long. I really like the approach that Google has taken in Gears to solving that, where they have a separate process which is completely isolated that you can send a program to and it'll run there. When it's finished, it'll tell you the result and the result comes back as an event. That's a brilliant model.
```
This is terrifying. Evidently based on prior art by Mr. Pizlo – indeed, where's the acknowledgement of that?? (edit: I missed it) – but I'm assuming that was never translated into code.
I love the idea of experimentation and innovation; I abhor the idea of it being dependent on Anthropic and their theft. I've never rooted for the Chinese labs more strongly than after seeing this.
One of the biggest things preventing software like SQL DB's from being written in TypeScript is the lack of proper threading.
I genuinely think you could write a competitively-performant multi-threaded DB in Bun + TS if you had shared-heap threads and fast atomics/locking primitives.
"I genuinely think you could write a competitively-performant multi-threaded DB in Bun + TS if you had shared-heap threads and fast atomics/locking primitives."
Not likely. Databases that attain any significant use in the field end up getting optimized to the n'th degree because they're the bottleneck of the entire system of every system they get put into. Javascript runs on the "5-10x slower than C" language tier. Personally I think even picking Go, in the "2x slower than C" tier, is a huge mistake, though a few people seem to be doing OK with it. I don't think you can call it "competitive" when your C++ or Rust competition is consuming a factor of magnitude less resources.
WASM DBs, maybe, especially as it continues to mature. Not Javascript.
Exactly. Nothing stops your writing a high-performance parallel database in TypeScript today. Given that runtimes and tooling are actually pretty good, I think TypeScript is actually a fine choice of language for the task.
The only thing you can't do with JS today is share a heap across threads. You have SharedArrayBuffer. You have atomics. You don't need a shared address space.
There's a high performance database called "PostgreSQL" you may have heard about. It doesn't use threads. It uses separate processes and shared memory: just like standard JavaScript, with its service workers and SharedArrayBuffer.
If not sharing an address space is good enough for PostgreSQL, it's good enough for your TypeScript database.
The problem with shared-everything, unmarked, preemptive-parallel concurrency is that 90% of the time it gets used by people who don't know they shouldn't.
Are you hoping to, like, run postgres in nodejs or something?
You can get parallelism with web workers and shove sqlite over there if you like, e.g. for running more intensive queries. Beyond that I kinda don't see much of a reason to use JS for databases, except maybe for isolation (e.g. via wasm).
Imagine somebody doing a drive-by on your repo and dropping a 270k loc PR expecting you to merge it. Bonus points if they can't even put in the 0.001% smidgen of effort to write why they think the PR is useful or necessary in their own words. Oh, but we don't have to imagine it, because there are people who actually do that!
Eh, Firefox/Thunderbird had multi-threaded JS in SpiderMonkey in the late 90s.
Then it was removed it because it made garbage-collection a real mess (the JavaScript gc needs to walk through lots of C++ data, some of it may have specific requirements for destruction/finalization).
I know a thing or two about VMs. Reading this post, I thought to myself "No way it was this easy. No performance hit in the single threaded case? No way".
I was right. Buried in the middle of the post is this tidbit:
> v1 collects synchronous and stop-the-world
Ah, there it is! I knew it!
Parallel garbage collection is a very hard problem. Years of experience and subtle implementation are required to get something like ZGC. A stop-the-world garbage collector will kill tail latency in many use-cases, especially for large programs. I'd say a good GC is the hardest part of a modern VM, even harder than a good JIT: not that a JIT is easy.
Show me multi-threaded JS with generational mark, sweep, compaction, etc. running in parallel with the mutator and I'll be impressed. (The smart thing would be to base it on the JVM or CLR. Doesn't count though.)
It's all so exhausting, this current programmer culture of doing the easy part of a system thing X and presenting your work, without qualifiers, as a complete and modern X.
Sure, sure, we can have memory safe C (just don't have any data races!). Sure, we can have an AI C compiler (just don't expect type checking). Sure, we can port SQLite to Rust (but don't expect it to be fast). Sure, you can one shot a Slack clone (just don't expect performance or security). Doing the easy part of a thing is not doing the thing! You can't trust a README's feature list these days.
To be fair, given that the README is obviously unedited LLM output, the authors might not have realized that their agents cheated and made threading easy by pessimizing the GC. The LLM certainly did though.
Now, maybe the JSC really is adaptable to a multi-threaded mutator world. If it is, great. But over and over, I've seen AI say "I will defer and charter $HARD_THING" and mean "I have no idea how to do $HARD_THING, so I'm creatively reinterpreting your request to make it easy". You have to be endlessly vigilant for LLMs subtly twisting your tasks into easy versions that might technically meet the requirements but they are less complete than you intend.
In contrast, I don't know that much about VMs. But if you're making a big fundamental change to a system, I do know that it shouldn't start with a single "+279,276 -4,272" PR. It starts with a small patch with the core of the change so that everyone can understand what it does and how it works.
You don't cram everything into a single 270K line PR, even (especially) with an LLM, unless you specifically don't want anyone else to look too closely at what you did.
I know a ton of people absolutely hate this level of "LLM code + LLM PR description + LLM PR review" but my boss would have an orgasm if I was able to use AI half as well in our org... :/
My conclusion from the project I'm working on is that, as of this day, there is no way to have both this so-called 20x performance improvement _and_ any kind of quality. Or security if whoever is running the agent has any token in an .env anywhere on the same file system.
We'll see in which direction the CTO takes this. My bet is not on quality.
It is sad. This is a new reality. No one reads code, it is agents all the way down. It has been long enough now that I can safely say AI has not sped up project delivery nor improved quality when it did ship.
You're already using a new runtime with tsgo -- it's golang at build time -- but still running Node in prod, so the same could work here. :-)
Agreed I would not want all Typescript users forced to use /this/ runtime, but if the TS team shipped tsc as "oh now it's uses a special fast JS runtime" (just like tsgo is a different runtime) I'd love to at least have the option of using the same special fast runtime in my own still-written-in-TS apps.
Seems I've either struck or a nerve, or miscommunicated, given the insta down votes.
I knew it was possible :-)
https://webkit.org/blog/7846/concurrent-javascript-it-can-wo...
That's excellent work and a great read, Filip!
Yes, you did. And it's a good design. You even did the GC question justice.
My concern is more in the spirit of "Your scientists were so preoccupied with whether or not they could, they didn't stop to think if they should.". Of course JS being single threaded wasn't a hard constraint. Lift it, and people like you can use the parallelism to do great things.
The problem is that most developers are not you. Shared memory concurrency is foot-artillery (especially if truly parallel). Adding threads to the JS ecosystem is selling W48 nuclear artillery shells at the toy store.
JS's ostensible limitation to a single thread forced users to do what they should have been doing anyway: message-passing, thread-per-core architecture, and actor-ish stuff. People who don't know better reach for shared memory concurrency because it seems like a good way to solve problems, but it's actually a dangerous attractor in idea space. JS engine limitations were accidentally keeping people away from it. Now that they can hear the siren's song of a mutex, they'll run around on the hard problems of parallel programming.
Now, that's not a reason to avoid shipping such a system. It's just not something I would have chosen to implement for the masses.
I don’t understand the thread phobia
Comparing it to nukes is a bit extreme, don’t you think?
This is consistent with the endless contempt people have had for JavaScript and those that use it.
Yeah I don’t get that either
It’s a super successful language
The code needs to be not in the state of "no obvious bugs", but "obviously no bugs". Especially the programming language runtime. Otherwise there is no hope you can sustain any development whatsoever
No language runtime is ever in a state of "obviously no bugs".
Good luck demanding that of anything of JSC's or LLVM's complexity
On one hand, sure, the entire point of a programming language is to make complex ideas able to be expressed in simpler abstractions. On the other hand, we can damn well try.
Damn well trying to enforce an "obviously no bugs" rule in a language runtime would mean zero progress in language runtimes.
We certainly wouldn't have gotten to where we are with runtime and compiler quality and performance if we had damn well tried to enforce such a rule
IMO the very minimum requirement should be that you've demonstrated effort to reduce unnecessary complexity of the problem. Sure, some problems are complex enough that there might not exist an obvious solution, yet usually after a while once you're familiar with some topic the existing solutions do start to appear obvious. If they're not I'd argue we're doing something very very wrong
Adding concurrency to JavaScript definitely falls in the "complex enough" category
So does basically any feature or optimization in a JS runtime
I think it's also worth distinguishing _problem complexity_ and _solution complexity_. The problem might be really really hard (and it very obviously is in the case of adding multi-threading to JavaScript). But it does not mean that the solution has to be hard to understand. It doesn't mean that any average PHP developer (I can say that, I started with PHP) should be able to verify the correctness of the patch, but for a person who is well familiar with the area there shouldn't exist areas they can't understand.
Look at the description of your own Fil-C: it focuses on clarity of explanation of how it works, and it actually does make sense (and, hopefully, works well enough too). Compare that with the pull request sent here. I'll wait
The solution to concurrency in JS is hard to understand and I would expect even hardened JSVM folks (me included) to be super confused by it
Perhaps then it would be better to not use tools of this level of complexity.
I think LLVM is a perfect example of what happens when it's too complicated: it's slow, it's bug-ridden when you stray away from the beaten path (e.g. Rust hits bugs in LLVM like this one https://www.reddit.com/r/rust/comments/l4roqk/a_fix_for_the_... ), and it's really hard to use and understand.
It's obviously not useless because of that, but it's a great example of what happens when you cannot fully control the implementation complexity
So don't use compilers at all?
Compilers aren't made equal either. E.g. compare Visual Studio C++.NET compiler and something like Go. And Go isn't that simple either to be fair
how would you suggest we compile literally anything?
>Scalability, measured (the honest section)
Ugh.
almost spit out my drink!
Is there a human-authored description of the PR anywhere?
How are there not race conditions all over the place?
It's substantially based on my design, read the blog post I wrote (linked in another comment here)
It's a very complex thing, but not impossible. I'm very impressed that any LLM can do this
think of all the poor web devs trying to use multiple threads on top of asynchronous operations. wild.
Standard contempt for web developers.
It’s pretty incredible to me that a mammoth change like this is possible to prototype now using LLMs.
It makes me wonder how much of our software stack will become more malleable to big ideas and experiments in the future, like Filip’s idea here. Even if you don’t want to merge the code, it’s still an incredible existence proof that something like this could work.
I am shocked by how good and comprehensive the bun docs & ecosystem is.
Its so well contained I never need to look outside its ecosystem for basic components. It's a true "Batteries Included" runtime.
Last time I read the bun docs I spotted an off-by-one bug in sample code, so I opened a github issue. An AI bot responded, confirming the issue, and opened a PR to fix it - A simple "+ 1" added in the right place. Two other AI bots reviewed the PR, which went on for several rounds of "improvements". Last time I checked, neither the issue nor the PR received any human attention (actually I just checked again, and the PR has been closed by stalebot).
> (actually I just checked again, and the PR has been closed by stalebot).
Can you provide the link?
I, too, was curious to see it in practice.
Here is the ticket opened by @retr0id: https://github.com/oven-sh/bun/issues/28030
And here is the swarm of bots / LLMs / agents that open, review and bikeshed the PR before it's closed by the stalebot: https://github.com/oven-sh/bun/pull/28031
It's hilarious. But also a little sad.
Yup, that's the one.
Was the bug actualy soved?
Bun is so good that can’t be used as server and only as local script runner.
https://discord.com/channels/876711213126520882/148058965798...
Leaks memory left and right. And the core team seems unable to fix it.
Yet I rarely hear about it being used in production systems and replacing Node.js.
[delayed]
Don't have much to say on the topic but recalled this excerpt from the book Coders at Work in the chapter interviewing Douglas Crockford.
``` In my experience, the worst bugs are the real-time bugs, which have to do with interactions with multiple threads. My approach to those bugs is to avoid making them. So I don't like threads. I think threads are an atrocious programming model. They're an occasionally necessarily evil, but they're not necessary for most of the things we use threads for.
One of the things I like about the browser model is that we only get one thread. Some people complain about that—if you lock up that thread, then the browser's locked up. So you just don't do that. There are constantly calls for putting threads into JavaScript and so far we've resisted that. I'm really glad we have.
The event-based model, which is what we're using in the browser, works really well. The only place where it breaks down is if you have some process that takes too long. I really like the approach that Google has taken in Gears to solving that, where they have a separate process which is completely isolated that you can send a program to and it'll run there. When it's finished, it'll tell you the result and the result comes back as an event. That's a brilliant model. ```
Soo... Essentially, still threads, but no shared state between threads, and they talk through this message interface?
This is terrifying. Evidently based on prior art by Mr. Pizlo – indeed, where's the acknowledgement of that?? (edit: I missed it) – but I'm assuming that was never translated into code.
I love the idea of experimentation and innovation; I abhor the idea of it being dependent on Anthropic and their theft. I've never rooted for the Chinese labs more strongly than after seeing this.
The acknowledgement is in the PR description, section "The design, and what it's based on".
Thanks, fixed
One of the biggest things preventing software like SQL DB's from being written in TypeScript is the lack of proper threading.
I genuinely think you could write a competitively-performant multi-threaded DB in Bun + TS if you had shared-heap threads and fast atomics/locking primitives.
"I genuinely think you could write a competitively-performant multi-threaded DB in Bun + TS if you had shared-heap threads and fast atomics/locking primitives."
Not likely. Databases that attain any significant use in the field end up getting optimized to the n'th degree because they're the bottleneck of the entire system of every system they get put into. Javascript runs on the "5-10x slower than C" language tier. Personally I think even picking Go, in the "2x slower than C" tier, is a huge mistake, though a few people seem to be doing OK with it. I don't think you can call it "competitive" when your C++ or Rust competition is consuming a factor of magnitude less resources.
WASM DBs, maybe, especially as it continues to mature. Not Javascript.
You have web workers, and for shared memory and synchronisation respectively SharedArrayBuffer and the Atomics namespace.
Exactly. Nothing stops your writing a high-performance parallel database in TypeScript today. Given that runtimes and tooling are actually pretty good, I think TypeScript is actually a fine choice of language for the task.
The only thing you can't do with JS today is share a heap across threads. You have SharedArrayBuffer. You have atomics. You don't need a shared address space.
There's a high performance database called "PostgreSQL" you may have heard about. It doesn't use threads. It uses separate processes and shared memory: just like standard JavaScript, with its service workers and SharedArrayBuffer.
If not sharing an address space is good enough for PostgreSQL, it's good enough for your TypeScript database.
The problem with shared-everything, unmarked, preemptive-parallel concurrency is that 90% of the time it gets used by people who don't know they shouldn't.
Are you hoping to, like, run postgres in nodejs or something?
You can get parallelism with web workers and shove sqlite over there if you like, e.g. for running more intensive queries. Beyond that I kinda don't see much of a reason to use JS for databases, except maybe for isolation (e.g. via wasm).
I honestly should print that comment and hang it on a wall.
> …competitively-performant… Care to explain competitively to what?
…but why? JS/TS does not seem like the right tool for the job?
It's probably what they know so not anything new should be learned.
Imagine somebody doing a drive-by on your repo and dropping a 270k loc PR expecting you to merge it. Bonus points if they can't even put in the 0.001% smidgen of effort to write why they think the PR is useful or necessary in their own words. Oh, but we don't have to imagine it, because there are people who actually do that!
The PR is against bun's fork of WebKit, not upstream.
The title is of this post is definitely confusing if not misleading.
Oh, my mistake, I thought they were doing the zig thing again.
Eh, Firefox/Thunderbird had multi-threaded JS in SpiderMonkey in the late 90s.
Then it was removed it because it made garbage-collection a real mess (the JavaScript gc needs to walk through lots of C++ data, some of it may have specific requirements for destruction/finalization).
I hope it's better this time :)
The JS / interoperability is why V8 eventually added a C++ GC.
I know a thing or two about VMs. Reading this post, I thought to myself "No way it was this easy. No performance hit in the single threaded case? No way".
I was right. Buried in the middle of the post is this tidbit:
> v1 collects synchronous and stop-the-world
Ah, there it is! I knew it!
Parallel garbage collection is a very hard problem. Years of experience and subtle implementation are required to get something like ZGC. A stop-the-world garbage collector will kill tail latency in many use-cases, especially for large programs. I'd say a good GC is the hardest part of a modern VM, even harder than a good JIT: not that a JIT is easy.
Show me multi-threaded JS with generational mark, sweep, compaction, etc. running in parallel with the mutator and I'll be impressed. (The smart thing would be to base it on the JVM or CLR. Doesn't count though.)
It's all so exhausting, this current programmer culture of doing the easy part of a system thing X and presenting your work, without qualifiers, as a complete and modern X.
Sure, sure, we can have memory safe C (just don't have any data races!). Sure, we can have an AI C compiler (just don't expect type checking). Sure, we can port SQLite to Rust (but don't expect it to be fast). Sure, you can one shot a Slack clone (just don't expect performance or security). Doing the easy part of a thing is not doing the thing! You can't trust a README's feature list these days.
To be fair, given that the README is obviously unedited LLM output, the authors might not have realized that their agents cheated and made threading easy by pessimizing the GC. The LLM certainly did though.
Now, maybe the JSC really is adaptable to a multi-threaded mutator world. If it is, great. But over and over, I've seen AI say "I will defer and charter $HARD_THING" and mean "I have no idea how to do $HARD_THING, so I'm creatively reinterpreting your request to make it easy". You have to be endlessly vigilant for LLMs subtly twisting your tasks into easy versions that might technically meet the requirements but they are less complete than you intend.
In contrast, I don't know that much about VMs. But if you're making a big fundamental change to a system, I do know that it shouldn't start with a single "+279,276 -4,272" PR. It starts with a small patch with the core of the change so that everyone can understand what it does and how it works.
You don't cram everything into a single 270K line PR, even (especially) with an LLM, unless you specifically don't want anyone else to look too closely at what you did.
I will never get over the overuse of adjectives like "real" in LLM outputs, it dilutes the meaning of these words.
Related, spinning "I did something poorly" into "I am being honest"
> Scalability, measured (the honest section)
so what about the other sections?!
Counting 62 em-dashes in the PR description alone, are people reading those walls of slop anymore?
No human has ever read or will ever read the PR description.
No human has read or will ever read any of the code, nor was any human thought involved in its creation.
Everything is performative now. As long as you just keep your eyes closed and believe it all works, that's all that matters.
Course not. They have an LLM summarize it for them.
I know a ton of people absolutely hate this level of "LLM code + LLM PR description + LLM PR review" but my boss would have an orgasm if I was able to use AI half as well in our org... :/
Just stop caring about quality. It makes it 10x easier to produce slop with AI if you never bother to check
I just wrote an internal report in my company.
My conclusion from the project I'm working on is that, as of this day, there is no way to have both this so-called 20x performance improvement _and_ any kind of quality. Or security if whoever is running the agent has any token in an .env anywhere on the same file system.
We'll see in which direction the CTO takes this. My bet is not on quality.
It is sad. This is a new reality. No one reads code, it is agents all the way down. It has been long enough now that I can safely say AI has not sped up project delivery nor improved quality when it did ship.
Is it the AI or the people using it? Idk
Amazing. This is what the Typescript team should have done instead of rewriting to golang -- innovate the runtime.
That doesn't help anyone using Node. I don't want to have to start using a new runtime because my compiler is slow. That's wild.
You're already using a new runtime with tsgo -- it's golang at build time -- but still running Node in prod, so the same could work here. :-)
Agreed I would not want all Typescript users forced to use /this/ runtime, but if the TS team shipped tsc as "oh now it's uses a special fast JS runtime" (just like tsgo is a different runtime) I'd love to at least have the option of using the same special fast runtime in my own still-written-in-TS apps.
Seems I've either struck or a nerve, or miscommunicated, given the insta down votes.