I don't know if closing the gap on features with Boa and hardening for production use will also bloat the compilation size. Regardless, for passing 97% of the spec at this size is pretty impressive.
That covers the vast bulk of the difference. The ICU data is about 10.7MB in the source (boa/core/icu_provider) and may grow or shrink by some amount in the compiling.
I'm not saying it's all the difference, just the bulk.
There's a few reasons why svelte little executables with small library backings aren't possible anymore, and it isn't just ambient undefined "bloat". Unicode is a big one. Correct handling of unicode involves megabytes of tables and data that have to live somewhere, whether it's a linked library, compiled in, tables on disks, whatever. If a program touches text and it needs to handle it correctly rather than just passing it through, there's a minimum size for that now.
Is that with any other size optimizations? I think by default, most of them (like codegen-units=1, remove panic handling, etc) are tuned for performance, not binary size, so might want to look into if the results are different if you change them.
Stripping can save a huge amount of binary size, there’s lots of formatting code added for println! and family, stacktrace printing, etc. However, you lose those niceties if stripping at that level.
It's impressively compliant, considering it's just a one man project! Almost as fully featured as Boa, plus or minus a few things. And generally faster too, almost double the speed of Boa on some benchmarks.
This is the first Rust project I’ve wanted to use, mostly because they say GC uses very unsafe Rust. That’s a sign that they’re using Rust effectively IMO, as who in the fuck would write a garbage collector using garbage collected Rust?
I still wouldn't call it GC in that case. It's pretty much exactly the same as std::shared_ptr in C++, and we don't usually call that GC. I don't know about the academic definition, but I draw the line at a cycle collector. (So e.g. Python is GC'd, but Rust/C++/Swift are not.)
The obvious use-case for unsafe is to implement alternative memory regimes that don’t exist in rust already, so you can write safe abstractions over them.
Rust doesn’t have the kind of high performance garbage collection you’d want for this, so starting with unsafe makes perfect sense to me. Hopefully they keep the unsafe layer small to minimise mistakes, but it seems reasonable to me.
Rust also has some nice language features. Even unsafe rust doesn't have the huge "undefined behaviour" surface that languages like C++ still contain.
If I were to write a toy JS runtime in Rust, I'd try to make it as safe as possible and deal with unsafe only when optimization starts to become necessary, but it's not like that's the only way to use Rust.
That’s the philosophy. Use the less constrained (but still somewhat constrained and borrow checked) unsafe to wrap/build the low level stuff, and expose a safe public API. That way you limit the exposure of human errors in unsafe code to a few key parts that can be well understood and tested.
The whole point of unsafe is to be able to circumvent the guardrails where the developer knows something the compiler isn't (yet) smart enough to understand. It's likely that implementing a high-performance GC runs afoul of quite a few of those edge cases.
Even using something as simple as Vec means using `unsafe` code (from the std library). The idea isn't to have no `unsafe` code (which is impossible). It's to limit it to small sections of your code that are much more easily verifiable.
For some use cases, that means that "user code" can have no `unsafe`. But implementing a GC is very much not one of those.
Just a small comparison, compiled for release:
Boa: 23M Brimstone: 6.3M
I don't know if closing the gap on features with Boa and hardening for production use will also bloat the compilation size. Regardless, for passing 97% of the spec at this size is pretty impressive.
It looks like Boa has Unicode tables compiled inside of itself: https://github.com/boa-dev/boa/tree/main/core/icu_provider
Brimstone does not appear to.
That covers the vast bulk of the difference. The ICU data is about 10.7MB in the source (boa/core/icu_provider) and may grow or shrink by some amount in the compiling.
I'm not saying it's all the difference, just the bulk.
There's a few reasons why svelte little executables with small library backings aren't possible anymore, and it isn't just ambient undefined "bloat". Unicode is a big one. Correct handling of unicode involves megabytes of tables and data that have to live somewhere, whether it's a linked library, compiled in, tables on disks, whatever. If a program touches text and it needs to handle it correctly rather than just passing it through, there's a minimum size for that now.
Is that with any other size optimizations? I think by default, most of them (like codegen-units=1, remove panic handling, etc) are tuned for performance, not binary size, so might want to look into if the results are different if you change them.
Stripping can save a huge amount of binary size, there’s lots of formatting code added for println! and family, stacktrace printing, etc. However, you lose those niceties if stripping at that level.
"Compacting garbage collector, written in very unsafe Rust" got me cracking.
Sorry for the offtop, but I really miss the cracktros. Imagine having Ikari intro before you boot into your OS.
Sorry also for being offtopic, but "cracking" in this case most likely refers to cracking [with laughter].
Could you compare it with Boa? It is written in Rust too.
https://github.com/boa-dev/boa
I have some benchmark results here: https://ivankra.github.io/javascript-zoo/?v8=true
It's impressively compliant, considering it's just a one man project! Almost as fully featured as Boa, plus or minus a few things. And generally faster too, almost double the speed of Boa on some benchmarks.
Surprised at the lack of a license though.
Interesting. Hermes and QuickJS both come out looking very good in these (in terms of performance vs. binary size)
This is the first Rust project I’ve wanted to use, mostly because they say GC uses very unsafe Rust. That’s a sign that they’re using Rust effectively IMO, as who in the fuck would write a garbage collector using garbage collected Rust?
> who in the fuck would write a garbage collector using garbage collected Rust?
Rust is not garbage collected unless you explicitly opt into using Rc/Arc
I still wouldn't call it GC in that case. It's pretty much exactly the same as std::shared_ptr in C++, and we don't usually call that GC. I don't know about the academic definition, but I draw the line at a cycle collector. (So e.g. Python is GC'd, but Rust/C++/Swift are not.)
Rust is not garbage collected though.
Yes, but safe Rust enforces strict borrow checking with tracing, reference counting, etc. which would be inefficient for GC implementation.
What tracing?
Memory safety is one of Rust’s biggest selling points. It’s a bit baffling that this engine would choose to implement unsafe garbage collection.
The obvious use-case for unsafe is to implement alternative memory regimes that don’t exist in rust already, so you can write safe abstractions over them.
Rust doesn’t have the kind of high performance garbage collection you’d want for this, so starting with unsafe makes perfect sense to me. Hopefully they keep the unsafe layer small to minimise mistakes, but it seems reasonable to me.
I'm curious if it can be done in Rust entirely though. Maybe some assembly instructions are required e.g. for trapping or setting memory fences.
If it comes to it then Rust has excellent support for inline assembly
But how well does it play with memory fences?
Rust also has some nice language features. Even unsafe rust doesn't have the huge "undefined behaviour" surface that languages like C++ still contain.
If I were to write a toy JS runtime in Rust, I'd try to make it as safe as possible and deal with unsafe only when optimization starts to become necessary, but it's not like that's the only way to use Rust.
That’s the philosophy. Use the less constrained (but still somewhat constrained and borrow checked) unsafe to wrap/build the low level stuff, and expose a safe public API. That way you limit the exposure of human errors in unsafe code to a few key parts that can be well understood and tested.
The whole point of unsafe is to be able to circumvent the guardrails where the developer knows something the compiler isn't (yet) smart enough to understand. It's likely that implementing a high-performance GC runs afoul of quite a few of those edge cases.
Even using something as simple as Vec means using `unsafe` code (from the std library). The idea isn't to have no `unsafe` code (which is impossible). It's to limit it to small sections of your code that are much more easily verifiable.
For some use cases, that means that "user code" can have no `unsafe`. But implementing a GC is very much not one of those.