This is really interesting. At first glance, I was tempted to say "why not just use sqlite with JSON fields as the transfer format?" But everything about that would be heavier-weight in every possible way - and if I'm reading things right, this handles nested data that might itself be massive. This is really elegant.
- this encodes to ASCII text (unless your strings contain unicode themselves)
- that means you can copy-paste it (good luck doing that with compressed JSON or CBOR or SQLite
- there is a scale where JSON isn't human readable anymore. I've seen files that are 100+MB of minified JSON all on a single very long line. No human is reading that without using some tooling.
Or in this case, just do `rx file.rx` It has jq like queries built in and supports inputs with either rx or json. Also if you prefer jq, you can do `rx file.rx | jq`
As with most things in engineering, it depends. There are real logistical costs to using binary formats. This format is almost compact as a binary format while still retaining all the nice qualities of being an ASCII friendly encoding (you can embed it anywhere strings are allowed, including copy-paste workflows)
Think of it as a hybrid between JSON, SQLite, and generic compression. This format really excels for use cases where large read-only build artifacts are queried by worker nodes like an embedded database.
I agree in principle. However JSON tooling has also got so good that other formats, when not optimized and held correctly, can be worse than JSON. For example IME stock protocol buffers can be worse than a well optimized JSON library (as much as it pains me to say this).
While this is a neat feature, this means it is not in fact a drop in replacement for JSON.parse, as you will be breaking any code that relies on the that result being a mutable object.
True, the particular use case where this really shines is large datasets where typical usage is to read a tiny part of it. Also there is no reason you couldn't write an rx parser that creates normal mutable objects. It could even be a hybrid one that is lazy parsed till you want to turn it mutable and then does a normal parse to normal objects after that point.
Interesting. I've heard about cursors in reference to a Rust library that was mentioned as being similar to protobuf and cap'n proto.
Does this duplicate the name of keys? Say if you have a thousand plain objects in an array, each with a "version" key, would the string "version" be duplicated a thousand times?
Another project a lot of people aren't aware of even though they've benefitted from it indirectly is the binary format for OpenStreetMap. It allows reading the data without loading a lot of it into memory, and is a lot faster than using sqlite would be.
I love these projects, I hope one of them someday emerges as the winner because (as it motivates all these libraries' authors) there's so much low hanging fruit and free wins changing the line format for JSON but keeping the "Good Parts" like the dead simple generic typing.
XML has EXI (Efficient XML Interchange) for precisely the reason of getting wins over the wire but keeping the nice human readable format at the ends.
It's not quite clear to me why you'd use this over something more established such as protobuf, thrift, flatbuffers, cap n proto etc.
Cool project.
The viewer is cool, took me a while to find the link to it though, maybe add a link in the readme next to the screenshot.
This is really interesting. At first glance, I was tempted to say "why not just use sqlite with JSON fields as the transfer format?" But everything about that would be heavier-weight in every possible way - and if I'm reading things right, this handles nested data that might itself be massive. This is really elegant.
My one eyebrow raise is - is there no binary format specification? https://github.com/creationix/rx/blob/main/rx.ts#L1109 is pretty well commented, but you can't call it a JSON alternative without having some kind of equivalent to https://www.json.org/ in all its flowchart glory!
JSON is human-readable, why even compare it with this. Is any serialization format now just a "JSON alternative"?
- this encodes to ASCII text (unless your strings contain unicode themselves) - that means you can copy-paste it (good luck doing that with compressed JSON or CBOR or SQLite - there is a scale where JSON isn't human readable anymore. I've seen files that are 100+MB of minified JSON all on a single very long line. No human is reading that without using some tooling.
That kind of feels a bit worst of both worlds. None of the space savings/efficiency of binary but also no human readability.
Being able to copy/paste a serialization format is not really a feature i think i would care about.
cat file.whatever | whatever2json | jq ?
(Or to avoid using cat to read, whatever2json file.whatever | jq)
Or in this case, just do `rx file.rx` It has jq like queries built in and supports inputs with either rx or json. Also if you prefer jq, you can do `rx file.rx | jq`
You shouldn't be using JSON for things that'd have performance implications.
As with most things in engineering, it depends. There are real logistical costs to using binary formats. This format is almost compact as a binary format while still retaining all the nice qualities of being an ASCII friendly encoding (you can embed it anywhere strings are allowed, including copy-paste workflows)
Think of it as a hybrid between JSON, SQLite, and generic compression. This format really excels for use cases where large read-only build artifacts are queried by worker nodes like an embedded database.
I agree in principle. However JSON tooling has also got so good that other formats, when not optimized and held correctly, can be worse than JSON. For example IME stock protocol buffers can be worse than a well optimized JSON library (as much as it pains me to say this).
Can you imagine if a service as chatty and performance sensitive as Discord used JSON for their entire API surface?
Very cool stuff!
This did catch my eye, however: https://github.com/creationix/rx?tab=readme-ov-file#proxy-be...
While this is a neat feature, this means it is not in fact a drop in replacement for JSON.parse, as you will be breaking any code that relies on the that result being a mutable object.
True, the particular use case where this really shines is large datasets where typical usage is to read a tiny part of it. Also there is no reason you couldn't write an rx parser that creates normal mutable objects. It could even be a hybrid one that is lazy parsed till you want to turn it mutable and then does a normal parse to normal objects after that point.
A new random-access JSON alternative from the creator of nvm.sh, luvit.io, and js-git.
Interesting. I've heard about cursors in reference to a Rust library that was mentioned as being similar to protobuf and cap'n proto.
Does this duplicate the name of keys? Say if you have a thousand plain objects in an array, each with a "version" key, would the string "version" be duplicated a thousand times?
Another project a lot of people aren't aware of even though they've benefitted from it indirectly is the binary format for OpenStreetMap. It allows reading the data without loading a lot of it into memory, and is a lot faster than using sqlite would be.
Edit: the rust library I remember may have been https://rkyv.org/
I love these projects, I hope one of them someday emerges as the winner because (as it motivates all these libraries' authors) there's so much low hanging fruit and free wins changing the line format for JSON but keeping the "Good Parts" like the dead simple generic typing.
XML has EXI (Efficient XML Interchange) for precisely the reason of getting wins over the wire but keeping the nice human readable format at the ends.
this is more nuanced than the title suggests. worth reading the whole thing