It looks like you want to make money off this file format? That seems difficult. You would need to build a product around it first. I suppose some kind of a search or observability company could get funded if you have a demo. But be warned that running a company involves a lot more than developing a secret sauce.
The easiest thing is to popularize it and get a well-paying job from your fame. Make some friends and start your company together.
It doesn't exactly inspire confidence observing that the .see "archive" included in the zip distribution apparently gets further compressed by more than 2:1 within the zip archive....
Happy to answer design details (page layout, Bloom tuning, codec selection, failure modes).
Minimal Python examples for exists(key) and positions(key) are in the repo.
If anyone needs deeper materials (reproducible FULL benches, wheel artifacts, and design notes) we have an NDA-gated VDR; I can share the form on request.
Congrats on the release. The SEE approach—schema-aware delta, dictionaries, PageDir, and tuned Bloom filters—seems thoughtfully engineered. The tradeoff versus pure zstd makes sense if selective probes dominate TCO. I’ll try the quick demo; curious about failure modes and Bloom tuning across varied schemas.
It looks like you want to make money off this file format? That seems difficult. You would need to build a product around it first. I suppose some kind of a search or observability company could get funded if you have a demo. But be warned that running a company involves a lot more than developing a secret sauce.
The easiest thing is to popularize it and get a well-paying job from your fame. Make some friends and start your company together.
From OP's Github: "I am a 20-year-old university student living in Japan. Although I'm a liberal arts major, I aspire to become an engineer."
Just FYI - this is most likely vibe coding that a sycophantic AI has persuaded OP is cutting edge research.
It doesn't exactly inspire confidence observing that the .see "archive" included in the zip distribution apparently gets further compressed by more than 2:1 within the zip archive....
Happy to answer design details (page layout, Bloom tuning, codec selection, failure modes). Minimal Python examples for exists(key) and positions(key) are in the repo. If anyone needs deeper materials (reproducible FULL benches, wheel artifacts, and design notes) we have an NDA-gated VDR; I can share the form on request.
Congrats on the release. The SEE approach—schema-aware delta, dictionaries, PageDir, and tuned Bloom filters—seems thoughtfully engineered. The tradeoff versus pure zstd makes sense if selective probes dominate TCO. I’ll try the quick demo; curious about failure modes and Bloom tuning across varied schemas.
“Millisecond lookups” sounds funny when you work in game dev. Anyway, interesting idea, thanks for sharing. Where the code at, though?