13 points | by chenxi9649 16 hours ago ago
3 comments
Previous discussion in September when they didn't have distributed processing: https://news.ycombinator.com/item?id=41496033
Github Repo: https://github.com/lakehq/sail
Few interesting notes:
- Benchmarks show 4x faster than Spark on TPC-H with 94% cost reduction.
- Currently at 65.7% PySpark test compatibility(they talk about this in more detail in the post)
- Built in Rust using Tokio runtime and Arrow IPC for high performance
- Already supports 79/99 TPC-DS queries
Also, some discussions on Reddit from yesterday/today.
https://www.reddit.com/r/dataengineering/comments/1gv840u/in...
https://www.reddit.com/r/rust/comments/1gwayz6/introducing_d...
[dead]
Previous discussion in September when they didn't have distributed processing: https://news.ycombinator.com/item?id=41496033
Github Repo: https://github.com/lakehq/sail
Few interesting notes:
- Benchmarks show 4x faster than Spark on TPC-H with 94% cost reduction.
- Currently at 65.7% PySpark test compatibility(they talk about this in more detail in the post)
- Built in Rust using Tokio runtime and Arrow IPC for high performance
- Already supports 79/99 TPC-DS queries
Also, some discussions on Reddit from yesterday/today.
https://www.reddit.com/r/dataengineering/comments/1gv840u/in...
https://www.reddit.com/r/rust/comments/1gwayz6/introducing_d...
[dead]