Sail 0.2: Spark replacement in Rust, runs 4x faster, drop-in PySpark compatible

  • Previous discussion in September when they didn't have distributed processing: https://news.ycombinator.com/item?id=41496033

    Github Repo: https://github.com/lakehq/sail

    Few interesting notes:

    - Benchmarks show 4x faster than Spark on TPC-H with 94% cost reduction.

    - Currently at 65.7% PySpark test compatibility(they talk about this in more detail in the post)

    - Built in Rust using Tokio runtime and Arrow IPC for high performance

    - Already supports 79/99 TPC-DS queries

  • [dead]