1 comments

  • tingfirst 5 hours ago

    All streaming processors face the same fundamental problem:

    Streaming joins require maintaining state for both sides of the join

    High-cardinality data (millions of unique keys) means huge state sizes

    Traditional approach: Keep everything in memory will make memory exhausted

    The high-cardinality join memory problem isn't unique to Timeplus. Apache Flink also uses hybrid hash joins that spill to disk (RocksDB) when memory fills, Materialize shares indexed state across multiple queries (but still requires keeping full datasets in memory), and RisingWave stores state in cloud object storage (S3/GCS) with LRU caching for hot data. What makes Timeplus different is its purpose-built optimization for the Pareto Principle, where a tiny fraction of data generates the vast majority of activity - keeping hot data in memory and cold data on disk for dramatic memory savings.