Optimizing Top K in Postgres

(paradedb.com)

46 points | by philippemnoel a day ago ago

4 comments

  • jmgimeno 26 minutes ago

    Maybe I'm wrong, but for this query:

    SELECT * FROM benchmark_logs WHERE severity < 3 ORDER BY timestamp DESC LIMIT 10;

    this index

    CREATE INDEX ON benchmark_logs (severity, timestamp);

    cannot be used as proposed: "Postgres can jump directly to the portion of the tree matching severity < 3 and then walk the timestamps in descending order to get the top K rows."

    Postgres with this index can walk to a part of the tree with severity < 3, but timestamps are sorted only for the same severity.

  • davidelettieri an hour ago

    The "But Wait, We Need Filters Too" paragraph mentions "US" filter which is introduced only later on.

    • GrayShade 37 minutes ago

      And footnote 3 is unreferenced.

  • tacone 18 minutes ago

    The issue here is the row based format. You simply can't filter on arbitrary columns with that. Either use an external warehouse or a columnar plug-in like Timescale.