We had a case where a single OR was a massive performance problem on MSSQL, but not at all on Sybase SQLAnywhere we're migrating away from. Which one might consider slightly ironic given the origins of MSSQL...
Anyway, the solution was to manually rewrite the query as a UNION ALL of the two cases which was fast on both. I'm still annoyed though by the fact that MSSQL couldn't just have done that for me.
I am very grateful for databases but I have so many stories of having to manhandle them into doing what seems like it should be obvious to a reasonable query optimizer. Writing one must be very hard.
disjunctions and stats could be a pretty nasty combination in mssql. i think it got a bit better ca. 2016 with the CE updates, but i’ve had quite a few occurrences where the solutions were the union all approach
If optimization is really as simple as applying De Morgan's laws, surely it could be done within the query planner if that really is the main optimization switch? Or am I misreading this somehow?
Edit: I guess the main difference is that it's just calculating separate sets and then merging them, which isn't really DeMorgan's, but a calculation approach.
In most cases, the end result is a lot messier than it looks in this minimal example. Consider a case where the query is a 12-way join with a large IN list somewhere (which is essentially an OR); would it still look as attractive to duplicate that 12-way join a thousand times?
There _are_ optimizers that try to represent this kind of AND/OR expression as sets of distinct ranges in order to at least be able to consider whether a thousand different index scans are worth it or not, with rather limited success (it doesn't integrate all that well when you get more than one table, and getting cardinalities for that many index scans can be pretty expensive).
The core issue the article is pointing to is that most database indexes are B-trees, so if you have a predicate on the form (col_a = 'foo' OR col_b = 'foo'), then it is impossible to use a single B-tree lookup to find all rows that match the predicate. You'd have to do two lookups and then merge the sets. Some query optimizers can do that, or at least things that are similar in spirit (e.g. Postgres bitmap index scan), but it's much more expensive than a regular index lookup.
The query optimizer knows how many items are in each index, but has no advance idea how many items will be in the result of a JOIN. An "a OR b" query on a table with millions of rows might have three hits on A, or millions of hits.
The optimal query strategy for the two cases is very different.
Has anyone put machine learning in an SQL query optimizer yet?
> Has anyone put machine learning in an SQL query optimizer yet?
Yes, I think everyone has? At very least I know that MSSQL has because we semi regularly run into problems with it :).
MSSQL keeps track of query statistics and uses those in future planning. SOMETIMES it just so happens that the optimization for the general case makes the outlier 100x slower which kills general performance.
For a while I was maintaining software that supported both MSSQL and PGSQL, and I found that, when comparing like-for-like without DB-specific tuning, MSSQL produced better query plans on average. On a database without write contention, I'd often see 30% better performance on MSSQL.
However, it was also much more likely to hit an AWFUL pathological case which completely wrecked the performance as you describe. Combined with pessimistic locking, we ended up with far more overnight support calls from the MSSQL backend than from the PGSQL backend. Usually because it suddenly decided to switch query plan at 1AM.
I wonder if there's a trade-off where an optimizer produces better average query plans but worse outliers.
You double + the IO and potentially CPU time which is why this isn't done. It's also not always 100x, that just happens often enough. Sometimes it's only 2x or 1.5x. It's impossible to know which situation you are in and the hard thing is the outliers will be slow either way.
Some traditional planners try some different plans on random subsets of the data to see which plan works best. Don't need machine learning, just Bayes' rule.
> The query optimizer knows how many items are in each index, but has no advance idea how many items will be in the result of a JOIN.
Query optimizers definitely try to estimate cardinalities of joins. It's a really, really hard problem, but the typical estimate is _much_ better than “eh, no idea”.
I think having a way to build statistics on the join itself would be helpful for this. Similar to how extended statistics^1 can help when column distributions aren't independent of each other.
But this may require some basic materialized views, which postgres doesn't really have.
I really like the extension pattern. I wish more of the tables at my company used it.
Another huge benefit that we're realizing as we move some of our heaviest tables to this pattern is that it makes it really easy to index a core set of fields in Elasticsearch, along with a tag to associate it with a particular domain model. This has drastically cut down on our search-specific denormlization and let us avoid expensive index schema updates.
This sort of thing is why looking at generated SQL while developing instead of just trusting the ORM to write good queries is so important.
I find query planning (and databases in general) to be very difficult to reason about, basically magic. Does anyone have some recommended reading or advice?
If you are looking to squeeze every ounce of performance from your entire application stack, I'd say you should be looking at everything your ORM produces. The ORM is basically to speed up your developers time to production, but most ORMs will have some cases where they generate terrible SQL, and you can usually run your own SQL in a stored procedure if the generated SQL is sub-optimal. I've done this quite a few times with Microsoft's Entity Framework, but as new versions come out, it's become less common for me to have to do this. Usually I need to drop to a stored procedure for code that allows searching a large number of columns, in addition to sorting on all the columns that display. I also use stored procedures for multi-table joins with a WHERE clause, when using Entity Framework. You still need to look at your generated queries, but the code is nothing like it used to be under Entity Framework under the .NET Framework (at least in my experience - YMMV - you should never just let your ORM create SQL without reviewing what it is coming up with).
https://pages.cs.wisc.edu/~dbbook/ is a great overview, specifically chapters 13 and 14 on query optimization. it is difficult to reason about though, and every compiler is different. it takes time and enough examples to look at a query and have an intuition for what the plan should look like, and if there's something the compiler is not handling well.
This website taught me a ton, even after I thought I knew more than enough about performance. Just seeing how different databases generate and execute their SQL is a huge boon (and sometimes extremely surprising when looking at one DBMS to another).
Yes, I was going to say, seeing the generated SQL can be almost useless depending on the execution plan.
When you have a solid view of the schema and data sizes you can start to be more predictive about what your code will actually do, THEN you can layer on the complexity of the ORM hell code.
I strongly dislike the way the polroblem is presented and the “solution” is promoted. Author mentions merge join with count of top, and if th database supports index merges, it can be extremely efficient in described scenario. There are a lot of real optimizations that can be baked in such merges that author chooses to ignore.
The generalized guidance without even mentioning database server as a baseline, without showing plans and/or checking if the database can be guided is just a bad teaching.
It looks like author discovered a trick and tries to hammer everything into it by overgeneralizing.
For sure, there's definitely a lot of cool techniques (and I'm not aware of all of them)! And the first example is very much contrived to show a small example.
I'm not super familiar with the term index merge - this seems to be the term for a BitmapOr/BitmapAnd?
> I'm not super familiar with the term index merge - this seems to be the term for a BitmapOr/BitmapAnd?
Different databases will use similar terms for different operations, but I would guess that the comment refers to something similar to MySQL's index merge (which is essentially reading the row IDs of all the relevant ranges, then deduplicating them, then doing the final scan; it's similar to but less flexible than Postgres' BitmapOr).
As an aside, MySQL will optimize WHERE IN better than OR, assuming that the predicates are all constants, and not JSON. Specifically, it sorts the IN array and uses binary search.
That said, I’m not sure if it would have any impact on this specific query; I’d need to test.
Can we agree that this is only applies to queries where all the filter conditions use cols with indexes? If no indexes can be used, a single full table scan with OR surely is faster than multiple full table scans.
Absolutely. Though I don't recall seeing multiple sequential scans without a self-join or subquery. A basic filter within a sequential scan/loop is the most naive/simplest way of performing queries like these, so postgres falls back to that.
Also, fwiw, BitmapOr is only used with indexes: https://pganalyze.com/docs/explain/other-nodes/bitmap-or.
That was the extreme case - the multi-scan would be gotten if a casual reader tried your (neat by all means) AND-only query on a non-indexed table (or partially indexed for that matter).
Gotcha, I misunderstood your comment. The multiple counts is a definitely very contrived example to demonstrate the overhead of BitmapOr and general risk of sequential scans.
We had a case where a single OR was a massive performance problem on MSSQL, but not at all on Sybase SQLAnywhere we're migrating away from. Which one might consider slightly ironic given the origins of MSSQL...
Anyway, the solution was to manually rewrite the query as a UNION ALL of the two cases which was fast on both. I'm still annoyed though by the fact that MSSQL couldn't just have done that for me.
I am very grateful for databases but I have so many stories of having to manhandle them into doing what seems like it should be obvious to a reasonable query optimizer. Writing one must be very hard.
disjunctions and stats could be a pretty nasty combination in mssql. i think it got a bit better ca. 2016 with the CE updates, but i’ve had quite a few occurrences where the solutions were the union all approach
If optimization is really as simple as applying De Morgan's laws, surely it could be done within the query planner if that really is the main optimization switch? Or am I misreading this somehow?
Edit: I guess the main difference is that it's just calculating separate sets and then merging them, which isn't really DeMorgan's, but a calculation approach.
In most cases, the end result is a lot messier than it looks in this minimal example. Consider a case where the query is a 12-way join with a large IN list somewhere (which is essentially an OR); would it still look as attractive to duplicate that 12-way join a thousand times?
There _are_ optimizers that try to represent this kind of AND/OR expression as sets of distinct ranges in order to at least be able to consider whether a thousand different index scans are worth it or not, with rather limited success (it doesn't integrate all that well when you get more than one table, and getting cardinalities for that many index scans can be pretty expensive).
SQLite3 turns ORs into UNIONs.
The core issue the article is pointing to is that most database indexes are B-trees, so if you have a predicate on the form (col_a = 'foo' OR col_b = 'foo'), then it is impossible to use a single B-tree lookup to find all rows that match the predicate. You'd have to do two lookups and then merge the sets. Some query optimizers can do that, or at least things that are similar in spirit (e.g. Postgres bitmap index scan), but it's much more expensive than a regular index lookup.
> but it's much more expensive than a regular index lookup.
It doesn't have to be, it just "is" in some database engines for various historical reasons.
I.e.: PostgreSQL 18 is the first version to support "B-tree Skip Scan" operators: https://neon.com/postgresql/postgresql-18/skip-scan-btree
Other database engines are capable of this kind of thing to various degrees.
I'm not seeing De Morgan. I am seeing inclusion exclusion, which is just a neat trick all round. Highly recommend remembering it.
I imagine negative filters to be a bit inefficient as well, though maybe not for a simple count.
The query optimizer knows how many items are in each index, but has no advance idea how many items will be in the result of a JOIN. An "a OR b" query on a table with millions of rows might have three hits on A, or millions of hits. The optimal query strategy for the two cases is very different.
Has anyone put machine learning in an SQL query optimizer yet?
> Has anyone put machine learning in an SQL query optimizer yet?
Yes, I think everyone has? At very least I know that MSSQL has because we semi regularly run into problems with it :).
MSSQL keeps track of query statistics and uses those in future planning. SOMETIMES it just so happens that the optimization for the general case makes the outlier 100x slower which kills general performance.
For a while I was maintaining software that supported both MSSQL and PGSQL, and I found that, when comparing like-for-like without DB-specific tuning, MSSQL produced better query plans on average. On a database without write contention, I'd often see 30% better performance on MSSQL.
However, it was also much more likely to hit an AWFUL pathological case which completely wrecked the performance as you describe. Combined with pessimistic locking, we ended up with far more overnight support calls from the MSSQL backend than from the PGSQL backend. Usually because it suddenly decided to switch query plan at 1AM.
I wonder if there's a trade-off where an optimizer produces better average query plans but worse outliers.
At 100x, it seems like you could run both optimal strategies every time, let them race, and still come out way ahead.
You double + the IO and potentially CPU time which is why this isn't done. It's also not always 100x, that just happens often enough. Sometimes it's only 2x or 1.5x. It's impossible to know which situation you are in and the hard thing is the outliers will be slow either way.
Some traditional planners try some different plans on random subsets of the data to see which plan works best. Don't need machine learning, just Bayes' rule.
> The query optimizer knows how many items are in each index, but has no advance idea how many items will be in the result of a JOIN.
Query optimizers definitely try to estimate cardinalities of joins. It's a really, really hard problem, but the typical estimate is _much_ better than “eh, no idea”.
There are many papers on ML for query planners. You can search for "Learned Query Optimization". Some use ML just for the cardinality estimation.
I think having a way to build statistics on the join itself would be helpful for this. Similar to how extended statistics^1 can help when column distributions aren't independent of each other.
But this may require some basic materialized views, which postgres doesn't really have.
[1]: https://www.postgresql.org/docs/current/planner-stats.html#P...
could you elaborate on pg not really having matviews?
Materialized views in Postgres don't update incrementally as the data in the relevant tables updates.^1
In order to keep it up to date, the developer has to tell postgres to refresh the data and postgres will do all the work from scratch.
Incremental Materialized views are _hard_. This^2 article goes through how Materialize does it.
MSSQL does it really well from what I understand. They only have a few restrictions, though I've never used a MSSQL materialized view in production.^3
[1]: https://www.postgresql.org/docs/current/rules-materializedvi... [2]: https://www.scattered-thoughts.net/writing/materialize-decor... [3]: https://learn.microsoft.com/en-us/sql/t-sql/statements/creat...
I really like the extension pattern. I wish more of the tables at my company used it.
Another huge benefit that we're realizing as we move some of our heaviest tables to this pattern is that it makes it really easy to index a core set of fields in Elasticsearch, along with a tag to associate it with a particular domain model. This has drastically cut down on our search-specific denormlization and let us avoid expensive index schema updates.
This sort of thing is why looking at generated SQL while developing instead of just trusting the ORM to write good queries is so important.
I find query planning (and databases in general) to be very difficult to reason about, basically magic. Does anyone have some recommended reading or advice?
If you are looking to squeeze every ounce of performance from your entire application stack, I'd say you should be looking at everything your ORM produces. The ORM is basically to speed up your developers time to production, but most ORMs will have some cases where they generate terrible SQL, and you can usually run your own SQL in a stored procedure if the generated SQL is sub-optimal. I've done this quite a few times with Microsoft's Entity Framework, but as new versions come out, it's become less common for me to have to do this. Usually I need to drop to a stored procedure for code that allows searching a large number of columns, in addition to sorting on all the columns that display. I also use stored procedures for multi-table joins with a WHERE clause, when using Entity Framework. You still need to look at your generated queries, but the code is nothing like it used to be under Entity Framework under the .NET Framework (at least in my experience - YMMV - you should never just let your ORM create SQL without reviewing what it is coming up with).
https://pages.cs.wisc.edu/~dbbook/ is a great overview, specifically chapters 13 and 14 on query optimization. it is difficult to reason about though, and every compiler is different. it takes time and enough examples to look at a query and have an intuition for what the plan should look like, and if there's something the compiler is not handling well.
Highly recommend https://use-the-index-luke.com/
It's very readable - I always ask new hires and interns to read it.
This website taught me a ton, even after I thought I knew more than enough about performance. Just seeing how different databases generate and execute their SQL is a huge boon (and sometimes extremely surprising when looking at one DBMS to another).
It's a big help if you know how to retrieve and interpret execution plans for the database you use.
Yes, I was going to say, seeing the generated SQL can be almost useless depending on the execution plan.
When you have a solid view of the schema and data sizes you can start to be more predictive about what your code will actually do, THEN you can layer on the complexity of the ORM hell code.
Query Planners were considered "AI", at least among some folks, back in the day just FYI
I strongly dislike the way the polroblem is presented and the “solution” is promoted. Author mentions merge join with count of top, and if th database supports index merges, it can be extremely efficient in described scenario. There are a lot of real optimizations that can be baked in such merges that author chooses to ignore.
The generalized guidance without even mentioning database server as a baseline, without showing plans and/or checking if the database can be guided is just a bad teaching.
It looks like author discovered a trick and tries to hammer everything into it by overgeneralizing.
For sure, there's definitely a lot of cool techniques (and I'm not aware of all of them)! And the first example is very much contrived to show a small example.
I'm not super familiar with the term index merge - this seems to be the term for a BitmapOr/BitmapAnd?
Is there another optimization I'm missing?
The article links to my code for my timings here: https://github.com/ethan-seal/ors_expensive
There is an optimization that went in the new release of PostgreSQL I'm excited about that may affect this - I'm not sure. See https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit...
> I'm not super familiar with the term index merge - this seems to be the term for a BitmapOr/BitmapAnd?
Different databases will use similar terms for different operations, but I would guess that the comment refers to something similar to MySQL's index merge (which is essentially reading the row IDs of all the relevant ranges, then deduplicating them, then doing the final scan; it's similar to but less flexible than Postgres' BitmapOr).
Cool. I'll have to read up on that.
As an aside, MySQL will optimize WHERE IN better than OR, assuming that the predicates are all constants, and not JSON. Specifically, it sorts the IN array and uses binary search.
That said, I’m not sure if it would have any impact on this specific query; I’d need to test.
The article is specifically discussing cases where you have predicates on different columns OR'ed together, like col_a = 'foo' OR col_b = 'foo'.
Oof, misread that.
Can we agree that this is only applies to queries where all the filter conditions use cols with indexes? If no indexes can be used, a single full table scan with OR surely is faster than multiple full table scans.
Absolutely. Though I don't recall seeing multiple sequential scans without a self-join or subquery. A basic filter within a sequential scan/loop is the most naive/simplest way of performing queries like these, so postgres falls back to that. Also, fwiw, BitmapOr is only used with indexes: https://pganalyze.com/docs/explain/other-nodes/bitmap-or.
That was the extreme case - the multi-scan would be gotten if a casual reader tried your (neat by all means) AND-only query on a non-indexed table (or partially indexed for that matter).
Gotcha, I misunderstood your comment. The multiple counts is a definitely very contrived example to demonstrate the overhead of BitmapOr and general risk of sequential scans.
The OR query certainly states user intent better then the rewrite. So its better at communicating with a future programmer.
So add a comment. “X is easier to understand” shouldn’t mean taking a 100x performance hit.
Yeah, just watch community. Every OR splits the universe, duh.
If ORs are expensive you've probably got a poor table design or the ORs are for some form of nasty query that is more UI/Dashboard based.
A good table has the right indexes, and a good API to deal with the table is using the RIGHT indexes in it's criteria to get a good result.