SQL: Incorrect by Construction

(chreke.com)

18 points | by ingve 3 hours ago ago

13 comments

  • chasil 16 minutes ago

    This document relies strongly upon Transact-SQL:

    https://en.wikipedia.org/wiki/Transact-SQL

    A more universal industry standard is SQL/PSM, which originated from Oracle PL/SQL:

    https://en.wikipedia.org/wiki/SQL/PSM

    Demonstrating the flaws in question in the PSM standard would be more useful.

    • traderj0e 2 minutes ago

      You can in fact show the same flaws in PL/SQL or whatever else

  • traderj0e 36 minutes ago

    I've encountered this dozens of times. It's not intuitive, but this implicitly locks the row from concurrent reads, where as SELECTing first won't:

      UPDATE accounts  
      SET balance = balance - 10  
      WHERE owner = 'alice' AND balance >= 10;
    
    Another possible surprise, say two xacts do this at the same time:

      INSERT INTO foo(num) (
        SELECT 1 WHERE NOT EXISTS (
          SELECT * FROM foo WHERE num = 1
        )
      );
    
    Without a UNIQUE on num, you get num=1 twice. Of course adding UNIQUE would prevent this, but what you might not expect is UNIQUE implicitly adds a lock too. So not only do you only get num=1 once, but also both xacts are guaranteed to succeed, which in some situations is an important distinction.

    Schools teach that databases are ACID, but in most cases they aren't by default, and enabling full ACID comes with other caveats and also a large performance hit.

  • taeric an hour ago

    SQL is intended as a declarative query language. That it is not the correct tool for imperative processing of updates feels expected? And mostly fine?

    Fair that things often grow beyond their original intent.

    • traderj0e 3 minutes ago

      You run into the same issues without using the weird imperative syntax in this article.

  • gonzalohm 24 minutes ago

    If you need concurrency design your system for concurrency.

    Have a transactions table with the payer and receiver and calculate the current balance using the transactions.

    Each transaction must have a unique Id (pk)

    • grebc 7 minutes ago

      This is the answer for any serious banking/accounting software.

      Balance is calculated & stored after the fact from a known correct value.

    • traderj0e 22 minutes ago

      That is actually worse, I've been there. It's good to keep logs like that, but you can't use that for locking, you need a separate balances table.

      Edit: Well another option is to add a "pending" col and do three separate db xacts: 1. insert pending=true row 2. select balance with pending debits deducted (which ages out pending rows older than 1min) 3. update row to pending=false if successful. This is a useful pattern if you're waiting on an external system too, but not good in this case where you're just trying to update in one DB.

      • cozzyd 10 minutes ago

        your goal is to find if there is any combination of plausible transaction orders that results in a balance less than 0, so you can issue an overdraft fee.

        • traderj0e 6 minutes ago

          The original stated goal is that we want to disallow overdraft. If you want to allow it instead, then there are some followup questions like do you want to limit how much they could overdraft. But this is meant to be an example of race conditions, not a real world bank.

  • giancarlostoro 35 minutes ago

    This assumes you don't do any sort of caching or use distributed systems that can cache the data and choose to hold off to write it all to the DB. The cached system can show both users the in-process transactions as well.

    • traderj0e 29 minutes ago

      That introduces more questions, like are cache reads fully consistent

  • selimthegrim an hour ago

    Wonder what the [check-constraints] part meant or if it's a placeholder.