Instant SQL for results as you type in DuckDB UI

(motherduck.com)

163 points | by ryguyrg 4 hours ago ago

51 comments

  • carlineng 3 hours ago

    I just watched the author of this feature and blog post give a talk at the DataCouncil conference in Oakland, and it is obvious what a huge amount of craft, ingenuity, and care went into building it. Congratulations to Hamilton and the MotherDuck team for an awesome launch!

    • ryguyrg 3 hours ago

      wohoo! glad you noticed that. Hamilton is amazing.

  • jakozaur an hour ago

    It would be even better if SQL had pipe syntax. SQL is amazing, but its ordering isn’t intuitive, and only CTEs provide a reliable way to preview intermediate results. With pipes, each step could clearly show intermediate outputs.

    Example:

    FROM orders |> WHERE order_date >= '2024-01-01' |> AGGREGATE SUM(order_amount) AS total_spent GROUP BY customer_id |> WHERE total_spent > 1000 |> INNER JOIN customers USING(customer_id) |> CALL ENRICH.APOLLO(EMAIL > customers.email) |> AGGREGATE COUNT(*) high_value_customer GROUP BY company.country

  • ryguyrg 4 hours ago

    In DuckDB UI and MotherDuck.

    Awesome video of feature: https://youtu.be/aFDUlyeMBc8

    Disclaimer: I’m a co-founder at MotherDuck.

    • rancar2 an hour ago

      Thanks for sharing this update with the world and including it on the local ui too.

      Feature request: enable the tuning of when Instant SQL is run and displayed. The erroring out with flashing updates at nearly every keystoke while expanding on a query is distracting for me personally (my brain goes into troubleshooting vs thinking mode). I like the feature (so I will keep it on by default), but I’d like to have a few modes for it depending on my working context (specifically tuning of update frequency at separation characters [space, comma], end of statement [semicolon/newline], and injections [paste/autocomplete]).

      • hamilton an hour ago

        Great feedback! Thanks. We agree w/ the red errors. It's not helpful when it feels like your editor is screaming at you.

  • jpambrun 2 hours ago

    I really like duckdb's notebooks for exploration and this feature makes them even more awesome, but the fact that I can't share, export or commit them into a git repo feels extremely limiting. It's neat-ish that it dodfoods and store them in a duckdb database. It even seems to stores historical versions, but I can't really do anything with it..

    • hamilton 37 minutes ago

      Definitely something we want too! (I'm the author / lead for the UI)

  • XCSme 3 hours ago

    I hope this doesn't work with DELETE queries.

    • falcor84 3 hours ago

      Maybe in the next version they could also implement support for DROP, with autocorrect for the nearest (not yet dropped) table name.

      • clgeoio 3 hours ago

        LLM powered queries that run in Agent mode so it can answer questions of your data before you know what to ask.

        • XCSme 3 hours ago

          That's actually not a bad idea, to have LLM autocomplete when you write queries, especially if you first add a comment at the top saying what you want to achieve:

          // Select all orders for users registered in last year, and compute average earnings per user

          SELECT ...

      • krferriter 2 hours ago

        DELETED 0 rows. Did you mean `where 1=1`? (click accept to re-run with new where clause)

    • matsonj 3 hours ago

      for clarity: Instant SQL won't automatically run queries that write or delete data or metadata. It only runs queries that read data.

    • crmi 3 hours ago

      Young bobby tables at it again

    • worldsayshi 3 hours ago

      Can't it just run inside a transaction that isn't committed?

    • ryguyrg 3 hours ago

      ROFL

      • codetrotter 3 hours ago

        ROFL FROM jokes WHERE thats_a_new_one;

  • ayhanfuat 3 hours ago

    CTE inspection is amazing. I spend too much time doing that manually.

    • hamilton 2 hours ago

      Me too (author of the post here). In fact, I was watching a seasoned data engineer at MotherDuck show me how they would attempt to debug a regex in a CTE. As a longtime SQL user, I felt the pain immediately; haven't we all been there before? Instant SQL followed from that.

  • hk1337 2 hours ago

    First time seeing the from at the top of the query and I am not sure how I feel about it. It seems useful but I am so used to select...from.

    I'm assuming it's more of a user preference like commas in front of the field instead of after field?

    • hamilton 2 hours ago

      You can use any variation of DuckDB valid syntax that you want! I prefer to put from first just because I think it's better, but Instant SQL works with traditional select __ from __ queries.

    • ltbarcly3 2 hours ago

      Yes it comes from a desire to impose intuition from other contexts onto something instead of building intuition with that thing.

      SQL is a declarative language. The ordering of the statements was carefully thought through.

      I will say it's harmless though, the clauses don't have any dependency in terms of meaning so it's fine to just allow them to be reordered in terms of the meaning of the query, but that's true of lots and lots of things in programming and just having a convention is usually better than allowing anything.

      For example, you could totally allow this to be legal:

        def
            for x in whatever:
                print(x)
        print_whatever(whatever):
      
      There's nothing ambiguous about it, but why? Like if you are used to seeing it one way it just makes it more confusing to read, and if you aren't used to seeing it the normal way you should at least somewhat master something before you try to improve it through cosmetic tweaks.

      I think you see this all the time, people try to impose their own comfort onto things for no actual improvement.

      • whstl an hour ago

        No, it comes from wanting to make autocompletion easier and to make variable scoping/method ordering make sense within LINQ. It is an actual improvement in this regard.

        LINQ popularized it and others followed. It does what it says.

        Btw: saying that "people try to impose their own comfort" is uncalled for.

  • crazygringo an hour ago

    Edit: never mind, thanks for the replies! I had missed the part where it showed visualizing subqueries, which is what I wanted but didn't think it did. This looks very helpful indeed!

    • Noumenon72 an hour ago

      The article says it does subqueries:

      > Getting the AST is a big step forward, but we still need a way to take your cursor position in the editor and map it to a path through this AST. Otherwise, we can’t know which part of the query you're interested in previewing. So we built some simple tools that pair DuckDB’s parser with its tokenizer to enrich the parse tree, which we then use to pinpoint the start and end of all nodes, clauses, and select statements. This cursor-to-AST mapping enables us to show you a preview of exactly the SELECT statement you're working on, no matter where it appears in a complex query.

    • geysersam an hour ago

      > What would be helpful would be to be able to visualize intermediate results -- if my cursor is inside of a subquery, show me the results of that subquery.

      But that's exactly what they show in the blog post??

    • hamilton an hour ago

      You should read the post! This is what the feature does.

  • mritchie712 3 hours ago

    a fun function in duckdb (which I think they're using here) is `json_serialize_sql`. It returns a JSON AST of the SQL

        SELECT json_serialize_sql('SELECT 2');
    
    
    
        [
            {
                "json_serialize_sql('SELECT 2')": {
                    "error": false,
                    "statements": [
                        {
                            "node": {
                                "type": "SELECT_NODE",
                                "modifiers": [],
                                "cte_map": {
                                    "map": []
                                },
                                "select_list": [
                                    {
                                        "class": "CONSTANT",
                                        "type": "VALUE_CONSTANT",
                                        "alias": "",
                                        "query_location": 7,
                                        "value": {
                                            "type": {
                                                "id": "INTEGER",
                                                "type_info": null
                                            },
                                            "is_null": false,
                                            "value": 2
                                        }
                                    }
                                ],
                                "from_table": {
                                    "type": "EMPTY",
                                    "alias": "",
                                    "sample": null,
                                    "query_location": 18446744073709551615
                                },
                                "where_clause": null,
                                "group_expressions": [],
                                "group_sets": [],
                                "aggregate_handling": "STANDARD_HANDLING",
                                "having": null,
                                "sample": null,
                                "qualify": null
                            },
                            "named_param_map": []
                        }
                    ]
                }
            }
        ]
    • hamilton 2 hours ago

      Indeed, we are! We worked with DuckDB Labs to add the query_location information, which we're also enriching with the tokenizer to draw a path through the AST to the cursor location. I've been wanting to do this since forever, and now that we have it, there's actually a long tail of inspection / debugging / enrichment features we can add to our SQL editor.

    • krferriter 2 hours ago

      This is a very cool feature. I don't know how useful it is or how I'd use it right now but I think I am going to get into some benchmarking and performance tweaking soon and this could be handy.

  • wodenokoto 3 hours ago

    Will this be available in duckdb -ui ?

    Is mother duck editor features available on-prem? My understanding is that mother duck is a data warehouse sass.

    • 1egg0myegg0 3 hours ago

      It is already available in the local DuckDB UI! Let us know what you think!

      -Customer software engineer at MotherDuck

      • ukuina 2 hours ago

        Does local DuckDB UI work without an internet connection?

  • potatohead24 2 hours ago

    It's neat but the CTE selection bit errors out more often than not & erroneously selects more than the current CTE

    • hamilton 2 hours ago

      Can you say more? Where does it error out? Sounds like a bug; if you could post an example query, I bet we can fix that.

  • Vaslo 44 minutes ago

    I moved from pandas and SQLite to polars and DuckDB. Such an improvement in these new tools.

  • makotech221 3 hours ago

    Delete From dbo.users w...

    (129304 rows affected)

    • CurtHagenlocher 3 hours ago

      The blog specifically says that they're getting the SQL AST so presumably they would not execute something like a DELETE.

      • hamilton 2 hours ago

        Correct. We only enable fast previews for SELECT statements, which is the actual hard problem. This said, at some point we're likely to also add support for previewing a CTAS before you actually run it.

      • makotech221 2 hours ago

        Cool. Now, there's this thing called a joke...

  • ltbarcly3 2 hours ago

    This is such a bizarre feature.

    • thenaturalist 2 hours ago

      On first glance possibly, on second glance not at all.

      First, repeat data analyst queries are a usage driver in SQL DBs. Think iterating the code and executing again.

      Another huge factor in the same vein is running dev pipelines with limited data to validate a change works when modelling complex data.

      This is currently a FE feature, but underneath lies effective caching.

      The underlying tech is driving down usage cost which is a big thing for data practitioners.

    • hamilton 2 hours ago

      What about it is bizarre?

      • pixl97 an hour ago

        It's probably different for duckdb, but from something like Microsoft SQL tossing off these random queries at a database of any size could have some weird performance impacts. For example statistics on columns you don't want them on, unindexed queries with slow performance, temp tables being dumped out to disk, etc.

        • hamilton an hour ago

          I agree; one thing that is neat about Instant SQL is for many reasons, you can't do this with in any other DBMS. You really need DuckDB's specific architecture and ergonomics.

  • sannysanoff 3 hours ago

    Please finally add q language with proper integration to your tables so that our precious q-SQL is available there. Stop reinventing the wheel, let's at least catch up to the previous generation (in terms of convenience). Make the final step.