Dataframely: A polars-native data frame validation library

(tech.quantco.com)

30 points | by sito42 12 hours ago ago

7 comments

  • account-5 7 hours ago

    I've used pandas before, and recently became aware of polars as it's part on Nushell by way of a plugin.

    Why would you use polars over pandas?

    • NeutralForest 5 hours ago

      Something not mentioned yet is that pandas is eager and copies memory quite liberally so even not too large datasets might blow up your RAM. Polars has an eager API but also a lazy one; I know they're also working on a query optimizer for lazy queries so you could theoretically handle even large data on a laptop for example.

    • gpderetta 4 hours ago

      because this way you wouldn't be using pandas. That must be a good reason by itself.

      pandas is extremely powerful, but everything seems way more complex than it needs to be.

    • ayhanfuat 6 hours ago

      Better syntax, speed and memory.

      • mettamage 6 hours ago

        Seconding this, the syntax feels SQL-like which I'm a fan of since I also sometimes have to write SQL queries. So they keep the mindset sharp and also are a clear paradigm to think in if I need to think about a polars dataframe.

        • account-5 6 hours ago

          Thanks both, the sql-like syntax is definitely something I'm enjoying in Nushell.

  • camkego 7 hours ago

    Just started using Polars, great to learn that data frame validation even exists. Thanks