Pandas feels clunky coming from R. What about Haskell?

(mchav.github.io)

23 points | by mchav 2 days ago ago

7 comments

  • kermatt 2 days ago

    Although not quite the same without a pipe operator, https://pola.rs was an improvement for me when missing the R dataframe syntax.

    • minimaxir 2 days ago

      OP makes that concession in the first section of the post. (I may or may not have made a similar comment before deleting in kneejerk shame)

  • rgavuliak 2 days ago

    > This has a great SQL-ish API. Python is similar but starts to be a little clunky since it requires you to think about indices:

    groupby has an as_index parameter for this very purpose

    > Deducting the discount

    You focus on doing the subtraction during the group by. Is there any good reason for this? You could either do it as a step before, or after summing up both columns. Putting too many things into one command is not good practice yet you benchmark the language based on how easy it is to do said bad practice

    • mchav a day ago

      I think the original author picked this example to broadly illustrate how easy it is to make ad hoc changes to your query without worrying about lot about implementation details. Polars, for example, converges on a similar API and gives you the flexibility. You can iterate then refactor easily later to what you consider good practice.

      • rgavuliak a day ago

        For me the whole piping felt like making everything less readable and harder to debug compared to a string of commands.

  • internet_points 2 days ago

    > You need to read the previous line to understand what

    that ended rather abruptly?

  • 2 days ago
    [deleted]