Show HN: IOPS Profiler – Jupyter magic to measure I/O operations per second

(github.com)

2 points | by mtauraso 3 hours ago ago

1 comments

mtauraso 3 hours ago

Author here. Built this while working on astronomy data pipelines where we process terabyte-scale datasets. We kept hitting a frustrating pattern: libraries promised great performance, benchmarks looked solid, but our pipelines were mysteriously slow. CPU and memory were fine, yet tasks taking minutes in theory took hours in practice.

The culprit was consistently I/O. Either we were making millions of tiny operations, or the "optimized" storage layer wasn't doing what we expected. But there was no easy way to see actual I/O behavior without leaving Jupyter and diving into system tools.

So we built this. Now %%iops at the top of a cell immediately shows: "Oh, 50,000 separate writes instead of buffering. That's why."

It's been invaluable for debugging performance gaps between expectations and real-world behavior in our workloads.

Interesting sidenote: I'm a pretty extreme AI skeptic, but wanted to see how far current tools could be taken. With minor edits, all the code, documentation, and even this HN submission were generated by Claude/Copilot. The results surprised me. Happy to answer questions or hear if others have hit similar performance mysteries.