I found out that in the embedded world (think microcontrollers without an MMU), Tensorflow lite is still the only game in town (pragmatically speaking) for vendor-supported hardware acceleration.
I recently tried to port my model to JAX. Got it all working the "JAX WAY", and I believe I did everything correct, with one neat top level .jit() applied to the training step. Unfortunately I could not replicate the performance boost of torch.compile(). I have not yet delved under the hood to find the culprit, but my model is fairly simple so I was sort of expecting JAX JIT to perform just as well if not better than torch.compile().
JAX code usually ends up being way faster than equivalent torch code for me, even with torch.compile. There are common performance killers, though. Notably, using Python control flow (if statements, loops) instead of jax.lax primitives (where, cond, scan, etc).
Interesting. Thanks for you input. I already tried to adhere to the JAX paradigm as laid out in the documentation so I already have a fully static graph.
I would test how much of the total flop capability of the hardware you are using. Take the first order terms of your model and estimate how many flops you need per data point (a good guide is 6*param for training if you mostly have large multiplies and nonlinearity/norm layers) and then calculate the real time performance for a given data size input vs the actual expected theoretical max perfomance for the given GPU (eg 1e15 FLOPs/s for bfloat16 per H100 or H200 GPU). If you are already over 50% it is unlikely you can have big gains without very considerable effort, and most likely simple jax or pytorch are not sufficient at that point. If you are at the 2–20% range there are probably some low hanging fruit left and the closer you are to using only 1% the easier it is to see dramatic gains.
> In 2019, the war for ML frameworks has two remaining main contenders: PyTorch and TensorFlow. My analysis suggests that researchers are abandoning TensorFlow and flocking to PyTorch in droves.
But to be fair, it was kind of obvious around ~2023 without having to look at metrics/data, you just had to look at what the researchers publishing novel research used.
Any similar articles that are a bit more up to date, maybe even for 2025?
I feel like it was all pretty obvious by late 2017. Prototyping and development in PyTorch was so much easier - it felt just like writing normal Python code. And the supposed performance benefits of the static computation graph in TensorFlow didn't materialize for most workloads. Nobody wanted to use TensorFlow - though you often had to when working on existing codebases.
I think the only thing that could have saved TensorFlow at that point would have been some sort of enormous performance boost that would only work with their computation model. I'm assuming Google's plan was make it easy to run the same TensorFlow code on GPUs and TPUs, and then swoop in with TPUs that massively outperformed GPUs (at least on a performance per dollar basis). But that never really happened.
Interesting point about the shift towards PyTorch. It really has been fascinating to see how preferences in frameworks can impact the entire research landscape. I remember back in 2017, I felt like I was constantly hearing about TensorFlow everywhere, and then out of nowhere, PyTorch just started gaining this insane momentum. It was almost like watching a sports team come out of nowhere to win the championship!
In my experience, a lot of it comes down to the community and the ease of use. Debugging in PyTorch feels way more intuitive, and I wonder if that’s why so many people are gravitating toward it. I’ve seen countless tutorials and workshops pop up for PyTorch compared to TensorFlow recently, which speaks volumes to how quickly things can change.
But then again, TensorFlow's got its enterprise backing, and I can't help but think about the implications of that. How long can PyTorch ride this wave before it runs into pressure from industry demands? And as we look toward 2025, do you think we'll see a third contender emerge, or will it continue to be this two-horse race?
TensorFlow was an overengineered Google-style mess and they constantly made breaking changes.
All the graph building and session running was way too complex, with too much global state and variable sharing was complicated and based on naming and variable scopes and name scopes and so on.
It was an okay try, but that design simply didn't work so well for quick prototyping, iterating, debugging that's crucial in research.
PyTorch was much closer to just writing straightforward numpy code. TensorFlow 2 then tried to catch up with "eager mode", but in the background it was still a graph and tracing often broke and you had to write the code very carefully and with limitations.
In the end, Pytorch also developed proper production and serving tools as well as graph compilation, so now there's basically no reason to go to TensorFlow. Not even Google researchers use it (they use jax). I guess some industries still use it but at some point I expect Google to shut down TF and focus on the JAX ecosystem with some kind of conversion tools for TF.
> But then again, TensorFlow's got its enterprise backing, and I can't help but think about the implications of that. How long can PyTorch ride this wave before it runs into pressure from industry demands?
PyTorch has a huge collection of companies, organizations and other entities backing it, it's not gonna suddenly disappear soon, that much is clear. Take a look at https://pytorch.org/foundation/ for a sample
The thing about Tensorflow in 2017 is that everyone acknowledged how difficult it was to use. While it was almost the only game in town, no one was happy. Those are probably the areas where an upstart can come in and disrupt.
JAX is quite popular in many labs outside of Google doing large scale training runs, because up until recently the parallelism ergonomics were way better. PyTorch core is catching up (maybe already witn the latest release, haven’t used it yet) and there are a lot of PyTorch using projects to study though.
SAM (Segment Anything Model) by Meta is a popular go-to choice for off the shelf segmentation.
But the exciting new research is moving beyond the narrow task of segmentation. It's not just about having new models that get better scores but building larger multimodal systems, broader task definitions etc.
I haven’t used RCNN, but trained a custom YOLOv5 model maybe 3-4 years ago and was very happy with the results.
I think people have continued to work on it. There’s no single lab or developer, it mostly appears that the metrics for comparison are usually focused on the speed/MAP plane.
One nice thing is that even with modest hardware, it’s low enough latency to process video in real time.
We knew in 2017 that PyTorch was the future, so moved all our research and teaching to it: https://www.fast.ai/posts/2017-09-08-introducing-pytorch-for... .
I found out that in the embedded world (think microcontrollers without an MMU), Tensorflow lite is still the only game in town (pragmatically speaking) for vendor-supported hardware acceleration.
I recently tried to port my model to JAX. Got it all working the "JAX WAY", and I believe I did everything correct, with one neat top level .jit() applied to the training step. Unfortunately I could not replicate the performance boost of torch.compile(). I have not yet delved under the hood to find the culprit, but my model is fairly simple so I was sort of expecting JAX JIT to perform just as well if not better than torch.compile().
Have anyone else had similiar experiences?
JAX code usually ends up being way faster than equivalent torch code for me, even with torch.compile. There are common performance killers, though. Notably, using Python control flow (if statements, loops) instead of jax.lax primitives (where, cond, scan, etc).
Interesting. Thanks for you input. I already tried to adhere to the JAX paradigm as laid out in the documentation so I already have a fully static graph.
I would test how much of the total flop capability of the hardware you are using. Take the first order terms of your model and estimate how many flops you need per data point (a good guide is 6*param for training if you mostly have large multiplies and nonlinearity/norm layers) and then calculate the real time performance for a given data size input vs the actual expected theoretical max perfomance for the given GPU (eg 1e15 FLOPs/s for bfloat16 per H100 or H200 GPU). If you are already over 50% it is unlikely you can have big gains without very considerable effort, and most likely simple jax or pytorch are not sufficient at that point. If you are at the 2–20% range there are probably some low hanging fruit left and the closer you are to using only 1% the easier it is to see dramatic gains.
> In 2019, the war for ML frameworks has two remaining main contenders: PyTorch and TensorFlow. My analysis suggests that researchers are abandoning TensorFlow and flocking to PyTorch in droves.
Seems they were pretty spot on! https://trends.google.com/trends/explore?date=all&q=pytorch,...
But to be fair, it was kind of obvious around ~2023 without having to look at metrics/data, you just had to look at what the researchers publishing novel research used.
Any similar articles that are a bit more up to date, maybe even for 2025?
I feel like it was all pretty obvious by late 2017. Prototyping and development in PyTorch was so much easier - it felt just like writing normal Python code. And the supposed performance benefits of the static computation graph in TensorFlow didn't materialize for most workloads. Nobody wanted to use TensorFlow - though you often had to when working on existing codebases.
I think the only thing that could have saved TensorFlow at that point would have been some sort of enormous performance boost that would only work with their computation model. I'm assuming Google's plan was make it easy to run the same TensorFlow code on GPUs and TPUs, and then swoop in with TPUs that massively outperformed GPUs (at least on a performance per dollar basis). But that never really happened.
Interesting point about the shift towards PyTorch. It really has been fascinating to see how preferences in frameworks can impact the entire research landscape. I remember back in 2017, I felt like I was constantly hearing about TensorFlow everywhere, and then out of nowhere, PyTorch just started gaining this insane momentum. It was almost like watching a sports team come out of nowhere to win the championship!
In my experience, a lot of it comes down to the community and the ease of use. Debugging in PyTorch feels way more intuitive, and I wonder if that’s why so many people are gravitating toward it. I’ve seen countless tutorials and workshops pop up for PyTorch compared to TensorFlow recently, which speaks volumes to how quickly things can change.
But then again, TensorFlow's got its enterprise backing, and I can't help but think about the implications of that. How long can PyTorch ride this wave before it runs into pressure from industry demands? And as we look toward 2025, do you think we'll see a third contender emerge, or will it continue to be this two-horse race?
TensorFlow was an overengineered Google-style mess and they constantly made breaking changes.
All the graph building and session running was way too complex, with too much global state and variable sharing was complicated and based on naming and variable scopes and name scopes and so on.
It was an okay try, but that design simply didn't work so well for quick prototyping, iterating, debugging that's crucial in research.
PyTorch was much closer to just writing straightforward numpy code. TensorFlow 2 then tried to catch up with "eager mode", but in the background it was still a graph and tracing often broke and you had to write the code very carefully and with limitations.
In the end, Pytorch also developed proper production and serving tools as well as graph compilation, so now there's basically no reason to go to TensorFlow. Not even Google researchers use it (they use jax). I guess some industries still use it but at some point I expect Google to shut down TF and focus on the JAX ecosystem with some kind of conversion tools for TF.
> But then again, TensorFlow's got its enterprise backing, and I can't help but think about the implications of that. How long can PyTorch ride this wave before it runs into pressure from industry demands?
PyTorch has a huge collection of companies, organizations and other entities backing it, it's not gonna suddenly disappear soon, that much is clear. Take a look at https://pytorch.org/foundation/ for a sample
The thing about Tensorflow in 2017 is that everyone acknowledged how difficult it was to use. While it was almost the only game in town, no one was happy. Those are probably the areas where an upstart can come in and disrupt.
It’s still all pytorch.
Unless you’re working at Google, then maybe you use JAX.
JAX is quite popular in many labs outside of Google doing large scale training runs, because up until recently the parallelism ergonomics were way better. PyTorch core is catching up (maybe already witn the latest release, haven’t used it yet) and there are a lot of PyTorch using projects to study though.
In 2019 I delivered a instance segmentation project and I used Mask RCNN and tensorflow.
Nowadays it looks like yolo absolutely dominates this segment. Any data scientists can chime in?
SAM (Segment Anything Model) by Meta is a popular go-to choice for off the shelf segmentation.
But the exciting new research is moving beyond the narrow task of segmentation. It's not just about having new models that get better scores but building larger multimodal systems, broader task definitions etc.
I haven’t used RCNN, but trained a custom YOLOv5 model maybe 3-4 years ago and was very happy with the results.
I think people have continued to work on it. There’s no single lab or developer, it mostly appears that the metrics for comparison are usually focused on the speed/MAP plane.
One nice thing is that even with modest hardware, it’s low enough latency to process video in real time.
lil' self promo but I made a similar blog post in 2018.
I gave mxnet a bit of an outsized score in hindsight, but outside of that I think I got things mostly right.
https://source.coveo.com/2018/08/14/deep-learning-showdown/