lol, it took me 48 hours to do (and re-do, and re-do) this test + write it up and now that I convinced myself to stop changing bits and just publish it... Google's just announced the Gemma 4 QAT models :-D
It would not change the core of my article since the bottleneck remains the memory bandwidth on the old M1 16GB though
lol, it took me 48 hours to do (and re-do, and re-do) this test + write it up and now that I convinced myself to stop changing bits and just publish it... Google's just announced the Gemma 4 QAT models :-D
It would not change the core of my article since the bottleneck remains the memory bandwidth on the old M1 16GB though