TurboQuant: Redefining AI efficiency with extreme compression

(research.google)

96 points | by ray__ 3 hours ago ago

10 comments

moktonar 3 minutes ago

Aren’t polar coordinates still n-1 + 1 for radius for n-dim vector? If so I understand that angles can be quantized better but when radius r is big the error is large for highly quantized angles right? What am I missing?

[-]

amitport a minute ago

r is a single value per vector. You don't have to quantize it, you can keep it and quantize the billion+ other coordinates of the vector.

benob 33 minutes ago

This is the worst lay-people explanation of an AI component I have seen in a long time. It doesn't even seem AI generated.

[-]

spencerflem 31 minutes ago

I think it is though-

“ TurboQuant, QJL, and PolarQuant are more than just practical engineering solutions; they’re fundamental algorithmic contributions backed by strong theoretical proofs. These methods don't just work well in real-world applications; they are provably efficient and operate near theoretical lower bounds.”

[-]

benob 26 minutes ago

Maybe they quantized a bit too much the model parameters...

bluequbit an hour ago

I did not understand what polarQuant is.

Is is something like pattern based compression where the algorithm finds repeating patterns and creates an index of those common symbols or numbers?

[-]

mrugge 39 minutes ago

1. Efficient recursive transform of kv embeddings into polar coordinates 2. Quantize resulting angles without the need for explicit normalization. This saves memory via key insight: angles follow a distribution and have analytical form.

[-]

quotemstr 20 minutes ago

Reminds me vaguely of Burrows-Wheeler transformations in bzip2.

Maxious 36 minutes ago

https://mesuvash.github.io/blog/2026/turboquant-interactive/ has a little visualisation

[-]

spencerflem 24 minutes ago

I like the visualization, but I don’t understand the grid quantization. If every point is on the unit circle aren’t all the center grid cords unused?