> But matrix multiplication, to which our civilization is now devoting so many of its marginal resources, has all the elegance of a man hammering a nail into a board.
is the most interesting one.
A man hammering a nail into a board can be both beautiful and elegant! If you've ever seen someone effortlessly hammer nail after nail into wood without having to think hardly at all about what they're doing, you've seen a master craftsman at work. Speaking as a numerical analyst, I'd say a well multiplied matrix is much the same. There is much that goes into how deftly a matrix might be multiplied. And just as someone can hammer a nail poorly, so too can a matrix be multiplied poorly. I would say the matrices being multiplied in service of training LLMs are not a particularly beautiful example of what matrix multiplication has to offer. The fast Fourier transform viewed as a sparse matrix factorization of the DFT and its concomitant properties of numerical stability might be a better candidate.
If the O(n^3) schoolbook multiplication were the best that could be done, then I'd totally agree that "it's simply the nature of matrices to have a bulky multiplication process". Yet there's a whole series of algorithms (from the Strassen algorithm onward) that use ever-more-clever ways to recursively batch things up and decrease the asymptotic complexity, most of which aren't remotely practical. And for all I know, it could go on forever down to O(n^(2+ε)). Overall, I hate not being able to get a straight answer for "how hard is it, really".
Maybe the problem is that matrices are too general.
You can have very beautiful algorithms when you assume the matrices involved have a certain structure. You can even have that A*B == B*A, if A and B have a certain structure.
I think this sentence:
> But matrix multiplication, to which our civilization is now devoting so many of its marginal resources, has all the elegance of a man hammering a nail into a board.
is the most interesting one.
A man hammering a nail into a board can be both beautiful and elegant! If you've ever seen someone effortlessly hammer nail after nail into wood without having to think hardly at all about what they're doing, you've seen a master craftsman at work. Speaking as a numerical analyst, I'd say a well multiplied matrix is much the same. There is much that goes into how deftly a matrix might be multiplied. And just as someone can hammer a nail poorly, so too can a matrix be multiplied poorly. I would say the matrices being multiplied in service of training LLMs are not a particularly beautiful example of what matrix multiplication has to offer. The fast Fourier transform viewed as a sparse matrix factorization of the DFT and its concomitant properties of numerical stability might be a better candidate.
Do you disagree with my take or think I’m missing Witt’s point? I’d be happy to hear from people who disagree with me.
Ignore me then because I agree with you. :) He sounds like someone who upon first hearing jazz to complain it was ugly.
If the O(n^3) schoolbook multiplication were the best that could be done, then I'd totally agree that "it's simply the nature of matrices to have a bulky multiplication process". Yet there's a whole series of algorithms (from the Strassen algorithm onward) that use ever-more-clever ways to recursively batch things up and decrease the asymptotic complexity, most of which aren't remotely practical. And for all I know, it could go on forever down to O(n^(2+ε)). Overall, I hate not being able to get a straight answer for "how hard is it, really".
Maybe the problem is that matrices are too general.
You can have very beautiful algorithms when you assume the matrices involved have a certain structure. You can even have that A*B == B*A, if A and B have a certain structure.