It's always so surprising to me that stuff like Zitron's latest eruption gets traction but when the actual guys who are actually building the actual stuff actually talk about what they're actually doing? Silence.
Petabit network? Multi-region AI training? This shit is nuts!
Also I feel like they answer so many questions that the gotcha folks so frequently have. Like this:
Satya Nadella 00:02:41
There is coupling from the model architecture to what is the physical plan that’s optimized. And it’s also scary in that sense, which is, there’s going to be a new chip that’ll come out. Take Vera Rubin Ultra. That’s going to have power density that’s going to be so different, with cooling requirements that are going to be so different. So you kind of don’t want to just build all to one spec. That goes back a little bit to the dialogue we’ll have, which is that you want to be scaling in time as opposed to scale once and then be stuck with it.
But somehow this idea that "oh they're just so dumb don't they realize GPUs depreciate" is the level of analysis typically on offer when talking about these huge datacenters.
It's always so surprising to me that stuff like Zitron's latest eruption gets traction but when the actual guys who are actually building the actual stuff actually talk about what they're actually doing? Silence.
Petabit network? Multi-region AI training? This shit is nuts!
Also I feel like they answer so many questions that the gotcha folks so frequently have. Like this:
Satya Nadella 00:02:41 There is coupling from the model architecture to what is the physical plan that’s optimized. And it’s also scary in that sense, which is, there’s going to be a new chip that’ll come out. Take Vera Rubin Ultra. That’s going to have power density that’s going to be so different, with cooling requirements that are going to be so different. So you kind of don’t want to just build all to one spec. That goes back a little bit to the dialogue we’ll have, which is that you want to be scaling in time as opposed to scale once and then be stuck with it.
But somehow this idea that "oh they're just so dumb don't they realize GPUs depreciate" is the level of analysis typically on offer when talking about these huge datacenters.