AMD has great hardware, but their software is a different story. It’s poorly documented, unstable, and doesn’t deliver good performance for end users.
I’ve been working with the AMD MI300X for a few weeks, trying to get matrix multiplication running with tools like CK, Triton, or hipBLAS. However, the performance is only about 50% of the theoretical peak (FP16: 650 TFLOPS/s vs. 1300 TFLOPS/s in the whitepaper). Note that this is with matrices initialized to zero. When using random floats, performance drops by 20%—this is confirmed in AMD’s documentation.
Meanwhile, the H100, MI300X’s competitor, has a theoretical FP16 performance of 1000 TFLOPS, and I can achieve 800-900 TFLOPS with matrix multiplication using CUTLASS and random floats initialization.
AMD needs to improve their software quickly if they want to catch up with NVIDIA.
Anyone know how to find the list of AMD GPU/Accelerator hardware that ROCm 6.3 supports? Usually AMD drops an old line or two every time they update ROCm.
When looking at the latest support matrix it basically only supports these bleeding edge cards, "AMD Radeon RX 7900 XTX, AMD Radeon RX 7900 XT, AMD Radeon RX 7900 GRE, AMD Radeon PRO W7900, AMD Radeon PRO W7900DS, AMD Radeon PRO W7800".
Surely I'm misinterpreting this and that can't be all the cards they support with latest ROCm. Does anyone know a more complete list?
It's a bit confusing, but there's both "ROCm" and "ROCm on Radeon" releases. Despite what seems to be implied by the names, the standard "ROCm" releases support both Radeon and Instinct GPUs. "ROCm on Radeon" seems to be ROCm as packaged and released by the team that does "Radeon Software for Linux" instead of by the original ROCm packaging and release teams.
edit: well this is still 6.2.4, but the other page says only the bleeding edge cards are supported for 6.0 and 5.7 family. Sadly this is my experience with amd documentation.
double edit: yeah that page you linked to is just "Radeon" branded, and that's most consumer cards. Sometimes the more professional cards get something like "Radeon Pro" but that's not "Radeon"...
Sad that they don't support any of their more popular cards, like say the Radeon 7800xt, which is their current generation card. The 7800xt has 16GB vram and looks pretty good at $400 these days ($500 at launch).
The Radeon 7800 XT is officially supported on Windows [1], and I'm pretty sure it works fine on Linux even if it's not considered officially supported. It's Navi 32, just like the Radeon PRO V710, which is officially supported on Linux.
Ah, if we go by Mi50 family (gfx906) then it's typical AMD support length for compute: 4-5 years since release. So you better be buying release day at full price and not 2-3 years later if you go AMD.
I'm not sure where to find a better list, but I'm pretty sure those are the capital-s "Supported" configurations that you can expect to productively yell at AMD about, not the full list of what ROCm actually works with. I even seem to remember something about a Navi 10 problem being fixed in this release.
AMD has great hardware, but their software is a different story. It’s poorly documented, unstable, and doesn’t deliver good performance for end users.
I’ve been working with the AMD MI300X for a few weeks, trying to get matrix multiplication running with tools like CK, Triton, or hipBLAS. However, the performance is only about 50% of the theoretical peak (FP16: 650 TFLOPS/s vs. 1300 TFLOPS/s in the whitepaper). Note that this is with matrices initialized to zero. When using random floats, performance drops by 20%—this is confirmed in AMD’s documentation.
Meanwhile, the H100, MI300X’s competitor, has a theoretical FP16 performance of 1000 TFLOPS, and I can achieve 800-900 TFLOPS with matrix multiplication using CUTLASS and random floats initialization.
AMD needs to improve their software quickly if they want to catch up with NVIDIA.
Doesn't seem to be released yet. https://github.com/ROCm/llvm-project does not have a 6.3 tag.
Same for https://github.com/ROCm/rocm_smi_lib/releases
MIOpen has a "release/rocm-rel-6.3" branch, but I don't see a corresponding tag. I think a press release somehow got ahead of the software release.
Anyone know how to find the list of AMD GPU/Accelerator hardware that ROCm 6.3 supports? Usually AMD drops an old line or two every time they update ROCm.
https://rocm.docs.amd.com/projects/radeon/en/latest/docs/com...
When looking at the latest support matrix it basically only supports these bleeding edge cards, "AMD Radeon RX 7900 XTX, AMD Radeon RX 7900 XT, AMD Radeon RX 7900 GRE, AMD Radeon PRO W7900, AMD Radeon PRO W7900DS, AMD Radeon PRO W7800".
Surely I'm misinterpreting this and that can't be all the cards they support with latest ROCm. Does anyone know a more complete list?
You're looking only at the Radeon support page. https://rocm.docs.amd.com/en/latest/compatibility/compatibil...
It's a bit confusing, but there's both "ROCm" and "ROCm on Radeon" releases. Despite what seems to be implied by the names, the standard "ROCm" releases support both Radeon and Instinct GPUs. "ROCm on Radeon" seems to be ROCm as packaged and released by the team that does "Radeon Software for Linux" instead of by the original ROCm packaging and release teams.
yeah that page is wrong? or there is some context I'm missing.
I think this is what you want https://rocm.docs.amd.com/projects/install-on-linux/en/lates...
edit: well this is still 6.2.4, but the other page says only the bleeding edge cards are supported for 6.0 and 5.7 family. Sadly this is my experience with amd documentation.
double edit: yeah that page you linked to is just "Radeon" branded, and that's most consumer cards. Sometimes the more professional cards get something like "Radeon Pro" but that's not "Radeon"...
last edit I swear: https://rocm.docs.amd.com/en/docs-6.2.4/ is the current docs, changing that .9 or something nonsensical is a 404 but changing it to https://rocm.docs.amd.com/en/docs-6.3.0/ asks me to log in at the moment. I think someone forgot to release this or it's inbound soon. So we don't currently know.
Sad that they don't support any of their more popular cards, like say the Radeon 7800xt, which is their current generation card. The 7800xt has 16GB vram and looks pretty good at $400 these days ($500 at launch).
The Radeon 7800 XT is officially supported on Windows [1], and I'm pretty sure it works fine on Linux even if it's not considered officially supported. It's Navi 32, just like the Radeon PRO V710, which is officially supported on Linux.
[1]: https://rocm.docs.amd.com/projects/install-on-windows/en/lat...
Ah, if we go by Mi50 family (gfx906) then it's typical AMD support length for compute: 4-5 years since release. So you better be buying release day at full price and not 2-3 years later if you go AMD.
I'm not sure where to find a better list, but I'm pretty sure those are the capital-s "Supported" configurations that you can expect to productively yell at AMD about, not the full list of what ROCm actually works with. I even seem to remember something about a Navi 10 problem being fixed in this release.
Maybe because /radeon/ in the link, which means they only list the consumers cards.