AMD Releases ROCm Version 6.3

(insidehpc.com)

40 points | by ankitg12 a day ago ago

13 comments

  • ducviet00 a day ago

    AMD has great hardware, but their software is a different story. It’s poorly documented, unstable, and doesn’t deliver good performance for end users.

    I’ve been working with the AMD MI300X for a few weeks, trying to get matrix multiplication running with tools like CK, Triton, or hipBLAS. However, the performance is only about 50% of the theoretical peak (FP16: 650 TFLOPS/s vs. 1300 TFLOPS/s in the whitepaper). Note that this is with matrices initialized to zero. When using random floats, performance drops by 20%—this is confirmed in AMD’s documentation.

    Meanwhile, the H100, MI300X’s competitor, has a theoretical FP16 performance of 1000 TFLOPS, and I can achieve 800-900 TFLOPS with matrix multiplication using CUTLASS and random floats initialization.

    AMD needs to improve their software quickly if they want to catch up with NVIDIA.

  • amstan a day ago

    Doesn't seem to be released yet. https://github.com/ROCm/llvm-project does not have a 6.3 tag.

    Same for https://github.com/ROCm/rocm_smi_lib/releases

    • 0xcde4c3db a day ago

      MIOpen has a "release/rocm-rel-6.3" branch, but I don't see a corresponding tag. I think a press release somehow got ahead of the software release.

  • superkuh a day ago

    Anyone know how to find the list of AMD GPU/Accelerator hardware that ROCm 6.3 supports? Usually AMD drops an old line or two every time they update ROCm.

    https://rocm.docs.amd.com/projects/radeon/en/latest/docs/com...

    When looking at the latest support matrix it basically only supports these bleeding edge cards, "AMD Radeon RX 7900 XTX, AMD Radeon RX 7900 XT, AMD Radeon RX 7900 GRE, AMD Radeon PRO W7900, AMD Radeon PRO W7900DS, AMD Radeon PRO W7800".

    Surely I'm misinterpreting this and that can't be all the cards they support with latest ROCm. Does anyone know a more complete list?

    • burnte 21 hours ago

      You're looking only at the Radeon support page. https://rocm.docs.amd.com/en/latest/compatibility/compatibil...

      • slavik81 18 hours ago

        It's a bit confusing, but there's both "ROCm" and "ROCm on Radeon" releases. Despite what seems to be implied by the names, the standard "ROCm" releases support both Radeon and Instinct GPUs. "ROCm on Radeon" seems to be ROCm as packaged and released by the team that does "Radeon Software for Linux" instead of by the original ROCm packaging and release teams.

    • StrangeDoctor a day ago

      yeah that page is wrong? or there is some context I'm missing.

      I think this is what you want https://rocm.docs.amd.com/projects/install-on-linux/en/lates...

      edit: well this is still 6.2.4, but the other page says only the bleeding edge cards are supported for 6.0 and 5.7 family. Sadly this is my experience with amd documentation.

      double edit: yeah that page you linked to is just "Radeon" branded, and that's most consumer cards. Sometimes the more professional cards get something like "Radeon Pro" but that's not "Radeon"...

      last edit I swear: https://rocm.docs.amd.com/en/docs-6.2.4/ is the current docs, changing that .9 or something nonsensical is a 404 but changing it to https://rocm.docs.amd.com/en/docs-6.3.0/ asks me to log in at the moment. I think someone forgot to release this or it's inbound soon. So we don't currently know.

      • sliken a day ago

        Sad that they don't support any of their more popular cards, like say the Radeon 7800xt, which is their current generation card. The 7800xt has 16GB vram and looks pretty good at $400 these days ($500 at launch).

      • a day ago
        [deleted]
      • superkuh a day ago

        Ah, if we go by Mi50 family (gfx906) then it's typical AMD support length for compute: 4-5 years since release. So you better be buying release day at full price and not 2-3 years later if you go AMD.

    • 0xcde4c3db a day ago

      I'm not sure where to find a better list, but I'm pretty sure those are the capital-s "Supported" configurations that you can expect to productively yell at AMD about, not the full list of what ROCm actually works with. I even seem to remember something about a Navi 10 problem being fixed in this release.

    • papichulo2023 a day ago

      Maybe because /radeon/ in the link, which means they only list the consumers cards.