Bringing Up DeepSeek-V4-Flash on AMD MI300X

(fergusfinn.com)

93 points | by kkm 11 hours ago ago

11 comments

edg5000 27 minutes ago

Checked out this company about a year ago and they only offered small models. Now I see they have GLM-fp8/Kimi and DeepSeek V4 Pro. Since workloads are predominantly cached input, I'm surprised to see no separate price for cached input vs uncached. I hope the prices will drop significantly; with these prices you'll end up with thousands in monthly costs quickly. Hopefully more hardware companies will be on the market in the coming years. If the Chinese eventually start competing with the current memory makers, maybe that will help.

maCDzP 8 hours ago

I train on AMD MI250X and managed to get Gemma 4 31B to work - but it took a lot of work on the software side.

[-]

kkm 8 hours ago

This is very interesting, planning to write about it?

kkm 9 hours ago

Also the vllm patch accompanying the blogpost: https://github.com/doublewordai/vllm-amd-blog-doubleword

mezark 9 hours ago

We at doubleword are bullish for AMD for low-interactivity inference - it does just take a bigger lift on the software side...

[-]

brcmthrowaway 7 hours ago

Are you long AMD?