Jensen knows what he is doing with the CUDA stack and workstations. AMD needs to beat that more than thinking about bigger hardware. Most people are not going to risk years learning an arcane stack for an architecture that is used by less than 10% of the GPGPU market.
Can someone with more knowledge give me a software overview of what AMD is offering?
Which SDKs do they offer that can do neural network inference and/or training? I'm just asking because I looked into this a while ago and felt a bit overwhelmed by the number of options. It feels like AMD is trying many things at the same time, and I’m not sure where they’re going with all of it.
fyi: ROCm support status currently isn't crucial for casual AI users - standard proprietary AMD drivers include Vulkan API support going back ~10 years. It's slower, but llama.cpp supports it, and so do many oneclick automagic LLM apps like LM Studio.
Don't call us, we will call you when that future is the present.
Is Bob Page leading the effort?
I hear [“Atropos log, abandoning Helios”](https://returnal.fandom.com/wiki/Helios) and have an emotional reaction every time this comes up in the news.
If hope AMD can produce a chip that matches H100 in training workloads.
Honestly that was a hard read. I hope that guy gets an mi355 just for writing this.
AMD deserves exactly zero of the credulity this writer heaps onto them. They just spent four months not supporting their rdna4 lineup in rocm after launch. AMD is functionally capable of day120 support. None of the benchmarks disambiguated where the performance is coming from. 100% they are lying on some level, representing their fp4 performance against fp 8/16.
AMD future should be figuring out how to reproduce the performance numbers they “claim” they are getting
[dead]
What really matters is how much of "Software++: ROCm 7 Released" can I use on a regular consumer laptop, like I can with CUDA.
ROCm really is hit or miss depending on the use case.
Plus their consumer card support is questionable to say the least. I really wish it was a viable alternative, but swapping to CUDA really saved me some headaches and a ton or time.
Having to run MiOpen benchmarks for HIP can take forever.