Intel is hitting the AI problem with too many sticks
If there’s one company that has all the potential in resources, market share, and developer relationships to break NVIDIA’s dominance in AI, it’s Intel. However, we find that the green company seems to be the one that is benefiting from the boom in deep learning with the boom in services based on these technologies. The reason? Simple, all the applications use CUDA libraries that need NVIDIA hardware to work, they have even come out with their own CPU, based on ISA ARM and under the name Grace, to displace Intel and AMD from specialized data centers.
However, all success, although it is based on your own work, also has to do with luck and this sometimes comes with the missteps of your rivals. And honestly Intel in AI is just the perfect example of having the house disunited and facing each other, too many proposals that are not compatible with each other and some even competing in the same space.
Although it is best to list them so that you can see all the existing mess:
- AI-intensive AVX-512 instructions that run on some Xeon and Intel Core processors.
- Now without leaving the CPU, we have the AMX units, which are Tensor-type cores but in the CPU.
- And if you are referring to Tensor-type cores, we can also find them in Intel ARC with the name of XMX.
- We can also use Intel FPGAs for this.
- All of this without mentioning the multiple chips specialized in Deep and Machine Learning that they have been developing.
As you can see, there are too many bets to solve a single problem, to execute AI algorithms, be it on a computer, a workstation or a server.
OneAPI as a solution to Intel’s problems
Instead of looking for a unified hardware solution, nothing else occurred to them than to create a universal development API that they have called OneAPI and that encompasses all types of hardware created by the company co-founded by the late Gordon Moore. Basically, what they have done is launched into a fight between the different existing hardware options that will already be the end users who make the option. In other words, it is a cruel game of the chair where little by little the different solutions proposed for the AI will be dismounted until only one of them remains.
Guide to shooting yourself in the foot
It has long been shown that the base architecture used in GPUs with a few changes are excellent processors for AI, not only that, but also when using high-bandwidth memories and it is important due to the fact that to be able to Sustaining the computing capacity of this type of units requires really large bandwidths. The big problem comes when the CPUs usually have RAM that is more optimized for latency than for bandwidth, and the latter is key to the huge number of operations for the AI.
So much so that Intel in the case of its Sapphire Rapids, in order to take advantage of the AMX units found in the 56 Golden Cove cores inside, need to use HBM memory. On the other hand, if you are wondering why there is no ARC A790 with a higher number of cores, this is why. They have wanted the entire commitment to AI in servers to depend on the sale of these processors, since that is Intel’s main business.
However, there is a reason why using a GPU works better, not just because NVIDIA has proven it, but rather because the operations and instructions used in the AI are extremely simple. This allows them to be placed seamlessly on a GPU core that is much smaller. The case is a graphics chip with 56 cores, it would cost much less than a processor of that size. Just look at how mammoth Sapphire Rapids is.
A graphics card can be placed in any computer
Continuing with the previous argument, we come across the fact that if you want to work in any discipline related to AI on a PC, all you have to do is buy a graphics card. That’s why Intel’s AI strategy is flawed. Not everyone has the means or the resources to set up a high-calibre server or workstation. But the best thing is to put ourselves in a situation, the official price in dollars of the Intel Xeon 9480 Max is approximately 12,000 euros and only the CPU, as you can understand, are not for the user who uses a normal computer.
The idea behind the fourth generation Xeon is simple: why would you want a graphics card to run the inference algorithm if you have a CPU with the necessary units for it? The problem comes when another division of Intel itself was developing a GPU for the same purpose.