For Jim Keller, the x86 and CUDA architectures are a swamp and he “offers himself” to OpenAI

For Jim Keller the x86 and CUDA architectures are a swamp and

About the “vision” of the legendary chip designer Jim Keller we have spoken several times in the past: engaged in first person on Tenstorrenta company specializing in the design and development of accelerated computing solutions for artificial intelligence applications, Keller clarified his ideas on AI, RISC-V and edge computing by now defining thex86 architecture because it is burdened with so many burdens that are today maintained mainly for matters of backwards compatibility. In another article we explored the topic in depth by comparing x86 and ARM ISAs.

Why Jim Keller calls x86 and CUDA architectures swamps

In recent days, Keller has expressed harsh criticism towards two milestones in the chip landscape: architectures x86 e CUDA they are defined as real ones swamps. The designer has in fact highlighted a critical aspect of both architectures: the “stratification” they have undergone during growth over time and the accumulation of functionality.

All of this, as highlighted in the introduction, certainly makes the platforms complete and backwards compatible but, on the other hand, it weighs them down with a performance impact which is now more than evident. According to Keller, x86 and CUDA were built “adding one thing at a time“, without a homogeneous and harmonious architectural vision.

NVidia CUDA would be an ecosystem that is now too fragmented

We have seen what NVidia CUDA is and how widely it is used today for parallel computing. Focusing precisely on the solution developed by Jen-Hsun Huang’s company, Keller claims that CUDA has acquired several “specialized features” over the years, leading to the creation of a fragmented ecosystem. Adding specific tools, such as Triton, Tensor RT, Neon e Mojowould represent a confirmation of that stratification which we mentioned previously and which contributes to generating a complex software configuration.

Virtually no one writes in CUDA“, observes Keller. “If you write in CUDA, you are probably not fast. (…) This is why Triton, Tensor RT, Neon and Mojo exist“. NVidia itself also has tools that don’t rely solely on CUDA. For example, Triton Inference Server is an open source tool from NVidia that simplifies the deployment of AI models at scale, supporting frameworks like TensorFlow, PyTorch e ONNX (also used by Google).

TensorRT by NVidia is an optimizer that optimizes the activities of deep learning on the company’s GPUs. It takes models trained by various frameworks, such as TensorFlow and PyTorch, and optimizes them for “production” use, reducing latency and increasing the throughput. In this way, the models become suitable for artificial intelligence applications that work in real time for image classification, object detection and natural language processing.

Keller draws water to his mill

For Keller many development platforms software per l’IA open source they can be used more efficiently than CUDA.

The ambitious plan of Sam Altman, CEO of OpenAI, who wants to “raise” something like 5,000-7,000 billion dollars (attention, not millions but billions…) has literally shocked Keller who is actively involved, with his Tenstorrent, in development of AI processors and high-performance computing (HPC).

While the industry is still discussing OpenAI’s large-scale projects and analysts are wondering about their feasibility, Keller makes a shock statement claiming that he and his company are capable of accomplishing what Altman plans to do with a budget less than $1 trillion.

Altman’s plan involves a radical expansion of supply chain of the semiconductors, with the risk of overcapacity in factories and excessive depreciation of processors on the market. Keller agrees with Nvidia CEO Huang on one thing: it is essential that chips for AI applications become more sophisticated, therefore focusing more on functionality and quality in general than on “numbers”.

Start by eliminating margin buildup,” Keller wrote, suggesting that OpenAI reduce the various overlaps in terms of profit that each participant in the supply chain adds (supply chain). He recommends not exceeding two-three levels. At this point, then, it is crucial “make the chips much faster so that there is a direct match between hardware and software. It’s harder, but achievable“.

Opening image credit: iStock.com – Digital43

Leave a Reply

Your email address will not be published. Required fields are marked *