OpenAI and Broadcom announce chip designed for LLM inference at scale

openai broadcom jalapeno inference chip image

OpenAI, the company behind ChatGPT and Codex and the models that use those tools, and Broadcom, an established silicon supplier, have announced a new chip called Jalapeno, designed specifically for large language model inference in data centers.

The chip is intended to be deployed in large data centers, with both companies claiming this is the first generation in a long-term project that will refine the chips over time.

Broadcom says this ASIC (Application-Specific Integrated Circuit) was designed for LLM inference based on “detailed insights” from the company’s conversations with OpenAI researchers, and that the development of the chip was informed by OpenAI’s own roadmap for future models and products. The design and production of the chip took nine months.

The promise is that this chip is more specific to the current needs of LLM than inference systems running in existing data centers.

OpenAI claims that “initial testing shows that Jalapeno will deliver significantly better performance per watt than the current state-of-the-art,” but notes that it has not been done to measure performance, and that “a detailed technical report will be presented in the coming months.”



<a href

Leave a Comment