OpenAI unveils its first custom chip, built by Broadcom

Disclosure: Some links in this article are affiliate links. AI Maestro may earn a commission if you make a purchase, at no…

By AI Maestro June 24, 2026 2 min read
OpenAI unveils its first custom chip, built by Broadcom

OpenAI has unveiled its first custom inference processor, named Jalapeño, with manufacturing handled by Broadcom.

The new chip was designed specifically for OpenAI’s inference systems, and the company stated that its own AI models assisted in the development process.

While testing is still underway, early results indicate better performance-per-watt than current state-of-the-art alternatives.

The partnership was officially announced in October, though plans for a custom chip have been rumored for longer as a method to reduce reliance on Nvidia’s GPUs. Google and Amazon have both built custom chips to serve a similar purpose, often called “AI accelerators” — silicon designed specifically to speed up machine learning workloads.

OpenAI president Greg Brockman explained the company’s approach to chip development on its in-house podcast, shortly after the Broadcom partnership was announced.

“We have a deep understanding of the workload,” Brockman said in the episode. “We’ve really been looking for specific workloads that are underserved, [and asking] how can we build something that will be able to accelerate what’s possible?”

Jalapeño is specifically designed for inference, the process of running pre-built AI models in response to user commands. In the announcement, OpenAI emphasized the chip’s low operating cost when running real-time coding models. It is likely that more performance-intensive tasks like pre-training will still rely on Nvidia hardware, but even small reductions in inference costs could do a lot to improve the company’s bottom line.

Optimizing that inference system may prove to be a crucial factor in the economics of AI going forward — and it is likely to take place at every level of the stack. OpenAI is already building agentic products like Codex and the models that power them, as well as data centers to run those models. Moving into purpose-built chips lets the company go even further in that process, as the company explained in its announcement.

“OpenAI is not only developing frontier models or building products on top of them; it is designing the infrastructure underneath them: chip architecture, kernels, memory systems, networking, scheduling, deployment systems, and product experience,” the company wrote. “Because OpenAI operates across the stack, each layer can be optimized around the same goal: making its models faster, more reliable, and more affordable for users.”

What it means

This move signals a shift in how OpenAI handles its own infrastructure. By controlling the silicon, the company aims to lower costs for users running models in real time, particularly for coding tasks. While heavy training workloads will likely remain on Nvidia hardware, reducing the expense of inference could improve profitability. The strategy involves building the entire stack, from the underlying chips to the final product experience, to ensure every layer works toward the same efficiency goals.

Scroll to Top