AMD Instinct MI300A APU Enters Volume Production: Up To 4X Faster Than NVIDIA H100 In HPC, Twice As Efficient

In addition to the MI300X for AI, AMD is also announcing that its Instinct MI300A APU has entered volume production and is expected to offer the world’s fastest HPC performance when it launches next year.

AMD Propels HPC To The Next Level With Instinct MI300A APUs, 4X Faster & 2X Efficient Than NVIDIA H100

We have waited for years for AMD to finally deliver on the promise of an Exascale-class APU and the day is nearing as we move closer to the launch of the Instinct MI300A. Today, AMD confirmed that the MI300A APU entered volume production this quarter and is on the path to becoming the world’s fastest HPC solution when it becomes available in 2024.

The AMD Instinct MI300A APU is a combination of various architectures and interconnect tech with Zen 4, CDNA 3, and 4th Gen Infinity architecture being at the forefront. Some of the highlights of the MI300A APUs include:

  • Up To 61 TFLOPS FP64 Compute
  • Up To 122 TFLOPS FP32 Compute
  • Up To 128 GB HBM3 Memory
  • Up To 5.3 TB/s Memory Bandwidth
  • 146 Billion Transistors

The packaging on the MI300A is very similar to the MI300X except it makes use of TCO-optimized memory capacities & Zen 4 cores. So let’s get down to the details of this exascale horsepower for next-gen HPC and AI data centers.

AMD Instinct MI300A Accelerator.

One of the active dies has two CDNA 3 GCDs cut out and replaced with three Zen 4 CCDs which offer their separate pool of cache and core IPs. You get 8 cores and 16 threads per CCD so that’s a total of 24 cores and 48 threads on the active die. There’s also 24 MB of L2 cache (1 MB per core) and a separate pool of cache (32 MB per CCD). It should be remembered that the CDNA 3 GCDs also have the L2 cache separate.

AMD Instinct MI300A Accelerator with CDNA 3 & Zen 4 dies.

For the GPU side, AMD has enabled a total of 228 Compute Units based on the CDNA 3 architecture which equals 14,592 cores. That’s 38 Compute Units per GPU chiplet. Rounding up some of the highlighted features of the AMD Instinct MI300 Accelerators, we have:

  • First Integrated CPU+GPU Package
  • Aiming Exascale Supercomputer Market
  • AMD MI300A (Integrated CPU + GPU)
  • 146 Billion Transistors
  • Up To 24 Zen 4 Cores
  • CDNA 3 GPU Architecture
  • 228 Compute Units (14,592 cores)
  • Up To 128 GB HBM3 Memory
  • Up To 8 Chiplets + 8 Memory Stacks (5nm + 6nm process)

Coming to the performance figures, AMD once again compared the MI300A against the H100 but this time in HPC-specific workloads. In OpenFOAM, the Instinct MI300A APU delivers up to 4x the performance uplift which comes mainly from the unified memory layout, GPU performance, and overall memory capacity and bandwidth. The system also offers up 2x performance per watt when compared to NVIDIA’s Grace Hopper Superchips.

AMD also confirmed that the Instinct MI300A APUs are now shipping and will also be used to power the next-gen El-Capitan supercomputer which is expected to deliver up to 2 Exaflops of compute. It should be mentioned that AMD is the only company to have broken past the 1 Exaflop barrier with the Frontier supercomputer and is also the most efficient system on the planet.

Share this story