World’s Fastest Supercomputer Can’t Run a Day Without Failure

Building a supercomputer is always challenging, but creating the industry’s first exascale-class system is an encounter with something wholly unexpected and requires a lot of work with hardware and software. Unfortunately, this might be happening with Oak Ridge National Laboratory’s Frontier supercomputer, which can barely last a day without numerous hardware failures.

ORNL’s Frontier is the industry’s first system designed to deliver up to 1.685 FP64 ExaFLOPS peak performance using AMD’s 64-core EPYC Trento processors, Instinct MI250X compute GPUs, and HPE’s Slingshot interconnections at 21 MW of power. HPE built the system and used the Cray EX (opens in new tab) architecture designed for scale-out applications, primarily for ultra-fast supercomputers.

