Intel Officially Launches Sapphire Rapids and HPC-optimized Max Series

By Tiffany Trader

January 10, 2023

After a number of delays, Intel has launched its fourth-generation Intel Xeon Scalable processor, codenamed Sapphire Rapids, the successor to Ice Lake. Manufactured on the Intel 7 node (formerly known as 10nm) and sporting up to 60 Golden Cove cores per processor plus new dedicated accelerator cores, the platform offers a 1.53x average performance gain over the prior generation and a 2.9x average performance per watt efficiency improvement for targeted workloads using the new accelerators, according to Intel.

The launch, held today as a global livestreamed watch party, also included the recently remonikered Max series CPU and GPU, which were previously called “Sapphire Rapids HBM” and “Ponte Vecchio,” respectively. 

The Sapphire Rapids family includes 52 SKUs (see chart) grouped across 10 segments, inclusive of the Max series: 11 are optimized for 2-socket performance (8 to 56 cores, 150-350 watts), 7 for 2-socket mainline performance (12 to 36 cores, 150-300 watts), 10 target four- and eight- socket (8 to 60 cores, 195-350 watts), and there are 3 single-socket optimized parts (8 to 32 cores, 125-250 watts). There are also SKUs optimized for cloud, networking, storage, media and other workloads.

The lineup for the “HPC Optimized” Xeon Max series SKUs includes 32-, 40-, 48-, 52- and 56-core versions. All five of these 2-socket parts top out at 350 watts, and list pricing runs from $7,995 for the 32-core 9462 to $12,980 for the 56-core 9480. There are two SKUs more expensive than the 9480 Max series: the 60-core 8490H, which runs a cool $17,000, and the 48-core 8460H at $13,923.

Intel Max series CPU SKUs

At a press event in Hillsboro, Oregon, last month, Intel Senior Fellow Ronak Singhal referenced the wide span of SKUs, saying: “Customers will say you guys have too many SKUs, can you guys reduce the number of SKUs, but can you add these three SKUs that are really, really important? So we have this push and pull with our customers.”

New capabilities in fourth-gen Intel Xeon Scalable processors include PCIe 5.0, DDR5 memory, and support for CXL 1.1.

Source: Intel

The 56-core 8480+ top-of-bin two-socket (non-HBM) part – with 40% more cores than its Ice Lake counterpart – achieved gen-over-gen performance uplifts across a number of benchmarks, delivering a 1.5x improvement on Stream Triad, a 1.4x improvement for HPL and a 1.6x improvement on HPCG. Intel testing across a dozen-plus real-world applications (including WRF, Black Scholes, Monte Carlo and OpenFoam) showed similar speedups, with the greatest gain for a physics workload, CosmoFlow (2.6x).

Max series CPU (photo taken at December 2022 Hillsboro, Oregon, press event)

The Max series CPU is the first x86 processor with integrated High Bandwidth Memory. It offers a 3.7x gain in performance for memory-bound workloads, according to Intel, and requires 68 percent less energy than “deployed competitive systems.” On the AlphaFold2 application, the Xeon Max CPU showed a 3x speedup over the Ice Lake processor in Intel testing. Notable for HPC benchmark watchers, the Max series processor achieves a nearly 2.4x speedup on HPCG and a 3.5x speedup for Stream Triad, compared with the DDR-only Sapphire Rapids equivalent. The HBM in the Max series CPU offered no performance improvement for the High Performance Linpack benchmark.

Max series GPU products and form factors

The Max series “Ponte Vecchio” GPU, also launched today, contains over 100 billion transistors in a 47-tile package with up to 128 Xe HPC cores. Depending on the form factor, it supports up to 128GB HBM2e memory and delivers up to 52 peak FP64 teraflops. Combining the Max series GPU with the Max series CPU platform (in a three to one GPU:CPU ratio) offers a 12.9x performance boost for LAMMPS molecular dynamics workloads, compared with an Ice Lake platform without GPUs, according to benchmarking conducted by Intel. The addition of Max GPUs (six GPUs added to a 2-CPU server) translated into a 9.9x boost versus a Max series CPU-only platform for the same workload. The high bandwidth memory on the host CPUs enabled a 1.55x performance improvement compared to using DDR5 only. (Photo of demonstration given in Hillsboro, Oregon, last month.)

Both Max series parts were originally expected to debut in the Aurora supercomputer, but because of delays, the initial deployment is using the non-HBM Sapphire Rapids in addition to the Max series “Ponte Vecchio” GPUs. The HBM-equipped Max series CPU will now debut in the HPE-built Crossroads supercomputer, which is under construction at Los Alamos National Lab. Researchers there are reporting performance improvements up to 8.6x for pre-production Intel Max hardware over Intel Broadwell generation HPC systems at LANL with no code changes. The average improvement seen is 4x, according to Jim Lujan, HPC Platforms/Projects Program Director, LANL.

Max series CPU products have also been selected for CTS-2 systems at Lawrence Livermore National Laboratory and Sandia National Laboratory, and for the Camphor 3 supercomputer at Kyoto University with Dell as the server partner for both projects. Argentina is getting ready to deploy a Max+Max system from Lenovo for the country’s National Meteorological Service this spring.

Aurora system blade architecture (2 Sapphire Rapids CPUs and 6 Max series GPUs)

The Max series CPUs are now part of an upgrade path for Aurora at Argonne National Laboratory. The Intel/HPE system currently being installed has 20,000 Sapphire Rapids CPUs and 60,000 Max series GPUs in a form factor that Intel calls the exascale compute platform, or ECP (a clear nod to the Exascale Computing Project). The lab plans to swap in the Max CPU HBM parts this year. Pasting in the new CPUs could take on the order of 5,000 hours, according to an Intel person familiar with the project who figured on it taking about 30 minutes per blade (x10,000 blades).

A testbed for evaluating and debugging the technologies for the 2-plus-exaflops-peak Aurora system is located at the Jones Farm site in Hillsboro, Oregon. Called Borealis, it is a two-rack, 128-blade system – with another one-rack, 64-blade system providing additional testing opportunities. Borealis has a twin system named Sunspot that is installed and operational at Argonne. Sunspot is the test and development system for the Aurora supercomputer, which is slated to launch this year at Argonne. Intel is currently updating Borealis with the Max series CPUs.

The Borealis system at Intel’s HPC lab in Hillsboro, Oregon. Credit: Intel.

Built-in acceleration and new licensing options

Sapphire Rapids introduces four new dedicated accelerators (in addition to AVX-512, which debuted with the Xeon Phi “Knights Landing” product in 2016): 

Intel Advanced Matrix Extensions (Intel AMX) accelerates deep learning (DL) inference and training workloads, such as natural language processing (NLP), recommendation systems, and image recognition.

Intel Data Streaming Accelerator (Intel DSA) drives high performance for storage, networking, and data-intensive workloads by improving streaming data movement and transformation operations.

Intel In-Memory Analytics Accelerator (Intel IAA) improves analytics performance while offloading tasks from CPU cores to accelerate database query throughput and other workloads.

Intel Dynamic Load Balancer (Intel DLB) provides efficient hardware-based load balancing by dynamically distributing network data across multiple CPU cores as the system load varies.

With a new service called Intel On Demand (formerly referred to as software-defined silicon, SDSi) customers will have the option to have some of these accelerators turned on or upgraded post purchase. “On Demand will give end customers the flexibility to choose fully featured premium SKUs or the opportunity to add features at any time throughout the lifecycle of the Xeon processor,” Intel stated. Pricing will vary depending on the license model. On Demand currently applies to the following features: Intel Dynamic Load Balancer, Intel Data Streaming Accelerator, Intel In-Memory Analytics Accelerator, Intel Quick Assist Technology and Intel Software Guard Extensions. Note the Max series CPUs and the socket-scalable (-H tagged) SKUs do not have On Demand capability; nor does the 8-core single-socket part (3408U).

Sapphire Rapids ecosystem partners include AWS, Cisco, Dell Technologies, Fujitsu, Google Cloud, HPE, IBM Cloud, Inspur, Lenovo, Microsoft Azure, Nvidia, Oracle, Supermicro, VMware and others. Intel reports more than 30 Max series CPU system designs are coming to market and 15 system designs based on the Max series GPU are also in development.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industry updates delivered to you every week!

Edge-to-Cloud: Exploring an HPC Expedition in Self-Driving Learning

April 25, 2024

The journey begins as Kate Keahey's wandering path unfolds, leading to improbable events. Keahey, Senior Scientist at Argonne National Laboratory and the University of Chicago, leads Chameleon. This innovative projec Read more…

Quantum Internet: Tsinghua Researchers’ New Memory Framework could be Game-Changer

April 25, 2024

Researchers from the Center for Quantum Information (CQI), Tsinghua University, Beijing, have reported successful development and testing of a new programmable quantum memory framework. “This work provides a promising Read more…

Intel’s Silicon Brain System a Blueprint for Future AI Computing Architectures

April 24, 2024

Intel is releasing a whole arsenal of AI chips and systems hoping something will stick in the market. Its latest entry is a neuromorphic system called Hala Point. The system includes Intel's research chip called Loihi 2, Read more…

Anders Dam Jensen on HPC Sovereignty, Sustainability, and JU Progress

April 23, 2024

The recent 2024 EuroHPC Summit meeting took place in Antwerp, with attendance substantially up since 2023 to 750 participants. HPCwire asked Intersect360 Research senior analyst Steve Conway, who closely tracks HPC, AI, Read more…

AI Saves the Planet this Earth Day

April 22, 2024

Earth Day was originally conceived as a day of reflection. Our planet’s life-sustaining properties are unlike any other celestial body that we’ve observed, and this day of contemplation is meant to provide all of us Read more…

Intel Announces Hala Point – World’s Largest Neuromorphic System for Sustainable AI

April 22, 2024

As we find ourselves on the brink of a technological revolution, the need for efficient and sustainable computing solutions has never been more critical.  A computer system that can mimic the way humans process and s Read more…

Shutterstock 1748437547

Edge-to-Cloud: Exploring an HPC Expedition in Self-Driving Learning

April 25, 2024

The journey begins as Kate Keahey's wandering path unfolds, leading to improbable events. Keahey, Senior Scientist at Argonne National Laboratory and the Uni Read more…

Quantum Internet: Tsinghua Researchers’ New Memory Framework could be Game-Changer

April 25, 2024

Researchers from the Center for Quantum Information (CQI), Tsinghua University, Beijing, have reported successful development and testing of a new programmable Read more…

Intel’s Silicon Brain System a Blueprint for Future AI Computing Architectures

April 24, 2024

Intel is releasing a whole arsenal of AI chips and systems hoping something will stick in the market. Its latest entry is a neuromorphic system called Hala Poin Read more…

Anders Dam Jensen on HPC Sovereignty, Sustainability, and JU Progress

April 23, 2024

The recent 2024 EuroHPC Summit meeting took place in Antwerp, with attendance substantially up since 2023 to 750 participants. HPCwire asked Intersect360 Resear Read more…

AI Saves the Planet this Earth Day

April 22, 2024

Earth Day was originally conceived as a day of reflection. Our planet’s life-sustaining properties are unlike any other celestial body that we’ve observed, Read more…

Kathy Yelick on Post-Exascale Challenges

April 18, 2024

With the exascale era underway, the HPC community is already turning its attention to zettascale computing, the next of the 1,000-fold performance leaps that ha Read more…

Software Specialist Horizon Quantum to Build First-of-a-Kind Hardware Testbed

April 18, 2024

Horizon Quantum Computing, a Singapore-based quantum software start-up, announced today it would build its own testbed of quantum computers, starting with use o Read more…

MLCommons Launches New AI Safety Benchmark Initiative

April 16, 2024

MLCommons, organizer of the popular MLPerf benchmarking exercises (training and inference), is starting a new effort to benchmark AI Safety, one of the most pre Read more…

Nvidia H100: Are 550,000 GPUs Enough for This Year?

August 17, 2023

The GPU Squeeze continues to place a premium on Nvidia H100 GPUs. In a recent Financial Times article, Nvidia reports that it expects to ship 550,000 of its lat Read more…

Synopsys Eats Ansys: Does HPC Get Indigestion?

February 8, 2024

Recently, it was announced that Synopsys is buying HPC tool developer Ansys. Started in Pittsburgh, Pa., in 1970 as Swanson Analysis Systems, Inc. (SASI) by John Swanson (and eventually renamed), Ansys serves the CAE (Computer Aided Engineering)/multiphysics engineering simulation market. Read more…

Intel’s Server and PC Chip Development Will Blur After 2025

January 15, 2024

Intel's dealing with much more than chip rivals breathing down its neck; it is simultaneously integrating a bevy of new technologies such as chiplets, artificia Read more…

Comparing NVIDIA A100 and NVIDIA L40S: Which GPU is Ideal for AI and Graphics-Intensive Workloads?

October 30, 2023

With long lead times for the NVIDIA H100 and A100 GPUs, many organizations are looking at the new NVIDIA L40S GPU, which it’s a new GPU optimized for AI and g Read more…

Choosing the Right GPU for LLM Inference and Training

December 11, 2023

Accelerating the training and inference processes of deep learning models is crucial for unleashing their true potential and NVIDIA GPUs have emerged as a game- Read more…

Baidu Exits Quantum, Closely Following Alibaba’s Earlier Move

January 5, 2024

Reuters reported this week that Baidu, China’s giant e-commerce and services provider, is exiting the quantum computing development arena. Reuters reported � Read more…

AMD MI3000A

How AMD May Get Across the CUDA Moat

October 5, 2023

When discussing GenAI, the term "GPU" almost always enters the conversation and the topic often moves toward performance and access. Interestingly, the word "GPU" is assumed to mean "Nvidia" products. (As an aside, the popular Nvidia hardware used in GenAI are not technically... Read more…

Shutterstock 1606064203

Meta’s Zuckerberg Puts Its AI Future in the Hands of 600,000 GPUs

January 25, 2024

In under two minutes, Meta's CEO, Mark Zuckerberg, laid out the company's AI plans, which included a plan to build an artificial intelligence system with the eq Read more…

Leading Solution Providers

Contributors

China Is All In on a RISC-V Future

January 8, 2024

The state of RISC-V in China was discussed in a recent report released by the Jamestown Foundation, a Washington, D.C.-based think tank. The report, entitled "E Read more…

Shutterstock 1179408610

Google Addresses the Mysteries of Its Hypercomputer 

December 28, 2023

When Google launched its Hypercomputer earlier this month (December 2023), the first reaction was, "Say what?" It turns out that the Hypercomputer is Google's t Read more…

Nvidia’s New Blackwell GPU Can Train AI Models with Trillions of Parameters

March 18, 2024

Nvidia's latest and fastest GPU, codenamed Blackwell, is here and will underpin the company's AI plans this year. The chip offers performance improvements from Read more…

Shutterstock 1285747942

AMD’s Horsepower-packed MI300X GPU Beats Nvidia’s Upcoming H200

December 7, 2023

AMD and Nvidia are locked in an AI performance battle – much like the gaming GPU performance clash the companies have waged for decades. AMD has claimed it Read more…

Eyes on the Quantum Prize – D-Wave Says its Time is Now

January 30, 2024

Early quantum computing pioneer D-Wave again asserted – that at least for D-Wave – the commercial quantum era has begun. Speaking at its first in-person Ana Read more…

The GenAI Datacenter Squeeze Is Here

February 1, 2024

The immediate effect of the GenAI GPU Squeeze was to reduce availability, either direct purchase or cloud access, increase cost, and push demand through the roof. A secondary issue has been developing over the last several years. Even though your organization secured several racks... Read more…

GenAI Having Major Impact on Data Culture, Survey Says

February 21, 2024

While 2023 was the year of GenAI, the adoption rates for GenAI did not match expectations. Most organizations are continuing to invest in GenAI but are yet to Read more…

Intel Plans Falcon Shores 2 GPU Supercomputing Chip for 2026  

August 8, 2023

Intel is planning to onboard a new version of the Falcon Shores chip in 2026, which is code-named Falcon Shores 2. The new product was announced by CEO Pat Gel Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire