Intel Promises 3x Performance Boost With Its Next-Gen Sapphire Rapids HBM ‘Xeon Scalable’ CPU Lineup
According to Intel, the Sapphire Rapids-SP will come in two package variants, a standard, and an HBM configuration. The standard variant will feature a chiplet design composed of four XCC dies that will feature a die size of around 400mm2. This is the die size for a singular XCC die and there will be four in total on the top Sapphire Rapids-SP Xeon chip. Each die will be interconnected via EMIB which has a pitch size of 55u and a core pitch of 100u. The standard Sapphire Rapids-SP Xeon chip will feature 10 EMIB interconnects and the entire package will measure at a mighty 4446mm2. Moving over to the HBM variant, we are getting an increased number of interconnects which sit at 14 and are needed to interconnect the HBM2E memory to the cores. When comparing 3rd Gen Intel Xeon Scalable processors to the upcoming Sapphire Rapids HBM processors, we are seeing two- to three-times performance increases across weather research, energy, manufacturing, and physics workloads2. At the keynote, Ansys CTO Prith Banerjee also shows that Sapphire Rapids HBM delivers up to 2x performance increase on real-world workloads from Ansys Fluent and ParSeNet. The four HBM2E memory packages will feature 8-Hi stacks so Intel is going for at least 16 GB of HBM2E memory per stack for a total of 64 GB across the Sapphire Rapids-SP package. Talking about the package, the HBM variant will measure at an insane 5700mm2 or 28% larger than the standard variant. Compared to the recently leaked EPYC Genoa numbers, the HBM2E package for Sapphire Rapids-SP would end up 5% larger while the standard package will be 22% smaller.
Intel Sapphire Rapids-SP Xeon (Standard Package) - 4446mm2 Intel Sapphire Rapids-SP Xeon (HBM2E Package) - 5700mm2 AMD EPYC Genoa (12 CCD Package) - 5428mm2
Intel also states that the EMIB link provides twice the bandwidth density improvement and 4 times better power efficiency compared to standard package designs. Interestingly, Intel calls the latest Xeon lineup Logically monolithic which means that they are referring to the interconnect that’ll offer the same functionality as a single-die would but technically, there are four chiplets that will be interconnected together. You can read the full details regarding the standard 56 core & 112 thread Sapphire Rapids-SP Xeon CPUs here.
Intel Xeon SP Families (Preliminary):
As for the footnotes for the Intel Sapphire Rapids HBM ‘Xeon Scalable’ CPU performance, you can see them below:
Test by Intel as of 04/26/2022. 1-node, 2x Intel® Xeon® Platinum 8360Y CPU, 72 cores, HT On, Turbo On, Total Memory 256GB (16x16GB DDR4 3200 MT/s ), SE5C6200.86B.0021.D40.2101090208, Ubuntu 20.04, Kernel 5.10, 0xd0002a0, ifort 2021.5, Intel MPI 2021.5.1, build knobs: -xCORE-AVX512 –qopt-zmm-usage=high Test by Intel as of 04/19/22. 1-node, 2x Pre-production Intel® Xeon® Scalable Processor codenamed Sapphire Rapids Plus HBM, >40 cores, HT ON, Turbo ON, Total Memory 128 GB (HBM2e at 3200 MHz), BIOS Version EGSDCRB1.86B.0077.D11.2203281354, ucode revision=0x83000200, CentOS Stream 8, Linux version 5.16, ifort 2021.5, Intel MPI 2021.5.1, build knobs: -xCORE-AVX512 –qopt-zmm-usage=high
OpenFOAM
Test by Intel as of 01/26/2022. 1-node, 2x Intel® Xeon® Platinum 8380 CPU), 80 cores, HT On, Turbo On, Total Memory 256 GB (16x16GB 3200MT/s, Dual-Rank), BIOS Version SE5C6200.86B.0020.P23.2103261309, 0xd000270, Rocky Linux 8.5 , Linux version 4.18., OpenFOAM® v1912, Motorbike 28M @ 250 iterations; Build notes: Tools: Intel Parallel Studio 2020u4, Build knobs: -O3 -ip -xCORE-AVX512 Test by Intel as of 01/26/2022 1-node, 2x Pre-production Intel® Xeon® Scalable Processor codenamed Sapphire Rapids Plus HBM, >40 cores, HT Off, Turbo Off, Total Memory 128 GB (HBM2e at 3200 MHz), preproduction platform and BIOS, CentOS 8, Linux version 5.12, OpenFOAM® v1912, Motorbike 28M @ 250 iterations; Build notes: Tools: Intel Parallel Studio 2020u4, Build knobs: -O3 -ip -xCORE-AVX512
WRF
Test by Intel as of 05/03/2022. 1-node, 2x Intel® Xeon® 8380 CPU, 80 cores, HT On, Turbo On, Total Memory 256 GB (16x16GB 3200MT/s, Dual-Rank), BIOS Version SE5C6200.86B.0020.P23.2103261309, ucode revision=0xd000270, Rocky Linux 8.5, Linux version 4.18, WRF v4.2.2 Test by Intel as of 05/03/2022. 1-node, 2x Pre-production Intel® Xeon® Scalable Processor codenamed Sapphire Rapids Plus HBM, >40 cores, HT ON, Turbo ON, Total Memory 128 GB (HBM2e at 3200 MHz), BIOS Version EGSDCRB1.86B.0077.D11.2203281354, ucode revision=0x83000200, CentOS Stream 8, Linux version 5.16, WRF v4.2.2
YASK
Test by Intel as of 05/9/2022. 1-node, 2x Intel® Xeon® Platinum 8360Y CPU, 72 cores, HT On, Turbo On, Total Memory 256GB (16x16GB DDR4 3200 MT/s ), SE5C6200.86B.0021.D40.2101090208, Rocky linux 8.5, kernel 4.18.0, 0xd000270, Build knobs: make -j YK_CXX=‘mpiicpc -cxx=icpx’ arch=avx2 stencil=iso3dfd radius=8, Test by Intel as of 05/03/22. 1-node, 2x Pre-production Intel® Xeon® Scalable Processor codenamed Sapphire Rapids Plus HBM, >40 cores, HT ON, Turbo ON, Total Memory 128 GB (HBM2e at 3200 MHz), BIOS Version EGSDCRB1.86B.0077.D11.2203281354, ucode revision=0x83000200, CentOS Stream 8, Linux version 5.16, Build knobs: make -j YK_CXX=‘mpiicpc -cxx=icpx’ arch=avx2 stencil=iso3dfd radius=8,
Ansys Fluent
Test by Intel as of 2/2022 1-node, 2x Intel ® Xeon ® Platinum 8380 CPU, 80 cores, HT On, Turbo On, Total Memory 256 GB (16x16GB 3200MT/s, Dual-Rank), BIOS Version SE5C6200.86B.0020.P23.2103261309, ucode revision=0xd000270, Rocky Linux 8.5 , Linux version 4.18, Ansys Fluent 2021 R2 Aircraft_wing_14m; Build notes: Commercial release using Intel 19.3 compiler and Intel MPI 2019u Test by Intel as of 2/2022 1-node, 2x Pre-production Intel® Xeon® Scalable Processor code names Sapphire Rapids with HBM, >40 cores, HT Off, Turbo Off, Total Memory 128 GB (HBM2e at 3200 MHz), preproduction platform and BIOS, CentOS 8, Linux version 5.12, Ansys Fluent 2021 R2 Aircraft_wing_14m; Build notes: Commercial release using Intel 19.3 compiler and Intel MPI 2019u8
Ansys ParSeNet
Test by Intel as of 05/24/2022. 1-node, 2x Intel® Xeon® Platinum 8380 CPU, 80 cores, HT On, Turbo On, Total Memory 256GB (16x16GB DDR4 3200 MT/s [3200 MT/s]), SE5C6200.86B.0021.D40.2101090208, Ubuntu 20.04.1 LTS, 5.10, ParSeNet (SplineNet), PyTorch 1.11.0, Torch-CCL 1.2.0, IPEX 1.10.0, MKL (2021.4-Product Build 20210904), oneDNN (v2.5.0) Test by Intel as of 04/18/2022. 1-node, 2x Pre-production Intel® Xeon® Scalable Processor codenamed Sapphire Rapids Plus HBM, 112 cores, HT On, Turbo On, Total Memory 128GB (HBM2e 3200 MT/s), EGSDCRB1.86B.0077.D11.2203281354, CentOS Stream 8, 5.16, ParSeNet (SplineNet), PyTorch 1.11.0, Torch-CCL 1.2.0, IPEX 1.10.0, MKL (2021.4-Product Build 20210904), oneDNN (v2.5.0)