Formerly codenamed Perseus, the Neoverse N2 is the direct successor to the Neoverse N1. The Neoverse N series mainstream line of infrastructure CPU cores designed to scale the entire spectrum of devices from the edge to the cloud with ultra-low TDPs and all the way to server CPUs with 100s of Watt of TDP. The N2 follows Arm’s more traditional power and area constraints and takes a more balanced approach to improve the performance. Nonetheless, as with the V1, Arm is talking about a 40% IPC uplift between the N1 and the new N2. The Neoverse N2 is also the first Arm CPU to implement the ARMv9 ISA along with the SVE2 extension.
The Neoverse N2 CPU IP is also offered by Arm as a POP IP physical implementation. Arm says that by taking advantage of the latest 5-nanometer process, customers can expect 40% IPC improvement over the previous Neoverse N1 at roughly the same power envelope and area which was implemented on a 7nm node. Taking advantage of the latest node is also said to the potential to improve the frequency by about ten percent although historically, customers do not typically go all out on frequency.
Development timewise, the Neoverse N2 was designed slightly after the Neoverse V1 and was, therefore, able to borrow many of its features. In particular, Arm says that a large portion of the front-end (including the branch prediction unit) is very similar. Many of the very large buffers in the V1 are considerably smaller on the N2 and the corresponding bandwidths have also been reduced proportionally. For example, the MOP cache on the Neoverse N2 is 1.5K entry (half of the V1) and is identical to that found in the Cortex-A77 and A78. In fact, for the most part, the Neoverse N2 is very similar to the Cortex-A77, having similar dispatch bandwidths and ROB sizes.
Perhaps the biggest difference between the Neoverse V1 and N2 is the ISA support. The Neoverse NM2 is the first infrastructure CPU core to feature the ARMv9 ISA as well as SVE2. In terms of vector capabilities, the Neoverse N2 maintains 2x128b vector units which are now capable of SVE/SVE2 in addition to NEON/FP.
|Vector Performance (FLOPs)|
|uArch||Sunny Cove||Zen 3||Zeus||Perseus|
|EUs||2 × 512-bit FMA||2 × 256-bit FMA||2 × 256-bit FMA||2 × 128-bit FMA|
|DP FLOPs||32 FLOPs/clk||16 FLOPs/clk||16 FLOPs/clk||8 FLOPs/clk|
|SP FLOPs||64 FLOPs/cycle||32 FLOPs/clk||32 FLOPs/clk||16 FLOPs/clk|
Comparing the new Neoverse N2 to the Neoverse V1, Arm says the N2 is expected to be roughly 25% smaller at ISO-process and configuration which should enable higher packing of cores. Compared to first-generation Neoverse N1, the N2 is said to be around 1.4x IPC while the V1 is said to push this further to about 1.5x. All three cores should have the same peak frequency capabilities. At an ISO-process and configuration, the N2 is roughly 1.3x the area and 1.45x the power whereas the V1 is around 1.7x the area. Since the N1 was designed to take advantage of the 7nm process and the N2 targets a more recent 5nm node, Arm says that most of the area and power differences should be absorbed by the node improvement and should end up at about ~1.0x.
In a direct microarchitectural comparison, the numbers for both the Neoverse N2 and V1 are incredibly impressive. For the Neoverse N2, Arm is claiming an average IPC uplift of 32% across a whole array of benchmarks including SPEC CPU2006, SPEC CPU2017, and various other server and networking workloads. In a direct ISO-frequency comparison on the SPEC CPU2006 benchmark, the Neoverse N2 is said to have a 1.4x uplift in IPC. In various networking workloads such as that utilizing Nginx, Arm says it observed over 1.55x IPC improvement.
On the Neoverse V1, due to bigger core changes, Arm is touting an average of 48% IPC uplift across a whole array of benchmarks. For the SPEC CPU2006 benchmark, the company says it has observed over 1.5x IPC improvement at ISO-frequency. Since the Noeverse V1 implements wider vector units, for a certain class of workloads (such as those involving crypto and packet processing), the wider 256b SVE units result in an IPC uplift of well over 2x.
Putting Everything Together
Although an enormous amount of work has gone into this for over a decade, for those of us observing things from the outside, it has been only a few short years since Arm first announced the Neoverse back at the Arm TechCon 2018. At the time the company outlined its roadmap for the next few generations of infrastructure CPU IP cores – Ares, Zeus, and Poseidon.
“I want to make it very clear that when it comes to the data center, Arm is all-in in terms of the technologies that we are creating and the partnership we are building to enable efficient and low-cost data centers.” – Simon Segars
As part of that announcement, Arm committed itself to significant performance improvements gen-over-gen – far better than the competition’s recent generational improvements. The company promised to deliver 30% performance or higher with each generation. Today’s launch of the Neoverse N2 and Neoverse V1 fully validates their effort and commitment. The launch is also changing the dynamics between Arm’s main competition – Intel and AMD – and their own cloud vendors. This poses a significant threat to Intel’s success among the Super 7. The offering of IP CPU cores with exceedingly high performance that can be integrated on their own custom silicon and used as a scale/performance/cost differentiator is highly appealing. Historically, Intel could command higher prices for their top server SKUs due to their uncontested leading performance. Intel’s half-decade-long execution and manufacturing problems have all but eroded that position. We have already seen Amazon AWS takes this approach and it’s inconceivable that the other six have not at least run the numbers.