Last week Arm hosted its Vision Day. Along with the launch of the ARMv9 architecture, the company took a bit of time to talk about its near-term roadmap going into 2022.
Since the introduction of the Cortex-A72 in 2015, Arm has been introducing a new big core microarchitecture on a consistent yearly cadence. The company’s most recent big core is the Cortex-A78, formerly known as Hercules, which was launched last year. Arm’s next big core is Matterhorn which will likely be announced in the coming months. Makalu will succeed Matternhorn sometime in 2022.
In the infrastructure space, Arm is reporting around 2.4x the performance improvement on the Neoverse V1 versus SoCs implementations that use the Cortex-A72 cores. It’s worth pointing out the prior to the introduction of the Neoverse N1, most prior Arm server chips relied on the Cortex-A72 (the “Cosmos” platform) as their core implementation. One such major example is Amazon’s AWS Graviton. Amazon has since introduced the AWS Graviton2 which implemented Arm’s most recent Neoverse N1.
On the client-side, Matterhorn, Arm’s successor to the Cortex-A78 is expected to launch this year. Through the next two generations, Arm is expecting another 30% improvement in IPC (at ISO-process/frequency). In other words, we should expect a 14-15% IPC improvement in each generation if the next two generations have roughly an equal amount of single-thread performance improvement.
Dropping 32-bit Support
It’s also worth pointing out that at Arm DevSummit 2020, Arm announced that it is planning on dropping 32-bit support in their big cores. “From 2022, our big cores will be 64-bit only,” said Paul Williamson, SVP of Client device business at Arm. Arm argues that 64-bit can improve the performance of some workloads by as much as 15-30%. “In game development, 64-bit can provide an overall framerate uplift ranging from 9.5-16.7%,” he added. Since Makalu is planned for 2022, it will most likely be the first core to drop AArch32. What’s interesting is that Arm was specifically referring to the big cores, possibly implying that the small cores will continue to support 32-bit longer. It’s unclear how or if a mixture will remain supported under the company’s big.LITTLE architecture which allows mixing of cores. Arm also added they expect to see “64-bit only mobile devices” by 2023, which implies 32-bit support will be dropped altogether.
Full SoC Tuning
Beyond improving the implementation of the cores, Arm says it is working on improving the overall performance of the SoC through the optimization of the cache, memory, and frequency. These parameters are further tweaked by vendors in order to achieve the right power-performance balance. The slide below intends to highlight the benefit of fine-tuning the entire SoC and its impact on the final performance such as cache size versus performance and latency. Latency, for example, can deliver up to 1% performance uplift per 5ns of latency reduction. In other words, reducing the latency by 60ns from 150ns to 90ns as stated on the graph yields as much as 12% performance uplift. Likewise, doubling the cache size can yield as much as 9% performance uplift. The company was pointing out that a difference between a full-system optimized SoC and one that simply relies on the latest generation of cores without further optimization could be as much as an entire generation on its own.
Arm has not yet disclosed which cores will be the first to implement ARMv9. More details will likely come when Matterhorn is launched sometime later this year.