Ten years ago Arm launched a new 64-bit architecture. ARMv8 more than just extended the virtual address space over ARMv7. It cleaned up and streamlined the architecture and eliminated legacy quirks. ARMv8 has proven to be an extremely successful architecture, adding numerous new features over the next decade. Today, Arm is launching its successor – ARMv9 for the next decade.
The ARMv8 (specifically the A profile) has undergone significant enhancements over the last few years. Arm works closely with partners on defining those features. Related features are typically bundled together and are introduced as new decimal versions (e.g., v8.1, v8.2, etc..) on a consistent cadence. Arm also provides slightly modified ARMv8 ISAs called “profiles” for different classes of devices – Application (A) general-purpose processors, Real-Time (R), and Microcontrollers (M). This article will talk about the A profile since that’s the main ISA found in mobile, desktop, and server computers based on Arm and is the only ARMv9 profile Arm is launching today. Arm stated other ARMv9 profiles are planned for the future.
The first few revisions (8.1 and 8.2) concentrated on general enhancements. v8.2 also added the Scalable Vector Extension (SVE) although it was made optional with support limited to a few cores.
With v8.3-8.5 new security-related features were added. Some of it is a result of the Meltdown and Spectre vulnerabilities while others try to address the decade’s-old memory-related vulnerabilities such as buffer overflows. For example, the ARMv8.3 extension added a new feature called Pointer Authentication, an optional extension that allows for the authentication of the address in registers prior to data references and indirect branches. The extension introduced new instructions for signing and authenticating pointers against a chosen context (e.g., valid return address within a stack frame). Those pointers would then use Pointer Authentication Code or PACs which are baked into the pointer values themselves to authenticate and validate valid from invalid pointers. Pointer Authentication quickly made its way into Apple’s A12 SoC.
ARMv8.3-8.5 also added support for Secure exception level 2 was added and additional support for more cryptographic hashing algorithms was also introduced. Nested virtualization (guest VMs running at EL1) and NEON complex numbers were also added at that time. ARMv8.5 added a number of new ways of addressing speculative execution attacks. Memory tagging was added which limits access to memory from pointers with correct tags. Branch Target Indicators (BTI) is another extension that allows branches to only a valid set of destinations.
“The Memory Tagging Extension (MTE) will be an integral part of the first generation of ARMv9 CPUs available in the next year, and software support for using memory tagging is being introduced as part of Android 11 and Opensuse,” – Richard Grisenthwaite, Arm SVP and Chief Architect.
This leads us to ARMv9 which Arm is launching today. ARMv9 is fully compatible with ARMv8 and introduces a number of major extensions that are designed to improve security as well as specialized acceleration for better performance efficiency of DSP and ML workloads. Like its predecessor, ARMv9 will cover all three profiles (A, M, and R) in the future.
As far as features and extensions go, ARMv9.0’s starting point is ARMv8.5. It’s worth pointing out that since not all features are suitable for all markets and devices, some ARMv8.x features were made optional. Some optional features early on were also made mandatory in later versions of ARMv8.x, therefore we also expect to see some of those optional features becoming mandatory later on in ARMv9.x. That said since the ARMv9.0 starting point is ARMv8.5, all mandatory features in ARMv8.5 are mandatory in ARMv9.0. Incremental enhancements will continue for v9.x in the same manner as Arm has done with ARMv8.x
In order to improve DSP and ML workloads, the ARMv9 will be introducing a number of new vector extensions. ARMv9 relies on second-generation Scalable Vector Extension 2 (SVE2) as the foundation for those extensions. SVE2 extends the capabilities of SVE with operations that are much more orthogonal to NEON. As the name implies the extension uses a vector-length-agnostic programming model with an architecture specification that allows the underlying implementation to be implemented in various vector widths ranging from 128- to 2,048-bit vectors.
Currently, only the Fujitsu A64FX CPU supports SVE. Last year Arm announced that the Neoverse N2 and V1 will be the first CPUs from Arm to support SVE. Those CPUs are expected to be launched this year. Arm says that over the next few years, they plan on extending SVE further with “substantial enhancements in performing matrix-based calculations.”
ARMv9’s emphasis on SVE makes it clear that this will likely be the way forward. Arm says it expects all new vector code to be written in SVE2, retaining NEON for just legacy reasons.
Confidential Compute Architecture (CCA)
A second component of the ARMv9 which will be introduced in the future is the Arm Confidential Compute Architecture or Arm CCA. The ultimate goal here is to successfully protect the software data in use on any piece of device the software is deployed on. CCA is designed to get rid of the old assumption that privileged software – be it the underlying operating system or even the hypervisor – need access to private software data.
According to Arm, the Confidential Compute Architecture introduces a new concept called Realms. Those are dynamically-created, secure, private enclaves separated from both the existing secure and non-secure worlds. Realms, in theory, are entirely protected from any form of inspection or access by any piece of software or even the host through the use of a fourth entirely private address space. “Realms use a small amount of trust in the testable management software that is inherently separate from the operating system and hypervisor,” said Grisenthwaite. In essence, realms should be able to protect the user’s data, even if the device’s operating system is compromised.
More detailed information on the new Arm CCA will be disclosed later this year around summertime. Arm expects that Arm CCA Extension to show up in silicon implementations within the next two to three years.