With the large advancement in the performance and power efficiency of the small cores, Intel has been looking into integrating the two types of cores in a single chip in order to offer the best single-thread performance when desired while enabling greater power-efficiency when higher throughput is needed such as in background or heavily multi-threaded tasks. But while the hardware has been possible for some time, allowing for smooth thread migration in real-time based on the current execution lagged behind considerably in software support.
This article is part of a series of articles covering Intel’s 2021 Architecture Day:
- Intel’s Gracemont Small Core Eclipses Last-Gen Big Core Performance
- Intel Details Golden Cove For Next-Generation Client and Server CPUs
- Intel Unveils Alder Lake: Next-Generation Mainstream Heterogeneous Multi-Core SoC
- Intel Unveils Sapphire Rapids: Next-Generation Server CPUs
- Intel Introduces Thread Director For Heterogeneous Multi-Core Workload Scheduling
- Intel’s Mount Evans: Intel’s First ASIC DPU
- Intel Unveils Xe HPG – Discrete Graphics For Gamers
- Intel Unveils Xe HPC And Ponte Vecchio
Today at Intel 2021 Architecture day, the company unveiled the Intel Thread Director technology. This technology provides the operating system with visibility into the intimate behavior of the running workload sampled from the hardware performance monitoring telemetry itself. This allows assisting the operating system with the optimal runtime scheduling. In a way, the Thread Director adds many new dimensions to how the OS determines where to schedule a thread. For example, instead of just choosing a core based on background or foreground task, the OS can now receive hints from the hardware indicating that a certain workload should be assigned to a more performant core. Alternatively, Thread Director can also provide the OS with hints on threads that are running on a performance core that could be migrated over to an efficient core without hurting performance. It’s worth pointing out that core assignment is done based on the current context so a thread that changes behavior (e.g. spin waiting on some event) may be swapped out of the performance core and placed on an efficient core until it needs to do more real work.
Intel says that it has worked closely with Microsoft to enable this feature on the upcoming Windows 11 release.
Intel demonstrated the technology running on Windows 11 on the Alder Lake SoC.
In the demonstration below running a typical media/content-creation software, the dark green boxes represent threads executing mostly scalar instructions while the dark blue represents threads executing mostly vector instructions. Both of those threads are being prioritized to the performance cores. Light blue boxes represent background tasks. Those threads are being prioritized on the efficiency cores.
In the demo below, a fully multi-threaded synthetic workload is shown. In this situation, all threads will be distributed across all the available cores.