Intel Unveils the Tremont Microarchitecture: Going After ST Performance

Stephen Robinson Presenting Tremont

Among the many reveals made by Intel at the 2018 Intel Architecture Day was the company’s high-efficiency small core roadmap. At the time, Raja Koduri said that the common theme for all upcoming small cores is single-thread performance improvements. Koduri noted that the small cores have much more room to grow and that an aggressive roadmap was being driven.

As per the roadmap, the next planned small core was Tremont. Intel says that Tremont will be shipping in a whole range of products. The first announced product to feature Tremont is Lakefield, but other products will be announced in the future.

 

Today, Intel is unveiling the details of the Tremont microarchitecture at the Linley processor conference.

Design Targets

Tremont is Intel’s next-generation low power x86 core. It is designed to go into a whole range of products from IoT to mobile to edge and to microserver SoCs. The first and foremost focus was on single-threaded performance. Running all recent x86 software at a good performance was important. An additional vector of optimization involved networking (and various multi-core use cases) which influenced some of the overall design goals. To that end, the power/core and area/core were two important additional design targets.

High Level Changes

For a small core, Tremont is very beefy. The front end is twice as wide as Goldmont Plus. It’s got a 6-wide decode front end split into two clusters (we go over the details on the next page). It can decode 6 instructions in each cycle. It can allocate and retire 4 instructions each cycle. Intel also upgraded the branch predictor. Intel says that the branch predictors are very similar to the ones you can find on the big core.

At the back end, things have been made equally as wide. There are now 10 execution ports. On the memory side, Tremont supports dual load/store pairs (2 loads, 2 stores, or 1 of each). The typical configuration for Tremont is in a quad-core module, but that can vary by product. Tremont has a shared L2 core that, depending on the product, can be configured for up to 4.5 MiB of L2 cache. Likewise, depending on the product, there might be additional last level cache on top of this which Tremont has additional support for.

 
 



Spotted an error? Help us fix it! Simply select the problematic text and press Ctrl+Enter to notify us.

Spelling error report

The following text will be sent to our editors: