Intel Unveils the Tremont Microarchitecture: Going After ST Performance

Stephen Robinson Presenting Tremont

Among the many reveals made by Intel at the 2018 Intel Architecture Day was the company’s high-efficiency small core roadmap. At the time, Raja Koduri said that the common theme for all upcoming small cores is single-thread performance improvements. Koduri noted that the small cores have much more room to grow and that an aggressive roadmap was being driven.

As per the roadmap, the next planned small core was Tremont. Intel says that Tremont will be shipping in a whole range of products. The first announced product to feature Tremont is Lakefield, but other products will be announced in the future.


Today, Intel is unveiling the details of the Tremont microarchitecture at the Linley processor conference.

Design Targets

Tremont is Intel’s next-generation low power x86 core. It is designed to go into a whole range of products from IoT to mobile to edge and to microserver SoCs. The first and foremost focus was on single-threaded performance. Running all recent x86 software at a good performance was important. An additional vector of optimization involved networking (and various multi-core use cases) which influenced some of the overall design goals. To that end, the power/core and area/core were two important additional design targets.

High Level Changes

For a small core, Tremont is very beefy. The front end is twice as wide as Goldmont Plus. It’s got a 6-wide decode front end split into two clusters (we go over the details on the next page). It can decode 6 instructions in each cycle. It can allocate and retire 4 instructions each cycle. Intel also upgraded the branch predictor. Intel says that the branch predictors are very similar to the ones you can find on the big core.

At the back end, things have been made equally as wide. There are now 10 execution ports. On the memory side, Tremont supports dual load/store pairs (2 loads, 2 stores, or 1 of each). The typical configuration for Tremont is in a quad-core module, but that can vary by product. Tremont has a shared L2 core that, depending on the product, can be configured for up to 4.5 MiB of L2 cache. Likewise, depending on the product, there might be additional last level cache on top of this which Tremont has additional support for.


Spotted an error? Help us fix it! Simply select the problematic text and press Ctrl+Enter to notify us.

Notify of
Newest Most Voted
Inline Feedbacks
View all comments
1 year ago

I think the discussion on the single-thread perf against the CoreTM line has to be caveated on the fact that the graph seems to only show Sunny Cove scaling to at most 2x the power consumption of the Tremont. Considering that we see Skylake parts in the wild hitting their top-perf at close to 30W, and that slide is for Lakefield, I suspect we are seeing a case here where the Sunny Cove is not really getting to stretch its legs due to power constraints.

It’s also doubtful that Tremont is designed to scale to 4+Ghz the way the Core series are, so the absolute perf delta may be much larger, even if the IPC has gotten closer. Note the lack of absolute scale numbers anywhere in these slides.

Just going from the extremely few numbers available online, the Goldmont Plus 4C @ 2.8Ghz vs. Core i3-8100 @ 3.6Ghz was somewhere between 2-3x behind on the workloads tested. +30% IPC on the Goldmont, and +18% IPC on Sunny Cove, still leaves a pretty sizable gap at peak.

1 year ago

A couple of reports state that Elkhart Lake will be a 10nm, 32 Tremont Core + gen11 graphics low power server chip. Some reports also suggest it might include some Sunny Cove cores, as in the Lakefield hybrid.,40286.html?amp_js_v=0.1&usqp=mq331AQEKAFwAQ==

Reply to  JayN
1 year ago

Sorry, not 32 cores … 32 gen11 EU. I don’t see a document of how many Tremont cores go in these server chips.

1 year ago

6-way decode… AMD is still 4

Would love your thoughts, please comment.x

Spelling error report

The following text will be sent to our editors: