WikiChip Fuse
A look at the changes and enhancements that were implemented by IBM in their z14 mainframe microprocessor and system control chips. ISSCC 2018: The IBM z14 Microprocessor And System Control Design

System Control (SC) Chip

The system control chip serves two unique purposes: it facilitates the drawer-to-drawer connectivity and it contains the L4 cache. Like the CPU, the SC chip is also fabricated on 14HP and runs at 2.6 GHz. Containing 9.7 billion transistors on a massive 696 mm² die, the chip has 21.7 kilometers of wire and roughly 20,000 C4s.

z14 system control chip (ISSCC 2018, IBM)

In the center is the L4 directory and control logic. The L4 directory consists of 160 MiB of eDRAM. To the bottom and top of the directory is the L4 data-flow logic. All data flowing from the I/O ports to the L4 caches go through that unit.

 

The directory and control. (Die: ISSCC 2018, IBM)

Surrounding the directory and control are four slices of 168 MiB of L4 eDRAM cache for a total of 672 MiB.

The L4 cache. (Die: ISSCC 2018, IBM)

To the left and the right of the caches are the A-bus links. The A-bus links are the interconnects that go from one SC chip to another SC chip for drawer-to-drawer connectivity. The A-bus links use differential pairs with 7.8 Gb/s bandwidth per lane for 670 Gb/s bandwidth per link for a total of 2 Tb/s of drawer-to-drawer bandwidth. On the top and bottom of the die are the X-bus interfaces. Those links are single-ended interfaces running at 5.2 Gb/s per lane for 800 Gb/s per link for a total of 4.8 Tb/s of CP-SC bandwidth.

The X-bus and A-bus interfaces. (Die: ISSCC 2018, IBM)

DT Gradient, Variations, and Yield Problems

The transition to a new process technology is not without risk. IBM’s transition to 14nm resulted in unique problems that needed to be addressed. One of the unique features of this 14nm FinFET on SOI process is its deep trench (DT) structures. Those very deep structures, around 3 microns in depth, can create large physical stresses on the wafer. The localized planar distortions worsen the gate height and CMP process controls.

14HP Deep Trenches (ISSCC 2018, IBM)

In practice, the die comprises areas with eDRAM with high DT density (around 10%) and non-eDRAM areas with low DT density (around 1%). The large difference in DT density results in significant variations in performance. A special process was developed to perform DT density gradient analysis across the entire design, locating areas with low DT density and high DT density in order to detect possible problems.

 
 

In the original SC chip design, SRAM was used for the L4 directories. This created a situation where the surrounding L4 eDRAM caches consisted of mostly high-density DTs while the central area, which was mostly SRAM, consisted of low-density DTs. The die image below on the left is the original SC chip with the blue area consisting of low-density DTs. The DT gradient variations in that design led to near-zero yield in hardware. Major modifications were needed in order to reduce the distortions. One of those changes included swapping the SRAM in the directories to eDRAM to reduce the gradient across the design. The changes to the caches and directories can be seen picture on the right. The final design had a more uniform DT baseline across both eDRAM areas and non-eDRAM areas which resulted in acceptable distortion/yield projections.

SC Chip DT Gradient (original vs modified) (ISSCC 2018, IBM)


Spotted an error? Help us fix it! Simply select the problematic text and press Ctrl+Enter to notify us.

Previous page Next page
Subscribe
Notify of
guest
5 Comments
Inline Feedbacks
View all comments
curious
curious
1 year ago

I demoed OpenSUSE on z/39 and I couldn’t find much real-world applications to run on it. Mainframes are fast and great for parallel processing, but unless you can find software or write software that can be used for that architecture, it doesn’t do me any good. When can IBM release CentOS or Ubuntu using a x86_64 or ARM architecture mainframe which has better software support?

saidone
saidone
Reply to  curious
1 year ago

Of course nobody would buy a Z System for running Linux only: never seen one of those not running some mix of CICS, z/TPF, IMS, etc. mostly on z/OS and/or Linux on some ancillary systems like JVM apps or networking on z/VM.

Tom
Tom
1 year ago

“The z14 cores are very large, measuring 28 mm² in die area. ”
I’m fairly certain that is wrong….

BlackDove
BlackDove
Reply to  David Schor
1 month ago

There is something puzzling about the memory feeds and speeds. A lot of the slides I’ve seen say 9.6Gb/s per link, but its actually 9.6GB/s per link for Centaur DIMMs isn’t it? The big B little b makes a huge difference. I also can’t find where any of the feeds and speeds of z systems make sense when added up. Several slides I’ve seen from James Warnock say 1.6Tb/s memory bandwidth on z13, then another paper(by Chris Berry) says 384GB/s “drawer level memory bandwidth” on z13 which is also claimed to be >3x zEC12 and the only papers I could… Read more »

5
0
Would love your thoughts, please comment.x
()
x

Spelling error report

The following text will be sent to our editors: