You have probably never heard about Esperanto before and there’s a good reason for that. The startup has been cloaked in secrecy, at least until recently that is. At the 7th RISC-V Workshop Esperanto finally gave us some glimpse into what they were up to.
Esperanto Technologies was founded in 2014 and have pretty significant backing from companies such as Western Digital. Esperanto president and CEO is none other than Dave Ditzel. And yes, we mean that Ditzel. The co-author of the “The Case for RISC” paper along with Dave Patterson, the designer of probably a half-dozen SPARC architectures including leading the 64-bit SPARC development team, and perhaps most notably, co-founding Transmeta, the company that took on Intel. Interestingly enough before moving on to his most recent adventure at Esperanto, he was actually working at Intel on a project that attempted to replace all three of Intel’s architectures (Atom, Client, and Server) with a single brand new architecture.
At the 7th RISC-V Workshop Ditzel explained some of the work Esperanto has been doing. They initially planned on designing their own ISA but after a careful analysis of RISC-V, it was decided that RISC-V can more than meet their needs – particularly due to its ability to support custom extensions, a feature of the ISA Esperanto is making extensive use of. There is already a number of existing commercial RISC-V solutions, but nothing in the area of high-end performance; more specifically nothing that can currently compete against top ARM IP cores. This is where Esperanto comes in.
RISC-V IP Cores
Esperanto plans on paving the way for RISC-V to enter the high-performance computing market. As part of that plan they have been designing at least three RISC-V related IP cores:
- ET-Maxion – a high-performance RISC-V core comparable to the best ARM IP cores avilable today.
- ET-Minion – an energy-efficient RISC-V core for high TeraFLOP computing
- ET-Graphics – a RISC-V-based graphics processor
Designs for the cores have been done in TSMC’s leading-edge 7nm process using design methodologies comfortable for large IP customers (i.e., human-readable synthesizable Verilog using standard CAD flow). This is in contrast to many existing RISC-V solutions that rely on tools such as Chisel and Rocket Chip generator which are great tools developed by Berkeley for rapid prototyping but are largely unfamiliar to engineers in the industry. Ditzel noted that Esperanto plans on having a very strong physical design effort around the 7nm process with careful consideration of energy, architecture, circuits, and physical design. “What we are trying to do is build a flagship for RISC-V. We want a product that’s going to draw attention to great RISC-V performance,” Ditzel said.
ET-Maxion is going to be their flagship core which will also be a licensable core. They hope to achieve the highest single-thread performance comparable or exceeding the best ARM IP cores available today. This core is optimized for 7nm and is designed to run a standard Linux OS. It is a 64-bit RV64GC processor which originally started out as the Berkeley Out-of-Order Machine v2 (BOOM v2) but has since been substantially improved.
ET-Minion is Esperanto’s energy-efficient core designed to do all the heavy floating point work. That is, this core was designed to achieve the highest floating point throughput with a high degree of energy efficiency. The ultimate goal with this core is to greatly reduce the energy cost per operation. As with the ET-Maxion, this core is also a 64-bit RISC-V processor, in-order, and will also incorporate the new RISC-V vector extension Esperanto has been working on. Designed to talk to various accelerators, ET-Minion has support for multithreading to absorb the various latencies involved with those operations. Like the ET-Maxion, this core will be offered as a licensable core with aggressive 7nm design for the various physical design aspects such as the caches and register files.
ET-Graphics is a graphics solution that is based on RISC-V. Esperanto has designed a shader compiler which generates RISC-V instructions and is capable of distributing the workload over a large set of cores.
An AI Monster
Although Esperanto will be licensing the cores they have been designing, they do plan on producing their own products. The first product they want to deliver is the highest TeraFLOP per Watt machine learning computing system. Ditzel noted that the overall design is scalable in both performance and power. The chips will be designed in 7nm and will feature a heterogeneous multi-core architecture.
There are 16 of the large OoOE ET-Maxion cores with their own private L1 and L2 caches. Additionally, Ditzel said they plan on putting 4,096 ET-Minion cores – each with everything noted above. That is, each of those cores will include the full vector extension with the full vector floating point unit. In case you’re wondering, the exact width of those vector units has not been disclosed. By the way, this chip is not being designed just as an accelerator. Esperanto stated that it will also be capable of booting stand-alone standard Linux.
Given there are 4,096 small ET-Minion cores and 16 large ET-Maxions, we can probably expect the chip to consist of 16 clusters of 256 ET-Minions and 1 ET-Maxion.
Curiously, the overall chip architecture shares lots of similarities with the PEZY-SC2 we detailed earlier this year. Both chips blur the lines between CPUs, GPUs, and DSPs. The SC2 consists of 128 “cities” each with 16 processing elements (PEs). Both PEZY and Esperanto use a very similar heterogeneous multi-core architecture. PEZY’s consists of 6 high-performance, out of order, MIPS management cores while Esperanto uses 16 high-performance, out of order, RISC-V management cores. Both make use of very tiny in-order cores in order to achieve high FLOP throughput by integrating a full floating-point vector unit into each core. Additionally, both companies even use multithreading with those tiny cores to hide the various latencies from software (PEZY uses 8-way SMT while Esperanto has not disclosed how many threads they plan on having per core).
There is no doubt that Esperanto will achieve high compute power. The real bottleneck with the system will most likely be the data flow. Not much has been disclosed about the memory subsystem, but we do know they plan on integrating a high-bandwidth DRAM interface. It may interest you to know that Dave Ditzel is also the CEO of Thruchip Communications, the company that facilitates the licensing of ThruChip Interface (TCI) technology. Ditzel has been promoting the technology as a much cheaper alternative to through-silicon via (TSV). It’s possible we might see something in that direction in the future.
It’s worth noting that for this processor, Esperanto armed the Et-Minion, which already incorporates the new vector extension, with two additional domain-specific extensions for Machine Learning. One such extension is a tensor extension to augment the vector instructions. Ditzel claims those extensions greatly improve the energy efficiency in machine learning workloads. All the cores share the same address space and likely make use of TileLink. TileLink, which was also presented at this year’s RISC-V workshop, is a free and open source scalable cache-coherent fabric for RISC-V.
All in all, Esperanto is moving in the right direction for completing the RISC-V ecosystem. Current solutions are rather weak when it comes to the high-end offerings and is precisely the problem Esperanto is tackling. The lower cost and growing selection of RISC-V solutions, coupled with a rapidly growing software ecosystem, represent a real and viable alternative to ARM.
- The RISC-V momentum continues with the GAP8, a new IoT/AI Application Processor
- TSMC 5-Nanometer Update
- TSMC N7+ EUV Process Starts Shipping
- Analog AI Startup Mythic To Compute And Scale In Flash
- TSMC Talks 7nm, 5nm, Yield, And Next-Gen 5G And HPC Packaging
- TSMC Demonstrates A 7nm Arm-Based Chiplet Design for HPC
- TSMC 7nm HD and HP Cells, 2nd Gen 7nm, And The Snapdragon 855 DTCO
- SC19: Aurora Supercomputer To Feature Intel First Exascale Xe GPGPU, 7nm Ponte Vecchio
- A Look at Cerebras Wafer-Scale Engine: Half Square Foot Silicon Chip
- Groq Tensor Streaming Processor Delivers 1 PetaOPS of Compute
- Arm Makes Headway In HPC, Cloud
- Intel Announces Keem Bay: 3rd Generation Movidius VPU
- A Look at Spring Crest: Intel Next-Generation DC Training Neural Processor
- Marvell Lays Out ARM Server Roadmap
- AMD Announces 3rd Gen Ryzen Threadripper
- Intel Launches Stratix 10 GX 10M; 10M LEs, Two Massive Interconnected Dies
- IBM Adds POWER9 AIO, Pushes for an Open Memory-Agnostic Interface