As both Mobile World Congress and Embedded World 2018 get fully underway, a new startup has just lunched a very interesting microprocessor.
GreenWaves Technologies is a fabless startup founded in 2014 based in Grenoble area, France. The company currently has fifteen employees and has a partnership with both ETH Zürich and the University of Bologna. Most recently, they closed on a €3.1M financing round in August 2017 in order to bring this very product to market.
Today GreenWaves has announced their very first product – the GAP8 IoT/AI application processor. The news comes just over a month after we reported on Esperanto’s ambitious high-performance RISC-V cores. Unlike Esperanto, GreenWaves is going at the very opposite direction – ultra-low power IoT segment, where battery and solar power are a suitable power source. In particular, the GAP8 is designed for IoT inference processing, allowing applications to analyze data right from the image, audio and motion sensors while communicating over relatively low data rates networks, keeping the upstream data throughput requirements low. For example, the chip can classify a QVGA (320×240) image for object detection/recognition using a CNN model every three minutes for 10 years on a little 3.6 Wh battery. In addition to vision acceleration, this chip is particularly suited for a whole set of other applications such as audio signal sampling and speech recognition, machine health monitoring, and various always-on applications that need to react to some intelligent sensory event.
Logically and physically, the chip consists of two partitions – the low-power microcontroller and the compute engine. Despite what the name implies, the GAP8 is a nona-core processor featuring 9 fully compatible RISC-V cores. One of the cores is located in the microcontroller part of the chip while the other cores are clustered together in the compute engine.
By the way, if you have been following the RISC-V ecosystem this might seem a little familiar. The GAP8 is actually an enhanced derivative of the Parallel Ultra Low Power (PULP) open source core. In fact, one of the co-founders and the CTO of GreenWaves is Eric Flamand. He designed the PULP RISC-V DSP extensions and is the main developer of the PULP-optimized GCC toolchain for RI5CY. This is great because it boosts the credibility of an open-source hardware project when one of its contributors feel confident enough to commercialize its derivative.
The MCU side sits on its own power domain decoupled from the compute engine. This part of the chip is designed as an ultra-low power MCU with a single core that does the system management such as controlling peripherals but it can also double as a general purpose low-power core. This is important because it means you only need to fire-up the compute engine absolutely when necessary while keeping low-intensity workloads to this core. The core comes with its own 20 KiB of private L1 cache. Peripherals are along with the clock generator are configurable.
Effort was made to make the MCU feel and behave like any other MCU on the market. As such, the MCU can be programmed using the familiar standard GCC/GDB toolchain derived from the one developed by the RISC-V foundation. Two real-time operating systems are supported – the PULP OS and Arm Mbed OS. GreenWaves has developed a full set of drivers for both of those operating systems in order to allow access to all the GAP8 peripherals. In the future GreenWaves hopes to expand it to other operating systems.
The MCU has most of the I/O interfaces expected including SPI, I2S,I2C, CPI, UART, and Serial I/Q. Additionally, the chip has support for 128 Mb/s LVDS (TIA/EIA-644 compliant) as well as a HyperBus interface so up to 16 MiB of low-power SDRAM or RAM can be hooked up.
The compute engine is where things get a bit more interesting. The remaining eight cores are clustered together to form an optimized cluster for parallel workloads. The cluster share the same 64 KiB data and 16 KiB instruction caches. The engine sits on an entirely separate voltage and frequency domains allowing to be entirely power-gated when not used or clocked at the desirable power-levels. Getting from off state to full operation is supposedly fairly quick so applications should be able to smoothly offload work from the MCU side to the compute engine when needed. In addition to the eight cores is a separate hardware convolution computation engine (HWCE) designed specifically to accelerates inference calculations for convolutional neural networks (CNNs). The HWCE shares the same memory with the rest of the cluster. GreenWaves has software libraries for Deep Learning (CNN based), Image Processing Library (e.g., People counting and Visual Monitoring), data analysis, and encryption optimized for the compute engine.
All in all, GreenWaves claims the chip can do up to 200 MOPS at 1 mW and up to 10 GOPS at a few tens of mW. Keep in mind that the goal here is for very low power consumption (just 1 milliwatt to few 10s of milliwatts) so it’s not directly competing with some products such as ARM’s recently announced Project Trillium, which, while efficient from an operation per watt prospective, are designed for a mobile power budge of a few Watts – over two order of magnitude greater than what the GAP8 targets. The company has some additional benchmark claims on their blog if you’re interested.
Enhanced RISC-V Cores
As noted earlier, all nine cores are RISC-V cores. Those cores also support the integer multiplication and division instructions (M) and compressed instructions (C) standard extensions. Additionally, the cores fully comply with the RISC-V privileged instructions specs, allowing secured code execution through the assist of the programmable memory protection unit (MPU) which allows defining memory access permissions for separate regions. As started by PULP, GreenWaves used the RISC-V standard ISA extension mechanism to enhance the cores and improve performance for DPS-centric operations, specifically operations frequently found in the algorithms executed by the compute engine. They have added specific instructions for operating on Convolutional Neural Networks (CNNs), Support Vector Machines (SVMs), Bayesian, Boosting, Visual Location, Fast Fourier Transforms, Cepstral Analysis and various others. GreenWaves claims those supplemental instructions significantly reduce the number of cycles needed to perform many common algorithms, increasing the performance and efficiency of execution. It’s worth reiterating that GreenWaves is a key contributor to PULP with many of the enhancements will be fed back into project.
It’s interesting to point out that in addition to GreenWaves, Esperanto and at least 2 other companies are independently developing non-standard RISC-V extensions for tensor and artificial neural network operations. Additionally, the GAP8 vector extension implementation looks to be different from the standard RISC-V vector extension currently being drafted. While portable software should mask most of the differences, this might come back to hurt them in the future.
The company is preparing a development kit which will include the GAP8 SDK and the GAPDUINO board which is also an Arduino Uno compatible Master/Shield. The boards are already available for pre-order (priced at 100 euros) on their site with roll-out expected to begin in April 2018.
The GAP8 comes in an 84-pin advanced QFN (aQFN) package and is fabricated on TSMC’s 55nm low power process. First samples came back just a week ago. The GAP8 is priced at $5 per unit for large volumes.
GreenWaves is already working on their future product. If all goes well, they plan on raising 7 million euro this year upon the availability of their first silicon and design wins which will enable them to add 10 more members to the team to further support their development effort. Their next-gen GAP chip is aimed to launch sometimes in the first quarter of 2019. The chip will feature a number of incremental improvements including better energy efficiency by moving to a 28/22nm FD-SOI process. They have a third product in the works to incorporate 3rd party RF IP enabling wireless connectivity.
Derived WikiChip Articles: GreenWaves, GAP8.
- Analog AI Startup Mythic To Compute And Scale In Flash
- Cambricon Reaches for the Cloud With a Custom AI Accelerator, Talks 7nm IPs
- QUEST, A TCI-Based 3D-Stacked SRAM Neural Processor
- ISSCC 2018: Intel’s Self-Powered Intelligent IoT Edge Mote
- ISSCC 2018: MIT’s low-power hardware crypto RISC-V IoT processor
- Fujitsu launches a deep learning accelerator for industrial apps
- Esperanto exits stealth mode, aims at AI with a 4,096-core 7nm RISC-V monster
- NEC Refreshes SX-Aurora Vector Engine, Outlines Roadmap
- Japanese AI Startup Preferred Networks Designed A Custom Half-petaFLOPS Training Chip
- Samsung M5 Core Details Show Up
- SC19: Aurora Supercomputer To Feature Intel First Exascale Xe GPGPU, 7nm Ponte Vecchio
- A Look at Cerebras Wafer-Scale Engine: Half Square Foot Silicon Chip
- Groq Tensor Streaming Processor Delivers 1 PetaOPS of Compute
- Arm Makes Headway In HPC, Cloud
- Intel Announces Keem Bay: 3rd Generation Movidius VPU
- A Look at Spring Crest: Intel Next-Generation DC Training Neural Processor
- Marvell Lays Out ARM Server Roadmap