Special Processors to Drive IoT and Wearables (Part 1 of 2)

V.P. Sampath is a senior member of IEEE and a member of Institution of Engineers India. Currently working as technical architect at AdeptChips, Bengaluru, he has published international papers on VLSI and networks


Globally, there will be over 20 billion connected devices in the next five years, representing $7 trillion dollars in revenue. This corresponds to roughly three devices for every person and a value higher than the year 2016 GDP of all the countries in the world, except the US and China. Many of these systems including implantables, wearables, printed electronics and the Internet of Things (IoT) will have ultra-low power and area requirements. So these applications will rely on ultra-low-power general-purpose microcontrollers and microprocessors, making them the most abundant type of processors produced and used.

Two types of processors are available at present: Commercial-off-the-shelf (COTS) and bespoke. The two have many similar features, such as a pipeline and cache.

Specifically, COTS processors satisfy needs of the purchasing organisation without the need to commission custom made, or bespoke, solutions. Compared to custom processors, these are cheap, flexible (avoid binding solution to a single hardware/software source) and backward-compatible with legacy products. These provide current technology solutions and shorten design-to-production cycles, with their large user base generally uncovering design defects early.

While COTS processors are likely to be built for optimal average case performance, bespoke processors are designed to ease the certification process. Typical types of evidence required for certification are worst-case execution time of instructions, hardware reliability and information on systematic design flaws in the processor.

Bespoke processors are used for applications with ultra-low area and power constraints. Low-power processors are widely used and are expected to power a large number of emerging applications. Such processors tend to be simple, run relatively simple applications, and do not support non-determinism, which makes symbolic simulation-based technique a good fit for such processors.

Power and area efficiency concerns

Microprocessors and microcontrollers used in the emerging area- and power-constrained connected applications are designed to include a wide variety of functionalities in order to support a large number of diverse applications with different requirements. On the other hand, the embedded system is designed for typically a small number of applications, running over and again on a general-purpose processor for the lifetime of the system. Given that a particular application may only use a small subset of the functionalities provided by a general-purpose processor, there may be a considerable amount of logic in the processor that is not used by the application.

Cost concerns drive many of the connected applications to use general-purpose microprocessors and microcontrollers instead of much more area- and power-efficient ASICs, as, among other benefits, development cost of microprocessor IP cores can be amortised by the IP core licensor over a large number of chip makers and licensees. Given the mismatch between the extreme area and power constraints of emerging applications and the relative inefficiency of general-purpose microprocessors and microcontrollers compared to their ASIC counterparts, there exists a considerable opportunity to make microprocessor-based solutions for these applications much more area- and power-efficient.

One big source of area inefficiency in a microprocessor is that a general-purpose microprocessor is designed to target an arbitrary application and thus contains many more gates than what a specific application needs. The unused gates continue to consume power, resulting in significant power inefficiency, too. While adaptive power management techniques help to reduce power consumed by unused gates, the effectiveness of such techniques is limited due to the coarse granularity at which these must be applied, as well as significant implementation overheads such as domain isolation and state retention. These techniques also worsen area inefficiency.

Fig. 1: General-purpose processors are overdesigned for a specific application (top). The bespoke processor design methodology allows a microprocessor IP licensor or licencee to target different applications efficiently without additional software or hardware development cost (bottom)


One approach to significantly increase the area and power efficiency of a microprocessor for a given application is to eliminate all logic in the microprocessor IP core that will not be used by the application. Eliminating logic that is not used by an application can produce a design tailored to the application—a bespoke processor—that has significantly lower area and power requirements than the original microprocessor IP that targets an arbitrary application.

As long as the approach to create a bespoke processor is automated, the resulting design retains the cost benefits of a microprocessor IP, since no additional hardware or software needs to be developed. Also, since no logic used by the application is eliminated, area and power benefits come at no performance cost. The resulting bespoke processor does not require programmer intervention or hardware support either, since the software application can still run, unmodified, on the bespoke processor.

Static application analysis represents another approach for determining unusable logic for an application. However, application analysis may not identify the maximum amount of logic that can be removed, since unused logic does not correspond only to software-visible architectural functionalities but also to fine-grained and software-invisible micro-architectural functionalities.

Nimeesh Kumar


Please enter your comment!
Please enter your name here