# **A Novel Energy Efficient Adder Architecture for Image Processing Applications**

**Karshankant Sharma<sup>1</sup> , Vijay Kumar Magraiya<sup>2</sup> and Abhay Khedkar<sup>3</sup>**

*<sup>1</sup>Research scholar, <sup>2</sup>Assistant Professor,<sup>3</sup> Assistant Professor Electronics and Comm. dept., SRCEM Banmore, Morena, India*

*Abstract—* **In the present era, ultra-high energy efficiency is required for all the battery operated devices due to increased functionality on the single chip. This high energy efficiency can be achieved by designing efficient arithmetic circuit that performs most of the processing within these cores. In this paper, we propose a novel adder architecture that improves the power and speed parameters simultaneously at the cost of minor loss in accuracy. The proposed adder can be efficiently utilized in the image video processing applications. In order to evaluate the efficacy of the proposed adder, proposed and existing adder architecture is implemented on MATLAB to evaluate error metrics and Tanner to evaluate design metrics. Simulation results shows that proposed adder significantly reduces power, area and delay at small loss in accuracy.**

*Keywords—* **Image processing, Adder, High-speed integrated circuits, VLSI, Low power design.**

## **I. INTRODUCTION**

With the increasing use of portable devices and with increasing functionality on these devices, energy efficiency has become the prime challenge to the VLSI designer. The high energy consuming circuits not only degrades battery lifetime but also worsens the reliability of the device. To achieve efficient design, the first approach that is utilized is the transistor scaling [1] as it improves all the design parameters simultaneously. With the technology scaling the transistor size has reached to the Nano-scale era where process and other variations are becoming very severe. The process and temperature variation compensation techniques [2] are becoming more costly than the gained advantage due to scaling. Hence scaling the device dimension fails to achieve energy efficient design. So, the design of high-speed and lowpower VLSI architectures needs efficient arithmetic processing units, which are optimized for the performance parameters, namely, speed and power consumption [3].

Adders are the key components in general purpose microprocessors and digital signal processors. They also find use in many other functions such as subtraction, multiplication and division. As a result, it is very pertinent that its performance augers well for their speed performance.

Moreover, there are several applications where minor error can be tolerated called as error tolerant applications [4]. For these applications approximate circuit can be design that provides approximate value of the results at improved design metrics. Hence, we can improve all the design parameter simultaneously at the cost of minor acceptable loss in accuracy by approximate design. In these applications designing an accurate circuit is the waste of power, area and performance.

Based on the characteristic of digital VLSI design, some novel concepts and design techniques have been proposed. The concept of error tolerance (ET) [5] and the PCMOS technology are two of them. According to the definition, a circuit is error tolerant if: 1) it contains defects that cause internal and may cause external errors and 2) the system that incorporates this circuit produces acceptable results. The "imperfect" attribute seems to be not appealing.

Increasingly huge data sets and the need for instant response require the adder to be large and fast. The traditional ripplecarry adder (RCA) is therefore no longer suitable for large adders because of its low-speed performance. Many different types of fast adders, such as the carry-skip adder (CSK), carry-select adder (CSL), and carry-look-ahead adder (CLA), have been developed [6]. Also, there are many low-power adder design techniques that have been proposed. However, there are always trade-offs between speed and power. The error-tolerant design can be a potential solution to this problem. By sacrificing some accuracy, the ETA can attain great improvement in both the power consumption and speed performance.

The rest of the paper is organized as follows. Section 2 details conventional full adder whereas Section 3 shows proposed full adder. The extensive bit width adder using the proposed full adder is given in Section 4. Section 5 shows simulation results while Section 6 concludes the paper.

# **II. CONVENTIONAL FULL ADDER DESIGN**

There are different circuits for the full adder designs are available in the literature. In this paper we consider mirror adder for our study. The circuit diagram of the mirror adder as shown in figure 1 consists of 24 transistors only [1].



**Figure 1: Circuit diagram of Mirror adder**

From the circuit diagram we can see that the mirror adder provides complimented value of sum and carry out for the given input. Further, to achieve sum and carry out value addition inverter can be appended.

## **III.APPROXIMATE FULL ADDER DESIGN**

This section first discusses several techniques to improve the speed and area reduction of the design and then apply those techniques to the conventional mirror adder to achieve efficient approximate mirror adder.

As we know that delay of a circuit occurs due to charging and discharging of node capacitance. The larger the node capacitance larger will be the delay [7]. In order to reduce the delay i.e. to improve the speed of the design, we must reduce the node capacitance. Furthermore, decreasing the number of transistor to implement a design will certainly reduce the overall area [8]. Hence, in this work we try to reduce the transistor in the full adder circuit that reduces the node capacitance and simultaneously provides small area. While removing the transistor from the circuit, it introduces error. So, the transistor have to eliminated intelligently such the introduced error is small. One important care that has to be

taken into consideration is that there should not be any open or short circuit path formed while removing the transistors.

The first approximate full adder after intelligently eliminating few transistors is shown in figure 2. It can be seen that, the approximate full adder requires 6 less transistor compared to the original full adder.



**Figure 2: Approximate Mirror adder 1 (AMA1)**

Similarly from the close observation of the truth table of the full adder, we can see that value of sum is equal to inverse of carry out for six out of eight conditions. Then we can simply equate the value of sum to the inverse of carry directly. But doing in this will result in increase in the load at the carry out so we will have a buffer before getting a sum output. The resultant simplified mirror adder is shown in figure 3. From the figure we can see that approximate mirror adder 2 (AMA2) requires only 11 transistors.



**Figure 3: Approximate Mirror Adder 2 (AMA2)**

Similarly, we have noticed that value of Cout is equal to A (First input) for six out of eight conditions. If we select Cout to A and then calculate the value of sum from Cout in accurate manner, we can have an approximate adder. The resulting circuit diagram of the approximate mirror adder 3 is shown in figure 4. This adder requires only 11 transistors and thus significantly reduces the area. Moreover the decrement in node capacitance due to reduced number of transistor increases the speed of the adder. The Table 1 depicts the truth table of the conventional and the approximate mirror adders. The wrong value in each approximate adder is highlighted with red colour.

**Inputs** | Accurate | Approximate FA outputs  $\bf{A}$  **B C**<sub>in</sub> **S Co S**<sub>1</sub> **C**<sub>**o1**</sub> **S**<sub>2</sub> **C**<sub>**o2**</sub> **S**<sub>3</sub> **C**<sub>**o3**</sub> 0 0 0 0 0 1 0 0 0 0 0 0 0 1 1 1 0 1 0 0 1 0 0 0 0 0 | 1 | 0 | 1 | 0 | 0 | 1 | <mark>0</mark> | 0 | 0 | 1 | 0 0 | 1 | 1 | 0 | 1 | 0 | 1 | <mark>1 | 0 | 1 | 0</mark> 1 0 0 1 0 1 0 0 0 1 0 0 1 1 0 1 0 1 0 1 0 1 0 1 0 1 1 1 0 0 1 0 1 0 1 1 1

**Table 1: Truth table of accurate and approximate adders**

#### **IV. PROPOSED MULTIBIT ADDER**

1 1 1 1 1 0 1 1 1 1 1

In the real application, we require a multi-bit adder that can be design via full adder. The simple architecture of the multi-bit adder is ripple carry adder (RCA). The RCA has the small area and very simple architecture, but the large delay to long carry propagation chain reduces its uses in the design. In order to increase the performance (speed) of the adder, we can calculate the carry in advance and supply to each full adder, the resulting design is called as carry-look-ahead (CLA) adder. The CLA provides highest speed but the large area overhead bridle its fame. Further we can have small chain of RCA and these RCA can be utilized to build an extensive bit width adder such that carry out from on RCA select the result of two RCA connect with logic"1" and "0" at their carry input. The resulting adder is known as carry select adder. Although, CAL improves the speed of the addition but large area overhead hinders it uses in the applications.

In order to improve all the design parameters simultaneously, we propose a novel adder by utilizing the approximate full adder the least significant bits (LSB) positions. The accurate mirror adder is used at MSB to reduce the introduced error while approximate mirror adder is used to improve the design metrics. The resulting approximate adder 1 (ARCA1) is shown in figure 4.



Similarly other, approximate RCA are also designed using the other approximate full adders as show in figure 5.



#### **(b) Approximate RCA-3 (ARCA3)**



From the architectural diagram we can see that proposed adder significantly reduces and area due to reduced number of adder in approximate full adders. The next will discuss the experimental setup and methodology to evaluate proposed approximate adders.

#### **V. EXPERIMENTAL RESULT & ANALYSIS**

This section first introducedthe design metrics and then error metrics to evaluate the proposed design.

**Error Metrics**: The error metrics are evaluated by modeling the proposed adder on the MATLAB and then simulating the design for 1milion random input pattern. In order to design the approximate adder, first approximate full adder is designed as per the truth table given in Table 1. Using these full adder and accurate adders, 16-bit adders are designed. In these approximate adders are then simulated with 1million random inputs and the corresponding error metrics are evaluated. The error metrics as shown in Table 2, depict that proposed adder have less value of mean i.e. which is acceptable for most of the error tolerant applications. From the table it can be seen that ARCA1 provides has less mean and MSE over ARCA2 and ARCA3 and higher PSNR which is desirable. But this good error metrics occurs at the cost of slight decreased design metrics as can be seen from the design metrics Table 4. Thus, the adder ARCA1 is better suitable for the less error tolerant applications while ARCA2 and ARCA3 are suitable for the application that can tolerate more error.

**Table 2: Comparison of error metrics**

| Adder | Mean                  | MSE   | Std. dev.            | <b>PSNR</b> |
|-------|-----------------------|-------|----------------------|-------------|
| Tvpe  | Error $(\mu)$         |       | $(\sigma)$           |             |
| ARCA1 | $7.28 \times 10^{-4}$ | 0.443 | $2.1x10^{-3}$        | 125.9       |
| ARCA2 | $1.3x10^{-3}$         | 1.123 | $3.4x10^{-3}$        | 116.7       |
| ARCA3 | $.6x10$ <sup>*</sup>  | 0.970 | $3.1 \times 10^{-3}$ | 118.13      |

**Design Metrics:** In order to estimate the design metrics the proposed adder architectures and other well-known adders, all the designs are implemented on the tanner 14.1 and simulated with 45nm technology file. To have fair comparison transistor sizing are taken identical with same power supply for the proposed and reference design. The three primary design parameters are determined to evaluate the effectiveness of the proposed adders. The schematic of the AMA1 is shown in figure 6.



**Figure 6: Schematic diagram of AMA1**

Using these AMA the ARCA1 is implemented on the Tanner as shown in figure 7.



**Figure 7: Schematic diagram of ARCA1 on Tanner**

Similarly, other design parameters are also implemented on the Tanner to evaluate the design metrics. Table 2 shows the design parameter for all the approximate and conventional accurate full adders. From the table we can see that proposed AMA1, AMA2 and AMA3 reduces the PDP by 40.58%, 59.94% and 49.96% respectively over Conventional full adder. Further the design also show significant reduction in area, power and delay parameters simultaneously.

**Table 3: Comparison of design metrics**

| FA<br><b>Type</b> | Area<br>$(H$ Tran.) | Power<br>(uw) | <b>Delay</b><br>(p <sub>S</sub> ) | <b>PDP</b><br>(fJ) |
|-------------------|---------------------|---------------|-----------------------------------|--------------------|
| <b>CMA</b>        | 24                  | 0.117         | 64.4                              | 0.754              |
| AMA1              | 16                  | 0.09          | 53.1                              | 0.448              |
| AMA <sub>2</sub>  | 14                  | 0.114         | 26.7                              | 0.304              |
| AMA3              |                     | 0.093         | 44.1                              | 0.415              |

Similarly, design metrics for the proposed approximate adders and other accurate adders are also evaluated and shown in Table 3. From the table we can see that proposed ARCA1, ARCA2 and ARCA3 reduces the PDP by 12.27%, 22.7% and 67.6% respectively over conventional full adder. Further the design also show significant reduction in area, power and delay parameters simultaneously. Thus, the proposed ARCA adders can be effectively utilized in the portable battery operated devices where power/energy is the prime requirement over accuracy.





## **VII. CONCLUSION**

In this paper, we have proposed novel approximate adders that provide improved power, area and delay parameters simultaneously at the cost of minor acceptable error. The proposed adders are very suitable for the image/video processing applications that can tolerate small amount of errors. The efficacy of the proposed adders is evaluated by designing on MATLAB and Tanner. From the simulation results we have seen that the error metrics of the proposed adders shows acceptable PSNR while the design metrics shows significant reduction in area, power and delay over the accurate adder architecture. The present mobile and other battery operated devices demands highly energy efficient

arithmetic operation and the proposed adder is very much suitable for these devices.

### **REFERENCES**

[1] V. Gupta, D. Mohapatra, S. Park, A. Raghunathan, and K. Roy, "Impact: Impreciseadders for low-power approximate computing," in *Low Power Electronics and Design(ISLPED) 2011 International Symposium on*, aug. 2011, pp. 409–414.

[2] Ning Zhu, et. al "An Enhanced Low-Power High-Speed Adder For Error-Tolerant Application," International symposium on IC*.*, vol. 18, no. 8, pp. 69–72, Aug. 2009.

[3] Ning Zhu, et. al "Design of Low-Power High-Speed Truncation-Error- Tolerant Adder and Its Application in Digital Signal Processing," *IEEE Trans. on VLSI.*, vol. 18, no. 8, pp. 1225–1229, Aug. 2010.

[4] Melvin A. Breuer and Haiyang Zhu, "Error-tolerance and multi-media," in *Proc. of the 2006 International Conference on* 

*Intelligent Information Hiding and Multimedia Signal Processing*, 2006.

[5] M. A. Breuer, S. K. Gupta, and T. M. Mak, "Design and error-tolerancein the presence of massive numbers of defects," *IEEE Design TestComputer, vol. 24, no. 3, pp. 216–227, May-Jun. 2004.*

[6] B. Ramkumar, H.M. Kittur, and P. M. Kannan, "ASIC implementation of modified faster carry save adder," *Eur. J. Sci. Res., vol. 42, no. 1, pp.* 53–58, 2010.

[7] V. Gupta, D. Mohapatra, A. Raghunathan, and K. Roy, "Low-power digital signal processing using approximate adders," *Computer-Aided Design of Integrated Circuits andSystems, IEEE Transactions on*, vol. 32, no. 1, pp. 124–137, jan. 2013.

[8] P. Kulkarni, P. Gupta, and M. Ercegovac, "Tradingaccuracy for power with an underdesigned multiplier architecture," in *VLSI Design (VLSI Design), 2011 24th InternationalConference on*, 2011, pp. 346–351.