Leakage Power Reduction with Various IO Standards and Dynamic Voltage Scaling in Vedic Multiplier on Virtex-6 FPGA

The 8-bit design is able to process 256 times input combination in compare to 4-bit vedic multiplier, using approximates 6 times basic elements, 2 times IO buffers, approximate 1.5 times total power dissipation. HSTL_I_12, SSTL18_I and LVCMOS12 are the most energy efficient IO standards in HSTL, SSTL and LVCMOS family respectively. Device stat - ic power and design static power are two types of static power dissipation. Device static power is also known as Leakage power when the device is on but not configured. Design static power is power dissipation when bit file of de - sign is downloaded on FPGA but there is no switching activity. Design static power dissipation of 8-bit Vedic multiplier is almost double of design static power dissipation of 4-bit Vedic multiplier. Device static (leakage) power dissipation of 8-bit Vedic multiplier is almost equal to device static power dissipation of 4-bit Vedic multiplier on 40nm FPGA.


Introduction
The Input Output Standard (IOSTANDARD) constraint is both mapping constraint and synthesis constraint.Modern Programmable Logic Devices (PLDs), such as Field Programmable Gate Arrays (FPGAs), are capable of supporting a variety of different input/output (I/O) standards from 29 logic families as shown in Table 1.In this work, three different logic families out of 29 different available logic families on FPGA are used.These are Low Voltage Complementary Metal Oxide Semiconductor (LVCMOS), Stub Series Terminated Logic (SSTL) and High Speed Terminated Logic (HSTL).In LVCMOS family, LVCMOS12 is delivering best power specific performance and LVCMOS25 is delivering worst power Indian Journal of Science and Technology, Vol 9(25) specific performance out of different IO standards available in LVCMOS family 1,3,4 .In HSTL family, HSTL_I_12 is delivering best power specific performance and HSTL_I_ DCI_18 is delivering worst power specific performance out of different IO standards available in HSTL family 3,4 .In SSTL family, SSTL18_I is delivering best power specific performance.SSTL2_II_DCI (SSTL2_D) is delivering worst power specific performance out of different IO standards available in SSTL family 4 .
XC4000 FPGA was based on 500nm technology which uses 5V supply voltage.Supply voltage reduces to 3.3V with Spartan-XL FPGA family.Supply voltage of FPGA again reduces to 2.5V with Virtex FPGA family.With 40nm technology based FPGA, supply voltage is in range of -0.5V to 1.1V as shown in Table 1.

Applicable Elements
IO standard is applicable in Verilog Code, VHSIC Hardware Description Language (VHDL) code, User Constraint File (UCF), Physical Constraint File (PCF) file and Xilinx Constraint File (XCF).Target design is multiplier because multiplier is widely used in wireless communication.For energy efficient wireless communication, there is need to design energy efficient multiplier.Both CMOS 65 nm sigma delta frequency synthesizer and embedded capacitor multiplier has gigabit baseband rate for millimeter wave communication 9 .The capacitor multiplier achieves an equivalent value of 540 pF and save 90% area for main capacitor 9 .A multiplier circuit and wireless communication apparatus that adjust an output level of a desired multiple waves to a desired range are discussed in 10 .An apparatus for receiving signals includes a Low Noise Amplifier (LNA) configured to receive a Radio Frequency (RF) signal is using 16 multipliers 11

Top Level Schematic of Vedic Multiplier
The top level schematic of 8-bit Vedic multiplier that is based on the Vedic formula called Urdhva Triyagbhyam.

RTL Schematic of Vedic Multiplier
This 8-bit Vedic multiplier is using four 4-bit Vedic multiplier and three 8-bit full adders as shown in

Logic Utilization
Four-bit Vedic multiplier can multiply 256 combination of two 4-bit input from 0x0 to 15x15.Eight-bit Vedic multiplier can multiply 65536 combination of two 8-bit input from 0x0 to 256x256.The 8-bit design is taking 256 time input combination, whereas using approximates 6 times basic elements and 2 times IO buffers.4-Bit Vedic Multiplier is using 26 basic elements in compare to 153 basic elements in 8-bit Vedic multiplier.LUTN represents N-bit LUT.LUT2, LUT3, LUT4, LUT5, and LUT6 are five different LUT available in 40nm technology based Virtex-6 FPGA.There is no LUT3 and LUT5 in 4-bit Vedic multiplier.In 8-bit Vedic multiplier, Xilinx Synthesis Technology (XST) is using all 5 different LUTs.There are 50% less IO Buffers, 90.91% less LUT2, 100% less LUT3, 73.81% less LUT4, 100% less LUT5, and 80.56% less LUT6 used in 4-bit Vedic multiplier in compare to 8-bit Vedic multiplier as shown in Table 2.

Power Dissipation on FPGA
Total FPGA power = Device Static + Design Static + Design Dynamic Dynamic powers are modeled to account for load capacitance, supply voltage, and operating frequency.In this paper, primary concern is analysis and reduction of static power.There are two types of static power.One is device static power and other is design static power.Device static power is also known as Leakage power when the device is powered and not configured.Device static power is mainly sensitive to ambient temperature.Design static power is additional power dissipation when the device is configured but there is no switching activity.Design static power includes static power in I/O DCI terminations, clock managers, etc.

Working of Vedic Multiplier Based on Urdhva-Tiryagbhyam Sutra in Vedic Mathematics
Urdhva-Tiryagbhyam means vertically and column wise.
Let's consider a case of multiplication of 24 by 34.

Power Analysis of 8-bit Vedic
Multiplier on 40nm FPGA SSTL is dissipating more power among 3 different family of IO standards.Whereas, LVCMOS is the most power optimized IO standards available on 40nm FPGA as shown in Table 6.There is 69.26%, and 58.28% reduction in power dissipation when LVCMOS, and HSTL is used in place of SSTL on 0.5Volt.When supply voltage increases then the difference in power dissipation is decreases.There is only 26.76%, 22.17% reduction in power dissipation when LVCMOS12, HSTL_I_12 in place of SSTL2_D on 1.5Volt is used.
Device static power is also known as leakage power.Leakage power is not significantly affected with variation in IO standards of either same or different IO standard family.Using SSTL or LVCMOS12, there is 6.14% change in power dissipation at 1.5V and 0.67% change in power dissipation at 0.5V as shown in Table 7.When supply voltage is scale down from 1.5V to 0.5V, then there is 86.84%, 87.11% and 87.58% reduction in leakage power for LVCMOS25, HSTL_D18 and SSTL2_D as shown in Table 5.When using HSTL_I_12, SSTL18_I, HSTL_D18 in place of SSTL2_D then there is 84.09%, 83.06%, and 66.31% reduction in design static power dissipation of Vedic multiplier at 0.5V supply voltage as shown in Table 8.There is less effect of voltage scaling on power reduction.Similar reduction in power (percentage) with variation in IO standard on 1.0V, 1.2V and 1.5V is observed as observed on 0.5V.There is 26-27mW reduction in power for both HSTL and SSTL IO standard as shown in Table 6.
There is 84.1%, 83.01%, and 66.39% reduction in design static power dissipation of Vedic multiplier when we use HSTL_I_12, SSTL18_I, HSTL_D18 in place of SSTL2_D at 0.5V supply voltage as shown in Table 9.

For LVCMOS IO Standard
When voltage is varied from 1.5V to 1.2V, 1.0V and 0.5V, then there is 63.08%, 77.04%, and 86.86% saving in total power dissipation respectively for LVCMOS12 IO standards available on 40nm FPGA.With LVCMOS25, similar saving in power dissipation as with LVCMOS12 is analyzed as shown in Table 10.With LVCMOS IO standard, power dissipation for both 4-bit and 8-bit Vedic multiplier is same.

Data Collection Phase
HSTL IO standard is used in both 4-bit Vedic multiplier and 8-bit Vedic multiplier.With HSTL_I_12, 8-bit Vedic multiplier is using just 13.16% more power dissipation in compare to 4-bit Vedic multiplier whereas it is processing 99.61% more input combination (65536 combinations) than 4-bit vedic multiplier (256 combinations) as shown in Table 11.For 8-bit Vedic multiplier and HSTL_I_12 IO standard, there is 63.21%, 77.09% and 86.93% reduction in leakage power when supply voltage is scaled down from 1.5V to 1.2V, 1.0V and 0.5V respectively as shown in Table 12.Design static power dissipation of 8-bit vedic multiplier is almost double of design static power dissipation of 4-bit vedic multiplier as shown in Table 13.In 8-bit vedic multiplier, when supply voltage is varied from 1.5V to 1.2V, 1.0V and 0.5V, then there is 2.75%, 4.81%, 9.28% saving in design static power with HSTL_I_12 and 1.36%, 2.22%, 4.44% saving in design static power with HSTL_ D18.

SSTL IO Standards
With SSTL18_I and SSTL2_D IO standards, 8-bit Vedic multiplier is taking only 13.81% and 34.61% more power dissipation than 4-bit Vedic multiplier whereas data width of 8-bit Vedic multiplier is almost double of 4-bit Vedic multiplier on 0.5V supply voltage.For energy efficient SSTL18_I IO standards among SSTL family, 82.97%, 73.43% and 60.09% reduction in power dissipation with 0.5V in compare to 1.5V, 1.2V and 1.0V supply voltage respectively is achieved as shown in Table 14.
Device static (leakage) power dissipation of 8-bit Vedic multiplier is almost equal to device static power dissipation of 4-bit Vedic multiplier on 40nm FPGA.For 8-bit Vedic multiplier and SSTL2_D IO standards, we are saving 63.83%, 77.78%, and 87.58% power when supply voltage is varied from 1.5V to 1.2V, 1.0V and 0.5V respectively as shown in Table 15.Design static power of 8-bit Vedic multiplier is almost double of design static power dissipation of 4-bit Vedic multiplier as shown in Table 16.
For SSTL18_I IO standard, when we scale down supply voltage from 1.5V to 1.2V, 1.0V and 0.5V then there is 2.6%, 4.22% and 8.76% reduction in design static power dissipation respectively.

Conclusion
Eight-bit Vedic multiplier multiply 65536 combination of two 8-bit input from 0x0 to 256x256 which is more than 256 combination of 4-bit input from 0x0 to 15x15 for 4-bit Vedic multiplier.Our 8-bit design is able to process 256 times more input combination in compare to 4-bit vedic multiplier, whereas using only 6 times basic elements, 2 times IO buffers, approximate 1.5 times total power dissipation and 2 times design static power dissipation.HSTL_I_12, SSTL18_I and LVCMOS12 are the most energy efficient IO standards in HSTL, SSTL and LVCMOS family.Device static power and design static power are main component of static power dissipation.Both 4-bit and 8-bit Vedic multipliers are implemented on 40nm FPGA.When voltage is scaled down from 1.5V to 1.2V, 1.0V and 0.5V, then there is 63.08%, 77.04%, and 86.86% saving in total power dissipation respectively for LVCMOS12 IO standard.With HSTL_I_12, 8-bit vedic multiplier is using just 13.16% more power dissipation in compare to 4-bit vedic multiplier whereas it is processing 99.61% more input combination (65536 combinations) than 4-bit vedic multiplier (256 combinations).Design static power dissipation of 8-bit vedic multiplier is almost double of design static power dissipation of 4-bit vedic multiplier.Device static (leakage) power dissipation of 8-bit Vedic multiplier is almost equal to device static power dissipation of 4-bit Vedic multiplier.For 8-bit Vedic multiplier and SSTL2_D IO standards, we are saving 63.83%, 77.78%, and 87.58% device static power when supply voltage is scaled down from 1.5V to 1.2V, 1.0V and 0.5V respectively.

Future Scope
This design is implemented on 40nm technology based FPGA.In future, this design can be re-implemented on 28nm FPGA and 20nm ultra scale FPGA.This design of 8-bit Vedic multiplier can be extended as 16-bit Vedic multiplier and 32-bit Vedic multiplier.There is also open scope to integrate this multiplier in existing ALU and FIR filter and make a design of Vedic ALU, Vedic FIR Filter, and Vedic Math Co-processor.Here, we are using LVCMOS, HSTL and SSTL IO standards.There is wide scope to for other IO Figure 2. LAB1, LAB2, LAB3, and LAB4 are four instance of 4-bit Vedic multiplier.FA11, FA12 and FA13 are three instance of full adder.Inputs are A11 [7:0] and B11 [7:0].Output is product [15:0].

Figure 1
Figure 1 Top Level Schematic of 8-bit Vedic Multiplier.

Figure 3 .
Figure 3.Comparison of Design Static Power Dissipation for SSTL18_I.

Figure 4 .
Figure 4. Comparison of Design Static Power Dissipation for SSTL2_D.
. A high speed low power digital multiplier by taking the advantage of Vedic multiplication algorithms with a very efficient leakage control technique called McCMOS technology is designed 12 .This work is extension of McCMOS technology to LVCMOS, HSTL and SSTL IO standards available on 40nm process technology to control both device static and design static power.

Table 4 .
Total Power Dissipation on 40nm FPGA

Table 6 .
Design Static Power Dissipation on 40nm FPGA

Table 7 .
Total Power Dissipation on 40nm FPGA

Table 9 .
Design Static Power Dissipation on 40nm FPGA

Table 10 .
Total Power Dissipation of Vedic Multiplier Using LVCMOS on 40nm FPGA

Table 11 .
Total Power Dissipation of Vedic Multiplier Using HSTL on 40nm FPGA

Table 12 .
Device Static (Leakage) Power of Vedic Multiplier Using HSTL on 40nm FPGA

Table 14 .
Total Power Dissipation of Vedic Multiplier Using SSTL on 40nm FPGA

Table 15 .
Device Static (Leakage) Power of Vedic Multiplier Using SSTL on 40nm FPGA