Swarm Based Approach for the Optimization of Power-Delay Product in a CMOS Repeater Driven RC Interconnect Line

DOI : 10.17577/IJERTV2IS90757

Download Full-Text PDF Cite this Publication

Text Only Version

Swarm Based Approach for the Optimization of Power-Delay Product in a CMOS Repeater Driven RC Interconnect Line

Mohit Gupta

Navneet kaur

Dr. Sandeep Singh Gill

M.Tech. Student (ECE Department)

Assistant Prof. (ECE Department)

Associate Prof. (ECE Department)

GNDEC, Ludhiana

GNDEC, Ludhiana

GNDEC, Ludhiana

Abstract

In this work, Particle Swarm Optimization technique has been implemented in a CMOS repeater driven RC interconnect line for Power delay product minimization. The work aims at minimizing the delays associated with the RC parasitics at 180nm technology and hence optimizing power delay product.Different researchers have proposed different techniques to reduce the interconnect delay. However, the most effective technique for delay reduction is to insert buffers in between the RC interconnects for driving large interconnect loads. Although inserting repeaters reduces overall delay but if the number of repeaters inserted in the repeater chain is not optimal it may lead to increase in delay and power dissipation. So there is a need to optimize various parameters such as number of repeaters, size of repeaters and applied voltage. In this work, a factor called Power Delay Product has been optimized using the evolutionary technique popularly known as particle swarm optimization. The simulations are also carried out using CADENCE SPECTRE tool at 180 nm technology andthe results are finally compared with those obtained using PSO simulation in MATLAB.

Keywords Interconnects, Repeaters, Optimize, Lumped, Power-Delay Product

  1. Introduction

    In recent years, the size of complementary metal oxidesemiconductor (CMOS) integrated circuits continues to decrease. As we move into deep submicron technology, the long interconnect delay have become common on chip feature. As a result wire delaysdominate over gate delays. With a linear increase in length, interconnect delay increases quadratically due to a linear increase in both interconnect resistance and capacitance [1], [2]. Also, large interconnect loads not only affect circuit performance, but also degrades the waveform shape

    and causes excessive short-circuit power to be dissipated in the stage loading a CMOS logic gate, filters, etc.

    Interconnect delay is the combined effect of parasitic resistance, capacitance and inductance associated with it. Each of these parasitic elements increase linearly with the interconnect length. This increase in the interconnect parameters mainly occurs when feature sizes enter the nanometer era. Many design techniques have therefore been developed to minimize the propagation delay of global interconnect. Repeaters are often used to minimize the delay to propagate a signal through those interconnect lines that are best modeled as an RC impedance.But the number of repeaters plays a key role in deciding the overall delay of the repeater loaded interconnects. If optimal number of repeaters is not used in the repeater chain then this tends to increase in delay. Moreover as the number of repeaters in a chain increases the power dissipation also increases.So the aim is to optimize the number of repeaters to have minimum interconnect delay. Other factors namely size of repeater and applied voltage are also responsible for increasing RC interconnect delays. So there should be some optimal value of all these factors to have minimum delay and power so that the factor called power-delay product has been optimized.

  2. Lumped RC Interconnect Model

    The electrical properties of the interconnect wire mainly includes three parameters namely capacitance, resistance, and inductance. These parasitic elements have an impact on the electrical behavior of the circuit and influence its delay, power dissipation, and reliability.In this paper interconnects having RC parasitic element with a simple L-shaped model is considered [3].

    Figure 1.Interconnect RC models

    RC delays should only be considered when the rise (fall) time at the line input is smaller than RC, the rise (fall) time of the line. When this condition is not met, the change in signal is slower than the propagation delay of the wire, and a lumped capacitive model suffices.

  3. Interconnect Delay Reduction using Repeaters

    With the continued scaling of process technology, the interconnect resistance per unit length continues to increase, the capacitance per unit length remains roughly constant and logic delay continues to decrease. These trends have caused interconnect delay to become more dominant than logic delay. Interconnect-driven timing optimization techniques, such as wire sizing, buffer insertion and gate sizing have gained widespread acceptance in deep submicron design. In particular, buffer insertion techniques have been successful in reducing interconnect delay. To the first order, interconnect delay is proportional to the square of the length of the wire. Inserting buffers effectively divides the wire into smaller segments, which makes the interconnect delay almost linear in terms of length (plus the buffer delays). Many design techniques have therefore been developed to minimize the propagation delay of global interconnect. Repeaters are often used to minimize the delay to propagate a signal through those interconnect lines that are best modeled as an RC impedance. Since the delay of a long unbuffered line is quadratic in its length, long interconnects are divided into a number of segments with repeaters or buffers. [4]. So the most popular design approach to reduce the propagation delay of long wires is to introduce intermediate buffers, also called repeaters, in the interconnect line.

    Figure 2.RC interconnect model having n repeaters[5] Figure 2.shows the equivalent circuit of a repeater driven interconnect, where a number of equal sized repeaters are inserted along an interconnect. When n number of uniform repeaters are inserted in between, the interconnect capacitance (C/n) of the subsection comes in parallel with repeater capacitance (Crep).

    In this work, the resistive interconnect is considered which is modeled as a lumped RC load having resistance R=1K and capacitance C=1pF. In order to optimize the number of repeaters for having minimum delay, t90% delay has been calculated by inserting repeater in between the RC interconnect model. Figure 3 shows the long interconnect having RC parasitics with the six repeaters inserted in between them. The delay of the repeater loaded interconnect is calculated by varying supply voltage and number of repeaters.

    Figure 3. RC interconnect with six repeaters

    Since delay also depends on the supply voltage VDDand width of transistor Wn,so for each repeater insertion, delay has been calculated for a set of VDD at particular Wn (MOS width). Moreover two transistor technologies has been considered one at Wp = 3µm; Wn = 1µm and other at Wp = 1.8µm; Wn = 0.6µm

    Figure 4. CMOS inverter with Wp = 1.8µm; Wn = 0.6µm

    After calculating the t90% for a set of repeaters, readings have been analyzed for minimal delay with respect to voltage and technology.

    The analytical expression for ninety percent delay (t90%) of a repeater loaded RC interconnect is given by [6], [7], [8]:

    90% =

  4. Power Minimization

    In this work power analysis of repeater loaded interconnects is also carried out. Today due to high integration densities and high speeds in VLSI circuits power dissipation is a prime criterin. The total power

    1 1

    + +

    +

    dissipation includes dynamic power, short circuit

    2

    power and static power dissipation (leakage). In long interconnects dynamic (switching) power and short

    1 + +

    +

    circuit power is appreciably large as compared to

    2.3 1

    +

    (1)

    leakage power.

    Thus the total power dissipation in interconnect is equal to the sum of dynamic power, short circuit power

    and the leakage power. The expression is given by [4]:

    = + +

    Similarly the expression for even number of repeaters is:

    Since this work is based on long interconnects the static power dissipation can be neglected for 180nm

    90% = 1 + + +

    technology [9].

    2 1

    + +

    +

    There are three power dissipation mechanisms: Dynamic Power: It is the power consumed due to charging and discharging the load capacitance.

    2

    Dynamic power has been well studied and is characterized by the following well-known expression:

    1 +

    +

    +

    = 2

    Here f is the clock frequency, VDD

    is supply voltage

    2.3 1

    + 1

    + +

    and is the switching factor (here = 1).

    In the repeater loaded RC interconnect the

    (2)

    dynamic power is given by the expression

    = ( +

    )

    )

    2

    The above delay expression is dependent on saturation drain conductance where Gdonand Gdopare defined for NMOS and PMOS respectively. Here VDD is supply voltage, Crepis repeater capacitance and n is number of repeaters. But it has also been found that Gdon/p and Crepalso depend on supply voltage and width of transistors (Wn/p)

    The dependence of Gdon, Gdop and Crep on VDD, Wn is given by following expressions

    Gdon = 0.5 (Wn )0.95 (VDD )1.29(inmS)

    Crep = 85.78 (Wn )0.575 (VDD )0.612 (infF)

    Gdop = 0.47 (Wn )1.05 (VDD )1.51 (inmS)

    After substituting the values of Gdon, Gdopand Crepin equation 1, the expression for delay is in terms of VDD, Wn and n has been obtained.

    Short Circuit Power: This is the power consumed when the signal applied at the input of a CMOS inverter has a finite slew rate, a direct current path exists between VDD and the ground when the input signal switches between Vtn and VDD+Vtp. It is a function of input transition time, output load capacitance and the size of transistor [4]

    =

    Here Isc is the short circuit current.

    Leakage Power: The leakage current comprises of Sub-threshold current and gate leakage current.

    =

    where VDD is supply voltage and ILeakage is leakage current.

    Therefore the total power in this case is written as the sum of dynamic and short circuit power:

    = tr( + ) + tr VDD Isc Wn f (3)

    The variation in the number of repeaters is also been done required to minimize the delay and power. Total power dissipation is obtained using transient analysis. The simulations are carried out using Cadence Virtuoso tool.

  5. Power-Delay Product

    The power-delay product is a fundamental parameter which is often used for measuring the quality and the performance of a CMOS process and gate designs. As a physical quantity, the power-delay product can be interpreted in the average energy required for a gate to switch its output voltage from low to high and from high to low[10].

    At 180nm technology, leakage power can be neglected so in this work dynamic power and short circuit power are considered. As a trade-off between power and delay is required so this work aims at finding the optimum number of repeaters at which the product of power and delay is minimum.

    Mathematically, PDP can be formulated as

    PDP = power * Delay

    Now using equations of power (3) and delay(1 or 2) above PDP has been obtained.

    =

    algorithms are gaining popularity for solving optimization problems in the various fields like antenna, controllers, digital signal processing etc. VLSI is also one such field in which optimization problems can be effectively solved using these algorithms, taking lesser time as compared to the traditional simulation techniques. Kennedy and Eberhart (1995) proposed an approach called Particle Swarm Optimization which is inspired from the choreography of a bird flock[11]. The idea of this approach is to simulate the movements of a group (Population) of birds which aim to find food. The approach can be seen as a distributed behavioral algorithm that performs multidimensional search. In the simulation, the behavior of each individual is affected by either the best local (i.e. within a certain neighborhood) or the best global individual. The approach uses then the concept of population and a measure of performance similar to the fitness value used with evolutionary algorithm.

    The PSO algorithm consists of just three steps, which are repeated until some stopping condition is met.

    1. Evaluate the fitness of each particle

    2. Update individual and global best fitnesses and positions

    3. Update velocity and position of each particle

    The first two steps are fairly trivial. Fitness evaluation is conducted by supplying the candidate solution to the objective function. Individual and global best fitnesses and positions are updated by comparing the newly evaluated fitnesses against the previous individual and

    1 1

    + +

    +

    global best fitnesses, and replacing the best fitnesses

    2

    and positions as necessary.The velocity and position

    1 +

    +

    +

    update step is responsible for the optimization ability

    of the PSO algorithm.

    The velocity of each particle in the swarm is updated

    2.3 1

    + (tr( + ) +

    using the following equation:

    + 1 =

    tr VDD Isc Wn f (4)

    This is the final analytical expression of powerdelay product that is to be optimized for three variables

    + 11 + 2 2

    namely number of repeaters, supply voltage and transistor width.

  6. Optimization using Particle Swarm Technique

The problem of interconnect optimization can be also solved using evolutionary algorithms. Recently these

The parameters w, c1, and c2 (0 w 1.2, 0 c1 2, and 0 c2 2) are user-supplied coefficients. The values r1 and r2 (0 r1 1 and 0 r2 1) are random values regenerated for each velocity update.

Once the velocity for each particle is calculated, each particles position is updated by applying the new velocity to the particles previous position:

+ 1 = + + 1

This process is repeated until some stopping condition is met. Some common stopping conditions include: a preset number of iterations of the PSO algorithm, a number of iterations since the last update of the global best candidate solution, or a predefined target fitness value.

In this work, the main objective is to optimize the Power Delay Product (PDP). As it has been already discussed that delay in repeater loaded RC interconnects is minimized using repeater insertion technique but non optimal number of repeaters inserted in a chain sometimes leads to excessive power dissipation, so a trade-off is required between delay and power suchthat delay is also minimized while keeping power dissipation in control. Therefore for obtaining minimal PDP and optimizing the number of repeaters, NMOS width and power supply, here particle swarm optimization technique has been implemented in MATLAB Tool. Power-delay product is the objective function obtained using equation (4) and the algorithm aims at minimizing the PDP value.

  1. Simulation Results

    The optimal interconnect design problem is solved using particle swarm optimization and the results thus obtained are verified using Cadence Spectre simulator at 180 nm technology.The device has been implemented with transistor channel width of Wp = 3µm; Wn = 1µm (here Wp 3Wn) and channel lengths of Ln = Lp = 0.18µm. (where the subscripts n and p refer to NMOS and PMOS respectively).In this work, a voltage step of magnitude equal to the CMOS supply voltage is applied at the input of first repeater. Here the input is in the form of a symmetric voltage pulse and the input bit rate is 5x107bps (f= 50MHz, 20ns is the time period of the pulse). Firstly the simulation readings using cadence spectre tool at 180nm has been obtained.

    Table 1 and table 2 shows the effect of voltage on time delay(t90%) at two transistor widths. The results shown below depicts the variation in the time delay with the increasing number of repeaters with the scaling of voltage .The results show that with scaling of voltage the value of delay increases. For instance, for VDD = 1.8V, the delay obtained for n = 1 is 4.89ns for Wn = 3µm while for Wn = 1.8µm it is equal to 6.89ns. So as the transistor width is reduced the time delay increases.

    Table 1.Variation in ninety percent (t90%) time delay (ns) with respect to change in number of Repeaters and Change in supply voltage for Wp = 3µm; Wn = 1µm

    Table 2.Variation in ninety percent (t90%) time delay (ns) with respect to change in number of Repeaters and Change in supply voltage for Wp = 1.8µm; Wn = 0.6µm

    From table 1, the minimum time delay obtained is 2.47 ns at VDD = 1.8V, the optimum number of repeaters is n = 6 and as the voltage is scaled down the optimum

    number of repeaters gradually decreases. Similarly from table 2 the optimized number of repeaters comes out to be 8 at VDD = 1.8V with corresponding minimum time delay of 3.89 ns.

    After finding the optimized number of repeaters for minimum time delay at different voltages the corresponding power dissipated is calculated and hence power-delay product has been obtained for both transistor widths.

    Table 3.Variation in Power Delay Product (PDP) with change in supply voltage for Wp = 3µm; Wn = 1µm

    Table 4. Variation in Power Delay Product (PDP) with change in supply voltage for Wp = 1.8µm; Wn = 0.6µm

    Table 3 & 4 depicts as the voltage has been scaled down the corresponding PDP value also decreases until 1.2V.From the readings, the optimized PDP value is 342.63µW- ns at optimized repeaters no = 5, voltage VDD = 1.2V and Wn = 1µm with a time delay of 4.44 ns.

    In order to implement particle swarm optimization technique equation 4 has been taken as the objective

    function. The different PSO parameters has been set and then optimized PDP value is obtained with corresponding values of optimized number of repeaters, supply voltage (VDD) and CMOS transistor width (Wn).

    Figure5.PSO based results for power delay product of RC interconnect for even number of repeaters at180nm technology.

    The results show that the before applying PSO, the power delay product is 10-13Ws at no= 8, Wno= 0.868 µm and optimum VDD= 1.78V. But the results obtained after applying PSO are optimized values with power delay product = 10-13Ws at no= 3.35 (approx. 3), Wno= 0.55 µm and optimum VDD= 1.26V.

    Table below gives the comparison between the PSO results and those obtained using Cadence simulation.

    Table 5. Results Obtained by Cadence Simulations&PSO,for Optimal Interconnect Design for Minimum PDP at 180nm technology

    Table 5 shows that the results of particle swarm optimization are in in close deal with the Cadence results. The time required for solving the problem using PSO has been tremendous when compared with

    the timing of the simulation technique. Thus the algorithm used has been very useful in solving complex optimization problems which otherwise take longer time when solved using traditional simulation techniques.

  2. Conclusion

    The particle swarm optimization technique has been successfully implemented to find the optimum power-delay product at optimized number of repeaters, supply voltage VDDand CMOS transistor width (Wn). The simulation results obtained using Cadence spectre tool are in good agreement with the results obtained using particle swarm optimization technique in Matlab. The percentage error between the two results for power-delay product comes out to be 9.7%. The results obtained for power-delay product using cadence simulation attain minimum value at low voltage of VDDi.e 1.2V for both transistor widths and beyond this value the power-delay product increases as VDDis increased.

    The particle swarm optimization technique has proved to be a efficient technique in getting simultaneously both the optimal values of power-delay product and repeater variables.

  3. References

    1. Bakoglu H. B., Circuits, Interconnections, and Packaging for VLSI. Reading, MA: Addison- Wesley, 1990.

    2. S. Bothra, B. Rogers, M. Kellam, and C. M. Osburn, Analysis of the effects of scaling on interconnect delay in ULSI circuits, IEEE Trans.Electron Devices, vol. 40, pp. 591597, Mar. 1993.

    3. Rabey, J. M., DigitalIntegrated Circuits, Prentice Hall of India, 2002.

    4. Banerjee, K. and Mehrotra, A. (2002), A Power-Optimal Repeater Insertion Methodology For Global Interconnects in Nanometer Designs,IEEE Trans. Electron Devices, vol. 49, no. 11, pp. 2001-2007.

    5. Chandel, R., Sarkar, S. and Agarwal, R. P. (2007), An Analysis of Interconnect Delay Minimization by Low-Voltage Repeater Insertion, J. Microelectronics, vol. 38, pp. 649655.

    6. Adler, V. and Friedman, E. G. (1998), Repeater Design to Reduce Delay and Power in Resistive Interconnects, IEEE Trans. Circuits Systems-II: Analog and Digital Signal Processing, vol. 45, no. 5, pp. 607-616.

    7. Adler, V. and Friedman, E. G. (2000), Uniform repeater insertion in RC trees, IEEE Trans. Circuits and Syst., vol. 47, no. 10, pp. 1515-1523.

    8. Chandel, R. and Sumit, R. (2012), Optimal Design of Repeaters using GA for VLSI Interconnects,Int. J. of Information and Telecommunication Technology (IJITT), vol. 4, no. 1.

    9. Narasimhan, A. and Sridhar, R. (2010) Variability Aware Low-Power Delay Optimal Buffer Insertion for Global Interconnects,IEEE Trans Circuits Syst. I: Reg. Papers, vol. 57, no. 12, pp. 3055-3063.

    10. Kang S. M. and Lebici Y., CMOS Digital Integrated Circuits, Tata McGraw Hill Publications, New Delhi, 2003.

    11. Kennedy J. and Eberhart R.(1995),Particle Swarm Optimization,Proceeding of IEEE International Conference on Neural Networks

, vol. 4, pp.1942-1948.

Leave a Reply