Wednesday, November 2, 2011

Ground Bounce


Ground Bounce Definition


Ground bounce is usually seen on high density VLSI where insufficient precautions have been taken to supply a logic gate with a sufficiently low resistance connection (or sufficiently high capacitance) to ground. In this phenomenon, when the gate is turned on, enough current flows through the emitter-collector circuit that the silicon in the immediate vicinity of the emitter is pulled high, sometimes by several volts, thus raising the local ground, as perceived by the transistor, to a value significantly above true ground. Relative to this local ground, the BASE voltage can go negative, thus shutting off the transistor. As the excess local charge dissipates, the transistor turns back on, possibly causing a repeat of the phenomenon, sometimes up to a half-dozen bounces.
Ground bounce is one of the leading causes of "hung" or metastable gates in modern digital circuit design. This happens because the ground bounce puts the input of a flip flop effectively at voltage level that is neither a one or a zero at clock time, or causes untoward effects in the clock itself. A similar phenomenon may be seen on the collector side, called VCC sag, where VCC is pulled unnaturally low.






Ground Bounce defines a condition when a device's output {really a number of outputs} switches from High to Low and causes a voltage change on other pins. 

"Ground Bounce is a voltage oscillation between the ground pin on a component package and the ground reference level on the component die. Essentially it is caused by a current

surge passing through the lead inductance of the package." , IDT.
The problem is cause by the large current flow through the ground pin which develops a voltage drop over the lead inductance. This voltage drop on the ground line creates two main problems; first it rises the chip off ground [0 volts] potential


which increases the devices input threshold level, and increases the voltage level on an output pin which is not switching. Because a quiet output is effected by the other switching outputs, this is also called Simultaneous Switching Noise. Ground Bounce is really an issue with loss of noise margin, and is sometimes also called Ground Bounce Noise. The faster the slew rate of the logic family, the worse the problem becomes.


Normally when the information is provided in the data sheet ground bounce is given as Volp [Voltage Output Low Pulse], or output ground bounce. Ground Bounce will be given as some maximum voltage pulse, or a peak voltage below some maximum value.


Ground Bounce Elimination



With Glue Logic, the ground pins may have been moved around to reduce the inductance. Older families of Glue Logic used the far side pins as power and ground. For example a 14-pin IC would use pin 14 for power and pin 7 as ground. Newer ICs moved the power and ground pins to the center pins of the IC [pins 3 and 12 in this example]

Using a Surface Mount Device [SMD] instead of a Through Hole device will reduce the lead inductance. Normally SMD components are smaller, their leads are closer together and have a lower lead inductance. Refer here for a list of SMD ICs.
For FPGA's with hundreds of possible output pins the situation may change, and it's more up to the designer to deal with the issue. In most cases only a small portion of an FPGAs pins are connected to a separate set or power and ground connections. Every dozen Input/Output [I/O] pins switch off their own power and ground pins. So any one group of switching pins does not effect any other group of pins.

Start a noise budget to determine if the ground bounce, rise in ground potential, effects the design [Noise Margin Calculation ]. The voltage developed over the ground lead is proportional to the rate of change in current, so the faster the logic family the worse the problem becomes: V = L * [di/dt]. The more outputs switching at the same time, the larger the current value, and greater voltage bounce. Ground Bounce also occurs when the outputs switch from a 0 to a 1 but to a much smaller degree. 

Series Termination

Series termination of the line is one method of reducing ground bounce [Trace Termination Methods ]. Series termination resistors slow the rate of change of the output, and so reduce the instantaneous current on the ground line. How ever placing series resistors on all the possible output lines may not be practical. Also Resistor Pull-Ups on the line cause the ground bounce voltage to increase. The pull-up resistor allows the load capacitor to charge to it's full value, so as the line switches maximum current is delivered back to the driver. When practical eliminate pull-up resistors on devices with an issue, use pull-down resistors or series resistors if possible. Reducing the loading on the driver also reduces ground bounce. Ground Bounce may also be called Ground Lift.


Design Pitfalls

Ground Bounce is easy to understand, but you have to know the condition could exist. Without realizing the condition may exist could lead to a circuit failure. A one bit error [bit bounce] could occur at any time, weeks after system testing, when enough bits change in the same direction and at the same time. Any possible random data pattern could cause a ground bounce and an uncontrolled bit change in any near-by data pin. The only real way to test for this is to send every possible bit pattern down the bus while watching all the other unused pins for a change.

There are a number of other Logic Hazards when designing digital circuits.


Simultaneous Switching Outputs


Switching Outputs vs. Propagation Delay

In addition to causing ground bounce Simultaneous Switching outputs will also cause the propagation delay of the output to increase. The greater the number of outputs switching simultaneously the larger the increase in propagation delay. Note that the graph shows Number of Outputs Switching vs delta Change in propagation delay for the outputs. Five different IC packages are shown, including SSOP, TSSOP, TVSOP, and LFBGA. The green trace representing the LFBGA shows a larger increase in delta prop delay because the LFBGA contains 96-pin instead of the 48 pins of the other packages. Of course because the increase in prop delay is unintended it could also be considered noise. Graphic credit; TI.


Design Guidelines [from Altera]


1. Add decoupling capacitors for as many VCC/GND pairs as possible.

2. Place the decoupling capacitors as close as possible to the power and ground pins of the device.

3. Limit load capacitance by buffering loads with an external device, or by reducing the number of devices that drive the bus.

4. Eliminate sockets whenever possible.

5. Reduce the number of outputs that can switch simultaneously and/or distribute them evenly throughout the device.
6. Use multi-layer PCBs that provide separate VCC and ground planes.
7. Add appropriate resistors in series to each of the switching outputs to limit the current flow into each of the outputs.
8. Use surface mount capacitors to minimize the lead inductance.
9. Use low effective series resistance (ESR) capacitors. 
10. Each GND pin/via should be connected to the ground plane individually.
11. Eliminate pull-up resistors or use pull-down resistors.
 ... "Some bus applications use pull-up resistors to create a default high value for the bus. These resistors cause the load capacitances to charge up to the maximum voltage. Consequently, the driving device produces a higher level of ground bounce..."

editor note; there were additional design guidelines listed from this source, but they only related to programmable devices and are not listed here.

Of course implementing some of these guidelines may not be possible, but they are guidelines and not design rules. For example it my not be possible to reduce the number of outputs switching at the same time, as an 8-pin bus driver IC may have all its outputs switching half of the time. Also, using the same bus driver example which would already be the buffer, there would be no way to reduce its load capacitance.

Acronyms Defined
LFBGA; Low-profile Fine-pitch Ball Grid Array
SSOP; Shrink Small-Outline Package
TSSOP; Thin Shrink Small-Outline Package
TVSOP; Thin Very Small-Outline Package
Volp; Voltage Output Low Peak
Additional IC Package Styles.

Sunday, October 23, 2011

Floor Planning


Floor Planning

The first step in the Physical Design flow is Floor Planning. Floorplanning is the process of identifying structures that should be placed close together, and allocating space for them in such a manner as to meet the sometimes conflicting goals of available space (cost of the chip), required performance, and the desire to have everything close to everything else.
Based on the area of the design and the hierarchy, a suitable floorplan is decided upon. Floor Planning takes into account the macro's used in the design, memory, other IP cores and their placement needs, the routing possibilities and also the area of the entire design. Floor planning also decides the IO structure, aspect ratio of the design. A bad floor-plan will lead to waste-age of die area and routing congestion.
In many design methodologies, Area and Speed are considered to be things that should be traded off against each other. The reason this is so is probably because there are limited routing resources, and the more routing resources that are used, the slower the design will operate. Optimizing for minimum area allows the design to use fewer resources, but also allows the sections of the design to be closer together. This leads to shorter interconnect distances, less routing resources to be used, faster end-to-end signal paths, and even faster and more consistent place and route times. Done correctly , there are no negatives to Floor-planning.
As a general rule, data-path sections benefit most from Floorplanning, and random logic, state machines, and other non-structured logic can safely be left to the placer section of the place and route software.
Data paths are typically the areas of your design where multiple bits are processed in parallel with each bit being modified the same way with maybe some influence from adjacent bits. Example structures that make up data paths are Adders, Subtractors, Counters, Registers, and Muxes.

Friday, October 21, 2011

Latch-Up


What is latch up in CMOS design and ways to prevent it?

A Problem which is inherent in the p-well and n-well processses is due to relatively large number of junctions which are formed in these structures, the consequent presence of parasitic diodes and transistors.

Latch-up is a condition in which the parasitic components give rise to the Establishment of low resistance conducting path between VDD and VSS with Disastrous results

Latch-up may be induced by glitches on the supply rails or by incident radiation.

Latch-up pertains to a failure mechanism wherein a parasitic thyristor (such as a parasitic silicon controlled rectifier, or SCR) is inadvertently created within a circuit, causing a high amount of current to continuously flow through it once it is accidentally triggered or turned on. Depending on the circuits involved, the amount of current flow produced by this mechanism can be large enough to result in permanent destruction of the device due to electrical overstress (EOS).

Preventions for Latch-Up
  • by adding tap wells, for example in an Inverter for NMOS add N+ tap in n-well and conncet it to Vdd, and for PMOS add P+ tap in p-substrate and connect it to Vss. 
  • an increase in substrate doping levels with a consequent drop in the value of  Rs.
  • reducing Rp by control of fabrication parameters and by ensuring a low contact resistance to Vss.
  • and the other is by introducing of guard rings.....

Latchup in Bulk CMOS
A byproduct of the Bulk CMOS structure is a pair of parasitic bipolar transistors. The collector of each BJT is connected to the base of the other transistor in a positive feedback structure. A phenomenon called latchup can occur when (1) both BJT's conduct, creating a low resistance path between Vdd and GND and (2) the product of the gains of the two transistors in the feedback loop, b1 x b2, is greater than one. The result of latchup is at the minimum a circuit malfunction, and in the worst case, the destruction of the device.
Cross section of parasitic transistors in Bulk CMOS
Equivalent Circuit
Latchup may begin when Vout drops below GND due to a noise spike or an improper circuit hookup (Vout is the base of the lateral NPN Q2). If sufficient current flows through Rsub to turn on Q2 (I Rsub > 0.7 V ), this will draw current through Rwell. If the voltage drop across Rwell is high enough, Q1 will also turn on, and a self-sustaining low resistance path between the power rails is formed. If the gains are such that b1 x b2 > 1, latchup may occur. Once latchup has begun, the only way to stop it is to reduce the current below a critical level, usually by removing power from the circuit.
The most likely place for latchup to occur is in pad drivers, where large voltage transients and large currents are present.
Preventing latchup
Fab/Design Approaches

  1. Reduce the gain product b1 x b1

  • move n-well and n+ source/drain farther apart increases width of the base of Q2 and reduces gain beta2 ­> also reduces circuit density
  • buried n+ layer in well reduces gain of Q1
    2. Reduce the well and substrate resistances, producing lower voltage drops

·        higher substrate doping level reduces Rsub
·        reduce Rwell by making low resistance contact to GND
·        guard rings around p- and/or n-well, with frequent contacts to the rings, reduces the parasitic resistances.
CMOS transistors with guard rings
Systems Approaches
  1. Make sure power supplies are off before plugging a board. A "hot plug in" of an unpowered circuit board or module may cause signal pins to see surge voltages greater than 0.7 V higher than Vdd, which rises more slowly to is peak value. When the chip comes up to full power, sections of it could be latched.
  2. Carefully protect electrostatic protection devices associated with I/O pads with guard rings. Electrostatic discharge can trigger latchup. ESD enters the circuit through an I/O pad, where it is clamped to one of the rails by the ESD protection circuit. Devices in the protection circuit can inject minority carriers in the substrate or well, potentially triggering latchup.
  3. Radiation, including x-rays, cosmic, or alpha rays, can generate electron-hole pairs as they penetrate the chip. These carriers can contribute to well or substrate currents.
  4. Sudden transients on the power or ground bus, which may occur if large numbers of transistors switch simultaneously, can drive the circuit into latchup. Whether this is possible should be checked through simulation.
Referrences:
http://www.ece.drexel.edu/courses/ECE-E431/latch-up/latch-up.html

Tuesday, October 4, 2011

Delays in ASIC Design


Delays in ASIC Design


We encounter several types of delays in ASIC design. They are as follows:

·         Gate delay or Intrinsic delay
·         Net delay or Interconnect delay or Wire delay or Extrinsic delay or Flight time
·         Transition or Slew
·         Propagation delay
·         Contamination delay

Wire delays or extrinsic delays are calculated using output drive strength, input capacitance and wire load models. Other delays are intrinsic properties of each and every gate.
Delays are interdependent on different electrical properties. [Nekoogar]:

  • Input capacitance of the logic gate is a function of output state, output loads and input slew rate.
  • Internal timing arcs and output slew rate is a function of switching input(s).
  • Capacitance of the wire is dependent on frequency.
  • Internal timing arcs are a function of input slew rates.
  • Output slew rate is a function of input slew rate on each input.
  • Wires exhibit RLC characteristics instead of lumped RC.

Gate Delay
Transistors within a gate take a finite time to switch. This means that a change on the input of a gate takes a finite time to cause a change on the output. [Magma]
Gate delay =function of (input transition (slew) time, Cnet+Cpin).
or
Gate delay =function of (input transition (slew) time, Cload).
where Cload=Cnet+Cpin
Cnet-->Net capacitance
Cpin-->pin capacitance of the driven cell
Cell delay is also same as Gate delay.

How gate delay is calculated?
Cell or gate delay is calculated using Non-Linear Delay Models (NLDM). NLDM is highly accurate as it is derived from SPICE characterizations. The delay is a function of the input transition time (i.e. slew) of the cell, the wire capacitance and the pin capacitance of the driven cells. A slow input transition time will slow the rate at which the cell’s transistors can change state logic 1 to logic 0 (or logic 0 to logic 1), as well as a large output load Cload (Cnet + Cpin), thereby increasing the delay of the logic gate.

There is another NLDM table in the library to calculate output transition. Output transition of a cell becomes the input transition of the next cell down the chain.


·         Table models are usually two-dimensional to allow lookups based on the input slew and the output load (Cload). A sample table is given below.

timing() {
related_pin : "CKN";
timing_type : falling_edge;
timing_sense : non_unate;
cell_rise(delay_template_7x7) {
index_1 ("0.012, 0.032, 0.074, 0.154, 0.318, 0.644, 1.3");
index_2 ("0.001278, 0.0046008, 0.0112464, 0.0245376, 0.05112, 0.10454,0.212148");
values ( \
"0.225894, 0.249015, 0.285537, 0.352680, 0.484244, 0.748180, 1.279570", \
"0.231295, 0.254415, 0.290938, 0.358081, 0.489646, 0.753585, 1.284980", \
"0.243754, 0.266878, 0.303398, 0.370542, 0.502105, 0.766044, 1.297440", \
"0.267240, 0.290389, 0.326908, 0.394052, 0.525615, 0.789561, 1.320950", \
"0.307080, 0.330200, 0.366721, 0.433861, 0.565425, 0.829373, 1.360760", \
"0.380552, 0.403875, 0.440426, 0.507569, 0.639136, 0.903084, 1.434500", \
"0.497588, 0.521769, 0.558548, 0.625744, 0.757301, 1.021260, 1.552680");
}
rise_transition(delay_template_7x7) {
index_1 ("0.012, 0.032, 0.074, 0.154, 0.318, 0.644, 1.3");
index_2 ("0.001278, 0.0046008, 0.0112464, 0.0245376, 0.05112, 0.10454, 0.212148");
values ( \
"0.040574, 0.068619, 0.125391, 0.246672, 0.497688, 1.005982, 2.030120", \
"0.040570, 0.068618, 0.125390, 0.246672, 0.497688, 1.005940, 2.030240", \
"0.040565, 0.068616, 0.125389, 0.246650, 0.497770, 1.006180, 2.030120", \
"0.040532, 0.068612, 0.125387, 0.246670, 0.497710, 1.006164, 2.030100", \
"0.040578, 0.068621, 0.125392, 0.246636, 0.497688, 1.006182, 2.030040", \
"0.041763, 0.069211, 0.125662, 0.246758, 0.497726, 1.005930, 2.030000", \
"0.045813, 0.071321, 0.126671, 0.247154, 0.497846, 1.005962, 2.030180");
}


index_1 --> input transition values
index_2--> output load capacitance values
values--> delay values

Situation 1:
Input transition and output load values match with table index values

If both input transition and output load values match with table index values then corresponding delay value is directly picked up from the delay “values” table as highlighted by yellow shaded data.

Situation 2:
Output load values doesn't match with table index values

·         When the actual load capacitance values does not fall directly on or at one of the load-axis index points, the delay is determined by interpolation from the closest points. Note that to carry out interpolation input transition point should match with the any one of the table index values.
·         Determine the equation for the line segment connecting the two nearest points in the table.


To do this first we need to find the slope value.
Slope m = (y2-y1)/(x2-x1) where (y2-y1) is delay segment (generally in ns) on y axis and (x2-x1) is load segment (generally in pf) on x-axis.
·         Solve for the delay at the load point of interest.

The linear equation is:
y = mx+c
where
y-->delay (ns)
m-->slope
x-->load capacitance (pf)

i.e. delay=slope*load point of interest (constant value is zero)

Load point of interest means load capacitance value for which delay has to be calculated.

Situation 3:
Both input transition and output load values doesn't match with table index values

·         If both input transition and load capacitance values do not match exactly with the look up table index values then bilinear interpolation is used.
·         Multiple linear interpolations (~3) are performed on multiple closest table data points (~4) as shown in highlighted violet color in the look up table.

Situation 4:
Output load values doesn't match with table index values and is outside the table boundary

·         When the load point is outside of the boundary of the index, the delay is extrapolated to the closest known points.
·         Lookup value too far out of range of the given table value could lead to inaccuracy. [Cadence]

Intrinsic delay

·         Intrinsic delay is the delay internal to the gate. This is from input pin of the cell to output pin of the cell.
·         It is defined as the delay between an input and output pair of a cell, when a near zero slew is applied to the input pin and the output does not see any load condition. It is caused by the internal capacitance associated with its transistor.
·         This delay is largely dependent on the size of the transistors forming the gate because increasing size of transistors increase internal capacitors.

References
[Nekoogar] Farzad Nekoogar, “Timing Verification of Application Specific Integrated Circuits”, Prentice Hall
[Magma] Magma Blast Fusion User Guides
[Cadence] Cadence SOC Encounter User Guides