ECE 471 Final Project – 4b Adder with Razor

OUT: 2/25/2011

IN: 3/10/2011, in class (ORAL PRESENTATION)

IN: 3/18/2011, 11:59PM (WRITTEN PRESENTATION HAND-IN)

  • Project: 4b Adder with Razor Error Detection
  • Objective: Design a pipelined, 4b adder that is able to detect delay variations (up to 20%) in 250nm (or better) CMOS process for the best FoM (Figure of Merit, 4b-adder Operations with the LEAST amount of power consumed)
  • What you need to turn in:
  • Final, comprehensive project report with all design details. Please state all assumptions made clearly. One report per team.
  • Oral presentation: March 10, Midterm presentation: 5-minute presentation.
  • Each team builds one ADDER-with-RAZOR: please share the load equally (team dynamics)
  • Target Specifications:
  • 20% - %80 edge rate for A,B, CLK = 80p
  • Clock frequency => As fast as possible
  • Vcc = 2.5V, 0.25um CMOS at the highest clock frequency
  • Delay variation detection up to 20% of the clock cycle time
  • Minimize power consumption: Figure of Merit is:

Clock Frequency (Gigahertz or # of Operations/second) / Power (Watts)

  • Process Parameters:
  • Process: 0.25um CMOS
  • You should show process scalability to 65nm-CMOS
  • Design Choices:
  • Error Detection: 1) RAZOR circuit for determining error

2) Alternatives (i.e. Intel error resiliency (Keith Bowman)

  • Adder: 1) Ripple-Carry, Manchester-tree, Carry lookahead, or
    conditional-sum

2) any static or dynamic/domino circuit technique

  • Flip-flop: use a conventional flop in your library
  • DO only schematic design — we will worry about physical layout later.
  • Deliverables:
  • Simulations that show the design works at Vdd=2.5V
  • Simulations that show your circuit can catch delay variations up to %20 larger than the cycle time
  • Simulations that show that the design doesn’t work if there are large variations beyond 20% cycle time; AND min-path delay causes a race
  • Simulations of FoM across Monte Carlo Variations (use just 100mV offset in particularly important nodes, similar to Midterm-#2). SHOW THAT THE RAZOR CIRCUIT IS ABLE TO DETECT A SUBSET OF DELAY VARIATIONS THAT ARISE FROM PROCESS VARIATIONS.

NOTE: THE LAYOUT IS NOT REQUIRED. To save you time and energy, you will not be responsible to do the entire 4b adder-with-razor layout.

Instead, to help save you time but to ensure that you understand what is going on:

  1. LAYOUT: Make sure you do a layout of the FA. Make sure it passes lvs/pex/drc. In the final report appendix, I want to see the completed layout of the FA/HA.
  2. SCHEMATIC/LAYOUT DIFFERENCE: I want you to simulate the worst-case delay path in HSPICE the schematic of the FA. NOTE that schematic simulation is NO SUBSTITUTE for the LAYOUT simulation. (i.e. the parasitic loading of the internal capacitances will affect your delay, and is not representative of what is actually going on internally due to the layout area). Also note that the energy consumed in the layout is NOT the same as the schematic.
  3. BLOCK DIAGRAM FLOORPLAN: To save you energy, instead of doing the fully layout in cadence, in POWERPOINT, draw the FA/HA, D-FF, and RAZOR cells as a physical SQUARE Block, and ‘MAP-OUT’ how you would connect this up in layout. i.e. illustrate a line of D-FFs with RAZOR on the top of the 4b adder with razor, on the bottom of the 4b adder, and the FA/HA cells in the center.
  4. SIMULATE: Assuming you have understanding the layout/schematic are not the same (and you’ve proven that to yourself from [2] above), you can do the final 4b adder simulations only using the schematic mode.
  5. CLOCK SKEW: I still want you to understand the clock skew issue. After you’ve mapped out the block diagram/floorplan in PPT, you should know the relative X/Y dimensions of everything. Route your clock in a clockwise fashion to the TOP/BOTTOM flip-flops in the pipelined multiplier (top and bottom). Do this in M1 (to make the RC delay look the worst). Plot on the same HSPICE graph the clock skew between the first D-FF and the last D-FF.

The easiest way to do this is just draw the metal parasitic length segment between adjacent D-FFs, and just making this a symbolic block in HSPICE. That way, when you simulate the entire D-FF chain of skews, you will get a “reasonable” number of the skew with the actual D-FF clock loading. Note that not only are you loading the clock with metal wiring, but also with the input capacitance of the clock capacitances in the many D-FFs, etc.

Assume WORST-CASE clock skew, which is the clock comes in from the top left of the D-FFs, does a clockwise circulation, and exits at the bottom right of the 8b multiplied D-FF outputs. Assume worst-case wire loading, of 1um wide M1 clock wiring.

  • You can forget the energy-optimal point for this final project. The RAZOR circuit doesn’t work with Vdd ~ 0.3V.
  • SCALING: We would like to know how your design scales to different technologies. 0.25um and 65nm. If you can get 32nm, 24nn, or 16nm to work, I will give you EXTRA credit.

GRADUATE STUDENTS ARE REQUIRED TO DO EXTRA PROJECT. Please come see me to discuss. Some examples:

–Analog: Opamps built using inverters

–Digital: Adiabatic Near-Threshold

–Power Gating: add power gating to your design

• Undergrad: EXTRA CREDIT if do any of the above, as undergraduate

TEMPLATE for IEEE format paper.