ECE 511 2004f   2004-10-6
 

Lab 3: Fast Arithmetic, Pipelining, Packages


Please read the requirements for lab assignments.

In this lab, you will obtain greater speed in arithmetic by through the use of pipelining.   Note that the Spartan2 device we are using only has 140 pins, which means your pin count cannot exceed this number (place and route will fail if the pin count is exceeded).
 

Exercise

To check your disk space, type:
  % cd
  % fs lq
To see what's taking all the space, type:
  % du -k .

Lab

Part A: Growth of Delay

Measure the minimum clock period (from synthesis only) for at least four different sizes of multipliers (e.g. 8, 16, 32, 64 bits) and use this to estimate the clock period as a function of the number of bits in the multiplier   (Hand in only calculation and function.)


 

Part B: Reducing Critical Paths

Measure the minimum clock period for at least four different sizes of counters (e.g. 32, 64, 128, 256, 512 bits) and use this to estimate the clock period as a function of the number of bits in the counter. To conserve IO pins, direct only the 8 most significant bits to the output. (Hand in only calculation and function.)

Carry-save arithmetic does not fully propagate carries, but rather, stores results in a redundant format with 2 bits representing each bit of the result.  VHDL code for a carry-save adder is provided as an example.

Design a carry-save counter.  If you had used a simple binary adder, you would tie one adder input to "00001" and connect the output through a register to the other adder input.  Using the carry-save adder, connect sum_part1 and sum_part2 through registers to adden2 and adden3.   To avoid running out of pins, you may want to send only the most significant bits "off-chip".

  1. Again, express the minimum clock period as a function of the number of bits in the counter.
A note for the curious, this carry-save arithmetic technique is also used in fast multipliers.
 

Part C: Pipelines

Change directories to your Xilinx working directory and run:
  % setenv BAK $MODELSIM
  % setenv MODELSIM
  % compxlib -s mti_se -f spartan2:c -l vhdl -o ./
  % setenv MODELSIM $BAK
  % ln -s XilinxCoreLib/ xilinxcorelib

From Xilinx, click Project -> Add new source.
Select Coregen IP. Give it a name of pipe2_mult
Click Next, then Finish
In the Coregen window that pops up, double click "Math functions" then select "multiplilers". In the sub-window to the right, double click "multiplier". Set the name to pipe2_mult. Click next. Set the input widths of ports A and B to 16-bits. Click next. The output width should be 32 bits. Click next. Set the Pipelining to "Minimum Pipelining". Click "Generate". Click on "Data Sheet" to open a PDF data sheet on the multiplier. Once the multiplier has been generated you can dismiss the multiplier creation window.

Back in Xilinx you should see pipe2_mult in the sources window. Repeat the steps above but set the pipelining to "Maximum Pipelining".

With the coregen module selected check the functional view for the entity name and port map.

Your coregen modules include and pipeline registers (D-flipflops) before and after it.  The data input and output of your multiplier must be registers in order for registered-performance timing analysis to provide a report for your entire circuit.  Measure the total number of logic cells in the design.  Throughput is the maximum clock frequency (MHz).  We will define latency as the number of pipeline register stages, not counting the first one, times the minimum clock period.

Compare the size, throughput, and latency of the two variants of the design.