From LaPET electronics

Lecture 9: 2010-11-08 (programmable logic)

A few notes

Only Monday's lab is required this week. Feel free to skip Thursday's session, whether or not you are taking the GRE this weekend.
If you do come to lab on Thursday, Jose and I will be there to discuss final project ideas with you.
If you like, you can download a free copy of the Xilinx ISE "webpack" software from http://www.xilinx.com/tools/webpack.htm .
Documentation for the Digilent CMOD board can be found at http://www.digilentinc.com/Products/Detail.cfm?Prod=CMOD
I've written some Verilog Notes to help you to get started

Lecture

FPGA overview

This week we introduce programmable logic. After spending last week tediously connecting wires between discrete logic components, you will quickly appreciate the usefulness of Field Programmable Gate Arrays. (Technically, the Xilinx device that we will use for the next few labs -- the XC2C64A, mounted on a CMOD board from Digilent -- is a CPLD, not an FPGA. As far as I can tell, a CPLD is just a smaller and less versatile FPGA.)

The FPGA market is currently dominated by two major vendors, Altera and Xilinx. I spent about ten years (roughly 1998--2008) working with Altera FPGAs, and since 2008 I have worked with Xilinx FPGAs. From my point of view, the most substantial difference between the two vendors is the software that you use to compile your design files into bits that can be loaded into the vendors' chips. The two vendors' FPGAs offer a similar range of features. The main reason we are using a Xilinx device in this course is that Digilent sells for $18 a small Xilinx FPGA (or CPLD) mounted on a tiny circuit board that can be readily plugged into the breadboards that you already use in lab. This makes it easy for you to compare the FPGA with the discrete 74LSnn chips that you used in last week's lab.

A high-end modern FPGA contains in a single chip the equivalent of O(1 million) of the discrete logic gates that you used in Lab 8. And that chip can be reprogrammed many times to perform different functions, just by changing the program that you compile into the chip. Thus, a digital design that used to require connecting many 74LSnn components together with copper traces on a printed circuit board now just requires a sort of programming that causes logic to be internally connected within the FPGA. As a result, sophisticated digital designs are far easier to implement and enormously easier to modify in situ. (Hence "field programmable.")

There are three prevailing ways of developing FPGA programs: drawing schematic diagrams that allow you to pretend you are wiring together 74-series logic components; writing VHDL code; and writing Verilog code. I have tried all three and have a strong preference for Verilog, so that is what we will use.

Verilog was originally developed for simulating and testing logic, rather than for implementing logic, so the Verilog language contains some features that cannot be synthesized by the Xilinx compiler. (For example, a Verilog simulator can print out the value of a variable every time it changes -- something a real chip can't do on its own.) We will discuss only the synthesizable subset of Verilog (with a few exceptions, such as printing out variable contents). Nowadays I imagine that Verilog is used more often for FPGA synthesis than for generic logic simulation.

Probably the best way to start learning Verilog is just to dive in. When you have questions, you can ask me or you can consult the Verilog book chapters I handed out last week. I will start by implementing in Verilog the logic that you implemented last week with individual 74LSxx chips. In tonight's lab, you will compile these examples into your own FPGA and try them out. There may also be a couple of examples in which I give you a Verilog program that is nearly complete and you add a few lines to flesh out the details.

Then, once you've had this week to get comfortable with the FPGA and the Xilinx software, we'll make next week's lab a bit more challenging.

Simple gate example (NOT, NAND, AND, OR, XOR)

// nand_etc.v

`timescale 1ns / 1ps
`default_nettype none

module nand_etc
(
    input  wire a,
    input  wire b,
    output wire not_a,
    output wire a_nand_b,
    output wire a_and_b,
    output wire a_or_b,
    output wire a_xor_b
);
    assign not_a    = ~a;
    assign a_nand_b = ~(a&b);
    assign a_and_b  = a&b;
    assign a_or_b   = a|b;
    assign a_xor_b  = a^b;
endmodule

Normally you write a "test bench" for your design, so that you can try it out in the simulator before you try it on real hardware. This test bench runs inside the Xilinx "ISIM" simulator. Note that you will see the test benches use some non-synthesizable features of Verilog: explicit time delays, initial values, $finish, etc.

// nand_etc_tb.v

`timescale 1ns / 1ps
`default_nettype none

module nand_etc_tb;
    // Inputs for UUT
    reg a=0;
    reg b=0;
    // Outputs of UUT
    wire not_a;
    wire a_nand_b;
    wire a_and_b;
    wire a_or_b;
    wire a_xor_b;

    // Instantiate the Unit Under Test (UUT)
    nand_etc uut 
    (
	.a(a), 
	.b(b), 
	.not_a(not_a), 
	.a_nand_b(a_nand_b), 
	.a_and_b(a_and_b), 
	.a_or_b(a_or_b), 
	.a_xor_b(a_xor_b)
    );

initial begin
    // Wait 100 ns for simulator reset to finish
    #100;  

    // Drive first set of inputs
    a = 0; b = 0;

    // Wait 100ns then drive next set of inputs
    #100; a = 1; b = 0;

    // Wait 100ns then drive next set of inputs
    #100; a = 0; b = 1;

    // Wait 100ns then drive last set of inputs
    #100; a = 1; b = 1;

    // Wait 100ns then terminate simulation
    #100;
    $finish;
end

endmodule

Four-bit full adder example

This first version will implement the adder using individual gates.

// fourbit_fulladder_v1.v

`timescale 1ns / 1ps
`default_nettype none

module fourbit_fulladder_v1
(
    input  wire c0,
    input  wire a1, a2, a3, a4,
    input  wire b1, b2, b3, b4,
    output wire sum1, sum2, sum3, sum4,
    output wire c4
);
    wire c1, c2, c3;
    assign sum1 = a1^b1^c0;
    assign sum2 = a2^b2^c1;
    assign sum3 = a3^b3^c2;
    assign sum4 = a4^b4^c3;
    assign c1 = a1&b1 | a1&c0 | b1&c0;
    assign c2 = a2&b2 | a2&c1 | b2&c1;
    assign c3 = a3&b3 | a3&c2 | b3&c2;
    assign c4 = a4&b4 | a4&c3 | b4&c3;
endmodule

This version uses Verilog's ability to do arithmetic! Note that in Verilog, the result of adding two four-bit numbers is in fact five bits wide -- the carry bit being the most-significant bit of the sum.

// fourbit_fulladder_v2.v

`timescale 1ns / 1ps
`default_nettype none

module fourbit_fulladder_v2
(
    input  wire       cin,
    input  wire [3:0] a,
    input  wire [3:0] b,
    output wire [3:0] sum,
    output wire       cout
);
    assign {cout,sum} = a+b+cin;
endmodule

Here is a test bench that tests out both versions of the adder at the same time.

// fourbit_fulladder_tb.v

`timescale 1ns / 1ps
`default_nettype none

module fourbit_fulladder_tb;
    // Inputs (for both v1 and v2 adders)
    reg        cin=0;
    reg  [3:0] a=0, b=0;
    // Outputs of fourbit_fulladder_v1
    wire [3:0] sum;
    wire       cout;
    // Outputs of fourbit_fulladder_v2
    wire       sum1, sum2, sum3, sum4, c4;

    // Instantiate the Unit Under Test (UUT) (both of them)
    fourbit_fulladder_v1 uut_v1
    (
	.c0(cin), 
	.a1(a[0]), .a2(a[1]), .a3(a[2]), .a4(a[3]),
	.b1(b[0]), .b2(b[1]), .b3(b[2]), .b4(b[3]),
	.sum1(sum1), .sum2(sum2), .sum3(sum3), .sum4(sum4), .c4(c4)
    );
    fourbit_fulladder_v2 uut_v2
    (
	.cin(cin), .a(a), .b(b),
	.sum(sum), .cout(cout)
    );

    integer i=0, j=0, k=0;
    initial begin
	#100;  // Wait 100 ns for simulator reset to finish
	for (i=0; i<2; i=i+1) begin
	    for (j=0; j<16; j=j+1) begin
		for (k=0; k<16; k=k+1) begin
		   cin = i;
		   a = j;
		   b = k;
		   #100;  // Wait 100 ns so that we can see output update
		end
	    end
	end
	$finish;
    end
endmodule

Decoder example

Note that I made the outputs active-high, whereas the 138's outputs are active-low.

// decoder.v

`timescale 1ns / 1ps
`default_nettype none

module decoder
(
    input  wire       e1, e2, e3,
    input  wire [2:0] a,
    output wire [7:0] o
);
    // Selected output is HIGH if E3 & not E2 & not E1; else LOW
    wire e = e3 & ~e2 & ~e1;
    // Drive selected output to 'e'; others to LOW
    assign o[0] = (a==0 ? e : 0);
    assign o[1] = (a==1 ? e : 0);
    assign o[2] = (a==2 ? e : 0);
    assign o[3] = (a==3 ? e : 0);
    assign o[4] = (a==4 ? e : 0);
    assign o[5] = (a==5 ? e : 0);
    assign o[6] = (a==6 ? e : 0);
    assign o[7] = (a==7 ? e : 0);
    // Note that I made the outputs active high here,
    // whereas the 74LS138 outputs are active low
endmodule

Test bench.

// decoder_tb.v

`timescale 1ns / 1ps
`default_nettype none

module decoder_tb;
    // Inputs for decoder
    reg  [3:0] a=0;
    reg        e1=0, e2=0, e3=0;
    // Outputs of decoder
    wire [7:0] o;

    // Instantiate the Unit Under Test (UUT)
    decoder uut
    (
        .a(a), .e1(e1), .e2(e2), .e3(e3),
        .o(o)
    );

   integer i=0, j=0;
   initial begin
	#100;  // Wait 100 ns for simulator reset to finish
	for (i=0; i<8; i=i+1) begin
	    for (j=0; j<8; j=j+1) begin
		a = i;
		{e3,e2,e1} = j;
		#100;  // Wait 100 ns so that we can see output update
	    end
	end
	$finish;
    end
endmodule

Shift register example

Note Verilog syntax for synchronous logic: always @ (posedge clk)

// shiftreg.v

`timescale 1ns / 1ps
`default_nettype none

module shiftreg
(
    input  wire       clk,
    input  wire       d,
    output wire [5:0] o
);
    reg [5:0] oreg=0;
    always @ (posedge clk) begin
        oreg[5] <= oreg[4];
        oreg[4] <= oreg[3];
        oreg[3] <= oreg[2];
        oreg[2] <= oreg[1];
        oreg[1] <= oreg[0];
        oreg[0] <= d;
    end
    assign o = oreg;
endmodule

Test bench.

// shiftreg_tb.v

`timescale 1ns / 1ps
`default_nettype none

module shiftreg_tb;
    // Inputs for shiftreg
    reg        clk=0, d=0;
    // Outputs of shiftreg
    wire [5:0] o;

    // Instantiate the Unit Under Test (UUT)
    shiftreg uut
    (
        .clk(clk), .d(d),
        .o(o)
    );

   integer i=0;
   initial begin
	#100;  // Wait 100 ns for simulator reset to finish
	for (i=0; i<20; i=i+1) begin
            // Send in a short pulse at i=0 and a longer pulse at i=10
            d = (i==0 || i==10 || i==11);
	    #100;  // Wait 100 ns, jiggle the clock, and wait another 100 ns
            clk = 1;
	    #100;
            clk = 0;
	end
	$finish;
    end
endmodule

Counter example

// counter.v

`timescale 1ns / 1ps
`default_nettype none

module counter
(
    input  wire       clk,
    output wire [5:0] o
);
    reg [5:0] oreg=0;
    always @ (posedge clk) begin
        oreg <= oreg+1;
    end
    assign o = oreg;
endmodule

Test bench.

// counter_tb.v

`timescale 1ns / 1ps
`default_nettype none

module counter_tb;
    // Inputs for counter
    reg        clk=0;
    // Outputs of counter
    wire [5:0] o;

    // Instantiate the Unit Under Test (UUT)
    counter uut
    (
        .clk(clk),
        .o(o)
    );

   integer i=0;
   initial begin
	#100;  // Wait 100 ns for simulator reset to finish
	for (i=0; i<100; i=i+1) begin
	    #100;  // Wait 100 ns, jiggle the clock, and wait another 100 ns
            clk = 1;
	    #100;
            clk = 0;
	end
	$finish;
    end
endmodule

Lab

Part 1

I have combined several of the example programs from lecture into a single Verilog design file, lab9.v. Download this file as well as lab9_tb.v (Verilog test bench), lab9.ucf (Xilinx pin assignments), and lab9.xise (Xilinx project file) from http://positron.hep.upenn.edu/p364/lab9
Start up Xilinx ISE on your lab computer, open lab9.xise, compile, simulate.
Once this works, plug your FPGA board into your breadboard.
- Notes about power, etc.
- Notes about logic levels.
- Notes about having only 12 of these to last us the rest of the semester.
I have pre-programmed the lab9 project into your FPGA
- Pins 2 and 3 are the A and B inputs for the NAND, AND, OR, and XOR gates; pin 2 is the A input for the inverter.
- Pin 22 is the inverter output: NOT A.
- Pin 23 is the NAND output: A NAND B.
- Pin 24 is the AND output: A AND B.
- Pin 25 is the OR output: A OR B.
- Pin 26 is the XOR output: A XOR B.
- For the four-bit full adder, pin 4 is Cin; pins 10,11,12,13 are A0,A1,A2,A3; pins 14,15,16,17 are B0,B1,B2,B3; pins 27,28,29,30 are Sum0,Sum1,Sum2,Sum3; pin 31 is Cout.
- For the decoder, pins 32,33,34,35,36,37,38,39 are O0-O7; pins 2,3,4 are re-used for A0-A2; and pin 1 is an active-high enable.
- Note that all of this stuff together uses up about 30% of the logic cells in the chip (see fitter report after you compile) -- and this is one of the smallest Xilinx chips available

/*
 * lab9.v
 * Skeletal program from which to begin PHYS364 Lab 9
 * begun 2010-11-04 by Bill Ashmanskas, ashmansk@hep.upenn.edu
 */

`timescale 1ns / 1ps
`default_nettype none


module lab9
(
   /*
    * I chose these dumb names for the I/O signals
    * so that the mapping to pin numbers on the CMOD
    * DIP package is obvious.  Normally you would
    * give the I/O pins mnemonic names and would put
    * them into the .ucf file accordingly.
    */
    input  wire pin01, pin02, pin03, pin04,
    input  wire pin09, pin10, pin11, pin12, pin13,
    input  wire pin14, pin15, pin16, pin17, pin18,
    output wire pin22, pin23, pin24, pin25, pin26,
    output wire pin27, pin28, pin29, pin30, pin31,
    output wire pin32, pin33, pin34, pin35, pin36,
    output wire pin37, pin38, pin39, pin40
);
    nand_etc nand_etc_instance (
        .a(pin02), .b(pin03),
	.not_a(pin22), .a_nand_b(pin23), 
	.a_and_b(pin24), .a_or_b(pin25), .a_xor_b(pin26));
    fourbit_fulladder fourbit_fulladder_instance (
        .cin(pin04), 
        .a({pin13,pin12,pin11,pin10}), 
        .b({pin17,pin16,pin15,pin14}),
	.sum({pin30,pin29,pin28,pin27}), .cout(pin31));
    decoder decoder_instance (
        .e1(0), .e2(0), .e3(pin01), .a({pin04,pin03,pin02}),
	.o({pin39,pin38,pin37,pin36,pin35,pin34,pin33,pin32}));
endmodule


module nand_etc
(
    input  wire a,
    input  wire b,
    output wire not_a,
    output wire a_nand_b,
    output wire a_and_b,
    output wire a_or_b,
    output wire a_xor_b
);
    assign not_a    = ~a;
    assign a_nand_b = ~(a&b);
    assign a_and_b  = a&b;
    assign a_or_b   = a|b;
    assign a_xor_b  = a^b;
endmodule


module fourbit_fulladder
(
    input  wire       cin,
    input  wire [3:0] a,
    input  wire [3:0] b,
    output wire [3:0] sum,
    output wire       cout
);
    assign {cout,sum} = a+b+cin;
endmodule


module decoder
(
    input  wire       e1, e2, e3,
    input  wire [2:0] a,
    output wire [7:0] o
);
    // Selected output is HIGH if E3 & not E2 & not E1; else LOW
    wire e = e3 & ~e2 & ~e1;
    // Drive selected output to 'e'; others to LOW
    assign o[0] = (a==0 ? e : 0);
    assign o[1] = (a==1 ? e : 0);
    assign o[2] = (a==2 ? e : 0);
    assign o[3] = (a==3 ? e : 0);
    assign o[4] = (a==4 ? e : 0);
    assign o[5] = (a==5 ? e : 0);
    assign o[6] = (a==6 ? e : 0);
    assign o[7] = (a==7 ? e : 0);
    // Note that I made the outputs active high here,
    // whereas the 74LS138 outputs are active low
endmodule

# lab9.ucf -- mapping Verilog names to FPGA pins

NET "pin01"  LOC="12" | IOSTANDARD=LVTTL ;
NET "pin02"  LOC="13" | IOSTANDARD=LVTTL ;
NET "pin03"  LOC="14" | IOSTANDARD=LVTTL ;
NET "pin04"  LOC="16" | IOSTANDARD=LVTTL ;
NET "pin09"  LOC="18" | IOSTANDARD=LVTTL ;
NET "pin10"  LOC="19" | IOSTANDARD=LVTTL ;
NET "pin11"  LOC="20" | IOSTANDARD=LVTTL ;
NET "pin12"  LOC="21" | IOSTANDARD=LVTTL ;
NET "pin13"  LOC="22" | IOSTANDARD=LVTTL ;
NET "pin14"  LOC="23" | IOSTANDARD=LVTTL ;
NET "pin15"  LOC="27" | IOSTANDARD=LVTTL ;
NET "pin16"  LOC="28" | IOSTANDARD=LVTTL ;
NET "pin17"  LOC="29" | IOSTANDARD=LVTTL ;
NET "pin18"  LOC="30" | IOSTANDARD=LVTTL ;
NET "pin22"  LOC="31" | IOSTANDARD=LVTTL ;
NET "pin23"  LOC="32" | IOSTANDARD=LVTTL ;
NET "pin24"  LOC="33" | IOSTANDARD=LVTTL ;
NET "pin25"  LOC="34" | IOSTANDARD=LVTTL ;
NET "pin26"  LOC="36" | IOSTANDARD=LVTTL ;
NET "pin27"  LOC="37" | IOSTANDARD=LVTTL ;
NET "pin28"  LOC="38" | IOSTANDARD=LVTTL ;
NET "pin29"  LOC="39" | IOSTANDARD=LVTTL ;
NET "pin30"  LOC="40" | IOSTANDARD=LVTTL ;
NET "pin31"  LOC="41" | IOSTANDARD=LVTTL ;
NET "pin32"  LOC="42" | IOSTANDARD=LVTTL ;
NET "pin33"  LOC="43" | IOSTANDARD=LVTTL ;
NET "pin34"  LOC="44" | IOSTANDARD=LVTTL ;
NET "pin35"  LOC="1"  | IOSTANDARD=LVTTL ;
NET "pin36"  LOC="2"  | IOSTANDARD=LVTTL ;
NET "pin37"  LOC="3"  | IOSTANDARD=LVTTL ;
NET "pin38"  LOC="5"  | IOSTANDARD=LVTTL ;
NET "pin39"  LOC="6"  | IOSTANDARD=LVTTL ;
NET "pin40"  LOC="8"  | IOSTANDARD=LVTTL ;

Part 2

Now load lab9_synchronous.v, which includes the counter and the shift register shown in lecture today.
Compile it, simulate it, load it into your FPGA, make it display to a bank of LEDs. (Probably the block of 10 LEDs in a DIP package is easiest to wire up.)
Now modify the program so that the counter shows its output on a 7-segment LED display.
- pin 1 = lower left; pin 2 = bottom; pin 4 = lower right; pin 6 = upper right; pin 7 = top; pin 9 = upper left; pin 10 = center; pin 3 = common LED positive line.
Now modify the program so that the counter adds a 4-bit number (programmed with DIP switch bank) each cycle instead of just adding 1.
- Show that it works for counting up by 1, 2, 3. Does it stop when you count up by zero?
- What value do you set on the switches to make the counter count down by ones? Down by twos?

/*
 * lab9_synchronous.v
 * Skeletal program from which to begin PHYS364 Lab 9 (part 2)
 * begun 2010-11-04 by Bill Ashmanskas, ashmansk@hep.upenn.edu
 */

`timescale 1ns / 1ps
`default_nettype none


module lab9_synchronous
(
   /*
    * I chose these dumb names for the I/O signals
    * so that the mapping to pin numbers on the CMOD
    * DIP package is obvious.  Normally you would
    * give the I/O pins mnemonic names and would put
    * them into the .ucf file accordingly.
    */
    input  wire pin01, pin02, pin03, pin04,
    input  wire pin09, pin10, pin11, pin12, pin13,
    input  wire pin14, pin15, pin16, pin17, pin18,
    output wire pin22, pin23, pin24, pin25, pin26,
    output wire pin27, pin28, pin29, pin30, pin31,
    output wire pin32, pin33, pin34, pin35, pin36,
    output wire pin37, pin38, pin39, pin40
);
    counter counter_instance (
        .clk(pin01),
        .o({pin40,pin39,pin38,pin37,pin36,pin35}));
    shiftreg shiftreg_instance (
	.clk(pin01), .d(pin02),
        .o({pin34,pin33,pin32,pin31,pin30,pin29}));
endmodule


module counter
(
    input  wire       clk,
    output wire [5:0] o
);
    reg [5:0] oreg=0;
    always @ (posedge clk) begin
        oreg <= oreg+1;
    end
    assign o = oreg;
endmodule


module shiftreg
(
    input  wire       clk,
    input  wire       d,
    output wire [5:0] o
);
    reg [5:0] oreg=0;
    always @ (posedge clk) begin
        oreg[5] <= oreg[4];
        oreg[4] <= oreg[3];
        oreg[3] <= oreg[2];
        oreg[2] <= oreg[1];
        oreg[1] <= oreg[0];
        oreg[0] <= d;
    end
    assign o = oreg;
endmodule

Part 3

Modify the shift register in lab9_synchronous.v so that it requires no input (except the 1Hz clock). Instead:
- on each clock cycle, bits 1,2,3,4,5 take on the old values of bits 0,1,2,3,4;
- if all bits were zero, bit 0 gets a one; otherwise, bit 0 gets a zero
- else move every bit to the right by one place and put a zero
Display the output on your rectangular bank of LEDs
Now re-wire the output so that the 7-segment LED display rotates through these six states:

User:Ashmanskas/p364/lecture 9

From LaPET electronics

Contents

Lecture 9: 2010-11-08 (programmable logic)

A few notes

Lecture

FPGA overview

Simple gate example (NOT, NAND, AND, OR, XOR)

Four-bit full adder example

Decoder example

Shift register example

Counter example

Lab

Part 1

Part 2

Part 3

Views

Personal tools

Navigation

Search

Toolbox