Von Neumann (Princeton) Architecture

Von Neumann (Princeton) Architecture

N15page 1 of 10

Microprocessors

von Neumann (Princeton) architecture

instructions and data reside in same memory space (same address and data busses)

cannot fetch a new instruction and read/write data at the same time (bottleneck)

style

Harvard architecture

separate memory for instructions and data

data stored in RAM, program often stored in EEPROM with external serial load

Microprocessor history

Apollo guidance computer = 12,300 transistors (4,100 chips, one 3-input NOR gate per chip)

Intel / Instructions / Data / Addr / Transistors / Year
4004 / 46 / 4b / 12b / 2,300 / 1971
8008 / 48 / 8b / 14b / 3,500 / 1972
8080 / 256 / 8b / 16b / 4,500 / 1974 / several support chips with different voltages
8085 / 256 / 8b / 16b / 6,500 / 1977 / single +5V supply
8086 / 16b / 20b / 29,000 / 1978
8087 / 1980 / first floating point coprocessor
8088 / 8b / 20b / 29,000 / 1979 / IBM-PC
80186 / 16b / 20b / 55,000 / 1982
80286 / 16b / 24b / 134,000 / 1983
80386 / 32b / 32b / 275,000 / 1985
80486 / 32b / 32b / 1,180235 / 1989
Pentium / 64b / 32b / 3,100,000
Pentium Pro / 64b / 32b / 5,500,000
Pentium II / 64b / 32b / 7,500,000
Pentium III / 9,500,000
Pentium IV / 42,000,000
MOS Technology
6502 / 256 / 8b / 16b / 3,510 / 1975 / Apple II
Zilog
Z80 / 256 / 8b / 16b / 8,500 / 1976 / 8085 clone
Motorola
6800 / 72 / 8b / 16b / 4,100 / 1974
68000 / 16b / 24b / 68,000 / 1979 / Macintosh, UNIX
68020 / 32b / 32b / 200,000 / 1984 / UNIX

style
Main functions of Central Processing Unit (CPU)

Arithmetic logic unit (ALU)

Registers for temporary storage

Typically size of external data bus

Accumulator or Working (input and output of ALU)

Temporary (second input to ALU)

Flag status register (Carry, Zero, Negative, Interrupt Disable, etc.)

Instruction buffer

Control buffer

Data buffer

General purpose

Typically size of external address bus

Address buffer

Program counter (PC)

Stack pointer (SP)

Instruction decode lookup table (LUT)

Bus drivers

6502 CPU

N15page 1 of 10

8085 architecture

N15page 1 of 10

Classic one byte instruction (8b internal data, 8b external data, 16b external address)

movecontent of PCL to address buffer low

move contents of PCH to address buffer high

set MEMR

fetch instruction byte from memory into data buffer

move contents of data buffer into instruction buffer

decode instruction to set internal control lines

increment PC by one

typical functions

a) arithmetic operations - result in W

single operand on W - zero, NOT, increment/decrement, two's complement, set/clear bit, rotate/shift

two operand on W and TEMP - add, AND, OR, XOR

b) internal move from register source to register destination

Classic two byte instruction (8b internal data, 8b external data, 16b external address)

same as one byte instruction

decode instruction to set internal control lines

increment PC by one

fetch a second byte into data buffer (usually contains a constant value)

move contents of data buffer into W (or another register)

increment PC again

typical functions

load constant into register destination

Classic three byte instruction (8b internal data, 8b external data, 16b external address)

same as one byte instruction

decode instruction to set internal control lines

increment PC by one

fetch a second byte into data buffer (usually contains low byte of an address)

move contents of data buffer into W (or another register)

increment PC again

fetch a third byte into data buffer (usually contains high byte of an address)

move contents of data buffer into TEMP (or another register)

increment PC again

move W into address buffer low

move TEMP into address buffer high

if read from memory

set MEMR

fetch byte from memory location into data buffer

move data buffer into W

if write to memory

move content of a register into data buffer

set MEMW

write data buffer into memory location

typical functions

a) readvariable from memory

b) writevariable to memory

c) conditionaljump - JZ, JNZ, JC, JNC, JPOS, JNEG

move W and TEMP into PCL and PCH instead of address buffer if test is true

d) unconditional jump - JMP

e) jump to subroutine - JSR (see below)

N15page 1 of 10

Arduino program compiled into AVR instructions

int a, b, c; // 16b signed

a = 4;

b = 6;

c = a + b;

first pass of compiler assigns memory locations for data

hex locationdata

0x0100lsB(a)

0x0101msB(a)

0x0102lsB(b)

0x0103msB(b)

0x0104lsB(c)

0x0105msB(c)

second pass of complier creates instructions (hex opcode)

hexopcodecommand,argscomment

a = 4;

84 e0 ldir24, 0x04Load Immediate - load lsB(a) into register 24 (r24)

90 e0 ldir25, 0x00Load Immediate - load msB(a) into register 25 (r25)

90 93 01 01 sts0x0101, r25Store Direct to Data Space - store msB(a) to memory

80 93 00 01 sts0x0100, r24Store Direct to Data Space - store lsB(a) to memory

b = 6;

86 e0 ldir24, 0x06Load Immediate - load lsB(b) into r24

90 e0 ldir25, 0x00Load Immediate - load msB(b) into r25

90 93 03 01 sts0x0103, r25Store Direct to Data Space - store msB(b) to mem

80 93 02 01 sts0x0102, r24Store Direct to Data Space - store msB(b) to mem

c = a + b;

80 91 02 01 ldsr24, 0x0102Load Direct from Data Space - load lsB(b) from mem into r24

90 91 03 01 ldsr25, 0x0103Load Direct from Data Space - load msB(b) from mem into r25

20 91 00 01 ldsr18, 0x0100Load Direct from Data Space - load lsB(a) from mem into r18

30 91 01 01 ldsr19, 0x0101Load Direct from Data Space - load msB(a) from mem into r19

82 0f addr24, r18Add without Carry – add lsB(b) plus lsb(a), result in r24

93 1f adcr25, r19Add with Carry – add msB(b) plus msb(a), result in r25

90 93 03 01 sts0x0105, r25Store Direct to Data Space - store msB(c) to mem

80 93 02 01 sts0x0104, r24Store Direct to Data Space - store lsB(c) to mem

11 bytes of text (excluding spaces and semicolons), 6 bytes of data, 52 bytes of opcode instructions

Historic evolution of CPU

More registers

More arithmetic functions

More instructions in instruction set

Bigger instruction decoder LUT due to propagation delay

Slower performance per instruction

Reduced instruction set computer (RISC)

Fewer registers

Fewer instructions

Smaller decoder LUT

Faster performance per instruction

PIC and Atmel are RISC with modified Harvard architecture

Call subroutine

1) pushcontents of PC onto stack and increment SP

2) load address of instructions for subroutine into PC

3) execute subroutine

Return from subroutine

4)pop old value of PC from stack and decrement SP

5) load value from stack into PC

Maximum size of stack limits number of levels of subroutine nesting

Interrupt

hardware generated subroutine - use stack to remember old PC

need special interrupt handler instructions

need list of addresses for different handler routines (interrupt vector)

mask off other interrupts