Introduction to Binary Computers

Modified version

Introduction to Binary Computers

Steve Jost

January 1997

Table of Contents

1. A Brief History of Computers 1

2. Binary and Hexadecimal Numbers 2

3. Decimal-to-Binary Conversion 3

4. Binary Arithmetic 5

5. Representation of Negative Numbers 7

6. Bitwise Operations 8

7. Introduction to MACHINE-16 9

8. Addressing Modes 15

1. A Brief History of Computers.

1200 The modern abacus was invented.

1642 Blaise Pascal invented the first calculating machine at the age of 19.

1672 Gottfried Leibniz built a calculating machine that could add, subtract, multiply, and divide.

1837 Charles Babbage built the first mechanical computer capable of multistep programs. This machine was about 100 years ahead of its time and little further progress was made until the 20th century.

1842 Probably the first programmer was Ada Augusta, the Countess of Lovelace. She worked with Babbage helping him write programs for his “analytical engine.” She noted that the machine could not “originate anything” but could only do “what we know how to order it to perform.” The computer language ADA developed in the 1970's was named after her.

1887 Dr. Herman Hollerith developed a punched card reading machine to assist in tabulating data for the

1890 U. S. census. Without this machine, processing this data would have taken more than ten years! With the machine, the data processing for the census only took three years.

1937 Howard Aitken who was a professor at Harvard built the Mark I digital computer. It was controlled with electromagnetic relays and received its input from punched cards.

1939 The first electronic computer using vacuum tubes for switching was constructed by Dr. John Vincent Atanasoff at Iowa State college and was called the ABC (Atanasoff-Berry Computer). It was built for solving systems of simultaneous equations.

1940 The first general purpose computer was built by Atanasoff together with John Mauchly and J. Presper Eckert. This computer was called ENIAC (Electronic Numerical Integrator and Calculator) and funded by the U.S. army. It could do 300 multiplications per second which was 300 times faster than any other machine of the day, but is contained 18,000 vacuum tubes and weighed 30 tons. The ENIAC was programmed by externally manipulating plugs and switches. The machine was used by the army until 1955.

1945 John von Neumann proposed two ideas which made modern high speed computers possible. These ideas were using binary numbers to store data, and storing instructions as data rather than entering them by switches or plugs as was done in earlier machines.

1949 The transistor was invented by Bardeen, Braitain, and Shockley at Bell Telephone Laboratories, who shared the Nobel Prize for this invention which made modern electronics possible.

1949 The first stored program electronic computer was developed at Cambridge University. It was built by M.V. Wilkes and called EDSAC (Electronic Delay Storage Automatic Calculator.)

1950 Computer industry forcasters concluded that about 10 stored memory computers would meet the demand for the entire U.S for years to come. This turned out to be one of the worst forecasts in history.

1954 International Business machines (IBM) sold the first computer for record keeping and business organization. This machine was less expensive than other computers available at the time and was widely accepted. These early computers were called first generation computers. They were programmed in binary machine language which a difficult and error prone process. First generation computers were originally designed for scientific operations.

1952 The first high level language was invented by Dr. Grace Hopper. She developed a compiler for the language called A-2 which converted instructions into machine language.

1954 The FORTRAN (FORmula TRANslator) language was developed at IBM by a programming team headed by John Backus. FORTRAN is still widely used for scientific applications.

1959 The first second generation computers were introduced which were smaller and faster than the first generation ones. They used solid state devices such as diodes and transistors instead of vacuum tubes. In addition computers which accepted instructions in high level languages became widespread.

1959 To meet the increasing demand for data processing, the language COBOL (COmmon Business Oriented Language) was developed. This language became popular because it was written in a quasi-English form that could be more easily understood by nonprogrammers than other languages of that time.

1963 The BASIC (Beginners All Purpose Symbolic Instruction Code) language was developed at Dartmouth college by John Kemeny and Thomas Kurtz. It was introduced as a language which was easy for students to learn, and was available on a timesharing computer which allowed several users to take turns sharing the central processing unit of the computer.

1970 The C programming language was designed by Dennis Ritchie. Its name comes from the fact that it was an improved version of the language BCPL or B for short. Many applications which formerly were written in assembly language are now written in C.

1976 The first Cray supercomputer was built. It is capable of performing more than 100 million floating point operations per second (100 megaflops). Computers with the capability of performing more than 10 billion floating point operations per second (10 gigaflops) will be possible within the next ten years.

2. Binary Numbers for Computers

Many electronic hardware devices have two natural states such as conducting or not conducting, magnetized or not magnetized, positive or negative, on or off, etc. Using the binary numbers 0 or 1 greatly simplifies the design of electronic computers. However, because long sequences of 0's and 1's are difficult to read and understand, binary digits are conventionally grouped to make them more comprehensible. The following table shows the terms that are sometimes used to denote various sized groups of binary digits.

Number of Binary Digits Term

======

1 bit

4 nibble

8 byte

16 word

32 longword

Here are some of the common C datatypes and their properties.

C Datatype / Number of bits / Number of bytes / Minimum Value / Maximum Value
char / 8 / 1 / -128 / 127
unsigned char / 8 / 1 / 0 / 255
short int / 16 / 2 / -32,768 / 32,767
unsigned short int / 16 / 2 / 0 / 65,536
long int / 32 / 4 / -2,147,483,648 / 2,147,483,647
unsigned long int / 32 / 4 / 0 / 4,294,967,296
float / 32 / 4 / -3.40 x 10^38 / 3.40 x 10^38
double / 64 / 8 / -1.79 x 10^308 / 1.79 x 10^308

The size of the datatype int depends on the particular implementation of C being used. On a mainframe or minicomputer, the likely size of an intis 4 bytes, while on a personal computer or microcomputer, its size is probably 2 bytes. As an introduction to digital computers, we will study a simplified computer called MACHINE-16 which only uses numbers of length 8 bits or 1 byte. The reason for the name MACHINE-16 is that it has 16 instructions for manipulating data. MACHINE-16 is a Von Neumann machine, described in Section 1, in the sense that both data and instructions can be stored in its memory. The instructions are executed sequentially one at a time, with the possibility of looping back to repeat a block of instructions several times. Although modern assembly languages are much more powerful than the language of MACHINE-16, the 16 instructions in its instruction set are powerful enough to solve a wide variety of problems. Some of these problems will be discussed in Section 6.

3. Decimal-to-Binary Conversion

Because MACHINE-16 uses 8 bit binary arithmetic, all of our examples will be with 8 bit numbers. Initially, we will only consider positive or unsigned binary numbers. Later, we will see how negative numbers can be represented with eight bits. To convert a binary number into base 10 or vice versa, it is convenient to use the Binary-Hex-Decimal conversion table shown in below. A binary 1 means that that power of two is present while a 0 means that it is absent. Here is a table showing the various powers of two that we will need:

BinaryDecimal

======

1 1

10 2

100 4

1000 8

10000 16

100000 32

1000000 64

10000000 128

For example 01001101 = 64 + 8 + 4 + 1 = 77.

To convert a decimal number to binary, the procedure is reversed: express the decimal number as a sum of powers of 2 and then write this sum as a sequence of binary bits. For example, to convert the decimal number 77 into binary, we write

-64

- 8

- 4

- 1

We select for each subtraction, the largest power of 2 which is less than or equal to the amount remaining. We then represent those powers of 2 present as 1 and those powers of 2 missing as 0. This gives 64 + 8 + 4 + 1 = 01001101.

Because long binary numbers are hard to read, computer programmers prefer to express them in a form that is more easily read. The most common choice of notation today is the hexadecimal (base 16) representation where each four bit group of digits (nibble) is represented by a single hexadecimal digit. These digits are shown in the following Binary-Hex-Conversion table. Because we will be using the hex representation for machine code input to MACHINE-16, it is essential that this table be memorized.

BinaryHexDecimal

======

0000 0 0

0001 1 1

0010 2 2

0011 3 3

0100 4 4

0101 5 5

0110 6 6

0111 7 7

1000 8 8

1001 9 9

1010 a 10

1011 b 11

1100 c 12

1101 d 13

1110 e 14

1111 f 15

To represent a binary number in hex notation, simply replace each nibble by its corresponding hex digit:

0110 / 1001 / 1110 / 1001 / 0100 / 1100 / 0101 / 0001
6 / 9 / e / 9 / 4 / c / 5 / 1

The binary numbers we consider will be 8 bit numbers, each of which can be represented by 2 hex digits. We will employ the C language representation which uses the prefix “0x” (zero x) for hex numbers. For example the expression 0x5d represents the binary number 01011101. It is important to note that the internal binary representation of an integer is independent of the form that the number takes when it is printed. The binary number 01011101 looks like 93 when printed with the C++ manipulator dec , like

5d when printed with hex, and like ]when printed as a regular char.

Exercises

Assume that all binary numbers consist of 8 bits and are unsigned, that is, are in the range 0 to 255. Signed binary numbers are discussed in Section 5.

1. Convert the following binary numbers to decimal:

01101011, 01110000, 00000111, 11111111, 01010101

2. Convert the following decimal numbers to binary:

49, 7, 153, 200, 191, 128, 93

3. Convert the following hex numbers to binary:

0x41, 0x3f, 0x83, 0xef, 0x20, 0x2a, 0xbb

4. Convert the following binary numbers to hex:

1101101, 11101101, 11010000, 00000111, 11111111

5. Binary Arithmetic

The four basic operations addition, subtraction, multiplication, and division are all easily performed with binary numbers. In fact they are actually easier to perform in binary because the addition and multiplication tables are so small. Here are basic addition facts in binary.

0 + 0 = 0

0 + 1 = 1

1 + 0 = 1

1 + 1 = 10 /* Result is 0, carry 1 */

0 x 0 = 0

0 x 1 = 0

1 x 0 = 0

1 x 1 = 1

Carrying and borrowing are also performed as in decimal arithmetic. To add the numbers 0x3a and

0x27, first convert to binary, and add from right to left, carrying 1 to the next column when the result is 10.

58 0x3a ---> 00111010

+39 +0x27 ---> 00100111

------

97 0x61 <--- 01100001

To subtract two binary numbers, borrow 1 when the bottom digit is 1 and the top digit is 0. For example, to compute 0x3a - 0x27,

58 0x3a ---> 00111010

-39 - 0x27 ---> 00100111

------

19 0x13 <--- 00010011

Binary multiplication is also similar to decimal multiplication. The top factor is used as a partial product for each binary 1 in the bottom factor. The top factor is shifted by as many places to the left as the power of 2 that the binary 1 represents. To multiply 0x1e by 0x06,

20 0x14 ---> 00010100

x 6 x 0x06 ---> 00000110

------

000101000 Shift by 1 bit

000101000 Shift by 2 bits

------

120 0x78 <--- 0001111000

Only the rightmost eight bits are retained in the answer. Any bits to the left of these 8 bits are called overflow bits. In case any of the overflow bits are nonzero, an overflow has occurred and the rightmost eight bits which are retained in the answer are invalid. Because integer arithmetic operations are not checked for overflow in the C language, it is possible to obtain an incorrect answer when two numbers are multiplied or added. For example, when we multiply 58 and 39 and we store the result as unsigned char (8 bits), we obtain the answer 214 when the actual answer is 2262. The following calculation shows why:

58 0x3a ---> 00111010

x 39 x 0x27 ---> 00100111

------

00111010

------

0100011010110 Retain eight low order bits

Only the rightmost eight bits 11010110 (hex 0xd6or 214 decimal) are retained.

Division is also performed as it is for decimal numbers. Unlike base 10 long division, no guesswork is involved to decide how many times the divisor goes into the trial dividend. Either it goes (quotient = 1) or it doesn't (quotient = 0). As an example, divide 213 (binary 1101010, hex 0xd5) by 9 (binary

00001001, hex 0x09).

10111 <-- quotient

------

1001 ) 11010101

10010000

------

1000101

100100

------

1111

1001

----

110 <-- remainder

The answer is 23 (binary 00010111, hex 0x17) with a remainder of 6 (binary 00000110).

Exercise

Convert the following hex numbers to binary, perform the following operations, and convert back.

0x45 + 0x3f, 0x5a + 0x74, 0xe2 - 0xaf, 0xa4 - 0x9c,

0x0a x 0x2f, 0x47 x 0x1b, 0xc4 / 0x2e, 0xbe / 0x19

5. Representation of Negative Numbers

To represent negative numbers, we use the analogy of an automobile odometer. Because an odometer display has only five decimal digits to the left of the decimal point, only numbers in the range 0 to 99999 can be shown even though the actual number of miles may be much greater. The number 0 represents 0, 100000, 200000, etc., but it can also represent the negative numbers -100000, -200000, etc. The number 99999 represents 199999, 299999, and also -1, -100001, -200001, etc. By adopting the convention that 1 to 99999 represent positive numbers and 50000 to 99999 represent negative numbers, all numbers from starting from -50000 to 49999 can be represented. In the case of 8 bit binary numbers, we let 00000000 through 01111111 represent positive numbers and 10000000 through 11111111 represent negative numbers. By this convention, if the leftmost bit is 0, the number is positive, and if the leftmost bit is 1, the number is negative. In this way, an 8 bit binary number can represent all integers between -128 and 127. Such a representation is called a signed integer. The following table shows how to represent negative numbers in binary.

Binary Hex Decimal

01111111 7f 127

01111110 7e 126

......

00000010 02 2

00000001 01 1

00000000 00 0

11111111 ff -1

11111110 fe -2

......

10000001 81 -127

10000000 80 -128

Converting a an eight bit binary positive number into the corresponding negative one involves subtracting it from 100000000. (The eight bit representation of 100000000 is 00000000.) For example, to convert 0x05 = 00000101 to a negative number, subtract it from 100000000 to obtain 11111011. This operation is called obtaining the two's complement of a binary number. In general the two's complement of a binary number is formed by subtracting it from 2n where n is the number of bits. Another way of performing the two's complement is to complement each digit (replace 0 by 1 and 1 by 0 ) and add 1 to the result. The complement of 00000101 is 11111010 and adding 1 produces 11111011. To verify that these numbers are actually negatives of each other, we can add them to see that the sum is 100000000 = 00000000 when truncated to eight bits.

The beauty of two's complement arithmetic is that no special rules are needed to accommodate negative numbers. Merely perform the operation and keep the rightmost eight bits. Here are some examples:

- 29 ---> 11100011

+ 42 ---> 00101010

------

13 <--- 00001101

-29 ---> 11100011

- (-42) ---> 11010110

------

13 <--- 00001101

(-6) ---> 11111010

x (-5) ---> 11111011

------

...111111111111010

...11111111111010

...111111111010

...11111111010

...1111111010

...111111010

...11111010

------retain

30 <--- ...000000000011110 8 bits

6. Bitwise Operations

In addition to the 4 basic operations discussed in Section 3, C also allows the following operations: bitwise and ( & ), bitwise or ( | ), bitwise exclusive or ( ^ ), and bitwise complement (~). These bitwise operations are defined by the following tables.

x / y / x&y
0 / 0 / 0
1 / 0 / 0
0 / 1 / 0
1 / 1 / 1
x / y / x | y
0 / 0 / 0
1 / 0 / 1
0 / 1 / 1
1 / 1 / 1
x / y / x ^ y
0 / 0 / 0
1 / 0 / 1
0 / 1 / 1
1 / 1 / 0
x / ! x
0 / 1
1 / 0

For eight bit binary numbers, these operations are performed on each column of bit pairs separately. For example, to compute 52 & 45, first convert to binary, perform the bitwise and 00110100 & 00101101 = 00100100, and convert back to decimal which gives 36. The bitwise and is useful for isolating certain bits in a binary number to use in further computations. In the implementation of MACHINE-16 which is discussed in Sections 7 to 9, an eight bit machine instruction can be broken down into two parts, the opcode which consists of bits 1 to 4, and the addressing mode which consists of bits 7 and 8. (Bits 5 and 6 will not be used in our implementation.)

To isolate the opcode, we can perform a bitwise and with the mask 0xf0 = 11110000. The bits 1 to 4 will remain unchanged, but the bits 5 to 8 will be changed to zero. This can be performed by the C statement opcode = instruction & 0xf 0; To obtain the addressing mode, use the statement

mode = instruction & 0x03;

because 0x03 = 00000011.

Exercises

1. Find the values of the following expressions:

0x75 & 0xfb, 0x3c | 0xe9, 0xfa ^ 0xa5, ~c7

2. In a hypothetical operating system, the bits 5 to 27 in the the 4 byte integer variable psw (processor status word) represent the state of the system. The 32 bits are numbered from 0 to 31 from left to right. Write a hex mask which will isolate these bits (Hint: First write the mask in binary).

7. Introduction to MACHINE-16

The MACHINE-16 is a simple hypothetical stored memory computer which has 64 bytes of memory and 16 instructions. Its capabilities are comparable to the earliest computers built in the 1930's, but it is thousands of times faster because it will be implemented on a modern computer. The memory can be thought of as an array of bytes (sometimes called cells) with addresses numbered 0x00 to 0x3f.

This memory array will be represented by the global array of char. We will refer to each byte as a memory cell which will be abbreviated as MC. Here is an example:

01010110 01001001 00101001 00101001 01001010 10010100 ...

MC 0 MC 1 MC 2 MC 3 MC 4 MC 5 ...

Memory cells with addresses 0x00 and 0x01 (referred to in the C program as mem[0] and mem[1]) each have a special purpose. The memory cell with address 0x00 is called the program counter (PC) and is used to hold the address of the instruction which is currently executing. (This will make sense later when you see what the instructions are.) The memory cell with address 0x01 is called the accumulator (AC) and is used to hold the result of an arithmetic operation.

A MACHINE-16 instruction consists of two bytes (four nibbles or 16 bits). The first nibble is the opcode and the second nibble is the addressing mode. The table below describes the 16 MACHINE-16 opcodes. In these descriptions, PC refers to the program counter, AC refers to the accumulator. The opcodes are referred to as 0x00, 0x10, 0x20 rather than 0x0, 0x1, 0x2b because the second nibble is cleared to zero using a bitmask (see Section 9).

Opcode / Mnemonic / Meaning
0x00 / STOP / exit(1);
0x10 / LOAD / AC = MC;
0x20 / STORE / MC = AC;
0x30 / CLEAR / MC = 0;
0x40 / INCR / MC++;
0x50 / DECR / MC--;
0x60 / ADD / AC += MC;
0x70 / SUBT / AC -= MC;
0x80 / MULT / AC *= MC;
0x90 / DIV / AC /= MC;
0xa0 / JUMP / Jump to address in MC
0xb0 / BGZ / If AC > 0 jump to
address in MC
0xc0 / PRIND / Print MC as int
0xd0 / PRINC / Print MC as char
0xe0 / SCAND / Read MC as int
0xf0 / SCANC / Read MC as char

The second byte (nibbles 3 and 4) of the instruction determines the memory cell (MC) which contains the information to be used by the instructions 0x00, 0x60, 0x70, 0x80, 0x90, 0xa0, 0xb0, 0xc0, and 0xd0. It can also determine the address where information is to be put (instructions 0x20, 0x30, 0xe0, 0xf0). In the preceding table of operands, the symbol = means assignment as in C++, ++ means increment, and -- means decrement. The symbols +=, -=, *=, and /= have the same meanings as the corresponding C++ assignment operators. We distinguish between assignment (A=5;) and initialization (char A=5;) in exactly the same way that the C++ language does.