CIS 258 Assembly Language Programming - Lecture Notes

Chapter 2Introduction to the Assembler

An example assembly language program:

LABEL OP-CODE OPERANDCOMMENT

BACK_SPEQU$08ASCII code for backspace

DELETEEQU$01ASCII code for delete

CAR_RETEQU$0DASCII code for carriage return

ORG$400Data origin

LINEDS.B64Reserve 64 bytes for line buffer

*

* Input a character and store it in a buffer

*

ORG$001000Program origin

LEALINE,A2A2 points to line buffer

NEXTBSRGET_DATACall subroutine to get input

CMP.B#BACK_SP,D1Test for back space

BEQMOVE_LEFTIf back space then deal with it

CMP.B#DELETE,D1Test for delete

BEQCANCELIf delete then deal with it

CMP.B#CAR_RET,D1Test for carriage return

BEQEXITIf carriage return then exit

MOVE.BD1,(A2)+Else store input in memory

BRANEXTRepeat

MOVE_LEFTLEA-1(A2),A2Move buffer pointer back

BRANEXTand continue

CANCELLEALINE,A2Reset the buffer pointer

BRANEXTand continue

GET_DATAMOVE#5,D0Read a byte into D1

TRAP#15

EXITRTS

END$1000

An assembly language program is written in four columns (or fields), they are:

  • Label - A label must begin in column one. The label must consist of only letters, numbers and underscore characters (no spaces). The maximum size of the label is determined by the assembler. For readability purposes keep labels a reasonable size. If the first character is an asterisk '*' then the entire line is treated as a comment.
  • Op-code - The operation code column contains either an instruction for the microprocessor or an assembler directive.
  • Operand - The operand field contains additional information needed by the op-code or assembler directive.
  • Comment - An optional comment field ignored by the assembler.

Typically an instruction must be contained on one line. Additional lines may be used for comments if necessary by preceding them with an asterisk '*'. The formatting of the source code is entirely up to the programmer but for readability purposes it is best to stick with the formatting used in this example.

Dollar signs '$' are used to prefix hexadecimal numbers. Percent signs '%' are used to prefix binary numbers. A number with no prefix is treated as a decimal value. Enclosing text in single quotes 'ABC' converts the characters into ASCII codes. The pound sign '#' is used to indicate literal values.

Most assemblers are not case sensitive, that is, label and LABEL would be treated as the same word, however this is entirely up to the creators of the assembler.

Most assemblers generate a formatted program listing that may look like the following:

Source file: CHP2.X68

Assembled on: 02-01-24 at: 09:20:43

by: X68K PC-2.1 Copyright (c) University of Teesside 1989,93

Defaults: ORG $0/FORMAT/OPT A,BRL,CEX,CL,FRL,MC,MD,NOMEX,NOPCO

1 00000008 BACK_SP: EQU $08 ;ASCII code for backspace

2 00000001 DELETE: EQU $01 ;ASCII code for delete

3 0000000D CAR_RET: EQU $0D ;ASCII code for carriage return

4 00000400 ORG $400 ;Data origin

5 00000400 00000040 LINE: DS.B 64 ;Reserve 64 bytes for line buffer

6 *

7 * Input a character and store it in a buffer

8 *

9 00001000 ORG $001000 ;Program origin

10 00001000 45F80400 LEA LINE,A2 ;A2 points to line buffer

11 00001004 6100002A NEXT: BSR GET_DATA ;Call subroutine to get input

12 00001008 0C010008 CMP.B #BACK_SP,D1 ;Test for back space

13 0000100C 67000016 BEQ MOVE_LEFT ;If back space then deal with it

14 00001010 0C010001 CMP.B #DELETE,D1 ;Test for delete

15 00001014 67000014 BEQ CANCEL ;If delete then deal with it

16 00001018 0C01000D CMP.B #CAR_RET,D1 ;Test for carriage return

17 0000101C 67000018 BEQ EXIT ;If carriage return then exit

18 00001020 14C1 MOVE.B D1,(A2)+ ;Else store input in memory

19 00001022 60E0 BRA NEXT ;Repeat

20 00001024 45EAFFFF MOVE_LEFT: LEA -1(A2),A2 ;Move buffer pointer back

21 00001028 60DA BRA NEXT ;and continue

22 0000102A 45F80400 CANCEL: LEA LINE,A2 ;Reset the buffer pointer

23 0000102E 60D4 BRA NEXT ;and continue

24 00001030 303C0005 GET_DATA: MOVE #5,D0 ;Read a byte into D1

25 00001034 4E4F TRAP #15

26 00001036 4E75 EXIT: RTS

27 00001000 END $1000

Lines: 27, Errors: 0, Warnings: 0.

The first column in the above listing is a line number (used for reference only). The second column is the memory address (in hex). The third column is the data stored their (only the first eight digits are shown). This data may be CPU instructions or program data. The remaining columns are the formatted program listing. Notice that colons have been added to the end of each label: and a semi-colon has been added to the start of each comment.

Assembler Directives

Some of the words used in an assembly language program are directives to the assembler. Some of the directives are:

EQU - The equate directive links a name (label) to a value. Similar to defining a constant in C++. In the previous example the line,

BACK_SPEQU$08ASCII code for backspace

equates BACK_SP with the hex value $08 which is the ASCII code for backspace. In the program we can now use the word BACK_SP to represent $08. This makes our program much easier to read and modify. Once an expression has been equated to a value it may in turn be used in other equate directives, example:

FRAMEEQU128

FRAME2EQUFRAME+16

LengthEQU30

WidthEQU25

AreaEQULength*Width

DC - The Define Constant directive should not be confused with declaring a constant in C++. The DC directive instructs the assembler to place the following values into memory at the current location. The directive has three forms DC.B for byte data, DC.W for word (16-bit) data, and DC.L for long (32-bit) data. For example:

ORG$1000Start of the data region

FirstDC.B10,66The values 10 and 66 are stored in consecutive bytes

DC.L$0A1234The value $000A1234 is stored as on long word

DateDC.B'April 8 1985'The ASCII characters are stored as 12 bytes

DC.L1,2Two long words are set up with the values 1 and 2

The 68000 microprocessor requires that word and long word numbers be stored in even memory addresses. The assembler will adjust the memory locations accordingly. The book implies that long word data must fall on a long word address boundary, however this is not true. Beginning with the 68020 and later versions of the 68000 this word boundary restriction was removed.

The result of the above code would be:

001000:0A 42DC.B 10,66

001002:00 0A 12 34DC.L $0A1234

001006:41 70 72 69 6C 20 38 20 31 39 38 35DC.B 'April 8 1985'

001012:00 00 00 01 00 00 00 02DC.L 1,2

DS - The Define Storage directive reserves the specified amount of memory at the current location. DS is also qualified by .B, .W, or .L. Unlike DC, no data is stored in the reserved memory. The assembler will force DS.W and DS.L locations to start in even memory addresses.

TABLEDS.W256Reserve 256 words for TABLE

ORG - The origin directive set up the location counter that keeps track of where the next item is to be located in memory. The operand following ORG is the absolute value of the origin.

ORG$400Origin for data

END - The end directive tells the assembler that the end of a program has been reached. Most assemblers do not require an operand for end, however, the Teesside 68000 cross-assembler supplied with the text requires a single parameter which specifies the starting point of the program. Assuming the starting point of the program was $1000 the END directive would be:

END$1000

Some Examples of Assembly Language Instructions

MOVE D0,NUMCopy the contents of register D0 to memory location NUM

MOVE NUM,D0Copy the contents of memory location NUM to register D0

MOVE $400,D0Copy the contents of memory location $400 to register D0

MOVE #$400,D0Put the number $400 into register D0

ADD NUM,D0Add the contents of memory location NUM to register D0

ADD D0,NUMAdd the contents of register D0 to memory location NUM

CMP NUM,D0Compare register D0 to memory location NUM

CMP #10,D0Compare register D0 to the number 1010

CMP #$10,D0Compare register D0 to the number 1016

BEQ NEXTBranch on EQual to label NEXT (if the Z =1 in the CCR)

BRA NEXTBranch Always to label NEXT

SUB NUM,D0Subtract the contents of location NUM from register D0

SUB #5,D0Subtract 5 from register D0