CIS 258 Assembly Language Programming - Lecture Notes
Chapter 2Introduction to the Assembler
An example assembly language program:
LABEL OP-CODE OPERANDCOMMENT
BACK_SPEQU$08ASCII code for backspace
DELETEEQU$01ASCII code for delete
CAR_RETEQU$0DASCII code for carriage return
ORG$400Data origin
LINEDS.B64Reserve 64 bytes for line buffer
*
* Input a character and store it in a buffer
*
ORG$001000Program origin
LEALINE,A2A2 points to line buffer
NEXTBSRGET_DATACall subroutine to get input
CMP.B#BACK_SP,D1Test for back space
BEQMOVE_LEFTIf back space then deal with it
CMP.B#DELETE,D1Test for delete
BEQCANCELIf delete then deal with it
CMP.B#CAR_RET,D1Test for carriage return
BEQEXITIf carriage return then exit
MOVE.BD1,(A2)+Else store input in memory
BRANEXTRepeat
MOVE_LEFTLEA-1(A2),A2Move buffer pointer back
BRANEXTand continue
CANCELLEALINE,A2Reset the buffer pointer
BRANEXTand continue
GET_DATAMOVE#5,D0Read a byte into D1
TRAP#15
EXITRTS
END$1000
An assembly language program is written in four columns (or fields), they are:
- Label - A label must begin in column one. The label must consist of only letters, numbers and underscore characters (no spaces). The maximum size of the label is determined by the assembler. For readability purposes keep labels a reasonable size. If the first character is an asterisk '*' then the entire line is treated as a comment.
- Op-code - The operation code column contains either an instruction for the microprocessor or an assembler directive.
- Operand - The operand field contains additional information needed by the op-code or assembler directive.
- Comment - An optional comment field ignored by the assembler.
Typically an instruction must be contained on one line. Additional lines may be used for comments if necessary by preceding them with an asterisk '*'. The formatting of the source code is entirely up to the programmer but for readability purposes it is best to stick with the formatting used in this example.
Dollar signs '$' are used to prefix hexadecimal numbers. Percent signs '%' are used to prefix binary numbers. A number with no prefix is treated as a decimal value. Enclosing text in single quotes 'ABC' converts the characters into ASCII codes. The pound sign '#' is used to indicate literal values.
Most assemblers are not case sensitive, that is, label and LABEL would be treated as the same word, however this is entirely up to the creators of the assembler.
Most assemblers generate a formatted program listing that may look like the following:
Source file: CHP2.X68
Assembled on: 02-01-24 at: 09:20:43
by: X68K PC-2.1 Copyright (c) University of Teesside 1989,93
Defaults: ORG $0/FORMAT/OPT A,BRL,CEX,CL,FRL,MC,MD,NOMEX,NOPCO
1 00000008 BACK_SP: EQU $08 ;ASCII code for backspace
2 00000001 DELETE: EQU $01 ;ASCII code for delete
3 0000000D CAR_RET: EQU $0D ;ASCII code for carriage return
4 00000400 ORG $400 ;Data origin
5 00000400 00000040 LINE: DS.B 64 ;Reserve 64 bytes for line buffer
6 *
7 * Input a character and store it in a buffer
8 *
9 00001000 ORG $001000 ;Program origin
10 00001000 45F80400 LEA LINE,A2 ;A2 points to line buffer
11 00001004 6100002A NEXT: BSR GET_DATA ;Call subroutine to get input
12 00001008 0C010008 CMP.B #BACK_SP,D1 ;Test for back space
13 0000100C 67000016 BEQ MOVE_LEFT ;If back space then deal with it
14 00001010 0C010001 CMP.B #DELETE,D1 ;Test for delete
15 00001014 67000014 BEQ CANCEL ;If delete then deal with it
16 00001018 0C01000D CMP.B #CAR_RET,D1 ;Test for carriage return
17 0000101C 67000018 BEQ EXIT ;If carriage return then exit
18 00001020 14C1 MOVE.B D1,(A2)+ ;Else store input in memory
19 00001022 60E0 BRA NEXT ;Repeat
20 00001024 45EAFFFF MOVE_LEFT: LEA -1(A2),A2 ;Move buffer pointer back
21 00001028 60DA BRA NEXT ;and continue
22 0000102A 45F80400 CANCEL: LEA LINE,A2 ;Reset the buffer pointer
23 0000102E 60D4 BRA NEXT ;and continue
24 00001030 303C0005 GET_DATA: MOVE #5,D0 ;Read a byte into D1
25 00001034 4E4F TRAP #15
26 00001036 4E75 EXIT: RTS
27 00001000 END $1000
Lines: 27, Errors: 0, Warnings: 0.
The first column in the above listing is a line number (used for reference only). The second column is the memory address (in hex). The third column is the data stored their (only the first eight digits are shown). This data may be CPU instructions or program data. The remaining columns are the formatted program listing. Notice that colons have been added to the end of each label: and a semi-colon has been added to the start of each comment.
Assembler Directives
Some of the words used in an assembly language program are directives to the assembler. Some of the directives are:
EQU - The equate directive links a name (label) to a value. Similar to defining a constant in C++. In the previous example the line,
BACK_SPEQU$08ASCII code for backspace
equates BACK_SP with the hex value $08 which is the ASCII code for backspace. In the program we can now use the word BACK_SP to represent $08. This makes our program much easier to read and modify. Once an expression has been equated to a value it may in turn be used in other equate directives, example:
FRAMEEQU128
FRAME2EQUFRAME+16
LengthEQU30
WidthEQU25
AreaEQULength*Width
DC - The Define Constant directive should not be confused with declaring a constant in C++. The DC directive instructs the assembler to place the following values into memory at the current location. The directive has three forms DC.B for byte data, DC.W for word (16-bit) data, and DC.L for long (32-bit) data. For example:
ORG$1000Start of the data region
FirstDC.B10,66The values 10 and 66 are stored in consecutive bytes
DC.L$0A1234The value $000A1234 is stored as on long word
DateDC.B'April 8 1985'The ASCII characters are stored as 12 bytes
DC.L1,2Two long words are set up with the values 1 and 2
The 68000 microprocessor requires that word and long word numbers be stored in even memory addresses. The assembler will adjust the memory locations accordingly. The book implies that long word data must fall on a long word address boundary, however this is not true. Beginning with the 68020 and later versions of the 68000 this word boundary restriction was removed.
The result of the above code would be:
001000:0A 42DC.B 10,66
001002:00 0A 12 34DC.L $0A1234
001006:41 70 72 69 6C 20 38 20 31 39 38 35DC.B 'April 8 1985'
001012:00 00 00 01 00 00 00 02DC.L 1,2
DS - The Define Storage directive reserves the specified amount of memory at the current location. DS is also qualified by .B, .W, or .L. Unlike DC, no data is stored in the reserved memory. The assembler will force DS.W and DS.L locations to start in even memory addresses.
TABLEDS.W256Reserve 256 words for TABLE
ORG - The origin directive set up the location counter that keeps track of where the next item is to be located in memory. The operand following ORG is the absolute value of the origin.
ORG$400Origin for data
END - The end directive tells the assembler that the end of a program has been reached. Most assemblers do not require an operand for end, however, the Teesside 68000 cross-assembler supplied with the text requires a single parameter which specifies the starting point of the program. Assuming the starting point of the program was $1000 the END directive would be:
END$1000
Some Examples of Assembly Language Instructions
MOVE D0,NUMCopy the contents of register D0 to memory location NUM
MOVE NUM,D0Copy the contents of memory location NUM to register D0
MOVE $400,D0Copy the contents of memory location $400 to register D0
MOVE #$400,D0Put the number $400 into register D0
ADD NUM,D0Add the contents of memory location NUM to register D0
ADD D0,NUMAdd the contents of register D0 to memory location NUM
CMP NUM,D0Compare register D0 to memory location NUM
CMP #10,D0Compare register D0 to the number 1010
CMP #$10,D0Compare register D0 to the number 1016
BEQ NEXTBranch on EQual to label NEXT (if the Z =1 in the CCR)
BRA NEXTBranch Always to label NEXT
SUB NUM,D0Subtract the contents of location NUM from register D0
SUB #5,D0Subtract 5 from register D0