PROJECT 3

COMPILING TO AN INTERMEDIATE LANGUAGE

Write a compiler which compiles programs written in the language D# described below

The first statement in a D# program should be: main;

and the last statement should be: end main;

The above first statement should be following by declarations for all the variables that the program employs, and should be of the same form as declarations for variables of type “int” in C, but without initial values specified, e.g.:

int x3, y2;

int dd2;

The “main”, “end main” and “int” statements should not occur anywhere else in the program.

The rest of the program should consist of statements of the following form:

input identifier; (this should produce the prompt: Enter a number)

output expression;

identifier = expression;

The expressions referred to above should consist of an identifier or number or a sum of identifiers and numbers

Here is an example of a program written in the D# language which I will call “test3”:

main;

int xx;

int kk44, n;

input kk44;

kk44 = kk44 + xx +2;

output kk44 + n;

end main;

The name of the output file should be constructed from that for the input source file. Thus, assuming your compiler is called “compiler3”, if the user source program is called e.g. “aaa”, then the DOS command

compiler3 aaa

should produce a Masm program called aaa.asm

while if the user source program is called “bbb”, then the command

compiler3 bbb

should produce a Masm program called bbb.asm.

We have a lot of other work to cover in the semester, so will require only that the above compiler be written in the simplest possible way. Normally one would employ a symbol table as was done in Assignment 3, and have our program pinpoint syntactic errors as in Project 2, but we will omit these requirements for this project (though they will be required in a later more comprehensive project).

Also, for this project only, our code will employ only register ax for arithmetic

Here is a suggested grammar. You can change the grammar if you wish, but not the language to be implemented.

prog ® Main ; declaration_list code_start statement_list End Main ;

declaration_list ® declaration_list declaration | declaration

declaration ® Int identifier_list ;

identifier_list ® identifier_list , Identifier | Identifier

code_start ®

statement_list ® statement_list statement | statement

statement ® assignment_statement | input_statement | output_statement
assignment_statement ® Identifier = expression ;

input_statement ® Input Identifier ;

output_statement ® Output expression ;

expression ® expression + primary | primary

primary ® Identifier | Number

Here are some suggestions on how to generate the code (assuming that the file variable allocated to your output file is called e.g. “ff”):

Define YYSTYPE in both the Lex and Yacc definition file as follows:

typedef struct

{char symbol[20];} yystype;

#define YYSTYPE yystype

In your Lex definition file, when an identifier or number (i.e. integer) is matched, push yytext onto symbol stack, using code such as:

strcpy(yylval.symbol, yytext);

and then return Identifier or Number as the case may be.

In your main program, within the Yacc definition file, before calling yyparse(),

generate onto the output file:

.model small

extrn getdec:near, putdec:near

.586

.stack 100h

.data
and after calling yyparse(), generate onto the output file:

mov ah, 4ch

int 21h

main endp

end main

For the production

identifier_list ® identifier_list , Identifier

generate a declaration for the identifier, with code such as:

fprintf(ff,”%s%s\n”, $3.symbol,” dw ?“);

For the production

code_start ®

generate:

.code

main proc

mov ax, @data

mov ds, ax

Whenever code is generated to put something into register ax, let the associated entry on symbol stack be changed to “ax”. For instance, the code for:

expression ® expression + primary

could be of the following form:

if (strcmp($1.symbol, "ax") != 0){

fprintf(ff, "%s%s\n", " mov ax,", $1.symbol);

strcpy($$.symbol, "ax");

}

fprintf(ff, "%s%s\n", " add ax,", $3.symbol);

Call your compiler Dsharp.exe, and employ the following bat (i.e. script file), and call it b.bat :

Dsharp %1

Masm %1.asm

%1

Then to compile and execute a program in the D# language such as test3 given above, employ:

b test3

If all goes well, the 1st line of this bat file will in this case create test3.asm.

The second line will create test3.exe, and the last line will execute test3.exe.