PROJECT 3
COMPILING TO AN INTERMEDIATE LANGUAGE
Write a compiler which compiles programs written in the language D# described below
The first statement in a D# program should be: main;
and the last statement should be: end main;
The above first statement should be following by declarations for all the variables that the program employs, and should be of the same form as declarations for variables of type “int” in C, but without initial values specified, e.g.:
int x3, y2;
int dd2;
The “main”, “end main” and “int” statements should not occur anywhere else in the program.
The rest of the program should consist of statements of the following form:
input identifier; (this should produce the prompt: Enter a number)
output expression;
identifier = expression;
The expressions referred to above should consist of an identifier or number or a sum of identifiers and numbers
Here is an example of a program written in the D# language which I will call “test3”:
main;
int xx;
int kk44, n;
input kk44;
kk44 = kk44 + xx +2;
output kk44 + n;
end main;
The name of the output file should be constructed from that for the input source file. Thus, assuming your compiler is called “compiler3”, if the user source program is called e.g. “aaa”, then the DOS command
compiler3 aaa
should produce a Masm program called aaa.asm
while if the user source program is called “bbb”, then the command
compiler3 bbb
should produce a Masm program called bbb.asm.
We have a lot of other work to cover in the semester, so will require only that the above compiler be written in the simplest possible way. Normally one would employ a symbol table as was done in Assignment 3, and have our program pinpoint syntactic errors as in Project 2, but we will omit these requirements for this project (though they will be required in a later more comprehensive project).
Also, for this project only, our code will employ only register ax for arithmetic
Here is a suggested grammar. You can change the grammar if you wish, but not the language to be implemented.
prog ® Main ; declaration_list code_start statement_list End Main ;
declaration_list ® declaration_list declaration | declaration
declaration ® Int identifier_list ;
identifier_list ® identifier_list , Identifier | Identifier
code_start ®
statement_list ® statement_list statement | statement
statement ® assignment_statement | input_statement | output_statement
assignment_statement ® Identifier = expression ;
input_statement ® Input Identifier ;
output_statement ® Output expression ;
expression ® expression + primary | primary
primary ® Identifier | Number
Here are some suggestions on how to generate the code (assuming that the file variable allocated to your output file is called e.g. “ff”):
Define YYSTYPE in both the Lex and Yacc definition file as follows:
typedef struct
{char symbol[20];} yystype;
#define YYSTYPE yystype
In your Lex definition file, when an identifier or number (i.e. integer) is matched, push yytext onto symbol stack, using code such as:
strcpy(yylval.symbol, yytext);
and then return Identifier or Number as the case may be.
In your main program, within the Yacc definition file, before calling yyparse(),
generate onto the output file:
.model small
extrn getdec:near, putdec:near
.586
.stack 100h
.data
and after calling yyparse(), generate onto the output file:
mov ah, 4ch
int 21h
main endp
end main
For the production
identifier_list ® identifier_list , Identifier
generate a declaration for the identifier, with code such as:
fprintf(ff,”%s%s\n”, $3.symbol,” dw ?“);
For the production
code_start ®
generate:
.code
main proc
mov ax, @data
mov ds, ax
Whenever code is generated to put something into register ax, let the associated entry on symbol stack be changed to “ax”. For instance, the code for:
expression ® expression + primary
could be of the following form:
if (strcmp($1.symbol, "ax") != 0){
fprintf(ff, "%s%s\n", " mov ax,", $1.symbol);
strcpy($$.symbol, "ax");
}
fprintf(ff, "%s%s\n", " add ax,", $3.symbol);
Call your compiler Dsharp.exe, and employ the following bat (i.e. script file), and call it b.bat :
Dsharp %1
Masm %1.asm
%1
Then to compile and execute a program in the D# language such as test3 given above, employ:
b test3
If all goes well, the 1st line of this bat file will in this case create test3.asm.
The second line will create test3.exe, and the last line will execute test3.exe.