Compiler Design Lab Manual

CompilerDesignLabManual

User Manual:

Open the PDF directly: View PDF PDF.
Page Count: 9

Compiler Design Lab Manual
Department of Computer Science & Engineering
College of Engineering Trivandrum
1
Contents
1 Preface 3
1.1 Pre-Requisites .................................... 3
1.2 Course Objectives .................................. 3
1.3 List of Exercises .................................. 3
2 Finite Automata 4
2.1 Lab - Computing the closures of all states of an NFA with transitions ..... 5
2.2 Lab - Program to convert an NFA with transitions to an NFA without transitions 6
2.3 Lab - Program to convert an NFA to DFA ..................... 7
2.4 Lab - Program to minimize any given DFA ..................... 9
1. Preface
A compiler is computer software that transforms computer code written in one program-
ming language (the source language) into another programming language (the target lan-
guage). Compilers are a type of translator that support digital devices, primarily comput-
ers. The name compiler is primarily used for programs that translate source code from
a high-level programming language to a lower level language (e.g., assembly language,
object code, or machine code) to create an executable program.
1.1. Pre-Requisites
CS331 System Software Lab
1.2. Course Objectives
To implement different phases of a compiler
To implement and test simple optimization techniques
To give exposure to compiler writing tools
1.3. List of Exercises
1. Design and implement a lexical analyzer for a given language and the lexical
analyzer should ignore redundant spaces, tabs and new lines
2. Implementation of lexical analyzer using lex tool
3. Generate YACC Specification for a few syntactic categories
Program to recognize a valid arithmetic expression that uses operators +,
-, *, /
Program to recognize a valid variable which starts with a letter followed
by any number of letters and digits
Implementation of Calculator using Lex and YACC
Convert the BNF Rules into YACC Form and write code to generate ab-
stract syntax tree
4. Write program to find - closure of all states of any given NFA with transitions
5. Write program to convert NFA with transitions to an NFA without transitions
6. Write program to convert an NFA to DFA
7. Write program to minimize any given DFA
8. Develop an operator precedence parser for a given language
9. Write program to simulate First and Follow of any grammar
10. Construct a recursive descent parser for a given language
11. Construct a shift reduce parser for a given language
12. Write a program to perform loop unrolling
13. Write a program to perform constant propogation
14. Implement intermediate code generation for simple expressions
15. Implement the back end of a compiler which takes as input a three address code
and produces the 8086 assembly language instructions that can be assembled and
run using an 8086 assembler. The target assembly instructions can be Move, Add,
Sub, Jump etc.
2. Finite Automata
In the theory of computation, a branch of theoretical computer science, a deterministic
finite automaton (DFA)—also known as a deterministic finite acceptor (DFA) and a deter-
ministic finite state machine (DFSM) or a deterministic finite state automaton (DFSA)—is
a finite-state machine that accepts or rejects strings of symbols and only produces a unique
computation (or run) of the automaton for each input string. Deterministic refers to the
uniqueness of the computation. In search of the simplest models to capture finite-state
machines, Warren McCulloch and Walter Pitts were among the first researchers to intro-
duce a concept similar to finite automata in 1943.
Formally, A deterministic finite automata (DFA) is a 5-tuple, (Q, Σ, δ, q0, F ),
where each of them represents:-
a finite set of states Q
a finite set of input symbols called the alphabet Σ
a transition table δconsisting of a single state as entry for each state-symbol pair
a start state q0
a set of final states F
Similarly, An NFA (Non-Deterministic Finite Automata) without is also repre-
sented by a 5-tuple where the transition table contains an element of the power set of Qas
an entry for each state-symbol pair. An NFA with will also be represented by a 5-tuple
where the transition table will contain an element of the power set of Qas an entry for
each state-symbol pair as well as state-pair.
Any language that is represented by an NFA with transitions can be represented
by an NFA without transitions and any language that can be represented by an NFA
without transitions can be represented by a Deterministic Finite Automata (DFA). An
NFA without transitions is a more restricted version of an NFA with transitions and a
DFA is a more restricted version of an NFA without transitions.
2.1. Lab - Computing the closures of all states of an NFA with transitions
For convenience, the programming language used to explain concepts in the lab manual
will be C++.
Input:- -NFA
Output:- List of -closures of all states of the NFA
Initially, we consider the NFA with transitions to be represented by objects of a
structure of the form,
struct enfa
{
int num_states, num_alphabets;
vector<int> final_states;
vector<vector<vector<int> > > transititon_table;
};
We consider the default start state of all finite automata in the upcoming exercises
of the section to be state 0. We also assume the symbols in the alphabet are numbered
from 0to num alphabets, which represents the number of symbols in the input alphabet
of the NFA with transitions.
Vectors are dynamically sized arrays which has the ability to resize itself auto-
matically as and when an element is inserted or deleted. The following link provides a
basic documentation of the functions associated with vectors:- Vector Functions. Vectors
performs similar functions as performed by lists in Python.
A convention used throughout is that a vector which contains the element -1 only
is considered to be the null set. transition table[i][0] represents the entry in the epsilon
transition entry for state iin the transition table, while transition table[i][j]represents
the entry in the transition table for state iand input symbol j1. Obtain an -NFA as
input from the user using the following function,
enfa get_enfa()
The above function should obtain the variables of the enfa from the user through command
prompt. The next step is to compute the closure of all states of the -NFA. -closure of a
state qis defined as the set of states along with qthat can be reached by only following
-transitions.
The function which computes the closure of a state of the -NFA will have the
form,
vector<int> compute_closure(enfa a, int k)
which will compute the closure of -NFA afor the state kand return the -closure of state
kin a vector. The algorithm for the above function is provided as follows:-
Initialize a list tcontaining only state k
Initialize an iterator to the first element of the list t
While iterator has not crossed the last element of the list t
Append all states in the i-pair in the transition table of -NFA awhich is
not previously present in list tto t
Set the iterator to the next element of the list t
Return list tas the -closure for state kin -NFA a
2.2. Lab - Program to convert an NFA with transitions to an NFA without
transitions
Input:- -NFA
Output:- NFA without -transitions
We consider an NFA without epsilon transitions to be represented by objects of a
structure of the form,
struct nfa
{
int num_states, num_alphabets;
vector<int> final_states;
vector<vector<vector<int> > > transititon_table;
};
Alternatively, an object of the struct enfa can be used to reprsent an NFA without
transitions where all the state-pairs in the transitions will correspond to NULL sets.
As in the previous exercise, we consider the default start state of the NFA to be
0.transitiontable[i][j]represents the entry in the transition table for state iand input
symbol j.
Obtain an NFA with transitions as input from the user. Now the aim of the
exercise is to convert an NFA with -transitions to an NFA without -transitions, which
will be performed by a function having the form,
nfa convert_enfa(enfa a)
The algorithm to compute the above provided function is provided as follows,
Initialize an empty object of type nfa with variable name t
Initialize t.num states =a.num states,t.num alphabets =a.num alphabets
and t.final states =a.final states
Iterate through each of the state iin Q
Initialize lto the closure of state iof -NFA a
Iterate through each of the input symbol jin Σ
Initialize an empty list of states f
Iterate through each state kin l
·Add all states of a.transition table[k][j+ 1] to f
Remove all the duplicates from f
Compute the -closure cof f
Set t.transition table[i][j] = c
Return tas the NFA without -transitions corresponding to the -NFA a
2.3. Lab - Program to convert an NFA to DFA
Input:- NFA without -transitions
Output:- DFA
We represent DFA by objects of the structure,
struct dfa
{
int num_states, num_alphabets;
vector<int> final_states;
vector<vector<int> > transititon_table;
};
As usual, we consider state 0to be the default start state of the DFA. Also, since in a DFA
there exists exaactly one state transition for every state-symbol pair, transition table[i][j]
represents the state to which it undergoes transition on encountering input symbol jfrom
state i.
We obtain an NFA without -transitions as input from the user, which is performed
by a function of the form,
nfa get_nfa()
Now the objective of the exercise to convert it into a corresponding DFA. We will per-
form this conversion using the subset conversion method. Let the number of states in
the NFA be num states and the number of input symbols in the alphabet of the NFA
be numalphabets. Then, the number of states in the DFA initially will be 2(num states)
and the alphabet of the DFA will consist of num alphabets input symbols. The function
which computes the corresponding DFA will have the form,
dfa convert_nfa(nfa a)
The algorithm to compute the above function is as follows,
Initialize an empty object of type dfa having variable name t
Initialize t.num states = 2(a.num states)
Initialize t.num alphabets =a.num alphabets
Let Qtrepresent the states present in the DFA tand Σtrepresent the symbols in
the alphabet of the DFA t. Iterate through each of the states iin Qt
Let xbe the subset of states of Q(set of states of the NFA a) corresponding
to state iin DFA awhich is obtained by the function nfa state(int i)
(nfa state(int i)will be described later)
Iterate through each input symbol jin Σt
Initialize an empty list of states f
Iterate through each of the states kin x
·Append the states present in a.transition table[k][j]to f
Remove all the duplicate states from f
If fis empty, set t.transition table[i][j]=2(a.num states)1; Else,
Identify the DFA state ccorresponding to the subset of states fof
the NFA using the function dfa state(vector < int > f)and set
t.transition table[i][j] = c
Consider all the subsets of a.final states, identify the DFA state corresponding
to it and include it in the set t.final states (set of final states of the DFA t)
The function, all the elements of the power set of the set of states of NFA, ac-
cording to the binary representation of numbers between 1and 2(a.num states). The null set
is assigned a DFA state number of 2(a.num states)1. The function dfa state(vector <
int > a), computes the DFA state as follows,
int dfa_state(vector<int> a)
{
int j = 0;
for(int i = 0; i < a.size(); i++)
{
j += pow(2, a[i]);
}
return (j-1);
}
The function nfa state(int i)will increment i, compute it’s binary representa-
tion, and then add the unit positions where 1 occurs into to a set of states. This subset of
states, then corresponds to the DFA state i.
Optionally, we can identify the unreachable states from state 0by running a
Breadth First Search starting at state 0, when the DFA is considered as a graph. All
the states that have not been traversed during the BFS traversal can be eliminated
from the DFA. Correspondingly, the variables t.num states,t.transition table and
t.final states has to be modified.
2.4. Lab - Program to minimize any given DFA
Input:- DFA
Output:- Minimized DFA corresponding to the input DFA
We use an object of the same structure to represent a DFA. Obtain a DFA as input
from the user using a function of the form,
dfa get_dfa()
Now, the objective of the exercise is to minimize the given DFA, which will be
performed using a function of the form,
dfa minimize_dfa(dfa a)
The algorithm to compute the above provided function is as follows,
Initialize a list of lists twhich contains two lists, where t[0] contains the list of
non-final states and t[1] contains the list of final states
Initialize a matrix mof size a.num states a.num states and set every cell in
the matrix to 0
Initialize a flag variable fto 1
For all state pairs (x, y), Set m[x][y]=1if xis a final state and yis a non-final
state or vice-versa
While fis set to 1
Set fto 0
For all states ifrom 0to a.num states 2
For all states jfrom i+ 1 to a.num states 1
·If for any symbol uin Σ, (m[i][j] = 0 and
m[a.transition table[i][u]][a.transition table[j][u]] =
1), Then Set m[i][j]=1and f= 1
Initialize a DFA object and represent those pair of states (a, b)which has m[a][b] =
0by a single state ain the minimized DFA
The above method of minimization of DFA is using the Myhill-Nerode Theorem.

Navigation menu