Compiler Design Lab Manual

CompilerDesignLabManual

User Manual:

Open the PDF directly: View PDF .
Page Count: 9

Preface
Finite Automata

Department of Computer Science & Engineering

College of Engineering Trivandrum

Contents

1 Preface 3

1.1 Pre-Requisites .................................... 3

1.2 Course Objectives .................................. 3

1.3 List of Exercises .................................. 3

2 Finite Automata 4

2.1 Lab - Computing the closures of all states of an NFA with transitions ..... 5

2.2 Lab - Program to convert an NFA with transitions to an NFA without transitions 6

2.3 Lab - Program to convert an NFA to DFA ..................... 7

2.4 Lab - Program to minimize any given DFA ..................... 9

1. Preface

A compiler is computer software that transforms computer code written in one program-

ming language (the source language) into another programming language (the target lan-

guage). Compilers are a type of translator that support digital devices, primarily comput-

ers. The name compiler is primarily used for programs that translate source code from

a high-level programming language to a lower level language (e.g., assembly language,

object code, or machine code) to create an executable program.

1.1. Pre-Requisites

•CS331 System Software Lab

1.2. Course Objectives

•To implement different phases of a compiler

•To implement and test simple optimization techniques

•To give exposure to compiler writing tools

1.3. List of Exercises

1. Design and implement a lexical analyzer for a given language and the lexical

analyzer should ignore redundant spaces, tabs and new lines

2. Implementation of lexical analyzer using lex tool

3. Generate YACC Speciﬁcation for a few syntactic categories

•Program to recognize a valid arithmetic expression that uses operators +,

-, *, /

•Program to recognize a valid variable which starts with a letter followed

by any number of letters and digits

•Implementation of Calculator using Lex and YACC

•Convert the BNF Rules into YACC Form and write code to generate ab-

stract syntax tree

4. Write program to ﬁnd - closure of all states of any given NFA with transitions

5. Write program to convert NFA with transitions to an NFA without transitions

6. Write program to convert an NFA to DFA

7. Write program to minimize any given DFA

8. Develop an operator precedence parser for a given language

9. Write program to simulate First and Follow of any grammar

10. Construct a recursive descent parser for a given language

11. Construct a shift reduce parser for a given language

12. Write a program to perform loop unrolling

13. Write a program to perform constant propogation

14. Implement intermediate code generation for simple expressions

15. Implement the back end of a compiler which takes as input a three address code

and produces the 8086 assembly language instructions that can be assembled and

run using an 8086 assembler. The target assembly instructions can be Move, Add,

Sub, Jump etc.

2. Finite Automata

In the theory of computation, a branch of theoretical computer science, a deterministic

ﬁnite automaton (DFA)—also known as a deterministic ﬁnite acceptor (DFA) and a deter-

ministic ﬁnite state machine (DFSM) or a deterministic ﬁnite state automaton (DFSA)—is

a ﬁnite-state machine that accepts or rejects strings of symbols and only produces a unique

computation (or run) of the automaton for each input string. Deterministic refers to the

uniqueness of the computation. In search of the simplest models to capture ﬁnite-state

machines, Warren McCulloch and Walter Pitts were among the ﬁrst researchers to intro-

duce a concept similar to ﬁnite automata in 1943.

Formally, A deterministic ﬁnite automata (DFA) is a 5-tuple, (Q, Σ, δ, q0, F ),

where each of them represents:-

•a ﬁnite set of states Q

•a ﬁnite set of input symbols called the alphabet Σ

•a transition table δconsisting of a single state as entry for each state-symbol pair

•a start state q0

•a set of ﬁnal states F

Similarly, An NFA (Non-Deterministic Finite Automata) without is also repre-

sented by a 5-tuple where the transition table contains an element of the power set of Qas

an entry for each state-symbol pair. An NFA with will also be represented by a 5-tuple

where the transition table will contain an element of the power set of Qas an entry for

each state-symbol pair as well as state-pair.

Any language that is represented by an NFA with transitions can be represented

by an NFA without transitions and any language that can be represented by an NFA

without transitions can be represented by a Deterministic Finite Automata (DFA). An

NFA without transitions is a more restricted version of an NFA with transitions and a

DFA is a more restricted version of an NFA without transitions.

2.1. Lab - Computing the closures of all states of an NFA with transitions

For convenience, the programming language used to explain concepts in the lab manual

will be C++.

Input:- -NFA

Output:- List of -closures of all states of the NFA

Initially, we consider the NFA with transitions to be represented by objects of a

structure of the form,

struct enfa

{

int num_states, num_alphabets;

vector<int> final_states;

vector<vector<vector<int> > > transititon_table;

};

We consider the default start state of all ﬁnite automata in the upcoming exercises

of the section to be state 0. We also assume the symbols in the alphabet are numbered

from 0to num alphabets, which represents the number of symbols in the input alphabet

of the NFA with transitions.

Vectors are dynamically sized arrays which has the ability to resize itself auto-

matically as and when an element is inserted or deleted. The following link provides a

basic documentation of the functions associated with vectors:- Vector Functions. Vectors

performs similar functions as performed by lists in Python.

A convention used throughout is that a vector which contains the element -1 only

is considered to be the null set. transition table[i][0] represents the entry in the epsilon

transition entry for state iin the transition table, while transition table[i][j]represents

the entry in the transition table for state iand input symbol j−1. Obtain an -NFA as

input from the user using the following function,

enfa get_enfa()

The above function should obtain the variables of the enfa from the user through command

prompt. The next step is to compute the closure of all states of the -NFA. -closure of a

state qis deﬁned as the set of states along with qthat can be reached by only following

-transitions.

The function which computes the closure of a state of the -NFA will have the

form,

vector<int> compute_closure(enfa a, int k)

which will compute the closure of -NFA afor the state kand return the -closure of state

kin a vector. The algorithm for the above function is provided as follows:-

•Initialize a list tcontaining only state k

•Initialize an iterator to the ﬁrst element of the list t

•While iterator has not crossed the last element of the list t

–Append all states in the i-pair in the transition table of -NFA awhich is

not previously present in list tto t

–Set the iterator to the next element of the list t

•Return list tas the -closure for state kin -NFA a

2.2. Lab - Program to convert an NFA with transitions to an NFA without 

transitions

Input:- -NFA

Output:- NFA without -transitions

We consider an NFA without epsilon transitions to be represented by objects of a

structure of the form,

struct nfa

{

int num_states, num_alphabets;

vector<int> final_states;

vector<vector<vector<int> > > transititon_table;

};

Alternatively, an object of the struct enfa can be used to reprsent an NFA without 

transitions where all the state-pairs in the transitions will correspond to NULL sets.

As in the previous exercise, we consider the default start state of the NFA to be

0.transitiontable[i][j]represents the entry in the transition table for state iand input

symbol j.

Obtain an NFA with transitions as input from the user. Now the aim of the

exercise is to convert an NFA with -transitions to an NFA without -transitions, which

will be performed by a function having the form,

nfa convert_enfa(enfa a)

The algorithm to compute the above provided function is provided as follows,

•Initialize an empty object of type nfa with variable name t

•Initialize t.num states =a.num states,t.num alphabets =a.num alphabets

and t.final states =a.final states

•Iterate through each of the state iin Q

–Initialize lto the closure of state iof -NFA a

–Iterate through each of the input symbol jin Σ

∗Initialize an empty list of states f

∗Iterate through each state kin l

·Add all states of a.transition table[k][j+ 1] to f

∗Remove all the duplicates from f

∗Compute the -closure cof f

∗Set t.transition table[i][j] = c

•Return tas the NFA without -transitions corresponding to the -NFA a

2.3. Lab - Program to convert an NFA to DFA

Input:- NFA without -transitions

Output:- DFA

We represent DFA by objects of the structure,

struct dfa

{

int num_states, num_alphabets;

vector<int> final_states;

vector<vector<int> > transititon_table;

};

As usual, we consider state 0to be the default start state of the DFA. Also, since in a DFA

there exists exaactly one state transition for every state-symbol pair, transition table[i][j]

represents the state to which it undergoes transition on encountering input symbol jfrom

state i.

We obtain an NFA without -transitions as input from the user, which is performed

by a function of the form,

nfa get_nfa()

Now the objective of the exercise to convert it into a corresponding DFA. We will per-

form this conversion using the subset conversion method. Let the number of states in

the NFA be num states and the number of input symbols in the alphabet of the NFA

be numalphabets. Then, the number of states in the DFA initially will be 2(num states)

and the alphabet of the DFA will consist of num alphabets input symbols. The function

which computes the corresponding DFA will have the form,

dfa convert_nfa(nfa a)

The algorithm to compute the above function is as follows,

•Initialize an empty object of type dfa having variable name t

•Initialize t.num states = 2(a.num states)

•Initialize t.num alphabets =a.num alphabets

•Let Qtrepresent the states present in the DFA tand Σtrepresent the symbols in

the alphabet of the DFA t. Iterate through each of the states iin Qt

–Let xbe the subset of states of Q(set of states of the NFA a) corresponding

to state iin DFA awhich is obtained by the function nfa state(int i)

(nfa state(int i)will be described later)

–Iterate through each input symbol jin Σt

∗Initialize an empty list of states f

∗Iterate through each of the states kin x

·Append the states present in a.transition table[k][j]to f

∗Remove all the duplicate states from f

∗If fis empty, set t.transition table[i][j]=2(a.num states)−1; Else,

Identify the DFA state ccorresponding to the subset of states fof

the NFA using the function dfa state(vector < int > f)and set

t.transition table[i][j] = c

•Consider all the subsets of a.final states, identify the DFA state corresponding

to it and include it in the set t.final states (set of ﬁnal states of the DFA t)

The function, all the elements of the power set of the set of states of NFA, ac-

cording to the binary representation of numbers between 1and 2(a.num states). The null set

is assigned a DFA state number of 2(a.num states)−1. The function dfa state(vector <

int > a), computes the DFA state as follows,

int dfa_state(vector<int> a)

{

int j = 0;

for(int i = 0; i < a.size(); i++)

{

j += pow(2, a[i]);

}

return (j-1);

}

The function nfa state(int i)will increment i, compute it’s binary representa-

tion, and then add the unit positions where 1 occurs into to a set of states. This subset of

states, then corresponds to the DFA state i.

Optionally, we can identify the unreachable states from state 0by running a

Breadth First Search starting at state 0, when the DFA is considered as a graph. All

the states that have not been traversed during the BFS traversal can be eliminated

from the DFA. Correspondingly, the variables t.num states,t.transition table and

t.final states has to be modiﬁed.

2.4. Lab - Program to minimize any given DFA

Input:- DFA

Output:- Minimized DFA corresponding to the input DFA

We use an object of the same structure to represent a DFA. Obtain a DFA as input

from the user using a function of the form,

dfa get_dfa()

Now, the objective of the exercise is to minimize the given DFA, which will be

performed using a function of the form,

dfa minimize_dfa(dfa a)

The algorithm to compute the above provided function is as follows,

•Initialize a list of lists twhich contains two lists, where t[0] contains the list of

non-ﬁnal states and t[1] contains the list of ﬁnal states

•Initialize a matrix mof size a.num states ∗a.num states and set every cell in

the matrix to 0

•Initialize a ﬂag variable fto 1

•For all state pairs (x, y), Set m[x][y]=1if xis a ﬁnal state and yis a non-ﬁnal

state or vice-versa

•While fis set to 1

–Set fto 0

–For all states ifrom 0to a.num states −2

∗For all states jfrom i+ 1 to a.num states −1

·If for any symbol uin Σ, (m[i][j] = 0 and

m[a.transition table[i][u]][a.transition table[j][u]] =

1), Then Set m[i][j]=1and f= 1

•Initialize a DFA object and represent those pair of states (a, b)which has m[a][b] =

0by a single state ain the minimized DFA

The above method of minimization of DFA is using the Myhill-Nerode Theorem.

Compiler Design Lab Manual

CompilerDesignLabManual

Navigation menu

Versions of this User Manual:

Views

Navigation