Project 3: Constant Propagation and Live Variable Analysis
Project 3: Constant Propagation and Live Variable Analysis
Due Date: 03/16/09
Programs should be submitted using the turnin
program on csnet.cs.tamu.edu
by the announced time. Submissions should
be in the form of a single file named project3_UIN.tar.gz
file (where UIN is your UIN number)
containing the relevant source code and a README file explaining
your approach (see Creating
.tar.gz
File below on how to create and submit this file.)
Introduction
This project is intended to allow you to apply some more complex algorithms
to the MiniPolaris compiler.
The algorithms you will implement in Project 3 are constant
propagation and live variable analysis.
To simplify design, the project is broken into four parts :
-
Substitution of symbolic constants defined in PARAMETER
statements.
-
Perform the live variable analysis.
-
Substitution of all constant values and expressions.
-
Deadcoding based upon constant conditions and live variable analysis.
Input and Output
Input
The input to your program will be a FORTRAN 77 program. This will be
specified
on the command line when running your copy of MiniPolaris, e.g.
mini_polaris -f sw test.f > test.list
In this example, test.f is a FORTRAN program
which you want to
compile.
We save the output to the file test.list
which may be examined
for correctness. The file sw contains your own
copy of the switchfile
in your work directory. (see Switches.)
Output
The output that MiniPolaris produces goes to stdout
(error
messages to stderr) and is a
listing containing information
produced by the compiler during each pass of the compiler. Converting
this
list file to an actual FORTRAN file is done as follows :
list2src < test.list > test.P.f
The .P.f ending indicates the file potentially
contains parallelization
directives, which is not relevant for this assignment, but which is
important
to the purpose of the Polaris compiler.
The output FORTRAN program produced by project 4 should
contain all
the statements of the original, with the exception of deadcoded
statements.
The driver program will provide the option of printing out the current
subprogram just after you have completed the pass, either in normal
FORTRAN
or a detailed listing of the abstract syntax tree. (see Switches.)
Program Design
Switches
All the routines for this project should be controlled by
switches
(such as the file sw used in the above example.)
For pass you will use the following switches :
constprop 1 perform live variable analysis and propagate constants, remove dead code based on information from live variable analysis and constant propagation (1=within each routine,0=none)
p_constprop 2 print after substitution pass
These values are the default as specified in the file switches
in your work directory (do not remove this file since it is a symbolic
link
to the default switchfile -- copy
it to another file and specify that filename with the -f
option
when running mypolaris.)
So, if you use a different switchfile which has propagate
set
to 0, then the constant propagation routine will not be called.
Constant Propagation Pass
Interface
The constant propagation pass should be implemented in the cvdl/constant
directory. The file constant.h shows the
prototype
of the function called from the driver program
void propagate_constants(ProgramUnit & pgm);
This is a simple pass which takes one program unit as input, transforms
any relevant statements inside, and completes, returning nothing.
On linux, the file cvdl/constant/Makefile
includes the lines:
CPPSRCS = constant.cc # add your other .cc files here
CPPHDRS = constant.h # add your other .h files here
If you create other source files, you should add them to these two
lines
so they will be compiled and linked into the final program.
Algorithm Design
propagate_constants() is given a Program Unit in
SSA form on which to work.
It need not output any information (though you will find it useful to
do
so for the purposes of learning the AST representation of the two
statement
types as well as debugging your code.)
General Organization
It is recommended that you organize your algorithm as follows
:
-
Propagate constant expressions, but do not replace actual expressions
-
Propagate PARAMETER values (one pass)
-
Repeat until no changes occur :
-
Perform live variable analyis.
-
Propagate constant values and expressions (while loop)
-
Deadcode any statements determined unreachable
-
Substitute associated constant expressions for the actual expresions
Thus, you could use WorkSpace derived objects to
store information
associated with the various expressions until the final substitution.
Since
Expression type objects do not have a WorkSpaceStack
associated with them, you will have to store such associations in a WorkSpace
object for each Statement. This allows you to
determine from which
statement/expression a given constant expression was taken during the
iterative
process.
Part A : Propagation of PARAMETER
constants
In FORTRAN, a PARAMETER declaration provides a
method for declaring
the equivalent of a const variable in C. Any
identifier specified
in a PARAMETER declaration can only be used in a
RHS expression.
A PARAMETER declaration has nothing to do with
the arguments provided
to a function of subroutine, as is often the usage of the term in other
programming languages. We can therefore be certain it will be a
constant
value. Conveniently, Polaris provides a means whereby we can discover
if
a given expression represents an identifier defined as a parameter.
First, determine if the expression is an identifier type (use
the op()
member function of the Expression object.) If so,
test to see
if the symbol this expression represents is defined in a PARAMETER
declaration. To do this, you will need to find the symbol of the
expression
(expr.symbol())
and from this, determine if the
symbol class is SYMBOLIC_CONSTANT_CLASS. If so,
you'll need to
substitute the constant value for this expression.
Part B : Live variable analysis and propagation of constant values and expressions
Live variable analyis and constant propagation are both examples of data flow problems. Therefore,
they are somewhat similar. This will make ik easier to implement them. You can use the same technique
to do both which will reduce the amount of code that needs to be implemented. For more information
on live variable analysis you can consult the Dragon book. Obviously, Live variable analyis computes what
variables are still alive at a give statement (i.e. what variables might still be referenced later in the program).
For more information on dataflow problems you can consult the Dragon book in the dataflow-framework section. For your case it
will be even easier since the program is already in SSA form and you don't have to worry anymore about
use-def issues. In general, both algorithms do the following:
Iterate over the CFG, and for every statement compute information that changes within the node
(i.e. GEN, KILL sets), accumulate the infomation from all incoming DATAFLOW edges (meet operator), and
compute information flowing out of the DATAFLOW node (transfer function). Remember that constant propagation is a
forward dataflow problem while live variable analysis is a backward dataflow problem.
Relevant statements are those
containing an expression, either lhs or rhs. Statements including
subroutine or function calls will have potential side-effects. Those
variables which you can assume are changed
upon a call to a subprogram include :
-
any static variables, i.e. those in :
-
a DATA declaration (initialized)
-
an EQUIVALENCE declaration *
-
a COMMON block
-
any formal arguments of the current ProgramUnit
-
any actual arguments to the called subprogram
* variables declared
in an EQUIVALENCE
declaration are problematic when doing data flow analysis, particularly
when arrays are involved. For that reason, you should automatically
treat
any variables involved in an equivalence as non-constant.
Thus, any expression which qualifies as modified can be regarded as
killed
at the point of the subroutine or function call, and will be
non-constant
until the next definition.
You will initialize the constant status of each expression to
one of
three states :
-
constant
-
unknown
-
non-constant
You will then repeatedly propagate any constant expressions from the
definition
to the use, so long as the expression is the same value over all paths
from the definition to the use. In Polaris, you can simply test two Expression
type objects being equal with the == operator
(which will be false
if the expression involves a function call, whereas !=
may also
be false if it involves a function call.) When your loop encounters a
stable
state (i.e. no changes are made) you will exit the (part B) loop.
Upon completing the loop for part B (and part C when that is
added)
you will then make the substitution of any propagated expressions
(according
to the switch value substitute.). Here since the
input program unit is in SSA form, it is very easy to find out use-def
relationship. You could use the algorithm on page 74, class slide set
8. This algorithm is not based on SSA directly but it should not be
difficult for you to adapt it to SSA program. There is another
algorithm on Wolfe book , section 6.6. This algorithm is efficient but
a bit hard to understand. You could also
implement your project based on it if you are interested.
Part C : Deadcode elimination
You will then
evaluate
any conditional branches whose condition is resolved as constant. If
you
know an expression is going to be .TRUE. or .FALSE.
you
can deadcode any statements which will never be
reached. You can
deadcode assignment statements by simply removing them. Once this
is done, you will want to repeat part B. When the loop for part B
completes
(i.e. reaches a stable state) and no more code is changed by
deadcoding,
you can then complete the pass by substituting the expressions which
were
propagated into the actual statements.
It is highly recommended that you implement this function one
step at
a time, compiling and testing your latest addition to verify that your
additions are correct (see Examples.)
Compiling
You can re-compile your files from within the constant
directory
as long as you have compiled mini_polaris for once. In another word, there
is
already a mini_polaris generated.
If you have not compiled mini_polaris yet, change to your
work directory and type make
If you make some changes and want to recompile, just type make
at the constant directory. Or, if you want to do it from the top-level
makefile, you should
do
(suppose you are at the constant directory)
rm libconstprop_pkg.so.1
cd ~/minipolaris
make
Examples
Sample programs are provided in Samples
. In the proj-constant subdirectory, you will find a number of testcases
that focus on various problems. for this project there will be no solutions for now. You are required
to think what variables can be replaced with constants and what code can be removed.
Creating .tar.gz File
The files necessary for submission should be all source code (.cc
and .h) and the Makefiles in the directory cvdl/constant
in your personal work directory. To create the file proj-constant.tar.gz,
first clean out all non-relevant files from the constant
directory,
including any objects or libraries (i.e., any .o
files.) Then
change to your cvdl directory and then tar and
compress the files
as in the following example :
cd ~/minipolaris/cvdl/constant
rm *.o libconstprop_pkg.so.1
Then
# also, remove any additional files (test files, subdirectories, etc.) which
# you have added
cd ..
tar cf - constant | gzip > project4_UIN.tar.gz
Note:
If you also modify your bblock or gotos passes, please include the all
source files by doing like this: tar cf - constant bblock gotos | gzip
> project4_UIN.tar.gz
gzcat project4_UIN.tar.gz | tar tvf - # this will list the files for
verification
You need to log on csnet.cs.tamu.edu
and turn in project4_UIN.tar.gz

