This paper describes high-level objects and functions that are potentially user-visible in a PIPS1 [20] interactive environment. It defines the internal software interface between a user interface and program analyses and transformations. This is clearly not a user guide but can be used as a reference guide, the best one before source code because PIPS user interfaces are very closely mapped on this document: some of their features are automatically derived from it.
Objects can be viewed and functions activated by one of PIPS existing user interfaces: tpips2 , the tty style interface which is currently recommended, pips3 [6], the old batch interface, improved by many shell scripts4 , wpips and epips, the X-Window System interfaces. The epips interface is an extension of wpips which uses Emacs to display more information in a more convenient way. Unfortunately, right now these window-based interfaces are no longer working and have been replaced by gpips. It is also possible to use PIPS through a Python API, pyps.
From a theoretical point of view, the object types and functions available in PIPS define an heterogeneous algebra with constructors (e.g. parser), extractors (e.g. prettyprinter) and operators (e.g. loop unrolling). Very few combinations of functions make sense, but many functions and object types are available. This abundance is confusing for casual and experiences users as well, and it was deemed necessary to assist them by providing default computation rules and automatic consistency management similar to make. The rule interpretor is called pipsmake6 and described in [5]. Its key concepts are the phase, which correspond to a PIPS function made user-visible, for instance, a parser, the resources, which correspond to objects used or defined by the phases, for instance, a source file or an AST (parsed code), and the virtual rules, which define the set of input resources used by a phase and the set of output resources defined by the phase. Since PIPS is an interprocedural tool, some real inpu resources are not known until execution. Some variables such as CALLERS or CALLEES can be used in virtual rules. They are expanded at execution to obtain an effective rule with the precise resources needed.
For debugging purposes and for advanced users, the precise choice and tuning of an algorithm can be made using properties. Default properties are installed with PIPS but they can be redefined, partly or entirely, by a properties.rc file located in the current directory. Properties can also be redefined from the user interfaces, for example with the command setproperty when the tpips interface is used.
As far as their static structures are concerned, most object types are described in more details in PIPS Internal Representation of Fortran and C code7 . A dynamic view is given here. In which order should functions be applied? Which object do they produce and vice-versa which function does produce such and such objects? How does PIPS cope with bottom-up and top-down interprocedurality?
Resources produced by several rules and their associated rule must be given alias names when they should be explicitly computed or activated by an interactive interfaceFI: I do not understand.. This is otherwise not relevant. The alias names are used to generate automatically header files and/or test files used by PIPS interfaces.
No more than one resource should be produced per line of rule because different files are automatically extracted from this one8 . Another caveat is that all resources whose names are suffixed with _file are considered printable or displayable, and the others are considered binary data, even though they may be ASCII strings.
This LATEX file is used by several procedures to derive some pieces of C code and ASCII files. The useful information is located in the PipsMake areas, a very simple literate programming environment... For instance alias information is used to generate automatically menus for window-based interfaces such as wpips or gpips. Object (a.k.a resource) types and functions are renamed using the alias declaration. The name space of aliases is global. All aliases must have different names. Function declarations are used to build a mapping table between function names and pointer to C functions, phases.h. Object suffixes are used to derive a header file, resources.h, with all resource names. Parts of this file are also extracted to generate on-line information for wpips and automatic completion for tpips.
The behavior of PIPS can be slightly tuned by using properties. Most properties are linked to a particular phase, for instance to prettyprint, but some are linked to PIPS infrastructure and are presented in Chapter 2.
To understand and to be able to write new rules for pipsmake, a few things need to be known.
The rule:
Properties are also declared in this file. For instance
The following variables are defined to handle interprocedurality:
These variables are used in the rule definitions and instantiated before pipsmake infers which resources are pre-requisites for a rule.
This paper also defines and describes global variables used to modify or fine tune PIPS behavior. Since global variables are useful for some purposes, but always dangerous, PIPS programmers are required to avoid them or to declare them explicitly as properties. Properties have an ASCII name and can have boolean, integer or string values.
Casual users should not use them. Some properties are modified for them by the user interface and/or the high-level functions. Some property combinations may be meaningless. More experienced users can set their values, using their names and a user interface.
Experienced users can also modify properties by inserting a file called properties.rc in their local directory. Of course, they cannot declare new properties, since they would not be recognized by the PIPS system. The local property file is read after the default property file, $PIPS_ROOT/etc/properties.rc. Some user-specified property values may be ignored because they are modified by a PIPS function before it had a chance to have any effect. Unfortunately, there is no explicit indication of usefulness for the properties in this report.
The default property file can be used to generate a custom version of properties.rc. It is derived automatically from this documentation, Documentation/pipsmake-rc.tex.
PIPS behavior can also be altered by Shell environment variables. Their generic names is XXXX_DEBUG_LEVEL, where XXXX is a library or a phase or an interface name (of course, there are exceptions). Theoretically these environment variables are also declared as properties, but this is generally forgotten by programmers. A debug level of 0 is equivalent to no tracing. The amount of tracing increases with the debug level. The maximum useful value is 9.
Another Shell environment variable, NEWGEN_MAX_TABULATED_ELEMENTS, is useful to analyze large programs. Its default value is 12,000 but it is not uncommon to have to set it up to 200,000.
Properties are listed below on a source library basis. Properties used in more than one library or used by PIPS infrastructure are presented first. Section 2.3 contains information about properties related to infrastructure, external and user interface libraries. Properties for analyses are grouped in Chapter 6. Properties for program transformations, parallelization and distribution phases are listed in the next section in Chapters 8 and 7. User output produced by different kinds of prettyprinters are presented in Chapter 9. Chaper 10 is dedicated to properties of the libraries added by CEA to implement Feautrier’s method.
Rule and object declaration are grouped in chapters: input files (Chapter 3), syntax analysis and abstract syntax tree (Chapter 4), analyses (Chapter 6), parallelizations (Chapter 7), program transformations (Chapter 8) and prettyprinters of output files (Chapter 9). Chapter 10 describes several analyses defined by Paul Feautrier. Chapter 11 contains a set of menu declarations for the window-based interfaces.
Virtually every PIPS programmer contributed some lines in this report. Inconsistencies are likely. Please report them to the PIPS team9 !
Options are called properties in PIPS. Most of them are related to a specific phase, for instance the dependence graph computation. They are declared next to the corresponding phase declaration. But some are related to one library or even to several libraries and they are declared in this chapter.
Skip this chapter on first reading. Also skip this chapter on second reading because you are unlikely to need these properties until you develop in PIPS.
Are DO loops bodies executed at least once (F-66 style), or not (Fortran 77)?
ONE_TRIP_DO FALSE
is useful for use/def and semantics analysis but is not used for region analyses. This dangerous property should be set to FALSE. It is not consistently checked by PIPS phases, because nobody seems to use this obsolete Fortran feature.
With
LOG_TIMINGS FALSE
it is possible to display the amount of real, cpu and system times directly spent in each phase as well as the times spent reading/writing data structures from/to PIPS database. The computation of total time used to complete a pipsmake request is broken down into global times, a set of phase times which is the accumulation of the times spent in each phase, and a set of IO times, also accumulated through phases.
Note that the IO times are included in the phase times.
With
LOG_MEMORY_USAGE FALSE
it is possible to log the amount of memory used by each phase and by each request. This is mainly useful to check if a computation can be performed on a given machine. This memory log can also be used to track memory leaks. Valgrind may be more useful to track memory leaks.
PIPS infrastructure is based on a few external libraries, Newgen and Linear, and on three key PIPS1 libraries:
Newgen offers some debugging support to check object consistency (gen_consistent_p and gen_defined_p), and for dynamic type checking. See Newgen documentation[31][?].
This library is external and offers an independent debugging system.
The following properties specify how null (
SYSTEM_NULL "<null␣system>"
), undefined
SYSTEM_UNDEFINED "<undefined␣system>"
) or non feasible systems
SYSTEM_NOT_FEASIBLE "{0==-1}"
are prettyprinted by PIPS.
With
CHECK_RESOURCE_USAGE FALSE
it is possible to log and report differences between the set of resources actually read and written by the procedures called by pipsmake and the set of resources declared as read or written in pipsmake.rc file.
ACTIVATE_DEL_DERIVED_RES TRUE
controls the rule activation process that may delete from the database all the derived resources from the newly activated rule to make sure that non-consistent resources cannot be used by accident.
PIPSMAKE_CHECKPOINTS 0
controls how often resources should be saved and freed. 0 means never, and a positive value means every n applications of a rule. This feature was added to allow long big automatic tpips scripts that can coredump and be restarted latter on close to the state before the core. As another side effect, it allows to free the memory and to keep memory consumption as moderate as possible, as opposed to usual tpips runs which keep all memory allocated. Note that it should not be too often saved, because it may last a long time, especially when entities are considered on big workspaces. The frequency may be adapted in a script, rarely at the beginning to more often latter.
Shell environment variables PIPSDBM_DEBUG_LEVEL can be set to ? to check object consistency when they are stored in the database, and to ? to check object consistency when they are stored or retrieved (in case an intermediate phase has corrupted some data structure unwillingly).
You can control what is done when a workspace is closed and resources are save. The
PIPSDBM_RESOURCES_TO_DELETE "obsolete"
property can be set to to ”obsolete” or to ”all”.
Note that it is not managed from pipsdbm but from pipsmake which knows what is obsolete or not.
The top-level library is built on top of the pipsmake and pipsdbm libraries to factorize functions useful to build a PIPS user interface or API.
Property
USER_LOG_P TRUE
controls the logging of the session in the database of the current workspace. This log can be processed by PIPS utility logfile2tpips to generate automatically a tpips script which can be used to replay the current PIPS session, workspace by workspace, regardless of the PIPSuser interface used.
Property
ABORT_ON_USER_ERROR FALSE
specifies how user errors impact execution once the error message is printed on stderr: return and go ahead, usually when PIPS is used interactively, or core dump for debugging purposes and for script executions, especially non-regression tests.
Property
ACTIVE_PHASES "PRINT_SOURCE␣PRINT_CODE␣PRINT_PARALLELIZED77_CODE␣PRINT_CALL_GRAPH␣PRINT_ICFG␣TRANSFORMERS_INTER_FULL␣INTERPROCEDURAL_SUMMARY_PRECONDITION␣PRECONDITIONS_INTER_FULL␣ATOMIC_CHAINS␣RICE_SEMANTICS_DEPENDENCE_GRAPH␣MAY_REGIONS"
specifies which pipsmake phases should be used when several phases can be used to produce the same resource. This property is used when a workspace is created. A workspace is the database maintained by PIPS to contain all resources defined for a whole application or for the whole set of files used to create it.
Resources that create ambiguities for pipsmake are at least:
This list must be updated according to new rules and new resources declared in this file. Note that no default parser is usually specified in this property, because it is selected automatically according to the source file suffixes when possible.
Until October 2009, the active phases were:
They still are used for the old non-regression tests.
tpips is one of PIPS user interfaces.
TPIPS_NO_EXECUTION_MODE FALSE
controls whether we shall execute the instructions of just check the syntax:
TPIPS_IS_A_SHELL FALSE
controls whether tpips should behave as an extended shell and consider any input command that is not a tpips command a Shell command.
User warnings may be turned off. Definitely, this is not the default option! Most warnings must be read to understand surprising results. This property is used by library misc.
NO_USER_WARNING FALSE
By default, PIPS reports errors generated by system call stat which is used in library pipsdbm to check the time a resource has been written and hence its temporal consistency.
WARNING_ON_STAT_ERROR TRUE
An input program is a set of user Fortran 77 or C source files and a name, called a workspace. The files are looked for in the current directory, then by using the colon-separated PIPS_SRCPATH variable for other directories where they might be found. The first occurrence of the file name in the ordered directories is chosen, which is consistent with PATH and MANPATH behaviour.
The source files are splitted by PIPS at the program initialization phase to produce one PIPS-private source file for each procedure, subroutine or function, and for each block data. A function like fsplit is used and the new files are stored in the workspace, which simply is a UNIX sub-directory of the current directory. These new files have names suffixed by .f.orig.
Since PIPS performs interprocedural analyses, it expects to find a source code file for each procedure or function called. Missing modules can be replaced by stubs, which can be made more or less precise with respect to their effects on formal parameters and global variables. A stub may be empty. Empty stubs can be automatically generated if the code is properly typed (see Section 3.4).
The user source files should not be edited by the user once PIPS has been started because these editions are not going to be taken into account unless a new workspace is created. But their preprocessed copies, the PIPS source files, safely can be edited while running PIPS. The automatic consistency mechanism makes sure that any information displayed to the user is consistent with the current state of the sources files in the workspace. These source files have names terminated by the standard suffix, .f.
New user source files should be automatically and completely re-built when the program is no longer under PIPS control, i.e. when the workspace is closed. An executable application can easily be regenerated after code transformations using the tpips1 interface and requesting the PRINTED_FILE resources for all modules, including compilation units in C:
display PRINTED_FILE[%ALL]
Note that compilation units can be left out with:
display PRINTED_FILE[%ALLFUNC]
In both cases with C source code, the order of modules may be unsuitable for direct recompilation and compilation units should be included anyway, but this is what is done by explicitly requesting the code regeneration as described in § 3.5.
Note that PIPS expects proper ANSI Fortran 77 code. Its parser was not designed to locate syntax errors. It is highly recommended to check source files with a standard Fortran compiler (see Section 3.2) before submitting them to PIPS.
The Fortran files specified as input to PIPS by the user are preprocessed in various ways.
In case of failure, a warning is displayed. Note that if the program cannot be compiled properly with a Fortran compiler, it is likely that many problems will be encountered within PIPS.
The next property also triggers this preliminary syntactic verification.
CHECK_FORTRAN_SYNTAX_BEFORE_PIPS FALSE
PIPS requires source code for all leaves in its visible call graph. By default, a user error is raised by Function initializer if a user request cannot be satisfied because some source code is missing. It also is possible to generate some synthetic code (a.k.a. stubs or ???) and to update the current module list but this is not a very satisfying option because all interprocedural analysis results are going to be wrong. The user should retrieve the generated .f files in the workspace, under the Tmp directory, and add some assignments (def) and uses. The user modified synthetic files should then be saved and used to generate a new workspace.
Valid settings: error generate query.
PREPROCESSOR_MISSING_FILE_HANDLING "error"
Moreover the preprocessing options used can be extended with the following environment variables:
The Shell variable PIPS_CPP_FLAGS may be used to locate the include files. Directories to search are specified with the -I option.
The output of this phase is a set of .f_initial files in per-module subdirectories. They constitute the resource INITIAL_FILE.
A second step of preprocessing is performed to produce SOURCE_FILE files with standard Fortran suffix .f from the .f_initial files. The two preprocessing steps are shown in Figure 3.1.
A source_file contains the code of exactly one module. Source files are created from user source files at program initialization by fsplit or a similar function if fsplit is not available (see Section 3.2). A source file may be updated by the user4 , but not by PIPS. Program transformations are performed on the internal representation (see 4) and visible in the prettyprinted output (see 9).
Source code splitting and preprocessing, e.g. cpp, are performed by the function create_workspace() from the top-level library, in collaboration with db_create_workspace() from library pipsdbm which creates the workspace directory. The user source files have names suffixed by .f or .F if cpp must be applied. They are split into original user_files with suffix .f.orig. These so-called original user files are in fact copies stored in the workspace. The syntactic PIPS preprocessor is applied to generate what is known as a source_file by PIPS. This process is fully automatized and not visible from PIPS user interfaces. However, the cpp preprocessor actions can be controlled using the Shell environment variable PIPS_CPP_FLAGS.
Function initializer is only called when the source code is not found. If the user code is properly typed, it is possible to force initializer to generate empty stubs by setting properties PREPROCESSOR_MISSING_FILE_HANDLING 3.2 and, to avoid inconsistency, PARSER_TYPE_CHECK_CALL_SITES 4.2.1.4. But remember that many Fortran codes use subroutines with variable numbers of arguments and with polymorphic types. Fortran varargs mechanism can be achieved by using or not the second argument according to the first one. Polymorphism can be useful to design an IO package or generic array subroutine, e.g. a subroutine setting an array to zero or a subroutine to copy an array into another one.
The current default option is to generate a user error if some source code is missing. This decision was made for two reasons:
Note: the generation of the resource user_file here above is mainly directed in having the resource concept here. More thought is needed to have the concept of user files managed by pipsmake.
MUST appear after initializer:
In C, the initializer can generate directly a c_source_file and its compilation unit.
The unsplit 3.5 phase regenerates user files from available printed_file. The various modules that where initially stored in single file are appended together in a file with the same name. Not that just fsplit is reversed, not a preprocessing through cpp. Also the include file preprocessing is not reversed.
Regeneration of user files. The various modules that where initially stored in single file are appended together in a file with the same name.
The abstract syntax tree, a.k.a intermediate representation, a.k.a. internal representation, is presented in [21] and in PIPS Internal Representation of Fortran and C code1 .
Program entities are stored in PIPS unique symbol table2 , called entities. Fortran entities, like intrinsics and operators, are created by bootstrap at program initialization. The symbol table is updated with user local and global variables when modules are parsed or linked together. This side effect is not disclosed to pipsmake.
The entity data structure is described in PIPS Internal Representation of Fortran and C code3 .
The declaration of new intrinsics is not easy because it was assumed that there number was fixed and limited by the Fortran standard. In fact, Fortran extensions define new ones. To add a new intrinsic, C code in bootstrap/bootstrap.c and in effects-generic/intrinsics.c must be added to declare its name, type and Read/Write memory effects.
Information about entities generated by the parser is printed out conditionally to property: PARSER_DUMP_SYMBOL_TABLE 4.2.1.4. which is set to false by default.
Each module source code is parsed to produce an internal representation called parsed_code and a list of called module names, callees.
Source code is assumed to be fully Fortran-77 compliant. On the first encountered error, the parser may be able to emit a useful message or the non-analyzed part of the source code is printed out.
PIPS input language is standard Fortran 77 with few extensions and some restrictions. The input character set includes underscore, _, and varying length variable names, i.e. they are not restricted to 6 characters.
PARSER_EXPAND_STATEMENT_FUNCTIONS 4.2.1.4
is set. If the substitution is considered possibly unsafe, a warning is displayed.
These parser restrictions were based on funding constraints. They are mostly alleviated by the preprocessing phase. PerfectClub and SPEC-CFP95 benchmarks are handled without manual editing, but for ENTRY statements which are obsoleted by the current Fortran standard.
For debugging purposes, it is possible to print a summary of the symbol table, when enabling this property:
PARSER_DUMP_SYMBOL_TABLE FALSE
The prettyprint of the symbol table for a Fortran module is generated with:
Some subtle errors occur because the PIPS parser uses a fixed format. Columns 73 to 80 are ignored, but the parser may emit a warning if some characters are encountered in this comment field.
PARSER_WARN_FOR_COLUMNS_73_80 TRUE
PIPS has been initially developed to parse correct Fortran compliant programs only. Real applications use lots of ANSI extensions… and they are not always correct! To make sure that PIPS output is correct, the input code should be checked against ANSI extensions using property
CHECK_FORTRAN_SYNTAX_BEFORE_PIPS
(see Section 3.2) and the property below should be set to false.
PARSER_ACCEPT_ANSI_EXTENSIONS TRUE
Currently, this property is not used often enough in PIPS parser which let go many mistakes... as expected by real users!
PIPS has been developed to parse correct Fortran-77 compliant programs only. Array ranges are used to improve readability. They can be generated by PIPS prettyprinter. They are not parsed as correct input by default.
PARSER_ACCEPT_ARRAY_RANGE_EXTENSION FALSE
Each argument list at calls to a function or a subroutine is compared to the functional type of the callee. Turn this off if you need to support variable numbers of arguments or if you use overloading and do not want to hear about it. For instance, an IO routine can be used to write an array of integers or an array of reals or an array of complex if the length parameter is appropriate.
Since the functional typing is shaky, let’s turn it off by default!
PARSER_TYPE_CHECK_CALL_SITES FALSE
The PIPS implementation of Allen&Kennedy algorithm cannot cope with labeled DO loops because the loop, and hence its label, may be replicated if the loop is distributed. The parser can generate an extra CONTINUE statement to carry the label and produce a label-free loop. This is not the standard option because PIPS is designed to output code as close as possible to the user source code.
PARSER_SIMPLIFY_LABELLED_LOOPS FALSE
Most PIPS analyses work better if loop bounds are affine. It is sometimes possible to improve results for non-affine bounds by assigning the bound to an integer variables and by using this variable as bound.
PARSER_LINEARIZE_LOOP_BOUNDS FALSE
The entry construct can be seen as an early attempt at object-oriented programming. The same object can be processed by several function. The object is declared as a standard subroutine or function and entry points are placed in the executable code. The entry points have different sets of formal parameters, they may share some common pieces of code, they share the declared variables, especially the static ones.
The entry mechanism is dangerous because of the flow of control between entries. It is now obsolete and is not analyzed directly by PIPS. Instead each entry may be converted into a first class function or subroutine and static variables are gathered in a specific common. This is the default option. If the substitution is not acceptable, the property may be turned off and entries results in a parser error.
PARSER_SUBSTITUTE_ENTRIES TRUE
Alternate returns are put among the obsolete Fortran features by the Fortran 90 standard. It is possible (1) to refuse them (option ”NO”), or (2) to ignore them and to replace alternate returns by STOP (option ”STOP”), or (3) to substitute them by a semantically equivalent code based on return code values (option ”RC” or option ”HRC”). Option (2) is useful if the alternate returns are used to propagate error conditions. Option (3) is useful to understand the impact of the alternate returns on the control flow graph and to maintain the code semantics. Option ”RC” uses an additional parameter while option ”HRC” uses a set of PIPS run-time functions to hide the set and get of the return code which make declaration regeneration less useful. By default, the first option is selected and alternate returns are refused.
To produce an executable code, the declarations must be regenerated: see property PRETTYPRINT_ALL_DECLARATIONS 9.2.18.6 in Section 9.2.18.6. This is not necessary with option ”HRC”. Fewer new declarations are needed if variable PARSER_RETURN_CODE_VARIABLE 4.2.1.4 is implicitly integer because its first letter is in the I-N range.
With option (2), the code can still be executed if alternate returns are used only for errors and if no errors occur. It can also be analyzed to understand what the normal behavior is. For instance, OUT regions are more likely to be exact when exceptions and errors are ignored.
Formal and actual label variables are replaced by string variables to preserve the parameter ordre and as much source information as possible. See PARSER_FORMAL_LABEL_SUBSTITUTE_PREFIX 4.2.1.4 which is used to generate new variable names.
PARSER_SUBSTITUTE_ALTERNATE_RETURNS "NO"
PARSER_RETURN_CODE_VARIABLE "I_PIPS_RETURN_CODE_"
PARSER_FORMAL_LABEL_SUBSTITUTE_PREFIX "FORMAL_RETURN_LABEL_"
The internal representation can be hidden and the alternate returns can be prettyprinted at the call sites and modules declaration by turning on the following property:
PRETTYPRINT_REGENERATE_ALTERNATE_RETURNS FALSE
PRETTYPRINT_REGENERATED_LABEL "12345"
This is useful to link modules processed by PIPS with user modules using the alternate return constructs if alternate returns are replaced by STOP statement in the code processed by PIPS. The actual labels present in call sites may be dead code eliminated and a fake non-conflicting label must be used. It is defined by property PRETTYPRINT_REGENERATED_LABEL 4.2.1.4.
If all modules have been processed by PIPS, it is possible not to regenerate alternate returns and to use a code close to the internal representation. If they are regenerated in the call sites and module declaration, they are nevertheless not used by the code generated by PIPS which is consistent with the internal representation.
Here is a possible implementation of the two PIPS run-time subroutines required by the hidden return code (”HRC”) option:
subroutine SET_I_PIPS_RETURN_CODE_(irc)
common /PIPS_RETURN_CODE_COMMON/irc_shared
irc_shared = irc
end
subroutine GET_I_PIPS_RETURN_CODE_(irc)
common /PIPS_RETURN_CODE_COMMON/irc_shared
irc = irc_shared
end
Note that the subroutine names depend on the PARSER_RETURN_CODE_VARIABLE 4.2.1.4 property. They are generated by prefixing it with SET_ and GET_. There implementation is free. The common name used should not conflict with application common names. The ENTRY mechanism is not used because it would be desugared by PIPS anyway.
By default, assigned GO TO and ASSIGN statements are not accepted. These constructs are obsolete and will not be part of future Fortran standards.
However, it is possible to replace them automatically in a way similar to computed GO TO. Each ASSIGN statement is replaced by a standard integer assignment. The label is converted to its numerical value. When an assigned GO TO with its optional list of labels is encountered, it is transformed into a sequence of logical IF statement with appropriate tests and GO TO’s. According to Fortran 77 Standard, Section 11.3, Page 11-2, the control variable must be set to one of the labels in the optional list. Hence a STOP statement is generated to interrupt the execution in case this happens, but note that compilers such as SUN f77 and g77 do not check this condition at run-time (it is undecidable statically).
PARSER_SUBSTITUTE_ASSIGNED_GOTO FALSE
Assigned GO TO without the optional list of labels are not processed. In other words, PIPS make the optional list mandatory for substitution. It usually is quite easy to add manually the list of potential targets.
Also, ASSIGN statements cannot be used to define a FORMAT label. If the desugaring option is selected, an illegal program is produced by PIPS parser.
This property controls the processing of Fortran statement functions by text substitution in the parser. No other processing is available and the parser stops with an error message when a statement function declaration is encountered.
The default used to be not to perform this unchecked replacement, which might change the semantics of the program because type coercion is not enforced and actual parameters are not assigned to intermediate variables. However most statement functions do not require these extra-steps and it is legal to perform the textual substitution. For user convenience, the default option is textual substitution.
Note that the parser does not have enough information to check the validity of the transformation, but a warning is issued if legality is doubtful. If strange results are obtained when executing codes transformed with PIPS, his property should be set to false.
A better method would be to represent them somehow a local functions in the internal representation, but the implications for pipsmake and other issues are clearly not all foreseen…(Fabien Coelho).
PARSER_EXPAND_STATEMENT_FUNCTIONS TRUE
This parser takes a different Fortran file but applies the same processing as the previous parser. The Fortran file is the result of the preprocessing by the hpfc_filter 7.2.2.1 phase of the original file in order to extract the directives and switch them to a Fortran 77 parsable form. As another side-effect, this parser hides some callees from pipsmake. This callees are temporary functions used to encode HPF directives. Their call sites are removed from the code before requesting full analyses to PIPS. This parser is triggered automatically by the hpfc_close 7.2.2.5 phase when requested. It should never be selected or activated by hand.
A C file is seen in PIPS as a compilation unit, that contains all the objects declarations that are global to this file, and as many as module (function or procedure) definitions defined in this file.
Thus the compilation unit contains the file-global macros, the include statements, the local and global variable definitions, the type definitions, and the function declarations if any found in the C file.
When the PIPS workspace is created by PIPS preprocessor, each C file is preprocessed4 using for instance gcc -E5 and broken into a new which contains only the file-global variable declarations, the function declarations and the type definitions, and one C file for each C function defined in the initial C file.
The new compilation units must be parsed before the new files, containing each one exactly one function definition, can be parsed. The new compilation units are named like the initial file names but with a bang extension.
For example, considering a C file foo.c with 2 function definitions:
After preprocessing, it leads to a file foo.cpp_processed.c that is then split into a new foo!.cpp_processed.c compilation unit containing
and 2 module files containing the definitions of the 2 functions, a calc.c
and a main.c
Note that it is possible to have an empty compilation unit and no module file if the original file does not contain sensible C informations (such as an empty file containing only blank characters and so on).
The resource COMPILATION_UNIT.declarations produced by compilation_unit_parser is a special resource used to force the parsing of the new compilation unit before the parsing of its associated functions. It is in fact a hash table containing the file-global C keywords and typedef names defined in each compilation unit.
In fact phase compilation_unit_parser also produces parsed_code and callees resources for the compilation unit. This is done to work around the fact that rule c_parser was invoked on compilation units by later phases (in particular for the computation of initial preconditions), breaking the declarations of function prototypes. These two resources are not declared here because pipsmake gets confused between the different rules to compute parsed code : there is no simple way to distinguish between compilation units and modules at some times and handling them similarly at other times.
If you want to parse some C code using tpips, it is necessary to select the C parser with
A prettyprint of the symbol table for a C module can be generated with
PIPS analyses and transformations take advantage of a hierarchical control flow graph, which preserves structured part of code as such, and uses a control flow graph only when no syntactic representation is available (see [20]). The encoding of the relationship between structured and unstructured parts of code is explained elsewhere, mainly in the PIPS Internal Representation of Fortran and C code6 .
The controlizer 4.3 removes GOTO statements in the parsed code and generates a similar representation with small CFGs.
The hierarchical control flow graph built by the controlizer 4.3 is pretty crude. The partial control flow graphs, called unstructured statements, are derived from syntactic constructs. The control scope of an unstructured is the smallest enclosing structured construct, whether a loop, a test or a sequence. Thus some statements, which might be seen as part of structured code, end up as nodes of an unstructured.
Note that sequences of statements are identified as such by controlizer 4.3. Each of them appears as a unique node.
Also, useless CONTINUE statements may be added as provisional landing pads and not removed. The exit node should never have successors but this may happen after some PIPS function calls. The exit node, as well as several other nodes, also may be unreachable. After clean up, there should be no unreachable node or the only unreachable node should be the exit node. Function unspaghettify 8.3.3.1 (see Section 8.3.3.1) is applied by default to clean up and to reduce the control flow graphs after controlizer 4.3.
The GOTO statements are transformed in arcs but also in CONTINUE statements to preserve as many user comments as possible.
The top statement of a module returned by the controlizer 4.3 used to contain always an unstructured instruction with only one node. Several phases in PIPS assumed that this always is the case, although other program transformations may well return any kind of top statement, most likely a block. This is no longer true. The top statement of a module may contain any kind of instruction.
Control restructuring eliminates empty sequences but as empty true or false branch of structured IF. This semantic property of PIPS Internal Representation of Fortran and C code7 is enforced by libraries effects, regions, hpfc, effects-generic.
WARN_ABOUT_EMPTY_SEQUENCES FALSE
By unsetting this property unspaghettify 8.3.3.1 is not applied implicitly in the controlizer phase.
UNSPAGHETTIFY_IN_CONTROLIZER TRUE
The next property is used to convert C for loops into C while loops. The purpose is to speed up the re-use of Fortran analyses and transformation for C code. This property is set to false by default and should ultimately disappear.
FOR_TO_WHILE_LOOP_IN_CONTROLIZER FALSE
The next property is used to convert C for loops into C do loops when syntactically possible. The conversion is not safe because the effect of the loop body on the loop index is not checked. The purpose is to speed up the re-use of Fortran analyses and transformation for C code. This property is set to false by default and should disappear soon.
FOR_TO_DO_LOOP_IN_CONTROLIZER FALSE
This can also explicitly applied by calling the phase described in § 8.3.3.4.
To able deeper code transformation, FORMATs can be gathered at the very beginning of the code or at the very end according to the following options in the unspaghettify or control restructuring phase.
GATHER_FORMATS_AT_BEGINNING FALSE
GATHER_FORMATS_AT_END FALSE
To display the statistics about cleaning-up sequences and removing useless CONTINUE or empty statement.
CLEAN_UP_SEQUENCES_DISPLAY_STATISTICS FALSE
There is a trade-off between keeping the comments associated to labels and goto and the cleaning that can be do on the control graph.
By default, do not fuse empty control nodes that have labels or comments:
FUSE_CONTROL_NODES_WITH_COMMENTS_OR_LABEL FALSE
Although this phases should be spread elsewhere in this manual, we have put some pedagogical phases useful to jump into PIPS first.
A phase that displays, in debug mode, statements matching an XPath expression on the internal representation:
Prepends a comment to the first statement of a module. Useful to apply post-processing after PIPS.
The comment to add is selected by this property:
PREPEND_COMMENT "/*␣This␣comment␣is␣added␣by␣PREPEND_COMMENT␣phase␣*/"
This phase inserts a call to function MY_TRACK just before the first statement of a module. Useful as a pedagogical example to explore the internal representation and Newgen.
The called function could be defined by this property:
PREPEND_CALL "MY_TRACK"
but it is not.
Analyses encompass the computations of call graphs, the memory effects, reductions, use-def chains, dependence graphs, interprocedural checks (flinter), semantics information (transformers and preconditions), continuations, complexities, array regions, dynamic aliases and complementary regions.
All lists of callees are needed to build the global lists of callers for each module. The callers and callees lists are used by pipsmake to control top-down and bottom-up analyses. The call graph is assumed to be a DAG, i.e. no recursive cycle exists, but it is not necessarily connected.
The height of a module can be used to schedule bottom-up analyses. It is zero if the module has no callees. Else, it is the maximal height of the callees plus one.
The depth of a module can be used to schedule top-down analyses. It is zero if the module has no callers. Else, it it the maximal depth of the callers plus one.
The following pass generates a uDrawGraph1 version of the callgraph. Its quite partial since it should rely on an hypothetical all callees, direct and indirect, resource.
The data structures used to represent memory effects and their computation are described in [21]. Another description is available on line, in PIPS Internal Representation of Fortran and C code2 Technical Report.
Note that the standard name in the Dragon book is likely ot be Gen and Kill sets, but PIPS uses the more general concept of effect developped by P. Jouvelot and D. Gifford.
The proper effects of a statement basically are a list of variables that may be read or written by the statement. They are used to build use-def chains and then the dependence graph.
Proper means that the effects of a compound statement do not include the effects of lower level statements. For instance, the body of a loop, true and false branches of a test statement, control nodes in an unstructured statement ... are ignored to compute the proper effects of a loop, a test or an unstructured.
Summary effects (see Section 6.2.4) of a called module are used to compute the proper effects at the corresponding call sites. They are translated from the callee’s scope into the caller’s scope. The translation is based on the actual-to-formal binding. If too many actual arguments are defined, a user warning is issued but the processing goes on because a simple semantics is available: ignore useless actual arguments. If too few actual arguments are provided, a user error is issued because the effects of the call are not defined. It would be necessary to assume either a general READ or a general WRITE for each read or written formal parameter with no actual counterpartThis is implemented for calls by reference as in Fortran. The call-by-value implementation is on-going. See Amira Mensi..
Variables private to loop are handled like regular variable.
To be continued...by whom?
Cumulated effects of statements are also lists of read or written variables. Cumulated means that the effects of a compound statement, do loop, test or unstructured, include the effects of the lower level statements such as a loop body or a test branch.
Cumulated memory effects do not take into account local effects on private variables, such as variables declared in C blocks or in Fortran parallel DO loops.
Summary data flow information is the simplest interprocedural information needed to take procedure into account in a parallelizer. They were introduced in Parafrase.
The summary_effects 6.2.4 of a module are the cumulated effects of its top level statement, but effects on local dynamic variables are ignored (because they cannot be observed by the callers3 ) and subscript expressions of remaining effects are eliminated.
IN and OUT memory effects of a statement s are memory locations whose input values are used by statement s or whose output values are used by statement s continuation. Variables allocated in the statement are not part of the IN or OUT effects. Variables defined before they are used ar not part of the IN effects. OUT effects require an interprocedural analysis4
The concept of proper references is not yet clearly defined. The original idea is to keep track of the actual objects of newgen domain reference used in the program representation of the current statement, while retaining if they correspond to a read or a write of the corresponding memory locations. Proper references are represented as effects.
For C programs, where memory accesses are not necessarily represented by objects of newgen domain reference, the semantics of this analysis is unclear.
Cumulated references gather proper references over the program code, without taking into account the modification of memory stores by the program execution.
FC: I should implement real summary references?
Print SDFI just after computation:
EFFECTS_PRINT_SDFI TRUE
Filter this variable in phase filter_proper_effects 6.2.2.
EFFECTS_FILTER_ON_VARIABLE ""
When set to TRUE, EFFECTS_POINTER_MODIFICATION_CHECKING 6.2.7 enables pointer modification checking during the computation of cumulated effects and/or RW regions. Since this is still at experimentation level, it’s default value is FALSE. This property should disappear when pointer modification analyses are more mature.
EFFECTS_POINTER_MODIFICATION_CHECKING FALSE
The proper reductions are computed from a code.
The cumulated reductions propagate the reductions in the code, upwards.
This pass summarizes the reductions candidates found in a module for export to its callers. The summary effects should be used to restrict attention to variable of interest in the translation?
Some possible (simple) transformations could be added to the code to mark reductions in loops, for latter use in the parallelization.
The following is NOT implemented. Anyway, should the cumulated_reductions be simply used by the prettyprinter instead?
Use-def and def-use chains are a standard data structure in optimizing compilers [1]. These chains are used as a first approximation of the dependence graph. Chains based on regions (see Section 6.11) are more effective for interprocedural parallelization.
If chains based on regions have been selected, the simplest dependence test must be used because regions carry more information than any kind of preconditions. Preconditions and loop bound information already are included in the region predicate.
The algorithm used to compute use-def chains is original because it is based on PIPS hierarchical control flow graph and not on a unique control flow graph.
This algorithm generates inexistent dependencies on loop indices. These dependence arcs appear between DO loop headers and implicit DO loops in IO statements, or between one DO loop header and unrelated DO loop bound expressions using that index variable. It is easy to spot the problem because loop indices are not privatized. A prettyprint option,
PRETTYPRINT_ALL_PRIVATE_VARIABLES 9.2.18.5.1
must be set to true to see if the loop index is privatized or not). The problem disappears when some loop indices are renamed.
The problem is due to the internal representation of DO loops: PIPS has no way to distinguish between initialization effects and increment effects. They have to be merged as proper loop effects. To reduce the problem, proper effects of DO loops do not include the index read effect due to the loop incrementation.
Artificial arcs are added to... (Pierre Jouvelot, help!).
Such chains are required for effective interprocedural parallelization. The dependence graph is annotated with proper regions, to avoid inaccuracy due to summarization at simple statement level (see Section 6.11).
Region-based chains are only compatible with the Rice Fast Dependence Graph option (see Section 6.5.1) which has been extended to deal with them5 . Other dependence tests do not use region descriptors (their convex system), because they cannot improve the Rice Fast Dependence test based on regions.
Beware : this option is for experimental use only; resulting parallel code may not be equivalent to input code (see the explanations below).
When in_out_regions_chains 6.4.4 is selected, IN and OUT regions (see Sections 6.11.5 and 6.11.8) are used at call sites instead of READ and WRITE regions. For all other statements, usual READ and WRITE regions are used.
As a consequence, arrays and scalars which could be declared as local in callees, but are exposed to callers because they are statically allocated or are formal parameters, are ignored, increasing the opportunities to detect parallel loops. But as the program transformation which consists in privatizing variables in modules is not yet implemented in PIPS, the code resulting from the parallelization with in_out_regions_chains 6.4.4 may not be equivalent to the original sequential code.
As for region-based chains (see Section 6.4.3), the simplest dependence test should be selected for best results.
The following loop in Subroutine inout cannot be parallelized legally because Subroutine foo uses a static variable, y. However, PIPS will display this loop as (potentially) parallel if the in_out option is selected for use-def chain computation. Remember that IN/OUT regions require MUST regions to obtain interesting results (see Section 6.11.5).
subroutine inout(a,n)
real a(n)
do i = 1, n
call foo(a(i))
enddo
end
subroutine foo(x)
save y
y = x
x = x + y
end
It is possible to put use-use dependence arcs in the dependence graph. This is useful for estimation of cache memory traffic and of communication for distributed memory machine (e.g. you can parallelize only communication free loops). Beware of use-use dependence on scalar variables. You might expect scalars to be broadcasted and/or replicated on each processor but they are not handled that way by the parallelization process unless you manage to have them declared private with respect to all enclosing loops.
This feature is not supported by PIPS user interfaces. Results may be hard to interpret. It is useful to print the dependence graph.
KEEP_READ_READ_DEPENDENCE FALSE
It is possible to mask effects on local variables in loop bodies. This is dangerous with current version of Allen & Kennedy which assumes that all the edges are present, the ones on private variables being partially ignored but for loop distribution. In other words, this property should always be set to false.
CHAINS_MASK_EFFECTS FALSE
It also is possible to keep only true data-flow (Def – Use) dependences in the dependence graph. This was an attempt at mimicking the effect of direct dependence analysis and at avoiding privatization. However, direct dependence analysis is not implemented in the standard tests and spurious def-use dependence arcs are taken into account.
CHAINS_DATAFLOW_DEPENDENCE_ONLY FALSE
These last two properties are not consistent with PIPS current development (1995/96). It is assumed that all dependence arcs are present in the dependence graph. Phases using the latter should be able to filter out irrelevant arcs, e.g. pertaining to privatized variables.
The default disambiguation test is based on variables names. Array and scalar variables are handled in the same way. However it is possible to refine the chain graph by using constant subscript expressions.
CHAINS_DISAMBIGUATE_CONSTANT_SUBSCRIPTS FALSE
The dependence graph is used primarily by the parallelization algorithms. A dependence graph is a refinement of use-def chains (Section 6.4). It is location-based and not value-based.
There are several ways to compute a dependence graph. Some of them are fast (Banerjee’s one for instance) but provide poor results, others might be slower (Rémi Triolet’s one for instance) but produce better results.
Three different dependence tests are available, all based on Fourier-Motzkin elimination improved with a heuristics for the integer domain. The fast version uses subscript expressions only (unless regions were used to compute use-def chains, in which case regions are used instead). The full version uses subscript expressions and loop bounds. The semantics version uses subscript expressions and preconditions (see 6.8).
Note that for interprocedural parallelization precise array regions only are used by the fast dependence test if the proper kind of use-def chains has been previously selected (see Section 6.4.3).
There are several kinds of dependence graphs. Most of them share the same overall data structure: a graph with labels on arcs and vertices. usually, the main differences are in the labels that decorate arcs; for instance, Kennedy’s algorithm requires dependence levels (which loop actually creates the dependence) while algorithms originated from CSRD prefer DDVs (relations between loop indices when the dependence occurs). Dependence cones introduced in [22, 23, 24, 25] are even more precise [4].
The computations of dependence level and dependence cone [41] are both implemented in PIPS. DDV’s are not computed. Currently, only dependence levels are exploited by parallelization algorithms.
The dependence graph can be printed with or without filters (see Section 9.8). The standard dependence graph includes all arcs taken into account by the parallelization process (Allen & Kennedy [2]), except those that are due to scalar private variables and that impact the distribution process only. The loop carried dependence graph does not include intra-iteration dependences and is a good basis for iteration scheduling. The whole graph includes all arcs, but input dependence arcs.
It is possible to gather some statistics about dependences by turning on property RICEDG_PROVIDE_STATISTICS 6.5.6.2 (more details in the properties). A Shell script from PIPS utilities, print-dg-statistics, can be used in combination to extract the most relevant information for a whole program.
During the parallelization phases, is is possible to ignore arcs related to states of the libc, such as the heap memory management, because thread-safe libraries do perform the updates within critical sections. But these arcs are part of the use-def chains and of the dependence graph. If they were removed instead of being ignored, use-def elimination would remove all free statements.
The main contributors for the design and development of dependence analysis are Rémi Triolet, François Irigoin and Yi-qing Yang [41]. The code was improved by Corinne Ancourt and Béatrice Creusillet.
Use subscript expressions only (unless regions were used to compute use-def chains, in which case regions are used instead). rice_regions_dependence_graph is a synonym for this rule, but emits a warning if region_chains is not selected.
Use subscript expressions and loop bounds.
Uses subscript expressions and preconditions (see 6.8).
Synonym for rice_fast_dependence_graph, except that it emits a warning when region_chains is not selected.
This property seems to be now obsolete. The dependence test choice is now controlled directly and only by rules in pipsmake. The procedures called by these rules may use this property. Anyway, it is useless to set it manually.
DEPENDENCE_TEST "full"
Provide the following counts during the dependence test. There are three parts: numbers of dependencies and independences (fields 1-10), dimensions of referenced arrays and dependence natures (fields 11-25) and the same information for constant dependencies (fields 26-40), decomposition of the dependence test in elementary steps (fields 41-49), use and complexity of Fourier-Motzkin’s pair-wise elimination (fields 50, 51 and 52-68).
Note: field 1 minus field 2 is the number of array dependencies.
Note: field 5 must be greater or equal to field 4.
Note: the sum of fields 5 to 8 and field 2 equals field 1
Note: the sum of fields 11 to 25 should be equal to the sum of field 9 and 2 minus field 1.
Note: the fields 26 to 40 must be less than or equal to the corresponding fields 11 to 25
Note: the sum of fields 41 to 49 equals field 2
The results are stored in the current workspace in MODULE.resulttestfast, MODULE.resultesttestfull, or MODULE.resulttestseman according to the test selected.
RICEDG_PROVIDE_STATISTICS FALSE
Provide the statistics above and count all array reference pairs including these involved in call statement.
RICEDG_STATISTICS_ALL_ARRAYS FALSE
Only take into account true flow dependences (Def – Use) during the computation of SCC? Note that this is different from the CHAINS_DATAFLOW_DEPENDENCE_ONLY option which doesn’t compute the whole graph. Warning: this option potentially yields incorrect parallel code.
RICE_DATAFLOW_DEPENDENCE_ONLY FALSE
Here are the properties used to control the printing of dependence graphs in a file called module_name.dg. These properties should not be used explicitly because they are set implicitly by the different print-out procedures available in pipsmake.rc. However, not all combinations are available from pipsmake.rc.
PRINT_DEPENDENCE_GRAPH FALSE
To print the dependence graph without the dependences on privatized variables
PRINT_DEPENDENCE_GRAPH_WITHOUT_PRIVATIZED_DEPS FALSE
To print the dependence graph without the non-loop-carried dependences:
PRINT_DEPENDENCE_GRAPH_WITHOUT_NOLOOPCARRIED_DEPS FALSE
To print the dependence graph with the dependence cones:
PRINT_DEPENDENCE_GRAPH_WITH_DEPENDENCE_CONES FALSE
To print the dependence graph in a computer friendly format defined by Deborah Whitfield (SRU):
PRINT_DEPENDENCE_GRAPH_USING_SRU_FORMAT FALSE
The default option is to compute the dependence graph only for loops which can be parallelized using Allen & Kennedy algorithm. However it is possible to compute the dependences in all cases, even for loop containing test, goto, etc... by setting this option to TRUE.
Of course, this information is not used by the parallelization phase which is restricted to loops meeting the A&K conditions. By the way, the hierarchical control flow graph is not exploited either by the parallelization phase.
COMPUTE_ALL_DEPENDENCES FALSE
Function flinter 6.6 performs some intra and interprocedural checks about formal/actual argument pairs, use of COMMONs,... It was developed by Laurent Aniort and Fabien Coelho. Ronan Keryell added the uninitialized variable checking.
In the past, flinter 6.6 used to require MODULE.summary_effects to check the parameter passing modes and to make sure that no module would attempt an assignment on an expression. However, this kind of bug is detected by the effect analysis… which was required by flinter.
Resource CALLEES.code is not explicitly required but it produces the global symbols which function flinter 6.6 needs to check parameter lists.
Computes statistics about loops in module. It computes the number of perfectly and imperfectly nested loops and gives their depths. And it gives the number of nested loops which we can treat with our algorithm.
PIPS semantics analysis targets integer scalar variables. It is a two-pass process, with a bottom-up pass computing transformers 6.8.1, and a top-down pass propagating preconditions 6.8.5. Transformers and preconditions are specially powerful case of return and jump functions [13]. They abstract relations between program states with polyhedra and encompass most standard interprocedural constant propagations as well as most interval analyses. It is a powerful relational symbolic analysis.
Unlike [19] their computations are based on PIPS Hierarchical Control Flow Graph and on syntactic constructs instead of a standard flow graph. The best presentation of this part of PIPS is in [29].
A similar analysis is available in Parafrase-2 []. It handles polynomial equations between scalar integer variables. SUIF [] also performs some kind of semantics analysis.
The semantics analysis part of PIPS was designed and developed by François Irigoin.
RK: The following is hard to read without any example for someone that knows nothing about PIPS... FI: do you want to have everything in this documentation?
A transformer is an approximate relation between the symbolic initial values of scalar variables and their values after the execution of a statement, simple or compound (see [21] and [29]). In abstract interpretation terminology, a transformer is an abstract command linking the input abstract state of a statement and its output abstract state.
By default, only integer scalar variables are analyzed, but properties can be set to handle boolean, string and floating point scalar variables6 : SEMANTICS_ANALYZE_SCALAR_INTEGER_VARIABLES 6.8.11.1 SEMANTICS_ANALYZE_SCALAR_BOOLEAN_VARIABLES 6.8.11.1 SEMANTICS_ANALYZE_SCALAR_STRING_VARIABLES 6.8.11.1 SEMANTICS_ANALYZE_SCALAR_FLOAT_VARIABLES 6.8.11.1 SEMANTICS_ANALYZE_SCALAR_COMPLEX_VARIABLES 6.8.11.1
Transformers can be computed intraprocedurally by looking at each function independently or they can be computed interprocedurally starting with the leaves of the call tree7 .
Intraprocedural algorithms use cumulated_effects 6.2.3 to handle procedure calls correctly. In some respect, they are interprocedural since call statements are accepted. Interprocedural algorithms use the summary_transformer 6.8.2 of the called procedures.
Fast algorithms use a very primitive non-iterative transitive closure algorithm (two possible versions: flow sensitive or flow insensitive). Full algorithms use a transitive closure algorithm based on vector subspace (i.e. à la Karr [32]) or one based on the discrete derivatives [?]. The iterative fix point algorithm for transformers (i.e. Halbwachs/Cousot [19] is implemented but not used because the results obtained with transitive closure algorithms are faster and up-to-now sufficient.RK: Still true? FI: To be deleted? Properties are set to select the transitive closure algorithm used (see the Semantics section of the property documentation []).
SEMANTICS_FIX_POINT_OPERATOR 6.8.11.6
Additional information, such as array declarations and array references, can be used to improve transformers. See the property documentation for:
SEMANTICS_TRUST_ARRAY_DECLARATIONS 6.8.11.2 SEMANTICS_TRUST_ARRAY_REFERENCES 6.8.11.2
Within one procedure, the transformers can be computed in forward mode, using precondition information gathered along. Transformers can also be recomputed once the preconditions are available. In both cases, more precise transformers are obtained because the statement can be better modelized using precondition information. For instance, a non-linear expression can turn out to be linear because the values of some variables are numerically known and can be used to simplify the initial expression. See properties:
SEMANTICS_RECOMPUTE_EXPRESSION_TRANSFORMERS 6.8.11.4
SEMANTICS_COMPUTE_TRANSFORMERS_IN_CONTEXT 6.8.11.4
SEMANTICS_RECOMPUTE_FIX_POINTS_WITH_PRECONDITIONS 6.8.11.6
and phase refine_transformers 6.8.1.6.
Unstructured control flow graphs can lead to very long transformer computations, whose results are usually not interesting. Their sizes are limited by two properties:
SEMANTICS_MAX_CFG_SIZE2 6.8.11.3 SEMANTICS_MAX_CFG_SIZE1 6.8.11.3
discussed in the property documentation.
Default value were set in the early nineties to obtain results fast enough for live demonstrations. They have not been changed to preserve the non-regression tests. However since 2005, processors are fast enough to use the most precise options in all cases.
A transformer map contains a transformer for each statement of a module. It is a mapping from statements to transformers (type statement_mapping, which is not a NewGen file). Transformers maps are stored on and retrieved from disk by pipsdbm.
Build the fast intraprocedural transformers.
Build the improved intraprocedural transformers.
Build the fast interprocedural transformers.
Build the improved interprocedural transformers (This should be used as default option.).
Rebuild the interprocedural transformers using interprocedural preconditions. Intraprocedural preconditions are also used to refine all transformers.
A summary transformer is an interprocedural version of the module statement transformer, obtained by eliminating dynamic local, a.k.a. stack allocated, variables. The filtering is based on the module summary effects. Note: each module has a UNIQUE top-level statement.
A summary_transformer 6.8.2 is of Newgen type transformer.
All DATA initializations contribute to the global initial state of the program. The contribution of each module is computed independently. Note that variables statically initialized behave as static variables and are preserved between calls according to Fortran standard. The module initial states are abstracted by an initial precondition based on integer scalar variables only.
Note: To be extended to handle C code. To be extended to handle properly unknown modules.
All initial preconditions, including the initial precondition for the main, are combined to define the program precondition which is an abstraction of the program initial state.
The program precondition can only be used for the initial state of the main procedure. Although it appears below for all interprocedural analyses and it always is computed, it only is used when a main procedure is available.
A summary precondition is of type ”transformer”, but the argument list must be empty as it is a simple predicate on the initial state. So in fact it is a state predicate.
The intraprocedural summary precondition uses DATA statement for the main module and is the TRUE constant for all other modules.
Interprocedural summary preconditions can be requested instead. They are not described in the same section in order to introduce the summary precondition resource at the right place in pipsmake.rc.
No menu is declared to select either intra- or interprocedural summary preconditions.
A precondition for a statement s in a module m is a predicate true for every state reachable from the initial state of m, in which s is executed. A precondition is of NewGen type ”transformer” (see PIPS Internal Representation of Fortran and C code8 ) and preconditions is of type statement_mapping.
Option preconditions_intra 6.8.5.2 associates a precondition to each statement, assuming that no information is available at the module entry point.
Inter-procedural preconditions may be computed with intra-procedural transformers but the benefit is not clear. Intra-procedural preconditions may be computed with inter-procedural transformers. This is faster that a full interprocedural analysis because there is no need for a top-down propagation of summary preconditions. This is compatible with code transformations like partial_eval 8.4.2, suppress_dead_code 8.3.1 and use_def_elimination 8.3.2.
Since these two options for transformer and precondition computations are independent and that transformers_inter_full 6.8.1.5 and preconditions_inter_full 6.8.5.4 must be both (independently) selected to obtain the best possible results. These two options are recommended.
Only build the preconditions in a module without any interprocedural propagation. The fast version uses a fast but crude approximation of preconditions for unstructured code.
Option preconditions_inter_fast 6.8.5.3 uses the module own precondition derived from its callers as initial state value and propagates it downwards in the module statement.
The fast versions use no fix-point operations for loops.
Option preconditions_inter_full 6.8.5.4 uses the module own precondition derived from its callers as initial state value and propagates it downwards in the module statement.
The full versions use fix-point operations for loops.
By default, summary preconditions are computed intraprocedurally. The interprocedural option must be explicitly activated.
An interprocedural summary precondition for a module is derived from all its call sites. Of course, preconditions must be known for all its callers’ statements. The summary precondition is the convex hull of all call sites preconditions, translated into a proper environment which is not necessarily the module’s frame. Because of invisible global and static variables and aliasing, it is difficult for a caller to know which variables might be used by the caller to represent a given memory location. To avoid the problem, the current summary precondition is always translated into the caller’s frame. So each module must first translate its summary precondition, when receiving it from the resource manager (pipsdbm) before using it.
Note: the previous algorithm was based on a on-the-fly reduction by convex hull. Each time a call site was encountered while computing a module preconditions, the callee’s summary precondition was updated. This old scheme was more efficient but not compatible with program transformations because it was impossible to know when the summary preconditions of the modules had to be reset to the infeasible (a.k.a. empty) precondition.
An infeasible precondition means that the module is never called although a main is present in the workspace. If no main module is available, a TRUE precondition is generated. Note that, in both cases, the impact of static initializations propagated by link edition is taken into account although this is prohibited by the Fortran Standard which requires a BLOCKDATA construct for such initializations. In other words, a module which is never called has an impact on the program execution and its declarations should not be destroyed.
The following rule is obsolete. It is context sensitive and its results depends on the history of commands performed on the workspace.
Total preconditions are interesting to optimize the nominal behavior of a terminating application. It is assumed that the application ends in the main procedure. All other exits, aborts or stops, explicit or implicit such as buffer overflows and zero divide and null pointer dereferencing, are considered exceptions. This also applies at the module level. Modules nominally return. Other control flows are considered exceptions. Non-terminating modules have an empty total precondition9 . The standard preconditions can be refined by anding with the total preconditions to get information about the nominal behavior. Similar sources of increased accuracy are the array declarations and the array references, which can be exploited directly with properties described in section 6.8.11.2. These two properties should be set to true whenever possible.
Hence, a total precondition for a statement s in a module m is a predicate true for every state from which the final state of m, in which s is executed, is reached. It is an over-approximation of the theoretical total precondition. So, if the predicate is false, the final control state cannot be reached. A total precondition is of NewGen type ”transformer” (see PIPS Internal Representation of Fortran and C code10 ) and total_preconditions is of type statement_mapping.
The relationship with continuations (see Section 6.9) is not clear. Total preconditions should be more general but no must version exist.
Option total_preconditions_intra 6.8.7.2 associates a precondition to each statement, assuming that no information is available at the module return point.
Inter-procedural total preconditions may be computed with intra-procedural transformers but the benefit is not clear. Intra-procedural total preconditions may be computed with inter-procedural transformers. This is faster than a full interprocedural analysis because there is no need for a top-down propagation of summary total postconditions.
Since these two options for transformer and total precondition computations are independent, transformers_inter_full 6.8.1.5 and total_preconditions_inter 6.8.7.3 must be both (independently) selected to obtain the best possible results.
Status: This is a set of experimental passes. The intraprocedural part is implemented. The interprocedural part is not implemented yet, waiting for an expressed practical interest. Neither C for loops nor repeat loops are supported.
Only build the total preconditions in a module without any interprocedural propagation. No specific condition must be met when reaching a RETURN statement.
Option total_preconditions_inter 6.8.7.3 uses the module own total postcondition derived from its callers as final state value and propagates it backwards in the module statement. This total module postcondition must be true when the RETURN statement is reached.
The program postcondition is only used for the main module.
The summary total precondition of a module is the total precondition of its statement limited to information observable by callers, just like a summary transformer (see Section 6.8.2).
A summary total precondition is of type ”transformer”.
A final postcondition for a module is derived from all its call sites. Of course, total postconditions must be known for all its callers’ statements. The summary total postcondition is the convex hull of all call sites total postconditions, translated into a proper environment which is not necessarily the module’s frame. Because of invisible global and static variables and aliasing, it is difficult for a caller to know which variables might be used by the caller to represent a given memory location. To avoid the problem, the current summary total postcondition is always translated into the caller’s frame. So each module must first translate its summary total postcondition, when receiving it from the resource manager (pipsdbm) before using it.
A summary total postcondition is of type ”transformer”.
The program postcondition cannot be derived from the source code. It should be defined explicitly by the user. By default, the predicate is always true. But you might want some variables to have specific values, e.g. KMAX==1, or signs,KMAX>1 or relationships KMAX>JMAX.
By default, the semantic analysis is restricted to scalar integer variables as they are key variables to understand scientific code behavior. However it is possible to analyze scalar variables with other data types. Fortran LOGICAL variables are represented as 0/1 integers. Character string constants and floating point constants are represented as undefined values.
The analysis is thus limited to constant propagation for character strings and floating point values whereas integer and boolean variables are processed with a relational analysis.
Character string constants of fixed maximal length could be translated into integers but the benefit is not yet assessed because they are not much used in the benchmark and commercial applications we have studied. The risk is to increase significantly the number of overflows encountered during the analysis.
SEMANTICS_ANALYZE_SCALAR_INTEGER_VARIABLES TRUE
SEMANTICS_ANALYZE_SCALAR_BOOLEAN_VARIABLES FALSE
SEMANTICS_ANALYZE_SCALAR_STRING_VARIABLES FALSE
SEMANTICS_ANALYZE_SCALAR_FLOAT_VARIABLES FALSE
SEMANTICS_ANALYZE_SCALAR_COMPLEX_VARIABLES FALSE
For every module, array declaration are assumed to be correct with respect to the standard: the upper bound must be greater than or equal to the lower bound. When implicit, the lower bound is one. The star upper bound is neglected.
This property is turned off by default because it might slow down PIPS quite a lot without adding any useful information because loop bounds are usually different from array bounds.
SEMANTICS_TRUST_ARRAY_DECLARATIONS FALSE
For every module, array references are assumed to be correct with respect to the declarations: the subscript expressions must have values lower than or equal to the upper bound and greater than or equal to the lower bound.
This property is turned off by default because it might slow down PIPS quite a lot without adding any useful information.
SEMANTICS_TRUST_ARRAY_REFERENCES FALSE
Perform “meet” operations for semantics analysis. This property is managed by pipsmake which often sets it to TRUE. See comments in pipsmake documentation to turn off convex hull operations for a module or more if they last too long.
SEMANTICS_FLOW_SENSITIVE FALSE
Complex control flow graph may require excessive computation resources. This may happen when analyzing a parser for instance.
SEMANTICS_ANALYZE_UNSTRUCTURED TRUE
To reduce execution time, this property is complemented with a heuristics to turn off the analysis of very complex unstructured.
If the control flow graph counts more than SEMANTICS_MAX_CFG_SIZE1 6.8.11.3 vertices, use effects only.
SEMANTICS_MAX_CFG_SIZE2 20
If the control flow graph counts more than SEMANTICS_MAX_CFG_SIZE1 6.8.11.3 but less than SEMANTICS_MAX_CFG_SIZE2 6.8.11.3 vertices, perform the convex hull of its elementary transformers and take the fixpoint of it. Note that SEMANTICS_MAX_CFG_SIZE2 6.8.11.3 is assumed to be greater than or equal to SEMANTICS_MAX_CFG_SIZE1 6.8.11.3.
SEMANTICS_MAX_CFG_SIZE1 20
Without preconditions, transformers can be precise only for affine expressions. Approximate transformers can sometimes be derived for other expressions, involving for instance products of variables or divisions.
However, a precondition of an expression can be used to refine the approximation. For instance, some non-linear expressions can become affine because some of the variables have constant values, and some non-linear expressions can be better approximated because the variables signs or ranges are known.
To be backward compatible and to be conservative for PIPS execution time, the default value is false.
Not implemented yet.
SEMANTICS_RECOMPUTE_EXPRESSION_TRANSFORMERS FALSE
Intraprocedural preconditions can be computed at the same time as transformers and used to improve the accuracy of expression and statement transformers. Non-linear expressions can sometimes have linear approximations over the subset of all possible stores defined by a precondition. In the same way, the number of convex hulls can be reduced if a test branch is never used or if a loop is always entered.
SEMANTICS_COMPUTE_TRANSFORMERS_IN_CONTEXT FALSE
The default value is false for reverse compatibility and for speed.
To be refined later; basically, use callee’s transformers instead of callee’s effects when computing transformers bottom-up in the call graph; when going top-down with preconditions, should we care about unique call site and/or perform meet operation on call site preconditions ?
SEMANTICS_INTERPROCEDURAL FALSE
This property is used internally and is not user selectable.
CPU time and memory space are cheap enough to compute loop fix points for transformers. This property implies SEMANTICS_FLOW_SENSITIVE 6.8.11.3 and is not user-selectable.
SEMANTICS_FIX_POINT FALSE
The default fix point operator, called transfer, is good for induction variables but it is not good for all kinds of code. The default fix point operator is based on the transition function associated to a loop body. A computation of eigenvectors for eigenvalue 1 is used to detect loop invariants. This fails when no transition function but only a transition relation is available. Only equations can be found.
The second fix point operator, called pattern, is based on a pattern matching of elementary equations and inequalities of the loop body transformer. Obvious invariants are detected. This fix point operator is not better than the previous one for induction variables but it can detect invariant equations and inequalities.
A third fix point operator, called derivative, is based on finite differences. It was developed to handled DO loops desugared into WHILE loops as well as standard DO loops. The loop body transformer on variable values is projected onto their finite differences. Invariants, both equations and inequalities, are deduced directly from the constraints on the differences and after integration. This third fix point operator should be able to find at least as many invariants as the two previous one, but at least some inequalities are missed because of the technique used. For instance, constraints on a flip-flop variable can be missed. Unlike Cousot-Halbwachs fix point (see below), it does not use Chernikova steps and it should not slow down analyses.
This property is user selectable and its default value is transfer. The default value is the only one which has been seriously validated.
SEMANTICS_FIX_POINT_OPERATOR "transfer"
The property SEMANTICS_PATTERN_MATCHING_FIX_POINT has been removed and replaced by option pattern of the previous property.
This property was defined to select one of Cousot-Halbwachs’s heuristics and to compute fix points with inequalities and equalities for loops. These heuristics could be used to compute fix points for transformers and/or preconditions. This option implies SEMANTICS_FIX_POINT 6.8.11.6 and SEMANTICS_FLOW_SENSITIVE 6.8.11.3. It has not been implemented yet in PIPS11 because its accuracy has not yet been required, but is now badly named because there is no direct link between inequality and Halbwachs. Its default value is false and it is not user selectable.
SEMANTICS_INEQUALITY_INVARIANT FALSE
Because of convexity, some fix points may be improved by using some of the information carried by the preconditions. Hence, it may be profitable to recompute loop fix point transformer when preconditions are being computed.
The default value is false because this option slows down PIPS and does not seem to add much useful information in general.
SEMANTICS_RECOMPUTE_FIX_POINTS_WITH_PRECONDITIONS FALSE
Preconditions reflect by default all knowledge gathered about the current state (i.e. store). However, it is possible to restrict the information to variables actually read or written, directly or indirectly, by the statement following the precondition.
SEMANTICS_FILTERED_PRECONDITIONS FALSE
Output semantics results on stdout
SEMANTICS_STDOUT FALSE
Debug level for semantics used to be controlled by a property. A Shell variable, SEMANTICS_DEBUG_LEVEL, is used instead.
Continuation conditions are attached to each statement. They represent the conditions under which the program will not stop in this statement. Under- and over-approximations of these conditions are computed.
Complexities are symbolic approximations of the execution times of statements. They are computed interprocedurally and based on polynomial approximations of execution times. Non-polynomial execution times are represented by unknown variables which are not free with respect to the program variables. Thus non-polynomial expressions are equivalent to polynomial expressions over a larger set of variables.
Probabilities for tests should also result in unknown variables (still to be implemented). See [42].
A summary_complexity is the approximation of a module execution times. It is translated and used at call sites.
Complexity estimation could be refined (i.e. the number of unknown variables reduced) by using transformers to combine elementary complexities using local states, rather than preconditions to combine elementary complexities relatively to the module initial state. The same options exist for region computation. The initial version [36] used the initial state for combinations. The new version [11] delays evaluation of variable values as long as possible but does not really use local states.
The first version of the complexity estimator was designed and developed by Pierre Berthomier. It was restricted to intra-procedural analysis. This first version was enlarged and validated on real code for SPARC-2 machines by Lei Zhou [42]. Since, it has been modified slightly by François Irigoin. For simple programs, complexity estimation are strongly correlated with execution times. The estimations can be used to see if program transformations are beneficial.
Known bugs: tests and while loops are not correctly handled because a fixed probably of 0.5 is systematically assumed.
Complexity estimation is based on a set of basic operations and fixed execution times for these basic operation. The choice of the set is critical but fixed. Experiments by Lei Zhou showed that it should be enlarged. However, the basic times, which also are critical, are tabulated. New sets of tables can easily be developed for new processors.
Uniform complexity tables contain a unit execution time for all basic operations. They nevertheless give interesting estimations for SPARC SS-10, especially for -O2/-O3 optimized code.
Local variables are eliminated from the complexity associated to the top statement of a module in order to obtain the modules’ summary complexity.
Tables for floating point complexity estimation are set to 0 for non-floating point operations, and to 1 for all floating point operations, including intrinsics like SIN.
This enables the default specification within the properties to be considered.
The following properties control the static estimation of dynamic code execution time.
Trace the walk across a module’s internal representation:
COMPLEXITY_TRACE_CALLS FALSE
Trace all intermediate complexities:
COMPLEXITY_INTERMEDIATES FALSE
Print the complete cost table at the beginning of the execution:
COMPLEXITY_PRINT_COST_TABLE FALSE
The cost table(s) contain machine and compiler dependent information about basic execution times, e.g. time for a load or a store.
It is possible to specify a list of variables which must remain literally in the complexity formula, although their numerical values are known (this is OK) or although they have multiple unknown and unrelated values during any execution (this leads to an incorrect result).
Formal parameters and imported global variables are left unevaluated.
They have relatively high priority (FI: I do not understand this comment by Lei).
This list should be empty by default (but is not for unknown historical reasons):
COMPLEXITY_PARAMETERS "IMAX␣LOOP"
Controls the printing of accuracy statistics:
COMPLEXITY_PRINT_STATISTICS 0
This property is used to select a set of basic execution times. These times depend on the target machine, the compiler and the compilation options used. It is shown in [42] that fixed basic times can be used to obtain accurate execution times, if enough basic times are considered, and if the target machine has a simple RISC processor. For instance, it is not possible to use only one time for a register load. It is necessary to take into account the nature of the variable, i.e. formal parameter, dynamic variable, global variable, and the nature of the access, e.g. the dimension of an accessed array. The cache can be ignored an replacer by an average hit ratio.
Different set of elementary cost tables are available:
In the future, we might add a sparc-2 table...
The different elementary table names are defined in complexity-local.h. They presently are operation, memory, index, transcend and trigo.
The different tables required are to be found in $PIPS_LIBDIR/complexity/xyz, where xyz is specified by this property:
COMPLEXITY_COST_TABLE "all_1"
For the moment, we have designed two ways to solve the complexity combination problem. Since symbolic complexity formulae use program variables it is necessary to specify in which store they are evaluated. If two complexity formulae are computed relatively to two different stores, they cannot be directly added.
The first approach, which is implemented, uses the module initial store as universal store for all formulae (but possibly for the complexity of elementary statements). In some way, symbolic variable are evaluated as early as possible as soon as it is known that they won’t make it in the module summary complexity.
This first method is easy to implement when the preconditions are available but it has at least two drawbacks:
The second approach, which is not implemented, delay variable evaluation as late as possible. Complexities are computed and given relatively to the stores used by each statements. Two elementary complexities are combined together using the earliest store. The two stores are related by a transformer (see Section 6.8.11). Such an approach is used to compute MUST regions as precisely as possible (see Section 6.11.9).
A simplified version of the late evaluation was implemented. The initial store of the procedure is the only reference store used as with the early evaluation, but variables are not evaluated right away. They only are evaluated when it is necessary to do so. This not an ideal solution, but it is easy to implement and reduces considerably the number of unknown values which have to be put in the formulae to have correct results.
COMPLEXITY_EARLY_EVALUATION FALSE
Array regions are functions mapping a memory store onto a convex set of array elements. They are used to represent the memory effects of modules or statements. Hence, they are expressed with respect to the initial store of the module or to the store immediately preceding the execution of the statement they are associated with.
Apart from the array name and its dimension descriptors (or ϕ variables), an array region contains three additional informations:
Unfortunately, for historical reasons, MUST is still used in the implementation instead of EXACT, and actual MUST regions are not computed. Moreover, the must_regions option in fact computes exact and may regions.
MAY regions are flow-insensitive regions, whereas MUST regions are flow sensitive. Any array element touched by any execution of a statement is in the MAY region of this statement. Any array element in the MUST region of a statement is accessed by any execution of this statement.
For instance, the region:
<A(ϕ1,ϕ2)-W-EXACT-{ϕ1==I, ϕ1==ϕ2}>
Internally, regions are of type effect and as such can be used to build use-def chains (see Section 6.4.3). Regions chains are built using proper regions which are particular READ and WRITE regions. For simple statements (assignments, calls to intrinsic functions), summarization is avoided to preserve accuracy. At this inner level of the program control flow graph, the extra amount of memory necessary to store regions without computing their convex hull should not be too high compared to the expected gain for dependence analysis. For tests and loops, proper regions contain the regions associated to the condition or the range. And for external calls, proper regions are the summary regions of the callee translated into the caller’s name space, to which are merely appended the regions of the expressions passed as argument (no summarization for this step).
Together with READ/WRITE regions and IN regions are computed their invariant versions for loop bodies (MODULE.inv_regions and MODULE.inv_in_regions). For a given loop body, they are equal to the corresponding regions in which all variables that may be modified by the loop body (except the current loop index) are eliminated from the descriptors (convex polyhedron). For other statements, they are equal to the empty list of regions.
MAY READ and WRITE region analysis was first designed by Rémi Triolet [39] and then revisited by François Irigoin [40]. Alexis Platonoff [36] implemented the first version of region analysis in PIPS. These regions were computed with respect to the initial stores of the modules. François Irigoin and, mainly, Béatrice Creusillet [11, 12, 10], added new functionalities to this first version as well as functions to compute MUST regions, and IN and OUT regions.
Array regions for C programs are currently under development.
This function computes the MAY regions in a module.
This function computes the MUST regions in a module.
Module summary regions provides an approximation of the effects it’s execution has on its callers variables as well as on global and static variables of its callees.
IN regions are flow sensitive regions. They are read regions not covered (i.e. not previously written) by assignments in the local hierarchical control-flow graph. There is no way with the current pipsmake-rc and pipsmake to express the fact that IN (and OUT) regions must be calculated using must_regions 6.11.3 (a new kind of resources, must_regions 6.11.3, should be added). The user must be knowledgeable enough to select must_regions 6.11.3 first.
See Section 6.11.8.
OUT regions are also flow sensitive regions. They are downward exposed written regions which are also used (i.e. imported) in the continuation of the program. They are also called exported regions. Unlike READ, WRITE and IN regions, they are propagated downward in the call graph and in the hierarchical control flow graphs of the subroutines.
if MUST_REGIONS is true, then it computes regions using the algorithm described in report E/181/CRI, called T−1 algorithm. It provides more accurate regions, and preserve MUST approximations more often. As it is more costly, its default value is FALSE. EXACT_REGIONS is true for the moment for backward compatibility only.
EXACT_REGIONS TRUE
MUST_REGIONS FALSE
The default option is to compute regions without taking into account array bounds. The next property can be turned to TRUE to systematically add them in the region descriptors. Both options have their advantages and drawbacks.
REGIONS_WITH_ARRAY_BOUNDS FALSE
The current implementation of effects (simple effects as well as convex array regions) relies on a generic engine which is independent of the effect descriptor representation. The current representation for array regions, parameterized integer convex polyhedra, allows various patterns an provides the ability to exploit context information at a reasonable expense. However, some very common patterns such as nine-point stencils used in seismic computations or red-black patterns cannot be represented. It has been a long lasting temptation to try other representations [10].
A Complementary sections (see Section 6.13) implementation was formerly began as a set of new phases by Manjunathaiah Muniyappa, but is not maintained anymore.
And Nga Nguyen more recently created two properties to switch between regions and disjunctions of regions (she has already prepared basic operators). For the moment, they are always FALSE.
DISJUNCT_REGIONS FALSE
DISJUNCT_IN_OUT_REGIONS FALSE
Statistics may be obtained about the computation of array regions. When the next property (REGIONS_OP_STATISTICS) is set to TRUE statistics are provided about operators on regions (union, intersection, projection,…). The second next property turns on the collection of statistics about the interprocedural translation.
REGIONS_OP_STATISTICS FALSE
REGIONS_TRANSLATION_STATISTICS FALSE
Dynamic aliases are pairs (formal parameter, actual parameter) of regions generated at call sites. An “IN alias pair” is generated for each IN region of a called module and an “OUT alias pair” for each OUT region. For EXACT regions, the transitive, symmetric and reflexive closure of the dynamic alias relation results in the creation of equivalence classes of regions (for MAY regions, the closure is different and does not result in an equivalence relation, but nonetheless allows us to define alias classes). A set of alias classes is generated for a module, based on the IN and OUT alias pairs of all the modules below it in the callgraph. The alias classes for the whole workspace are those of the module which is at the root of the callgraph, if the callgraph has a unique root. As an intermediate phase between the creation of the IN and OUT alias pairs and the creation of the alias classes, “alias lists” are created for each module. An alias list for a module is the transitive closure of the alias pairs (IN or OUT) for a particular path through the callgraph subtree rooted in this module.
Display the dynamic alias pairs (formal region, actual region) for the IN regions of the module.
Display the dynamic alias pairs (formal region, actual region) for the OUT regions of the module.
Display the transitive closure of the dynamic aliases for the module.
Display the dynamic alias equivalence classes for this module and those below it in the callgraph.
A new representation of array regions added in PIPS by Manjunathaiah Muniyappa. This anlysis is not maintained anymore.
This function computes the complementary sections in a module.