![]() |
1.0.7 (revision 953)
|
Opari2 is a tool to automatically instrument C, C++ and Fortran source code files in which OpenMP is used. Function calls to a POMP2 API are inserted around OpenMP directives. By implementing this API, detailed measurements regarding the runtime behavior of an OpenMP application can be made. A conforming POMP2 implementation needs to implement all POMP2 functions, see pomp2_lib.h for a list of those.
OpenMP 3.0 introduced tasking to OpenMP. To support this feature the POMP2 adapter needs to do some bookkeeping in regard to specific task IDs. The pomp2_lib.c provided with this package includes the necessary code so it is strongly advised to use it as a basis for writing an adapter to your own tool.
A detailed description of the first Opari version has been published by Mohr et al. in "Design and prototype of a performance tool interface for OpenMP" (Journal of supercomputing, 23, 2002).
Opari2 was developed with Autotools. After downloading and unpacking, change into your build directory and perform the following steps:
See the file INSTALL for further information.
To create an instrumented version of an OpenMP application, each file of interest is transformed by the OPARI2 tool. The application is then linked against the POMP2 runtime measurement library and optionally to a special initialization file (see section LINKING (startup initialization only) and SUMMARY for further details).
A call to Opari2 has the following syntax:
Usage: opari2 [OPTION] ... infile [outfile]
with following options and parameters:
[--f77|--f90|--c|--c++] [OPTIONAL] Specifies the programming language
of the input source file. This option is only
necessary if the automatic language detection
based on the input file suffix fails.
[--free-form] [OPTIONAL] Specifies that free formating is
used for Fortran source files. This is the
default for Fortran 90/95.
[--fix-form] [OPTIONAL] Specifies that fixed formating is
used for Fortran source files. This is the
default for Fortran 77.
[--nosrc] [OPTIONAL] If specified, OPARI2 does not
generate #line constructs, which allow to
preserve the original source file and line
number information, in the transformation
process. This option might be necessary if
the OpenMP compiler does not understand #line
constructs. The default is to generate #line
constructs.
[--nodecl] [OPTIONAL] Disables the generation of
POMP2_DLISTXXXXX macros. These are used in the
parallel directives of the instrumentation to
make the region handles shared. By using this
option the shared clause is used directly on
the parallel directive with the resprective
region handles.
[--tpd] [OPTIONAL] Adds the clause 'copyin(<pomp_tpd>)'
to any parallel construct. This allows to
pass data from the creating thread to its
children. The variable is declared externally
in all files, so it needs to be defined by
the pomp library.
[--disable=<constructs>] [OPTIONAL] Disable the instrumentation of
manually-annotated POMP regions or the
more fine-grained OpenMP constructs such as
!$OMP ATOMIC. <constructs> is a comma
separated list of the constructs for which
the instrumentation should be disabled.
Accepted tokens are atomic, critical, master,
flush, single, ordered or locks (as well as
sync to disable all of them) or regions.
[--task= Special treatment for the task directive
abort|warn|remove] abort: Stop instrumentation with an error
message when encountering a task
directive.
warn: Resume but print a warning.
remove: Remove all task directives.
[--untied= Special treatment for the untied task attribute.
abort|keep|no-warn] The default beavior is to remove the untied
attribute, thus making all tasks tied, and print
out a warning.
abort: Stop instrumentation with an error
message when encountering a task
directive with the untied attribute.
keep: Do not remove the untied attribute.
no-warn: Do not print out a warning.
[--tpd-mangling= [OPTIONAL] If programming languages are mixed
gnu|intel|sun|pgi| (C and Fortran), the <pomp_tpd> needs to use
ibm|cray] the Fortran mangled name also in C files.
This option specifies to use the mangling
scheme of the gnu, intel, sun, pgi or ibm
compiler. The default is to use the mangling
scheme of the compiler used to build opari2.
[--version] [OPTIONAL] Prints version information.
[--help] [OPTIONAL] Prints this help text.
infile Input file name.
[outfile] [OPTIONAL] Output file name. If not
specified, opari2 uses the name
infile.mod.suffix if the input file is
called infile.suffix.
Report bugs to <scorep-bugs@groups.tu-dresden.de>.
If you run Opari2 on the input file example.c it will create two files:
example.mod.c is the instrumented version of example.c, i.e. it contains the original code plus calls to the POMP2 API referencing handles to the OpenMP regions identified by Opari2.example.c.opari.inc contains the OpenMP region handle definitions accompanied with all the relevant data needed by the handles. This compile time context (CTC) information is encoded into a string for maximum portability. For each region, the tuple (region_handle, ctc_string) is passed to an initializing function (POMP2_Assign_handle()). All calls to these initializing functions are gathered in a function named POMP2_Init_reg_XXX_YY, where XXX_YY is unique for each compilation unit.At some point during the runtime of the instrumented application, the region handles need to be initialized using the information stored in the CTC string. This can be done in one of of two ways:
We highly recommend using the first option as it incurs much less runtime overhead than the second one (no locking, no lookup needed). In this case all POMP2_Init_reg_XXX_YY functions introduced by opari2 need to be called. See LINKING (startup initialization only) for further details. For runtime initialization the ctc string as argument to the relevant POMP2 function calls is provided as an argument.
As mentioned above, we pass ctc strings to different POMP2 functions. These functions need to parse the string in order to process the encoded information. With POMP2_Region_info and ctcString2RegionInfo() the opari2 package provides means of doing this, see pomp2_region_info.h.
The CTC string is a string in the format "length*key=value*key=value*[key=value]**, for example:
*82*regionType=parallel*sscl=xmpl.c:61:61*escl=xmpl.c:66:66*hasIf=1**
Mandatory keys are:
Optional keys are
The optional values are set to 0 by default, i.e. the presence of the key denotes the presence of the respective clause.
You can use the function ctcString2RegionInfo() to decode CTC strings. It can be found in pomp2_region_info.c and pomp2_region_info.h, installed under <opari-prefix>/share/opari2/devel.
For startup initialization all POMP2_Init_reg_XXX_YY functions that can be found in the object files and libraries of the application are called. This is done by creating an additional compilation unit that contains calls to following POMP2 functions:
The resulting object file is linked to the application. During startup of the measurement system the only thing to be done is to call POMP2_Init_region() which then calls all POMP2_Init_reg_XXX_YY functions.
In order to create the additional compilation unit (for example pomp2_init_file.c) the following command sequence can be used:
% `opari2-config --nm` <objs_and_libs> | \ `opari2-config --region-initialization` > pomp2_init_file.c
Here, <objs_and_libs> denotes the entire set of object files and libraries that were instrumented by opari2.
Due to portability reasons nm, and the awk script to create the additional file are not called directly but via the provided opari2-config tool.
A call to the opari2-config tool has the following syntax:
Usage: opari2-config [OPTION] ... <command>
with the following commands:
--nm Prints the nm command.
--region-initialization Prints the script used to create the
pomp2_init_regions.c file.
--create-pomp2-regions Prints the whole command necessary
<object files> for creating the initialization file.
--awk-cmd [Deprecated, use --region-initialization instead.]
Prints the awk command.
--awk-script [Deprecated, use --region-initialization instead.]
Prints the awk script.
--egrep [Deprecated, use --region-initialization instead.]
Prints the egrep command.
--cflags Prints compiler options to include
installed headers.
--version Prints the opari2 version number.
--interface-version Prints the pomp2 API version that
instrumented files conform too.
--opari2-revision Prints the revision number of the
OPARI2 package.
--common-revision Prints the revision number of the
common package.
--help Prints this help text.
and the following options:
[--build-check] Tells opari2-config to use build paths
instead of install paths. Used for build
testing.
[--config=<config file>] Reads in a configuration from the given
file.
Report bugs to <scorep-bugs@groups.tu-dresden.de>.
For manual user instrumentation the following pragmas are provided.
C/C++:
#pragma pomp inst init #pragma pomp inst begin(region_name) #pragma pomp inst altend(region_name) #pragma pomp inst end(region_name) #pragma pomp noinstrument #pragma pomp instrument
Fortran:
!$POMP INST INIT !$POMP INST BEGIN(region_name) !$POMP INST ALTEND(region_name) !$POMP INST END(region_name) !$POMP NOINSTRUMENT !$POMP INSTRUMENT
Users can specify code regions, like functions for example, with INST BEGIN and INST END. If a region contains several exit points like return/break/exit/... all but the last need to be marked with INST ALTEND pragmas. The INST INIT pragma should be used for initialization in the beginning of main, if no other initialization method is used. The NOINSTRUMENT and INSTRUMENT pragmas can be used to turn off or on the instrumentation of OpenMP pragmas. All pragmas between NOINSTRUMENT and INSTRUMENT except for parallel regions are not instrumented. Parallel regions are always instrumented to allow a correct thread management in the performance tool. See the EXAMPLE section for an example on how to use user instrumentation.
The directory <prefix>/share/opari2/doc/example contains the following files:
example.c example.f Makefile
The Makefile contains all required information for building the instrumented and uninstrumented binaries. It demonstrates the compilation and linking steps as described above.
Additional examples which illustrate the use of user instrumentation can be found in <prefix>/share/opari2/doc/example_user_instrumentation. The folder contains the following files:
example_user_instrumentation.c example_user_instrumentation.f Makefile
Opari2 uses a new mechanism to link files. The main advantage is, that no opari.rc file is needed anymore. Libraries can now be preinstrumented and parallel builds are supported. To achieve this, the handles for parallel regions are instrumented using a ctc_string.
The POMP2 interface is not compatible with the original POMP interface. All functions of the new API begin with POMP2_. The declaration prototypes can be found in pomp2_lib.h.
The POMP2_Parallel_fork() call has an additional argument to pass the requested number of threads to the POMP2 library. This allows the library to prepare data structures and allocate memory for the threads before they are created. The value passed to the library is determined as follows:
num_threads clause is present, the expression inside this clause is evaluated into a local variable pomp_num_threads. This variable is afterwards passed in the call to POMP2_Parallel_fork() and in the num_threads clause itself.pomp_num_threads and passed to the POMP2_Parallel_fork() call.In Fortran, instead of omp_get_max_threads(), a wrapper function pomp_get_max_threads_XXX_X is used. This function is needed to avoid multiple definitions of omp_get_max_threads() since we do not know whether it is defined in the user code or not. Removing all definitions in the user code would require much more Fortran parsing than is done with opari2, since function definitions cannot easily be distinguished from variable definitions.
If it is necessary for the POMP2 library to pass information from the master thread to its children, the option --tpd can be used. Opari2 uses the copyin clause to pass a threadprivate variable pomp_tpd to the newly spawned threads at the beginning of a parallel region. This is a 64 bit integer variable, since Fortran does not allow pointers. However a pointer can be stored in this variable, passed to child threads with the copyin clause (in C/C++ or Fortran) and later on be cast back to a pointer in the pomp library.
To support mixed programming (C/Fortran) the variable name depends on the name mangling of the Fortran compiler. This means, for GNU, Sun, Intel and PGI C compilers the variable is called pomp_tpd_ and for IBM it is called pomp_tpd in C. In Fortran it is of course always called pomp_tpd. The --tpd-mangling option can be used to change this. The variable is declared extern in all program units, so the pomp library contains the actual variable declaration of pomp_tpd as a 64 bit integer.
In OpenMP 3.0 the new tasking construct was introduced. All parts of a program are now implicitly executed as tasks and the user gets the possibility of creating tasks that can be scheduled for asynchronous execution. Furthermore these tasks can be interrupted at certain scheduling points and resumed later on (see the OpenMP API 3.0 for more detailed information).
Opari2 instruments functions POMP2_Task_create_begin and POMP2_Task_create_end to allow the recording of the task creation time. For the task execution time, the functions POMP2_Task_begin and POMP2_Task_end are instrumented in the code. To correctly record a profile or a trace of a program execution these different instances of tasks need to be differentiated. Since OpenMP does not provide Task ids, the performance measurement system needs to create and maintain own task ids. This cannot be done by code instrumentation as done by Opari2 alone but requires some administration of task ids during runtime. To allow the measurement system to administrate these ids, additional task id parameters (pomp_old_task/pomp_new_task) were added to all functions belonging to OpenMP constructs which are task scheduling points. With this package there is a "dummy" library, which can be used as an adapter to your measurement system. This library contains all the relevant functionality to keep track of the different instances of tasks and it is highly recommended to use it as a template to implement your own adapter for your measurement system.
For more detailed information on this mechanism see:
"How to Reconcile Event-Based Performance Analysis with Tasking in OpenMP"
by Daniel Lorenz, Bernd Mohr, Christian Rössel, Dirk Schmidl, and Felix Wolf
In: Proc. of 6th Int. Workshop of OpenMP (IWOMP), LNCS, vol. 6132, pp. 109121
DOI: 10.1007/978-3-642-13217-9_9
The typical usage of OPARI2 consists of the following steps:
1. Call OPARI2 for each input source file
% opari2 file1.f90
...
% opari2 fileN.f90
2. Compile all modified output files *.mod.* using the OpenMP compiler
3. Generate the initialization file
% `opari2-config --nm` <objs_and_libs> | \ `opari2-config --region-initialization` > pomp2_init_file.c
4. Link the resulting object files against the pomp2 runtime measurement library.