vesnin_xl-V1.1.xml
IBM XL Compiler Flags, Common Operating System Commands and Environment Settings
Compilers: IBM XL C/C++ Version 13.1.5 for Linux
Compilers: IBM XL Fortran Version 15.1.5 for Linux
Operating systems: Red Hat Enterprise Linux Server release 7
Last updated: $LastChangedDate: 2017-12-16 09:45:29 -0400 (Sat, 16 Dec 2017) $ revision $LastChangedRevision: 001 $
]]>
Determines substitute path names for XL Fortran executables such as the compiler, assembler, linker, and preprocessor.
It can be used in combination with the -t option, which determines which of these components are affected by -B.
Example : -B/opt/at10.0/share/libhugetlbfs/
Macro to have compiler always inline externs if specified.
Pass the --hugetlbfs-align flag to the linker so that we can control (by environment variable HUGETLB_ELFMAP) which program segments are placed in hugepages.
Pass the --hugetlbfs-link=BDT flag to the linker so that the text, initialized data, and BSS segments of the application are backed by hugepages.
Link the Engineering and Scientific Subroutine Library (ESSL).
Link with libhugetlbfs.so. This enables heap to be backed by the 16 Megabyte pages.
Link the mathematical acceleration subsystem libraries (MASS), which contain libraries of tuned mathematical intrinsic functions.
Link with tcmalloc's library for Linux on POWER. This is a library that optimizes calls to new, delete, malloc and free.
Instructs the linker to include libdl.a to enable dynamic linking loader. Links "/usr/lib/libdl.a" library.
Pass the -q flag to the linker causing the final executable to have the relocation information.
Instructs the linker to allow multiple definitions and the first definition will be used.
Normally when a symbol is defined multiple times, the linker will report a fatal error.
Turn off the effect of the --whole-archive flag.
Instructs the linker to include every object file in the specified library, rather than searching the library for the required object files.
Example : "-Wl,--wholearchive /usr/lib/libhugetlbfs.a"
Link with the Apache C++ Standard Library ("stdcxx"). "libstd8d.so" is a 32-bit shared library with optimization enabled.
Adds the directory for the Apache C++ Standard Library to the search path at link time.
-O
-O enables the level of optimization that represents the best tradeoff between compilation speed and run-time performance.
If you need a specific level of optimization, specify the appropriate numeric value.
Currently, -O is equivalent to -O2.
]]>
-O2
-O2 performs a set of optimizations that are intended to offer improved performance without an unreasonable
increase in time or storage that is required for compilation including :
- Eliminates redundant code
- Basic loop optimization
- Can structure code to take advantage of -qarch and -qtune settings
]]>
-O3
-O3 Performs additional optimizations that are memory intensive, compile-time intensive, and may change the semantics
of the program slightly, unless -qstrict is specified. We recommend these optimizations when the desire for run-time
speed improvements outweighs the concern for limiting compile-time resources.
The optimizations provided include:
- In-depth memory access analysis
- Better loop scheduling
- High-order loop analysis and transformations (-qhot=level=0)
- Inlining of small procedures within a compilation unit by default
- Eliminating implicit compile-time memory usage limits
- Widening, which merges adjacent load/stores and other operations
- Pointer aliasing improvements to enhance other optimizations
`-O3 is equivalent to the following flags :
]]>
-O4
-O4 is equivalent to the following flags:
- -O3
- -qipa=level=1
- -qarch=auto
- -qtune=auto
- -qsimd=auto
]]>
-O5
-O5 provides all of the functionality of the -O4 option, but also provides the functionality of the -qipa=level=2 option.
-O5 is equivalent to the following flags :
]]>
-q64
Generates 64-bit ABI binaries. The default is to generate 64-bit ABI binaries on little-endian Linux.
-qalias
-qalias=ansi | noansi :
If ansi is specified, type-based aliasing is used during optimization, which restricts the lvalues that can be safely used to access a data object.
The default is ansi for the xlc, xlC, and c89 commands. This option has no effect unless you also specify the -O option.
qalias=std |nostd :
Indicates whether the compilation units contain any non-standard aliasing. If so, specify nostd.
]]>
-qalign
Specifies what aggregate alignment rules the compiler uses for file compilation, where the alignment options are:
- bit_packed : The compiler uses the bit_packed alignment rules.
- full : The compiler uses the RISC System/6000 alignment rules. This is the same as power.
- mac68k : The compiler uses the Macintosh alignment rules. This suboption is valid only for 32- bit compilations.
- natural : The compiler maps structure members to their natural boundaries.
- packed : The compiler uses the packed alignment rules.
- power : The compiler uses the RISC System/6000 alignment rules.
- twobyte : The compiler uses the Macintosh alignment rules. This suboption is valid only for 32- bit compilations.
The default is -qalign=full.
]]>
Indicates that the compiler understands how to do alloca(). This flag is not supported on little-endian Linux.
-qarch
auto selects the processor, the compile is being done on. pwr5x is the POWER5+ processor.
Supported values for this flag are :
- auto - Use the processor on which the program is compiled.
- pwr8 - The POWER8 processor based systems.
- pwr7 - The POWER7 processor based systems.
- pwr6e - The POWER6 processor in "Enhanced" mode based systems.
- pwr6 - The POWER6 processor based systems.
- pwr5x - The POWER5+ processor based systems.
- pwr5 - The POWER5 processor based systems.
- pwr4 - The POWER4 processor based systems.
- ppc970 - The PPC970 processor based systems.
]]>
-qassert
-qassert=refalign | norefalign | contig :
- refalign specifies that all pointers inside the compilation unit only point to data that is naturally aligned according to the length of the pointer types.
- contig specifies the compiler can perform optimizations according to the memory layout of the objects occupying contiguous blocks of memory.
]]>
-qenablevmx
Enables the generation of vector instructions for processors that support them.
Tell the compiler that enum size is small.
Specifies that, if either -lessl or -lesslsmp is specified, then Engineering and Scientific Subroutine Library
(ESSL) routines should be used in place of some Fortran 90 intrinsic procedures when there is a safe opportunity to do so.
The compiler generates additional symbol information for use by the "fdpr" binary optimization tool.
-qhot
The supported values for suboption are :
- arraypad - The compiler will pad any arrays where it infers that there may be a benefit.
- level=0 - The compiler performs a limited set of high-order loop transformations.
- level=1 - The compiler performs its full set of high-order loop transformations.
- simd - Replaces certain instruction sequences with vector instructions.
- vector - Replaces certain instruction sequences with calls to the MASS library.
Specifying -qhot without suboptions implies -qhot=nosimd, -qhot=noarraypad, -qhot=vector and -qhot=level=1.
The -qhot option is also implied by -O4 and -O5 .
]]>
This option inlines glue code that optimizes external function calls when compiling.
The inline option specifies the threshold and limit of inlined functions. Example : -qinline=40.
The inline suboption specifies the threshold and limit of inlined functions.
Examples : -qipa=inline=limit=1000 and -qipa=inline=threshold=100
-qipa=level
Enhances optimization by doing detailed analysis across procedures (interprocedural analysis or IPA).
The level determines the amount of interprocedural analysis and optimization that is performed.
- level=0 does only minimal interprocedural analysis and optimization.
- level=1 turns on inlining , limited alias analysis, and limited call-site tailoring.
- level=2 turns on full interprocedural data flow and alias analysis.
]]>
The partition suboption specifies the size of the program sections that are analysed together.
Larger partitons may produce better analysis but require more storage. Default is medium.
-qipa=threads
threads suboption allows the IPA optimizer to run portions of the optimization process in parallel threads,
which can speed up the compilation process on multi-processor systems. All the available threads, or the number specified by N, may be used.
N must be a positive integer.
Specifying nothreads does not run any parallel threads; this is equivalent to running one serial thread.
This option does not affect the code in the final binary created.
]]>
Indicates that a program, designed to execute in a large page memory environment, can take advantage.
of large 16 MB pages provided on POWER4 and higher based systems. This flag is not supported on little-endian Linux.
This option specifies that no functions are to be inlined.
-qnoenablevmx
Disables the generation of vector instructions.
Suppresses interprocedural analysis (IPA), which is enabled by default at optimization levels -O4 and -O5.
-qnoprefetch
The noprefetch option will not add any prefetch instructions automatically.
Do not use the XL compiler thread information.
The option used in the first pass of a profile directed feedback compile that causes pdf information to be generated.
The profile directed feedback optimization gathers data on both execution path and data values.
It does not use hardware counters, nor gather any data other than path and data values for PDF specific optimizations.
The option used in the second pass of a profile directed feedback compile that causes PDF information to be utilized during optimization.
-qprefetch
Inserts prefetch instructions automatically where there are opportunities to improve code performance.
- -qprefetch=aggressive : Aggressively prefetch data.
- -qprefetch=dscr option causes the Data Streams Control Register to be set to the value specified when executing this program.
Example : -qprefetch=dscr=42
]]>
Adds the restrict type qualifier to the pointer parameters within all functions without modifying the source file.
Cause the C++ compiler to generate Run Time Type Identification code
Specifies that all local variables be treated as STATIC.
-qsimd
-qsimd : enables the generation of vector instructions for processors that support them.
-qnosimd : disables the generation of vector instructions.
Default : whether -qsimd is specified or not, -qsimd=auto is implied at the -O3 or higher optimization level;
-qsimd=noauto is implied at the -O2 or lower optimization level.
]]>
Causes the Fortran compiler to allocate dynamic arrays on the heap instead of the stack.
Causes the compiler to automatically generate parallel code using OMP controls when possible.
Tell the compiler that OMP controls are used to identify parallel code.
Specifies the size of the register allocation spill area in bytes.
-qstrict
-O3 and higher,
and, optionally at -O2, do not alter the semantics of a program.
The -qstrict=all, -qstrict=precision, -qstrict=exceptions, -qstrict=ieeefp, and -qstrict=order
suboptions and their negative forms are group suboptions that affect multiple, individual suboptions.
Group suboptions act as if either the positive or the no form of every suboption of the group is specified.
Default:
- Always -qstrict or -qstrict=all when the -qnoopt or -O0 optimization level is in effect.
- -qstrict or -qstrict=all is the default when the -O2 or -O optimization level is in effect.
- -qnostrict or -qstrict=none is the default when -O3 or a higher optimization level is in effect.
<suboptions_list> is a colon-separated list of one or more of the following:
- all | none : all disables all semantics-changing transformations, including those controlled by
the ieeefp, order, library, precision, and exceptions suboptions. none enables these transformations.
- precision | noprecision : precision disables all transformations that are likely to affect floating-point precision,
including those controlled by the subnormals, operationprecision, association, reductionorder, and library suboptions.
noprecision enables these transformations.
- exceptions | noexceptions : exceptions disables all transformations likely to affect exceptions or be affected by them,
including those controlled by the nans, infinities, subnormals, guards, and library suboptions.
noexceptions enables these transformations.
- ieeefp | noieeefp : ieeefp disables transformations that affect IEEE floating-point compliance, including
those controlled by the nans, infinities, subnormals, zerosigns, and operation precision suboptions.
noieeefp enables these transformations.
- nans | nonans : nans disables transformations that may produce incorrect results in the presence of, or that
may incorrectly produce IEEE floating-point signaling NaN (not-a-number) values.
nonans enables these transformations.
- infinities | noinfinities : infinities disables transformations that may produce incorrect results in the presence of,
or that may incorrectly produce floating-point infinities.
noinfinities enables these transformations.
- subnormals | nosubnormals : subnormals disables transformations that may produce incorrect results in the presence of,
or that may incorrectly produce IEEE floating-point subnormals (formerly known as denorms).
nosubnormals enables these transformations.
- zerosigns | nozerosigns : zerosigns disables transformations that may affect or be affected by whether the sign of a floating-point zero is correct.
nozerosigns enables these transformations.
- operationprecision | nooperationprecision : operationprecision disables transformations that produce
approximate results for individual floating-point operations.
nooperationprecision enables these transformations.
- order | noorder : order disables all code reordering between multiple operations that may affect results or
exceptions, including those controlled by the association, reductionorder, and guards suboptions.
noorder enables code reordering.
- association | noassociation : association disables reordering operations within an expression.
noassociation enables reordering operations.
- reductionorder | noreductionorder: reductionorder disables parallelizing floating-point reductions.
noreductionorder enables these reductions.
- guards | noguards : guards disables moving operations past guards or calls which control whether the operation should be executed or not.
noguards, enables these moving operations.
- library | nolibrary : library disables transformations that affect floating-point library functions.
nolibrary enables these transformations.
]]>
-qtune
The supported values for suboption are :
- auto - Use the processor on which the program is compiled.
- pwr8 - The POWER8 processor based systems.
- pwr7 - The POWER7 processor based systems.
- pwr6e - The POWER6 processor in "Enhanced" mode based systems.
- pwr6 - The POWER6 processor based systems.
- pwr5x - The POWER5+ processor based systems.
- pwr5 - The POWER5 processor based systems.
- pwr4 - The POWER4 processor based systems.
- ppc970 - The PPC970 processor based systems.
]]>
Specifies whether to use volatile or non-volatile vector registers. Volatile vector registers are registers whose
value is not preserved across function calls so the compiler will not depend on values in them across function calls.
-qxlf90
suboption can be one of the following :
- signedzero | nosignedzero : Determines how the SIGN(A,B) function handles signed real 0.0.
In addition, determines whether negative internal values will be prefixed with a minus
when formatted output would produce a negative sign zero.
- autodealloc | noautodealloc : Determines whether the compiler deallocates allocatable arrays that are declared locally
without either the SAVE or the STATIC attribute and have a status of currently allocated when the subprogram terminates.
- oldpad | nooldpad : When the PAD=specifier is present in the INQUIRE statement, specifying -qxlf90=nooldpad
returns UNDEFINED when there is no connection, or when the connection is for unformatted I/O.
This behavior conforms with the Fortran 95 standard and above. Specifying -qxlf90=oldpad preserves the Fortran 90 behavior.
- Default: signedzero, autodealloc and nooldpad for the xlf95, xlf95_r, xlf95_r7 and f95 invocation commands.
nosignedzero, noautodealloc and oldpad for all other invocation commands.
]]>
Specifies library search directory for the Apache C++ Standard Library for use by the runtime linker.
The information is recorded in the object file and passed to the runtime linker.
Parameter |
Description |
Executable name |
a |
Assembler |
as |
b |
Low-level optimizer |
xlfcode |
c |
Compiler front end |
xlfentry |
d |
Disassembler |
dis |
F |
C preprocessor |
cpp |
h |
Array language optimizer |
xlfhot |
I |
High-level optimizer, compile step |
ipa |
l |
Linker |
ld |
z |
Binder |
bolt |
]]>
-qchars=signed : Causes the compiler to treat the type "char" as signed instead of the default of unsigned.
-qchars=unsigned : Causes the compiler to treat the type "char" as unsigned. This is the default.
Note: this particular portability flag is included for 526.blender_r per the recommendation in its documentation - see
http://www.spec.org/cpu2017/Docs/benchmarks/526.blender_r.html.
]]>
Permits the usage of "//" to introduce a comment that lasts until the end of the current source line, as in C++.
Adds an underscore to global entities to match the C compiler ABI
Indicates that the input fortran source program is in fixed form.
Do not use the XL compiler compat macros.
<suboption> must be one of the following suboptions:
- be : Specifies that I/O operations on unformatted data files use big-endian byte order.
- le : Specifies that I/O operations on unformatted data files use little-endian byte order.
Default: -qufmt=le
]]>
xlc
xlc
xlc_r
xlc_at
xlc_r_at
32-bit binaries are produced by default on big-endian Linux. Only 64-bit compilation is supported on little-endian Linux.
The xlc_r invocation is thread-safe version of xlc compiler. The xlc_at and xlc_r_at invocations link with the IBM Advanced Toolchain libraries.
]]>
xlC
xlC
xlC_r
xlC_at
xlC_r_at
32-bit binaries are produced by default on big-endian Linux. Only 64-bit compilation is supported on little-endian Linux.
The xlC_r invocation is thread-safe version of xlC compiler. The xlC_at and xlC_r_at invocations link with the IBM Advanced Toolchain libraries.
]]>
xlf95
xlf95
xlf95_r
xlf95_at
xlf95_r_at
32-bit binaries are produced by default on big-endian Linux. Only 64-bit compilation is supported on little-endian Linux.
The xlf95_r invocation is thread-safe version of xlf95 compiler. The xlf95_at and xlf95_r_at invocations link with the IBM Advanced Toolchain libraries.
]]>
Compilation conforms to the ISO C99 standard and accepts implementation-specific language extensions.
Causes the compiler to output a traceback if it abends.
Specifies whether to include standard object code in the object files.
The noobject suboption can substantially reduce overall
compilation time, by not generating object code during the first IPA phase.
This option does not affect the code in the final binary created.
Specifies the size of the compiler's internal program storage areas, in bytes. Example : -qspillsize=512.
-qsuppress=msg1:msg2
-qsuppress
Suppresses the message with the message number specified. Examples : -qsuppress=1500-036 and -qsuppress=cmpmsg.
Suppresses informational, language-level, and warning messages. This option sets -qflag=e:e.