----------------------------------------------------------------------------------- Fujitsu PRIMEPOWER flags/tunables description (May.09 2005) (Each section is sorted in case insensitive, alphabetical order) Table of Contents [1] Fujitsu Parallelnavi 2.3 compiler flag description [2] Sun Studio 9 flag description [3] Environment Variables [4] Kernel Parameters (/etc/system) [5] Commands for feedback control ----------------------------------------------------------------------------------- [1] Fujitsu Parallelnavi 2.3 compiler flag description Compiler options Remark ----------------------------------------------------------------------------------- [Fortran] -Am Required if a source file contains modules which will be referenced by USE statements in other source files or if a source file contains USE statements that reference modules in another source file. The -Am option creates a module information file (module_name.mod) for each module compiled, either in the current directory or in a directory specified by the -Mdirectory option. frt searches for module information files in the current directory and also in directories speci- fied by the -Mdirectory and/or -Idirectory options. -dy/-dn In c, specify y or n. -dy specifies dynamic linking in the linker. -dn specifies static linking in the linker. The default is -dy. This option and its argu- ment are passed to the linker. -f omitmsg Set the level of diagnostic messages output and inhibit specific messages. omitmsg can be one of the characters i, w, or s, and/or a list of msgnum. If several arguments are specified, they must be delimited by commas. i All messages are output, this is the default. w i level messages are not output. s i and w level messages are not output. msgnum Message number msgnum is inhibited. msgnum must be an i or w level message. -Fixed Specifies that Fortran source programs are written in fixed source form. If file.f or file.F is specified as a Fortran source program, the -Fixed option is effective by default. -K opt Control specific optimizations and code generation. If several of these are specified at the same time, they must be delimited by commas. alignc[=N] Adjust entry of global data alignment at n-byte boundary. N can be specified from 1 to 32768. alignl[=N] Adjust entry of local data alignment at n-byte boundary. N can be specified from 1 to 32768. commonpad[=N] Insert padding elements in common blocks for effi- cient use of cache. N can be specified from 4 to 4096 bytes. When it's omitted, the compiler automatically determines suitable value. dalign Generates instructions assuming that eight-byte integer data, double-precision real data, double- precision complex data, quadruple-precision real data or quadruple-precision complex data referred to by dummy arguments or pointers is aligned on eight-byte boundaries. eval This option specifies the optimization by changing the method of operator evaluation. Specifying this option may give rise to side effects (preci- sion errors and runtime exceptions) in the execu- tion results, leading to unintended results. This option is effective only if -O option is also specified. fast_GP2[={0|1|2|3}] Specifies the best optimization level suitable for the system equipped with SPARC64 V. 0, 1, 2 or 3 can be specified for the argument level. If the level is not specified, 1 is used. Moreover, -Kprefetch_model=kind is automatically chosen according to the compiling machine. -Kprefetch_model=L is chosen when compiling with the system which is not equipped with SPARC64 V. 0 Induces -O4 -Kfsimple,dalign,ns,fuse,mfunc, prefetch,SPARC64_GP2,V8PLUS,VIS1,gs options. 1 Induces -O4 -Kfsimple,dalign,ns,fuse,mfunc, prefetch,SPARC64_GP2,V8PLUS,VIS1,gs options. 2 Induces -O4 -Kfsimple,dalign,ns,fuse,mfunc, prefetch,SPARC64_GP2,V8PLUS,VIS1,gs, eval options. 3 Induces -O4 -Kfsimple,dalign,ns,fuse,mfunc, prefetch,SPARC64_GP2,V8PLUS,VIS1,gs,eval, preex options. FMADD Use of the combined multiply-add/subtract floating-point instructions. Either of the -KV8PLUS option or -KV9 option must be specified together. frecipro This option specifies to convert a floating point division into multiplication by the reciprocal. fsimple This option creates an object program by applying optimizaitons that simplify floating-point opera- tion. fuse Fuses neighboring loops. GREG_SYSTEM This option specifies to global registers g5 through g7(when -KV9 is available, g6,g7 are used) are subject to register allocation in the compile stage. These registers are reservation register of system. gs Performs global instruction scheduling. It is ignored if the -O5 option is specified after it. largepage[=level] This option creates executable file that uses the Parallelnavi largepage functionality. 1 or 2 can be specified for the argument level. If the level is not specified, 1 is used. 1 The largepage functionality applies to the data and the heap areas. 2 In addition to the -Klargepage=1, the lar- gepage functionality applies to the stack area. mfunc Indicates that the intrinsic function (including power operation) into a multi-operation function. Single precision real type LOG and EXP, and double precision real type LOG, EXP and power operation are targets for this optimization. Either of the -KV8PLUS option or -KV9 option must be specified together. NOFMADD Suppresses use of the combined multiply- add/subtract floating-point instructions. -KNOFMADD is default. nomfunc Suppresses to change the intrinsic function or an operation to a multi-operation function. -Knomfunc is default. noprefetch Suppresses use the prefetch instruction. nounroll Prevents loop unrolling optimizations. ns Initialize the FPU in non-standard mode of operation. Used mostly to suppress underflow interruptions. preex Optimizes by moving the evaluation of invariant expressions ahead of branch instructions. pg Generates instructions to produce a profile file for subsequent optimization (global instruction scheduling etc.). prefetch[=level] If -KSPARC64_GP is in effect and level is not specified, compiler is selected 2. If -KSPARC64_GP2 is in effect and level is not specified, compiler is selected 3. 1: Basic level prefetch for array elements only inner-most loop. 2: In addition to the -Kprefetch=1, generates the prefetch instruction for array elements within the loop pre-header which access the first iteration in the loop. 3: In addition to the -Kprefetch=2, when the stride of access for array elements are larger than cache line size, compiler generates prefetch instruction for each cache line size access. 4: In addition to -Kprefetch=3, prefetch with address calculation is executed. 5: In addition to -Kprefetch=4, prefetching is applied to array data which are accessed indirectly. prefetch_cache_level=N This option is specified cache-level to prefetch of data. It means -KSPARC64_GP2 and -Kprefetch={2|3|4|5} option is in effect. N can be specified as following: 1 Data is prefetch in first cache. Prefetch instruction is used normal. 2 Data is prefetch only second cache. Pre- fetch instruction is used instruction is prefetch only second cache. 3 1 and 2 function is in effect. Two kind of prefetch instruction is used, so that pre- fetch become high level. prefetch_infer The compiler assumes the memory access to be continuous access and generate the prefetch instruction. pu Optimizes by using an existing profile information file. SPARC64_GP2 Optimization for SPARC64 V is applied. unroll[=N] Performs loop unrolling. N means upper limit of unrolling expansion number, whose value should be from 2 to 100. When specification of N is omit- ted, the compiler automatically determines suit- able value. Default is -Knounroll if -O0 or -O1 is specified, and -Kunroll if -O2 or higher is speci- fied. V8PLUS Indicates that SPARC V8+ instructions should be used. V9 Indicates that SPARC V9 instructions should be used. VIS1 This option specifies to output Visual Instruction Set (VIS) version 1.0. Either of the -KV8PLUS option or -KV9 option must be specified together. -O[level] Specifies the optimization level. 0, 1, 2, 3, 4 or 5 can be specified for level. If level is omitted, level 3 is assumed. If the -O option is not specified, -O2 is assumed. If the -O option is specified together with the -g or -Kcover option, the level is treated as 0 and optimization is not done. In addition to the -O option, the Fortran compiler sup- ports the -K and -x options for optimization. 0 No optimization. 1 Basic optimization. 2 Loop unrolling in addition to -O1. 3 Global instruction scheduling, loop tiling and restructuring of nested loop in addition to -O2. 4 Further optimization of loop restructuring, that is full unrolling, splitting for promoting loop exchange etc. in addition to -O3. 5 Creates an object program by applying further optimizations of register allocation in addition to -O4. -SSL2 The whole set of routines from SSL II, SSL II Thread- Parallel Capabilities and BLAS/LAPACK becomes part of link-edit libraries. -x inline Expands calls to external, internal and module pro- cedures to the corresponding lines in the calling pro- cedure. -, pgm1[,pgm2] ..., stno, dsizeK or dir=dirname1[,dir=dirname2] ... can be specified in the argument inline. If several are specified they must be delimited by commas. - Expands user-defined procedures which have 30 or fewer executable statements. pgm Only expands the procedures specified by the argu- ment pgm. For pgm, specify the external procedure name for external procedure, and specify the host procedure name + '.' + internal procedure name for internal procedure. For internal procedure in module procedure, specify module name + '.' module procedure name + '.' + internal procedure name. pgm may be combined with stno or dsizeK. -x stno Expands user-defined procedures which have stno or fewer executable statements. stno may be combined with pgm or dsizeK. -x dir=dirname Performs inline expansion of procedures defined in the file under the directory specified as the argument dirname and reference in file currently being compiled. But the files whose suffixes are .f, .for, .f90 or .f95 under the directory are to be target of this optimization. The argument dir- name is the directory name, and sub option( dir=dirname ) can be specified multiply using a comma as a delimiter. When specifying multiple arguments, files under all directories are to be target of this optimization. [C] -K opt The -K option can use multiple parameters. For example, -Klib,PIC can be used instead of -Klib -KPIC. cfunc This uses high speed mathematical functions and library functions (malloc,calloc,realloc,free) prepared by this compilation system. This option functionally include -Kmfunc option. This option is effective if -Kmfunc option is also specified. This option is ignored if -mt, -KOMP or -Kparallel option is also specified. crossfile This option specifies the crossfile optimization. If program consists of several files, the compiler refers these files at one time, and analyzes data dependency and control relation across these files. This optimization is called the crossfile optimization. This option is effective only if -O option is also specified. When this option is specified, -Kiopt and -Kxi= N are assumed. When specification N is ommited, the compiler deter- mines automatically suitable value. This option is ignored if -g, -KV9, -KOMP or -Kparallel option is also specified. This function can be used under the Parallelnavi environment. dalign Generates instructions assuming that eight-byte integer data, double-precision real data, double- precision complex data, quadruple-precision real data or quadruple-precision complex data referred to by dummy arguments or pointers is aligned on eight-byte boundaries. eval This option specifies the optimization by changing the method of operator evaluation. Specifying this option may give rise to side effects (preci- sion errors and runtime exceptions) in the execu- tion results, leading to unintended results. This option is effective only if -O option is also specified. fast_GP2[={0|1|2|3}] This performs optimization for SPARC64 V series. When -Kfast_GP2 option is specified, -Kfast_GP2=1 is assumed. This option ignored if -g or -Kcover option is also specified. This option makes -O0, -O1 , -O2 or -KV8 option ineffective forcedly. -Kfast_GP2=3 can be used under the Parallelnavi environment. -Kfast_GP2=0 This performs optimization same as -O3 -Klib -Kdalign -KSPARC64_GP2 -KV8PLUS -Kgs options. -Kfast_GP2=1 This performs optimization same as -O3 -Klib -Kdalign -KSPARC64_GP2 -KV8PLUS -Kgs options. -Kfast_GP2=2 This generates -Keval option in addition to -Kfast_GP2=1. -Kfast_GP2=3 This generates -Kcrossfile option in addition to -Kfast_GP2=1. -Kfast_GP2=1 is assumed if -KV9, -KOMP or -Kparallel option is also specified. GREG The global registers g2 through g7 (when -KV9 option is available, g2,g3,g6,g7 are used) are subject to register allocation in the compile stage. This is equivalent to specifying - KGREG_APPLI, GREG_SYSTEM option. gs Performs global instruction scheduling. It is ignored if the -O5 option is specified after it. lib Recognizing the operation of the standard func- tions, this option replaces the standard functions with faster, inline expanded standard functions. If a user-defined function with the same name as a standard function is used, unintended results by user may occur. This option is effective only if -O option is also specified. pg This generates an instruction sequence used to generate profile information referred by the com- piler in order to perform optimization (global instruction scheduling, etc.). This option is effective only if -O option is also specified. preex This option specifies the optimization by moving the evaluation of invariant expressions beyond branch. Specifying this option may give rise to side effects in the execution results, leading to results unintended by the user. This option is effective only if -O option is also specified. pu[=file] This performs optimization (global instruction scheduling, etc.) using program runtime profile information obtained by specifying -Kpg option. If both -Kpg and -Kcrossfile options are speci- fied, profile file name, which is gotten by -Kpg option, has to be specified as file. Among the execution with -Kpg and -Kpu options, the number of CPU and maximum threads cannot be changed. This option is effective only if -O option is also specified. SPARC64_GP2 Optimization for SPARC64 V is applied. V8PLUS Indicates that SPARC V8+ instructions should be used. V9 Indicates that SPARC V9 instructions should be used. -O[n] In n, specify the level of optimization as 0, 1, 2, 3, 4 or 5 . When -O option is specified, -O3 is assumed. The higher the level of optimization, the shorter the execution time and take more compile time. The higher levels of optimization functionally include the lower levels of optimization. This option is not valid to .s file. Optimization level 0 No optimization is performed. This is equivalent to that -O option is not specified. Optimization level 1 Optimization is performed through detailed analysis of program control flow. Optimization level 2 In addition to the optimization of optimization level 1, the following optimization is performed: - Loop unrolling This may involve increase in object size. Optimization level 3 In addition to the optimization of optimization level 2, the following optimizations are per- formed: - Loop unrolling (expanded) - Software pipelining - Repeated application of optimization functions Repeated application of optimization functions means that the optimization functions performed in optimization level 1 are repeatedly performed until there is no room for further optimization. Optimization level 4 In addition to the optimization of optimization level 3, the following optimizations are per- formed: - spliting for promoting loop exchange - -KGREG_APPLI option is assumed. Optimization level 5 In addition to the optimization of optimization level 4, the following optimizations are per- formed: - register allocation (expanded) ----------------------------------------------------------------------------------- [2] Sun Studio 9 flag description Compiler options Remark ----------------------------------------------------------------------------------- cc Invoke the Sun Studio 9 Compiler C (C compiler) CC Invoke the Sun Studio 9 Compiler C++ (C++ compiler) -crit Enable optimization of critical control paths (optimizer) -dalign Assume data is naturally aligned. (C, C++, Fortran) -Dalloca=__builtin_alloca (Portability flag) Portability switch, used for 176.gcc: allow use of compiler's internal builtin alloca. -depend Synonym for -xdepend. (Fortran) -DHOST_WORDS_BIG_ENDIAN Portability switch, used for 176.gcc: (Portability flag) controls how bytes are numbered within a word. -D__MATHERR_ERRNO_DONTCARE (C) Allows the compiler to assume that your code does not rely on setting of the errno variable. -DSPEC_CPU2000_SOLARIS Portability switch, used for 253.perlbmk: (Portability flag) selects header files and code paths compatible with Solaris. -DSUN Portability switch, used for 186.crafty: (Portability flag) selects header files and code paths compatible with Solaris. -DSYS_HAS_CALLOC_PROTO Portability switch, used for 254.gap: (Portability flag) allows use of the designated prototype. -DSYS_HAS_IOCTL_PROTO Portability switch, used for 254.gap: (Portability flag) allows use of the designated prototype. -DSYS_HAS_SIGNAL_PROTO Portability switch, used for 254.gap: (Portability flag) allows use of the designated prototype. -DSYS_HAS_TIME_PROTO Portability switch, used for 254.gap: (Portability flag) allows use of the designated prototype. -DSYS_IS_USG Portability switch, used for 254.gap: (Portability flag) selects code compatible with USG-based systems. -e Portability switch, used for 178.galgel: (Portability, Fortran) allows source lines to be up to 132 characters long. f90 Invoke the Sun Studio 9 Compiler Fortran 90 (Fortran compiler) -fast A convenience option, this switch selects the (C) following switches that are defined elsewhere in this page: -D__MATHERR_ERRNO_DONTCARE -fns -fsimple=2 -fsingle -xalias_level=basic -xbuiltin=%all -xdepend -xlibmil -xlibmopt -xmemalign=8s -xO5 -xprefetch=auto,explicit -xtarget=native -fast A convenience option, this switch selects the (C++) following switches that are defined elsewhere in this page: -dalign -fns -fsimple=2 -ftrap=%none -xbuiltin=%all -xlibmil -xlibmopt -xO5 -xtarget=native -fast A convenience option, this switch selects the (Fortran) following switches that are defined elsewhere in this page: -dalign -depend -fns -fsimple=2 -ftrap=common -xlibmil -xlibmopt -xO5 -xpad=local -xprefetch=auto,explicit -xtarget=native -xvector=yes -fixed Portability switch, used for 178.galgel: (Portability, Fortran) assume fixed-format source input. -fns Selects faster (but nonstandard) handling of (C, C++, Fortran) floating point arithmetic exceptions and gradual underflow. -fsimple= Controls simplifying assumptions for (C, C++, Fortran) floating point arithmetic: -fsimple=0 Permits no simplifying assumptions. Preserves strict IEEE 754 conformance. -fsimple=1 Allows the optimizer to assume: The IEEE 754 default rounding/trapping modes do not change after process initialization. Computations producing no visible result other than potential floating-point exceptions may be deleted. Computations with Infinity or NaNs as operands need not propagate NaNs to their results. For example, x*0 may be replaced by 0. Computations do not depend on sign of zero. -fsimple=2 Permits more aggressive floating point optimizations that may cause programs to produce different numeric results due to changes in rounding. Even with -fsimple=2, the optimizer still is not permitted to introduce a floating point exception in a program that otherwise produces none. -fsingle Evaluate float expressions as single precision. (C) -ftrap=common Sets the IEEE 754 trapping mode to common exceptions (C, C++, Fortran) (invalid, division by zero, and overflow). -ftrap=%none Turns off all IEEE 754 trapping modes. (C, C++, Fortran) -library=iostream Portability switch, used for 252.eon: (Portability, C++) allow use of the classic iostream library. -ll2amm Include a library containing chip specific (linker) memory routines. -lm Include the math library. (linker) -lmopt Include the optimized math library. This option (linker) usually generates faster code, but may produce slightly different results. Usually these results will differ only in the last bit. -noex Do not allow C++ exceptions. A throw specification (C++) on a function is accepted but ignored; the compiler does not generate exception code. -O A synomym for -xO3. (Fortran) -Qoption Pass flags along to compiler phase: f90comp Fortran first pass iropt Global optimizer cg Code Genetator -Qoption cg See -Wc, below. (The code generator (code generator) phase is addressed via -Qoption cg in Fortran and C++; and via -Wc in C.) -Qoption cg -Qeps:enabled=1 (code generator) See -Wc,-Qeps:enabled=1 -Qoption cg -Qeps:ws= (code generator) See -Wc,-Qeps:ws= -Qoption cg -Qgsched-T (code generator) See -Wc,-Qgsched-T -Qoption cg -Qgsched-trace_late=1 (code generator) See -Wc,-Qgsched-trace_late=1 -Qoption iropt See -W2, below. (The optimizer can (optimizer) be addressed either via Qoption iropt in Fortran and C++; or via -W2 in C.) -Qoption iropt -Addint:sf= (optimizer) When considering whether to interchange loops, set memory store operation weight to n. A higher value of n indicates a greater performance cost for stores. -Qoption iropt -Ainline[:cp=][:cs=][:inc=][:irs=][:mi][:recursion=1] (optimizer) See -W2,[:cp=][:cs=][:inc=][:irs=][:mi][:recursion=1] -Qoption iropt -Apf:llist=:noinnerllist (optimizer) Do speculative prefetching for link-list data structures: llist= perform prefetching n iterations ahead noinnerllist do not attempt for innermost loops. -Qoption iropt -Atile:skewp[:b] (optimizer) Perform loop tiling which is enabled by loop skewing. Loop skewing is a transformation that transforms a non-fully interchangeable loop nest to a fully interchangeable loop nest. The optional b sets the tiling block size to n. -Qoption iropt -Aujam:inner=g (optimizer) Increase the probability that small-trip-count inner loops will be fully unrolled. RM_SOURCES = lapak.f90 This option allows building the benchmark 178.galgel (SPEC tools) without its copy of the lapak sources; instead, the lapak entry points in the sunperf library are used. rm -rf ./feedback.profile ./SunWS_cache (Unix) Remove any profile feedback information from previous runs. -W, Pass flags along to compiler phase (2=optimizer, c=code genetator). -W2,-Abcopy Increase the probability that the compiler will (optimizer) perform memcpy/memset transformations. -W2,-Ainline[:cp=][:cs=][:inc=][:irs=][:mi][:recursion=1] (optimizer) Control the optimizer's loop inliner: (without a value) Perform Inter-Procedural Analysis (IPA) -based inlining. cp= The minimum call site frequency counter in order to consider a routine for inlining. cs= Set inline callee size limit to n. The unit roughly corresponds to the number of instructions. inc= The inliner is allowed to increase the size of the program by up to n%. irs= Allow routines to increase by up to n. The unit roughly corresponds to the number of instructions. mi Perform maximum inlining (without considering code size increase). recursion=1 Allow routines that are called recursively to still be eligible for inlining. -W2,-crit Enable optimization of critical control paths. (optimizer) -W2,-Apf:llist=:noinnerllist (optimizer) Do speculative prefetching for link-list data structures: llist= perform prefetching n iterations ahead noinnerllist do not attempt for innermost loops. -W2,-Ashort_ldst Convert multiple short memory operations into (optimizer) single long memory operations. -W2,-whole Do whole program optimizations. (optimizer) -Wc,-Qdepgraph-early_cross_call=1 (code generator) There are several scheduling passes in the compiler. This option allows early passes to move instructions across call instructions. -Wc,-Qeps:enabled=1 Use enhanced pipeline scheduling(EPS) (code generator) and selective scheduling algorithms for instruction scheduling. -Wc,-Qeps:ws= Set the EPS window size, that is, the number (code generator) of instructions it will consider across all paths when trying to find independent instructions to schedule a parallel group. Larger values may result in better run time, at the cost of increased compile time. -Wc,-Qgsched-T Sets the aggressiveness of the trace (code generator) formation, where n is 4, 5, or 6. The higher the value of n, the lower the branch probability needed to include a basic block in a trace. -Wc,-Qgsched-trace_late=1 (code generator) Turns on the late trace scheduler. -Wc,-Qipa:valueprediction (code generator) Use profile feedback data to predict values and attempt to generate faster code along these control paths, even at the expense of possibly slower code along paths leading to different values. Correct code is generated for all paths. -Wc,-Qlp=[-av=][-t=][-fa=][-fl=] (code generator) Control irregular loop prefetching: lp= Turns the module on (1) or off (0) (default is on for F90; off for C/C++) -av= Sets the prefetch look ahead distance, in bytes. Default is 256. -t= Sets the number of attempts at prefetching. If not specified, t=2 if -xprefetch_level=3 has been set; otherwise, defaults to t=1. -fa= 1=Force user settings to override internally computed values. -fl= 1=Force the optimization to be turned on for all languages. -Wc,-Qms_pipe-pref Turn off prefetching within modulo scheduling. (code generator) -xalias_level=[basic|std|strong] (C) Allows the compiler to perform type-based alias analysis at the specified alias level: basic Assume that memory references that involve different C basic types do not alias each other. std Assume aliasing rules described in the ISO 1999 C standard. strong In addition to the restrictions at the std level, assume that pointers of type char * are used only to access an object of type char; and assume that there are no interior pointers. -xalias_level=compatible (C++) Allows the compiler to assume that layout-incompatible types are not aliased. -xarch= Limit the set of instructions the compiler may use (C, C++, Fortran) to generic, generic64, native, native64, v7, v8a, v8, v8plus, v8plusa, v8plusb, v9, v9a, v9b. Typical settings include: UltraSPARC-II, 32-bit mode: v8plusa UltraSPARC-II, 64-bit mode: v9a UltraSPARC-III, 32-bit mode: v8plusb UltraSPARC-III, 64-bit mode: v9b For more information, see the Fortran User's Guide at docs.sun.com -xbuiltin=%all Substitute intrinsic functions or inline system (C, C++) functions where profitable for performance. -xchip= Specifies the target processor for use by the (C, C++, Fortran) optimizer. c must be one of: generic, native, old, super, super2, micro, micro2, hyper, hyper2, powerup, ultra, ultra2, ultra2i, ultra3, ultra3cu, ultra3i, ultra4, 386, 486, pentium, pentium_pro, pentium3, pentium4 -xcache= Defines the cache properties for use by the (C, C++, Fortran) optimizer. c must be one of the following: native (set parameters for the host environment) * s1/l1/a1 * s1/l1/a1:s2/l2/a2 * s1/l1/a1:s2/l2/a2:s3/l3/a3 The si/li/ai are defined as follows: si The size of the data cache at level i, in kilobytes. li The line size of the data cache at level i, in bytes. ai The associativety of the data cache at level i. -xdepend Analyze loops for inter-iteration data dependencies, (C, Fortran) and do loop restructuring. -xinline= Turn off inlining. (C, C++, Fortran) -xipo[=2] Perform optimizations across all object files in the (C, C++, Fortran) link step: 0=off 1=on 2=performs whole-program detection and analysis -xlibmil Use inline expansion for math library, libm. (C, C++, Fortran) -xlibmopt Select the optimized math library. (C++, Fortran) -xlic_lib=sunperf Link with Sun supplied licensed sunperf library. (C, C++, Fortran) -xlinkopt Perform link-time optimizations, such as branch (C, C++, Fortran) optimization and cache coloring. -xO Specify optimization level n: (C, C++, Fortran) -xO1 Does only basic local optimizations (peephole). -xO2 Do basic local and global optimizations, such as induction variable elimination, common subexpression elimination, constant propogation, register allocation, and basic block merging. -xO3 Add global optimizations at the function level, loop unrolling, and software pipelining. -xO4 Adds automatic inlining of functions in the same file. -xO5 Uses optmization algorithms that may take significantly more compilation time or that do not have as high a probability of improving execution time, such as speculative code motion. -xpad=common[:] If multiple same-sized arrays are placed in common, (Fortran) insert padding between them for better use of cache. n specifies the amount of padding to apply, in units that are the same size as the array elements. If no parameter is specified then the compiler selects one automatically. -xpad=local Pad local variables, for better use of cache. (Fortran) -xpagesize= Set the preferred page size for running the program. (C, C++, Fortran) -xprefetch=auto,explicit (C, C++, Fortran) Allow generation of prefetch instructions. -xprefetch and -xprefetch=yes is a synonym for -xprefetch=auto,explicit. -xprefetch=latx: Adjust the compiler's assumptions about prefetch latency (C, C++, Fortran) by the specified factor. Typically values in the range of 0.5 to 2.0 will be useful. A lower number might indicate that data will usually be cache resident; a higher number might indicate a relatively larger gap between the processor speed and the memory speed (compared to the assumptions built into the compiler). -xprefetch=no%auto Turn off prefetch instruction generation. (C, C++, Fortran) -xprefetch_level= Control the level of searching that the compiler does (C, C++, Fortran) for prefetch opportunities by setting n to 1, 2, or 3, where higher numbers mean to do more searching. The default is 2. -xprofile=collect:./feedback (C, C++, Fortran) Collect profile data for feedback-directed optimization, and store it in a sub directory of the current directory, named ./feedback. -xprofile=use:./feedback (C, C++, Fortran) Use data collected for profile feedback. Look for it in a subdirectory of the current directory, named ./feedback. -xregs=syst Allows use of the system reserved registers %g6 and (C, C++, Fortran) %g7, and %g5 if not already allowed by -xarch value. -xrestrict Treat pointer-valued function parameters as (C) restricted pointers. -xsafe=mem Enables the use of non-faulting loads when used in (C, C++, Fortran) conjunction with -xarch=v8plus. Assumes that no memory based traps will occur. -xsfpconst Represents unsuffixed floating-point constants (C, C++, Fortran) as single precision. -xtarget=[system_name] Selects options appropriate for the system where (C, C++, Fortran) the compile is taking place, including architecture, chip, and cache sizes. (These can also be controlled separately, via -xarch, -xchip, and -xcache, respectively.) -xunroll=n Specifies whether or not the compiler optimizes (C, C++, Fortran) (unrolls) loops. n is a positive integer. When n is 1, it is a command and the compiler unrolls no loops. When n is greater than 1, -xunroll=n merely suggests to the compiler that it unroll loops n times. -xvector Allow the compiler to transform math library calls within (C, Fortran) loops into calls to the vector math library. ----------------------------------------------------------------------------------- [3] Environment Variables Flag Remark ----------------------------------------------------------------------------------- LD_LIBRARY_PATH=

Specify the locations to resolve dynamic link dependencies. LD_PRELOAD=mpss.so.1 Allow use of the mpss.so.1 shared object, which provides a means by which preferred stack and/or heap page sizes can be selected. MPSSHEAP= Specify the preferred page size for heap. The specified page size is applied to all created processe. MPSSSTACK= Specify the preferred page size for stack. The specified page size is applied to all created processe. ulimit -s unlimited Allow stack size to grow without limit. ----------------------------------------------------------------------------------- [4] Kernel Parameters (/etc/system) System Tunable Remark ----------------------------------------------------------------------------------- autoup The frequency of file system sync operations. consistent_coloring Controls the page coloring policy. It can be set to one of the following: 0 (default) dynamic (uses various vaddr bits) 1 static (virtual=paddr) tune_t_fsflushr The number of seconds between fsflush invocations for checking dirty memory. -------------------------------------------------------------------------------- [5] Commands for feedback control Command Remark -------------------------------------------------------------------------------- Paralllenavi compiler: fdo_pre0 = rm -rf `pwd`*.f.d fdo_pre0 = rm -rf `pwd`*.fbk remove the profile data generated at the last feedback-optimized compilation. Sun Studio 9 compiler: fdo_pre0 = rm -rf `pwd`/..feedback.profile fdo_pre0 = rm -rf `pwd`/SunWS_cache remove the profile data generated at the last feedback-optimized compilation.