Table of Contents

Name

kf90 - compiler driver used with kapf to optimize application performance

Syntax

kf90 [ switches ] filename

Description

The command invokes a series of language translators for the compilation of Fortran programs. For each input file, the translators are executed in the following order by default:

       * KAP high-level optimizer
       * compiler
* linker

Execution of one or more of the translators may be inhibited. Switches which apply to the preprocessor and high-level optimizer are recognized and passed accordingly. Unrecognized switches are passed to the compiler only. Linker switches are passed to the linker. Files with unrecognized extensions will be passed to the compiler untouched.

EXAMPLES

Switches

COMMAND LINE SWITCHES

- [no ]fkap [='Fortran_kap_path' ]
Default: -fkap='/usr/bin/kapf'

This switch inhibits or causes the execution of the KAP Fortran high-level optimizer, providing the capability of specifying an alternative path.

-fkapargs='kap_option_string'

This switch passes switches to the KAP Fortran high-level optimizer.

- [no ]f90 [='Fortran_compiler_path' ]

Default: -f77='/usr/bin/f90'

This switch inhibits or causes the execution of the Fortran compiler, providing the capability of specifying an alternative path.

-fext=Fortran_file_extension_string

Default: -fext=f

Treat files with the indicated extension as Fortran source files.

-v

Print the commands invoking passes as they execute. This switch is also passed to the compiler.

-tmpdir=temporary_directory_path_string

Default: -tmpdir=/tmp/
This is the directory to place temporary files. This switch may also be set by the environment variable TMPDIR.

-sif [={kap} ]
-S

Default: off

Save intermediate files. Specifying -sif is equivalent to -sif=kap . Specifying -S is equivalent to -sif=kap and passing -S to the compiler which saves the assembly language output. Intermediate file naming conventions are as follows:

       K<file>.f        - KAP Fortran output file

The path and switch strings shown above must be enclosed in single or double quotes if they contain white space characters.

FILES

file.f90    - input Fortran file
file.out    - output KAP listing file
file.o    - output object file

COMMAND LINE SWITCHES

-ag=<list>
Long name: -aggressive=<list>
Default value: -nag
-nag=<list>
Long name: -noaggressive=<list>
-Aggressive=a means that kapf90 will pad COMMON blocks in an attempt to avoid cache line collisions. This assumes the following:
·
All COMMON blocks will be visible to kapf90 in the course of processing the entire file.
·
If the same COMMON block has two different layouts, these two layouts are fully independent and do not pass values between each other.
-arclm=<integer>
Long name: -arclimit=<integer>
Default value: -arclm=5000
The arclimit switch is used to increase the size of the dependence arc data structure that kapf90 uses to perform data dependence analysis. This data structure is dynamically allocated on a loop-nest by loop-nest basis.

The formula which is used to estimate the number of dependence arcs for a given loop nest is:

dependence_array_size=max(#_of_statements * 4, arclimit value)

This is an estimate because kapf90 is assuming that each statement, in the worst case, would have 4 dependence arcs.

-a=<list>
Long name: -assume=<list>
Default value: -a=cel
-nas
Long name: -noassume
list can contain the following characters:
a
Allow multiple aliasing
b
Allow array bounds violation
c
Constant arguments are assigned to temporaries in procedure and function calls
e
Equivalenced variables do not refer to same memory location inside one DO loop nest
l
Last value assignments are necessary

To disable all the above assumptions, give -noassume on the command line.

-chl=<integer>[,<integer>]
Long name: -cacheline=<integer>[,<integer>]
Default value: -chl=32,32
The cacheline switch informs kapf90 of the width of the memory channel in bytes between cache and main memory. Cacheline can take a second argument. When two arguments are specified, the first argument gives the width of the memory channel between the primary cache and the secondary cache, and the second argument gives the width of the memory channel between the secondary cache and main memory. Omitting the second argument, or specifying it as 32 (the default), instructs KAP not to optimize secondary cache usage.
-cplc=<integer>
Long name: -cache_prefetch_line_count=<integer>
Default value: -cplc=0
The cache_prefetch_line_count switch gives the number of additional lines prefetched into the cache during a cache miss.

-chs=<integer>[,<integer>]
Long name: cachesize=<integer>[,<integer>]
Default value: -chs=8,0
The cachesize switch informs kapf90 of the size in kilobytes of the cache memory.

When two arguments are specified, the first argument gives the size of the primary cache, and the second argument gives the size of the secondary cache. Omitting the second argument, or specifying it as 0 (the default), tells KAP not to optimize secondary cache usage.

-cmp [=<file> ]
Short name: -cmp [=<file> ]
-nocmp
Short name: -ncmp
Default value: <file>.cmp.f90, <file>.cmp.f
The -cmp switch causes KAP to save the optimized source program under the file name of your choice. The kf90 default names the optimized source <file>.cmp.f90 when the souce file extension is .f90. If the optimized source has a file extension of .f, .for, or .FOR, the default is to name the optimized source <file>.cmp.f.

The kapf90 default is to name the optimized source program <file>.cmp, regardless of the input file extension. Because the Fortran 90 compiler will not process a file with the default .cmp extension, you should override the default. For example, use the -cmp switch in the kapf90 command line to rename the optimized source <file>.cmp.f90.

Both kf90 and kapf90 place the optimized source file in the current directory. To disable generation of the optimized Fortran output file, enter -nocmp on the command line.

-conc
Long name: -concurrentize
Default value: -noconc
-noconc
Long name: -noconcurrentize
The concurrentize switch directs KAP to restructure the source code for parallel processing.

Setting -noconcurrentize disables parallel execution and allows all serial optimizations to take place. You can enable and disable parallel execution on a module by module basis using KAP directives or on a loop by loop basis using KAP assertions.

Programs containing many loops which require synchronization or programs that have loops with small iteration counts may run more slowly when parallelized. In these cases you should disable parallel execution.

-cp=<list>
Long name: -cmpoptions=<list>
Default value: -cp=n
-ncp
Long name: -nocmpoptions
The cmpoptions switch specifies optional additional information or formatting for inclusion in the transformed code, file, .cmp .
i
Insert special numbers that reference the original code
n
Create transformed code from internal data structures

Specifying -cmpoptions=n instructs kapf90 to create the transformed code from its internal data structures. Specifying -nocmpoptions will instruct kapf90 to use lines from the source file, where feasible. Using the internal data structures for the code will provide consistent indentation and formatting but also all new labels and other changes from the source code. This may make relating source and transformed code more difficult.

Special line numbers are # line comments which may appear in the transformed program file in order to reference line numbers of the original code. The line in the transformed code that immediately follows a # line comment is either the transformed version of the line in the original code that is referenced, or a line which kapf90 inserted before the referenced line. The name of the source file from the command line is included in the form it had on the kapf90 command line.

- [n ]ds
Long name: - [no ]datasave
Default value: -ds
The datasave switch instructs kapf90 to treat local variables in a subroutine or function which appear in DATA statements as if they were also in SAVE statements. That is, their values will be retained between invocations of the subroutine or function. This is the practice of many commercial Fortran compilers. This choice affects certain optimizations performed by kapf90. The nodatasave switch complies with the Fortran-77 standard.
-dr=<list>
Long name: -directives=<list>
Default value: -dr=ak
-ndr
Long name: -nodirectives
The directives switch controls which directives are accepted by kapf90. <list> can contain the following characters:
a
kapf90 assertions are accepted
k
kapf90 !*$* or *$* directives are accepted
v
VAST CVD$ directives

Setting -nodirectives disables the acceptance of all directives.

- [n ]dl
Long name: - [no ]dlines
Default value: -ndl
The dlines switch allows a D in column 1 to be treated like a character space. The rest of that line is then parsed as a normal Fortran statement. By default, kapf90 treats these lines like comments. This switch is useful for the inclusion or exclusion of debugging lines.
-dpr=<integer>
Long name: -dpregisters=<integer>
Default value: -dpr=32
The dpregisters switch specifies the number of DOUBLE PRECISION registers each processor has.
-eiifg=<integer>
Long name: -each_invariant_if_growth=<integer>
Default value: -eiifg=20
When a loop contains an IF statement whose condition does not change from one iteration to another, loop invariant , the same test must be repeated for every iteration. The code can often be made more efficient by floating the IF outside the loop and putting the THEN and ELSE sections into their own loops.

This gets more complicated when there is other code in the loop, since a copy of it must be included in both the THEN and ELSE loops. The total amount of additional code generated in a program unit through invariant IF floating can be limited with the max_invariant_if_growth switch.

-escape
Long name: - [no ]escape
Default value: -escape
The -escape switch causes KAP to scan escape characters in input lines.
-fpr=<integer>
Long name: -fpregisters=<integer>
Default value: -fpr=32
The fpregisters switch specifies the number of single precision registers, such as ordinary floating point, each processor has.
-ff
Long name: -freeformat
Default value: -nff
The freeformat switch removes the standard column restrictions for Fortran source code. For example, source files can be up to 132 columns and use an ampersand (&) at the end of the line to indicate continuation. See the Fortran Language Reference manual for more information.

Setting -freeformat=f90 allows KAP to accept Fortran 90 conventions and extensions. Continuation lines are indicated with an ampersand (&) as the first character of the continuation line.

The -freeformat switch is off by default, and the usual Fortran 90 conventions apply. For example, files are truncated after column 72 unless you specify the DEC Fortran 90 flag -extend_source . A character (except a zero or a blank) in column 6 indicates a continuation line.

-fuse
Long name: -fuse
Default value: -nofuse
The fuse switch tells KAP to perform loop fusion. Loop fusion is a conventional compiler optimization that transforms two adjacent loops into a single loop. Data dependence tests allow fusion of more loops than standard techniques allow. Before KAP can perform loop fusion, you must specify the switch -scalaropt = 2 or -optimize = 5 .
-fuselevel
Long name: -fuselevel
Default value: -fuselevel=0
The fuselevel=1 switch causes KAP to attempt loop fusion after making additional passes through the source program to gather information about data dependencies. To activate fuselevel=1 , you must also use the fuse switch.

The default is fuselevel=0 . The effect of fuselevel=0 is equivalent to setting the fuse switch.

-generateh
Default value:  off
KAP automatically sets the -generate switch for you. Digital recommends that you do not set the -generateh switch.

KAP needs two passes to resolve Fortran 90 forward declarations. The first pass, the generateh pass, builds the information needed to analyze the program for forward references.

-hdir=directory_name
Default value: -hdir=current_directory
The -hdir=directoryname switch specifies the name of the directory where the KAP -generateh pass stores the temporary files containing information about forward references. The -useh switch picks up the information from that directory. The default is the current directory.

KAP automatically sets the -hdir switch for you. Digital recommends that you do not set the -hdir switch.

-heap =<integer>
Long name: -heaplimit =<integer>
Default value: -heaplimit=116
KAP may require large amounts of memory in order to processes your source code. The -heaplimit option specifies the maximum size in megabytes that the KAP heap can grow. If this limit is breached, KAP will stop processing your source code and try to exit with an ``out of memory'' error message.

If you choose a -heaplimit setting that is greater than the amount of memory that your machine has available, KAP may run out of memory before it reaches the -heaplimit .

KAP relies upon the operating system to tell it that the process has run out of memory before that problem occurs. Some operating systems kill KAP without first telling KAP that there is insufficent memory. In that case, KAP may stop processing your code and exit in an undefined manner. Using -heaplimit makes a graceful exit more likely.

-hli=<integer>
Long name: -hoist_loop_invariants=<integer>
Default value: -hli=1
The hoist_loop_invariants switch controls code hoisting of loop-invariant expressions from loops. Note that this switch is independent of the switches, each_invariant_if_growth and max_invariant_if_growth, that control the floating of invariant-IFs out of loops. The possible settings for hoist_loop_invariants are the following:
0 -- Turns off the hoisting of invariant code from loops.
1 -- Floats all loop invariant expressions not under the control of an IF-structure within the given loop nest.

- [n ]ig
Long name: - [no ]ignoreoptions
Default value: -nig
The ignoreoptions switch allows the user to direct kapf90 to ignore a !*$*OPTIONS or *$*OPTIONS line at the beginning of a file, thereby having the command line switches override the options card. The default is to accept the command line switches specified on the !*$*OPTIONS line.
-inc=<path name>
Long name: -include=<path name>
Default value: -off
The include switch allows the user to specify an alternate directory for locating the files specified in INCLUDE statements.

An include file whose name does not begin with a slash (/) is sought first in the directory containing the file containing the INCLUDE statement or directive, then in the directory named in the include switch.

-inl [=<names> ]
Long name: -inline [=<names> ]
Default value:  off
-ninl=<names>
Long name: -noinline=<names>
Default value:  off
- [no ]interchange
Long name: -interchange
Default value: -interchange
-ninterchange
Long name: -nointerchange
Use the interchange switch to enable or disable loop interchanging. KAP enables loop interchange when -interchange is specified and the -optimize level is at least 1 or the -scalaropt level is 3. If you specify -nointerchange , KAP disables loop interchange regardless of the -optimize or -scalaropt levels. Loop interchanging is enabled by default.
-intl
Long name: -interleave
Default value: -interleave
-nintl
Long name: -nointerleave
The -interleave switch controls loop unrolling and rescheduling. Interleaved unrolling can help the compiler recognize quad-word loads and stores, which are more efficient than ordinary loads and stores. It does this by first unrolling the loop as in ordinary loop unrolling. Second, the statements in the loop are interchanged where possible to make references to the same array adjacent to each other. Interleaved unrolling can be demonstrated by the example below:

real A(100),B(100)

do I = 1, 100
A(i) = 99.
B(i) = 100.
enddo

print *,a,b
end

The output from KAP with interleaved unrolling turned on, -interleave , is:

real A(100), B(100)

do I=1,97,4
A(I) = 99.
A(I+1) = 99.
A(I+2) = 99.
A(I+3) = 99.
B(I) = 100.
B(I+1) = 100.
B(I+2) = 100.
B(I+3) = 100.
end do

print *, A, B
end
The code produced with interleaved unrolling turned off, -nointerleave , is

real A(100), B(100)

do I=1,97,4
A(I) = 99.
B(I) = 100.
A(I+1) = 99.
B(I+1) = 100.
A(I+2) = 99.
B(I+2) = 100.
A(I+3) = 99.
B(I+3) = 100.
end do

print *, A, B
end
The default value is -interleave .

-ipa [=<names> ]
Long name: -ipa [=<names> ]
Default value:  off
-nipa=<names>
Long name: -noipa=<names>
Default value:  off

The inline switch provides kapf90 a list of routines to analyze. If the switch is given without an argument list, kapf90 will try to inline/analyze all the called functions in the inlining universe specified by the inline_from../ipa_from.. switches. If a list of names is included, for example, -inline=mkcoef,yval , then just the routines named will be inlined/analyzed. Additionally, -ipa causes KAP to give information in the annotated listing about appropriate settings for the -ind , -inll , and -ipall switches on a loop by loop basis.

The no forms instruct kapf90 to inline/analyze all routines except those in the list. The list is required.

-inlc=<names>
Long name: inline_and_copy=<names>
Default value:  off

The inline_and_copy command line switch functions like the inline switch, except that if all CALLs or references to a subprogram are inlined, the text of the routine is not optimized but is copied unchanged to the transformed code file. This is intended for use when inlining routines from the same file as the call. Inline_and_copy has no special effect when the routines being inlined are taken from a library or another source file.

After a subprogram has been inlined everywhere it is used, leaving it unoptimized saves compilation time. When a program involves multiple source files, the unoptimized routine will still be available in case one of the other source files contains a reference to it, so no errors will result.

Note: The inline_and_copy algorithm assumes that all CALLs and references to the routine precede it in the source file. If the routine is referenced after the text of the routine, and that particular call site cannot be inlined, the unoptimized version of the routine will be invoked.

-incr[=<file> ]
Long name: -inline_create[=<file> ]
Default value:  off
-ipacr [=<file> ]
Long name: -ipa_create [=<file> ]
Default value:  off

The inline_create and ipa_create switch instruct kapf90 to build a library file containing partially analyzed routines for later inlining. The library created is used with the inline_from_libraries or ipa_from_libraries switch. Libraries created with inline_create can be used with either inlining or interprocedural analysis, since they contain essentially complete descriptions of the functions included. Libraries created with ipa_create can be used only with interprocedural analysis, since they do not have the complete text of the functions--just the data relationships information.

Any filename can be used for the library name. An extension .klib is preferred for maximum compatibility with the ...from_libraries switches. If either of these switches is given without a file name, the created library is named <file>.klib , where <file> is the source file name with any trailing .f , .ftn , or .for removed.

-ind=<integer>
Long name: -inline_depth=<integer>
Default value:  -ind=2
The inline_depth switch sets the maximum level of subprogram nesting which kapf90 will attempt to inline. Higher values instruct kapf90 to trace CALLs and function references further. The values and their meanings are:
1-10
Inline routines to this depth.
0
Use the default value.
-1
Inline only routines which do not contain subroutine CALLs or function references.

The !*$*[no]inline directive, when enabled, is not affected by the inline_depth restrictions.

-ipad=<integer>
Long name: -ipa_depth=<integer>
Default value:  -ipad=2
The ipa_depth switch sets the maximum level of subprogram nesting which kapf90 will attempt to analyze. Higher values instruct kapf90 to trace CALLs and function references further. The values and their meanings are:
1-10
Analyze routines to this depth.
0
Use the default value.
-1
Analyze only routines which do not contain subroutine CALLs or function references.

The !*$*[no]ipa directive, when enabled, is not affected by the ipa_depth restrictions.

-inff=<file>,<file> Long name: -inline_from_files=<file>,<file>
Default value: current source file

-ipaff=<file>,<file>
Long name: -ipa_from_files=<file>,<file>
Default value: current source file
-infl=<file>,<file>
Long name: -inline_from_libraries=<library>,<library>
Default value:  off
-ipafl=<library>,<library>
Long name: -ipa_from_libraries=<library>,<library>
Default value:  off
The .._from_.. switches provide kapf90 with the locations of functions available for inlining/interprocedural analysis. The total set of available functions is called the inlining or IPA universe .

The .._from_files switches take the names of source files and directories containing source files. Including a directory, for example, -ipaff=/usr/ipalib is equivalent to the UNIX notation /usr/ipalib/*.c . Do not use shell wild card characters in the list of files and directories.

The .._from_libraries switches take the names of libraries created with the .._create switches and directories containing such libraries. In directories, the kapf90 libraries are identified by the extension .klib .

Multiple files/libraries or directories may be given in one .._from_.. switch, separated by commas. Multiple .._from_.. switches may be specified on the command line.

-inll=<integer>
Long name: -inline_looplevel=<integer>
Default value:  -inll=2
-ipall=<integer>
Long name: -ipa_looplevel=<integer>
Default value:  -ipall=2

The .._looplevel switches enable the user to limit inlining to just functions which are referenced in nested loops where the effects of reduced function call overhead or enhanced optimizations will be multiplied.

The parameter is defined from the most deeply nested function reference. For example, -inll=1 restricts inlining to functions referenced in the deepest loop nest. -inll=3 restricts inlining to those routines referenced at the three deepest levels. The FOR loop nest level of each function reference is included in the optional calling tree section of the listing files.

The !*$*[NO]INLINE and !*$*[NO]IPA directives, when enabled, are not affected by the looplevel restrictions.

-inm
Long name:  -inline_manual
Default value: off
-ipam
Long name:  -ipa_manual
Default value: off
The inline_manual and ipa_manual switches instruct kapf90 to recognize the !*$*ASSERT [NO]IPA directives. This allows manual control over which functions are inlined/analyzed at which call sites.

The default is to ignore these directives. They are enabled when any inlining (IPA) switch is given on the command line. When -inline_manual or -ipa_manual is included on the command line, the !*$*INLINE or !*$*IPA directives are enabled without enabling the automatic inlining algorithms. Since !*$*[NO]INLINE and !*$*[NO]IPA override the -inline=/-ipa=, -inline_depth, and -.._looplevel command line switches, they can be used along with command line control to select routines or call sites which the regular selection algorithm would reject or to prevent specific routines or CALL sites from being inlined/analyzed.

-inline_optimize=<integer>
Long name:  -inline_optimize=<integer>
Default value:  -inline_optimize=0
-ipa_optimize=<integer>
Long name:  -ipa_optimize=<integer>
Default value:  -ipa_optimize=0
The inline_optimize=<integer> and ipa_optimize=<integer> switches aid in optimizing large codes. These switches cause other KAP switches to be set depending on the value you replace for <integer> as follows:
0
-noipa , -noinline
1
-ipa , -inline
2
-ipa , -inline , -[ipa,inline]_loop_level=3 , -[ipa,inline]_depth=10 , -heaplimit=500 , -noarclimit
3
-ipa , -inline , -[ipa,inline]_loop_level=10 , -[ipa,inline]_depth=10 , -heaplimit=500 , -noarclimit
-i=<file>
Long name: -input=<file>
Default value: whatever filename is given on the command line.
For some mainframe operating systems, i=file may be required to specify the input file name. For other operating systems, simply list the filename on the command line. When input is specified without a file name, kapf90 reads the source code from standard input .
-int=<integer>
Long name: -integer=<integer>
Default value: -int=4
The integer switch specifies a size in bytes, N, for the default size of INTEGER variables. When N=2 or 4, take INTEGER*N as the default INTEGER type. When N=0, use the ordinary default length for INTEGER variables.
-intlog

Default value: -intlog
The intlog switch enables the mixing of integer and logical operands in expressions. When integer operands are used with logical operators, the operations are performed in a bitwise manner. When logical operations are used with arithmetic operators, the operands are treated as integers.
-lc=<name>
Long name: -library_calls=<name>
Default value: off
The library_calls qualifier directs kapf90 to replace sections of code with calls to standard numerical library routines which have the same functionality. This can simplify the source code, and if a version of the library which has been highly tuned for the target machine is available, the use of the standard package will improve performance of the application program. For example, if you specify this switch and you link the application with the Digital Extended Math Library (DXML), calls to the DXML Basic Linear Algebra Subroutines (BLAS) will replace sections of code. Use the following command: kf90 -fkapargs='-lc' -ldxml myprog.f90

The argument for library_calls identifies which library to create CALLs for. The DXML BLAS libraries are: blas1 which performs vector-vector operations such as dot product, blas2 which performs matrix-vector operations such as matrix vector multiplication, and blas3 which performs matrix-matrix multiplication. To specify both blas1 and blas2 , specify blas12 . To specify both blas2 and blas3 , specify blas23 ; this is the recommended switch. Specifying blas is equivalent to specifying blas23 . This switch can be disabled within a section of code with the C*$* optimize=o directive. This switch is disabled if -roundoff=0 .

CAUTION: This switch will introduce calls to BLAS routines to be linked from system libraries. Use of this switch can cause a collision between KAP generated BLAS routine names and user-provided routines in the source code. Even if the user-provided routines are identical in function to the library routines, rename or remove the user routines, since the linker will not use the optimized library routines if the user's calls to routines can be satisfied with the user-provided routines.

-lm=<integer>
Long name: -limit=<integer>
Default value: -lm=20000
In order to reduce the compile time, kapf90 estimates how long it spends analyzing each loop nest construct. If a loop is too deeply nested, kapf90 ignores the outer loop and recursively visits the inner loops. The loop nest limit is a rough dial to control what kapf90 considers too deeply nested. For further information, refer to the KAP Fortran 90 for Digital UNIXtm User's Guide.
-ln=<integer>
Long name: -lines=<integer>
Default value: -ln=55
The listing generated by kapf90 is paginated for printing on a line printer. The number of lines per page on the listing may be changed using the -lines= switch. The -lines=0 switch directs kapf90 to paginate at subroutine boundaries.
-l=<file>
Long name: -list=<file>
-nl
Long name: -nolist
Default value: -nl
The list switch allows kapf90 to generate an annotated listing of the user's program. On most systems, the default name of the listing file is derived from the input file name; however, on some systems the listing file name must be explicit. If -list=file is specified, the listing is written to that file. To disable generation of the listing file, enter -nolist on the command line.
-lw=<integer>
Long name: -listingwidth=<integer>
Default value: -lw=132
The listingwidth switch sets the maximum line length for the listing file produced by kapf90. This setting affects the format of the loop summary table, -listoptions=l , and kapf90 switches table, -listoptions=k . The fixed setting, 132, is optimal for most line printers. At present, no other values are allowed.
-kind=<integer>
Long name: -kind=<integer>
Default: -kind=4
The kind switch establishes the value for the Fortran 90 KIND type parameter used when KIND has not been specified or KIND=0 is specified. kind applies to all data types: logical, integer, real, and complex. The values for -kind are 4 or 8 with 4 being the default. The kind switch allows you to change the underlying precision of compuations without violating the Fortran 90 standard constraints that default logical, default integer and default real occupy the same amount of storage and that default double precision and default complex occupy twice the storage of default real.
-lo=<list>
Long name: -listoptions=<list>
Default: -lo=o
The listoptions switch tells kapf90 what information to include in the listing and error files. <list> can contain the following characters:
c
Calling tree at end of program listing
k
kapf90 switches used are printed at the end of each program unit
l
A loop-by-loop optimization table
n
Program unit names as processed to error
o
An annotated listing of the original program
p
Compilation performance statistics
s
A summary of the optimizations performed
t
Annotated listing of transformed program

-log=<integer>
Long name: -logical=<integer>
Default value: -log=4
The logical switch specifies a size in bytes, N, for the default size of LOGICAL variables. When N=1, 2, or 4, take LOGICAL*N as the default LOGICAL type. When N=0, use the ordinary default length for LOGICAL variables.

-mc=<integer>
Long name: -minconcurrent=<integer>
Default value: -mc=1700

The minconcurrent switch sets the level of work in a loop above which KAP executes the loop in parallel. The range of values for this switch is all numbers greater than or equal to 0. The higher the minconcurrent value, the more iterations and/or statements the loop body must have to run concurrently.

Executing a loop in parallel incurs overhead that varies with different systems. If a loop has little work, the overhead required to set up parallel execution may make the loop execute more slowly than it would using serial execution. At compilation time, KAP estimates the amount of work inside a loop on the basis of loop computations and loop iterations. KAP multiplies the loop iteration count by the sum of the noindex operands/results and the nonassignment operators. KAP compares its estimation with the minconcurrent value. If the estimated amount of work is greater than the minconcurrent value, KAP generates parallel code for the loop. Otherwise, the loop execution is serial. This is called a two-version loop. If the DO loop bounds are known at compilation time, KAP computes the exact iteration count. However, if the DO loop bounds are unknown, KAP generates a block IF around the parallel code. The block IF allows a runtime decision whether or not to execute the loop in parallel.

To disable the generation of two-version loops throughout the program, use the command line switch minconcurrent=0 . To disable this action in specific DO loops, use the minconcurrent directive.

The minconcurrent switch automatically executes the concurrentize switch.

-ma=<list>
Long name: -machine=<list>
Default value: -ma=s

<List> is one of three of:

n
Prefer optimization of non-stride-1 loops.
o
Do not parallelize innermost loops when optimizing. Parallelize only outermost loops.
s
Prefer optimization of stride-1 inner loops.

-miifg=<integer>
Long name: -max_invariant_if_growth=<integer>
Default value: -miifg=500
When a loop contains an IF statement whose condition does not change from one iteration to another, loop invariant , the same test must be repeated for every iteration. The code can often be made more efficient by floating the IF outside the loop and putting the THEN and ELSE sections into their own loops.

This gets more complicated when there is other code in the loop, since a copy of it must be included in both the THEN and ELSE loops. The max_invariant_if_growth switch allows the user to limit the total number of additional lines of code generated in each program unit through invariant IF restructuring .

This can be controlled on a loop-by-loop basis with the !*$*MAX_INVARIANT_IF_GROWTH (<integer>) directive. The maximum amount of additional code generated in a single loop through invariant IF floating can be limited with the each_invariant_if_growth switch.

-namepart=<integer><integer>
Long name: -namepartitioning=<integer><integer>
Default value: -nonamepart
This switch tells KAP to look at distinct array names and limit the number of arrays that appear in a loop to avoid cache thrashing. That is, this switch breaks a loop containing, for example, references to arrays A and B into two loops. One loop references array A and the other loop references array B.

Two arguments (i and j) used in a -namepartitioning=i,j switch, control name partitioning as follows:
i --- specifies the minimum number of partitions. This is preferred smallest number of distinct arrays in each distributed loop.
j --- specifies the maximum number of partitions. This is preferred largest number of distinct arrays in each distributed loop.
If no arguments appear with the -namepartitioning switch, KAP uses its default values of 2 for the minimum and 8 for the maximum number of partitions.

Before KAP can perform name partitioning, you must specify the switch -scalaropt=n where n is greater than or equal to 3.

The -nonamepartitioning switch explicitly prevents name partitioning.

-nat[=<list> ]
Long name: -natural[=<list> ]
Default value: -nat
-nnat
Long name: -nonatural
The natural switch selects between natural alignment, such as REAL*8 entities will always start on double-word boundaries, or non-alignment of data elements in COMMON blocks.

Natural alignment specifies that variables and arrays in COMMON blocks will start on boundaries which correspond to their size. Items which take up two words, such as COMPLEX arrays, will start on double-word boundaries; single-word items, such as REAL variables, will start on word boundaries; half-word items, such as INTEGER*2 variables, will start on half-word boundaries. The natural alignment can improve program speed by making memory access simpler.

This optimization is safe when:

·
All COMMON blocks will be visible to kapf90 in the course of processing the source file.
·
If the same COMMON block has two different layouts, the different layouts do not pass data between them as they are fully independent.

The default, nonatural , causes variables and arrays to be packed tightly into COMMON blocks. This can reduce memory usage but slow the program.

- [n ]1
Long name: - [no ]onetrip
Default value: -n1
The onetrip switch allows the user to specify one-trip DO loops. Many pre-Fortran-77 compilers implemented DO loops which were always executed once, even if the loop index initial value was higher than the final value. This switch informs kapf90 that the DO loops in the file being processed assume this feature.

-o=<integer>
Long name: -optimize=<integer>
Default value: -o=5
The optimize switch sets the optimization level, ranging from 0 to 5. The meanings of levels are as follows:
0
No optimization performed
1
Only simple analysis and optimization performed
Induction variables recognized
DO loop interchanging techniques applied
2
Lifetime analysis performed
More powerful data dependence tests performed
3
More loop interchanging performed
Special case data dependence tests performed
Wraparound variables recognized
4
Loop interchanging around reductions
More exact data dependence tests performed
5
Array expansion enabled

The enter gate , exit gate , and independent directives will be generated.

-pio
Long name: -parallelio
Default value: -nopio

The parallelio switch allows parallel execution of loops with I/O. Use this switch when you know the I/O will not execute. An example is a test for an error condition that causes a message to be printed.

-rl=<integer>
Long name: -real=<integer>
Default value: -rl=4
The real switch tells KAP what the Fortran 90 compiler default size for REAL variables is in bytes, N, where REAL*N can be 4 or 8. To change the default size of REAL variables, for example, from 4 to 8, first, set the Fortran 90 compiler switch -r=8 . Next, tell KAP the new size with the -real=8 switch.
-r=<integer>
Long name: -roundoff=<integer>
Default value: -r=3
The roundoff switch allows the user to specify the change from serial roundoff error that is tolerable. If an arithmetic reduction is accumulated in a different order than in the scalar program, the roundoff error is accumulated differently and the final result may differ from that of the original program's output. While the difference is usually insignificant, certain restructuring transformations performed by kapf90 must be disabled in order to obtain exactly the same results as the scalar program. These transformations, referenced below, are discussed in Chapter 7.

kapf90 classifies its transformations by the amount of difference in roundoff error that can accumulate so the user can decide what level of roundoff error differences is allowable. The roundoff command line switch has the values 0 to 3.

The meaning of each roundoff level is as follows. Each level is cumulative, performing what is listed below for that level in addition to what is listed for the previous levels. Meanings of these levels are as follows:

0
No roundoff-changing transformations
1
Expression simplification and code floating enabled
Arithmetic reductions recognized
Loop interchanging around arithmetic reductions allowed if optimize >= 4
Loop rerolling if scalaropt >= 1
2
Reciprocal substitution performed to move an expensive division outside of loop
3
Recognize real induction variables if scalaropt >= 2 or optimize >= 1
Memory management enabled if -scalaropt = 3
Expressions such as A / B / C can be rotated to A / (B * C)

-rt=<routine_name>[,<routine_name>...]
Long name: -routine=<routine_name>[,<routine_name>...]
Default value: -noroutine
The routine switch allows you to specify switches that apply only to specific routines within the source file KAP possesses. The only switches that routine can specify are:
-each_invariant_if_growth
-max_invariant_if_growth
-optimize
-roundoff
-scalaropt
-skip
-unroll
-unroll2
-unroll3
Place the routine switch after the name for the DEC Fortran source file. <routine_name> must be a routine in the the source file.

-sv=<list>
Long name: -save=<list>
Default value: -sv=manual_adjust
The save switch instructs kapf90 whether or not to perform live variable analysis to determine if the value of a local scalar variable in a subroutine or function needs to be saved between invocations of the routine being processed. SAVE statements will be generated for any variables requiring them. kapf90 will not delete or ignore a SAVE statement coded by the user.

Saving local variables may be required for correct execution of the program but can restrict kapf90 optimizations.

With -save=manual , kapf90 assumes that the user has inserted the necessary SAVE statements into the code and performs no corresponding analysis of its own. The user-written SAVE statements are assumed to be correct and sufficient.

Specifying -save=all tells kapf90 that all routine-local variables and COMMON blocks are retained between invocations. This is as if all variables and COMMON blocks were in SAVE statements.

-so=<integer>
Long name: -scalaropt=<integer>
Default value: -so=3
The !*$*SCALAR OPTIMIZE directive sets the level of serial transformations performed. Unlike the scalaropt switch, the !*$*SCALAR OPTIMIZE directive sets the level of loop-based optimizations only, such as loop fusion, and not straight code optimizations, and dead code elimination.

The levels and their optimizations are:

0
No scalar optimizations performed
1
IF loops changed into DO loops
Simple code floating out of loops performed
Inaccessible or unused code removed
Forward substitution of variables performed
Dusty deck IF transformations enabled
2
Full range of scalar optimizations enabled
Invariant IFs floated out of loops
Induction variable recognition
Loop rerolling if roundoff >= 1
Loop unrolling, loop peeling, loop fusion
3
Memory management performed if -roundoff = 3
Additional dead code elimination performed during output conversion
-scan=<integer>
Long name: -scan=<integer>
Default value: -scan=72
The scan switch allows the user to set the length of the Fortran input lines. kapf90 will ignore and treat as a comment characters on columns beyond the value of the scan switch. The values must be 72, 120, or 132.
-sasc=<integer>
Long name: -setassociativity=<integer>
Default value: -sasc=1
The setassociativity switch provides information on the mapping of physical addresses in main memory to cache pages. The default, 1 , says that a datum in main memory can be placed in only one place in cache. If this cache page is in use, it will have to be rewritten or flushed in order to copy the newly accessed page into cache.
-skip

Default value: -noskip
The skip switch tells kapf90 to ignore the specified routines. For example, the command:

kapf90 program.f90 -skip=temp_sub_1 -skip=temp_sub_2

tells KAP to process all the program units in DEC Fortran source file program.f90 except for temp_sub_1 and temp_sub_2.

-srlcd

Default value: -nosrlcd
The srlcd switch tells kapf90 to remove loop-carried dependencies. KAP holds in temporary scalar array values read or written across multiple loop iterations. Faster temporary/register accesses replace slower memory accesses in the loop body.
Srlcd stands for Scalar Replacement of Loop Carried Dependencies.
Before KAP can remove loop-carried dependencies, you must specify the switch -scalaropt where n is greater than or equal to 2.
-su=<list>
Long name: -suppress=<list>
Default value: no suppression
kapf90 produces several types of messages that range from syntax warning and error messages to messages about the optimizations performed. Use the switches below to disable the following classes of messages:
d
Data dependence messages
e
Syntax error messages
i
Informational messages
n
Not optimized messages
q
Questions
s
Standardized messages
w
Syntax warning messages

-sy=<list>
Long name: -syntax=<list>
Default value: accepts all dialects listed below
The syntax switch directs kapf90 as to whether to check for compliance with certain syntactic rules. The default is to accept the superset of the ANSI Fortran 77 standard defined by DEC Fortran, which includes many common Fortran 77 extensions.

The syntax settings are as follows:

a
Checks for strict compliance with ANSI standard. Warning and error messages are issued for syntax which does not conform to the standard.
v
Accepts the extensions and interpretations of DEC Fortran
f90
Checks for strict compliance with the ANSI Fortran 90 standard. With -syntax=f90 , failures occur when you mix logical and integer variables. See the manpage for -intlog .

-tune=<architecture>
Long name: -tune=<architecture>
Default value: -tune=host
The KAP preprocessor determines whether the host architecture is ev4 or ev5 and then optimizes your program for that architecture by default. In the event you compile a program on one architecture but plan to run it on another, you should override the default by setting -tune equal to the architecture where the program will run. For example, if you compile a program on ev4 architecture, but plan to run it on ev5, use -tune =ev5.

- [n ]ty
Long name: - [no ]type
Default value: -nty
The type switch instructs kapf90 to issue warning messages for variables not explicitly typed. This is as if there were an IMPLICIT NONE at the top of each program unit. The notype default suppresses this checking.

-ur=<integer>
Long name: -unroll
Default value: -ur=4
The unroll , unroll2 , and unroll3 switches control innerloop unrolling. -scalaropt=2 must be in effect to engage the unroll switch.

The syntax for unroll is as follows:

Long form:    -unroll=<#it>

Short form:    -ur=<#it>

where    <#it>    is the maximum number of iterations to unroll
   =0    use default values to unroll
   =1    no unrolling

The default, 4, means at most 4 iterations will be unrolled.

-ur2=<integer>
Long name:-unroll2
Default value: -ur2=160
-scalaropt=2 must be in effect to engage the unroll2 switch.
The syntax for unroll2 is as follows:

Long form:    -unroll2=<weight>

Short form:    -ur2=<weight>

where    <weight> is the maximum weight, estimate of work, in
       an unrolled loop. Work is estimated by counting
       operands and operators in a loop.

The default, 160, means a maximum work of 160 in an unrolled iteration.

-ur3=<integer>
Long name: -unroll3
Default value: -ur3=1
-scalaropt=2 must be in effect to engage the unroll3 switch. Unroll3=n sets the lower limit for unrolling. If there are less than n units of work in the loop, the loop will not be unrolled. The amount of work in each loop iteration is shown in the loop table in the annotated listing. The switch should be left at 1, the default.
-useh
Default value:  off
KAP automatically sets the -useh switch correctly for you. Digital recommends that you do not set the -useh switch.

KAP needs two passes to resolve Fortran 90 forward declarations The second pass, the -useh pass, resolves any forward references.

Directives

KAP supports the following directives: !*$* arclimit (0-500)
!*$* [no]concurrentize
!*$* each_invariant_if_growth (0-100)
!*$* [no]inline [here | routine | global][(name[,name...])]
!*$* [no]ipa [here | routine | global][(name[,name...])]
!*$* limit (>0)
!*$* max_invariant_if_growth (0-1000)
!*$* minconcurrent
!*$* optimize (0-5)
!*$* roundoff (0-3)
!*$* scalar optimize (0-3)
!*$* unroll(<#it>[,<weight>])
See the
KAP Fortran 90 for Digital UNIXtm User's Guide for more details.

Assertions

KAP supports the following assertions: !*$* assert [no]argument aliasing
!*$* assert [no]bounds violations
!*$* assert concurrent call
!*$* assert do (concurrent)
!*$* assert do (concurrent call)
!*$* assert do (serial)
!*$* assert do prefer (concurrent)
!*$* assert do prefer (serial)
!*$* assert [no]equivalence hazard
!*$* assert [no]last value needed
!*$* assert permutation
!*$* assert no recurrence
!*$* assert relation (<name> .XX. <variable/constant>)
!*$* assert no sync
!*$* assert [no] temporaries for constant arguments

SEE ALSO

f90(1)
cpp(1)
cc(1)
KAP Fortran 90 User's Guide
kapf90 man page


Table of Contents