This page is meant to be a collection of best practices for scientific
computing in Fortran.
This page is under construction.
1. Philosophy
2. Dialect
Under no circumstances should one begin to write a new code in FORTRAN
77. It seems to me that some people believe that it is not
possible—or they are not capable—of writing a code in modern FORTRAN
(90/95, 2003) which is as numerically expedient as the same code in
FORTRAN 77. (When I say numerically expedient I mean this in terms of
execution time, cache hits/misses, etc.) While it is possible to
write a slower code using modern FORTRAN by using inappropriate
algorithms and data structures or by giving variables and procedures
certain attributes, in general (and in my personal experience) there
is no reason why a code written in modern FORTRAN should be any slower
than one written in the 77 dialect. Indeed the modern dialects not
only include language features which can enhance performance, but also
include a wide variety of language features which allow the programmer
to drastically improve the style, modularity, readability,
functionality, and performance of the code. I am very slowly working
on this page and the rest of my website so please bare with me. Below
is a brief outline taken from an ARSC HPC Users' newsletter (with some
markup I have added).
2.1. Modularity Tip
The following was extracted from the ARSC HPC Users' Newsletter Number 399 01/16/2009 [ By Lee Higbie ]
Many programs are still written in FORTRAN 77 with constructs that are
relatively difficult to maintain compared to those available in FORTRAN
90/95. This article encourages you to take advantage of some of the
newer constructs to improve the software engineering of your Fortran. I
will describe modules and how they lead to more understandable and safer
code than common blocks.
*Some Terminology*
The "heap" is a section of memory that is managed separately from other
system memory and, for most HPC machines and all ARSC machines, is the
memory with the largest addresses. Statically allocated data and
instructions are limited to the first 2 GB of address space by default
for the AMD x86_64 architecture. This is known as the small mcmodel,
the McModel, I guess. Currently Pingo and Ognip only support this
small memory model, which is insufficient for very large programs. One
way for Fortran codes to take advantage of the great wasteland of memory
with large addresses, is to dynamically allocate memory. All allocated
data are on the heap, starting at the largest available address and
working down toward 2G.
I use the non-standard term "parameters" for the items in a call or
function statement and "arguments" for those in the subroutine or
function statement itself. You give parameters (to your children) and
get arguments (from your parent).
*Out with the Old*
So, you have a program, maybe old maybe new, maybe it looks nice with
nice structure, no GOTO statements and free format text and
using features that were not in the standard until Fortran 90/95, but
it may not be a Fortran 90/95 program. In my terminology, a program is
not FORTRAN 90/95 if it or any of its subroutines use any of the
following:
fixed form or column-oriented source code
COMMON blocks
DATA statements
multiple declarations statements applying to a single variable
statement functions
This list is hardly comprehensive. It includes some common Fortran 77
structures that I think should be banned.
First a description of modules for Fortran 77 users. Conceptually the
data part of modules is like common that is referenced by name instead
of location. Referencing by name means that the name is the same in all
parts of the program and there are no aliases for the data it
references.
In addition to providing program-wide names that refer to specific
arrays, for example, modules allow you to
Describe all the characteristics of the data in one place (type,
rank, volatility, …).
Select only those data items you want (without a dummy
variable to skip over them, as you need in common).
The Fortran standards groups have tried to make modules more like OO
objects with functions and subroutines allowed in the module. One
important consequence of putting subprogames in modules is that
compilers verify that parameter lists match argument lists, so some of
the most common interface problems will be caught by the compiler.
Compilers have to check parameter type and shape because of subroutine
and function name overloading. (Several similar subprograms can have
the same name but have different type or shape of arguments.)
Suppose, probably for reasons of saving memory space, a legacy program
uses the same memory locations for two arrays. In Fortran 77, it might
have:
REALcelTmp(lonDm, latDm, hgtDm),fahTmp(lonDm, latDm, hgtDm)COMMON/tempbl/ celTmp, fahTmp
! uses 4 * (lonDm * latDm * hgtDm + lonDm * latDm * hgtDm) bytes, the! same as before
The problems I see with this approach are the compiler has no way to
verify much of anything about any of the temperature variables. If a new
programmer sees one of these variables and tries to use it at a time
when the data in it are the other type, it will most assuredly lead to
interesting results. Also, bizarre outcome is likely if the relative
memory allocation of the two data types varies, as can happen when
porting to a new machine. If the declarations are not inserted by an
include statement, then all sorts of other mistakes that the compiler
cannot detect are likely.
Notice here that the longer names are self-descriptive. There is much
less chance of a new programming team member thinking celsTemperature is
a cell-temporary variable (as with the F77 celTemp), improving safety.
With this structure, an attempt to use data when it is not allocated
will generate a trap.
Summarizing, if your programs include common, you should start a project
to rewrite them to use modules instead. Fortran modules provide the
data sharing visibility of common blocks, but in a manner that provides
for much more maintainable and safer code, and they provide other
features that improve code clarity.
3. Style
4. Performance
5. Pitfalls
5.1. Initializing Local variables
Do note initialize local variables in procedures! In Fortran this
implicitly gives the variable the SAVE attribute, causing it to
retain its value upon exiting the procedure, and then initialize
to its old value upon re-entering the procedure. If this behaviour
is desired the variable should be explicitly given the SAVE
attribute to prevent confusion. In Example 1 the variable int has
(probably inadvertently) acquired the SAVE attribute. Example 2
accomplishes the same thing as Example one but it is much clearer,
and stylistically preferred. Example 3 is probably what the author
of Example 1 was trying to accomplish. Out of all the examples this
is the only procedure which will run more than once. In examples 1
and 2 int will retain the a value of SIZE(in,1) and after the first
execution the do loop will exit before one cycle.
1
SUBROUTINEexample(in,out)REAL,DIMENSION(:,:),INTENT(in):: in
REAL,DIMENSION(:),INTENT(out):: out
INTEGER:: int =0! DON'T DO THIS!!!!!!!!!!!!!DO
int = int +1IF(int >=SIZE(in,1))THEN
EXIT
ENDIF! ...ENDDO
2
SUBROUTINEexample(in,out)REAL,DIMENSION(:,:),INTENT(in):: in
REAL,DIMENSION(:),INTENT(out):: out
INTEGER,SAVE:: int =0! This is equivalent to 1 but stylistically! preferred.DO
int = int +1IF(int >=SIZE(in,1))THEN
EXIT
ENDIF! ...ENDDO
3
SUBROUTINEexample(in,out)REAL,DIMENSION(:,:),INTENT(in):: in
REAL,DIMENSION(:),INTENT(out):: out
INTEGER:: int ! This is probably what the author in 1 meant to! accomplish.
int =0DO
int = int +1IF(int >=SIZE(in,1))THEN
EXIT
ENDIF! ...ENDDO