[esp-r] Re: ESPr validation with gfortran/gcc4

Fri Feb 27 17:37:47 GMT 2009

>Has anyone recently performed validation on more recent toolchains 
>under a Linux plaform >(gcc4.3/gfortran)?
>
>I tend to keep my system updated, and am wondering if a separate 
>version of gcc3/g77 is still necessary to avoid truncation/numerical 
>discrepancies between compilers.

Hi Scott,

We've undertaken validation with gcc/gfortran 4.1 and 4.2, as well as
the Intel compiler. Results are mixed; 4.1.2 seems the most stable (and
is now recommended over gcc 3.4.X/ g77, while 4.2+ produces numerical
exceptions. The next release (11.7) will likely default the Install
script to use gFortran. 

A complete summary of the pre-release testing (including
g77/gfortran/ifort/sun-f90 comparisons) completed for 11.6 follows.

- Alex 

Alex Ferguson
Sustainable Buildings and Communities (Housing Group)
CanmetENERGY
Natural Resources Canada
P:+1 613 995 3294
F:+1 613 996 9909
Alex.Ferguson at NRCan-RNCan.gc.ca

====================== 
SUMMARY: 
====================== 

 - ESP-r compiles correctly on Linux, Cygwin, MinGW, Solaris and OSX. 
 - Portability testing suggests that GCC 4.1.2 builds are actually more 
   reliable than GCC3 builds. 
 - GCC 4.2.3 builds are not as robust. 
 - Some uncertainty remains in Intel builds, as well as builds on OSX. 

Recommendations: 
 - Proceed with release 11.6. 
 - The ESP-r 'reference' platform should be changed from 
   Linux / GCC 3.4.6 to Linux / GCC 4.1.2 
 - Cygwin / GCC 3.4.6  and SUN CC builds should continue to be 
   designated as stable 
 - GCC 4.2+ and Intel builds should be used with caution. 
 - Compiler optimization should be used with caution. 
 - OSX / GCC 3.4.6 and MinGW / GCC 3.4.6 builds should be used with 
   caution. 

Detailed results follow. 

====================== 
Static Analysis: 
====================== 

The Forcheck static analyzer warns of no additional coding errors in the
current version of pre-release-patches. The total number of errors,
warnings and informational numbers as declined from the last release.

====================== 
Regression testing: 
====================== 

Changes since the last revision have introduced significant numerical
differences into the results. But for the first time, all the numerical
differences were anticipated by developers, and no unintended effects
were observed during testing.

Two revisions introduced numerical differences in to the results: 

 r2838: This commit patched a long-standing bug in the water tank 
        storage model. Predicted water temperatures in the plant 
        network may differ by as much as 3oC; predicted casual gains 
        in the space may differ by as much as 3W. 

 r3088: This commit corrected an error in the application of relaxation 
        to the plant network solution scheme. Predicted heat transfer 
        in the plant may differ by as much as 400W (1%), predicted 
        temperatures in the plant may differ by as much as 18oC in 
        extreme cases. 

====================== 
Portability testing: 
====================== 
The release candidate can be successfully installed on the following
platform, compiler and graphics library combinations. We've also added
support for the Intel compiler suite on linux. Jon Hand has been
experimenting with support for 64-bit architecture, but this work has
not yet been generalized for inclusion in the public release. 

We've successfully installed ESP-r using the following 32-bit platform,
compiler and graphics library combinations: 

          Support for ESP-r on various compiler, platform 
                and graphics library combinations. 
    ============================================================= 
    Compiler     Library    Linux   Cygwin   MinGW   Sun   OSX 
    ============================================================= 
    GCC 3.4.X    X11                                  O     O 
    SUN CC       X11                                  O 
    GCC 3.4.4    X11                  O 
                 noX                  O 
    GCC 3.4.5    noX                           * 
    GCC 3.4.6    X11          O 
                 GTK          O 
                 noX          O 
    INTEL        noX          O 
                 X11          O 
                 GTK          O 
    GCC 4.1.2    X11          O 
    GCC 4.2.3    X11          O 
    ------------------------------------------------------------- 
    Notes: 
    O: Installs correctly. 
    *: MinGW only installs correctly to absolute DOS paths 
    (i.e. C:/ESRU) 
    ============================================================= 

We've also undertaken a comprehensive comparison of the numerical
results produced by ESP-r on various platforms. Previously, these
comparisons were limited to various versions of the GCC compiler suite
on Linux and cygwin.  Since the last release, Jon Hand has worked to add
support for XML-enabled builds on OSX and Solaris, using both the GCC
and Sun compiler suites. Along with new support for the Intel compiler,
these new capabilities add important reference points for our ongoing
evaluation of GCC4/gFortran

For the last few years, we've designated GCC 3.4.6/Linux builds as the
reference version of ESP-r. Most ESP-r development and automated testing
is based on this platform, and bps's numerical predictions on this
platform receive scrutiny than any other. GCC 3's fortran 77 compiler
(g77) has been obsoleted by gFortran, a fortran 90 compiler released
with GCC 4. Results differ somewhat between g77 and gFortran builds, and
we've been wondering if they're caused by bugs in g77 or gFortran, or
perhaps poor ESP-r code that one of the compilers is more sensitive to.
For the first time, we have additional reference points (Intel and Sun
CC/f90) to perform these comparisons with.

The following chart summarizes the significant observed differences
between the  and other platform and compiler combinations. The "Maximum
difference" reflects the maximum observed difference observed in the
test cases, excluding exceptional test cases that are discussed in
section 'Problematic test cases', below..

      Comparison of results from GCC4.1.2-compiled versions 
      (X11, Linux) with builds from other platform/compilers. 
    ============================================================= 
    Arch.     Platform    Compiler      Maximum difference (%) 
    ------------------------------------------------------------- 
    IA-32     CYGWIN      GCC 3.4.4            3.18 
              LINUX       GCC 3.4.6            3.18 
                          GCC 4.2.3            0.   (^1) 
                          Intel -O0            0.75 (^2) 
                          Intel -O2           38.781 
    PowerPC   OSX         GCC 3.4             16.1 
    Sparc     SOLARIS     Sun CC/f90           1.80 
    ------------------------------------------------------------- 
    NOTES: 
      1. Some test cases produced numerical exceptions and 
         floating point errors. See "Problematic Test Cases". 
      2. One test case produced a larger error (~13%). See below. 
    ============================================================= 

Detailed scrutiny of the results revealed: 

 - gFortran vs Intel iFort / SUN f90 
   --------------------------------- 
   With optimization disabled, the results from SUN and Intel compilers 
   consistently agree more closely with GCC 4.1.2 / gFortran than they 
   do with GCC 3.4.6 / g77. These results suggest GCC 4.1.2 can be used 
   with confidence, and indeed, should be recommended in place of g77. 

 - Stability of gFortran 
   --------------------- 
   GCC 4.2.3 builds encounter numerical exceptions in a handful of test 
   cases that ran correctly in all other compilers (See "problematic 
   test cases", below). I recommend GCC 4.2+ builds be regarded as 
   'beta' until this issue is resolved. 

 - Intel iFort results: effect of optimization 
   ------------------------------------------- 
   With optimization deactivated, the Intel compiler consistently
produced 
   results close to the gFortran and Sun f90 builds. But when the
default 
   optimization is activated (using the -O2) option, agreement between 
   the Intel and GCC 4.1.2 compiler deteriorates considerably. Nearly 
   all of the models exhibit small, non-trace differences, and in every 
   case these differences were slightly, or significantly larger than 
   those observed with optimization deactivated. 

   Its possible this error is due to bugs in Intel's optimization 
   algorithm, but poor code in ESP-r is more likely the culprit. For 
   instance, when -O0 is specified, the compiler ensures that arguement 
   mismatches in procedure calls are properly converted, but does not 
   perform these conversions by default. 

   This problem might occur in GCC builds as well. By default, GCC uses 
   no optimization. I suspect similar issues might appear if -O2 is 
   specified for g77/gFortran builds. 

   For these reasons, application of optimization options with ESP-r 
   is not presently recommended. 

 - Intel iFort results: test case bld_hc_ISO15099 /HC 
   -------------------------------------------------- 
   Newly added for this release, this test case continues to cause 
   headaches. Intel-compiled versions of bps exhibit significant 
   differences from g77/gFortran builds: 

      - MAX error (W)                199.37 W  ( 13.293 %) 
      - Predicted value - g77        1499.8 W 
                          Intel      1300.4 W 
        [ observed in: 
          building:zone_05:thermal_loads:net_load:month_01 (max) ] 

   The corresponding test case that does not activate the ISO15099 
   correlation does not exhibit the same error, suggesting the Intel 
   compiler is exposing a sensitive compoment of the ISO15099 algorithm,

   or vice-versa. 

   The SUN CC/f90 results we have do not include this test case, 
   precluding assessment of whether the g77/gFortran or Intel
predictions 
   are more accurate. For this reason, the Intel compiler suite is only 
   recommended for beta testing for the time being. 

 - OSX / GCC builds 
   ---------------- 
   A handful of test cases exhibited suprising sensitivity when run 
   with GCC 3.4 compiled builds on OSX. The cause of these differences 
   is not known; OSX builds should be used with caution for the time 
   being. 

=========================== 
Problematic test cases 
=========================== 

 - esru_benchmark_model / bld_basic_af2_summer 
                        / bld_basic_af2_winter 
   ------------------------------------------- 
   We've previously observed the that this coarse-timestep air flow 
   network test case produces significant differences in g77 and 
   gFortran builds. The long, half-hour timesteps cause small 
   differences in the numerical computations to be exaggerated in 
   the aggregated output --- increasing time resolution vastly improves 
   agreement between the compilers. 

   The Linux / Intel, Solaris / SUN CC-f90, and OSX / g77 platforms all 
   exhibit the same sensitivity. They produce differing results at 
   short-timesteps, but agreement with Linux / g77 improves at higher 
   time-resolutions. 

   Since we've observed that each compiler combination produces
dissimilar 
   results and that the high-resolution version is more useful, 
   this test case has little value. For this reason, I recommend we 
   delete it. 

 - plt_boundary_conditions / connected_flow 
                             connected_temperature 
                             unconnected_controls 
                             unconnected_flow 
                             unconnected_temperature 
   ------------------------------------------------- 
   While they run correctly in every other compiler (including GCC
4.1.2), 
   these test cases appear to produce numerical exceptions in GCC 4.2.3.