SPEC® MPIL2007 Result

Copyright 2006-2010 Standard Performance Evaluation Corporation

SGI

SGI Altix ICE 8400EX
(Intel Xeon X5690, 3.46 GHz)

MPI2007 license: 4 Test date: Jun-2011
Test sponsor: SGI Hardware Availability: Feb-2011
Tested by: SGI Software Availability: Aug-2011
Benchmark results graph

Results Table

Benchmark Base Peak
Ranks Seconds Ratio Seconds Ratio Seconds Ratio Ranks Seconds Ratio Seconds Ratio Seconds Ratio
Results appear in the order in which they were run. Bold underlined text indicates a median measurement.
121.pop2 3072 94.7 41.1  75.6 51.4  75.7 51.4  2048 59.7 65.1 58.8 66.1 58.7 66.3
122.tachyon 3072 127   15.4  58.1 33.5  58.8 33.0  3072 127   15.4 58.1 33.5 58.8 33.0
125.RAxML 3072 79.2 36.9  79.5 36.7  79.5 36.7  3072 79.2 36.9 79.5 36.7 79.5 36.7
126.lammps 3072 33.6 73.1  33.9 72.4  33.9 72.5  3072 33.6 73.1 33.9 72.4 33.9 72.5
128.GAPgeofem 3072 72.5 81.8  72.9 81.5  72.6 81.8  3072 72.5 81.8 72.9 81.5 72.6 81.8
129.tera_tf 3072 36.7 30.0  36.5 30.1  36.7 29.9  3072 36.7 30.0 36.5 30.1 36.7 29.9
132.zeusmp2 3072 37.6 56.4  37.5 56.5  37.4 56.6  2048 34.2 61.9 33.8 62.8 34.0 62.3
137.lu 3072 39.3 107    39.4 107    39.5 106    2048 33.6 125   33.6 125   33.6 125  
142.dmilc 3072 24.9 148    25.0 147    25.0 148    3072 24.9 148   25.0 147   25.0 148  
143.dleslie 3072 660   4.70 660   4.70 660   4.70 2048 31.0 100   31.0 99.9 32.0 97.0
145.lGemsFDTD 3072 89.1 49.5  89.1 49.5  88.8 49.7  2048 82.8 53.3 82.8 53.3 83.1 53.1
147.l2wrf2 3072 90.4 90.8  90.2 91.0  90.7 90.4  3072 90.4 90.8 90.2 91.0 90.7 90.4
Hardware Summary
Type of System: Homogeneous
Compute Node: SGI Altix ICE 8400EX Compute Node
Interconnect: InfiniBand (MPI and I/O)
File Server Node: SGI InfiniteStorage Nexis 2000 NAS
Total Compute Nodes: 256
Total Chips: 512
Total Cores: 3072
Total Threads: 6144
Total Memory: 6 TB
Base Ranks Run: 3072
Minimum Peak Ranks: 2048
Maximum Peak Ranks: 3072
Software Summary
C Compiler: Intel C++ Composer XE 2011 for Linux,
Version 12.0.3.174 Build 20110309
C++ Compiler: Intel C++ Composer XE 2011 for Linux,
Version 12.0.3.174 Build 20110309
Fortran Compiler: Intel Fortran Composer XE 2011 for Linux,
Version 12.0.3.174 Build 20110309
Base Pointers: 64-bit
Peak Pointers: 64-bit
MPI Library: SGI MPT 2.04 Patch 10789
Other MPI Info: OFED 1.4.2
Pre-processors: None
Other Software: None

Node Description: SGI Altix ICE 8400EX Compute Node

Hardware
Number of nodes: 256
Uses of the node: compute
Vendor: SGI
Model: SGI Altix ICE 8400EX (Intel Xeon X5690, 3.46 GHz)
CPU Name: Intel Xeon X5690
CPU(s) orderable: 1-2 chips
Chips enabled: 2
Cores enabled: 12
Cores per chip: 6
Threads per core: 2
CPU Characteristics: Six Core, 3.46 GHz, 6.4 GT/s QPI
Intel Turbo Boost Technology up to 3.73 GHz
Hyper-Threading Technology enabled
CPU MHz: 3467
Primary Cache: 32 KB I + 32 KB D on chip per core
Secondary Cache: 256 KB I+D on chip per core
L3 Cache: 12 MB I+D on chip per chip
Other Cache: None
Memory: 24 GB (6 x 4 GB 2Rx4 PC3-10600R-9, ECC)
Disk Subsystem: None
Other Hardware: None
Adapter: Mellanox MT26428 ConnectX IB QDR
(PCIe x8 Gen2 5 GT/s)
Number of Adapters: 2
Slot Type: PCIe x8 Gen2
Data Rate: InfiniBand 4x QDR
Ports Used: 1
Interconnect Type: InfiniBand
Software
Adapter: Mellanox MT26428 ConnectX IB QDR
(PCIe x8 Gen2 5 GT/s)
Adapter Driver: OFED-1.4.2
Adapter Firmware: 2.7.8200
Operating System: SUSE Linux Enterprise Server 11 SP1,
Kernel 2.6.32.13-0.4-default
Local File System: NFSv3
Shared File System: NFSv3 IPoIB
System State: Multi-user, run level 3
Other Software: SGI ProPack 7SP1 for Linux,
Build 701r3.sles11-1005252113
SGI Tempo Compute Node 2.1,
Build 701r3.sles11-1005252113

Node Description: SGI InfiniteStorage Nexis 2000 NAS

Hardware
Number of nodes: 1
Uses of the node: fileserver
Vendor: SGI
Model: SGI Altix XE 270 (Intel Xeon X5670, 2.93 GHz)
CPU Name: Intel Xeon X5670
CPU(s) orderable: 1-2 chips
Chips enabled: 2
Cores enabled: 12
Cores per chip: 6
Threads per core: 2
CPU Characteristics: Intel Turbo Boost Technology up to 3.33 GHz
Hyper-Threading Technology enabled
CPU MHz: 2933
Primary Cache: 32 KB I + 32 KB D on chip per core
Secondary Cache: 256 KB I+D on chip per chip
L3 Cache: 12 MB I+D on chip per chip
Other Cache: None
Memory: 96 GB (12*8 GB DDR3-1333 CL9 DIMMs)
Disk Subsystem: 8.8 TB RAID 5
60 x 146 GB SAS (Seagate Cheetah 15K.5)
Other Hardware: None
Adapter: Mellanox MT26428 ConnectX IB QDR
(PCIe x8 Gen2 5 GT/s)
Number of Adapters: 2
Slot Type: PCIe x8 Gen2
Data Rate: InfiniBand 4x QDR
Ports Used: 2
Interconnect Type: InfiniBand
Software
Adapter: Mellanox MT26428 ConnectX IB QDR
(PCIe x8 Gen2 5 GT/s)
Adapter Driver: OFED-1.4.0
Adapter Firmware: 2.7.0
Operating System: SUSE Linux Enterprise Server 11 (x86_64)
Kernel 2.6.27.19-5-default
Local File System: xfs
Shared File System: --
System State: Multi-user, run level 3
Other Software: SGI Foundation Software 2, Build
700r3.sles11-1004061553

Interconnect Description: InfiniBand (MPI and I/O)

Hardware
Vendor: Mellanox Technologies and SGI
Model: None
Switch Model: SGI QDR_1.5_HYPR_2454 with Mellanox Device 48438
(Infiniscale IV)
Number of Switches: 64
Number of Ports: 36
Data Rate: InfiniBand 4x QDR
Firmware: 5040005
Topology: Enhanced Hypercube
Primary Use: MPI and I/O traffic

Submit Notes

The config file option 'submit' was used.
For peak benchmarks that used 2048 MPI ranks, four ranks
were assigned to each CPU chip, leaving 2 cores per chip idle.

General Notes

 Software environment:
   export MPI_REQUEST_MAX=65536
   export MPI_TYPE_MAX=32768
   export MPI_BUFS_THRESHOLD=1
   export MPI_IB_RAILS=2
   ulimit -s unlimited

 BIOS settings:
   AMI BIOS version 080016
   Hyper-Threading Technology enabled (default)
   Intel Turbo Boost Technology enabled (default)
   Intel Turbo Boost Technology activated in the OS via
     /etc/init.d/acpid start
     /etc/init.d/powersaved start
     powersave -f

 Job Placement:
   In the base run, each MPI job was assigned to a topologically compact
   set of nodes, i.e. the minimal needed number of switches was
   used for each job: 2 switches for 96 ranks,
   4 switches for 192 ranks, 8 switches for 384 ranks,
   16 switches for 768 ranks, 32 switches for 1536 ranks,
   and 64 switches for 3072 ranks.

 Additional notes regarding interconnect:
   The Infiniband network consists of two independent planes,
   with half the switches in the system allocated to each plane.
   I/O traffic is restricted to one plane, while MPI traffic can
   use both planes.

Compiler Invocation

C benchmarks:

 icc 

C++ benchmarks:

126.lammps:  icpc 

Fortran benchmarks:

 ifort 

Benchmarks using both Fortran and C:

 icc   ifort 

Portability Flags

121.pop2:  -DSPEC_MPI_CASE_FLAG 

Base Optimization Flags

C benchmarks:

 -O3   -xSSE4.2   -no-prec-div 

C++ benchmarks:

126.lammps:  -O3   -xSSE4.2   -no-prec-div   -ansi-alias 

Fortran benchmarks:

 -O3   -xSSE4.2   -no-prec-div 

Benchmarks using both Fortran and C:

 -O3   -xSSE4.2   -no-prec-div 

Peak Optimization Flags

C benchmarks:

122.tachyon:  basepeak = yes 
125.RAxML:  basepeak = yes 
142.dmilc:  basepeak = yes 

C++ benchmarks:

126.lammps:  basepeak = yes 

Fortran benchmarks:

129.tera_tf:  basepeak = yes 
137.lu:  -O3   -xSSE4.2   -no-prec-div 
143.dleslie:  Same as 137.lu 
145.lGemsFDTD:  Same as 137.lu 

Benchmarks using both Fortran and C:

121.pop2:  -O3   -xSSE4.2   -no-prec-div 
128.GAPgeofem:  basepeak = yes 
132.zeusmp2:  Same as 121.pop2 
147.l2wrf2:  basepeak = yes 

Other Flags

C benchmarks:

 -lmpi 

C++ benchmarks:

126.lammps:  -lmpi 

Fortran benchmarks:

 -lmpi 

Benchmarks using both Fortran and C:

 -lmpi 

The flags file that was used to format this result can be browsed at
http://www.spec.org/mpi2007/flags/SGI_x86_64_Intel12_flags.html.

You can also download the XML flags source by saving the following link:
http://www.spec.org/mpi2007/flags/SGI_x86_64_Intel12_flags.xml.