SPEC® MPIM2007 Result

Copyright 2006-2010 Standard Performance Evaluation Corporation

IBM Corporation

IBM Power 575

MPI2007 license: 0005 Test date: Jun-2008
Test sponsor: IBM Corporation Hardware Availability: May-2008
Tested by: IBM Corporation Software Availability: May-2008
Benchmark results graph

Results Table

Benchmark Base Peak
Ranks Seconds Ratio Seconds Ratio Seconds Ratio Ranks Seconds Ratio Seconds Ratio Seconds Ratio
Results appear in the order in which they were run. Bold underlined text indicates a median measurement.
104.milc 64 281 5.56 280 5.58 280 5.60 64 281 5.56 280 5.58 280 5.60
107.leslie3d 64 408 12.8  406 12.9  406 12.8  64 408 12.8  406 12.9  406 12.8 
113.GemsFDTD 64 408 15.5  406 15.5  406 15.5  64 408 15.5  406 15.5  406 15.5 
115.fds4 64 237 8.22 237 8.23 240 8.13 64 237 8.22 237 8.23 240 8.13
121.pop2 64 548 7.53 548 7.53 548 7.54 64 548 7.53 548 7.53 548 7.54
122.tachyon 64 775 3.61 775 3.61 774 3.61 64 775 3.61 775 3.61 774 3.61
126.lammps 64 384 7.59 383 7.62 382 7.62 64 384 7.59 383 7.62 382 7.62
127.wrf2 64 722 10.8  721 10.8  721 10.8  64 722 10.8  721 10.8  721 10.8 
128.GAPgeofem 64 650 3.18 731 2.83 175 11.8  64 650 3.18 731 2.83 175 11.8 
129.tera_tf 64 654 4.24 654 4.23 653 4.24 64 654 4.24 654 4.23 653 4.24
130.socorro 64 148 25.8  147 26.0  211 18.1  64 148 25.8  147 26.0  211 18.1 
132.zeusmp2 64 480 6.46 477 6.50 477 6.51 64 480 6.46 477 6.50 477 6.51
137.lu 64 275 13.3  278 13.2  274 13.4  64 275 13.3  278 13.2  274 13.4 
Hardware Summary
Type of System: Homogeneous
Compute Nodes: IBM Power 575
IBM Power 575
Interconnects: InfiniBand
Gigabit Ethernet
File Server Node: IBM Power 575
Head Node: IBM Power 575
Total Compute Nodes: 2
Total Chips: 32
Total Cores: 64
Total Threads: 64
Total Memory: 256 GB
Base Ranks Run: 64
Minimum Peak Ranks: 64
Maximum Peak Ranks: 64
Software Summary
C Compiler: IBM XL C/C++ Enterprise Edition V9.0
Updated with the Oct2007 PTF
C++ Compiler: IBM XL C/C++ Enterprise Edition V9.0
Updated with the Oct2007 PTF
Fortran Compiler: IBM XL Fortran Enterprise Edition V11.1
Updated with the Oct2007 PTF
Base Pointers: 64-bit
Peak Pointers: 64-bit
MPI Library: IBM Parallel Environment for AIX
V4.3.2.2
Other MPI Info: --
Pre-processors: --
Other Software: None

Node Description: IBM Power 575

Hardware
Number of nodes: 1
Uses of the node: compute, head, fileserver
Vendor: IBM Corporation
Model: IBM Power 575
CPU Name: POWER6
CPU(s) orderable: 32 cores
Chips enabled: 16
Cores enabled: 32
Cores per chip: 2
Threads per core: 1
CPU Characteristics:
CPU MHz: 4700
Primary Cache: 64 KB I + 64 KB D on chip per core
Secondary Cache: 4 MB I+D on chip per core
L3 Cache: 32 MB I+D off chip per chip
Other Cache: None
Memory: 128 GB (64x2 GB) DDR2 533 MHz
Disk Subsystem: 1x146 GB SFF SAS, 10K RPM
Other Hardware: None
Adapter: Integrated
Number of Adapters: 1
Slot Type: --
Data Rate: 1 Gbps
Ports Used: 1
Interconnect Type: Gigabit Ethernet
Adapter: IBM Dual 2-port 4x DDR Host Channel Adapter
Number of Adapters: 2
Slot Type: GX++
Data Rate: 4x DDR 20 Gbps
Ports Used: 4
Interconnect Type: DDR InfiniBand
Software
Adapter: Integrated
Adapter Driver: fileset devices.chrp.IBM.lhea.rte 5.3.8.2
Adapter Firmware: --
Adapter: IBM Dual 2-port 4x DDR Host Channel Adapter
Adapter Driver: fileset devices.common.IBM.ib.rte 5.3.8.2
Adapter Firmware: --
Operating System: IBM AIX V5.3
with the 5300-08-02 Technology Level
Local File System: AIX/JFS2
Shared File System: NFS over ethernet
System State: Multi-user
Other Software: APAR IZ26983
software update for InfiniBand adapter drivers
IBM LoadLeveler for AIX
V3.4.3.2

Node Description: IBM Power 575

Hardware
Number of nodes: 1
Uses of the node: compute
Vendor: IBM Corporation
Model: IBM Power 575
CPU Name: POWER6
CPU(s) orderable: 32 cores
Chips enabled: 16
Cores enabled: 32
Cores per chip: 2
Threads per core: 1
CPU Characteristics:
CPU MHz: 4700
Primary Cache: 64 KB I + 64 KB D on chip per core
Secondary Cache: 4 MB I+D on chip per core
L3 Cache: 32 MB I+D off chip per chip
Other Cache: None
Memory: 128 GB (64x2 GB) DDR2 533 MHz
Disk Subsystem: 1x146 GB SFF SAS, 10K RPM
Other Hardware: None
Adapter: Integrated
Number of Adapters: 1
Slot Type: --
Data Rate: 1 Gbps
Ports Used: 1
Interconnect Type: Gigabit Ethernet
Adapter: IBM Dual 2-port 4x DDR Host Channel Adapter
Number of Adapters: 2
Slot Type: GX++
Data Rate: 4x DDR 20 Gbps
Ports Used: 4
Interconnect Type: DDR InfiniBand
Software
Adapter: Integrated
Adapter Driver: fileset devices.chrp.IBM.lhea.rte 5.3.8.2
Adapter Firmware: --
Adapter: IBM Dual 2-port 4x DDR Host Channel Adapter
Adapter Driver: fileset devices.common.IBM.ib.rte 5.3.8.2
Adapter Firmware: --
Operating System: IBM AIX V5.3
with the 5300-08-02 Technology Level
Local File System: AIX/JFS2
Shared File System: NFS over ethernet
System State: Multi-user
Other Software: APAR IZ26983
software update for InfiniBand adapter drivers
IBM LoadLeveler for AIX
V3.4.3.2

Interconnect Description: InfiniBand

Hardware
Vendor: QLogic
Model: --
Switch Model: QLogic SilverStorm 9024
Number of Switches: 2
Number of Ports: 24
Data Rate: InfiniBand 4x DDR 20 Gbps
Firmware: 4.2.1.1.1
Topology: linear
Primary Use: MPI Communication

Interconnect Description: Gigabit Ethernet

Hardware
Vendor: IBM Corporation
Model: Cisco Systems WS-C6509-E
Catalyst 6500 9-slot Chassis System
Switch Model: Cisco Systems WS-X6748-GE-TX
CEF720 48 port 10/100/1000mb Ethernet card
Cisco Systems WS-SUP720-3B
2 ports Supervisor Engine 720 Rev. 5.2
Number of Switches: 1
Number of Ports: 48
Data Rate: 1 Gbps
Firmware: 01ES330_034_034
Topology: --
Primary Use: File system

General Notes

113.GemsFDTD (base): Applied maxprocandstop src.alt
129.tera_tf (base): Applied fixbuffer src.alt
127.wrf2 (base): Applied fixcalling src.alt
all ulimits set to unlimited
"petaskbind.sh" script used to bind each task to a unique processor
POE Environment variables set before executing benchmarks:
 CWD		     =/specmpi/mpi2007-1.0
 MP_ADAPTER_USE      =shared
 MP_EUILIB           =us
 MP_EUIDEVICE        =sn_all
 MP_SHARED_MEMORY	 =yes
 MP_SINGLE_THREAD	 =yes
 MP_WAIT_MODE        =poll
 MP_EAGER_LIMIT      =65536
 MP_BUFFER_MEM       =67108864
 MP_POLLING_INTERVAL =80000000
 MP_USE_BULK_XFER    =yes
 MP_BULK_MIN_MSG_SIZE=65536
 MP_STDINMODE        =none
 MP_LABELIO          =no
 MP_HOSTFILE         =$CWD/r35.64-2node
Other Environment variables
 MEMORY_AFFINITY     =MCM
 LDR_CNTRL 	     =DATAPSIZE=64K@TEXTPSIZE=64K@STACKPSIZE=64K
 XLFRTEOTPS          =intrinthds=1
submit command uses petaskbind.sh script to bind logical processors to ranks
 poe $CWD/petaskbind.sh $command -procs $ranks
The Gigabit ethernet switch is shared among many nodes, not just the cluster used in this benchmark.

Base Compiler Invocation

C benchmarks:

 /usr/bin/mpcc_r 

C++ benchmarks:

126.lammps:  /usr/bin/mpCC_r 

Fortran benchmarks:

 /usr/bin/mpxlf95_r 

Benchmarks using both Fortran and C:

 /usr/bin/mpcc_r   /usr/bin/mpxlf95_r 

Base Portability Flags

107.leslie3d:  -qfixed 
115.fds4:  -DSPEC_MPI_LC_NO_TRAILING_UNDERSCORE   -qfixed 
121.pop2:  -DSPEC_MPI_AIX 
127.wrf2:  -DNOUNDERSCORE   -DSPEC_MPI_AIX 
130.socorro:  -DSPEC_NO_UNDERSCORE   -qcpluscmt 
132.zeusmp2:  -qfixed   -DSPEC_SINGLE_UNDERSCORE 
137.lu:  -qfixed 

Base Optimization Flags

C benchmarks:

 -O4   -qarch=pwr6   -qtune=pwr6   -q64 

C++ benchmarks:

126.lammps:  -O4   -qarch=pwr6   -qtune=pwr6   -qstrict   -q64 

Fortran benchmarks:

 -O4   -qarch=pwr6   -qtune=pwr6   -qalias=nostd   -q64 

Benchmarks using both Fortran and C:

 -O4   -qarch=pwr6   -qtune=pwr6   -qalias=nostd   -q64 

Base Other Flags

C benchmarks:

 -w   -qsuppress=1500-036   -qipa=noobject   -qipa=threads 

C++ benchmarks:

126.lammps:  -w   -qsuppress=1500-036   -qipa=noobject   -qipa=threads 

Fortran benchmarks:

 -w   -qsuppress=1500-036   -qsuppress=cmpmsg   -qipa=noobject   -qipa=threads 

Benchmarks using both Fortran and C:

 -w   -qsuppress=1500-036   -qsuppress=cmpmsg   -qipa=noobject   -qipa=threads 

Peak Optimization Flags

C benchmarks:

104.milc:  basepeak = yes 
122.tachyon:  basepeak = yes 

C++ benchmarks:

126.lammps:  basepeak = yes 

Fortran benchmarks:

107.leslie3d:  basepeak = yes 
113.GemsFDTD:  basepeak = yes 
129.tera_tf:  basepeak = yes 
137.lu:  basepeak = yes 

Benchmarks using both Fortran and C:

115.fds4:  basepeak = yes 
121.pop2:  basepeak = yes 
127.wrf2:  basepeak = yes 
128.GAPgeofem:  basepeak = yes 
130.socorro:  basepeak = yes 
132.zeusmp2:  basepeak = yes 

The flags files that were used to format this result can be browsed at
http://www.spec.org/mpi2007/flags/MPI2007_flags.20080828.html,
http://www.spec.org/mpi2007/flags/MPI2007_flags.0.20080828.html,
http://www.spec.org/mpi2007/flags/MPI2007_flags.1.html.

You can also download the XML flags sources by saving the following links:
http://www.spec.org/mpi2007/flags/MPI2007_flags.20080828.xml,
http://www.spec.org/mpi2007/flags/MPI2007_flags.0.20080828.xml,
http://www.spec.org/mpi2007/flags/MPI2007_flags.1.xml.