SPEC® MPIM2007 Result

Copyright 2006-2010 Standard Performance Evaluation Corporation

Intel Corporation

Endeavor (Intel Xeon E5-2697 v3, 2.60 GHz,
DDR4-2133 MHz, SMT on, Turbo on)

SPECmpiM_peak2007 = Not Run

MPI2007 license: 13 Test date: Aug-2014
Test sponsor: Intel Corporation Hardware Availability: Sep-2014
Tested by: Pavel Shelepugin Software Availability: May-2014
Benchmark results graph

Results Table

Benchmark Base Peak
Ranks Seconds Ratio Seconds Ratio Seconds Ratio Ranks Seconds Ratio Seconds Ratio Seconds Ratio
Results appear in the order in which they were run. Bold underlined text indicates a median measurement.
104.milc 112 85.1 18.4 84.2 18.6 84.0 18.6
107.leslie3d 112 212   24.6 213   24.5 213   24.5
113.GemsFDTD 112 193   32.7 192   32.9 193   32.7
115.fds4 112 89.4 21.8 89.0 21.9 90.2 21.6
121.pop2 112 239   17.2 240   17.2 240   17.2
122.tachyon 112 108   25.9 108   25.9 108   25.9
126.lammps 112 143   20.4 143   20.4 141   20.6
127.wrf2 112 177   44.1 175   44.4 176   44.3
128.GAPgeofem 112 67.6 30.6 67.6 30.6 67.4 30.7
129.tera_tf 112 139   19.9 159   17.4 139   19.9
130.socorro 112 78.1 48.9 78.0 48.9 79.0 48.3
132.zeusmp2 112 111   27.9 112   27.8 112   27.8
137.lu 112 100   36.7 100   36.6 101   36.3
Hardware Summary
Type of System: Homogeneous
Compute Node: Endeavor Node
Interconnects: IB Switch
Gigabit Ethernet
File Server Node: NFS
Total Compute Nodes: 4
Total Chips: 8
Total Cores: 112
Total Threads: 224
Total Memory: 256 GB
Base Ranks Run: 112
Minimum Peak Ranks: --
Maximum Peak Ranks: --
Software Summary
C Compiler: Intel C++ Composer XE 2013 for Linux, Version
14.0.3.174 Build 20140422
C++ Compiler: Intel C++ Composer XE 2013 for Linux, Version
14.0.3.174 Build 20140422
Fortran Compiler: Intel Fortran Composer XE 2013 for Linux, Version
14.0.3.174 Build 20140422
Base Pointers: 64-bit
Peak Pointers: 64-bit
MPI Library: Intel MPI Library 4.1.3.049 for Linux
Other MPI Info: None
Pre-processors: No
Other Software: None

Node Description: Endeavor Node

Hardware
Number of nodes: 4
Uses of the node: compute
Vendor: Intel
Model: R2208WTTYC1
CPU Name: Intel Xeon E5-2697 v3
CPU(s) orderable: 1-2 chips
Chips enabled: 2
Cores enabled: 28
Cores per chip: 14
Threads per core: 2
CPU Characteristics: Intel Turbo Boost Technology up to 3.6 GHz,
9.6 GT/s QPI, Hyper-Threading enabled
CPU MHz: 2600
Primary Cache: 32 KB I + 32 KB D on chip per core
Secondary Cache: 256 KB I+D on chip per core
L3 Cache: 35 MB I+D on chip per chip, 35 MB
shared / 14 cores
Other Cache: None
Memory: 64 GB (8 x 8 GB 2Rx4 PC4-17000R-15, ECC)
Disk Subsystem: ATA INTEL SSDSA2BZ20, SSDSC2BB80
Other Hardware: None
Adapter: Intel (ESB2) 82575EB Dual-Port Gigabit
Ethernet Controller
Number of Adapters: 1
Slot Type: PCI-Express x8
Data Rate: 1Gbps Ethernet
Ports Used: 2
Interconnect Type: Ethernet
Adapter: Mellanox MCX353A-FCAT ConnectX-3
Number of Adapters: 1
Slot Type: PCIe x8 Gen3
Data Rate: InfiniBand 4x FDR
Ports Used: 1
Interconnect Type: InfiniBand
Software
Adapter: Intel (ESB2) 82575EB Dual-Port Gigabit
Ethernet Controller
Adapter Driver: e1000
Adapter Firmware: None
Adapter: Mellanox MCX353A-FCAT ConnectX-3
Adapter Driver: OFED 3.5-2-MIC-rc1
Adapter Firmware: 2.31.5050
Operating System: Red Hat EL 6.5, kernel 2.6.32-358
Local File System: Linux/xfs
Shared File System: NFS
System State: Multi-User
Other Software: IBM Platform LSF Standard 9.1.1.1

Node Description: NFS

Hardware
Number of nodes: 1
Uses of the node: fileserver
Vendor: Intel
Model: S7000FC4UR
CPU Name: Intel Xeon CPU
CPU(s) orderable: 1-4 chips
Chips enabled: 4
Cores enabled: 16
Cores per chip: 4
Threads per core: 2
CPU Characteristics: --
CPU MHz: 2926
Primary Cache: 32 KB I + 32 KB D on chip per core
Secondary Cache: 8 MB I+D on chip per chip, 4 MB shared / 2 cores
L3 Cache: None
Other Cache: None
Memory: 64 GB
Disk Subsystem: 8 disks, 500GB/disk, 2.7TB total
Other Hardware: None
Adapter: Intel 82563GB Dual-Port Gigabit
Ethernet Controller
Number of Adapters: 1
Slot Type: PCI-Express x8
Data Rate: 1Gbps Ethernet
Ports Used: 1
Interconnect Type: Ethernet
Software
Adapter: Intel 82563GB Dual-Port Gigabit
Ethernet Controller
Adapter Driver: e1000e
Adapter Firmware: N/A
Operating System: RedHat EL 5 Update 4
Local File System: None
Shared File System: NFS
System State: Multi-User
Other Software: None

Interconnect Description: IB Switch

Hardware
Vendor: Mellanox
Model: Mellanox MSX6025F-1BFR
Switch Model: Mellanox MSX6025F-1BFR
Number of Switches: 46
Number of Ports: 36
Data Rate: InfiniBand 4x FDR
Firmware: 9.2.8000
Topology: Fat tree
Primary Use: MPI traffic

Interconnect Description: Gigabit Ethernet

Hardware
Vendor: Force10 Networks, Cisco Systems
Model: Force10 S50N, Force10 C300, Cisco WS-C4948E-F
Switch Model: Force10 S50N, Force10 C300, Cisco WS-C4948E-F
Number of Switches: 13
Number of Ports: 48
Data Rate: 1Gbps Ethernet, 10Gbps Ethernet
Firmware: 8.3.2.0, 12.2(54)WO
Topology: Star
Primary Use: Cluster File System

Submit Notes

The config file option 'submit' was used.

General Notes

130.socorro (base): "nullify_ptrs" src.alt was used.

 MPI startup command:
   mpiexec.hydra command was used to start MPI jobs.

 BIOS settings:
   Intel Hyper-Threading Technology (SMT): Enabled (default is Enabled)
   Intel Turbo Boost Technology (Turbo)  : Enabled (default is Enabled)

 RAM configuration:
   Compute nodes have 2x8-GB RDIMM on each memory channel.

 Network:
   Forty six 36-port switches: 18 core switches and 28 leaf switches.
   Each leaf has one link to each core. Remaining 18 ports on 25 of 28 leafs
   are used for compute nodes. On the remaining 3 leafs the ports are used
   for FS nodes and other peripherals.

 Job placement:
   Each MPI job was assigned to a topologically compact set of nodes, i.e.
   the minimal needed number of leaf switches was used for each job: 1 switch
   for 28/56/112/224/448 ranks, 2 switches for 896 ranks, 4 switches for 1792 ranks,
   8 switches for 3584 ranks.

 IBM Platform LSF was used for job submission. It has no impact on performance.
   Information can be found at: http://www.ibm.com

Base Compiler Invocation

C benchmarks:

 mpiicc 

C++ benchmarks:

126.lammps:  mpiicpc 

Fortran benchmarks:

 mpiifort 

Benchmarks using both Fortran and C:

 mpiicc   mpiifort 

Base Portability Flags

121.pop2:  -DSPEC_MPI_CASE_FLAG 
126.lammps:  -DMPICH_IGNORE_CXX_SEEK 
127.wrf2:  -DSPEC_MPI_CASE_FLAG   -DSPEC_MPI_LINUX 
130.socorro:  -assume nostd_intent_in 

Base Optimization Flags

C benchmarks:

 -O3   -xCORE-AVX2   -no-prec-div 

C++ benchmarks:

126.lammps:  -O3   -xCORE-AVX2   -no-prec-div 

Fortran benchmarks:

 -O3   -xCORE-AVX2   -no-prec-div 

Benchmarks using both Fortran and C:

 -O3   -xCORE-AVX2   -no-prec-div 

The flags file that was used to format this result can be browsed at
http://www.spec.org/mpi2007/flags/EM64T_Intel140_flags.20140908.html.

You can also download the XML flags source by saving the following link:
http://www.spec.org/mpi2007/flags/EM64T_Intel140_flags.20140908.xml.