SPEC® ACCEL™ ACC Result

Copyright 2015-2018 Standard Performance Evaluation Corporation

Supermicro (Test Sponsor: NVIDIA Corporation)

Tesla V100-PCIE-16GB

SuperServer 1029GQ-TRT

SPECaccel_acc_base = 11.5 

SPECaccel_acc_peak = 11.5 

ACCEL license: 019 Test date: Jul-2018
Test sponsor: NVIDIA Corporation Hardware Availability: Nov-2017
Tested by: NVIDIA Corporation Software Availability: Aug-2018
Benchmark results graph
Hardware
CPU Name: Intel Xeon Gold 6148
CPU Characteristics:
CPU MHz: 2400
CPU MHz Maximum: 3700
FPU: Integrated
CPU(s) enabled: 40 cores, 2 chips, 20 cores/chip, 2 threads/core
CPU(s) orderable: 1,2 chips
Primary Cache: 32 KB I + 32 KB D on chip per core
Secondary Cache: 1 MB I+D on chip per core
L3 Cache: 28160 KB I+D on chip per chip
Other Cache: None
Memory: 384 GB (12 x 32 GB 2Rx8 PC4-2666V-R)
Disk Subsystem: 512GB Samsung 960 PRO M.2 PCIe 3.0 x4 NVMe Solid
State Drive
Other Hardware: None
Accelerator
Accel Model Name: Tesla V100
Accel Vendor: NVIDIA Corporation
Accel Name: Tesla V100-PCIE-16GB
Type of Accel: GPU
Accel Connection: PCIe
Does Accel Use ECC: Yes
Accel Description: See notes
Accel Driver: NVIDIA UNIX x86_64 Kernel Module 390.46
Software
Operating System: CentOS Linux release 7.4.1708 (Core)
3.10.0-693.17.1.el7.x86_64
Compiler: PGI Professional Edition, Release 18.7 LLVM
File System: xfs
System State: Run level 3 (multi-user)
Other Software: None

Results Table

Benchmark Base Peak
Seconds Ratio Seconds Ratio Seconds Ratio Seconds Ratio Seconds Ratio Seconds Ratio
Results appear in the order in which they were run. Bold underlined text indicates a median measurement.
303.ostencil 9.23  15.7   9.16  15.8   9.16  15.8   9.23  15.7   9.16  15.8   9.16  15.8  
304.olbm 38.6   11.8   38.6   11.8   38.5   11.8   38.6   11.8   38.6   11.8   38.5   11.8  
314.omriq 41.7   22.9   41.7   22.9   41.9   22.8   41.7   22.9   41.7   22.9   41.9   22.8  
350.md 11.3   22.4   11.2   22.5   11.3   22.3   11.3   22.4   11.2   22.5   11.3   22.3  
351.palm 141     2.63  141     2.63  140     2.64  141     2.63  141     2.63  140     2.64 
352.ep 53.2   9.96  53.4   9.93  53.2   9.97  53.2   9.96  53.4   9.93  53.2   9.97 
353.clvrleaf 35.6   12.5   35.9   12.4   35.6   12.5   35.6   12.5   35.9   12.4   35.6   12.5  
354.cg 34.3   11.9   34.0   12.0   33.8   12.1   34.3   11.9   34.0   12.0   33.8   12.1  
355.seismic 27.9   13.3   27.7   13.3   27.8   13.3   27.9   13.3   27.7   13.3   27.8   13.3  
356.sp 22.0   12.5   22.1   12.5   21.9   12.6   22.0   12.5   22.1   12.5   21.9   12.6  
357.csp 19.3   14.0   19.2   14.0   19.3   14.0   19.3   14.0   19.2   14.0   19.3   14.0  
359.miniGhost 39.1   9.43  38.9   9.48  39.5   9.33  39.1   9.43  38.9   9.48  39.5   9.33 
360.ilbdc 30.5   12.0   30.5   12.0   30.5   12.0   30.5   12.0   30.5   12.0   30.5   12.0  
363.swim 74.6   3.08  74.7   3.08  74.8   3.08  74.6   3.08  74.7   3.08  74.8   3.08 
370.bt 8.92  25.0   8.79  25.4   8.78  25.4   8.92  25.0   8.79  25.4   8.78  25.4  

Submit Notes

The config file option 'submit' was used.
Submit command: numactl -C 1 -m 0 $command

Operating System Notes

 Stacksize set to 'unlimited'

Platform Notes

 Sysinfo program /local/home/aglobus/spec-accel/Docs/sysinfo
 $Rev: 6965 $ $Date:: 2015-04-21 #$ c05a7f14b1b1765e3fe1df68447e8a35
 running on perf-sky2.pgi.net Wed Jul 25 20:12:07 2018

 This section contains SUT (System Under Test) info as seen by
 some common utilities.  To remove or add to this section, see:
   http://www.spec.org/accel/Docs/config.html#sysinfo

 From /proc/cpuinfo
    model name : Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz
       2 "physical id"s (chips)
       80 "processors"
    cores, siblings (Caution: counting these is hw and system dependent.  The
    following excerpts from /proc/cpuinfo might not be reliable.  Use with
    caution.)
       cpu cores : 20
       siblings  : 40
       physical 0: cores 0 1 2 3 4 8 9 10 11 12 16 17 18 19 20 24 25 26 27 28
       physical 1: cores 0 1 2 3 4 8 9 10 11 12 16 17 18 19 20 24 25 26 27 28
    cache size : 28160 KB

 From /proc/meminfo
    MemTotal:       394873648 kB
    HugePages_Total:      20
    Hugepagesize:       2048 kB

 /usr/bin/lsb_release -d
    CentOS Linux release 7.4.1708 (Core)

 From /etc/*release* /etc/*version*
    centos-release: CentOS Linux release 7.4.1708 (Core)
    centos-release-upstream: Derived from Red Hat Enterprise Linux 7.4 (Source)
    os-release:
       NAME="CentOS Linux"
       VERSION="7 (Core)"
       ID="centos"
       ID_LIKE="rhel fedora"
       VERSION_ID="7"
       PRETTY_NAME="CentOS Linux 7 (Core)"
       ANSI_COLOR="0;31"
       CPE_NAME="cpe:/o:centos:centos:7"
    redhat-release: CentOS Linux release 7.4.1708 (Core)
    system-release: CentOS Linux release 7.4.1708 (Core)
    system-release-cpe: cpe:/o:centos:centos:7

 uname -a:
    Linux perf-sky2.pgi.net 3.10.0-693.17.1.el7.x86_64 #1 SMP Thu Jan 25 20:13:58
    UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

 run-level 3 Mar 29 15:36

 SPEC is set to: /local/home/aglobus/spec-accel
    Filesystem                   Type  Size  Used Avail Use% Mounted on
    /dev/mapper/centos_sky2-root xfs   472G   60G  413G  13% /
 Additional information from dmidecode:

    Warning: Use caution when you interpret this section. The 'dmidecode' program
    reads system data which is "intended to allow hardware to be accurately
    determined", but the intent may not be met, as there are frequent changes to
    hardware, firmware, and the "DMTF SMBIOS" standard.


 (End of data from sysinfo program)
 Information from pgaccelinfo
 CUDA Driver Version:           9010
 NVRM version:                  NVIDIA UNIX x86_64 Kernel Module  390.46
 Device Number:                 0
 Device Name:                   Tesla V100-PCIE-16GB
 Device Revision Number:        7.0
 Global Memory Size:            16945512448
 Number of Multiprocessors:     80
 Concurrent Copy and Execution: Yes
 Total Constant Memory:         65536
 Total Shared Memory per Block: 49152
 Registers per Block:           65536
 Warp Size:                     32
 Maximum Threads per Block:     1024
 Maximum Block Dimensions:      1024, 1024, 64
 Maximum Grid Dimensions:       2147483647 x 65535 x 65535
 Maximum Memory Pitch:          2147483647B
 Texture Alignment:             512B
 Clock Rate:                    1380 MHz
 Execution Timeout:             No
 Integrated Device:             No
 Can Map Host Memory:           Yes
 Compute Mode:                  default
 Concurrent Kernels:            Yes
 ECC Enabled:                   Yes
 Memory Clock Rate:             877 MHz
 Memory Bus Width:              4096 bits
 L2 Cache Size:                 6291456 bytes
 Max Threads Per SMP:           2048
 Async Engines:                 7
 Unified Addressing:            Yes
 Managed Memory:                Yes
 Concurrent Managed Memory:     Yes
 Preemption Supported:          Yes
 Cooperative Launch:            Yes
   Multi-Device:                Yes
 PGI Default Target:            -ta=tesla:cc70

General Notes

Yes: The test sponsor attests, as of date of publication, that CVE-2017-5754 (Meltdown)
is mitigated in the system as tested and documented.
Yes: The test sponsor attests, as of date of publication, that CVE-2017-5753 (Spectre variant 1)
is mitigated in the system as tested and documented.
Yes: The test sponsor attests, as of date of publication, that CVE-2017-5715 (Spectre variant 2)
is mitigated in the system as tested and documented.

Base Compiler Invocation

C benchmarks:

 pgcc 

Fortran benchmarks:

 pgfortran 

Benchmarks using both Fortran and C:

 pgcc   pgfortran 

Base Optimization Flags

C benchmarks:

 -Mllvm   -V18.7   -fast   -Mfprelaxed   -Mnouniform   -acc   -ta=tesla:cc70 

Fortran benchmarks:

 -Mllvm   -V18.7   -fast   -Mfprelaxed   -Mnouniform   -acc   -ta=tesla:cc70 

Benchmarks using both Fortran and C:

353.clvrleaf:  -Mllvm   -V18.7   -fast   -Mfprelaxed   -Mnouniform   -acc   -ta=tesla:cc70 
359.miniGhost:  -Mllvm   -V18.7   -fast   -Mfprelaxed   -Mnouniform   -acc   -ta=tesla:cc70   -Mnomain 

Peak Optimization Flags

C benchmarks:

303.ostencil:  basepeak = yes 
304.olbm:  basepeak = yes 
314.omriq:  basepeak = yes 
352.ep:  basepeak = yes 
354.cg:  basepeak = yes 
357.csp:  basepeak = yes 
370.bt:  basepeak = yes 

Fortran benchmarks:

350.md:  basepeak = yes 
351.palm:  basepeak = yes 
355.seismic:  basepeak = yes 
356.sp:  basepeak = yes 
360.ilbdc:  basepeak = yes 
363.swim:  basepeak = yes 

Benchmarks using both Fortran and C:

353.clvrleaf:  basepeak = yes 
359.miniGhost:  basepeak = yes 

The flags files that were used to format this result can be browsed at
https://www.spec.org/accel/flags/PGI-Platform-Multicore-OMP.html,
https://www.spec.org/accel/flags/pgi2018_flags.html.

You can also download the XML flags sources by saving the following links:
https://www.spec.org/accel/flags/PGI-Platform-Multicore-OMP.xml,
https://www.spec.org/accel/flags/pgi2018_flags.xml.