SPECaccel(R)2023 Result Supermciro Tesla A100 PCIe 80GB 120GQ-TNRT Test Sponsor: NVIDIA Corporation accel2023 License: 9045 Test date: Oct-2023 Test sponsor: NVIDIA Corporation Hardware availability: Mar-2023 Tested by: NVIDIA Corporation Software availability: Nov-2023 Base Base Base Base Peak Peak Peak Peak Benchmarks Model Ref. Run Time Ratio Model Ref. Run Time Ratio -------------- ------ ------ --------- --------- ------ ------ --------- --------- 403.stencil LOP 440 243 1.81 S LOP 440 243 1.81 S 403.stencil LOP 440 244 1.81 S LOP 440 244 1.81 S 403.stencil LOP 440 243 1.81 * LOP 440 243 1.81 * 404.lbm LOP 455 136 3.34 * LOP 455 136 3.34 * 404.lbm LOP 455 136 3.34 S LOP 455 136 3.34 S 404.lbm LOP 455 136 3.35 S LOP 455 136 3.35 S 450.md LOP 600 451 1.33 * LOP 600 451 1.33 * 450.md LOP 600 452 1.33 S LOP 600 452 1.33 S 450.md LOP 600 451 1.33 S LOP 600 451 1.33 S 452.ep LOP 415 221 1.88 * LOP 415 221 1.88 * 452.ep LOP 415 221 1.88 S LOP 415 221 1.88 S 452.ep LOP 415 221 1.87 S LOP 415 221 1.87 S 453.clvrleaf LOP 1000 190 5.25 S LOP 1000 190 5.25 S 453.clvrleaf LOP 1000 191 5.24 S LOP 1000 191 5.24 S 453.clvrleaf LOP 1000 191 5.24 * LOP 1000 191 5.24 * 455.seismic LOP 780 243 3.21 S LOP 780 243 3.21 S 455.seismic LOP 780 244 3.19 * LOP 780 244 3.19 * 455.seismic LOP 780 245 3.18 S LOP 780 245 3.18 S 456.spF LOP 475 141 3.36 S LOP 475 141 3.36 S 456.spF LOP 475 141 3.36 * LOP 475 141 3.36 * 456.spF LOP 475 144 3.29 S LOP 475 144 3.29 S 457.spC LOP 540 224 2.41 S LOP 540 224 2.41 S 457.spC LOP 540 223 2.42 S LOP 540 223 2.42 S 457.spC LOP 540 223 2.42 * LOP 540 223 2.42 * 459.miniGhost LOP 590 336 1.75 S LOP 590 336 1.75 S 459.miniGhost LOP 590 337 1.75 * LOP 590 337 1.75 * 459.miniGhost LOP 590 339 1.74 S LOP 590 339 1.74 S 460.ilbdc LOP 555 235 2.36 S LOP 555 235 2.36 S 460.ilbdc LOP 555 235 2.36 * LOP 555 235 2.36 * 460.ilbdc LOP 555 235 2.36 S LOP 555 235 2.36 S 463.swim LOP 440 213 2.06 * LOP 440 213 2.06 * 463.swim LOP 440 213 2.07 S LOP 440 213 2.07 S 463.swim LOP 440 224 1.96 S LOP 440 224 1.96 S 470.bt LOP 1055 218 4.83 S LOP 1055 218 4.83 S 470.bt LOP 1055 218 4.83 * LOP 1055 218 4.83 * 470.bt LOP 1055 218 4.83 S LOP 1055 218 4.83 S ============================================================================================ 403.stencil LOP 440 243 1.81 * LOP 440 243 1.81 * 404.lbm LOP 455 136 3.34 * LOP 455 136 3.34 * 450.md LOP 600 451 1.33 * LOP 600 451 1.33 * 452.ep LOP 415 221 1.88 * LOP 415 221 1.88 * 453.clvrleaf LOP 1000 191 5.24 * LOP 1000 191 5.24 * 455.seismic LOP 780 244 3.19 * LOP 780 244 3.19 * 456.spF LOP 475 141 3.36 * LOP 475 141 3.36 * 457.spC LOP 540 223 2.42 * LOP 540 223 2.42 * 459.miniGhost LOP 590 337 1.75 * LOP 590 337 1.75 * 460.ilbdc LOP 555 235 2.36 * LOP 555 235 2.36 * 463.swim LOP 440 213 2.06 * LOP 440 213 2.06 * 470.bt LOP 1055 218 4.83 * LOP 1055 218 4.83 * SPECaccel 2023_base 2.57 SPECaccel 2023_peak 2.57 HARDWARE -------- CPU Name: Intel Xeon Gold 6338 Max MHz.: 3400 Nominal: 2000 Enabled: 64 cores, 2 chips, 2 threads/core Orderable: 2 chips Cache L1: 32 KB I + 48 KB D on chip per core L2: 1280 KB I+D on chip per core L3: 48 MB I+D on chip per chip Other: None Memory: 512 GB (16x 16GB, PC3200 CL3 DDR4) Storage: 1TB SATA Other: None Base Threads Run: 1 Min. Peak Threads: 1 Max. Peak Threads: 1 ACCELERATOR ----------- Accel Model Name: A100 PCIe 80GB Accel Vendor: NVIDIA Accel Name: Tesla A100 PCIe 80GB Type of Accel: GPU Accel Connection: PCIe 4.0 16x Does Accel Use ECC: Yes Accel Description: See Notes Accel Driver: NVIDIA UNIX x86_64 Kernel Module 525.60.13 SOFTWARE -------- OS: Rocky Linux release 8.8 (Green Obsidian) 4.18.0-477.15.1.el8_8.x86_64 Compiler: C/Fortran: Version 23.11 of NVHPC SDK Firmware: 1.1b 11/01/2021 File System: xfs System State: Run level 3 (multi-user) Other: None Base Parallel Model: LOP Base Threads Run: 1 Peak Parallel Models: LOP Max. Peak Threads: 1 Min. Peak Threads: 1 Operating System Notes ---------------------- Shell stacksize set to unlimited via "limit stacksize unlimited" Platform Notes -------------- Information from nvaccelinfo CUDA Driver Version: 12000 NVRM version: NVIDIA UNIX x86_64 Kernel Module 525.60.13 Wed Nov 30 06:39:21 UTC 2022 Device Number: 0 Device Name: NVIDIA A100 80GB PCIe Device Revision Number: 8.0 Global Memory Size: 85024112640 Number of Multiprocessors: 108 Concurrent Copy and Execution: Yes Total Constant Memory: 65536 Total Shared Memory per Block: 49152 Registers per Block: 65536 Warp Size: 32 Maximum Threads per Block: 1024 Maximum Block Dimensions: 1024, 1024, 64 Maximum Grid Dimensions: 2147483647 x 65535 x 65535 Maximum Memory Pitch: 2147483647B Texture Alignment: 512B Clock Rate: 1410 MHz Execution Timeout: No Integrated Device: No Can Map Host Memory: Yes Compute Mode: default Concurrent Kernels: Yes ECC Enabled: Yes Memory Clock Rate: 1512 MHz Memory Bus Width: 5120 bits L2 Cache Size: 41943040 bytes Max Threads Per SMP: 2048 Async Engines: 3 Unified Addressing: Yes Managed Memory: Yes Concurrent Managed Memory: Yes Preemption Supported: Yes Cooperative Launch: Yes Default Target: cc80 Sysinfo program /local/home/mcolgrove/ACCELV2/bin/sysinfo Rev: r6622 of 2021-04-07 b1a7d5f8f71be5aff70a755cad7211a0 running on ice2 Tue Oct 24 10:32:15 2023 SUT (System Under Test) info as seen by some common utilities. For more information on this section, see https://www.spec.org/cpu2017/Docs/config.html#sysinfo From /proc/cpuinfo model name : Intel(R) Xeon(R) Gold 6338 CPU @ 2.00GHz 2 "physical id"s (chips) 128 "processors" cores, siblings (Caution: counting these is hw and system dependent. The following excerpts from /proc/cpuinfo might not be reliable. Use with caution.) cpu cores : 32 siblings : 64 physical 0: cores 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 physical 1: cores 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 From lscpu from util-linux 2.34: Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian Address sizes: 46 bits physical, 57 bits virtual CPU(s): 128 On-line CPU(s) list: 0-127 Thread(s) per core: 2 Core(s) per socket: 32 Socket(s): 2 NUMA node(s): 2 Vendor ID: GenuineIntel CPU family: 6 Model: 106 Model name: Intel(R) Xeon(R) Gold 6338 CPU @ 2.00GHz Stepping: 6 Frequency boost: enabled CPU MHz: 1774.338 CPU max MHz: 3200.0000 CPU min MHz: 800.0000 BogoMIPS: 4000.00 Virtualization: VT-x L1d cache: 3 MiB L1i cache: 2 MiB L2 cache: 80 MiB L3 cache: 96 MiB NUMA node0 CPU(s): 0-31,64-95 NUMA node1 CPU(s): 32-63,96-127 Vulnerability Itlb multihit: Not affected Vulnerability L1tf: Not affected Vulnerability Mds: Not affected Vulnerability Meltdown: Not affected Vulnerability Mmio stale data: Mitigation; Clear CPU buffers; SMT vulnerable Vulnerability Retbleed: Not affected Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization Vulnerability Spectre v2: Mitigation; Enhanced IBRS, IBPB conditional, RSB filling, PBRSB-eIBRS SW sequence Vulnerability Srbds: Not affected Vulnerability Tsx async abort: Not affected Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 invpcid_single ssbd mba ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb intel_pt avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local wbnoinvd dtherm ida arat pln pts avx512vbmi umip pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg tme avx512_vpopcntdq rdpid md_clear pconfig flush_l1d arch_capabilities From lscpu --cache: NAME ONE-SIZE ALL-SIZE WAYS TYPE LEVEL L1d 48K 3M 12 Data 1 L1i 32K 2M 8 Instruction 1 L2 1.3M 80M 20 Unified 2 L3 48M 96M 12 Unified 3 /proc/cpuinfo cache data cache size : 49152 KB From numactl --hardware WARNING: a numactl 'node' might or might not correspond to a physical chip. available: 2 nodes (0-1) node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 node 0 size: 257622 MB node 0 free: 209175 MB node 1 cpus: 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 node 1 size: 257990 MB node 1 free: 244915 MB node distances: node 0 1 0: 10 20 1: 20 10 From /proc/meminfo MemTotal: 527987596 kB HugePages_Total: 0 Hugepagesize: 2048 kB /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor has performance /usr/bin/lsb_release -d Ubuntu 20.04.6 LTS From /etc/*release* /etc/*version* debian_version: bullseye/sid os-release: NAME="Ubuntu" VERSION="20.04.6 LTS (Focal Fossa)" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 20.04.6 LTS" VERSION_ID="20.04" HOME_URL="https://www.ubuntu.com/" SUPPORT_URL="https://help.ubuntu.com/" uname -a: Linux ice2 5.4.0-153-generic #170-Ubuntu SMP Fri Jun 16 13:43:31 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux Kernel self-reported vulnerability status: CVE-2018-12207 (iTLB Multihit): Not affected CVE-2018-3620 (L1 Terminal Fault): Not affected Microarchitectural Data Sampling: Not affected CVE-2017-5754 (Meltdown): Not affected mmio_stale_data: Mitigation: Clear CPU buffers; SMT vulnerable retbleed: Not affected CVE-2018-3639 (Speculative Store Bypass): Mitigation: Speculative Store Bypass disabled via prctl and seccomp CVE-2017-5753 (Spectre variant 1): Mitigation: usercopy/swapgs barriers and __user pointer sanitization CVE-2017-5715 (Spectre variant 2): Mitigation: Enhanced IBRS, IBPB: conditional, RSB filling, PBRSB-eIBRS: SW sequence CVE-2020-0543 (Special Register Buffer Data Sampling): Not affected CVE-2019-11135 (TSX Asynchronous Abort): Not affected run-level 3 Aug 10 17:47 SPEC is set to: /local/home/mcolgrove/ACCELV2 Filesystem Type Size Used Avail Use% Mounted on /dev/nvme1n1p1 ext4 916G 24G 846G 3% /local From /sys/devices/virtual/dmi/id Vendor: Supermicro Product: SYS-120GQ-TNRT Product Family: SMC X12 Cannot run dmidecode; consider saying (as root) chmod +s /usr/sbin/dmidecode BIOS: BIOS Vendor: American Megatrends International, LLC. BIOS Version: 1.1b BIOS Date: 11/01/2021 (End of data from sysinfo program) Compiler Version Notes ---------------------- ============================================================================== C | 457.spC(base) ------------------------------------------------------------------------------ /usr/bin/ld: /usr/lib/x86_64-linux-gnu/crt1.o: in function `_start': (.text+0x24): undefined reference to `main' pgacclnk: child process exit status 1: /usr/bin/ld nvc 23.11-0 64-bit target on x86-64 Linux -tp icelake-server NVIDIA Compilers and Tools Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. ------------------------------------------------------------------------------ ============================================================================== C | 403.stencil(base) 404.lbm(base) 452.ep(base) 470.bt(base) ------------------------------------------------------------------------------ nvc 23.11-0 64-bit target on x86-64 Linux -tp icelake-server NVIDIA Compilers and Tools Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. ------------------------------------------------------------------------------ ============================================================================== C | 457.spC(base) ------------------------------------------------------------------------------ /usr/bin/ld: /usr/lib/x86_64-linux-gnu/crt1.o: in function `_start': (.text+0x24): undefined reference to `main' pgacclnk: child process exit status 1: /usr/bin/ld nvc 23.11-0 64-bit target on x86-64 Linux -tp icelake-server NVIDIA Compilers and Tools Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. ------------------------------------------------------------------------------ ============================================================================== C | 403.stencil(base) 404.lbm(base) 452.ep(base) 470.bt(base) ------------------------------------------------------------------------------ nvc 23.11-0 64-bit target on x86-64 Linux -tp icelake-server NVIDIA Compilers and Tools Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. ------------------------------------------------------------------------------ ============================================================================== Fortran | 450.md(base) 455.seismic(base) 456.spF(base) 460.ilbdc(base) | 463.swim(base) ------------------------------------------------------------------------------ nvfortran 23.11-0 64-bit target on x86-64 Linux -tp icelake-server NVIDIA Compilers and Tools Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. ------------------------------------------------------------------------------ ============================================================================== Fortran, C | 453.clvrleaf(base) 459.miniGhost(base) ------------------------------------------------------------------------------ nvfortran 23.11-0 64-bit target on x86-64 Linux -tp icelake-server NVIDIA Compilers and Tools Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. nvc 23.11-0 64-bit target on x86-64 Linux -tp icelake-server NVIDIA Compilers and Tools Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. ------------------------------------------------------------------------------ Base Compiler Invocation ------------------------ C benchmarks: nvc Fortran benchmarks: nvfortran Benchmarks using both Fortran and C: nvfortran nvc Base Portability Flags ---------------------- 403.stencil: -DSPEC_NO_NOTHING 457.spC: -mcmodel=medium -Wl,--no-relax Base Optimization Flags ----------------------- C benchmarks: -Ofast -mp=gpu -Mfprelaxed -Mstack_arrays -static-nvidia Fortran benchmarks: -Ofast -mp=gpu -Mfprelaxed -Mstack_arrays -static-nvidia Benchmarks using both Fortran and C: 453.clvrleaf: -Ofast -mp=gpu -Mfprelaxed -Mstack_arrays -static-nvidia 459.miniGhost: -Mnomain -Ofast -mp=gpu -Mfprelaxed -Mstack_arrays -static-nvidia Peak Optimization Flags ----------------------- C benchmarks: 403.stencil: basepeak = yes 404.lbm: basepeak = yes 452.ep: basepeak = yes 457.spC: basepeak = yes 470.bt: basepeak = yes Fortran benchmarks: 450.md: basepeak = yes 455.seismic: basepeak = yes 456.spF: basepeak = yes 460.ilbdc: basepeak = yes 463.swim: basepeak = yes Benchmarks using both Fortran and C: 453.clvrleaf: basepeak = yes 459.miniGhost: basepeak = yes The flags file that was used to format this result can be browsed at http://www.spec.org/accel2023/flags/nv2023_flags_v2.html You can also download the XML flags source by saving the following link: http://www.spec.org/accel2023/flags/nv2023_flags_v2.xml SPECaccel is a registered trademark of the Standard Performance Evaluation Corporation. All other brand and product names appearing in this result are trademarks or registered trademarks of their respective holders. ------------------------------------------------------------------------------------------------------------- For questions about this result, please contact the tester. For other inquiries, please contact info@spec.org. Copyright 2023 Standard Performance Evaluation Corporation Tested with SPECaccel2023 v2.0.17 on 2023-10-24 13:32:15-0400. Report generated on 2023-12-06 13:07:33 by accel2023 ASCII formatter v112. Originally published on 2023-11-08.