SPEC® CFP2006 Result

Copyright 2006-2017 Standard Performance Evaluation Corporation

Sugon

Sugon A320-G30 (AMD EPYC 7401P)

CPU2006 license: 9046 Test date: Dec-2017
Test sponsor: Sugon Hardware Availability: Dec-2017
Tested by: Sugon Software Availability: Oct-2017
Benchmark results graph
Hardware
CPU Name: AMD EPYC 7401P
CPU Characteristics: AMD Turbo CORE technology up to 3.00 GHz
CPU MHz: 2000
FPU: Integrated
CPU(s) enabled: 24 cores, 1 chip, 24 cores/chip, 2 threads/core
CPU(s) orderable: 1 chip
Primary Cache: 64 KB I + 32 KB D on chip per core
Secondary Cache: 512 KB I+D on chip per core
L3 Cache: 64 MB I+D on chip per chip, 8 MB shared / 3 cores
Other Cache: None
Memory: 256 GB (8 x 32 GB 2Rx4 PC4-2667V-R, running at
2400)
Disk Subsystem: 1 x 800 GB SATA, SSD
Other Hardware: None
Software
Operating System: SUSE Linux Enterprise Server 12 SP3
Kernel 4.4.73-5-default
Compiler: C/C++/Fortran: Version 4.5.2.1 of x86 Open64
Compiler Suite (from AMD)
Auto Parallel: No
File System: ext4
System State: Run level 3 (Multi User)
Base Pointers: 64-bit
Peak Pointers: 32/64-bit
Other Software: None

Results Table

Benchmark Base Peak
Copies Seconds Ratio Seconds Ratio Seconds Ratio Copies Seconds Ratio Seconds Ratio Seconds Ratio
Results appear in the order in which they were run. Bold underlined text indicates a median measurement.
410.bwaves 48 1057 617 1057 617 1058 616 24 501 651 500 652 501 652
416.gamess 48 1109 847 1113 844 1115 843 48 988 951 988 952 988 951
433.milc 48 846 521 847 520 847 520 24 341 646 341 646 341 646
434.zeusmp 48 408 1070 406 1070 417 1050 48 401 1090 398 1100 398 1100
435.gromacs 48 436 786 424 808 415 826 48 305 1120 304 1130 303 1130
436.cactusADM 48 517 1110 515 1110 518 1110 24 244 1180 239 1200 242 1190
437.leslie3d 48 999 452 1001 451 1000 451 24 370 609 370 610 370 609
444.namd 48 514 749 515 748 513 750 48 433 890 432 890 432 891
447.dealII 48 366 1500 359 1530 359 1530 48 335 1640 337 1630 336 1630
450.soplex 48 829 483 827 484 827 484 24 372 538 371 539 372 538
453.povray 48 242 1060 242 1050 243 1050 48 196 1300 194 1320 193 1320
454.calculix 48 340 1170 341 1160 342 1160 48 364 1090 363 1090 364 1090
459.GemsFDTD 48 1245 409 1241 410 1243 410 24 577 442 576 442 579 440
465.tonto 48 531 889 530 890 530 891 24 255 925 254 931 253 932
470.lbm 48 762 866 762 865 767 860 24 357 925 358 920 356 927
481.wrf 48 707 758 708 757 709 756 24 350 766 350 765 351 765
482.sphinx3 48 1445 647 1447 647 1447 646 24 521 898 521 898 521 898

Submit Notes

The config file option 'submit' was used.
'numactl' was used to bind copies to the cores.
See the configuration file for details.

Operating System Notes

'ulimit -s unlimited' was used to set environment stack size
'ulimit -l 2097152' was used to set environment locked pages in memory limit

runspec command invoked through numactl i.e.:
numactl --interleave=all runspec <etc>

Set dirty_ratio=8 to limit dirty cache to 8% of memory
Set swappiness=1 to swap only if necessary
Set zone_reclaim_mode=1 to free local node memory and avoid remote memory
sync then drop_caches=3 to reset caches before invoking runcpu

Transparent huge pages were enabled for this run (OS default)

Set vm/nr_hugepages=43008 in /etc/sysctl.conf
mount -t hugetlbfs nodev /mnt/hugepages

Platform Notes

BIOS settings:
Determinism Slider = Power
cTDP Control = Manual
cTDP = 200

General Notes

Environment variables set by runspec before the start of the run:
HUGETLB_LIMIT = "896"
LD_LIBRARY_PATH = "/home/cpu2006/amd1603-rate-libs-revB/32:/home/cpu2006/amd1603-rate-libs-revB/64"

The binaries were built with the AMD supported x86 Open64 Compiler Suite,
which is only available from AMD at
http://developer.amd.com/tools-and-sdks/cpu-development/x86-open64-compiler-suite/
Binaries were compiled on a system with 2 x AMD Opteron 6378 chips + 128 GB Memory using RHEL 6.3

Base Compiler Invocation

C benchmarks:

 opencc 

C++ benchmarks:

 openCC 

Fortran benchmarks:

 openf95 

Benchmarks using both Fortran and C:

 opencc   openf95 

Base Portability Flags

410.bwaves:  -DSPEC_CPU_LP64 
416.gamess:  -DSPEC_CPU_LP64 
433.milc:  -DSPEC_CPU_LP64 
434.zeusmp:  -DSPEC_CPU_LP64 
435.gromacs:  -DSPEC_CPU_LP64 
436.cactusADM:  -DSPEC_CPU_LP64   -fno-second-underscore 
437.leslie3d:  -DSPEC_CPU_LP64 
444.namd:  -DSPEC_CPU_LP64 
447.dealII:  -DSPEC_CPU_LP64 
450.soplex:  -DSPEC_CPU_LP64 
453.povray:  -DSPEC_CPU_LP64 
454.calculix:  -DSPEC_CPU_LP64 
459.GemsFDTD:  -DSPEC_CPU_LP64 
465.tonto:  -DSPEC_CPU_LP64 
470.lbm:  -DSPEC_CPU_LP64 
481.wrf:  -DSPEC_CPU_LINUX   -DSPEC_CPU_CASE_FLAG   -DSPEC_CPU_LP64   -fno-second-underscore 
482.sphinx3:  -DSPEC_CPU_LP64 

Base Optimization Flags

C benchmarks:

 -Ofast   -OPT:malloc_alg=1   -HP:bd=2m:heap=2m   -IPA:plimit=8000   -IPA:small_pu=100   -mso   -march=bdver1   -mno-fma4   -mno-xop   -mno-tbm   -WB,   -Wl,   -z,muldefs 

C++ benchmarks:

 -Ofast   -static   -CG:load_exe=0   -OPT:malloc_alg=1   -INLINE:aggressive=on   -HP:bd=2m:heap=2m   -D__OPEN64_FAST_SET   -march=bdver2   -mno-fma4   -mno-xop   -mno-tbm   -WB,   -Wl,   -z,muldefs 

Fortran benchmarks:

 -Ofast   -LNO:blocking=off   -LNO:simd_peel_align=on   -OPT:rsqrt=2   -OPT:unroll_size=256   -HP:bd=2m:heap=2m   -mso   -march=bdver1   -mno-fma4   -mno-xop   -mno-tbm   -WB,   -Wl,   -z,muldefs 

Benchmarks using both Fortran and C:

 -Ofast   -OPT:malloc_alg=1   -HP:bd=2m:heap=2m   -IPA:plimit=8000   -IPA:small_pu=100   -mso   -march=bdver1   -mno-fma4   -mno-xop   -mno-tbm   -WB,   -Wl,   -z,muldefs   -LNO:blocking=off   -LNO:simd_peel_align=on   -OPT:rsqrt=2   -OPT:unroll_size=256 

Peak Compiler Invocation

C benchmarks:

 opencc 

C++ benchmarks:

 openCC 

Fortran benchmarks:

 openf95 

Benchmarks using both Fortran and C:

 opencc   openf95 

Peak Portability Flags

410.bwaves:  -DSPEC_CPU_LP64 
416.gamess:  -DSPEC_CPU_LP64 
433.milc:  -DSPEC_CPU_LP64 
434.zeusmp:  -DSPEC_CPU_LP64 
435.gromacs:  -DSPEC_CPU_LP64 
436.cactusADM:  -DSPEC_CPU_LP64   -fno-second-underscore 
437.leslie3d:  -DSPEC_CPU_LP64 
444.namd:  -DSPEC_CPU_LP64 
453.povray:  -DSPEC_CPU_LP64 
454.calculix:  -DSPEC_CPU_LP64 
459.GemsFDTD:  -DSPEC_CPU_LP64 
465.tonto:  -DSPEC_CPU_LP64 
470.lbm:  -DSPEC_CPU_LP64 
481.wrf:  -DSPEC_CPU_LINUX   -DSPEC_CPU_CASE_FLAG   -DSPEC_CPU_LP64   -fno-second-underscore 

Peak Optimization Flags

C benchmarks:

433.milc:  -Ofast   -CG:movnti=1   -CG:locs_best=on   -HP:bdt=2m:heap=2m   -IPA:plimit=7000   -IPA:callee_limit=1200   -OPT:struct_array_copy=2   -OPT:alias=field_sensitive   -mso   -march=bdver1   -mno-fma4 
470.lbm:  -Ofast   -CG:cmp_peep=on   -OPT:keep_ext=on   -HP:bdt=2m:heap=2m   -IPA:plimit=8000   -IPA:small_pu=100   -march=bdver1   -mno-fma4   -mso 
482.sphinx3:  -Ofast   -m32   -IPA:plimit=1000   -OPT:malloc_alg=2   -CG:cmp_peep=on   -CG:p2align=0   -CG:load_exe=1   -CG:dsched=on   -INLINE:aggressive=on   -LNO:prefetch=2   -LNO:prefetch_ahead=4   -mso   -march=bdver2   -WB,   -mno-fma4   -mno-tbm   -mno-xop 

C++ benchmarks:

444.namd:  -Ofast   -IPA:plimit=3000   -LNO:ignore_feedback=off   -CG:local_sched_alg=0   -CG:load_exe=0   -OPT:unroll_size=256   -fno-exceptions   -HP:bdt=2m:heap=2m   -LNO:if_select_conv=1   -OPT:alias=disjoint   -LNO:psimd_iso_unroll=ON   -march=bdver2   -mno-fma4   -WB,   -mno-xop   -mno-tbm 
447.dealII:  -Ofast   -D__OPEN64_FAST_SET   -static   -INLINE:aggressive=on   -LNO:opt=1   -LNO:simd=2   -fno-emit-exceptions   -m32   -OPT:unroll_times_max=8   -OPT:unroll_size=256   -OPT:unroll_level=2   -HP:bdt=2m:heap=2m   -GRA:unspill=on   -CG:cmp_peep=on   -CG:movext_icmp=off   -TENV:frame_pointer=off   -march=bdver1   -mno-fma4 
450.soplex:  -fb_create fbdata(pass 1)   -fb_opt fbdata(pass 2)   -O3   -LNO:ignore_feedback=off   -INLINE:aggressive=on   -OPT:RO=1   -OPT:IEEE_arith=3   -OPT:IEEE_NaN_Inf=off   -OPT:fold_unsigned_relops=on   -fno-exceptions   -CG:p2align=0   -m32   -mno-fma4   -HP:bdt=2m:heap=2m   -WOPT:sib=on   -march=bdver1 
453.povray:  -fb_create fbdata(pass 1)   -fb_opt fbdata(pass 2)   -Ofast   -CG:pre_local_sched=off   -CG:p2align=0   -CG:p2align_split=on   -CG:dsched=on   -INLINE:aggressive=on   -HP:bd=2m:heap=2m   -OPT:transform=2   -OPT:alias=disjoint   -WOPT:aggcm=0   -march=bdver2   -mno-fma4   -WB,   -mno-xop   -mno-tbm   -Wl,   -z,muldefs 

Fortran benchmarks:

410.bwaves:  -fb_create fbdata(pass 1)   -fb_opt fbdata(pass 2)   -Ofast   -OPT:Ofast   -OPT:treeheight=on   -LNO:blocking=off   -LNO:ignore_feedback=off   -LNO:fu=4   -LNO:loop_model_simd=on   -LNO:simd_rm_unity_remainder=on   -WOPT:aggstr=0   -HP:bdt=2m:heap=2m   -CG:cmp_peep=on   -march=bdver2   -mno-fma4 
416.gamess:  -fb_create fbdata(pass 1)   -fb_opt fbdata(pass 2)   -Ofast   -LNO:fu=6   -LNO:blocking=0   -LNO:simd=2   -OPT:ro=3   -OPT:recip=on   -CG:local_sched_alg=1   -HP:bdt=2m:heap=2m   -WOPT:sib=on   -march=bdver1   -mno-fma4 
434.zeusmp:  -fb_create fbdata(pass 1)   -fb_opt fbdata(pass 2)   -Ofast   -LNO:blocking=off   -LNO:interchange=off   -IPA:plimit=1500   -HP:bdt=2m:heap=2m   -march=bdver2   -mno-fma4 
437.leslie3d:  -Ofast   -CG:pre_minreg_level=2   -LNO:simd=0   -LNO:fusion=2   -HP:bdt=2m:heap=2m   -mso   -march=bdver1   -mno-fma4 
459.GemsFDTD:  -Ofast   -IPA:plimit=1500   -OPT:unroll_size=1024   -OPT:unroll_times_max=16   -LNO:fission=2   -CG:local_sched_alg=2   -HP   -march=bdver1   -mno-fma4 
465.tonto:  -Ofast   -OPT:alias=no_f90_pointer_alias   -LNO:blocking=off   -CG:load_exe=1   -CG:local_sched_alg=3   -IPA:plimit=525   -HP:bdt=2m:heap=2m   -march=bdver2   -WB,   -mno-fma4   -mno-tbm   -mno-xop 

Benchmarks using both Fortran and C:

435.gromacs:  -Ofast   -OPT:rsqrt=2   -HP:bdt=2m:heap=2m   -CG:local_sched_alg=2   -CG:load_exe=3   -GRA:unspill=on   -march=bdver2   -mno-fma4   -LNO:simd=3 
436.cactusADM:  -fb_create fbdata(pass 1)   -fb_opt fbdata(pass 2)   -Ofast   -LNO:blocking=off   -LNO:prefetch=2   -LNO:pf2=0   -LNO:prefetch_ahead=4   -HP   -CG:locs_shallow_depth=1   -CG:load_exe=0   -CG:dsched=on   -WOPT:sib=on   -march=bdver2   -mno-fma4 
454.calculix:  -Ofast   -OPT:unroll_size=256   -OPT:alias=disjoint   -GRA:optimize_boundary=on   -CG:dsched=on   -HP:bdt=2m:heap=2m   -march=bdver1   -mno-fma4 
481.wrf:  -Ofast   -LNO:blocking=off   -LANG:copyinout=off   -IPA:callee_limit=5000   -GRA:prioritize_by_density=on   -HP   -WOPT:sib=on   -march=bdver1   -mno-fma4 

The flags files that were used to format this result can be browsed at
http://www.spec.org/cpu2006/flags/x86-openflags-rate-revA-I.html,
http://www.spec.org/cpu2006/flags/Sugon-Naples-Platform-Settings-revC-I.html.

You can also download the XML flags sources by saving the following links:
http://www.spec.org/cpu2006/flags/x86-openflags-rate-revA-I.xml,
http://www.spec.org/cpu2006/flags/Sugon-Naples-Platform-Settings-revC-I.xml.