SPEC CPU2017 Platform Settings for Supermicro Systems

Operating System Tuning Parameters

kernel.randomize_va_space (ASLR)
This setting can be used to select the type of process address space randomization. Defaults differ based on whether the architecture supports ASLR, whether the kernel was built with the CONFIG_COMPAT_BRK option or not, or the kernel boot options used.
Possible settings: Disabling ASLR can make process execution more deterministic and runtimes more consistent. For more information see the randomize_va_space entry in the Linux sysctl documentation.
Transparent Hugepages (THP)
THP is an abstraction layer that automates most aspects of creating, managing, and using huge pages. It is designed to hide much of the complexity in using huge pages from system administrators and developers. Huge pages increase the memory page size from 4 kilobytes to 2 megabytes. This provides significant performance advantages on systems with highly contended resources and large memory workloads. If memory utilization is too high or memory is badly fragmented which prevents hugepages being allocated, the kernel will assign smaller 4k pages instead. Most recent Linux OS releases have THP enabled by default.
THP usage is controlled by the sysfs setting /sys/kernel/mm/transparent_hugepage/enabled. Possible values: THP creation is controlled by the sysfs setting /sys/kernel/mm/transparent_hugepage/defrag. Possible values: An application that "always" requests THP often can benefit from waiting for an allocation until those huge pages can be assembled.
For more information see the Linux transparent hugepage documentation.
dirty_ratio
This is a percentage value of total available memory that can be filled with dirty data before writing the modifications to disk. Set through "sysctl -w vm.dirty_ratio=8".
swappiness
This control is used to define how aggressive the kernel will swap memory pages. Increaasing the value causes swapping more frequently. The default value is 60. A value of 1 tells the kernel to only swap processes to disk if absolutely necessary. This can be set through a command like "sysctl -w vm.swappiness=1"
zone_reclaim_mode
Zone_reclaim_mode allows someone to set more or less aggressive approaches to reclaim memory when a zone runs out of memory. It controls whether memory reclaim is performed on a local NUMA node or other nodes. To tell the kernel to free local node memory rather than grabbing free memory from remote nodes, it can be set through a command like "sysctl -w vm.zone_reclaim_mode=1".
drop_caches
Writing this will cause kernel to drop clean caches, as well as reclaimable slab objects like dentries and inodes. Once dropped, their memory becomes free. Set through "sysctl -w vm.drop_caches=3" to free slab objects and pagecache.
CPUFreq scaling governor:

Governors are power schemes for the CPU. It is in-kernel pre-configured power schemes for the CPU and allows you to change the clock speed of the CPUs on the fly. On Linux systems can set the govenor for all CPUs through the cpupower utility with the following command:

Below are govenors in the Linux kernel.

tuned-adm:

A commandline interface for switching between different tuning profiles available in supported Linux distributions. The distribution provided profiles are located in /usr/lib/tuned and the user defined profiles in /etc/tuned. To set a profile, one can issue the command "tuned-adm profile (profile_name)". Below are details about some relevant profiles.


Firmware / BIOS / Microcode Settings

Determinism Control:
This BIOS option allows for choose AGESA determinism control. AGESA is an acronym for "AMD Generic Encapsulated Software Architecture." AGESA is a bootstrap protocol by which system devices on AMD64-architecture mainboards are initialized, it responsible for the initialization of the processor cores, memory, and the HyperTransport controller. Available settings are:
Determinism Enable:
This BIOS option allows for Enable/Disable AGESA determinism to control performance. AGESA is an acronym for "AMD Generic Encapsulated Software Architecture." AGESA is a bootstrap protocol by which system devices on AMD64-architecture mainboards are initialized, it responsible for the initialization of the processor cores, memory, and the HyperTransport controller. "Performance determinism" tells the processor to run in a consistent manner which allows consistent repeatability when doing benchmarks or performance testing. The processor will run at the best performance with little deviation allowing repeatable runs. Available settings are:
cTDP Control:
This BIOS option is for "Configurable TDP (cTDP)", it allows user can set customized value for TDP. Available settings are:
cTDP:
TDP is an acronym for “Thermal Design Power.” TDP is the recommended target for power used when designing the cooling capacity for a server. EPYC processors are able to control this target power consumption within certain limits. This capability is referred to as “configurable TDP” or "cTDP." cTDP can be used to reduce power consumption for greater efficiency, or in some cases, increase power consumption above the default value to provide additional performance. cTDP is controlled using a BIOS option.

The default EPYC cTDP value corresponds with the microprocessor’s nominal TDP. For the EPYC 9354, the default value is 280W. The default cTDP value is set at a good balance between performance and energy efficiency. The EPYC 9354 cTDP can be reduced as low as 240W, which will minimize the power consumption for the processor under load, but at the expense of peak performance. Increasing the EPYC 9354 cTDP to 300W will maximize peak performance by allowing the CPU to maintain higher dynamic clock speeds, but will make the microprocessor less energy efficient. Note that at maximum cTDP, the CPU thermal solution must be capable of dissipating at least 300W or the EPYC 9354 processor might engage in thermal throttling under load.

The available cTDP ranges for each EPYC model are in the table below:
ModelNominal TDP Minimum cTDP Maximum cTDP*
EPYC 9754360W 320W 400W
EPYC 9734340W 320W 400W
EPYC 9684X400W 320W 400W
EPYC 9654360W 320W 400W
EPYC 9654P360W 320W 400W
EPYC 9554360W 320W 400W
EPYC 9554P360W 320W 400W
EPYC 9374F320W 320W 400W
EPYC 9354280W 240W 300W
EPYC 9354P280W 240W 300W
EPYC 9224200W 200W 240W
EPYC 9174F320W 320W 400W
EPYC 9124200W 200W 240W
* cTDP must remain below the thermal solution design parameters or thermal throttling could be frequently encountered.
IOMMU:
The I/O Memory Management Unit (IOMMU) extends the AMD64 system architecture by adding support for address translation and system memory access protection on DMA transfers from periph-eral devices. IOMMU also helps filter and remap interrupts from peripheral devices. Available settings are:
Package Power Limit Control:
This is a per processor Package Power Limit (PPT) value applicable for all populated processors in the system. This can be set to limit the PPT to a certain value. Available settings are:
Package Power Limit:
Set customize processor Package Power Limit (PPT) value to be used on all populated processors in the system. If set to 240 = Use the 240W PPT ***PPT will be used as the ASIC power limit***
APBDIS:
APBDis is an IO Boost disable on uncore. For any system user that needs to block these uncore optimizations that are impacting base core clock speed, we are exposing a method to disable this behavior called APBDis. This locks the fabric clock to the non-boosted speeds. Available settings are:
NUMA Nodes Per Socket:
Specifies the number of desired NUMA nodes per socket. This option allows the user to divide the memory that each socket has into a certain number of NUMA memory nodes for optimal memory bandwidth. Available settings are:
SMT Control:
This controls enable or disable the logical processor cores on the processor. Enable SMT Control can improve overall performance for most workloads. For some floating point or HPC workloads may result in highr performance if disable SMT Control. Available settings are:
ACPI SRAT L3 cache As NUMA Domain:
Controls generation of distance information in the ACPI System Locality Information Table (SLIT) and NUMA proximity domains in the System Resource Affinity Table (SRAT). Enabling this feature can increase performance for workloads that are NUMA aware and optimized. Available settings are:
TSME:
This controls enable or disable the Transparent Secure Memory Encryption. Enable TSME can improve security by encrypt the data in memory. Disable for lower memory latency. Available settings are:
SEV Control:
This controls enable or disable the Secure Encrypted Virtualization. SEV is an extension of SME that effectively enables a per-virtual machine SME. In other words, SEV enables running encrypted virtual machines in which the code and data of the VM are private to the VM and may only be decrypted within the VM itself. Available settings are: