OS Images |
os_Image_1(1)
|
Hardware Description |
hw_1
|
Number of Systems |
1
|
SW Environment |
non-virtual
|
Tuning |
BIOS Settings: - SMT Control set to Auto
- IOMMU set to Enabled
- NUMA nodes per socket set to NPS4
- Determinism Control set to Manual
- Determinism Slider set to Power
- cTDP Control set to Manual
- cTDP set to 240
- Package Power Limit Control set to Manual
- Package Power Limit set to 240
- Enforce POR set to Accept
- Overclock set to Enabled
- Memory Clock Speed set to 1467MHz
- L1 Stream HW Prefetcher set to Disable
- L2 Stream HW Prefetcher set to Disable
|
Notes |
notes
|
|
JVM Instances |
jvm_Ctr_1(1), jvm_Backend_1(8), jvm_TxInjector_1(8)
|
OS Image Description |
os_1
|
Tuning |
- cpupower -c all frequency-set -g performance
- tuned-adm profile throughput-performance
- echo 10000 > /proc/sys/kernel/sched_cfs_bandwidth_slice_us
- echo 0 > /proc/sys/kernel/sched_child_runs_first
- echo 16000000 > /proc/sys/kernel/sched_latency_ns
- echo 1000 > /proc/sys/kernel/sched_migration_cost_ns
- echo 28000000 > /proc/sys/kernel/sched_min_granularity_ns
- echo 9 > /proc/sys/kernel/sched_nr_migrate
- echo 100 > /proc/sys/kernel/sched_rr_timeslice_ms
- echo 1000000 > /proc/sys/kernel/sched_rt_period_us
- echo 990000 > /proc/sys/kernel/sched_rt_runtime_us
- echo 0 > /proc/sys/kernel/sched_schedstats
- echo 1 > /proc/sys/kernel/sched_tunable_scaling
- echo 50000000 > /proc/sys/kernel/sched_wakeup_granularity_ns
- echo 3000 > /proc/sys/vm/dirty_expire_centisecs
- echo 500 > /proc/sys/vm/dirty_writeback_centisecs
- echo 40 > /proc/sys/vm/dirty_ratio
- echo 10 > /proc/sys/vm/dirty_background_ratio
- echo 10 > /proc/sys/vm/swappiness
- echo 0 > /proc/sys/kernel/numa_balancing
- echo always > /sys/kernel/mm/transparent_hugepage/defrag
- echo always > /sys/kernel/mm/transparent_hugepage/enabled
- Add cgroup_disable=memory,cpu,cpuacct,blkio,hugetlb,pids,cpuset,perf_event,freezer,devices,net_cls,net_prio to GRUB_CMDLINE_LINUX_DEFAULT
- ulimit -n 1024000
- UserTasksMax=970000
- DefaultTasksMax=970000
|
Notes |
None
|
Parts of Benchmark |
Controller
|
JVM Instance Description |
jvm_1
|
Command Line |
-Xms2g -Xmx2g -Xmn1536m -XX:+UseParallelOldGC -XX:CICompilerCount=2 -XX:ParallelGCThreads=2
|
Tuning |
None
|
Notes |
Used numactl to interleave memory on all CPUs
|
Parts of Benchmark |
Backend
|
JVM Instance Description |
jvm_1
|
Command Line |
-Xms32736M -Xmx32736M -Xmn31736M -XX:-UseDynamicNumberOfGCThreads -XX:CICompilerCount=32 -XX:AllocatePrefetchInstr=2 -XX:+UseParallelOldGC -XX:ParallelGCThreads=32 -XX:LargePageSizeInBytes=2m -XX:-UseAdaptiveSizePolicy -XX:+AlwaysPreTouch -XX:-UseBiasedLocking -XX:+UseLargePages -XX:TLABAllocationWeight=55 -XX:ThreadStackSize=512 -XX:SurvivorRatio=31 -XX:TargetSurvivorRatio=95 -XX:InlineSmallCode=10k -XX:MaxGCPauseMillis=300 -XX:LoopUnrollLimit=100 -XX:+UseTransparentHugePages
|
Tuning |
None
|
Notes |
Used numactl to affinitize each Backend JVM to one NUMA node - numactl --cpunodebind=0 --membind=0
- numactl --cpunodebind=1 --membind=1
- numactl --cpunodebind=2 --membind=2
- numactl --cpunodebind=3 --membind=3
- numactl --cpunodebind=4 --membind=4
- numactl --cpunodebind=5 --membind=5
- numactl --cpunodebind=6 --membind=6
- numactl --cpunodebind=7 --membind=7
|
Parts of Benchmark |
TxInjector
|
JVM Instance Description |
jvm_1
|
Command Line |
-Xms2g -Xmx2g -Xmn1536m -XX:+UseParallelOldGC -XX:CICompilerCount=2 -XX:ParallelGCThreads=2
|
Tuning |
None
|
Notes |
Used numactl to affinitize each Transaction Injector JVM to one NUMA node - numactl --cpunodebind=0 --membind=0
- numactl --cpunodebind=1 --membind=1
- numactl --cpunodebind=2 --membind=2
- numactl --cpunodebind=3 --membind=3
- numactl --cpunodebind=4 --membind=4
- numactl --cpunodebind=5 --membind=5
- numactl --cpunodebind=6 --membind=6
- numactl --cpunodebind=7 --membind=7
|
|