The SPECjbb2015 Benchmark - Known Issues (October 4, 2019)

This is a place where SPEC has collected descriptions (and solutions) to installation, build, and runtime problems encountered by people using the SPECjbb2015 benchmark. If your issue is not amongst the known issues, please bring it to the attention of SPECjbb2015 Support via e-mail to: support@spec.org with "SPECjbb" in the subject line.

The reported default value for specjbb.forkjoin.workers is different from what I've previously seen

Response time spikes in Response-Throughput (RT) graph in the HTML report

In some cases, benchmark metric "max-jOPS" run-to-run variability could be high

  • Metric "max-jOPS" is determined during RT graph building. RT graph building starts from 0% step while increasing the IR (Injection Rate) in 1% increments of HBIR (High Bound Injection Rate) and observing each step for settling (3 sec min and 30 sec maximum) and 60 sec steady state. Each RT step is evaluated for a passing criterion. The successful IR of the RT step just before the RT step where passing criterion fails (call First Failure), is called "max-jOPS". Since each RT step is evaluated for 60 sec, if a very long GC pause happens, it is possible that First Failure may happen much before the full system capacity is reached. In this case, user will observe max-jOPS red color line in RT graph much earlier than usual end of the graph. Even after First Failure, benchmark keep testing RT step levels unless three continuous RT steps fail. This is to show user more clearly as where the failures of RT steps are happening. User can also look at the IR/PR accuracy graph at the end of the HTML report to observe passing criterion details. In above evaluation criterion, if long GC pause duration and/or its temporal location in RT graph have variability, this will result in "max-JOPS" run-to-run variability. On most systems, we tested very small run-to-run variability.

Benchmark metric "critical-jOPS" run-to-run variability could be high

  • Metric "critical-jOPS" is calculated based on 99th percentile of response time from all RT step level till full system capacity "max-jOPS" is reached. Criterion for critical-jOPS is 99th percentile of response time which is very sensitive to GC pauses. On most system tested with optimized configuration, critical-jOPS has very small run-to-run variability. Any configuration where long GC pause durations and temporal locations are random, critical-jOPS may show more run-to-run variability. In particular, systems running Suse Linux OS exhibited very high run-to-run variability.

In rare cases, benchmark metric max-jOPS > 100% HBIR

  • Initial phase of the benchmark determines a rough approximation of full system capacity called HBIR (High Bound Injection Rate). On most systems tested, max-jOPS occurs around 80-90% of HBIR. In some rare cases, it is possible that max-jOPS > 100% HBIR.

Scaling of >16 groups inside a single OS image

  • In testing, benchmark scales very well when running large number of groups across multiple OS images. When testing inside a single OS image, scaling is reasonable up to 16 groups. When running >16 groups, scaling of max-jOPS and critical-jOPS is poor due to some network resource related bottleneck inside a single OS image. Once more accurate reason is identified; this document will be accordingly updated.

No connections among SPECjbb2015-Distributed instances running across OS images when firewall enabled

  • SPECjbb2015-Distributed instances running across OS images may not be able to connect if firewall is enabled. Firewall blocks the TCP-IP communication among Java instances running across OS images and as a result different Java instances are not able to communicate with each other. Disabling the firewall should resolve this issue.

CPU utilization of less than 90%

  • With good optimizations and tuning a user should be able to achieve ~90% of CPU utilization. It is suggested that if CPU utilization is <90%, - Dspecjbb.forkjoin.workers= could be set 2 x that of available processor threads for each backend for better performance. Benchmark by default tries to set this property to available processor threads but affinity and/or running multiple groups configuration makes it complex for the benchmark to determine the optimal value for this setting.

Exception at the beginning of the run

  • When multiple instances take longer time for the handshake with the controller, it results in exceptions being thrown. These are harmless exceptions and can be ignored.

Submit errors during the run

  • During the benchmark run, "submit error" message is reported for several cases. Some of these exceptions are fatal while others are harmless. Please refer to controller log for more detailed information about these error messages.

A "Validation level CORRECTNESS is missing from the Validation Reports" error occurs

  • During the benchmark run, an attempt is made to test the load 3 steps above the max-jOPs to showcase that max-jOPS determined is indeed the full system sustained capacity and not much lower max-jOPS resulted as example from a severe glitch of full system GC etc. Some systems may not be able to recover from this 3 steps above the max-jOPS load and validation is skipped resulting in this error. In such cases the user tunable property "specjbb.controller.maxir.maxFailedPoints" can be lowered to value of "1" which should help the system recover and not skip the validation.

After a completed benchmark run, the ssh session is closed

  • This behavior can be changed by removing the 'exit 0' line from the end of the script used to run the benchmark.

All benchmark results are located in the benchmark root directory

  • Benchmark results can be located anywhere by editing line in *.sh from 'result=./$timestamp' to 'result=./result_dir/$timestamp' or in *.bat from 'set result=%timestamp: =0%' to 'set result=result_dir\%timestamp: =0%' in the script use to run the benchmark to include the desired path.

Missing Maven dependency POM file for the grizzly-framework

  • When building experimental versions of the benchmark the removal of a grizzly-framework artifact from public Maven repository may cause the build to fail.

    [WARNING] The POM for org.glassfish.grizzly:grizzly-framework:jar:2.3.19_p1_internal is missing, no dependency information available

    Users can workaround this issue by using the Maven Install plugins install-file goal. To install the missing POM file and artifact to the users local Maven repository. Use the following example to complete the install.

    $ mkdir tmp-grizzly-framework
    $ unzip -q lib/grizzly-framework-*.jar -d tmp-grizzly-framework/
    $ ls tmp-grizzly-framework/META-INF/maven/org.glassfish.grizzly/grizzly-framework/pom.xml
    tmp-grizzly-framework/META-INF/maven/org.glassfish.grizzly/grizzly-framework/pom.xml
    $ mvn install:install-file -DgroupId=org.glassfish.grizzly -DartifactId=grizzly-framework -Dpackaging=jar -Dversion=2.3.19_p1_internal -Dfile=lib/grizzly-framework-2.3.19_p1_internal.jar -DpomFile=tmp-grizzly-framework/META-INF/maven/org.glassfish.grizzly/grizzly-framework/pom.xml

    For the kit 1.02 release two bespoke JAXB artifacts are used. It means the same type of issue may occur after the release. These are the Maven co-ordinates of the bespoke artifacts.

    com.sun.xml.bind:jaxb-api:jar:2.4.0-b180830.0359
    com.sun.xml.bind:jaxb-impl:jar:2.4.0-b180830.0438

    Use the above example to install the missing Maven artifacts.

You're seeing a "Too many open files" error message in your Txinjector log files.

  • Here's an example of what the message looks like:
    • <Thu Nov 08 02:42:07 PST 2018> org.spec.jbb.driver: Submit Error for InStore Purchase, isSaturate = false : Transaction Error: Transaction had failed (non-fatal); caused by org.spec.jbb.sm.tx.NotEnoughCreditException: Not enough credit, got Error: Communication problem; caused by java.io.IOException: NIO Client threw unexpected exception; caused by org.spec.jbb.core.objectpool.PoolException: Exception while connecting to 127.0.0.1:46599; caused by java.util.concurrent.ExecutionException: java.net.SocketException: Too many open files; caused by java.net.SocketException: Too many open files
  • This has been seen when using a JDK-11 release, though it can occur with any release. To address this, you need to increase the OS's file descriptor (fd) limit. We've found that 64K is sufficient in our testing, but the setting may need to be different depending on your SUT.

The reporter fails with a "java.lang.InternalError: java.lang.reflect.InvocationTargetException"

  • If you are using Oracle Java 11 (or later), and the reporter fails with an exception whose trace begins with:
    • java.lang.InternalError: java.lang.reflect.InvocationTargetException
              at java.desktop/sun.font.FontManagerFactory$1.run(FontManagerFactory.java:86)
              ...
  • This is because, starting in Oracle Java 11, licensed fonts were removed from the release as part of the effort to make OracleJDK and OpenJDK equivalent. OracleJDK now relies on the OS to provide the fonts (or replacement fonts).
    • To date, we have only had reports of missing fonts from some of the minimally packaged OS bundles.
  • To correct the font issue, do one of the following:
    1. Run the benchmark on an OS that bundles the needed fonts.
    2. Install the missing fonts onto your OS.
    3. Run the reporter separately using an earlier JDK release, giving it the raw file generated from the benchmark run. See the run scripts included with the benchmark kit for details, but as an example, you can run the reporter similar to this:
      • java -jar specjbb2015.jar -m reporter -raw specjbb2015-M-20181108-00001.raw

Reporter fails to generate plots in the result report

  • If Java exceptions relating to "font" are occurring while generating the report, the "fontconfig" package may be missing. Install the "fontconfig" package on the operating system. Ex:
    • Linux (Ubuntu):
      • $ apt install fontconfig
    • AIX:
      • Download "fontconfig-2.11.95-3.aix6.1.ppc.rpm" and "freetype2-2.8-1.aix6.1.ppc.rpm" from IBM's AIX Toolbox
      • $ rpm -i fontconfig-2.11.95-3.aix6.1.ppc.rpm freetype2-2.8-1.aix6.1.ppc.rpm

Using Java SE 9 and Java SE 10 with the benchmark

  • Due to a change introduced with the Java module system, you need to add the following option to your java execute line(s) in your run scripts.
    • --add-modules=java.xml.bind
  • When using any other release, you are not allowed to submit using this option.

The benchmark very occassionally hangs.

  • A rarely occurring bug was found, and fixed in the v1.03 benchmark.
    • If you are using SPECjbb2015 v1.02 or earlier, please start using the v1.03 kit (or later).
    • Overview:
      • There was a race between the LockManagers and the GC to flush out stale entries from the ConcurrentHashMap of locks causing early collection/processing of Soft/WeakReferences. This resulted in a hash map entry getting stuck in the hash map forever, which caused an infinite loop in LockManager.getLock()

Disclaimer

For latest update to this document, please check here: http://www.spec.org/jbb2015/docs/knownissues.html.
Product and service names mentioned herein may be the trademarks of their respective owners.
Copyright (c) 2007-2019 Standard Performance Evaluation Corporation (SPEC).
All Rights Reserved.