To check for possible updates to this document, please see http://www.spec.org/jbb2015/docs/SPECjbb2015-Result_File_Fields.html
ABSTRACT
This document describes the various fields in the result file making up the complete the SPECjbb2015 benchmark result disclosure.
2.10 INVALID or WARNING or COMMENTS
4. Overall SUT (System Under Test) Description
5.1.27 # and type of Network Interface Cards (NICs) Installed
5.1.28 Power Supply Quantity and Rating (W)
5.1.30 Cabinet/Housing/Enclosure
5.4 Java Virtual Machine (JVM)
7. SUT or Driver configuration
7.3.2 JVM Instance Description
8.3 Last Success jOPS/First Failure jOPS for SLA points Table
11. Rate of non-critical failures
12. Delay between performance status pings
14. Controller time offset from Time Server
The SPECjbb2015 (Java Server Benchmark) is SPEC's benchmark for evaluating the performance of server side Java. Like its predecessors, SPECjbb2000/5, the SPECjbb2015 benchmark evaluates the performance of server side Java by emulating a three-tier client/server system (with emphasis on the middle tier). The benchmark exercises the implementations of the JVM (Java Virtual Machine), JIT (Just-In-Time) compiler, garbage collection, threads and some aspects of the operating system. It also measures the performance of CPUs, caches, memory hierarchy and the scalability of shared memory processors (SMPs). The benchmark also using an approach of reporting response time while gradually increasingly the load and reporting not only full system capacity throughput, but also throughput under response time constraint..
The benchmark suite consists of three separate software modules:
The top bar shows the measured SPECjbb2015 benchmark result and gives some general information regarding this test run.
The headline of the performance report includes one field displaying the hardware vendor and the name of the system under test. If this report is for a historical system the declaration "(Historical)" must be added to the model name. In a second field the max-jOPS and critical-jOPS is printed, eventually prefixed by an "Invalid" indicator, if the current result does not pass the validity checks implemented in the benchmark.
The name of the organization or individual that sponsored the test.Generally, this is the name of the license holder.
The SPEC license number of the organization or individual that ran the benchmark
The date when all the hardware necessary to run the result is generally available. For example, if the CPU is available in Aug-2007, but the memory is not available until Oct-2007, then the hardware availability date is Oct-2007 (unless some other component pushes it out farther).
The name of the organization or individual that ran the test and submitted the result.
The name of the city, the state and country the test took place. If there are installations in multiple geographic locations, that must also be listed in this field.
The date when all the software necessary to run the result is generally available. For example, if the operating system is available in Aug-2007, but the JVM is not available until Oct-2007, then the software availability date is Oct-2007 (unless some other component pushes it out farther).
The date when the test is run. This value is automatically supplied by the benchmark software; the time reported by the system under test is recorded in the raw result file .
The date when this report will be published after finishing the review. This date is automatically filled in with the correct value by the submission tool provided by SPEC. By default this field is set to "Unpublished" by the software generating the report.
Any inconsistencies with the run and reporting rules causing a failure of one of the validity checks implemented in the report generation software will be reported here and all pages of the report file will be stamped with an "Invalid" water mark in case this happens. The printed text will show more details about which of the run rules wasn't met and the reason why. More detailed explanation may as well be at the end of report in sections "Run Properties" or "Validation Details". If there are any special waivers or other comments from SPEC editor, those will also be listed here.
This section describes the result details as a graph (jOPS and Response time), the SPECjbb2015 benchmark category, number of groups and links to other sections of the report.
The header of this section decrypts as which the SPECjbb2015 benchmark category was run and how many "number of groups" were set to run using property "specjbb.group.count":
The raw data from this graph can be found by clicking on the graph. This graph only shows the Response-Throughput (RT) phase of the benchmark. Initial phase of finding High Bound Injection Rate (HBIR) (Approximate High Bound of throughput) and later validation at the end of the run are not part of this graph. X-axis is showing jOPS (Injection Rate : IR) as system is being tested for gradually increasing RT step levels in increments of 1% of HBIR. Y-axis is showing response time (min, various percentiles, max) where 99th percentile determines the critical-jOPS metric being shown a yellow vertical line. The last successful RT step level before the "First Failure" of an RT step level is marked as red vertical line reflecting the max-jOPS metric of the benchmark. Benchmark continues to test few RT step levels beyond the "First Failure" RT step level. Often, there should be very few RT step levels passing beyond "First Failure" RT step level else it indicates that with more tuning system should be able to pass higher max-jOPS. A user need to view either controller.out or level-1 report output to view details about levels beyond "First Failure" RT step level.
The following section of the report file gives the system under test (SUT) overview.
Company which sells the system.
URL of system vendor.
Single Supplier or Parts Built
Possible values for this property are:
The total number of configured systems.
{YES / NO].
The total number of configured systems. Please refer to Run and Reporting Rules document for definition of system. As example, a rack based blade system, can be one system with many blade nodes with all running under single OS image or each running its own OS image.
{YES / NO].
The number of nodes configured on each system.
The number of total chip installed on all system(s) in overall SUT(s).
The number of total cores installed on all system(s) in overall SUT(s).
The number of total thread on all system(s) in overall SUT(s).
The number of total memory installed on all system(s) in overall SUT(s).
The number of total OS images installed on all system(s) in overall SUT(s).
Environment mode. [virtual / Non-virtual]
The following section of the report file describes the hardware and the software of the system under test (SUT) used to run the reported benchmark with the level of detail required to reproduce this result.
The following section of the report file describes the hardware and the software of the system under test (SUT) used to run the reported benchmark with the level of detail required to reproduce this result. Same fields are also valid for Driver system(s) HW and SW description. For driver system, some fields like memory etc. may not be needed in as details as for SUT.
HW Name.
The Company name which sells the system.
The URL of the system vendor.
The HW availability (month-year) of the system.
The model name identifying the system under test
The number of systems under test
The form factor for this system.
In multi-node configurations, this is the form factor for a single node. For rack-mounted systems, specify the number of rack units. For blades, specify "Blade". For other types of systems, specify "Tower" or "Other".
The number of nodes per system.
A manufacturer-determined processor formal name.
Technical characteristics to help identify the processor, such as number of cores, frequency, cache size etc.
If the CPU is capable of automatically running the processor core(s) faster than the nominal frequency and this feature is enabled, this field should also list the feature and the maximum frequency it enables on that CPU (e.g.: "Intel Turbo Boost Technology up to 3.46GHz").
If this CPU clock feature is present but is disabled, no additional information is required here.
The numberof Chips Per System.
The number of Cores Per System.
The number of Cores Per Chip.
The number of Threads Per System.
The number of Threads Per Core.
The HW version (if there is one), and the BIOS version.
The nominal (marked) clock frequency of the CPU, expressed in megahertz.
If the CPU is capable of automatically running the processor core(s) faster than the nominal frequency and this feature is enabled, then the CPU Characteristics field must list additional information, at least the maximum frequency and the use of this feature.
Furthermore if the enabled/disabled status of this feature is changed from the default setting this must be documented in the System Under Test Notes field.
Description (size and organization) of the CPU's primary cache. This cache is also referred to as "L1 cache".
Description (size and organization) of the CPU's secondary cache. This cache is also referred to as "L2 cache".
Description (size and organization) of the CPU's tertiary, or "L3 cache".
Description (size and organization) of any other levels of cache memory.
A description of the disk drive(s) (count, model, size, type, rotational speed and RAID level if any) used to boot the operating system and to hold the benchmark software and data during the run.
The file system used.
Total size of memory in the SUT in GB.
Number and size of memory modules used for testing.
Detailed description of the system main memory technology, sufficient for identifying the memory used in this test.
Potentially there can be multiple instances of this field if different types of DIMMs have been used for this test, one separate field for each DIMM type.
Since the introduction of DDR4 memory there are two slightly different formats.
The recommended formats are described here.
DDR4 Format:
N x gg ss pheRxff PC4v-wwwwaa-m
References:
A description of the network controller(s) (number, manufacturer, type, ports and speed) installed on the SUT
The number of power supplies that are installed in this node and the power rating for each power supply. Both entries should show "None" if the node is powered by a shared power supply.
Any additional equipment added to improve performance and required to achieve the reported scores.
The model name identifying the enclosure housing the tested nodes.
Additional descriptions about the shared HW.
Description of additional performance relevant components not covered in the fields above
Additional Notes.
Other HW like network switch(s) or other software.
Other Hardware/Software Name.
The company name which sells item under Other Hardware/Software.
The company URL which sells item under Other Hardware/Software.
Other Hardware/Software Version.
The HW/SW availability (month-year).
If applicable Other HW or SW Bitness else type 'n/a'.
Additional Notes.
System OS Section
OS name.
The OS vendor name.
The OS vendor URL.
OS version.
The OS availability (month-year).
OS Bitness.
Additional OS Notes.
Note that OS tuning info is placed under the separate configuration section described here 7.2.3
JVM Section
JVM name.
The JVM vendor name.
The JVM vendor URL.
Version of the JVM.
The JVM availability (month-year).
JVM Bitness.
Additional JVM Notes.
Note that JVM tuning info is placed under the separate configuration section described here 7.3.4
This section covers the topology for SUT and driver system (Distributed category only). First section shows an easy summary of the deployment of JVM and OS images across H/W systems. Later sub-sections detail about JVM instances across OS images deployed for each H/W configuration for SUT and driver system (Distributed category only).
This section covers as how JVM instances are deployed inside OS images and those OS images are deployed across HW systems.
On a given system HW configuration describes OS images being deployed.
Format is OS_image_type (number of them deployed on this system). OS_image_type should match one of the OS configuration described in this section 13.2 .
Name of the HW when describing in product section like 'hw_1'.
Number of systems using same exact deployment.
Virtual or non-virtual.
Any tuning.
Any notes.
On a given OS image describes JVM instances being deployed.
Format is list of many JVM instances following: JVM_image_type (number of them deployed in this OS image). JVM_image_type should match one of the JVM Instance configuration described in this section 13.3 .
Name of the OS product when describing in product section like 'os_1'.
Any tuning.
Any Notes.
Describes a JVM Instance.
Name of benchmark agent this JVM instance will run. It can be Composite (for Composite category) or for MulitJVM and Distributed category it can be Controller or TxInjector or Backend.
Name of the JVM product when describing in product section like 'jvm_1'.
Command line parameters being used to launch this JVM instance.
Any tuning.
Any notes.
Details about max-jOPS and critical-jOPS calculations.
Showing last few RT(Response-Throughput) step levels close to max-jOPS. Pass means that RT step level passed and fail means system did not pass that RT step level. Successful RT step level before first failed RT step level is chosen as max-jOPS. .
This is a complex calculation. jOPS at various SLAs (Service Level Agreement) are calculated using data shown in the table called "Last Success jOPS/First Failure jOPS for SLA points" as well as other RT step levels in between those two levels. Then geometric mean of jOPS at these SLAs represent the critical-jOPS metric. This metric could be 0 if jOPS for any one or more of the five SLAs (10ms, 25ms, 50ms, 75ms, 100ms) with details later in the report is 0.
First column list various SLAs points (different response time thresholds) while first title row is listing response time percentiles. The data for a given SLAs (as example 10000 us = 10ms) and percentile (as example 99th percentile) has two data values in format [Last Success jOPS/First Failure jOPS]. Last Success jOPS is the last successful RT step level whose 99th percentile of response time of all samples was 10ms or less. If 99th percentile of response time was never 10ms or below, data will be '-'. First Failure jOPS is the RT step level where first time 99th percentile of response time of all samples was more than 10ms. Data points with red color background are being used in calculation for the metric critical-jOPS.
This is one of the validation criteria. This graph only shows RT phase step levels. jOPS for RT step levels are on x-axis and number of probes as % of total jOPS is on y-axis (logarithmic scale). Two horizontal lines are showing limits. To have good confidence in response time, we need to ensure that a good % of total jOPS is being issued as probes. For more details, please refer to validation section of Run and Reporting Rules document.
This is one of the validation criteria. Total requests are issued to maintain a request mix. This graph only shows RT phase step levels. jOPS for RT step levels are on x-axis and y-axis shows the (Actual % in the mix - Expected % in the mix). For more details about passing criteria, please refer to validation section of Run and Reporting Rules document.
This is one of the validation criteria. If these non-critical failures are 0 during the RT phase, only a message is printed. If non-critical failures during RT phase are >0, then a graph is shown. In case of graph, jOPS for RT step levels are on x-axis and number of non-critical failures for each RT step level is on y-axis. Transaction Injectors (TxI) issue requests to Backend(s) to process. Many time for various reasons, TxI will timeout after waiting for a threshold. This is counted as non-critical failure. For more details about passing criteria, please refer to validation section of Run and Reporting Rules document.
This is one of the validation criteria. X-axis is time in milliseconds (msec). Y-axis is showing delay time in msec. Validation criteria applies to whole RT phase and not to individual RT step levels. Also, minimum y-axis value is 5 sec as that is passing criteria and chosen to reduce the size of .raw file for submission. If a user want to see y-axis data starting with 0, user need to generate report with level-1 and it will have the detailed graph. For more details about passing criteria, please refer to validation section of Run and Reporting Rules document.
This graph shows the relationship between IR (Injection rate), aIR(Actual Injection Rate) and Actual PR(Processed Rate). The graph is showing all the phase starting from HBIR(High Bound Injection Rate) search, RT phase warm-up, RT phase and validation phase at the end. X-axis is showing iteration number where iteration means a time period for which IR/aIR/PR being evaluated. IR is target Injection rate, actual IR is the IR we could issue for a given iteration and PR is total processed rate for that iteration. To pass an iteration, IR/aIR/PR must be within certain % of each other. Y-axis shows as how far Actual IR and Actual PR are compared to IR as base. If those are within low and high bound, that iteration passed else it failed. A user will see many failures during HBIR search. During RT phase till max-jOPS is found, there could be some failures as certain number of retries are allowed. For more details about passing criteria, please refer to Run and Reporting Rules document.
This is one of the validation criteria which applies for Composite and MultiJVM runs on virtualized systems. It validates Controller time correctness and consistency agains Time Server running on the native host. X-axis is time in milliseconds (msec). Y-axis is showing Mean offset from Time Server (calculated during Response Time curve) minus specific offset time at this point in msec. Validation criteria applies to RT phase. There are three time offset validation metrics: not be more than 10 points where |Mean-offset| > 50 msec; no points where |Mean-offset| > 500 msec; offset STDDEV during RT must not be > 100 msec.
This section covers the run properties which are being set by the user.
Details about validation are listed here.
Provides details about different type of validation.
The SPECjbb2015 benchmark Run and Reporting Rules document list specifically the properties which can be set by the user. If user settable properties are set within compliant settable range, this section prints the message "PASSED" for all agents.
If a user sets either non-settable property to different than default and/or sets user settable properties to out of compliant range, this property is listed here as well as agent along with message that run is INVALID.
Benchmark also has data structures which must be in synchronization and should match certain criteria. If this criteria fails, run is declared INVALID.
List other checks for compliance as well as High Bound maximum and High Bound settled values during the HBIR (High Bound Injection Rate) search phase.
Product and service names mentioned herein may be the trademarks of their respective owners.
Copyright 2007-2019 Standard Performance Evaluation Corporation (SPEC).
All Rights Reserved.