|
|
SPEC GlossaryDefinitions for:
DefinitionsarrayA collection of data items usually laid out linearly in memory for simple access with an integer index from the base of the array. Many scientific applications use arrays to contain the dataset that they are analyzing. The larger the dataset, the larger the array size needs to be to fit the data. associate memberA membership class available to non-profit organizations at a significantly reduced cost. Allows for access to the benchmarks under development on the condition of significant involvement in the development process. Each group has their own rules and requirements regarding associate membership, contact SPEC for details. availability dateThe date upon which that part of the system becomes generally available, that is available to anyone willing to pay the appropriate price and take immediate delivery. baselineFor SPEC's purposes, "baseline" refers to a configuration that is more general and hopefully simpler than one tuned for a specific benchmark. Usually a "baseline" configuration needs to be effective across a variety of workloads, and there may be further restrictions such as requirements about the ease-of-use for any features utilized. Commonly "baseline" is the alternative to a "peak" configuration. benchmarkA reference point. Originally: a mark on a workbench used to compare the lengths of pieces so as to determine whether one was longer or shorter than desired. For computers: a "benchmark" is a test, or set of tests, designed to compare the performance of one computer system against the performance of others. Note: a benchmark is not necessarily a capacity planning tool. That is, benchmarks may not be useful in attempting to guess the correct size of a system required for a particular use. In order to be effective in capacity planning, it is necessary for the test to be easily configurable to match the targeted use. In order to be effective as a benchmark, it is necessary for the test to be rigidly specified so that all systems tested perform comparable work. These two goals are often at direct odds with one another, with the result that benchmarks are usually useful for comparing systems against each other, but some other test is often required to establish what kind of system is appropriate for an individual's needs. benchmark sponsorEvery benchmark code of SPEChpc96 has a technical advisor who is knowledgeable about the code and the scientific/engineering problem, possibly with the help of experts outside the SPEC organization. binaryTo be specific, binary refers to a numeric representation that is comprised of (frequently very long) sequences of only two values, usually '0' and '1'. Deep down at their very core, most computers really only understand '0' and '1' (or in other words some little bit of information is either "off" or "on"); thus, the term binary is frequently used to describe anything already translated to the form that is closest to what the system understands natively. chipThe term "chip" identifies the actual microprocessor, the physical package containing one or more "cores". compilerA program that translates (presumably) human-readable source code into a form that is native for a particular machine. coreThe term "core" is used to identify the core set of architectural, computational processing elements that provide the functionality of a CPU. CPU2000CPU2000 is the current version of the CPU component benchmark suite from SPEC. It replaces CPU95. CPU2006CPU2006 is the name given to the ongoing effort to replace the current CPU2000 product. CPU92CPU92 is a now outdated CPU-component benchmark suite from SPEC. This was replaced by CPU95. CPU95CPU95 is an earlier version of the CPU component benchmark suite from SPEC, which replaced CPU92 and the even older CPU89. This suite has in turn been replaced by CPU2000. CPU intensiveA term that SPEC uses often to mean applications that are primarily bound by the available processing power. Typically, these spend most of their time performing calculations or comparisons or transformations, and do little or no I/O and spend very little time in the operating system. datasetThe set of inputs for a particular benchmark. There may be more than one dataset available for each benchmark each serving a different purpose (e.g. measurement versus testing) or configured for different problem sizes (small, medium, large, ...). double precisionA level of floating point accuracy that usually requires twice the space for each value than does single precision, but provides considerably more precision. For most systems running the SPEC CPU tests from the OSG (e.g. CPU2000), double precision implies a 64 bit value. executableAs an adjective, executable means that the described item can be executed. In computer talk, executable has been also used as a noun, where it means "an executable program" or in other words, something that is ready to run without further modification. Commonly, the term executable is used to refer to the binary file that is the final result of compiling source code. filesetA pre-defined set of files that are used within a benchmark workload. Usually a fileset has specific characteristics that are relevant to how the benchmark performs its work. floating pointA class of arithmetic, typically used in scientific applications. Actually much like the values displayed by your calculator, the values can range from very large down to minute fractions but only the first several digits are available. Floating point is commonly used when dealing when the values being calculated can be very large, into the billions, or else have involve fractions; e.g. the number of miles from Earth to the next galaxy (billions and billions), or the precise temperature of a feverish baby (101.8). Floating point is the alternative to integer. For the purposes of classification for the CPU benchmarks, SPEC classifies an application to be a floating point application, if that application typically spends 10% or more of its time in calculating floating point values. full disclosure reportThe complete documentation of a benchmark's results along with the relevant system and benchmark configuration information. There should be sufficient detail and coverage for someone else to be able to reproduce the tests. Each result available on this server has such a disclosure available. geometric meanA mean ("average" value) that is obtained through the use of multiplication and Nth roots rather than by addition and division. Thus to calculate: take the Nth root (the power of 1/N) of the product of all N terms. The geometric mean has the interesting property that a certain percentage change in any one of the terms has the same effect as the same percentage change in any of the other terms, and even successive changes in the same term will have the same effect as if the changes were instead spread over other terms. What this means in benchmarking terms is that a 10% improvement in one benchmark has the same effect on the overall mean as a 10% improvement on any of the other benchmarks, and that another 10% improvement on that benchmark will have the same effect as the last 10% improvement. Thus no one benchmark in a suite becomes more important than any of the others in the suite. HPGHigh-Performance Group. One of several groups within the SPEC organization. HPG has created the benchmark suite SPEChpc96, aiming at high-end machines including both shared-memory and distributed-memory architectures. HPSCHigh-Performance Steering Committee. Executive group within HPG. Currently HPG and HPSC are the same (i.e., all HPG members are part of HPSC) HTTPHyperText Transfer Protocol. The protocol over TCP/IP by which the WWW communicates. The specifications for HTTP is available from the World Wide Web Consortium which develops such standards. integerA class of arithmetic, commonly used in computers. Integer arithmetic deals only in whole numbers; e.g. 1, 2, 99, 4563. Any calculation that does not result in a nice whole number is truncated back to a nice whole number, the fractional part is thrown away; e.g. 9 / 4 = 2 and not 2.25 or two and a quarter. Typically, computers can perform integer arithmetic more quickly than they can any other form of arithmetic, so most programs do as much work as they can in integer. However, most computer have significant limits on the values they can manage in integer format. Besides the lack of fractions, many computers cannot handle integer values beyond the millions. Thus integers can be used to count time, or to keep track of all the pennies in your bank account. However, most scientific applications deal with large values or need to be more precise than just throwing away the fractions. These kind of applications then make use of floating point arithmetic. For the SPEC CPU benchmarks, applications are classified as "integer" if they spend less than 1% of their time performing floating point calculations (which covers most non-scientific applications, e.g. compilers, utilities, simulators, etc.). LADDISThe name of a performance group that originated the benchmark that came to be known as SPEC SFS. The name is an acronym of the companies from the original members:
In computer terms, a library is a collection of subroutines provided by the operating system or development environment that can be used to perform certain common tasks; e.g. read something off of disk, create a window on the display, sort an array of values, calculate the cosine of a value, etc.). license agreementAn agreement that each licensee accepts prior to use of a product. In the SPEC case, this agreement covers what can and cannot be done with the SPEC benchmarks; usually stating that any public use of any SPEC metrics must come from tests that were in complete agreement with the run and reporting rules for that benchmark. load generatorSomething that provides part of a workload to a SUT for a benchmark. Commonly in SPEC usage, this term applies to a "client" system that is used to drive the SUT over a LAN; however this term can also be used to describe a process (either on a "client" or the SUT) which is generating a load for the benchmark. load levelFor any benchmark which submits various amounts of work to a SUT, a load level is one such amount of work. This is usually in terms of expected throughput; such as "a load level of 100 operations per second was tried, but the SUT was not able to keep up and was only able to complete 80..." metricThe final results of a benchmark. The significant statistics reported from a benchmark run. Each benchmark defines what are valid metrics for that particular benchmark. object codeObject code is commonly the product of running source code through a compiler. It is usually a binary representation of the program statements translated into a form that is understood natively by the processor. opsOperations Per Second. Usually the units of a throughput metric (for example in SPEC SFS97_R1). The average number of operations performed per second, where the "operation" has been specified by the benchmark standard. OSGThe Open Systems Group within SPEC. This group works on benchmarks for evaluating the performance of systems running 'open' (or publicly defined) operating systems (e.g. UNIX and its derivations, as well as NT and VMS). See the OSG home page. OSSCOpen Systems Steering Committee. Executive decision making body within the OSG. parallelizableThe property of a computer program, or program segment, that allows for the parallel execution of parts of the same program. Parallel programming covers a wide range of degrees; from the very small grain (e.g. similar operations on multiple elements of the same array or matrix), to large grain (e.g. simultaneous execution of unrelated procedures). peakFor SPEC's purposes, a "peak" configuration is one where the configuration is tuned especially to get the best result for a single, specific workload. Typically, this demonstrates the highest performance levels achievable. "Peak" is often used in combination with "baseline" configurations. performance neutralPerformance neutral means that there is no significant difference in performance. For example, a performance neutral source code change would be one which would not have any significant impact on the performance as measured by the benchmark. portabilityPortability flags or changes are those which are necessary for the correct execution of a benchmark. That is, the benchmark will not run or will produce the wrong output without these flags or changes. portableIn computer terms, portable means that the code in question can be easily taken to a different system and made to work there. Code that is dependant upon quirks or specific resources of a certain system are usually considered not to be portable because of the difficulties in finding means of supporting these dependencies on the new system. The use of standardized definitions and interfaces, e.g. ANSI-C and Posix, greatly aids portability because the difficult dependencies are hidden behind the standardized interfaces and the difficulties are shifted from the programmer to the system provider. reference timeThe amount of time that a particular benchmark took to run on a specific reference platform. reporting rulesThe set of benchmark rules that defines what constitutes a valid full disclosure for that benchmark. Usually these define what parts of the benchmark configuration and the system configuration(s) that need to be detailed and disclosed. response timeThe amount of time from when an action is requested until the time that the request completes and is returned to the requestor. resultThe value of the primary metric being reported for the benchmark. run rulesThe set of benchmark rules that defines what constitutes a valid test with that benchmark. Usually these define legal configurations, experimental limitations, and any operating constraints. scriptA file that contains a sequence of instructions for an interpreter, the "script" for that interpreter to follow. SFS93Known as SPEC SFS, SFS93 is the NFS server benchmark which evolved from LADDIS. SFS97SPEC SFS97 is the NFS server benchmark which replaced SFS93. SFS97_R1SFS97_R1 is version 3 of the NFS benchmark, replacing the SFS97 suite. shellA UNIX term for a command interpreter and its environment. Thus, typically a program that supports the interpretation and execution of commands. single precisionA level of floating point accuracy that usually requires half the space for each value than does double precision, but provides considerably less precision. For most systems running the SPEC CPU tests from the OSG (e.g. CPU95), single precision implies a 32 bit value. source codeThe human readable form of a computer program. This is typically the form in which the program is written, read, and modified by its human author(s). SPECStandard Performance Evaluation Corporation. SPEC is an organization of computer industry vendors dedicated to developing standardized benchmarks and publishing reviewed results See SPEC's home page. SPEC95A common (mis)name for the CPU95 benchmarks. Also, SPEC89 implies CPU89, SPEC92 should be CPU92, and SPEC2000 is CPU2000 SPECchem96Official name of the Gamess application of SPEChpc96; an application representative of computations used by the chemical industry. SPEChpc96The first benchmark suite released by SPEC/HPG. Includes the two applications Seismic and Gamess. SPECjvm98SPECjvm98 is the current Java Virtual Machine benchmark suite from SPEC. SPECmarkSPECmarks were the metrics for SPEC's original CPU89 benchmarks. Now, the term is often used to refer collectively to the CPU95 ratio speed metrics. SPECrateA "SPECrate" is a throughput metric based on the SPEC CPU benchmarks (such as SPEC CPU95). This metric measures a system's capacity for processing jobs of a specified type in a given amount of time. Note: This metric is used the same for multi-processor systems and for uni-processors. It is not necessarily a measure of how fast a processor might be, but rather a measure of how much work the one or more processors can accomplish. The other kind of metrics from the SPEC CPU suites are SPECratios, which measure the speed at which a system completes a specified job. SPECratioA measure of how fast a given system might be. The "SPECratio" is calculated by taking the elapsed time that was measured for a system to complete a specified job, and dividing that into the reference time (the elapsed time that job took on a standardized reference machine). This measures how quickly, or more specifically: how many times faster than a particular reference machine, one system can perform a specified task. "SPECratios" are one style of metric from the SPEC CPU benchmarks, the other are SPECrates. SPECseis96Official name of the Seismic application of SPEChpc96; an application representative of computations used by the seismic industry. SPECweb2005SPECweb2005 is a standardized performance test for WWW servers, the successor to SPECweb99 and SPECweb99_SSL. The benchmark consists of different workloads (both SSL and non-SSL), such as banking and e-commerce, and writes dynamic content in scripting languages to more closely model real-world deployments. The web server also communicates with a lightweight backend to simulate an application/database server. SPECweb96SPECweb96 is SPEC's first attempt at a benchmark for WWW servers. It measures a servers ability to handle HTTP/1.0 GET requests from a number of external "client" drivers. SPECweb99SPECweb99 is one of the current web server benchmarks, which replaced the SPECweb96 product. sponsorIn the OSG: The entity that has accepted the license agreement. In other words, the people who are responsible for ensuring that the results were obtained in accordance with any existing run and reporting rules. For the HPG, see benchmark sponsor who is a technical advisor for a particular benchmark. steering committeePart of the SPEC bureaucracy; each free-standing group within SPEC has a steering committee which acts as the key decision making body, with full membership votes typically being reserved for benchmark ratifications and elections. SUTSystem Under Test The system being tested, as distinct from anything in the testbed being used to drive the test. TCP/IPA networking protocol developed for the creation of a robust "internet" being a connection across a variety of local networking mechanisms. Or: the protocol used to connect to and through what is known today as the Internet (or just the 'Net). The internet uses a layered architecture with several protocols. The TCP (Transmission Control Protocol) defines session based communications, and the IP (Internet Protocol) addresses the lower level issues of packet fragmentation and routing. testbedThe entire test setup, including the SUT and any external systems used to drive or coordinate or monitor the benchmark. vectorizableThe property of a computer program, or program segment, that allows for the simultaneous execution of operations on different data values; thus making it possible to allocate the work to a set of operators and accomplish the work in parallel. One example of work that is very vectorizable is taking an entire matrix of values and multiplying each by 2, it is possible for different operators to work on different cells of the matrix at the same time. One example of work that is not vectorizable is adding to each item in an array the value of the preceding item in the array, each calculation is dependent upon the results of the preceding calculation so there is no way to perform the operations at the same time. Vectorization is only one subclass (probably one of the most restrictive subclass) of parallelizable programming. warm upA period of time prior to when the actual measurement is taken, where the workload has been already started in an effort to get the SUT to a stable and consistent state. workloadThe workload is the definition of the units of work that are to be performed during a benchmark run. |