SPEC SFS(R)2014_vda Result IBM Corporation : IBM DeepFlash 150 with Spectrum Scale 4.2.1 SPEC SFS2014_vda = 1700 Streams (Overall Response Time = 4.12 msec) =============================================================================== Performance =========== Business Average Metric Latency Streams Streams (Streams) (msec) Ops/Sec MB/Sec ------------ ------------ ------------ ------------ 170 1.6 1700 784 340 1.8 3401 1568 510 1.9 5102 2355 680 2.3 6803 3134 850 2.7 8503 3917 1020 2.7 10204 4713 1190 2.8 11904 5491 1360 3.2 13605 6281 1530 3.9 15306 7065 1700 30.0 16980 7821 =============================================================================== Product and Test Information ============================ +---------------------------------------------------------------+ | IBM DeepFlash 150 with Spectrum Scale 4.2.1 | +---------------------------------------------------------------+ Tested by IBM Corporation Hardware Available July 2016 Software Available July 2016 Date Tested August 2016 License Number 11 Licensee Locations Almaden, CA USA IBM DeepFlash 150 provides an essential big-data building block for petabyte-scale, cost-constrained, high-density and high-performance storage environments. It delivers the response times of an all flash array with extraordinarily competitive cost benefits. DeepFlash 150 is an ideal choice to accelerate systems of big data and other workloads requiring high performance and sustained throughput. IBM Spectrum Scale provides unified file and object software-defined storage for high performance, large scale workloads on-premises or in the cloud. When deployed together, DeepFlash 150 and IBM Spectrum Scale create a storage solution that provides optimal workload flexibility, an extraordinary low-cost-to-performance ratio, and the data lifecycle management and storage services required by enterprises grappling with high-volume, high-velocity data challenges. Solution Under Test Bill of Materials ===================================== Item No Qty Type Vendor Model/Name Description ---- ---- ---------- ---------- ---------- ----------------------------------- 1 2 DeepFlash IBM 9847-IF2 Each DeepFlash 150 includes 64 150 Flash module storage slots. In this particular model half of the slots are filled, each with a 8 TB Flash module. 2 12 Spectrum Lenovo x3650-M4 Spectrum Scale client and server Scale nodes. Lenovo model number Nodes 7915D3x. 3 2 InfiniBand Mellanox SX6036 36-port non-blocking managed 56 Switch Gbps InfiniBand/VPI SDN switch. 4 1 Ethernet SMC SMC8150L2 50-port 10/100/1000 Gbps Ethernet Switch Networks switch. 5 20 InfiniBand Mellanox MCX456A-F 2-port PCI FDR InfiniBand adapter Adapter used in the Spectrum Scale client nodes. 6 4 InfiniBand Mellanox MCX354A- 2-port PCI FDR InfiniBand adapter Adapter FCBT used in the Spectrum Scale server nodes. 7 2 Host Bus Avago Tech SAS 2-port PCI 12 Gbps SAS adapter used Adapter nologies 9300-8e in one of the Spectrum Scale server nodes for attachment to the Deep Flash 150. 8 2 Host Bus Avago Tech SAS 4-port PCI 12 Gbps SAS adapter used Adapter nologies 9305-16e in one of the Spectrum Scale server nodes for attachment to the Deep Flash 150. Configuration Diagrams ====================== 1) sfs2014-20160912-00016.config1.png (see SPEC SFS2014 results webpage) Component Software ================== Item Name and No Component Type Version Description ---- ------------ ------------ ------------ ----------------------------------- 1 Spectrum Spectrum 4.2.1 The Spectrum Scale File System is a Scale Nodes Scale File distributed file system that runs System on both the server nodes and client nodes to form a cluster. The cluster allows for the creation and management of single namespace file systems. 2 Spectrum Operating Red Hat The operating system on the client Scale Nodes System Enterprise nodes was 64-bit Red Hat Enterprise Linux 7.2 Linux version 7.2. for x86_64 3 DeepFlash Storage 2.1.2 The software runs on the IBM 150 Server DeepFlash 150 and is installed with the included DFCLI tool. Hardware Configuration and Tuning - Physical ============================================ +----------------------------------------------------------------------+ | Spectrum Scale Client Nodes | +----------------------------------------------------------------------+ Parameter Name Value Description --------------- --------------- ---------------------------------------- verbsPorts mlx5_0/1/1 InfiniBand device names and port mlx5_1/1/2 numbers. verbsRdma enable Enables InfiniBand RDMA transfers between Spectrum Scale client nodes and server nodes. verbsRdmaSend 1 Enables the use of InfiniBand RDMA for most Spectrum Scale daemon-to-daemon communication. Hyper-Threading disabled Disables the use of two threads per core in the CPU. The setting was changed in the BIOS menus of the client nodes. +----------------------------------------------------------------------+ | Spectrum Scale Server Nodes | +----------------------------------------------------------------------+ Parameter Name Value Description --------------- --------------- ---------------------------------------- verbsPorts mlx4_0/1/1 InfiniBand device names and port mlx4_0/2/2 numbers. mlx4_1/1/1 mlx4_1/2/2 verbsRdma enable Enables InfiniBand RDMA transfers between Spectrum Scale client nodes and server nodes. verbsRdmaSend 1 Enables the use of InfiniBand RDMA for most Spectrum Scale daemon-to-daemon communication. scheduler noop Specifies the I/O scheduler used for the DeepFlash 150 block devices. nr_requests 32 Specifies the I/O block layer request descriptors per request queue for DeepFlash 150 block devices. Hyper-Threading disabled Disables the use of two threads per core in the CPU. The setting was changed in the BIOS menus of the server nodes. Hardware Configuration and Tuning Notes --------------------------------------- The first three configuration parameters were set using the "mmchconfig" command on one of the nodes in the cluster. The verbs settings in the table above allow for efficient use of the InfiniBand infrastructure. The settings determine when data are transferred over IP and when they are transferred using the verbs protocol. The InfiniBand traffic went through two switches, item 3 in the Bill of Materials. The block device parameters "scheduler" and "nr_requests" were set on the server nodes with echo commands for each DeepFlash device. The parameters can be found at "/sys/block/DEVICE/queue/{scheduler,nr_requests}", where DEVICE is the block device name. The last parameter disabled Hyper-Threading on the client and server nodes. Software Configuration and Tuning - Physical ============================================ +----------------------------------------------------------------------+ | Spectrum Scale - All Nodes | +----------------------------------------------------------------------+ Parameter Name Value Description --------------- --------------- ---------------------------------------- ignorePrefetchL yes Specifies that only maxMBpS and not the UNCount number of LUNs should be used to dynamically allocate prefetch threads. maxblocksize 1M Specifies the maximum file system block size. maxMBpS 10000 Specifies an estimate of how many megabytes of data can be transferred per second into or out of a single node. maxStatCache 0 Specifies the number of inodes to keep in the stat cache. numaMemoryInter yes Enables memory interleaving on NUMA leave based systems. pagepoolMaxPhys 90 Percentage of physical memory that can MemPct be assigned to the page pool scatterBufferSi 256K Specifies the size of the scatter ze buffers. workerThreads 1024 Controls the maximum number of concurrent file operations at any one instant, as well as the degree of concurrency for flushing dirty data and metadata in the background and for prefetching data and metadata. +----------------------------------------------------------------------+ | Spectrum Scale - Server Nodes | +----------------------------------------------------------------------+ Parameter Name Value Description --------------- --------------- ---------------------------------------- nsdBufSpace 70 Sets the percentage of the pagepool that is used for NSD buffers. nsdMaxWorkerThr 3072 Sets the maximum number of threads to eads use for block level I/O on the NSDs. nsdMinWorkerThr 3072 Sets the minimum number of threads to eads use for block level I/O on the NSDs. nsdMultiQueue 64 Specifies the maximum number of queues to use for NSD I/O. nsdThreadsPerDi 3 Specifies the maximum number of threads sk to use per NSD. nsdThreadsPerQu 48 Specifies the maximum number of threads eue to use per NSD I/O queue. nsdSmallThreadR 1 Specifies the ratio of small thread atio queues to small thread queues. pagepool 80G Specifies the size of the cache on each node. On server nodes the page pool is used for NSD buffers. +----------------------------------------------------------------------+ | Spectrum Scale - Client Nodes | +----------------------------------------------------------------------+ Parameter Name Value Description --------------- --------------- ---------------------------------------- pagepool 16G Specifies the size of the cache on each node. Software Configuration and Tuning Notes --------------------------------------- The configuration parameters were set using the "mmchconfig" command on one of the nodes in the cluster. The parameters listed in the table above reflect values that might be used in a typical streaming environment with Linux nodes. Service SLA Notes ----------------- There were no opaque services in use. Storage and Filesystems ======================= Item Stable No Description Data Protection Storage Qty ---- ------------------------------------- ------------------ -------- ----- 1 64 7 TB LUNs from two DeepFlash 150 Spectrum Scale Yes 64 systems. synchronous replication 2 300 GB 10K mirrored HDD pair in RAID-1 No 10 Spectrum Scale client nodes used to store the OS. Number of Filesystems 1 Total Capacity 245 TiB Filesystem Type Spectrum Scale File System Filesystem Creation Notes ------------------------- A single Spectrum Scale file system was created with a 1 MiB block size for data and metadata, a 4 KiB inode size, and a 32 MiB log size, 2 replicas for data and metadata, and "relatime". The file system was spread across all of the Network Shared Disks (NSDs). The client nodes each had an ext4 file system that hosted the operating system. Storage and Filesystem Notes ---------------------------- Each DeepFlash presented 32 JBOF LUNs to one of the server nodes. An NSD was created from each LUN. All of the NSDs attached to the first server node were placed in a failure group. All of the NSDs attached to the second server node were placed in a second failure group. The file system was configured with 2 data and 2 metadata replicas. Therefore a copy of all data and metadata was present on each DeepFlash 150. The cluster used a two-tier architecture. The client nodes perform file-level operations. The data requests are transmitted to the server nodes. The server nodes perform the block-level operations. In Spectrum Scale terminology the load generators are NSD clients and the server nodes are NSD servers. The NSDs were the storage devices specified when creating the Spectrum Scale file system. Transport Configuration - Physical ================================== Item Number of No Transport Type Ports Used Notes ---- --------------- ---------- ----------------------------------------------- 1 1 GbE cluster 12 Each node connects to a 1 GbE administration network network with MTU=1500 2 FDR InfiniBand 28 Client nodes have 2 FDR links, and each server cluster network node has 4 FDR links to a shared FDR IB cluster network Transport Configuration Notes ----------------------------- The 1 GbE network was used for administrative purposes. All benchmark traffic flowed through the Mellanox SX6036 InfiniBand switches. Each client node had two active InfiniBand ports. Each server node had four active InfiniBand ports. Each client node InfiniBand port was on a separate FDR fabric for RDMA connections between nodes. Switches - Physical =================== Total Used Item Port Port No Switch Name Switch Type Count Count Notes ---- -------------------- --------------- ------ ----- ------------------------ 1 SMC 8150L2 10/100/1000 50 12 The default Gbps Ethernet configuration was used on the switch. 2 Mellanox SX6036 #1 FDR InfiniBand 36 14 The default configuration was used on the switch. 3 Mellanox SX6036 #2 FDR InfiniBand 36 14 The default configuration was used on the switch. Processing Elements - Physical ============================== Item No Qty Type Location Description Processing Function ---- ---- -------- -------------- ------------------------- ------------------- 1 20 CPU Spectrum Scale Intel(R) Xeon(R) CPU Spectrum Scale client nodes E5-2630 v2 @ 2.60GHz client, load 6-core generator, device drivers 2 4 CPU Spectrum Scale Intel Xeon CPU E5-2630 v2 Spectrum Scale NSD server nodes @ 2.60GHz 6-core server, device drivers Processing Element Notes ------------------------ Each of the Spectrum Scale client nodes had 2 physical processors. Each processor had 6 cores with one thread per core. Memory - Physical ================= Size in Number of Description GiB Instances Nonvolatile Total GiB ------------------------- ---------- ---------- ------------ ------------ Spectrum Scale node 128 12 V 1536 system memory Grand Total Memory Gibibytes 1536 Memory Notes ------------ In the client nodes Spectrum Scale reserves a portion of the physical memory for file data and metadata caching. In the server nodes a portion of the physical memory is reserved for NSD buffers. A portion of the memory is also reserved for buffers used for node to node communication. Stable Storage ============== Stable writes and commit operations in Spectrum Scale are not acknowledged until the NSD server receives an acknowledgment of write completion from the underlying storage system, which in this case is the DeepFlash 150. The DeepFlash 150 does not have a cache, so writes are acknowledged once the data has been written to the flash cards. Solution Under Test Configuration Notes ======================================= The solution under test was a Spectrum Scale cluster optimized for streaming environments. The NSD client nodes were also the load generators for the benchmark. The benchmark was executed from one of the client nodes. All of the Spectrum Scale nodes were connected to a 1 GbE switch and two FDR InfiniBand switches. Each DeepFlash 150 was connected to a single server node via 4 12 Gbps SAS connections. Each server node had 2 SAS adapters. One server node had two Avago SAS 9300-8e adapters, and the other server two had two Avago SAS 9305-16e adapters. Other Solution Notes ==================== None Dataflow ======== The 10 Spectrum Scale client nodes were the load generators for the benchmark. Each load generator had access to the single namespace Spectrum Scale file system. The benchmark accessed a single mount point on each load generator. In turn each of mount points corresponded to a single shared base directory in the file system. The NSD clients process the file operations, and the data requests to and from disk were serviced by the Spectrum Scale server nodes. Other Notes =========== IBM, IBM Spectrum Scale, and IBM DeepFlash 150 are trademarks of International Business Machines Corp., registered in many jurisdictions worldwide. Intel and Xeon are trademarks of the Intel Corporation in the U.S. and/or other countries. Mellanox is a registered trademark of Mellanox Ltd. Other Report Notes ================== None =============================================================================== Generated on Wed Mar 13 16:52:34 2019 by SpecReport Copyright (C) 2016-2019 Standard Performance Evaluation Corporation