.. _setup_cloud::

**********************************
Set up Your Cloud for the Benchmark
**********************************

.. role:: bash(code)
    :language: bash
    
The SPEC Cloud IaaS 2016 Benchmark assumes that you know how to set up your own cloud.

You will be defining all aspects of what SPEC calls the 'system under test' (SUT).

   * Your SUT meets minimum functional and configuration requirements.
   * Your SUT can access the Internet to retrieve updates and patches, or you can transfer any updates/patches as needed.
   * You know how to edit text configuration files.

But before you begin, there are some basic requirements and abilities
you must have to get a successful run.

Basic Cloud Requirements
=========================

   * Three physical machines with or without virtualization software.
   * A Cloud management software (e.g., OpenStack or a public cloud).
   * Enough disk storage to hold 50 GB of database files and logs.

Operating System Requirements
=============================

   * A \*Nix compliant operating system base for instances that supports :bash:`sh` or :bash:`bash` shells.
   * Same \*Nix user account/password across all instances (e.g., set up :bash:`cbuser` as the \*Nix account in instance images).
   * Remote access all cloud instances via :bash:`ssh` client commands without prompting. 
     If cloud instances are behind a firewall, then access must be set up using a jump box or a VPN to the benchmark harness machine.
   * :bash:`sudo` installed and configured to allow benchmark user (:bash:`cbuser`) to perform admin level tasks.


Suggested Storage Space For Workload and Benchmark Harness Machines
===================================================================

The benchmark has two workloads, namely, KMeans and YCSB. Each workload
has two 'roles' defined in CBTOOL. These roles roughly correspond to 
load/data generator (ycsb/hadoop name node) and the workload (cassandra/hadoop
cluster). The roles are ycsb, seed (Cassandra
seed node), hadoopmaster, and hadoopslave.

The storage requirements for these roles is defined below. 

==================   =================   ============================
  Workload           Local free space         Usage Considerations
==================   =================   ============================
YCSB                        50 GB        Hold runtime log files for YCSB
SEED                        50 GB        Hold NoSQL database
HADOOPMASTER                50 GB        Hold runtime log files and KMeans driver
HADOOPSLAVE                 50 GB        Hold data and log files
==================   =================   ============================

The recommended disk size for benchmark harness machine running CBTOOL and
benchmark drivers is 50 GB. The machine holds
results from an experiment.


Cloud Management Software
==========================

The SPEC Cloud IaaS 2016 Benchmark 
has been tested with the following cloud platforms.

* Amazon EC2
* Digital Ocean
* Google Compute Engine
* OpenStack (Juno,Kilo,Libert,Mitaka,Newton - should work with new versions alsoif there are no changes to OpenStack API)
* Rackspace
* IBM SoftLayer

The SPEC Cloud IaaS 2015 Benchmark incorporates normal cloud management 
tasks into its workload sequence. These include creation and deletion of
instances from one or more instance images. The list above shows the cloud
management systems (aka cloud managers) that SPEC tested during the 
development cycle and considers
supported.  If the cloud manager is not among the cloud management systems,
then the tester needs to create their own set of adapters for CBTOOL and
have them reviewed with SPEC OSG cloud subcommittee prior to submission
of results. 


Cloud Management Interface and Adapters
---------------------------------------

The SPEC Cloud IaaS 2016 Benchmark manager, CBTOOL, uses a defined set of
cloud and benchmark management tasks during the test sequence.  When you
build a new adapter, you should identify the corresponding capabilities 
or command sequences that implement tasks such as:

#. Provision instance - create compute instances, (optional) install required software;
#. Provision storage for instances;
#. Provision application instance - distribute generated workload configuration files, start workload specific services, and determine service availability;
#. Start/stop specific load driver for an application instance;
#. Monitor application instance availability and responsiveness during workload runs;
#. Collect workload results (log files, or command line responses);
#. Stop workload servers
#. Destroy application instance(s);
#. Destroy instances


NTP Server and Timezone for White Box Cloud 
===========================================

For non-instance machines in a white box cloud, use of the UTC timezone
is recommended. These machines should get their time via NTP from the 
same NTP server(s) as the test instances.  However, this is not required 
for a compliant run.

Block Storage Support
=====================

Most clouds support some form of block storage support, allowing additional, block-based
volumes to be attached to running instances in their cloud. These volumes can take many
forms, such as NASes, SAN-based LUNs, or network filesystems. They can be attached to
all kinds of instances, not limited to VMs, containers, or baremetal. As long as the
respective CBTOOL adapter for the cloud you are benchmarking supports attaching these
volumes during benchmark runtime, CBTOOL will AUTOMATICALLY make use of them. This works
in a fairly sophisticated way: If you configure CBTOOL to attach an extra volume,
during the application configuration step, CBTOOL will scan all the volumes available
from within the instance which are not hosting either root nor swap volumes. If it finds
one larger than 1GB (to also exclude cloud-init volumes), it will AUTOMATICALLY put
a filesystem on the volume and instruct both cassandra and HDFS to store data onto those
volumes. By default, none of the CBTOOL cloud adapters will attach a volume --- this must be
specifically requested by the user. If so, CBTOOL will indeed make use of the volume and the effects
of using such a volume (even if slightly slower) will most certainly appear in the reported
result to SPEC during submission time. If you are preparing such a submission, it is
very important to disclose this configuration in the YAML comments of your submission,
despite the fact that the benchmark is collecting information from within the instance ---
it must be obvious during the review process that your instances are configured this way
in the YAML, not only in the CBTOOL configuration before we receive a submission.