Guest workload driver

Registered by Joe Talerico

Integrate CBTool into Rally to manage Guest workloads.

VMTask is a perfect intersection point.

Blueprint information

Status:
Started
Approver:
Boris Pavlovic
Priority:
High
Drafter:
Joe Talerico
Direction:
Approved
Assignee:
Boris Pavlovic
Definition:
Approved
Series goal:
None
Implementation:
Started
Milestone target:
None
Started by
Boris Pavlovic

Related branches

Sprints

Whiteboard

#Rohan
Can you please give more details about this integration. Are you planning to add some guest-agent to manage multiple guest tools (CBTool etc)?

#Joe Talerico / rook
Rohan - CloudBench (CBTool) Currently has a mechanism to control guests tools - I see two possible paths to success here.

Path 1 (more likely to work):
Rally creates the users/tenants/networks and passes the information to CBTool to create the Guests within that users space, run the benchmark within the guest, and return the results to Rally.

Path 2 (less likely to work):
Rally creates the users/tenants/networks and the Guests and passes the guest IP information to CloudBench. This path is more difficult to implement because CloudBench has the description of the work loads in : https://github.com/ibmcb/cbtool/blob/master/configs/templates/PUBLIC_application_instances.txt

Example Netperf workload:
[AI_TEMPLATES : NETPERF]
SUT = netclient->netserver
LOAD_GENERATOR_ROLE = netclient
LOAD_MANAGER_ROLE = netclient
METRIC_AGGREGATOR_ROLE = netclient
CAPTURE_ROLE = netserver
LOAD_BALANCER = $False
LOAD_BALANCER_TARGET_PORT = 80
LOAD_BALANCER_TARGET_URL = unknown
LOAD_BALANCER_TARGET_CHILDREN = 2
NETCLIENT_SETUP1 = cb_check_netperf_client.sh
NETSERVER_SETUP1 = cb_check_netperf_server.sh
START = cb_netperf.sh
LOAD_PROFILE = tcp_stream
LOAD_LEVEL = 1
LOAD_DURATION = uniformIXIXI70I90
# "Special" modifier parameters for the AI NETPERF. These should be set on
# YOUR configuration file, not on this template! Please DO NOT uncomment them
# here.
#SYNC_COUNTER_NAME = synchronization_counter
#CONCURRENT_AIS = 2
#SYNC_CHANNEL_NAME = synchronization_channel
#RUN_COUNTER_NAME = experiment_id_counter

The above describes the number of guests and the role of the guests that will be launched.

To view all the current workloads automated with CloudBench today, please see : https://github.com/ibmcb/cbtool/tree/master/scripts

# Marcio Silva/ibmcb

Some comments:

1 - Typically, for every workload (internally called "Virtual Application" or "Application Instance") - some examples can be seen in a figure in https://github.com/ibmcb/cbtool/wiki/DOC:-Architecture-Layers - we employ pre-created images that are simply imported in glance.

2 - In order to help with the creation of the images containing all the necessary requirements for a given workload, some of it (but not all, at the moment), are shipped with a rudimentary "automated installer", capable of grabbing the bits and pieces from multiple repositories (example, YCSB, java and Cassandra) and then configuring it automatically, before capturing. An example usage of such installers can be seen in the second bullet of STEP 1 in https://github.com/ibmcb/cbtool/wiki/HOWTO:-Preparing-a-VM-to-be-used-with-CBTOOL-on-a-real-cloud. Important: these installer are provided mostly as a guideline/documentation of the steps to configure a workload. Since they rely on external repositories, they are not guaranteed to work all the time.

3 - Regarding the "Path 1" mentioned by Joe Talerico. If Rally creates the users/tenants/networks/keypairs, it can then just call the CloudBench API passing these as parameters. CloudBench will instantiate a new workload, a let it run for as long as needed, collecting both Guest OS (Host OS also possible, but it requires special configuration) and Application performance.

An illustrative example:

>>> from lib.api.api_service_client import *
>>> api = APIClient("http://172.16.1.250:7070")
>>> api.applist("TESTSIMCLOUD")
[]
>>> api.appattach("TESTSIMCLOUD", "haddoop", temp_attr_list="credentials=userfromrally-passwdfromrally-tenantfromrally,ssh_key_name=keyfromrally")
>>> api.applist("TESTSIMCLOUD")
[{'temp_attr_list': 'credentials=userfromrally-passwdfromrally-tenantfromrally,keyname=keyfromrally', 'vms_nr': '4', 'load_balancer_target_ip': 'none', 'attempts': '3', 'vms': '6F6B47E4-D450-5DDC-803B-126571DB569D|hadoopmaster|vm_1,7AB7F7DC-6E91-5395-B822-2B2CDF9AE66C|hadoopslave|vm_2,0B85167F-682C-5CA0-B7A3-E13AE9227699|hadoopslave|vm_3,DA8A553F-526B-50C8-98A2-83F01626051A|hadoopslave|vm_4', 'load_generator_role': 'hadoopmaster', 'staging': 'none', 'uuid': '361A5E25-BE40-5FB6-9203-DE2482BBE6B2', 'mgt_002_provisioning_request_sent': '0', 'seconds_before_save': '0', 'aidrs_name': 'none', 'load_balancer_target_children': '2', 'ai_departed': '0', 'load_generator_vm': '6F6B47E4-D450-5DDC-803B-126571DB569D', 'load_generator_target_ip': '81.189.241.102,103.23.161.252,77.48.147.32', 'metric_aggregator_ip': '59.252.175.194', 'cloud_name': 'TESTSIMCLOUD', 'runstate_parallelism': '5', 'credentials': 'userfromrally-passwdfromrally-tenantfromrally', 'ai_arriving': '0', 'vmc_arrived': '4', 'load_profile': 'terasort', 'name': 'ai_2', 'ai_arrived': '0', 'dont_start_qemu_scraper': 'True', 'suts': '1', 'vm_reservations': '0', 'ssh_key_name': 'cbtool_rsa', 'mode': 'controllable', 'vm_creation': 'explicit', 'load_manager_ip': '59.252.175.194', 'execute_parallelism': '6', 'runstate_supported': 'True', 'vm_departed': '0', 'arrival': '1407763887', 'load_balancer_target_vm': 'none', 'load_balancer_target_role': 'none', 'drivers_per_sut': '0', 'aidrs': 'none', 'dont_start_load_manager': 'False', 'notification': 'False', 'command_originated': '1407763884', 'load_generator_target_role': 'hadoopslave', 'cloud_ip': '59.252.175.194', 'sut': '1_x_hadoopmaster->3_x_hadoopslave', 'save_on_attach': 'False', 'load_balancer_target_port': '80', 'load_balancer': 'False', 'vm_failed': '0', 'experiment_id': 'EXP-08-11-2014-09-25-15-AM-EDT', 'capture_supported': 'True', 'metric_aggregator_role': 'hadoopmaster', 'debug_remote_commands': 'False', 'ai_reservations': '0', 'ai_failed': '1', 'detach_parallelism': '20', 'pattern': 'none', 'login': 'klabuser', 'base_dir': '/home/msilva/cloudbench/lib/auxiliary//../..', 'load_level': 'uniformIXIXI1I3', 'resize_supported': 'True', 'update_frequency': '1', 'vm_arrived': '0', 'hadoopmaster_resize1': 'cb_restart_hadoop_cluster.sh', 'update_attempts': '720', 'start': 'cb_hadoop_job.sh', 'attach_parallelism': '4', 'cloud_hostname': 'F39E13A0-FE8A-50A2-876A-541C3703CA40.simcloud.com', 'type': 'hadoop', 'load_generator_ip': '59.252.175.194', 'username': 'msilva', 'load_generator_target_vm': '7AB7F7DC-6E91-5395-B822-2B2CDF9AE66C,0B85167F-682C-5CA0-B7A3-E13AE9227699,DA8A553F-526B-50C8-98A2-83F01626051A', 'mgt_003_provisioning_request_completed': '0', 'hadoopmaster_setup2': 'cb_start_hadoop_cluster.sh', 'execute_json_filename_prefix': 'cb', 'hadoopmaster_setup1': 'cb_config_hadoop_cluster.sh', 'load_balancer_target_url': 'unknown', 'execute_script_name': 'execute_on_staging.sh', 'run_limit': '100000', 'drivers_nr': '0', 'lifetime': 'none', 'max_ais': '1', 'capture_role': 'hadoopslave', 'replicated_vms': '0', 'load_manager_role': 'hadoopmaster', 'run_application_scripts': 'False', 'vmc_departed': '0', 'load_duration': '60', 'base_type': 'hadoop', 'hadoopslave_resize1': 'cb_restart_hadoop_cluster.sh', 'load_manager_vm': '6F6B47E4-D450-5DDC-803B-126571DB569D', 'vm_destruction': 'explicit', 'metric_aggregator_vm': '6F6B47E4-D450-5DDC-803B-126571DB569D', 'vmc_failed': '0', 'log_output_command': 'True', 'mgt_001_provisioning_request_originated': '1407763884', 'identity': '/home/msilva/cloudbench/lib/auxiliary//../../credentials/cbtool_rsa', 'tracking': 'none', 'counter': '6', 'timeout': '240', 'keyname': 'keyfromrally', 'command': 'aiattach TESTSIMCLOUD hadoop default default none none none credentials=userfromrallypasswdfromrallytenantfromrally,keyname=keyfromrally', 'vm_arriving': '0', 'hadoopslave_setup1': 'cb_config_hadoop_cluster.sh', 'model': 'sim', 'hadoopslave_setup2': 'cb_start_hadoop_cluster.sh'}]

After the Hadoop Virtual Application Instance is deployed, then Rally can both control (appalter, appcapture, app detach) and collect metrics (get_latest_app_data, get_latest_system_data) from this Virtual Application instance.

(?)

Work Items

This blueprint contains Public information 
Everyone can see this information.