hCloud

First release

forked from @eadan/k8s-hcloud

OVERVIEW LICENSE

Simulating HASH's Cloud Infrastructure

This is a simulation which demonstrates how HASH's cloud infrastruture responds to user requests to run simulations on hCloud. hCloud executes each simulation in a dedicated Kubernetes pod with a certain amount of CPU and memory resources.

Requests are generated according to real-world data. The request distribution was uploaded as a dataset — distribution.csv — and represents the proportion of daily requests received in each hour of the day. Simulations are executed as soon as received, unless there are not currently enough compute resources available in the cluster. In this case, the request is queued until adequate resources become available. Compute resources become available when another simulation completes, or when a new compute node is added to the cluster by the autoscaler.

The parameters controlling this simulation may be tuned through the globals.json file. These parameters are:

requests_per_day: the number of user requests received per day.
experiment_time_seconds: a triangular distribution which specifies the duration of each simulation run. This distribution is parameterised by a min, mode and max, in seconds.
node_specs: the number of cpu cores and gibabytes of memory in each compute node in the cluster.
pod_specs: the number of cpu cores and gigabytes of memory allocated to the pod for each simulation run.
autoscale_bounds: the autoscaler automatically adds and removes nodes in an effort to maintain cluster utilisation between the given bounds. If cluster utilisation exceeds max_util, then one node is added to the cluster. Similarly, if utilisation drops below min_util, then one node is removed from the cluster. At each time step, at most one node will be removed or added.
autoscale_delay_minutes: the time delay before which a request to add a node to the cluster is fulfilled. There is no delay to remove a node.
initial_num_nodes: the starting number of nodes in the cluster.
min_num_nodes: the minimum number of nodes required to be in the cluster at all times.

Each step in the simulation represents a passing of one minute.

The Analysis tab contains several plots illustrating how the cluster evolves as experiment requests are received and executed.

Experiment Pods shows the number of actively running, and queued simulation experiments at each time step.
Cluster CPU Usage shows the total number of CPU cores in the cluster and the current utilisation at each time step.
Cluster Memory Usage shows the total amount of memory in the cluster and the current utilisation at each time step.
Number of Nodes shows the number of compute nodes in the cluster at each time step.