Parallelism
Running an Experiment on several machines at once is easy and natural with the SigOpt API. Before you start running experiments in parallel, make sure you know how to set up Runs, and how to create an experiment.

Create an Experiment

Create your experiment on a controller machine; you only need to perform this step once. Be sure to make a note of your experiment's id because you'll need it in subsequent steps.
Then set the parallel_bandwidth field as the number of parallel workers you are using in your experiment. This field ensures SigOpt can factor your level of parallelism into account and provide better run configurations.

Initialize the Workers

Initialize each of your workers with the EXPERIMENT_ID from the experiment that you just created. All workers, whether individual threads or machines, will receive the same experiment ID.

Run SigOpt Optimization Experiments in Parallel

Now, start the optimization loop on each worker machine. Workers will individually communicate with SigOpt's API, creating Runs, and evaluating your metric.

Why This Works

A large benefit of SigOpt's parallelization is that each worker communicates asynchronously with the SigOpt API, so you do not need to worry about task management.
SigOpt acts as a distributed scheduler for your SigOpt Runs, ensuring that each worker machine receives parameter assignments at the moment it asks for a new parameter configuration. SigOpt tracks which SigOpt Runs are currently active, so machines independently running the jobs will not receive duplicates.

Example Setups

In these examples each machine needs to be configured with API tokens.

Scenario 1:

The user has a code repository, a local computer (ex. Macbook) and a group of remote machines with copies of the code repository.
On your local computer, create an experiment.yml file with the following contents:
1
name: sigopt parallel example
2
parameters:
3
- name: hidden_layer_size
4
type: int
5
bounds:
6
min: 32
7
max: 512
8
- name: activation_function
9
type: categorical
10
categorical_values: ['relu', 'tanh']
11
metrics:
12
- name: holdout_accuracy
13
strategy: optimize
14
objective: maximize
15
threshold: 0.1
16
parallel_bandwidth: 2
17
budget: 30
Copied!
Create an Experiment using the CLI command:
1
$ sigopt create experiment
Copied!
Remotely connect to each of the remote machines (ex. via ssh) and start parallel workers with the CLI command:
1
$ sigopt start-worker 1234 python run-model.py
Copied!

Scenario 2:

The user has a code repository, a coordination host (ex. local or remote machine) and a group of remote machines with copies of the code repository.
On the coordination host, create an Experiment:
1
import sigopt
2
3
experiment = sigopt.create_experiment(
4
name="sigopt parallel example",
5
parameters=[
6
dict(name="hidden_layer_size", type="int", bounds=dict(min=32, max=512)),
7
dict(name="activation_fn", type="categorical", categorical_values=["relu", "tanh"]),
8
],
9
metrics=[
10
dict(name="holdout_accuracy", strategy="optimize", objective="maximize"),
11
dict(name="inference_time", strategy="constraint", objective="minimize", threshold=0.1),
12
],
13
parallel_bandwidth=1,
14
budget=30,
15
)
Copied!
Start parallel workers:
1
for machine_number in range(experiment.parallel_bandwidth):
2
run_command_on_machine(
3
machine_number,
4
f"sigopt start-worker {experiment.id} python run-model.py",
5
)
Copied!