Parallelism
Running an Experiment on several machines at once is easy and natural with the SigOpt API. Before you start running experiments in parallel, make sure you know how to create and run an experiment.
Create an Experiment
Create your experiment on a controller machine; you only need to perform this step once. Be sure to make a note of your experiment's id because you'll need it in subsequent steps.
Then set the parallel_bandwidth
field as the number of parallel workers you are using in your experiment. This field ensures SigOpt can factor your level of parallelism into account and provide better suggestions.
Initialize the Workers
Initialize each of your workers with the EXPERIMENT_ID
from the experiment that you just created. All workers, whether individual threads or machines, will receive the same experiment ID.
Run SigOpt Optimization Experiments in Parallel
Now, start the optimization loop on each worker machine. Workers will individually communicate with SigOpt's API, creating suggestions, and evaluating your metric.
Why This Works
A large benefit of SigOpt's parallelization is that each worker communicates asynchronously with the SigOpt API, so you do not need to worry about task management.
SigOpt acts as a distributed scheduler for your SigOpt suggestions, ensuring that each worker machine receives parameter assignments at the moment it asks for a new parameter configuration. SigOpt tracks which Suggestions are currently open
, so machines independently creating Suggestions will not receive duplicates.
Example Setups
In these examples each machine needs to be configured with API Token.
Core Module
These code snippets provide an example combine suggested controller/worker division of labor, as well as incorporating metadata to track which machines have reported Observations.
Controller: Create Experiment, Spin Up Workers
Worker: Run Optimization Loop with Metadata
Recovering Open Suggestions
In the event that one or more of your machines fail, you may have a Suggestion or two in an open
state
. You can list open Suggestions and continue to work on them:
Or you can simply delete open Suggestions:
AI Module
Scenario 1:
The user has a code repository, a local computer (ex. Macbook) and a group of remote machines with copies of the code repository.
On your local computer, create an experiment.yml
file with the following contents:
Create an Experiment using the CLI command:
Remotely connect to each of the remote machines (ex. via ssh) and start parallel workers with the CLI command:
Scenario 2:
The user has a code repository, a coordination host (ex. local or remote machine) and a group of remote machines with copies of the code repository.
On the coordination host, create an Experiment:
Start parallel workers:
Last updated