Comment on page
Search for answers to common questions. Don't see your question here? Shoot us a message!
A parameter’s importance value is a measure of that parameter’s capacity to predict optimized metric values. This is estimated by fitting an ExtraTreesRegressor and using the
features_importancesof the fit model to compute the impurity of each parameter. If you have two optimized metrics, the importances are computed independently for each metric.
You can view your parameter importances via your experiment dashboard, or using python:
conn = sigopt.Connection(client_token=API_TOKEN)
metric_importances = conn.experiments(EXPERIMENT_ID).metric_importances().fetch()
Yes you can! SigOpt is framework agnostic, and lightweight enough to plug into existing modeling workflows. All you need is a SigOpt account and the ability to connect to our API. See here to get started. If you’re looking to bring your own optimization engine, check out this page to see how we can support that.
Does the type of the parameter impact in any way the difficulty of the optimization process? Should we take this matter into account when setting our experiment budget? Are any type more, or less, difficult for SigOpt to search through?
The type of parameters within a given experiment definitely affects the way in which SigOpt approaches the optimization problem. While int and double parameters are treated much more similarly, categoricals do not possess ordinal relationships that we can exploit and learn as effectively from, which is why we suggest a higher experimental budget, and enforce a more stringent constraint on the number of variables you can search through. Defining a log transformation when the distribution of the parameter is known would also make the optimization “less difficult” in the sense that SigOpt would most probably find a performant solution in fewer iterations. Check out our best practices page for more details!
For parameters of type double or int, should we take into account the width of the specified range when setting our experiment budget? The wider the range, the more difficult should be to find the best value, right?
Technically yes, the larger the range, the larger the space SigOpt will have to search through to find optimal areas. However, we are much more impacted by the dimension of the problem, rather than the range of any particular parameter. If you already know with some certainty where your parameters are most performant, narrowing the range will speed up the process, but assuming that you’re not setting an abnormally broad range, we should be able to learn where the “good” values are within the suggested budget.
Could you give us more details about why we should test more observations when using parallelism (even for getting the same result in less wall_clock time)?
Bayesian optimization is classically a sequential process – it uses results from previous observations to update the belief. If we have 100 observations performed in a series, the results from each one can inform the next suggestion, and thus each one has the maximal amount of information to improve upon. The first observation helps build the second, and third off of that, etc. So by the time we get to the 11th observation, it has the knowledge from 10 increasingly improved configurations behind it. If however we have 100 observations but being performed in parallel batches of 10, then the first 10 performed are essentially naive, and the next 10 only having the knowledge of 10 naive observations to build off of (instead of 10 increasingly informed ones). So parallelism is beneficial, but with diminishing returns.
A key component of creating a SigOpt experiment is thinking about what results and insights you want to see.
For some experiments the goal is simply to maximize or minimize some number of metrics. When defining such an experiment the metric-object will look similar to the following:
However, a lot of modelers have a good understanding of what they want to see, or at least some minimum criteria for success. This can be things like a minimum accuracy or a maximum inference time determined by the business.
When creating a SigOpt experiment you have the option to provide such information in the form of metric thresholds, something which will help guide the optimizer and hopefully lead you to good results faster.
When creating the experiment, you’ll just need to add two lines of code to the metric-object:
name ="holdout_accuracy [%]",
threshold = 75
name ="inference_time [ms]",
threshold = 3
And now the optimizer knows what you are looking for! Of course you can always update the thresholds throughout the experiment!