All Constraints Experiment
Early in the experimentation process, users often want to understand the relationship between parameters and metrics. In particular, users may want to study which parameter regions consistently yield high-performing models. By conducting an experiment with all Constraint Metrics, SigOpt users can efficiently search for many high-performing models as defined through constraints on each of the metrics under analysis. All-Constraint experiments focus on diverse parameter configurations, increasing the chances of finding models that meet business goals.
Diversity Accelerates Model Development
Let us go through an example. Suppose we want to classify chess end-games for White King and Rook against Black King. We use the UCI dataset known as Chess created by Michael Bain and Arthur van Hoff at the Turing Institute, Glasgow, UK. We are interested in performing hyperparameter tuning of XGBoost models. We will use the following parameter space in our experiments:
Defining our metrics
Now, let us say that we want to search for models with a high F1 score and low model complexity. We then define two metrics: f1_score
and the actual average_depth
of the model. We are interested in models that achieve higher than 0.8 of f1_score
and have average_depth
lower than 10. It is also a good idea to store other metrics to inspect the models further. For example, we can keep track of each model's inference_time
on the test set.
If we are confident that f1_score
and average_depth
capture everything about our problem, we can run a Multimetric Experiment to search for the Pareto Efficient Frontier points. The minimum-performance thresholds can (optionally) be incorporated as Metric Thresholds.
Our new All-Constraint experiment looks very similar, but replaces the optimize
strategy with the constraint
strategy.
SigOpt allows our users to store additional metrics for consideration during analysis of the experiment. These should be defined during model creation as well.
Running our experiment
With the above lists of parameters and metrics, we can easily create SigOpt experiments:
After running these experiments, we observed the below results. In blue, we show the metric values resulting from SigOpt suggestions. In orange, we display the final results for each experiment. For the Multimetric Experiment, the best observations are the points on the Pareto Efficient Frontier. For an All-Constraint experiment, all points that meet the user's constraints are returned by the Best Assignments endpoint (Best Run for the AI Module). Notice that the Multimetric experiment finds many dominant points, and an All-Constraint experiment finds more configurations that satisfy the user's constraints.
Dealing with unforeseen requirements
An All-Constraint experiment finds more points that satisfy the user's constraints, at the cost of a less well-defined Pareto frontier. Why is this valuable? Suppose that, after this experiment, we talk to other stakeholders of our project; now they explicitly state that low inference time is critical for our application. Instead of rerunning this experiment (which could take a while), we decide to revisit our current results. Below we display the results after filtering the points by inference time (less than 0.1s).
Since our Multimetric experiment had a limited goal (highest f1_score
and lowest average_depth
), all models failed to achieve low inference time. All-Constraint experiments recognize that other goals may exist, and they search for a diverse range of outcomes to service future demands. Specifically, note that:
The All-Constraint experiment found nine viable models, whereas the Multimetric experiment did not find models with low inference time.
None of the points from our earlier Pareto Efficient frontier met this prediction time requirement.
Analyzing parameters
The value of an All-Constraint experiment is most striking when we use our Parallel Coordinate plot. See the comparison to the Multimetric experiment below when we filter the models by inference_time
. Notice that only models with low num_boost_round
remain active.
There are some useful insights to be gained here about the parameters and the resulting metric values.
High
num_boost_round
yields high F1 score -- this is not surprising, but our Multimetric experiment learns this and then spends its energy exploiting that information to make a better Pareto frontier.In contrast, All-Constraint finds models with low
num_boost_round
. That is critical for producing models with good performance and faster inference time.For the full range of satisfactory models, all models require
eta
(learning rate) values between [-1.5, 0].Most viable models have gamma values less than 3.
All-Constraint finds more models with lower
max_depth
than Multimetric, especially between values 5 and 15.The entire range of
min_child_weight
values seems to produce acceptable results -- the metrics seem unaffected by this parameter alone. However, for satisfactory models, it looks likemax_depth
andmin_child_weight
are inversely correlated.
Conceptualizing the Value of an All-Constraint Experiment
Creating an All-Constraint Experiment
Below we create a new SigOpt All-Constraint experiment using the above XGBoost hyperparameter tuning example. The goal of such an experiment is to explore high-performing regions of the parameter space effectively. Recall that the main distinction is a list of Constraint Metrics with no optimized metrics. SigOpt engine will automatically focus on diverse parameter configurations rather than focusing on the optimal achievable values for each metric. As discussed earlier, for an exploration strategy that focuses on Pareto Efficient Frontier of two metrics, we recommend users run a Multimetric Experiment instead.
Core Module
AI Module
As your Experiment executes, report the metric values to the corresponding Run:
Selecting and Updating the Metric Thresholds
In many applications, it is straightforward to specify the minimum performance criteria for each metric. For example, inference time and model size are limited by the production setting's desired response time and memory constraints. A simple lower bound on accuracy is the fraction of examples of the majority class. A constant predictor that always reports the average training value gives the minimum level of performance expected for an intelligent system for regression problems. To conduct an effective exploration, we recommend users set conservative threshold values. SigOpt understands that configurations that do not meet the constraints are undesirable; therefore, setting a high threshold at the beginning of your experimentation can prematurely discourage SigOpt from sampling promising regions of the parameter space. As the experiment progresses, the metric thresholds can be updated on the properties page of an experiment or directly through our web application. For more information see how to update your metric constraints.
Core Module
Limitations
Budget must be set when an All-Constraint experiment is created.
The maximum number of constraint metrics is 4.
The maximum number of dimensions for All-Constraint is 50.
Experiments with Parameter Conditions are not permitted.
Multitask experiments are not permitted.
Last updated