Tuning XGBoost Models
sigopt.xgboost.experiment
sigopt.xgboost.experiment
The sigopt.xgboost.experiment
function simplifies the hyperparameter tuning process of an XGBoost model, by automatically creating and running a SigOpt AI Experiment. This function also extends the automatic parameter, metric, and metadata logging of our sigopt.xgboost.run
API to the SigOpt experimentation platform.
However, this automatic logging is only one of the features, andsigopt.xgboost.experiment
offers the following crucial improvements over the existing SigOpt AI Experiment API when tuning an XGBoost model:
A simplified and streamlined API that knows the exact problem it is tuning: XGBoost, and makes intelligent decisions accordingly.
Automatic selection of the parameter search space, optimization metric, and the tuning budget.
A preset list of standard optimization metrics to choose from.
An improved hyperparameter optimization routine that leverages advanced methods in metalearning and multi-fidelity optimization to learn a more performant model in less time.
This API has been designed with ease-of-use in mind, so that you may run an XGBoost Experiment as effortlessly as possible.
Examples
To give you an initial feel for how you might use the sigopt.xgboost.experiment
API, we provide multiple examples showcasing its simplicity and flexibility. Our API aims to reduce the overall complexity of intelligent experimentation and hyperparameter optimization by automatically selecting parameters, metrics, and even the budget where needed.
The sequence of examples are provided in the tabs below, increases in complexity.
Automatic Experiment Configuration
The parameter search space, metric, and budget are determined by SigOpt based on the training data provided.
The simple examples are made possible because of key research advances made by SigOpt research. It is worth noting that the decrease in simplicity corresponds to a decrease in flexibility; if you opt to omit the metric for example, you will be forced to optimize the metric we select for you.
Input Arguments for sigopt.xgboost.experiment
sigopt.xgboost.experiment
The API for an XGBoost Experiment follows:
experiment_config
dict
The configuration of the Experiment. See the following section for more information.
eval
xgboost.DMatrix
or array<(xgboost.DMatrix, string)>
These are the validation set(s). If it is a list, the first dataset will be used to compute the optimization metric.
params
dict
These are the XGBoost parameters, e.g., tree_method
, you plan on fixing throughout the course of the Experiment. See the description of these parameters at the XGBoost Parameters documentation for more information.
num_boost_round
int
Optional. The number of boosting rounds. Leave this argument as blank if num_boost_round
is specified in parameters
field in the experiment_config
.
early_stopping_rounds
int
Optional. XGBoost stops training when the validation metric has not improved for early_stopping_rounds
. NOTE: SigOpt sets early_stopping_rounds
to 10 by default. To turn off early stopping, explicitly set it to None
.
run_options
dict
Optional. A dictionary specifying the autologging capabilities. See the SigOpt XGBoost Run documenation for more information.
The experiment_config
is most important to understand, since it not only determines how your Experiment executes, but also possesses the most flexibility and extensibility out of all XGBoost Experiment API arguments. Thus, we explain it next.
Specifying sigopt.xgboost.experiment
through experiment_config
sigopt.xgboost.experiment
through experiment_config
An experiment config has the following keys:
name
string
Name of the Experiment.
parameters
array<
Parameter
>
Optional. An array of Parameter objects. See Parameter Space subsection below for more information.
metrics
array<
Metric
>
or string
Optional. An array of Metric objects. See Metric Space subsection below for more information.
budget
int
Optional. An integer defining the minimum number of SigOpt Runs in a given SigOpt Experiment.
Parameter Space
We show an illustrative example below on how to set the parameter space.
There are three different ways of specifying the Experiment parameters:
name
only: SigOpt autoselects the bounds and type.name
andtype
: SigOpt autoselects the bounds.name
,type
, andbounds
/categorical_values
: Explicit parameter specification.
These specifications may be mixed as in the example above. Currently, SigOpt only autoselects the bounds for the following parameters:
Any parameter that is not on this list must have its bounds
or categorical_values
explicitly stated.
Metric Space
The metric space of an Experiment is defined by both the metrics
argument of the experiment_config
and the datasets listed in the evals
argument.
There are two ways of specifying the metric space.
Below is a table of the metrics we natively support for classification and regression.
Classification
accuracy
, F1
, precision
, recall
accuracy
Regression
mean absolute error
, mean squared error
mean squared error
Last updated