Hyperparameter Optimization
PHOTONAI offers easy access to several established hyperparameter optimization strategies.
Grid Search
An exhaustive searching through a manually specified subset of the hyperparameter space. The grid is defined by a finite list for each hyperparameter.
1 2 |
|
Random Grid Search
Random sampling of a manually specified subset of the hyperparameter space. The grid is defined by a finite list for each hyperparameter. Then, a specified number of random configurations from this grid is tested
1 2 3 4 |
|
Random Search
A grid-free selection of configurations based on the hyperparameter space. In the case of numerical parameters, decisions are made only on the basis of the interval limits. The creation of configurations is limited by time or a maximum number of runs.
1 2 3 4 |
|
If the optimizer_params
contain a time and numerical limit, both limits are
considered by aborting if either of the limits is met.
The default limit for Random Search is n_configurations=10
.
Scikit-Optimize
Scikit-Optimize, or skopt, is a simple and efficient library to minimize (very) expensive and noisy black-box functions. It implements several methods for sequential model-based optimization. skopt aims to be accessible and easy to use in many contexts.
Scikit-optimize usage and implementation details available here. A detailed parameter documentation here.
1 2 3 4 5 6 7 |
|
Nevergrad
Nevergrad is a gradient-free optimization platform. Thus, this package is suitable for optimizing over the hyperparamter space. As a great advantage, evolutionary algorithms are implemented here in addition to Bayesian techniques.
Nevergrad usage and implementation details available here.
1 2 3 4 5 6 |
|
Smac
SMAC (sequential model-based algorithm configuration) is a versatile tool for optimizing algorithm parameters. The main core consists of Bayesian Optimization in combination with an aggressive racing mechanism to efficiently decide which of two configurations performs better.
SMAC usage and implementation details available here.
1 2 3 4 5 6 |
|
Switch Optimizer
This optimizer is special, as it uses the strategies above to optimizes the same dataflow for different learning algorithms in a switch ("OR") element at the end of the pipeline.
For example you can use bayesian optimization for each learning algorithm and select that each of the algorithms gets 25 configurations to be tested.
This is different to a global optimization, in which, after an initial exploration phase, computational resources are dedicated to the best performing learning algorithm only.
By equally distributing computational ressources to each learning algorithms, better comparability is achieved
in-between the algorithms. This can according to the use case be desirable.
1 2 3 |
|