Syne Tune: Large-Scale and Reproducible Hyperparameter Optimization
This package provides state-of-the-art algorithms for hyperparameter optimization (HPO) with the following key features:
Wide coverage (>20) of different HPO methods, including:
Asynchronous versions to maximize utilization and distributed versions (i.e., with multiple workers);
Multi-fidelity methods supporting model-based decisions (BOHB, MOBSTER, Hyper-Tune, DyHPO, BORE);
Hyperparameter transfer learning to speed up (repeated) tuning jobs;
Multi-objective optimizers that can tune multiple objectives simultaneously (such as accuracy and latency).
HPO can be run in different environments (locally, AWS, simulation) by changing just one line of code.
Out-of-the-box tabulated benchmarks that allows you simulate results in seconds while preserving the real dynamics of asynchronous or synchronous HPO with any number of workers.
New tutorial: Using Syne Tune for Transfer Learning. Transfer learning allows us to speed up our current optimisation by learning from related optimisation runs. Syne Tune provides a number of transfer HPO methods and makes it easy to implement new ones. Thanks to Sigrid for this contribution.
DyHPO. This is a recent multi-fidelity method, which can be seen as alternative to ASHA, MOBSTER or HyperTune. Different to these, decisions on whether to promote paused trials are done based on the surrogate model. Our implementation differs from the published work by using a Gaussian process surrogate model, and by a promotion rule which is a hybrid between DyHPO and ASHA.
New tutorial: How to Contribute a New Scheduler. Learn how to implement your own scheduler, wrap external code, or modify one of the existing templates in order to get your job done.
New tutorial: Benchmarking in Syne Tune. You’d like to run many experiments in parallel, or launch training jobs on different instances, all by modifying some simple scripts to your needs? Then our benchmarking mechanism is for you.
New tutorial: Progressive ASHA. PASHA is a variant of ASHA where the maximum number of resources (e.g., maximum number of training epochs) is not fixed up front, but is adapted. This can lead to savings when training on large datasets. Thanks to Ondre for this contribution.