Advances in neural networks and deep learning have renewed interest in algorithms to automate the tuning of the expanding list of hyper-parameters for these high-dimensional models. Open source libraries such as scikit-learn provide ready access to simple but inefficient algorithms such as exhaustive search and random search. Recently, Snoek et al showed that statistical hyper-parameter optimization approaches produce better better results than humans and are more efficient than exhaustive or random approaches in high-dimensional domains such as image and speech machine learning.[1] Similarly, Bergstra et al. improved efficiency and performance further with their Sequential Model-Based Global Optimization (SMGO) approach which approximates the computationally complex model training step with a heuristic.[2] In this paper we will demonstrate these hyper-parameter optimization algorithms on several toy and real-world problems, including machine learning problem types not previously optimized with SMGO.
[1] Snoek, Larachelle, Adams, “Practical Bayesian Optimization of Machine Learning Algorithms”, 2014 [2] Bergstra, Bardenet, Bengio, and Ḱegl, “Algorithms for Hyper-Parameter Optimization,” 2014