forked from mlpack/mlpack
-
Notifications
You must be signed in to change notification settings - Fork 0
Add a hyper-parameter tuning module #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
micyril
wants to merge
27
commits into
simple_cv
Choose a base branch
from
hpt
base: simple_cv
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
27 commits
Select commit
Hold shift + click to select a range
7c393ff
Add a cross-validation wrapper for optimizers
micyril f2d11fc
Add grid-search optimization
micyril 15e35b4
Add hyper-parameter tuning
micyril fa78d07
Add support for metrics that should be maximized
micyril 42d16ee
Add interface for accessing the best model
micyril 33a1120
Fix lambda sets initialization
micyril dd1db7c
Fix misspellings
micyril 92d75a1
Fix CVFunctionTest
micyril 0b7690c
Fix style
micyril 5eb3dc7
Extend documentation
micyril 56c2ff1
Fix typo
micyril c6f897f
Make optimal lambdas in the "middle" of the space
micyril 54ac867
Add passing MLAlgorithm type to CVFunction
micyril 46e8c25
Add braces around multiline statement
micyril 7b9604c
Use more verbose names for BAIndex and PIndex
micyril cce56f5
Use the word "fixed" for "bound"
micyril 85060c8
Revert "Make optimal lambdas in the "middle" of the space"
micyril 6744351
Refactor GridSearch to use DatasetMapper
micyril 9486986
Use the new GridSearch interface in the HPT module
micyril fc53e01
Add separate comments
micyril 6f65136
Add gradient evaluation to CVFunction
micyril 5a6df55
Add support for GradientDescent in the HPT module
micyril fd172d9
Refactor template conditions for InitAndOptimize
micyril fd95da5
Add assertion for argument types of Optimize
micyril 7de610e
Add assertion that input collections aren't empty
micyril ab46e58
Add const getters in HyperParameterTuner
micyril 5ac45dd
Make CVFunction cache computations for gradient
micyril File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -5,6 +5,7 @@ set(DIRS | |
| cv | ||
| data | ||
| dists | ||
| hpt | ||
| kernels | ||
| math | ||
| metrics | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,15 @@ | ||
| set(SOURCES | ||
| cv_function.hpp | ||
| cv_function_impl.hpp | ||
| deduce_hp_types.hpp | ||
| fixed.hpp | ||
| hpt.hpp | ||
| hpt_impl.hpp | ||
| ) | ||
|
|
||
| set(DIR_SRCS) | ||
| foreach(file ${SOURCES}) | ||
| set(DIR_SRCS ${DIR_SRCS} ${CMAKE_CURRENT_SOURCE_DIR}/${file}) | ||
| endforeach() | ||
|
|
||
| set(MLPACK_SRCS ${MLPACK_SRCS} ${DIR_SRCS} PARENT_SCOPE) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,172 @@ | ||
| /** | ||
| * @file cv_function.hpp | ||
| * @author Kirill Mishchenko | ||
| * | ||
| * A cross-validation wrapper for optimizers. | ||
| * | ||
| * mlpack is free software; you may redistribute it and/or modify it under the | ||
| * terms of the 3-clause BSD license. You should have received a copy of the | ||
| * 3-clause BSD license along with mlpack. If not, see | ||
| * http://www.opensource.org/licenses/BSD-3-Clause for more information. | ||
| */ | ||
| #ifndef MLPACK_CORE_HPT_CV_FUNCTION_HPP | ||
| #define MLPACK_CORE_HPT_CV_FUNCTION_HPP | ||
|
|
||
| #include <mlpack/core.hpp> | ||
|
|
||
| namespace mlpack { | ||
| namespace hpt { | ||
|
|
||
| /** | ||
| * This wrapper serves for adapting the interface of the cross-validation | ||
| * classes to the one that can be utilized by the mlpack optimizers. | ||
| * | ||
| * This class is not supposed to be used directly by users. To tune | ||
| * hyper-parameters see HyperParameterTuner. | ||
| * | ||
| * @tparam CVType A cross-validation strategy. | ||
| * @tparam MLAlgorithm The machine learning algorithm used in cross-validation. | ||
| * @tparam TotalArgs The total number of arguments that are supposed to be | ||
| * passed to the Evaluate method of a CVType object. | ||
| * @tparam BoundArgs Types of arguments (wrapped into the BoundArg struct) that | ||
| * should be passed into the Evaluate method of a CVType object but are not | ||
| * going to be passed into the Evaluate method of a CVFunction object. | ||
| */ | ||
| template<typename CVType, | ||
| typename MLAlgorithm, | ||
| size_t TotalArgs, | ||
| typename... BoundArgs> | ||
| class CVFunction | ||
| { | ||
| public: | ||
| /** | ||
| * Initialize a CVFunction object. | ||
| * | ||
| * @param cv A cross-validation object. | ||
| * @param relativeDelta Relative increase of arguments for calculation of | ||
| * partial derivatives (by the definition). The exact increase for some | ||
| * particular argument is equal to the absolute value of the argument | ||
| * multiplied by the relative increase (see also the documentation for the | ||
| * minDelta parameter). | ||
| * @param minDelta Minimum increase of arguments for calculation of partial | ||
| * derivatives (by the definition). This value is going to be used when it | ||
| * is greater than the increase calculated with the rules described in the | ||
| * documentation for the relativeDelta parameter. | ||
| * @param BoundArgs Arguments that should be passed into the Evaluate method | ||
| * of the CVType object but are not going to be passed into the Evaluate | ||
| * method of this object. | ||
| */ | ||
| CVFunction(CVType& cv, | ||
| const double relativeDelta, | ||
| const double minDelta, | ||
| const BoundArgs&... args); | ||
|
|
||
| /** | ||
| * Run cross-validation with the bound and passed parameters. | ||
| * | ||
| * @param parameters Arguments (rather than the bound arguments) that should | ||
| * be passed into the Evaluate method of the CVType object. | ||
| */ | ||
| double Evaluate(const arma::mat& parameters); | ||
|
|
||
| /** | ||
| * Evaluate numerically the gradient of the CVFunction with the given | ||
| * parameters. | ||
| * | ||
| * @param parameters Arguments (rather than the bound arguments) that should | ||
| * be passed into the Evaluate method of the CVType object. | ||
| * @param gradient Vector to output the gradient into. | ||
| */ | ||
| void Gradient(const arma::mat& parameters, arma::mat& gradient); | ||
|
|
||
| //! Access and modify the best model so far. | ||
| MLAlgorithm& BestModel() { return bestModel; } | ||
|
|
||
| private: | ||
| //! The type of tuples of BoundArgs. | ||
| using BoundArgsTupleType = std::tuple<BoundArgs...>; | ||
|
|
||
| //! The amount of bound arguments. | ||
| static const size_t BoundArgsAmount = | ||
| std::tuple_size<BoundArgsTupleType>::value; | ||
|
|
||
| /** | ||
| * A struct that finds out whether the next argument for the Evaluate method | ||
| * of a CVType object should be a bound argument at the position BoundArgIndex | ||
| * rather than an element of parameters at the position ParamIndex. | ||
| */ | ||
| template<size_t BoundArgIndex, | ||
| size_t ParamIndex, | ||
| bool BoundArgsIndexInRange = BoundArgIndex < BoundArgsAmount> | ||
| struct UseBoundArg; | ||
|
|
||
| //! A reference to the cross-validation object. | ||
| CVType& cv; | ||
|
|
||
| //! The bound arguments. | ||
| BoundArgsTupleType boundArgs; | ||
|
|
||
| //! The best objective so far. | ||
| double bestObjective; | ||
|
|
||
| //! The best model so far. | ||
| MLAlgorithm bestModel; | ||
|
|
||
| //! Relative increase of arguments for calculation of gradient. | ||
| double relativeDelta; | ||
|
|
||
| //! Minimum absolute increase of arguments for calculation of gradient. | ||
| double minDelta; | ||
|
|
||
| /** | ||
| * Collect all arguments and run cross-validation. | ||
| */ | ||
| template<size_t BoundArgIndex, | ||
| size_t ParamIndex, | ||
| typename... Args, | ||
| typename = typename | ||
| std::enable_if<BoundArgIndex + ParamIndex < TotalArgs>::type> | ||
| inline double Evaluate(const arma::mat& parameters, const Args&... args); | ||
|
|
||
| /** | ||
| * Run cross-validation with the collected arguments. | ||
| */ | ||
| template<size_t BoundArgIndex, | ||
| size_t ParamIndex, | ||
| typename... Args, | ||
| typename = typename | ||
| std::enable_if<BoundArgIndex + ParamIndex == TotalArgs>::type, | ||
| typename = void> | ||
| inline double Evaluate(const arma::mat& parameters, const Args&... args); | ||
|
|
||
| /** | ||
| * Put the bound argument (at the BoundArgIndex position) as the next one. | ||
| */ | ||
| template<size_t BoundArgIndex, | ||
| size_t ParamIndex, | ||
| typename... Args, | ||
| typename = typename std::enable_if< | ||
| UseBoundArg<BoundArgIndex, ParamIndex>::value>::type> | ||
| inline double PutNextArg(const arma::mat& parameters, const Args&... args); | ||
|
|
||
| /** | ||
| * Put the element (at the ParamIndex position) of the parameters as the next | ||
| * one. | ||
| */ | ||
| template<size_t BoundArgIndex, | ||
| size_t ParamIndex, | ||
| typename... Args, | ||
| typename = typename std::enable_if< | ||
| !UseBoundArg<BoundArgIndex, ParamIndex>::value>::type, | ||
| typename = void> | ||
| inline double PutNextArg(const arma::mat& parameters, const Args&... args); | ||
| }; | ||
|
|
||
|
|
||
| } // namespace hpt | ||
| } // namespace mlpack | ||
|
|
||
| // Include implementation | ||
| #include "cv_function_impl.hpp" | ||
|
|
||
| #endif |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,168 @@ | ||
| /** | ||
| * @file cv_function_impl.hpp | ||
| * @author Kirill Mishchenko | ||
| * | ||
| * The implementation of the class CVFunction. | ||
| * | ||
| * mlpack is free software; you may redistribute it and/or modify it under the | ||
| * terms of the 3-clause BSD license. You should have received a copy of the | ||
| * 3-clause BSD license along with mlpack. If not, see | ||
| * http://www.opensource.org/licenses/BSD-3-Clause for more information. | ||
| */ | ||
| #ifndef MLPACK_CORE_HPT_CV_FUNCTION_IMPL_HPP | ||
| #define MLPACK_CORE_HPT_CV_FUNCTION_IMPL_HPP | ||
|
|
||
| namespace mlpack { | ||
| namespace hpt { | ||
|
|
||
| template<typename CVType, | ||
| typename MLAlgorithm, | ||
| size_t TotalArgs, | ||
| typename... BoundArgs> | ||
| template<size_t BoundArgIndex, size_t ParamIndex> | ||
| struct CVFunction<CVType, MLAlgorithm, TotalArgs, BoundArgs...>::UseBoundArg< | ||
| BoundArgIndex, ParamIndex, true> | ||
| { | ||
| using BoundArgType = | ||
| typename std::tuple_element<BoundArgIndex, BoundArgsTupleType>::type; | ||
|
|
||
| static const bool value = BoundArgType::index == BoundArgIndex + ParamIndex; | ||
| }; | ||
|
|
||
| template<typename CVType, | ||
| typename MLAlgorithm, | ||
| size_t TotalArgs, | ||
| typename... BoundArgs> | ||
| template<size_t BoundArgIndex, size_t ParamIndex> | ||
| struct CVFunction<CVType, MLAlgorithm, TotalArgs, BoundArgs...>::UseBoundArg< | ||
| BoundArgIndex, ParamIndex, false> | ||
| { | ||
| static const bool value = false; | ||
| }; | ||
|
|
||
| template<typename CVType, | ||
| typename MLAlgorithm, | ||
| size_t TotalArgs, | ||
| typename... BoundArgs> | ||
| CVFunction<CVType, MLAlgorithm, TotalArgs, BoundArgs...>::CVFunction( | ||
| CVType& cv, | ||
| const double relativeDelta, | ||
| const double minDelta, | ||
| const BoundArgs&... args) : | ||
| cv(cv), | ||
| boundArgs(args...), | ||
| bestObjective(std::numeric_limits<double>::max()), | ||
| relativeDelta(relativeDelta), | ||
| minDelta(minDelta) | ||
| { /* Nothing left to do. */ } | ||
|
|
||
| template<typename CVType, | ||
| typename MLAlgorithm, | ||
| size_t TotalArgs, | ||
| typename... BoundArgs> | ||
| double CVFunction<CVType, MLAlgorithm, TotalArgs, BoundArgs...>::Evaluate( | ||
| const arma::mat& parameters) | ||
| { | ||
| return Evaluate<0, 0>(parameters); | ||
| } | ||
|
|
||
| template<typename CVType, | ||
| typename MLAlgorithm, | ||
| size_t TotalArgs, | ||
| typename... BoundArgs> | ||
| void CVFunction<CVType, MLAlgorithm, TotalArgs, BoundArgs...>::Gradient( | ||
| const arma::mat& parameters, | ||
| arma::mat& gradient) | ||
| { | ||
| gradient = arma::mat(arma::size(parameters)); | ||
| arma::mat increasedParameters = parameters; | ||
| double originalParametersEvaluation = Evaluate(parameters); | ||
| for (size_t i = 0; i < parameters.n_rows; ++i) | ||
| { | ||
| double delta = std::max(std::abs(parameters(i)) * relativeDelta, minDelta); | ||
| increasedParameters(i) += delta; | ||
| gradient(i) = | ||
| (Evaluate(increasedParameters) - originalParametersEvaluation) / delta; | ||
| increasedParameters(i) = parameters(i); | ||
| } | ||
| } | ||
|
|
||
| template<typename CVType, | ||
| typename MLAlgorithm, | ||
| size_t TotalArgs, | ||
| typename... BoundArgs> | ||
| template<size_t BoundArgIndex, | ||
| size_t ParamIndex, | ||
| typename... Args, | ||
| typename> | ||
| double CVFunction<CVType, MLAlgorithm, TotalArgs, BoundArgs...>::Evaluate( | ||
| const arma::mat& parameters, | ||
| const Args&... args) | ||
| { | ||
| return PutNextArg<BoundArgIndex, ParamIndex>(parameters, args...); | ||
| } | ||
|
|
||
| template<typename CVType, | ||
| typename MLAlgorithm, | ||
| size_t TotalArgs, | ||
| typename... BoundArgs> | ||
| template<size_t BoundArgIndex, | ||
| size_t ParamIndex, | ||
| typename... Args, | ||
| typename, | ||
| typename> | ||
| double CVFunction<CVType, MLAlgorithm, TotalArgs, BoundArgs...>::Evaluate( | ||
| const arma::mat& /* parameters */, | ||
| const Args&... args) | ||
| { | ||
| double objective = cv.Evaluate(args...); | ||
|
|
||
| // Change the best model if we have got a better score, or if we probably | ||
| // have not assigned any valid (trained) model yet. | ||
| if (bestObjective > objective || | ||
| bestObjective == std::numeric_limits<double>::max()) | ||
| { | ||
| bestObjective = objective; | ||
| bestModel = std::move(cv.Model()); | ||
| } | ||
|
|
||
| return objective; | ||
| } | ||
|
|
||
| template<typename CVType, | ||
| typename MLAlgorithm, | ||
| size_t TotalArgs, | ||
| typename... BoundArgs> | ||
| template<size_t BoundArgIndex, | ||
| size_t ParamIndex, | ||
| typename... Args, | ||
| typename> | ||
| double CVFunction<CVType, MLAlgorithm, TotalArgs, BoundArgs...>::PutNextArg( | ||
| const arma::mat& parameters, | ||
| const Args&... args) | ||
| { | ||
| return Evaluate<BoundArgIndex + 1, ParamIndex>( | ||
| parameters, args..., std::get<BoundArgIndex>(boundArgs).value); | ||
| } | ||
|
|
||
| template<typename CVType, | ||
| typename MLAlgorithm, | ||
| size_t TotalArgs, | ||
| typename... BoundArgs> | ||
| template<size_t BoundArgIndex, | ||
| size_t ParamIndex, | ||
| typename... Args, | ||
| typename, | ||
| typename> | ||
| double CVFunction<CVType, MLAlgorithm, TotalArgs, BoundArgs...>::PutNextArg( | ||
| const arma::mat& parameters, | ||
| const Args&... args) | ||
| { | ||
| return Evaluate<BoundArgIndex, ParamIndex + 1>( | ||
| parameters, args..., parameters(ParamIndex, 0)); | ||
| } | ||
|
|
||
| } // namespace hpt | ||
| } // namespace mlpack | ||
|
|
||
| #endif | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can simplify this,
if (objective < bestObjective)would be sufficient to also capture the case wherebestObjective == DBL_MAX.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean to use
DBL_MAXinstead ofstd::numeric_limits<double>::max()?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I was being lazy---it is quicker to type
DBL_MAXthanstd::numeric_limits<double>::max()and I was using a phone so I did not want to type too much. :)The idea of what I was saying though, is that there is no need to check if
bestObjective == std::numeric_limits<double>::max(); if that condition holds, then it will always be true thatobjective < bestObjective(...assuming thatobjectiveis notstd::numeric_limits<double>::max(), but even if that is true, that corner case can be handled by relaxing the conditional toif (objective <= bestObjective)and thereby taking the better objective even when it is the same.). Let me know if I can clarify further.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some trade-offs: we can use less lines (and CPU cycles) for condition checking, but we potentially will do more often the
ifbody. Also the implementation with one condition is inconsistent withGridSearch- as a result we can provide a model trained with some different hyper-parameters than ones returned fromOptimize.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I did not realize relaxing the condition made it inconsistent. In that case it seems like we need to leave it as-is.