Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .JuliaFormatter.toml
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
style = "sciml"
yas_style_nesting=true
yas_style_nesting=false
ignore = ["docs"]
5 changes: 3 additions & 2 deletions docs/src/API.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,11 +25,12 @@ Results are stored in a `FidesSolution` struct:
Fides.FidesSolution
```

## Hessian Approximations
## Hessian Options

In cases where the Hessian is too expensive or difficult to compute, several Hessian approximations are supported. The BFGS method is often effective:
Multiple Hessian options and approximation methods are available. When the Hessian is too costly or difficult to compute, the `BFGS` method is often performant:

```@docs
Fides.CustomHessian
Fides.BFGS
Fides.SR1
Fides.DFP
Expand Down
34 changes: 18 additions & 16 deletions docs/src/tutorial.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ This overarching tutorial describes how to solve an optimization problem with Fi

## Input - a Function to Minimize

Fides requires a function to minimize, its gradient and optionally its Hessian. In this tutorial, the nonlinear Rosenbrock function is used:
Fides requires a function to minimize, its gradient and optionally its Hessian. In this tutorial, we use the nonlinear Rosenbrock function:

```math
f(x_1, x_2) = (1.0 - x_1)^2 + 100.0(x_2 - x_1^2)^2
Expand All @@ -19,7 +19,7 @@ end
nothing # hide
```

In particular, `x` may be either a `Vector` or a `ComponentVector` from [ComponentArrays.jl](https://github.com/SciML/ComponentArrays.jl). Fides also requires the gradient, and optionally Hessian function. In this example, for convenience we compute both via automatic differentiation using [ForwardDiff.jl](https://github.com/JuliaDiff/ForwardDiff.jl):
Where `x` may be either a `Vector` or a `ComponentVector` from [ComponentArrays.jl](https://github.com/SciML/ComponentArrays.jl). Fides also requires a gradient function, and optionally a Hessian function. In this example, for convenience we compute both via automatic differentiation using [ForwardDiff.jl](https://github.com/JuliaDiff/ForwardDiff.jl):

```@example 1
using ForwardDiff
Expand All @@ -42,30 +42,34 @@ x0 = [ 2.0, 2.0]
prob = FidesProblem(f, grad!, x0; lb = lb, ub = ub)
```

Where `x0` is the initial guess for parameter estimation, and `lb` and `ub` are the lower and upper parameter bounds (defaulting to `-Inf` and `Inf` if unspecified). The problem is then minimized by calling `solve`, and when the Hessian is unavailable or too expensive to compute, a Hessian approximation is chosen during this step:
Where `x0` is the initial guess for parameter estimation, and `lb` and `ub` are the lower and upper parameter bounds (defaulting to `-Inf` and `Inf` if unspecified). The problem is then minimized by calling `solve`. When the Hessian is unavailable or too expensive to compute, a Hessian approximation is provided during this step:

```@example 1
sol = solve(prob, Fides.BFGS()) # hide
sol = solve(prob, Fides.BFGS())
```

Several Hessian approximations are supported (see the [API](@ref API)), and of these `BFGS` generally performs well. Additional tuning options can be set by providing a [`FidesOptions`](@ref) struct via the `options` keyword in `solve`, and a full list of available options is documented in the [API](@ref API).
Several Hessian approximations are supported (see the [API](@ref API)), and `BFGS` generally performs well. Additional tuning options can be set by providing a [`FidesOptions`](@ref) struct via the `options` keyword in `solve`, and a full list of available options can be found in the [API](@ref API) documentation.

## Optimization with a User-Provided Hessian

If the Hessian (or a suitable approximation such as the [Gauss–Newton approximation](https://en.wikipedia.org/wiki/Gauss%E2%80%93Newton_algorithm)) is available, providing it can improve convergence. To provide a Hessian function to `FidesProblem` do:

```@example 1
prob = FidesProblem(f, grad!, x0; hess! = hess!, lb = lb, ub = ub)
sol = solve(prob) # hide
sol = solve(prob)
nothing # hide
```

Since a Hessian function is provided, no Hessian approximation needs to be specified.
Then, when solving the problem use the `Fides.CustomHessian()` Hessian option:

```@example 1
sol = solve(prob, Fides.CustomHessian()) # hide
sol = solve(prob, Fides.CustomHessian())
```

## Performance tip: Computing Derivatives and Objective Simultaneously

Internally, the objective function and its derivatives are computed simultaneously by Fides. Hence, runtime can be reduced if intermediate quantities are reused between the objective and derivative computations. To take advantage of this, a `FidesProblem` can be created with a function that computes the objective and gradient (and optionally the Hessian) for a given input. For example, when only the gradient is available:
Internally, the objective function and its derivatives are computed simultaneously by Fides. Hence, runtime can be reduced if is is possible to reuse intermediate quantities between the objective and derivative computations. To take advantage of this, a `FidesProblem` can be created with a function that computes the objective and gradient (and optionally the Hessian) for a given input. For example, when only the gradient is available:

```@example 1
function fides_obj(x)
Expand All @@ -74,13 +78,12 @@ function fides_obj(x)
return (obj, g)
end

hess = false
prob = FidesProblem(fides_obj, x0, hess; lb = lb, ub = ub)
prob = FidesProblem(fides_obj, x0; lb = lb, ub = ub)
sol = solve(prob, Fides.BFGS()) # hide
sol = solve(prob, Fides.BFGS())
```

Here, the variable `hess` indicates whether the objective function also returns the Hessian. When a Hessian function is available, do:
When a Hessian function is available, do:

```@example 1
function fides_obj(x)
Expand All @@ -90,10 +93,9 @@ function fides_obj(x)
return (obj, g, H)
end

hess = true
prob = FidesProblem(fides_obj, x0, hess; lb = lb, ub = ub)
sol = solve(prob) # hide
sol = solve(prob)
prob = FidesProblem(fides_obj, x0; lb = lb, ub = ub)
sol = solve(prob, Fides.CustomHessian()) # hide
sol = solve(prob, Fides.CustomHessian())
```

In this simple example, no runtime benefit is obtained as not quantities are reused between objective and derivative computations. However, if quantities can be reused (for example, when gradients are computed for ODE models), runtime can be noticeably reduced.
In this simple example, no runtime benefit is obtained as no quantities are reused between objective and derivative computations. However, if quantities can be reused (for example, when gradients are computed for ODE models), runtime can be noticeably reduced.
2 changes: 1 addition & 1 deletion src/Fides.jl
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ const LOGGING_LEVELS = ["warning", "info", "error", "debug"]
const InputVector = Union{Vector{<:Real}, ComponentVector{<:Real}}

include(joinpath(@__DIR__, "hessian_update.jl"))
const HessianUpdate = Union{BB, SR1, BG, BFGS, DFP, Broyden}
const HessianUpdate = Union{BB, SR1, BG, BFGS, DFP, Broyden, CustomHessian}

include(joinpath(@__DIR__, "problem.jl"))
include(joinpath(@__DIR__, "options.jl"))
Expand Down
18 changes: 15 additions & 3 deletions src/hessian_update.jl
Original file line number Diff line number Diff line change
@@ -1,3 +1,15 @@
"""
CustomHessian()

User‑provided Hessian function.

The Hessian function should be provided when creating a `FidesProblem`.

See also: [FidesProblem](@ref)
"""
struct CustomHessian
end

"""
BB(; init_hess = nothing}

Expand Down Expand Up @@ -90,7 +102,7 @@ struct BFGS{T <: Union{Nothing, AbstractMatrix}}
init_with_hess::Bool
end
function BFGS(; init_hess::Union{Nothing, AbstractMatrix} = nothing,
enforce_curv_cond::Bool = true)
enforce_curv_cond::Bool = true)
init_with_hess = _get_init_with_hess(init_hess)
return BFGS(init_hess, enforce_curv_cond, init_with_hess)
end
Expand All @@ -117,7 +129,7 @@ struct DFP{T <: Union{Nothing, AbstractMatrix}}
init_with_hess::Bool
end
function DFP(; init_hess::Union{Nothing, AbstractMatrix} = nothing,
enforce_curv_cond::Bool = true)
enforce_curv_cond::Bool = true)
init_with_hess = _get_init_with_hess(init_hess)
return DFP(init_hess, enforce_curv_cond, init_with_hess)
end
Expand Down Expand Up @@ -151,7 +163,7 @@ struct Broyden{T <: Union{Nothing, AbstractMatrix}}
init_with_hess::Bool
end
function Broyden(phi::AbstractFloat; init_hess::Union{Nothing, AbstractMatrix} = nothing,
enforce_curv_cond::Bool = true)
enforce_curv_cond::Bool = true)
init_with_hess = _get_init_with_hess(init_hess)
return Broyden(phi, init_hess, enforce_curv_cond, init_with_hess)
end
16 changes: 8 additions & 8 deletions src/options.jl
Original file line number Diff line number Diff line change
Expand Up @@ -56,12 +56,12 @@ struct FidesOptions{T <: Union{String, Nothing}}
history_file::T
end
function FidesOptions(; maxiter::Integer = 1000, fatol::Float64 = 1e-8,
frtol::Float64 = 1e-8, gatol::Float64 = 1e-6, grtol::Float64 = 0.0,
xtol::Float64 = 0.0, maxtime::Float64 = Inf, verbose = "warning",
subspace_solver::String = "2D", stepback_strategy::String = "reflect",
delta_init::Float64 = 1.0, mu::Float64 = 0.25, eta::Float64 = 0.75,
theta_max = 0.95, gamma1::Float64 = 0.25, gamma2::Float64 = 2.0,
history_file = nothing)::FidesOptions
frtol::Float64 = 1e-8, gatol::Float64 = 1e-6, grtol::Float64 = 0.0,
xtol::Float64 = 0.0, maxtime::Float64 = Inf, verbose = "warning",
subspace_solver::String = "2D", stepback_strategy::String = "reflect",
delta_init::Float64 = 1.0, mu::Float64 = 0.25, eta::Float64 = 0.75,
theta_max = 0.95, gamma1::Float64 = 0.25, gamma2::Float64 = 2.0,
history_file = nothing)::FidesOptions
if !(stepback_strategy in STEPBACK_STRATEGIES)
throw(ArgumentError("$(stepback_strategy) is not a valid stepback strategy. \
Valid options are $(STEPBACK_STRATEGIES)"))
Expand All @@ -76,8 +76,8 @@ function FidesOptions(; maxiter::Integer = 1000, fatol::Float64 = 1e-8,
end

return FidesOptions(maxiter, fatol, frtol, gatol, grtol, xtol, maxtime, verbose,
stepback_strategy, subspace_solver, delta_init, mu, eta, theta_max,
gamma1, gamma2, history_file)
stepback_strategy, subspace_solver, delta_init, mu, eta, theta_max,
gamma1, gamma2, history_file)
end

"""
Expand Down
41 changes: 27 additions & 14 deletions src/problem.jl
Original file line number Diff line number Diff line change
Expand Up @@ -15,13 +15,17 @@ Optimization problem to be minimized with the Fides Newton Trust Region optimize
- `lb`: Lower parameter bounds. Defaults to `-Inf` if not specified.
- `ub`: Upper parameter bounds. Defaults to `Inf` if not specified.

!!! note
In case a Hessian function is not provided, `Fides.CustomHessian()` must be provided to
`solve`.

See also [solve](@ref) and [FidesOptions](@ref).

FidesProblem(fides_obj, x0, hess::Bool; lb = nothing, ub = nothing)
FidesProblem(fides_obj, x0; lb = nothing, ub = nothing)

Optimization problem created from a function that computes:
- `hess = false`: Objective and gradient; `fides_obj(x) -> (obj, g)`.
- `hess = true`: Objective, gradient and Hessian; `fides_obj(x) -> (obj, g, H)`.
- Objective and gradient; `fides_obj(x) -> (obj, g)`.
- Objective, gradient and Hessian; `fides_obj(x) -> (obj, g, H)`.

Internally, Fides computes the objective function and derivatives simultaneously. Therefore,
this constructor is the most runtime-efficient option when intermediate quantities can be
Expand Down Expand Up @@ -65,7 +69,7 @@ struct FidesProblem{T <: AbstractVector}
user_hessian::Bool
end
function FidesProblem(f::Function, grad!::Function, x0::InputVector; hess! = nothing,
lb = nothing, ub = nothing)
lb = nothing, ub = nothing)
_lb = _get_bounds(x0, lb, :lower)
_ub = _get_bounds(x0, ub, :upper)
# To ensure correct input type to f, grad!, hess! a variable having the same type as
Expand All @@ -76,22 +80,31 @@ function FidesProblem(f::Function, grad!::Function, x0::InputVector; hess! = not
user_hessian = !isnothing(hess!)
return FidesProblem(fides_objective, fides_objective_py, x0, _lb, _ub, user_hessian)
end
function FidesProblem(fides_objective::Function, x0::InputVector, hess::Bool; lb = nothing,
ub = nothing)
function FidesProblem(
fides_objective::Function, x0::InputVector; lb = nothing, ub = nothing)
_lb = _get_bounds(x0, lb, :lower)
_ub = _get_bounds(x0, ub, :upper)
# Get number of output arguments
ret = fides_objective(x0)
if length(ret) < 2 || length(ret) > 3
throw(ArgumentError("Fides objective function can only return 2 or 3 values, not \
$(length(ret))"))
end
# See xinput comment above
xinput = similar(x0)
if hess == false
if length(ret) == 2
hess = false
fides_objective_py = _get_fides_objective(fides_objective, nothing, xinput, true)
else
elseif length(ret) == 3
hess = true
fides_objective_py = _get_fides_objective(fides_objective, xinput, true)
end
return FidesProblem(fides_objective, fides_objective_py, x0, _lb, _ub, hess)
end

function _get_fides_objective(f::Function, grad!::Function, hess!::Union{Function, Nothing},
xinput::InputVector, py::Bool)::Function
function _get_fides_objective(
f::Function, grad!::Function, hess!::Union{Function, Nothing},
xinput::InputVector, py::Bool)::Function
if !isnothing(hess!)
fides_objective = (x) -> let _grad! = grad!, _f = f, _hess! = hess!,
_xinput = xinput, _py = py
Expand All @@ -106,14 +119,14 @@ function _get_fides_objective(f::Function, grad!::Function, hess!::Union{Functio
return fides_objective
end
function _get_fides_objective(f_grad::Function, ::Nothing, xinput::InputVector,
py::Bool)::Function
py::Bool)::Function
fides_objective = (x) -> let _f_grad = f_grad, _xinput = xinput, _py = py
return _fides_objective(x, _f_grad, nothing, _xinput, _py)
end
return fides_objective
end
function _get_fides_objective(f_grad_hess::Function, xinput::InputVector,
py::Bool)::Function
py::Bool)::Function
fides_objective = (x) -> let _f_grad_hess = f_grad_hess, _xinput = xinput, _py = py
return _fides_objective(x, _f_grad_hess, _xinput, _py)
end
Expand All @@ -127,7 +140,7 @@ function _fides_objective(x, f::Function, grad!::Function, xinput::InputVector,
return _get_fides_results(obj, g, py)
end
function _fides_objective(x, f::Function, grad!::Function, hess!::Function,
xinput::InputVector, py::Bool)
xinput::InputVector, py::Bool)
_get_xinput!(xinput, x)
obj = f(xinput)
g = _grad_fides(xinput, grad!)
Expand Down Expand Up @@ -159,7 +172,7 @@ function _hess_fides(x::InputVector, hess!::Function)::Matrix
end

function _get_bounds(x0::InputVector, bound::Union{InputVector, Nothing},
which_bound::Symbol)::AbstractVector
which_bound::Symbol)::AbstractVector
@assert which_bound in [:lower, :upper] "Only lower and upper bounds are supported"
!isnothing(bound) && return bound
_bound = similar(x0)
Expand Down
Loading
Loading