List view
# LearningToOptimize Design LearningToOptimize (L2O) is a Julia framework that streamlines using (deep) learning methods for approximating the solution to parametric optimization problems. Users will communicate to L2O: - the parametric JuMP model - sampling functions/distributions for the parameters - dataset generation settings - a neural network that takes the parameter vector as input and returns a prediction for all the variables of the model - a training loop body using the L2O API, or custom training code The model definition code is identical to the code for solving an instance in JuMP using `Parameter`. ```julia using JuMP, HiGHS N = 10 m = Model() @variable(m, x[1:N]) set_lower_bound.(x, 0.0) set_upper_bound.(x, 1.0) @variable(m, p[1:N] ∈ Parameter.(0.5)) @constraint(m, x .≥ p) @objective(m, Min, sum(x)) ``` Since we are interested in a distribution over the parameters, users have to specify that distribution. ```julia using Distributions import LearningToOptimize as L2O dist = product_distribution([Uniform(0.0, 1.0) for _ in 1:N]) sampler = L2O.make_sampler(dist) ``` Note it may be worth forcing/letting users formulate using (a subset of) `InfiniteOpt.jl` directly instead. Then, define the neural network and training loop. This will depend on the learning method. L2O comes with a flexible `Trainer` class inspired by `pytorch-lightning` in Python, below are some examples of this API, which complements an extensive public lower-level API. First, a proxy example: ```julia import KernelAbstractions as KA using Flux proxy_nn = L2O.MLP(m, # default -> all parameters to all variables hidden_sizes=(1024, 1024, 1024), activation=Flux.softplus, batchnorm=true, ) unlabeled_data = L2O.generate_data(m, sampler, config=( num_samples=1e5, solve=false, )) proxy_trainer = L2O.Trainer(m, unlabeled_data, proxy_nn, config=( batch_size=64, epochs=1000, rule=Flux.Adam, backend=KA.CUDABackend(), projection=:bound_clamp, # clamp to bound constraints # projection=:euclidean, # InfiniteExaModel for batched projection # projection=func, # custom projection (must be batched GPU-friendly) # projection=:none, # default, no-op loss=(tr, p_batch) -> begin x_pred = tr.nn(p_batch) x_proj = tr.project_x(x_pred) tr.objective_value(x_proj) end )) L2O.train!(proxy_trainer) ``` Now, a PINN (mse + violation penalty) example: ```julia pinn_nn = L2O.MLP(m) M = 1e5 labeled_data = L2O.generate_data(m, sampler, config=( num_samples=1e5, solve=true, optimizer=HiGHS.Optimizer, )) pinn_trainer = L2O.Trainer(m, labeled_data, pinn_nn, config=( batch_size=64, epochs=1000, backend=KA.CUDABackend(), # no projection loss=(tr, p_batch, x_true_batch) -> begin # 3-argument when solve=true x_pred = tr.nn(p_batch) v_pred = tr.constraint_violations(x_pred) Flux.mse(x_pred, x_true_batch) + M * sum(v_pred) end )) L2O.train!(pinn_trainer) ``` Now a primal-dual penalty method: ```julia pd_nn = L2O.MLP(m, predict_dual=true) M = 1e5 pd_data = L2O.generate_data(m, sampler, config=( num_samples=1e5, solve=true, store_duals=true, )) pd_trainer = L2O.Trainer(m, pd_data, pd_nn, config=( batch_size=64, epochs=1000, backend=KA.CUDABackend(), projection=:none, loss=(tr, p_batch, xy_true) -> begin x_true, y_true = tr.split(xy_true) xy_pred = tr.nn(p_batch) x_pred, y_pred = tr.split(xy_pred) v_x_pred = tr.constraint_violations(x_pred, p_batch) v_y_pred = tr.dual_constraint_violations(y_pred, p_batch) Flux.mse(xy_pred, xy_true) + M * (sum(v_x_pred) + sum(v_y_pred)) end )) L2O.train!(pd_trainer) ``` Now a DLL example: ```julia import DualLagrangianLearning as DLL DLL.decompose!(m, DLL.BoundDecomposition) dll_nn = DLL.MLP(m) dll_trainer = DLL.Trainer(m, unlabeled_data, dll_nn, config=( batch_size=64, epochs=1000, backend=KA.CUDABackend(), )) L2O.train!(dll_trainer) ``` ## BatchOptInterface For batch-mode training/inference, L2O automates the generation of functions that evaluate problem data for a batch of parameter values/predictions. This is facilitated by the InfiniteExaModels package. Specificially, L2O will formulate an InfiniteOpt model, mimicking the user model but with all variables depending on infinite parameters. The objective is set to the expectation of the user objective over the parameters. The number of supports is set to the batch size. InfiniteExaModels then transcribes this model to an ExaModel, which finally allows L2O to call the batched GPU problem data kernels in ExaModels. ### Obstacles 1. ExaModels does not support Float32. I can't figure out where it uses Float64, but KA can't compile with Float32. Not sure if InfiniteOpt supports Float32, it is not as necessary. 2. ExaModels does not support constraints(x, p), only constraints(x). We need to modify the values in `SIMDFunction.itr` each time. 3. ExaModels automatically reduces over the SIMD dimension. For basic use cases this is fine, but we should be able to give the user the e.g. violations, for each sample. 4. ExaModels <-> MOI mapping needs to be robust. InfiniteExaModels should make this easy. 5. Variable batch size support would be nice. 6. It may be advantageous to have two levels of SIMD, one for batch and one like a normal ExaModel. 7. We need access to the ExaCore for computing violations. That should be easy.
Due by June 29, 2026