docs: sync with Sem refactor

Alexey Stukalov · Alexey Stukalov · commit 3c3994150791 · 2026-03-09T15:47:04.000-07:00
diff --git a/docs/src/developer/implied.md b/docs/src/developer/implied.md
@@ -13,9 +13,9 @@ end
 and a method to update!:
 
 ```julia
-import StructuralEquationModels: objective!
+import StructuralEquationModels: update!
 
-function update!(targets::EvaluationTargets, implied::MyImplied, model::AbstractSemSingle, params)
+function update!(targets::EvaluationTargets, implied::MyImplied, params)
 
     if is_objective_required(targets)
         ...
@@ -31,7 +31,7 @@ function update!(targets::EvaluationTargets, implied::MyImplied, model::Abstract
 end
 ```
 
-As you can see, `update` gets passed as a first argument `targets`, which is telling us whether the objective value, gradient, and/or hessian are needed.
+As you can see, `update!` gets passed as a first argument `targets`, which is telling us whether the objective value, gradient, and/or hessian are needed.
 We can then use the functions `is_..._required` and conditional on what the optimizer needs, we can compute and store things we want to make available to the loss functions. For example, as we have seen in [Second example - maximum likelihood](@ref), the `RAM` implied type computes the model-implied covariance matrix and makes it available via `implied.Σ`.
 
 
@@ -56,7 +56,7 @@ Empty placeholder for models that don't need an implied part.
 - `specification`: either a `RAMMatrices` or `ParameterTable` object
 
 # Examples
-A multigroup model with ridge regularization could be specified as a `SemEnsemble` with one
+A multigroup model with ridge regularization could be specified as a `Sem` with one
 model per group and an additional model with `ImpliedEmpty` and `SemRidge` for the regularization part.
 
 # Extended help
@@ -75,26 +75,20 @@ end
 ### Constructors
 ############################################################################################
 
-function ImpliedEmpty(;
-    specification,
-    meanstruct = NoMeanStruct(),
-    hessianeval = ExactHessian(),
+function ImpliedEmpty(
+    spec::SemSpecification;
+    hessianeval::HessianApprox = ExactHessian(),
     kwargs...,
 )
-    return ImpliedEmpty(hessianeval, meanstruct, convert(RAMMatrices, specification))
+    ram_matrices = convert(RAMMatrices, spec)
+    return ImpliedEmpty(hessianeval, MeanStruct(ram_matrices), ram_matrices)
 end
 
 ############################################################################################
 ### methods
 ############################################################################################
 
-update!(targets::EvaluationTargets, implied::ImpliedEmpty, par, model) = nothing
-
-############################################################################################
-### Recommended methods
-############################################################################################
-
-update_observed(implied::ImpliedEmpty, observed::SemObserved; kwargs...) = implied
+update!(targets::EvaluationTargets, implied::ImpliedEmpty, par) = nothing
 ```
 
-As you see, similar to [Custom loss functions](@ref) we implement a method for `update_observed`.
+As you see, similar to [Custom loss functions](@ref) we implement a constructor.
diff --git a/docs/src/developer/loss.md b/docs/src/developer/loss.md
@@ -166,17 +166,6 @@ end
 
 ## Additional functionality
 
-### Update observed data
-
-If you are planing a simulation study where you have to fit the **same model** to many **different datasets**, it is computationally beneficial to not build the whole model completely new everytime you change your data.
-Therefore, we provide a function to update the data of your model, `replace_observed(model(semfit); data = new_data)`. However, we can not know beforehand in what way your loss function depends on the specific datasets. The solution is to provide a method for `update_observed`. Since `Ridge` does not depend on the data at all, this is quite easy:
-
-```julia
-import StructuralEquationModels: update_observed
-
-update_observed(ridge::Ridge, observed::SemObserved; kwargs...) = ridge
-```
-
 ### Access additional information
 
 If you want to provide a way to query information about loss functions of your type, you can provide functions for that:
diff --git a/docs/src/developer/optimizer.md b/docs/src/developer/optimizer.md
@@ -25,12 +25,6 @@ struct MyoptResult{O <: SemOptimizerMyopt} <: SEM.SemOptimizerResult{O}
     ...
 end
 
-############################################################################################
-### Recommended methods
-############################################################################################
-
-update_observed(optimizer::SemOptimizerMyopt, observed::SemObserved; kwargs...) = optimizer
-
 ############################################################################################
 ### additional methods
 ############################################################################################
@@ -43,8 +37,6 @@ and `SEM.sem_optimizer_subtype(::Val{:Myopt})` returns `SemOptimizerMyopt`.
 This instructs *SEM.jl* to use `SemOptimizerMyopt` when `:Myopt` is specified as the engine for
 model fitting: `fit(..., engine = :Myopt)`.
 
-A method for `update_observed` and additional methods might be usefull, but are not necessary.
-
 Now comes the essential part: we need to provide the [`fit`](@ref) method with `SemOptimizerMyopt`
 as the first positional argument.
 
diff --git a/docs/src/developer/sem.md b/docs/src/developer/sem.md
@@ -1,13 +1,14 @@
 # Custom model types
 
-The abstract supertype for all models is `AbstractSem`, which has two subtypes, `AbstractSemSingle{O, I, L}` and `AbstractSemCollection`. Currently, there are 2 subtypes of `AbstractSemSingle`: `Sem`, `SemFiniteDiff`. All subtypes of `AbstractSemSingle` should have at least observed, implied, loss and optimizer fields, and share their types (`{O, I, L}`) with the parametric abstract supertype. For example, the `SemFiniteDiff` type is implemented as
+The abstract supertype for all models is [`AbstractSem`](@ref). Currently, there are 2 concrete subtypes:
+`Sem{L <: Tuple}` and `SemFiniteDiff{S <: AbstractSem}`.
+A `Sem` model holds a tuple of `LossTerm`s (each wrapping an `AbstractLoss`) and a vector of parameter labels. Both single-group and multigroup models are represented as `Sem`.
+
+`SemFiniteDiff` wraps any `AbstractSem` and substitutes dedicated gradient/hessian evaluation with finite difference approximation:
 
 ```julia
-struct SemFiniteDiff{O <: SemObserved, I <: SemImplied, L <: SemLoss} <:
-       AbstractSemSingle{O, I, L}
-    observed::O
-    implied::I
-    loss::L
+struct SemFiniteDiff{S <: AbstractSem} <: AbstractSem
+    model::S
 end
 ```
 
@@ -17,6 +18,4 @@ Additionally, you can change how objective/gradient/hessian values are computed
 evaluate!(objective, gradient, hessian, model::SemFiniteDiff, params) = ...
 ```
 
-Additionally, we can define constructors like the one in `"src/frontend/specification/Sem.jl"`.
-
-It is also possible to add new subtypes for `AbstractSemCollection`.
+Additionally, we can define constructors like the one in `"src/frontend/specification/Sem.jl"`.
diff --git a/docs/src/internals/types.md b/docs/src/internals/types.md
@@ -2,12 +2,16 @@
 
 The type hierarchy is implemented in `"src/types.jl"`.
 
-`AbstractSem`: the most abstract type in our package
-- `AbstractSemSingle{O, I, L} <: AbstractSem` is an abstract parametric type that is a supertype of all single models
-    - `Sem`: models that do not need automatic differentiation or finite difference approximation
-    - `SemFiniteDiff`: models whose gradients and/or hessians should be computed via finite difference approximation
-- `AbstractSemCollection <: AbstractSem` is an abstract supertype of all models that contain multiple `AbstractSem` submodels
+[`AbstractLoss`](@ref): is the base abstract type for all loss functions:
+- `SemLoss{O <: SemObserved, I <: SemImplied}`: is the subtype of `AbstractLoss`, which is the
+  base for all SEM-specific loss functions ([`SemML`](@ref), [`SemWLS`](@ref) etc) that
+  evaluate how closely the implied covariation structure (represented by the object of type `I`)
+  matches the observed one (contained in the object of type `O`);
+- regularizing terms (e.g. [`SemRidge`](@ref)) are implemented as subtypes of `AbstractLoss`.
 
-Every `AbstractSemSingle` has to have `SemObserved`, `SemImplied`, and `SemLoss` fields (and can have additional fields).
-
-`SemLoss` is a container for multiple `SemLossFunctions`.
+[`AbstractSem`](@ref) is the base abstract type for all SEM models. It has two concrete subtypes:
+- `Sem{L <: Tuple} <: AbstractSem`: the main SEM model type that implements a list of weighted
+loss terms (using [`LossTerm`](@ref) wrapper around `AbstractLoss`) and allows modeling both single
+and multi-group SEMs and combining them with regularization terms.
+- `SemFiniteDiff{S <: AbstractSem} <: AbstractSem`: a wrapper around any `AbstractSem` that
+  substitutes dedicated gradient/hessian evaluation with finite difference approximation.
diff --git a/docs/src/performance/mixed_differentiation.md b/docs/src/performance/mixed_differentiation.md
@@ -2,22 +2,22 @@
 
 This way of specifying our model is not ideal, however, because now also the maximum likelihood loss function lives inside a `SemFiniteDiff` model, and this means even though we have defined analytical gradients for it, we do not make use of them.
 
-A more efficient way is therefore to specify our model as an ensemble model: 
+A more efficient way is therefore to specify our model as a combined model with multiple loss terms:
 
 ```julia
-model_ml = Sem(
-    specification = partable,
-    data = data,
-    loss = SemML
+ml_term = SemML(
+    SemObservedData(data = data, specification = partable),
+    RAMSymbolic(partable)
 )
 
-model_ridge = SemFiniteDiff(
-    specification = partable,
-    data = data,
-    loss = myridge
+ridge_term = SemFiniteDiff(
+    SemML(
+        SemObservedData(data = data, specification = partable),
+        RAMSymbolic(partable)
+    )
 )
 
-model_ml_ridge = SemEnsemble(model_ml, model_ridge)
+model_ml_ridge = Sem(ml_term, ridge_term)
 
 model_ml_ridge_fit = fit(model_ml_ridge)
 ```
diff --git a/docs/src/performance/simulation.md b/docs/src/performance/simulation.md
@@ -57,19 +57,7 @@ model = Sem(
     data = data_1
 )
 
-model_updated = replace_observed(model; data = data_2, specification = partable)
-```
-
-If you are building your models by parts, you can also update each part seperately with the function `update_observed`.
-For example,
-
-```@example replace_observed
-
-new_observed = SemObservedData(;data = data_2, specification = partable)
-
-my_optimizer = SemOptimizer()
-
-new_optimizer = update_observed(my_optimizer, new_observed)
+model_updated = replace_observed(model, data_2)
 ```
 
 ## Multithreading
@@ -90,7 +78,7 @@ model1 = Sem(
     data = data_1
 )
 
-model2 = deepcopy(replace_observed(model; data = data_2, specification = partable))
+model2 = deepcopy(replace_observed(model, data_2))
 
 models = [model1, model2]
 fits = Vector{SemFit}(undef, 2)
@@ -104,5 +92,4 @@ end
 
 ```@docs
 replace_observed
-update_observed
 ```
diff --git a/docs/src/tutorials/collection/collection.md b/docs/src/tutorials/collection/collection.md
@@ -1,31 +1,29 @@
 # Collections
 
-With StructuralEquationModels.jl, you can fit weighted sums of structural equation models. 
-The most common use case for this are [Multigroup models](@ref). 
-Another use case may be optimizing the sum of loss functions for some of which you do know the analytic gradient, but not for others. 
-In this case, you can optimize the sum of a `Sem` and a `SemFiniteDiff` (or any other differentiation method).
+With StructuralEquationModels.jl, you can fit weighted sums of structural equation models.
+The most common use case for this are [Multigroup models](@ref).
+Another use case may be optimizing the sum of loss functions for some of which you do know the analytic gradient, but not for others.
+In this case, you can combine `SemLoss` terms with finite-difference-wrapped terms in a single `Sem` model.
 
-To use this feature, you have to construct a `SemEnsemble` model, which is actually quite easy:
+To use this feature, you construct a `Sem` model with multiple loss terms:
 
 ```julia
-# models
-model_1 = Sem(...)
+# loss terms
+loss_1 = SemML(observed_1, implied_1)
 
-model_2 = SemFiniteDiff(...)
+loss_2 = SemML(observed_2, implied_2)
 
-model_3 = Sem(...)
-
-model_ensemble = SemEnsemble(model_1, model_2, model_3)
+model = Sem(loss_1, loss_2)
 ```
 
-So you just construct the individual models (however you like) and pass them to `SemEnsemble`.
-You may also pass a vector of weigths to `SemEnsemble`. By default, those are set to ``N_{model}/N_{total}``, i.e. each model is weighted by the number of observations in it's data (which matches the formula for multigroup models).
+So you just construct the individual `SemLoss` (or other `AbstractLoss`) terms and pass them to `Sem`.
+You may also pass weights by using the pair syntax `loss => weight`. By default, SEM loss terms are weighted by ``N_{model}/N_{total}``, i.e. each term is weighted by the number of observations in its data (which matches the formula for multigroup models).
 
 Multigroup models can also be specified via the graph interface; for an example, see [Multigroup models](@ref).
 
 # API - collections
 
 ```@docs
-SemEnsemble
-AbstractSemCollection
+Sem
+LossTerm
 ```
diff --git a/docs/src/tutorials/collection/multigroup.md b/docs/src/tutorials/collection/multigroup.md
@@ -6,7 +6,7 @@ using StructuralEquationModels
 
 As an example, we will fit the model from [the `lavaan` tutorial](https://lavaan.ugent.be/tutorial/groups.html) with loadings constrained to equality across groups.
 
-We first load the example data. 
+We first load the example data.
 We have to make sure that the column indicating the group (here called `school`) is a vector of `Symbol`s, not strings - so we convert it.
 
 ```@setup mg
@@ -59,19 +59,19 @@ You can then use the resulting graph to specify an `EnsembleParameterTable`
 groups = [:Pasteur, :Grant_White]
 
 partable = EnsembleParameterTable(
-    graph, 
+    graph,
     observed_vars = observed_vars,
     latent_vars = latent_vars,
     groups = groups)
 ```
 
-The parameter table can be used to create a `SemEnsemble` model:
+The parameter table can be used to create a multigroup `Sem` model:
 
 ```@example mg; ansicolor = true
-model_ml_multigroup = SemEnsemble(
+model_ml_multigroup = Sem(
     specification = partable,
     data = dat,
-    column = :school,
+    semterm_column = :school,
     groups = groups)
 ```