Skip to content

Latest commit

 

History

History
108 lines (78 loc) · 2.62 KB

File metadata and controls

108 lines (78 loc) · 2.62 KB

Simulation studies

Replace observed data

In simulation studies, a common task is fitting the same model to many different datasets. It would be a waste of resources to reconstruct the complete model for each dataset. We therefore provide the function replace_observed to change the observed part of a model, without necessarily reconstructing the other parts.

For the A first model, you would use it as

using StructuralEquationModels

observed_vars = [:x1, :x2, :x3, :y1, :y2, :y3, :y4, :y5, :y6, :y7, :y8]
latent_vars = [:ind60, :dem60, :dem65]

graph = @StenoGraph begin

    # loadings
    ind60 → fixed(1)*x1 + x2 + x3
    dem60 → fixed(1)*y1 + y2 + y3 + y4
    dem65 → fixed(1)*y5 + y6 + y7 + y8

    # latent regressions
    ind60 → dem60
    dem60 → dem65
    ind60 → dem65

    # variances
    _(observed_vars) ↔ _(observed_vars)
    _(latent_vars) ↔ _(latent_vars)

    # covariances
    y1 ↔ y5
    y2 ↔ y4 + y6
    y3 ↔ y7
    y8 ↔ y4 + y6

end

partable = ParameterTable(
    graph,
    latent_vars = latent_vars, 
    observed_vars = observed_vars
)
data = example_data("political_democracy")

data_1 = data[1:30, :]

data_2 = data[31:75, :]

model = Sem(
    specification = partable,
    data = data_1
)

model_updated = replace_observed(model; data = data_2, specification = partable)

If you are building your models by parts, you can also update each part seperately with the function update_observed. For example,


new_observed = SemObservedData(;data = data_2, specification = partable)

my_optimizer = SemOptimizer()

new_optimizer = update_observed(my_optimizer, new_observed)

Multithreading

!!! danger "Thread safety" This is only relevant when you are planning to fit updated models in parallel

Models generated by `replace_observed` may share the same objects in memory (e.g. some parts of 
`model` and `model_updated` are the same objects in memory.)
Therefore, fitting both of these models in parallel will lead to **race conditions**, 
possibly crashing your computer.
To avoid these problems, you should copy `model` before updating it.

Taking into account the warning above, fitting multiple models in parallel becomes as easy as:

model1 = Sem(
    specification = partable,
    data = data_1
)

model2 = deepcopy(replace_observed(model; data = data_2, specification = partable))

models = [model1, model2]
fits = Vector{SemFit}(undef, 2)

Threads.@threads for i in 1:2
    fits[i] = fit(models[i])
end

API

replace_observed
update_observed