Splittable random numbers for reproducible training

[Bagger](https://github.com/CitrineInformatics/lolo/blob/main/src/main/scala/io/citrine/lolo/bags/Bagger.scala#L101-L110) and [MultiTaskBagger](https://github.com/CitrineInformatics/lolo/blob/main/src/main/scala/io/citrine/lolo/bags/MultiTaskBagger.scala#L51-L56) both train the individual models in parallel. Because the order of training is uncontrolled, this means that Lolo random forests are inherently non-reproducible, even if the bagging and the rngs for base learners are identical.

There are ways of guaranteeing reproducibility across multiple threads, and we should make use of them.
[SplittableRandom in Java](https://docs.oracle.com/javase/8/docs/api/java/util/SplittableRandom.html)
[A discussion in the context of numpy](https://numpy.org/doc/stable/reference/random/parallel.html)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Splittable random numbers for reproducible training #259

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Splittable random numbers for reproducible training #259

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions