Thoughts about the integrating optimizations into the analysis stack

Thinking about integration with _BiocSingular_ in particular.

I think the best option would be to write a separate package with a S4 generic that takes an instance of a matrix-like-object and returns a `BiocSingularParam` object that provides the "best" choice of algorithm (in terms of the speed/accuracy trade-off). This approach has several advantages:

- Isolate implementation (in _BiocSingular_) from decisions about the choice of algorithm, which makes it easier to maintain as I only have to implement things rather than explicitly choose between them.
- Provide users with greater control - if they don't like the generated `BiocSingularParam` object, or if they know better (e.g., about their file system access speeds), they can just use their own.
- Simplify extensions for community-defined matrix representations, as anyone can write methods for the generic if they know that their representation is fast/slow at being touched.

The downside of the above strategy is that the choice of algorithm is not entirely transparent in the analysis stack, as the user has to actively call the new function that generates the new `BiocSingularParam` object. But I would argue that the choice of algorithm would not have been transparent in the first place, e.g., switching between IRLBA and RSVD can give results that differ beyond numerical precision, while switching between approximate and exact SVD will result in changes to the random seed.

Plus, if you write a separate package, you can also put in functions to perform empirical benchmarking for people who are _really_ interested in optimizing their SVD speeds. Those won't go into _BiocSingular_.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Thoughts about the integrating optimizations into the analysis stack #1

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Thoughts about the integrating optimizations into the analysis stack #1

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions