Recommended deployment strategy?

Hello,

I've been reviewing WEPP since I came across it on the SPHERES slack last week. I think the idea of calling haplotypes against an UShER MAT is really exciting, and I'd like to try it on some of our own wastewater datasets. 

That said, I'm finding myself a bit puzzled with the expected deployment strategy. It looks like you generally recommend running snakemake with 32 cores, and from what I can tell, the C++ code appears to bring entire alignment map datasets into memory. To my mind, this seems intensive enough that I'd want to put it on our HPC cluster. But then snakemake itself launches the dashboard frontend using nginx on a privileged port, a persistent node.js server, and direct web browser access to localhost, all implying that WEPP should instead be run locally.

Given that and how tightly coupled the C++ build, the data processing, and the dashboard are, would you recommend that users only run WEPP on high-end personal workstations (32+ cores, 32GB+ RAM, sudo access)? I would try running on my HPC with `DASHBOARD_ENABLED=False` and then copy all the snakemake outputs to a local clone of WEPP, but then I need to compile WEPP a second time on a different target, and because `build/wepp` is an input dependency for multiple Snakemake rules, I worry that that would cause much of the workflow to re-run locally--all just to run the dashboard!

Hopefully I'm just overthinking this and there's a better way. Thanks again for your work on this and happy to clarify anything further,

--Nick

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recommended deployment strategy? #1

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Recommended deployment strategy? #1

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions