Migrate a few more entries from EU #103

bgruening · 2025-12-24T13:14:35Z

No description provided.

bgruening · 2025-12-24T13:47:09Z

tools.yml

  toolshed.g2.bx.psu.edu/repos/ecology/srs_preprocess_s2/srs_preprocess_s2/.*:
    mem: 16
+  toolshed.g2.bx.psu.edu/repos/ecology/wildlife_megadetector_huggingface/wildlife_megadetector_huggingface/.*:
+    gpus: 1


Is it ok to specify here GPUs?

If it doesn't work without I would say yes

bgruening · 2025-12-24T13:47:52Z

tools.yml

-    mem: 12
+    mem: 24
+    env:
+      EGGNOG_DBMEM: --dbmem


I have a related question here. @cat-bro do you know if on AU all those envs are passed into the containers without any extra magic?

bgruening · 2025-12-24T13:48:50Z

tools.yml

+    cores: 4
+    mem: 30
+    scheduling:
+      require:


I can remove that. But what is the preferred way to indicate that this tool can/should run in Singularity?

bgruening · 2025-12-24T13:49:08Z

tools.yml

+      - singularity
+    rules:
+    - if: input_size >= 0.01
+      gpus: 1


flag GPU here, if we don't want this in a shared DB.

bgruening · 2025-12-24T13:49:37Z

tools.yml

+    - if: input_size >= 0.01
+      gpus: 1
+      params:
+        singularity_run_extra_arguments: ' --nv '


this is enabling Singularity containers to setup the containers for GPU/Nvidia ...

My initial thought would be: keep this but remove the scheduling tag for singularity and add a comment? But in general running things in singularity is the recommended approach, or not?

tools.yml

bgruening · 2025-12-24T13:55:50Z

tools.yml

  toolshed.g2.bx.psu.edu/repos/iuc/nanopolishcomp_eventaligncollapse/nanopolishcomp_eventaligncollapse/.*:
    cores: 10
    mem: 12
+  toolshed.g2.bx.psu.edu/repos/iuc/ncbi_fcs_gx/ncbi_fcs_gx/.*:


This is a very expensive and inefficient tool for VGP - TODO look at ORGs rule.

bgruening · 2025-12-24T13:56:43Z

tools.yml

  toolshed.g2.bx.psu.edu/repos/iuc/pureclip/pureclip/.*:
    cores: 2
    mem: 32
+  # 4GB is enough for most of the runs as it seems


I had this in my notes, no idea why we have here so much more memory :(

Yes, but one must take care of the edge cases otherwise people cannot run their jobs (if OOM killing is in place)

I added a comment and put in the 30GB back in.

bgruening · 2025-12-24T22:26:12Z

tools.yml

+      cores: 12
+      mem: 92
+    - id:
+      if: input_size >= 1


We are bailing out when we see large files and recommend Spades.

This one is interesting. I don’t think this rule belongs in the shared db because it would prevent reproducible research in some cases. Apart from this there is no guarantee that all galaxies would have rnaspades installed.

This one is interesting. I don’t think this rule belongs in the shared db because it would prevent reproducible research in some cases.

Mh, given the varying resource limits across our instances, I don't see a problem for reproducible research. Let's use the shared-db to coordinate the deprecation of resource-heavy legacy tools and align on more sustainable software choices.

But ofc I'm happy to take this rule out. Imho ORG has something else as well.

tools.yml

bgruening

Thanks, I adopted this

tools.yml

cat-bro · 2026-01-04T23:42:57Z

tools.yml

  toolshed.g2.bx.psu.edu/repos/iuc/purge_dups/purge_dups/.*:
-    mem: 30.4
+    cores: 1
+    mem: 6


I am frequently seeing max memory for jobs over 10GB, sometimes as high as 30GB.

So we just deduce a rule here?

Suggested change

mem: 6

# might be a good target for a memory rule

mem: 30.4

bgruening and others added 2 commits December 24, 2025 14:14

EU migrate

84e744c

sort

620cf67