The ScaleBio CROP-seq workflow requires a number of dependencies to run. These include ScaleBio developed and open-source executables, python libraries, etc. There are three alternative ways to provide these dependencies; select one of these, depending on what is easiest on your system, and follow the instructions below.
If your system supports docker containers, this is the recommended way to handle all dependencies for the ScaleBio CROP-seq workflow. We provide pre-build docker containers and the workflow is setup to automatically use them.
This is enabled by adding -profile docker to the nextflow command-line.
If your system does not support docker, singularity is an alternative that is enabled on many HPC clusters (2.3.x or newer). Setting -profile docker,singularity (no space) will use the singularity engine for all dependencies. The environment variable NXF_SINGULARITY_CACHEDIR can be used to control where singularity images are stored. This should be a writable location that is available on all compute nodes. Similarly TMPDIR should be changed from the default /tmp to a location writable from the container if necessary.
See Nextflow Containers for details and additional configuration options.
One important point is that all input and output paths need to be available (bind) inside the containers. For docker, Nextflow will set the relevant options automatically at runtime; for singularity this requires user mounts to be enabled in the system-wide configuration (see the notes in the Nextflow singularity documentation).
Another option is using the Conda package manager. Nextflow can automatically create conda environments with most dependencies. This mode is selected by setting -profile conda. In this case the following additional steps need to be completed:
- Install and update conda
conda update -n base -c defaults conda
- Install ScaleBio Tools
/PATH/TO/ScaleRNA/envs/download-scale-tools.sh
- If running from a sequencer runFolder (.bcls) Illumina BCL Convert is required to be installed (and available on
$PATH) - If nextflow throws an error installing packages while using
-profile conda, there is a more verbose yaml file with a comprehensive list of all python packages and their versions, specified in envs(scaleRna_verbose.conda.yml and scaleCROP_verbose.conda.yml)- To run nextflow with these conda files, edit the nextflow.config and replace the process.conda section of the conda profile with the appropriate yml. So
process.conda = "$projectDir/envs/scaleRna.conda.yml"becomesprocess.conda = "$projectDir/envs/scaleRna_verbose.conda.ymlandconda = "$projectDir/envs/scaleCROP.conda.yml"becomesconda = "$projectDir/envs/scaleCROP_verbose.conda.yml
- To run nextflow with these conda files, edit the nextflow.config and replace the process.conda section of the conda profile with the appropriate yml. So
See the Nextflow documentation for additional detail of conda support in Nextflow.
As a final alternative it is also possible to simply install the required dependencies directly, either by hand or using Conda.
A list of all requirements can be found in envs/scaleRna.conda.yml and envs/scaleCROP.dockerfile. All tools need to be available on $PATH or in /PATH/TO/crispr_tools/bin/