Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
53 changes: 28 additions & 25 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,8 +39,9 @@

Welcome to the official repository of ROADIES, a novel pipeline for inferring phylogenetic species trees directly from raw genomic assemblies. ROADIES offers a fully automated, scalable, and easy-to-use solution, eliminating manual steps and allowing flexible control over the trade-off between accuracy and runtime.

**For a detailed overview of ROADIES' features and configuration options, please visit our [Wiki](https://turakhialab.github.io/ROADIES/).**
### 🟡 For a detailed overview of ROADIES' features and configuration options, please visit our [Wiki](https://turakhialab.github.io/ROADIES/).

### 🟡 If you encounter issues while running the pipeline, please refer to [this page](https://turakhialab.github.io/ROADIES/troubleshooting/) for common errors and troubleshooting tips.
<br>

<div align="center">
Expand All @@ -60,7 +61,7 @@ Please follow any of the options below to install ROADIES in your system.

### <a name="conda"></a> Option 1: Install via Bioconda (Recommended)

1. Install Conda (if not installed):
**Step 1:** Install Conda (if not installed):

```
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
Expand All @@ -71,7 +72,7 @@ export PATH="$HOME/miniconda3/bin:$PATH"
source ~/.bashrc
```

2. Configure Conda channels:
**Step 2:** Configure Conda channels:

```
conda config --add channels defaults
Expand All @@ -85,7 +86,7 @@ conda config --add channels conda-forge

Verify the installation by running `conda` in your terminal

3. Create and activate a custom environment:
**Step 3:** Create and activate a custom environment:

```
conda create -n roadies_env python=3.9 ete3 seaborn
Expand All @@ -94,13 +95,13 @@ conda create -n roadies_env python=3.9 ete3 seaborn
conda activate roadies_env
```

4. Install ROADIES:
**Step 4:** Install ROADIES:

```
conda install roadies=0.1.10
```

5. Locate the installed files:
**Step 5:** Locate the installed files:

```
cd $CONDA_PREFIX/ROADIES
Expand All @@ -111,22 +112,22 @@ You will be able to find the contents of the repository within this ROADIES fold

If you would like to install ROADIES using DockerHub, follow these steps:

1. Pull the ROADIES image from DockerHub:
**Step 1:** Pull the ROADIES image from DockerHub:

```
docker pull ang037/roadies:latest
```
2. Launch a container:
**Step 2:** Launch a container:

```
docker run -it ang037/roadies:latest
```

These commands will launch the Docker container in interactive mode, with the roadies_env environment activated and the working directory set to the ROADIES repository containing all necessary files. Once you are able to access the ROADIES repository, refer to the [Quick Start](#start) to run the pipeline.
These commands will launch the Docker container in interactive mode, with the `roadies_env` environment activated and the working directory set to the ROADIES repository containing all necessary files. Once you are able to access the ROADIES repository, refer to the [Quick Start](#start) to run the pipeline.

### <a name="docker"></a> Option 3: Install via Local Docker Build

1. Clone the ROADIES repository:
**Step 1:** Clone the ROADIES repository:

```
git clone https://github.com/TurakhiaLab/ROADIES.git
Expand All @@ -135,7 +136,7 @@ git clone https://github.com/TurakhiaLab/ROADIES.git
cd ROADIES
```

2. Build and run the Docker container:
**Step 2:** Build and run the Docker container:

```
docker build -t roadies_image .
Expand All @@ -148,7 +149,7 @@ Once you are able to access the ROADIES repository, refer to [Quick Start](#star

### <a name="script"></a> Option 4: Install via Source Script

1. Install the following dependencies (**requires sudo access**):
**Step 1:** Install the following dependencies (**requires sudo access**):

- Java Runtime Environment (Version 1.7 or higher)
- Python (Version 3.9 or higher)
Expand All @@ -164,7 +165,7 @@ For Ubuntu, you can install these dependencies with:
sudo apt-get install -y wget unzip make g++ python3 python3-pip python3-setuptools git default-jre libgomp1 libboost-all-dev cmake
```

2. Clone the repository:
**Step 2:** Clone the repository:

```
git clone https://github.com/TurakhiaLab/ROADIES.git
Expand All @@ -173,7 +174,7 @@ git clone https://github.com/TurakhiaLab/ROADIES.git
cd ROADIES
```

3. Run the installation script:
**Step 3:** Run the installation script:

```
chmod +x roadies_env.sh
Expand All @@ -192,19 +193,19 @@ After successful setup (Setup complete message), your environment `roadies_env`

After installing using one of the options mentioned in [Quick Install](#usage), you're ready to run ROADIES! To get started:

1. Download the test dataset (11 Drosophila genomes):
**Step 1:** Download the test dataset (11 Drosophila genomes):

```
mkdir -p test/test_data && cat test/input_genome_links.txt | xargs -I {} sh -c 'wget -O test/test_data/$(basename {}) {}'
```

This will save the datasets on a separate `test/test_data` folder within the repository

2. Run the pipeline
**Step 2:** Run the ROADIES pipeline

#### IMPORTANT: ROADIES by default runs multiple iterations for generating highly accurate trees. For quick testing, use `--noconverge` to run a single iteration.

**Full run (multiple iterations)**
**Full run (multiple iterations) - Default**
```
python run_roadies.py --cores 16
```
Expand All @@ -215,11 +216,13 @@ python run_roadies.py --cores 16
python run_roadies.py --cores 16 --noconverge
```

3. Output:
**Step 3:** Access final species tree

- Final **UNROOTED** newick tree saved as `roadies.nwk` in a separate `output_files` folder.
- Intermediate files (if `--noconverge` not used) saved in a separate `converge_files` folder.
**Default mode:**
Final species tree (in Newick format) for individual iterations (latest one will be the most confident and accurate tree) will be saved in separate `converge_files/iteration_<iteration_number>` folders.

**If `--noconverge` is used:**
Final species tree (in Newick format) will be saved as `roadies.nwk` in a separate `output_files` folder.

#### NOTE: ROADIES outputs unrooted trees by default. You can reroot trees on your own or use the provided `reroot.py` script in `workflow/scripts/` (given a reference rooted species tree as input).

Expand All @@ -229,7 +232,7 @@ python run_roadies.py --cores 16 --noconverge

If you want to run ROADIES with your own datasets, follow these steps:

1. Specify Input Dataset:
**Step 1:** Specify Input Dataset:

- Edit `config.yaml` file (found in the ROADIES directory - `config` folder).
- Update the `GENOMES` field with paths to your `.fa` or `.fa.gz` genome assemblies. Ensure all input genomic assemblies are in `.fa` or `.fa.gz` format and named according to the species' name (e.g., `Aardvark.fa`).
Expand All @@ -240,12 +243,12 @@ If you want to run ROADIES with your own datasets, follow these steps:
faSplit byname <input_dir> <output_dir>
```

2. Configure Other Parameters:
**Step 2:** Configure Other Parameters:

- Modify other parameters in `config.yaml` as needed.
- Refer to detailed settings on the [Wiki](https://turakhialab.github.io/ROADIES/).

3. Run the Pipeline:
**Step 3:** Run the Pipeline:

```
python run_roadies.py --cores 16
Expand All @@ -264,9 +267,9 @@ python run_roadies.py --cores 16 --mode balanced
python run_roadies.py --cores 16 --mode fast
```

The output species tree (unrooted) in Newick format will be saved as `roadies.nwk` in the `output_files` folder.
Final unrooted species tree (in Newick format) for individual iterations (latest one will be the most confident and accurate tree) will be saved in separate `ALL_OUT_DIR/iteration_<iteration_number>` folders (`ALL_OUT_DIR` is configured in `config/config.yaml`).

### For troubleshooting, contributing, or SLURM cluster usage, refer to [Wiki](https://turakhialab.github.io/ROADIES/)
### For contributing to the code, or SLURM cluster usage, refer to [Wiki](https://turakhialab.github.io/ROADIES/contribution)

<br>

Expand Down
59 changes: 36 additions & 23 deletions docs/troubleshooting.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,38 @@
# Troubleshooting Steps

## 1. Mamba not found in the shell
## Error 1. Issues with PASTA

### Solution

When running the pipeline, if you encounter that the pipeline fails by the failure of PASTA, please install PASTA from source by executing the following commands. Please run the following steps from the main ROADIES repository directory (after doing `cd ROADIES`) - within the activated Conda environment:

```bash
git clone https://github.com/smirarab/pasta.git
git clone https://github.com/smirarab/sate-tools-linux.git
cd pasta
python3 setup.py develop --user
```

Also, in the `align.smk` file (inside the `workflow/rules` directory of the ROADIES repository), please replace any instance of `pasta.py` with `python pasta/run_pasta.py`, AND
`run_seqtools.py` with `python pasta/run_seqtools.py`.

After doing this change, please re-run the ROADIES pipeline.

## Error 2. Environment conflict

### Solution

If you encounter the following error message - `"ls: relocation error: /lib64/libacl.so.1: symbol getxattr, version ATTR_1.0 not defined in file libattr.so.1 with link time reference"`, please run the following command to resolve it:

```bash
export LD_LIBRARY_PATH=/usr/lib64/:${LD_LIBRARY_PATH}
```

## Error 3. Mamba not found in the shell

When running the following command:
```bash
$ python ROADIES-main/run_roadies.py --cores 1
$ python ROADIES/run_roadies.py --cores 1
```
You may encounter this error:

Expand All @@ -17,10 +45,7 @@ Building DAG of jobs...
CreateCondaEnvironmentException:
The 'mamba' command is not available in the shell /usr/bin/bash that will be used by Snakemake. You have to ensure that it is in your PATH, e.g., first activating the conda base environment with `conda activate base`.The mamba package manager (https://github.com/mamba-org/mamba) is a fast and robust conda replacement. It is the recommended way of using Snakemake's conda integration. It can be installed with `conda install -n base -c conda-forge mamba`. If you still prefer to use conda, you can enforce that by setting `--conda-frontend conda`.
```

### Cause

The `mamba` package manager is missing or not available in the environment.
This means `mamba` package manager is missing or not available in the environment.

### Solution

Expand Down Expand Up @@ -56,42 +81,30 @@ cmd = [
python run_roadies.py --cores 16
```

## 2. Conda not recognized

### Cause
## Error 4. Conda not recognized

Conda is not added to your system's PATH.
This can happen if conda is not added to your system's PATH.

### Solution

Ensure conda is added to the PATH by running the following commands:
To resolve this, please ensure conda is added to the PATH by running the following commands:

```bash
export PATH="$HOME/miniconda3/bin:$PATH"
source ~/.bashrc
```

## 3. Singularity issues

### Cause

Problems arise when trying to run the pipeline with Singularity.
## Error 5. Handling dependencies (glibc)

### Solution

We recommend using Docker instead of Singularity. Ensure Docker is installed and running on your system. We have also provided Bioconda support for users who face issues with Singularity.

## 4. Handling dependencies (glibc)

Ensure that the glibc version on your system is updated to 2.29 or higher. Update your system libraries if necessary. Otherwise you may encounter this error:

```bash
workflow/scripts/lastz_32: /lib64/libm.so.6: version 'GLIBC_2.29' not found
```

## 5. PASTA fails with insufficient core count

### Cause
## Error 6. PASTA fails with insufficient core count

Pasta fails when the number of cores is insufficient for the number of instances.

Expand Down