A Comprehensive Protein-DNA Interface Generation Tool with Residue Propensity Map analysis

Welcome to the Protein<>DNA Interface Generation repository! 🚀

Introduction
Repository Structure
Workflow Stages
Installation and Dependencies
Usage
Docker Usage
Testing
Contributing
License

Introduction

This project offers a pipeline to:

Process multi-chain PDB files placed in the input/ folder.
Split them into chain-specific files in split_chain/.
Use Naccess to generate .asa, .rsa, and .int outputs for both the entire complex and individual chains in rsa/.
Produce final residue propensity maps and other interface analysis results in CSV format under interface/.

It leverages:

Python: Data parsing and scripting tasks.
Fortran: Performance-heavy computations.
Shell: Workflow automation.
Docker: Reproducible and consistent environment.
Snakemake: Automated workflow orchestration.

Repository Structure

Protein_DNA_Interface_Generation/
├── input/                  # Raw PDB or other input files
├── split_chain/            # Contains split chain PDB files
├── rsa/                    # Naccess outputs (.asa, .rsa, .int) for complex & chains
├── interface/              # Final residue propensity maps (CSV) & summary outputs
├── scripts/                # Python, Shell, Fortran scripts
├── docker/                 # Docker configuration and resources
├── Snakefile               # Main Snakemake workflow definition
└── README.md               # Project documentation (this file)

input/: Place your raw PDB files here to be processed.
split_chain/: Contains the individual chain-specific PDB files generated by the workflow.
rsa/: Holds the output from Naccess (e.g., .asa, .rsa, .int) run on both the entire complex and individual chains.
interface/: Stores final CSV results with residue-based interface metrics and any summary files.
scripts/: Key scripts for chain splitting, interface analysis, and more.
docker/: Docker setup to help package and run the entire pipeline in a container.

Workflow Stages

Input Parsing
- Reads .pdb files from input/.
Chain Splitting
- Splits each file by chain, outputting them to split_chain/.
Naccess Runs
- Computes accessible surface areas for both the complex and each chain, results go to rsa/.
Interface Computation
- Uses Naccess outputs to identify interface residues and compute relevant metrics.
Results Aggregation
- Final CSV files summarizing residue-based interface stats are written into interface/.

Installation and Dependencies

Clone the Repository

git clone https://github.com/mhtjsh/Protein_DNA_Interface_Generation.git
cd Protein_DNA_Interface_Generation

Install Dependencies
- Python (3.7+ recommended)
- Snakemake (install via pip or conda):
```
pip install snakemake
```
- Fortran Compiler (e.g., gfortran)
- Shell (usually installed by default)
- Docker (optional, but recommended for reproducible runs)
Check Installation
```
snakemake --version
```
A valid Snakemake version (e.g., 7.x) should be displayed.

Usage

Prepare Input
- Place your raw .pdb files in input/.
Run the Workflow
```
snakemake --cores 1 --latency-wait 10
```
This commands all the steps: splitting PDB files, running Naccess, and generating interface results.
Customization (Optional)
- Modify or add rules in the Snakefile.
- Update any scripts in scripts/ to customize the pipeline.

Common Snakemake Options

Dry Run
```
snakemake --cores 1 --latency-wait 10
```
Shows the planned jobs without executing them.
Force All Steps
```
snakemake --cores 1 --latency-wait 10
```
Re-runs every rule ignoring cached results.
Workflow DAG
```
snakemake --dag | dot -Tpng > dag.png
```
Exports a directed acyclic graph (DAG) of the workflow.

Docker Usage

We provide a ready-to-use Docker image to facilitate a reproducible environment. Below are instructions to pull or build the image, and run the pipeline inside a container.

Pull the Pre-built Image (Recommended)

docker pull mhtjsh/protein-dna-interface

Build the Image Yourself (Optional)

git clone https://github.com/mhtjsh/Protein_DNA_Interface_Generation.git
cd Protein_DNA_Interface_Generation
docker build -t mhtjsh/protein-dna-interface .

Running the Container

Basic Run Using Example Data

docker run --rm -it mhtjsh/protein-dna-interface

Mounting Input and Output Folders

To process your own input PDB files and retrieve outputs on your host system:

docker run --rm -it \
  -v /home/mhtjsh/Protein_DNA_Interface_Generation/input:/app/input \
  -v /home/mhtjsh/Protein_DNA_Interface_Generation/output:/app/output \
  mhtjsh/protein-dna-interface

Ensure your local input/ directory has .pdb files before running.
Results will be placed in output/.
Adjust the local path (/home/mhtjsh/Protein_DNA_Interface_Generation) to your actual directory if needed.

Testing

Manual Testing
- Place a test PDB file in input/.
- Run Snakemake or the Docker container, verifying outputs in split_chain/, rsa/, and interface/.
Automated Testing
- Create minimal test data and a test rule in the Snakefile or a CI configuration (e.g., GitHub Actions).

Contributing

All contributions are welcome! To contribute:

Fork this repository.
Create a new feature branch.
Submit a pull request with your changes.

License

This project is distributed under an open-source license (e.g., MIT). See LICENSE for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A Comprehensive Protein-DNA Interface Generation Tool with Residue Propensity Map analysis

Table of Contents

Introduction

Repository Structure

Workflow Stages

Installation and Dependencies

Usage

Common Snakemake Options

Docker Usage

Pull the Pre-built Image (Recommended)

Build the Image Yourself (Optional)

Running the Container

Basic Run Using Example Data

Mounting Input and Output Folders

Testing

Contributing

License

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

A Comprehensive Protein-DNA Interface Generation Tool with Residue Propensity Map analysis

Table of Contents

Introduction

Repository Structure

Workflow Stages

Installation and Dependencies

Usage

Common Snakemake Options

Docker Usage

Pull the Pre-built Image (Recommended)

Build the Image Yourself (Optional)

Running the Container

Basic Run Using Example Data

Mounting Input and Output Folders

Testing

Contributing

License