SANNO

The official implementation for "SANNO".

Datasets

We provide preprocessed datasets for easy reproduction.

Download datasets from: Dataset Link

Installation

To use {Project Name}, follow these steps:

Create a conda environment:

conda create -n {SANNO} python=3.7
conda activate {SANNO}

Install dependencies:

pip install -r requirements.txt
# or just pip install SANNO
pip install SANNO

Install PYG and Pytorch according to the CUDA version, take torch-1.13.1+cu117 (Ubuntu 20.04.4 LTS) as an example:

conda install pytorch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 pytorch-cuda=11.7 -c pytorch -c nvidia
pip install torch_geometric==2.3.0 # must be this version
pip install pyg_lib torch_scatter torch_sparse torch_cluster torch_spline_conv -f https://data.pyg.org/whl/torch-1.13.1+cu117.html

Usage

Data Preprocessing

In order to run SANNO, we need to first create anndata from the raw data.

We require two types of datasets for this project: reference data and query data. Both datasets should be provided in .h5ad format, with cells stored in obs and genes/features stored in var.

Reference Data

Format: .h5ad
Content:
- obs: Cell metadata, including a mandatory cell_type column indicating the true cell type labels.
- var: Gene/feature metadata.
- obsm: Spatial coordinates stored under the key pos, representing the relative positions of cells as a 2D numpy array (n_cells x 2).

Query Data

Format: .h5ad
Content:
- obs: Cell metadata (cell type labels are not required).
- var: Gene/feature metadata.
- obsm: Spatial coordinates stored under the key pos, representing the relative positions of cells as a 2D numpy array (n_cells x 2).

Cell Type Annotation

The processed data are used as input to SANNO and a reference genome is provided to extract the embedding and anootation incorporating reference Spatial Transcriptomics information:

cd SANNO/SANNO

python main_xy_adj.py   --gpu_index 3 # GPU index
                        --type st2st \ # project type
                        --dataset Project name \ # project name
                        --train_dataset path/to/train_adata.h5ad \ # reference data
                        --test_dataset path/to/test_adata.h5ad \ # query data
                        --log log \ # log path

The project type must be selected based on the nature of the reference and query datasets. The following modes are supported:

st2st – For cases where both the reference and query datasets are spatial transcriptomics.
st2sc – For cases where the reference dataset is spatial transcriptomics, and the query dataset is single-cell transcriptomics.
sc2sc – For cases where both the reference and query datasets are single-cell transcriptomics.

Running the above command will generate three output files in the output path:

acc.csv: Contains the overall accuracy of the query data and SANNO predictions.
embedding.h5ad: An AnnData file storing the embeddings extracted by SANNO.
Reports: A set of logs recorded during the training process.

Tutorial

Tutorial 1: Cell annotations within samples (HubMap CL A & HubMap CL B)

Install the required environment according to Installation.
Download the datasets from HubMap CL.
Preprocess the datasets according to the Data Preprocessing standards.
For more detailed information, run the tutorial HubMap_CL_intra.ipynb for how to do data preprocessing and training.

Tutorial 1: Cell annotations cross samples (Tonsil & BE)

Install the required environment according to Installation.
Download the datasets from Tonsil_BE.
Preprocess the datasets according to the Data Preprocessing standards.
For more detailed information, run the tutorial HubMap_CL_intra.ipynb for how to do data preprocessing and training.

Citation

If you use SANNO in your research, please cite:

@article{
  title={{SANNO: A Graph-Transformer Enhanced Optimal Transport Tool for Spatial Transcriptomic Annotation }},
  author={Yuansong Zeng, Yuanze Chen, Ningyuan Shangguan, Wenbin Li, Hongyu Zhang, Zheng Wang, Huiying Zhao}
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
SANNO.egg-info		SANNO.egg-info
SANNO		SANNO
Toturial		Toturial
datasets		datasets
dist		dist
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SANNO

Table of Contents

Datasets

Installation

Usage

Data Preprocessing

Reference Data

Query Data

Cell Type Annotation

Tutorial

Tutorial 1: Cell annotations within samples (HubMap CL A & HubMap CL B)

Tutorial 1: Cell annotations cross samples (Tonsil & BE)

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

BillyChen123/SANNO

Folders and files

Latest commit

History

Repository files navigation

SANNO

Table of Contents

Datasets

Installation

Usage

Data Preprocessing

Reference Data

Query Data

Cell Type Annotation

Tutorial

Tutorial 1: Cell annotations within samples (HubMap CL A & HubMap CL B)

Tutorial 1: Cell annotations cross samples (Tonsil & BE)

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages