You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Merge pull request #1 from BenjaminIsaac0111/InteractionModelDebugging
I felt the model was getting a bit complicated, and technical debt was building. Simplified the model to a point that it at least learns something without any extended components, beyond the standard MIL-like interaction model using transformer layers. I might look back at some of the initial ideas, but I think this version is a good launch pad for extending ideas from. Improved the test, but I still think I need to make these more robust and give better coverage! I will work on this...
Thank you for your interest in contributing! As a project at the intersection of deep learning and pathology, we value rigorous, well-tested contributions.
4
+
5
+
## Project Status
6
+
7
+
> [!IMPORTANT]
8
+
> This project is a **Work in Progress**. We are actively refining the core interaction logic and scaling behaviors. Expect breaking changes in the CLI and data schemas.
9
+
10
+
## Intellectual Property & Licensing
11
+
12
+
SpatialTranscriptFormer is protected under a **Proprietary Source Code License**.
13
+
14
+
-**Academic/Non-Profit**: We encourage contributions from the research community. Contributions made under an academic affiliation are generally welcome.
15
+
-**Commercial/For-Profit**: Contributions from commercial entities or individuals intended for profit-seeking use require a separate agreement.
16
+
-**Assignment**: By submitting a Pull Request, you agree that your contributions will be licensed under the project's existing license, granting the author the right to include them in both the open-access and proprietary versions of the software.
17
+
18
+
## Development Workflow
19
+
20
+
### 1. Environment Setup
21
+
22
+
Use the provided setup scripts to ensure a consistent development environment:
23
+
24
+
```bash
25
+
# Windows
26
+
.\setup.ps1
27
+
28
+
# Linux/HPC
29
+
bash setup.sh
30
+
```
31
+
32
+
### 2. Coding Standards
33
+
34
+
We use `black` for formatting and `flake8` for linting. Please ensure your code passes these checks before submitting.
35
+
36
+
```bash
37
+
black .
38
+
flake8 src/
39
+
```
40
+
41
+
### 3. Testing
42
+
43
+
All new features must include unit tests in the `tests/` directory. We use `pytest` for our test suite.
44
+
45
+
```bash
46
+
# Run all tests
47
+
.\test.ps1 # Windows
48
+
bash test.sh # Linux
49
+
```
50
+
51
+
## Pull Request Process
52
+
53
+
1.**Open an Issue**: For major changes, please open an issue first to discuss the design.
54
+
2.**Branching**: Work on a descriptive feature branch (e.g., `feature/pathway-attention-mask`).
55
+
3.**Documentation**: Update relevant files in `docs/` and the `README.md` if your change affects usage.
56
+
4.**Verification**: Ensure all CI checks (GitHub Actions) pass.
57
+
58
+
### Branch Protections
59
+
60
+
To maintain code quality and stability, the following protections are enforced on the `main` branch:
61
+
62
+
-**Require Pull Request Reviews**: All merges to `main` require at least one approval from a project maintainer.
63
+
-**Required Status Checks**: The `CI` workflow must pass successfully before a PR can be merged. This includes formatting checks (`black`) and the full test suite (`pytest`).
64
+
-**No Direct Pushes**: Pushing directly to `main` is disabled. All changes must go through the Pull Request process.
65
+
-**Linear History**: We prefer **Squash and Merge** to keep the `main` branch history clean and concise.
66
+
67
+
## Contact
68
+
69
+
For questions regarding commercial licensing or complex architectural changes, please contact the author directly.
-**Spatial Pattern Coherence**: Optimized using a composite **MSE + PCC (Pearson Correlation) loss** to prevent spatial collapse and ensure accurate morphology-expression mapping.
13
+
-**Biologically Informed Initialization**: Gene reconstruction weights derived from known hallmark memberships.
4
14
5
15
## License
6
16
@@ -25,71 +35,66 @@ This project requires [Conda](https://docs.conda.io/en/latest/).
25
35
26
36
## Usage
27
37
28
-
After installation, the following command-line tools are available in your `SpatialTranscriptFormer` environment:
29
-
30
38
### Download HEST Data
31
39
32
40
Download specific subsets using filters or patterns:
33
41
34
42
```bash
35
-
# List available organs
36
-
stf-download --list_organs
37
-
38
43
# Download only the Bowel Cancer subset (including ST data and WSIs)
39
44
stf-download --organ Bowel --disease Cancer --local_dir hest_data
40
-
41
-
# Download any other organ
42
-
stf-download --organ Kidney
43
45
```
44
46
45
-
### Split Dataset
47
+
### Train Models
48
+
49
+
We provide presets for baseline models and scaled versions of the SpatialTranscriptFormer.
46
50
47
-
Perform patient-stratified splitting on the metadata:
51
+
```bash
52
+
# Recommended: Run the Interaction model with 4 transformer layers
For a complete list of configurations, see the [Training Guide](docs/TRAINING_GUIDE.md).
54
63
55
-
Train baseline models (HE2RNA, ViT) or the proposed interaction model. For a complete list of configurations and examples, see the [Training Guide](docs/TRAINING_GUIDE.md).
Visualization plots will be saved to the `./results` directory.
74
81
75
82
## Documentation
76
83
77
-
For detailed information on the data and code implementation, see:
78
-
84
+
-[Models](docs/MODELS.md): Detailed model architectures and scaling parameters.
79
85
-[Data Structure](docs/DATA_STRUCTURE.md): Organization of HEST data on disk.
80
-
-[Dataloader](docs/DATALOADER.md): Technical implementation of the PyTorch dataset and loaders.
81
-
-[Gene Analysis](docs/GENE_ANALYSIS.md): Analysis of available genes and modeling strategies.
82
-
-[Pathway Mapping](docs/PATHWAY_MAPPING.md): Strategies for clinical interpretability and pathway integration.
83
-
-[Latent Discovery](docs/LATENT_DISCOVERY.md): Unsupervised discovery of biological pathways from data.
84
-
-[Models](docs/MODELS.md): Model architectures and literature references.
86
+
-[Pathway Mapping](docs/PATHWAY_MAPPING.md): Clinical interpretability and pathway integration.
87
+
-[Gene Analysis](docs/GENE_ANALYSIS.md): Modeling strategies for high-dimensional gene space.
85
88
86
89
## Development
87
90
88
91
### Running Tests
89
92
90
-
Use the included test wrapper:
91
-
92
93
```bash
93
-
# Run all tests
94
+
# Run all tests (Pytest wrapper)
94
95
.\test.ps1
95
96
```
97
+
98
+
## Contributing
99
+
100
+
We welcome contributions! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for details on our coding standards and the process for submitting pull requests. Note that this project is under a proprietary license; contributions involve an assignment of rights for non-academic use.
0 commit comments