UCD-BDLab · ramosv · Jul 13, 2025 · Jul 13, 2025 · Jul 13, 2025 · Jul 13, 2025
diff --git a/.github/workflows/publish.yml b/.github/workflows/publish.yml
@@ -4,6 +4,8 @@ on:
   push:
     tags:
       - "v*.*.*"
+  release:
+    types: [published]
 
 jobs:
   build-and-publish:
@@ -12,6 +14,8 @@ jobs:
     steps:
       - name: Check out code
         uses: actions/checkout@v3
+        with:
+          ref: ${{ github.event_name == 'release' && github.event.release.tag_name || github.ref }}
 
       - name: Set up Python
         uses: actions/setup-python@v4
@@ -32,7 +36,7 @@ jobs:
           TWINE_PASSWORD: ${{ secrets.GITHUB_TOKEN }}
         run: |
           twine upload \
-            --repository-url https://upload.pypi.github.io/UCD-BDLab/BioNeuralNet \
+            --repository-url https://api.github.com/orgs/UCD-BDLab/packages/pypi/upload \
             dist/*
 
       - name: Publish to PyPI

diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -66,9 +66,47 @@ and this project adheres to [Semantic Versioning](https://semver.org/).
 - **Updated Tutorials and Documentation**: New end to end jupiter notebook example.
 - **Updated Test**: All test have been updated and new ones have been added.
 
-## [1.0.1] to [1.0.9] - 2025-04-24
+## [1.1.0] - 2025-07-12
 
-- **BUG**: A bug related to rdata files missing
-- **Updated License**: BioNeuralNet is now distributed under the [Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (CC BY-NC-ND 4.0)](https://creativecommons.org/licenses/by-nc-nd/4.0/).
+### **Added**
+- **New Embedding Integration Utility**
+  - `_integrate_embeddings(reduced, method="multiply", alpha=2.0, beta=0.5)`: 
+    - Integrates reduced embeddings with raw omics features via a multiplicative scheme:  
+    - `enhanced = beta * raw + (1 - beta) * (alpha * normalized_weight * raw)`  
+    - (default ensures ≥ 50 % of each feature’s final value is influenced by the learned weights).
+
+- **Graph-Generation Algorithms**
+  - `gen_similarity_graph`: k-NN Cosine / Gaussian RBF similarity graph  
+  - `gen_correlation_graph`: Pearson / Spearman co-expression graph  
+  - `gen_threshold_graph`: soft-threshold (WGCNA-style) correlation graph  
+  - `gen_gaussian_knn_graph`: Gaussian kernel k-NN graph  
+  - `gen_mutual_info_graph`: mutual-information graph
+
+- **Preprocessing Utilities**
+  - Clinical data pipeline `preprocess_clinical`
+  - Inf/NaN cleaning: `clean_inf_nan`
+  - Variance selection: `select_top_k_variance`
+  - Correlation selection (supervised / unsupervised): `select_top_k_correlation`
+  - RandomForest importance: `select_top_randomforest`
+  - ANOVA F-test selection: `top_anova_f_features`
+  - Network-pruning helpers:  
+      - `prune_network`, `prune_network_by_quantile`,  
+      - `network_remove_low_variance`, `network_remove_high_zero_fraction`
+
+- **Continuous-Deployment Workflow**  
+  Added `.github/workflows/publish.yml` to auto-publish releases to PyPI when a Git tag is pushed.
+
+- **Updated Homepage Image**  
+  Replaced the index-page illustration to depict the full BioNeuralNet workflow.
 
-- **New release**: A new release will include documentation for the other updates. (1.1.0)
+### **Changed**
+- **Comprehensive Documentation Update**
+  - Rebuilt ReadTheDocs site with a new workflow diagram on the landing page.  
+  - Synced API reference to include all new graph-generation, preprocessing, and embedding-integration functions.  
+  - Added quick-start guide, expanded tutorials, and refreshed examples/notebooks.  
+  - Updated narrative docs, docstrings, and licencing info for consistency.
+
+- **License**: Project is now distributed under the [Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (CC BY-NC-ND 4.0)](https://creativecommons.org/licenses/by-nc-nd/4.0/).
+
+### **Fixed**
+- **Packaging Bug**: Missing `.csv` datasets and `.R` scripts in source distribution; `MANIFEST.in` updated to include all requisite data files.
diff --git a/README.md b/README.md
@@ -8,7 +8,7 @@
 [![Documentation](https://img.shields.io/badge/docs-read%20the%20docs-blue.svg)](https://bioneuralnet.readthedocs.io/en/latest/)
 
 
-## Welcome to BioNeuralNet 1.0.9
+## Welcome to BioNeuralNet 1.1.0
 
 ![BioNeuralNet Logo](assets/LOGO_WB.png)
 

diff --git a/bioneuralnet/__init__.py b/bioneuralnet/__init__.py
@@ -29,7 +29,7 @@
     - `datasets`: Contains example (synthetic) datasets for testing and demonstration purposes.
 """
 
-__version__ = "1.0.9"
+__version__ = "1.1.0"
 
 from .network_embedding import GNNEmbedding
 from .downstream_task import SubjectRepresentation

diff --git a/docs/jupyter_execute/Quick_Start.ipynb b/docs/jupyter_execute/Quick_Start.ipynb
@@ -913,7 +913,7 @@
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "BioNeuralNet version: 1.0.9\n"
+      "BioNeuralNet version: 1.1.0\n"
      ]
     }
    ],

diff --git a/docs/jupyter_execute/TCGA-BRCA_Dataset.ipynb b/docs/jupyter_execute/TCGA-BRCA_Dataset.ipynb
@@ -60,27 +60,6 @@
     "- [Direct Download BRCA](http://firebrowse.org/?cohort=BRCA&download_dialog=true)\n"
    ]
   },
-  {
-   "cell_type": "code",
-   "execution_count": 1,
-   "id": "60a6b53c",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# adjusting global pandas options for better display on web documentation\n",
-    "import pandas as pd\n",
-    "import warnings\n",
-    "import logging\n",
-    "\n",
-    "pd.set_option(\"display.max_columns\", 5)\n",
-    "pd.set_option(\"display.expand_frame_repr\", False)\n",
-    "warnings.filterwarnings(\"ignore\", category=UserWarning)\n",
-    "warnings.filterwarnings(\"ignore\", category=DeprecationWarning)\n",
-    "logging.getLogger(\"ray\").setLevel(logging.ERROR)\n",
-    "logging.getLogger(\"ray.tune\").setLevel(logging.ERROR)\n",
-    "logging.getLogger(\"torch_geometric\").setLevel(logging.ERROR)"
-   ]
-  },
   {
    "cell_type": "markdown",
    "id": "c9698b74",

diff --git a/docs/source/Quick_Start.ipynb b/docs/source/Quick_Start.ipynb
@@ -913,7 +913,7 @@
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "BioNeuralNet version: 1.0.9\n"
+      "BioNeuralNet version: 1.1.0\n"
      ]
     }
    ],

diff --git a/docs/source/_autosummary/bioneuralnet.utils.graph.rst b/docs/source/_autosummary/bioneuralnet.utils.graph.rst
@@ -15,6 +15,7 @@ bioneuralnet.utils.graph
       gen_similarity_graph
       gen_snn_graph
       gen_threshold_graph
+      get_logger
 
    .. rubric:: Classes
 

diff --git a/docs/source/_autosummary/bioneuralnet.utils.preprocess.rst b/docs/source/_autosummary/bioneuralnet.utils.preprocess.rst
@@ -15,7 +15,6 @@ bioneuralnet.utils.preprocess
       multipletests
       network_remove_high_zero_fraction
       network_remove_low_variance
-      overload
       preprocess_clinical
       prune_network
       prune_network_by_quantile
@@ -28,7 +27,6 @@ bioneuralnet.utils.preprocess
 
    .. autosummary::
 
-      OrdinalEncoder
       RandomForestClassifier
       RandomForestRegressor
       RobustScaler

diff --git a/docs/source/_static/BioNeuralNet.png b/docs/source/_static/BioNeuralNet.png
diff --git a/docs/source/_static/BioNeuralNet_old1.png b/docs/source/_static/BioNeuralNet_old1.png
diff --git a/docs/source/clustering.rst b/docs/source/clustering.rst
@@ -1,158 +1,95 @@
 Correlated Clustering
 =====================
 
-BioNeuralNet includes internal modules for performing **correlated clustering** on complex networks.
-These methods extend traditional community detection by integrating **phenotype correlation**, allowing users to extract **biologically relevant, phenotype-associated modules** from any network.
+BioNeuralNet provides **correlated clustering methods** designed specifically to identify biologically relevant communities within multi-omics networks. By integrating **phenotype correlations**, these approaches enhance traditional community detection methods, capturing biologically meaningful network modules strongly associated with clinical or phenotypic outcomes.
 
-Overview
---------
+Key Features
+------------
+- **Phenotype-Aware Clustering**: Incorporates external phenotype information directly into clustering algorithms, resulting in communities that are both structurally cohesive and biologically meaningful.
+- **Flexible Application**: Methods are applicable to any network data represented as adjacency matrices, facilitating diverse research scenarios including biomarker discovery and functional module identification.
+- **Integration with Downstream Analysis**: Clusters obtained can directly feed into downstream tasks such as disease prediction, feature selection, and biomarker identification.
 
-Our framework supports three key **correlated clustering** approaches:
+Supported Clustering Methods
+----------------------------
 
-- **Correlated PageRank**:
+Correlated PageRank
+-------------------
+A variant of PageRank that biases node rankings toward phenotype-relevant nodes, prioritizing features with strong phenotype associations:
 
-  - A **modified PageRank algorithm** that prioritizes nodes based on their correlation with an external phenotype.
-
-  - The **personalization vector** is computed using phenotype correlation, ensuring that **biologically significant nodes receive more influence**.
-
-  - This method is ideal for **identifying high-impact nodes** within a given network.
+.. math::
 
-- **Correlated Louvain**:
+     \mathbf{r} = \alpha \cdot \mathbf{M} \mathbf{r} + (1 - \alpha) \mathbf{p}
 
-  - An adaptation of the **Louvain community detection algorithm**, modified to optimize for **both network modularity and phenotype correlation**.
-  - The objective function for community detection is given by:
+- :math:`\mathbf{M}`: Normalized adjacency (transition probability matrix).
+- :math:`\mathbf{p}`: Phenotype-informed personalization vector (based on correlation).
+- Ideal for ranking biologically impactful nodes.
 
-    .. math::
+Correlated Louvain
+------------------
+Modifies Louvain community detection to balance structural modularity and phenotype correlation, optimizing:
 
-       Q^* = k_L \cdot Q + (1 - k_L) \cdot \overline{\lvert \rho \rvert},
+.. math::
 
-    where:
+       Q^* = k_L \cdot Q + (1 - k_L) \cdot \overline{\lvert \rho \rvert}
 
-      - :math:`Q` is the standard **Newman-Girvan modularity**, defined as:
+- :math:`Q`: Newman-Girvan modularity, measuring network structural cohesiveness.
+- :math:`\overline{\lvert \rho \rvert}`: Mean absolute Pearson correlation between cluster features and phenotype.
+- :math:`k_L`: User-defined parameter balancing structure and phenotype relevance.
+- Efficient for identifying phenotype-enriched communities.
 
-        .. math::
+Hybrid Louvain (Iterative Refinement)
+-------------------------------------
+Combines Correlated Louvain with Correlated PageRank iteratively to refine community assignments:
 
-           Q = \frac{1}{2m} \sum_{i,j} \bigl(A_{ij} - \frac{k_i k_j}{2m} \bigr) \delta(c_i, c_j),
-
-        where :math:`A_{ij}` represents the adjacency matrix, :math:`k_i` and :math:`k_j` are node degrees, and :math:`\delta(c_i, c_j)` indicates whether nodes belong to the same community.
-      - :math:`\overline{\lvert \rho \rvert}` is the **mean absolute Pearson correlation** between the **first principal component (PC1) of the subgraph's features** and the phenotype.
-      - :math:`k_L` is a user-defined weight (e.g., :math:`k_L = 0.2`), balancing **network modularity and phenotype correlation**.
-
-  - This method **detects communities** that are both **structurally cohesive and strongly associated with phenotype**.
-
-- **Hybrid Louvain**:
-
-  - A **refinement approach** that combines **Correlated Louvain** and **Correlated PageRank** in an iterative process.
-
-  - The key steps are:
-
-    1. **Initial Community Detection**:
-
-       - The **input network (adjacency matrix)** is clustered using **Correlated Louvain**.
-       - This identifies **initial phenotype-associated modules**.
-
-    2. **Iterative Refinement with Correlated PageRank**:
-
-       - In each iteration:
-
-         - The **most correlated module** is **expanded** based on Correlated PageRank.
-         - The refined network is **re-clustered using Correlated Louvain**.
-         - This process continues **until convergence**.
-
-    3. **Final Cluster Extraction**:
-
-       - The final **phenotype-optimized modules** are extracted and returned.
-       - The quality of the clustering is measured using **both modularity and phenotype correlation metrics**.
+1. Initial clustering using Correlated Louvain identifies phenotype-associated modules.
+2. Clusters iteratively refined by expanding highly correlated modules using Correlated PageRank.
+3. Repeated until convergence, producing optimized phenotype-associated communities.
 
 .. figure:: _static/hybrid_clustering.png
    :align: center
-   :alt: Overview hybrid clustering workflow
-
-   **Hybrid Clustering**: Precedure and steps for the hybrid clustering method.
+   :alt: Hybrid Clustering Workflow
 
+   Workflow: Hybrid Louvain iteratively integrates Correlated PageRank and Correlated Louvain to produce refined phenotype-associated clusters.
 
-Mathematical Approach
+Comparison of Methods
 ---------------------
-
-**Correlated PageRank:**
-
-   - Correlated PageRank extends the traditional PageRank formulation by **biasing the random walk towards phenotype-associated nodes**.
-
-   - The **ranking function** is defined as:
-
-  .. math::
-
-     \mathbf{r} = \alpha \cdot \mathbf{M} \mathbf{r} + (1 - \alpha) \mathbf{p},
-
-  where:
-
-  - :math:`\mathbf{M}` is the transition probability matrix, derived from the **normalized adjacency matrix**.
-  - :math:`\mathbf{p}` is the **personalization vector**, computed using **phenotype correlation**.
-  - :math:`\alpha` is the **teleportation factor** (default: :math:`\alpha = 0.85`).
-
-- Unlike standard PageRank, which assumes a **uniform teleportation distribution**, **Correlated PageRank prioritizes phenotype-relevant nodes**.
-
-Graphical Comparison
---------------------
-
-Below is an illustration of **different clustering approaches** on a sample network:
+The figure below illustrates the difference between standard and correlated clustering methods, highlighting BioNeuralNet's ability to extract biologically meaningful modules.
 
 .. figure:: _static/clustercorrelation.png
    :align: center
-   :alt: Comparison of Correlated Clustering Methods
-
-   **Figure 2:** Comparison between SmCCNet generated clusters and Correlated Louvain clusters
-
-Integration with BioNeuralNet
-------------------------------
+   :alt: Clustering Method Comparison
 
-Our **correlated clustering methods** seamlessly integrate into **BioNeuralNet** and can be applied to **any network represented as an adjacency matrix**.
+   Comparison: Standard (SmCCNet) versus Correlated Louvain clusters.
 
-Use cases include:
+Applications and Use Cases
+--------------------------
+BioNeuralNet correlated clustering is versatile and suitable for diverse network analyses:
 
-   - **Multi-Omics Networks**: Extracting **biologically relevant subgraphs** from gene expression, proteomics, or metabolomics data.
-   - **Brain Connectivity Graphs**: Identifying **functional modules associated with neurological disorders**.
-   - **Social & Disease Networks**: Detecting **community structures in epidemiology and patient networks**.
+- **Multi-Omics Networks**: Extract biologically relevant gene/protein modules associated with clinical phenotypes.
+- **Neuroimaging Networks**: Identify functional brain modules linked to neurological diseases.
+- **Disease Networks**: Reveal patient or epidemiological network communities strongly linked to clinical outcomes.
 
-Our framework supports:
+Integration into BioNeuralNet Workflow
+--------------------------------------
+Clustering outputs seamlessly feed into downstream BioNeuralNet modules:
 
-   - **Graph Neural Network Embedding**: Training GNNs on **phenotype-optimized clusters**.
-
-   - **Predictive Biomarker Discovery**: Identifying key **features associated with disease outcomes**.
-
-   - **Customizable Modularity Optimization**: Allowing users to **adjust the trade-off between structure and phenotype correlation**.
+- **GNN Embedding Generation**: Train Graph Neural Networks on phenotype-enriched clusters.
+- **Disease Prediction (DPMON)**: Utilize phenotype-associated modules for improved predictive accuracy.
+- **Biomarker Discovery**: Extract features or modules strongly predictive of disease status.
 
-Notes for Users
----------------
-
-1. **Input Requirements**:
-
-   - Any **graph-based dataset** can be used as input, provided as an **adjacency matrix**.
-
-   - Phenotype data should be supplied in **numerical format** (e.g., disease severity scores, expression levels).
-
-2. **Cluster Comparison**:
-
-   - **Correlated Louvain extracts phenotype-associated modules.**
-
-   - **Hybrid Louvain iteratively refines clusters using Correlated PageRank.**
-
-   - Users can compare results using **modularity scores and phenotype correlation metrics**.
-
-3. **Method Selection**:
+User Recommendations
+--------------------
+- **Correlated PageRank**: Best for prioritizing individual high-impact features or nodes.
+- **Correlated Louvain**: Ideal for extracting phenotype-associated functional communities efficiently.
+- **Hybrid Louvain**: Recommended for maximal biological interpretability, particularly in complex multi-omics scenarios.
 
-   - **Correlated PageRank** is ideal for **ranking high-impact nodes in a phenotype-aware manner**.
-
-   - **Correlated Louvain** is best for **detecting phenotype-associated communities**.
-
-   - **Hybrid Louvain** provides the most refined, **biologically meaningful clusters**.
+Reference and Further Reading
+-----------------------------
+For detailed methodology and benchmarking, refer to our publication:
 
-Conclusion
-----------
+- Abdel-Hafiz et al., Frontiers in Big Data, 2022. [1]_
 
-The **correlated clustering methods** implemented in BioNeuralNet provide a **powerful, flexible framework** for extracting **highly structured, phenotype-associated modules** from any network.
-By integrating **phenotype correlation directly into the clustering process**, these methods enable **more biologically relevant and disease-informative network analysis**.
+Return to :doc:`../index`
 
-paper link: https://doi.org/10.3389/fdata.2022.894632 
+.. [1] Abdel-Hafiz, M., Najafi, M., et al. "Significant Subgraph Detection in Multi-omics Networks for Disease Pathway Identification." *Frontiers in Big Data*, 5 (2022). DOI: `10.3389/fdata.2022.894632 <https://doi.org/10.3389/fdata.2022.894632>`_.
 
-Return to :doc:`../index`
diff --git a/docs/source/conf.py b/docs/source/conf.py
@@ -7,7 +7,7 @@
 try:
     release = metadata.version("bioneuralnet")
 except metadata.PackageNotFoundError:
-    release = "1.0.9"
+    release = "1.1.0"
 
 project = "BioNeuralNet"
 version = release