Skip to content

Introduce graph algorithms domain#573

Merged
JohT merged 6 commits intomainfrom
feature/introduce-graph-algorithms-domain
May 2, 2026
Merged

Introduce graph algorithms domain#573
JohT merged 6 commits intomainfrom
feature/introduce-graph-algorithms-domain

Conversation

@JohT JohT self-assigned this May 1, 2026
@JohT JohT force-pushed the feature/introduce-graph-algorithms-domain branch from 24379ff to 50f0df4 Compare May 1, 2026 19:21
@JohT JohT force-pushed the feature/introduce-graph-algorithms-domain branch from cee7a28 to d90c57c Compare May 2, 2026 08:49
@JohT JohT marked this pull request as ready for review May 2, 2026 08:50
@JohT JohT changed the base branch from feature/introduce-algorithms-and-embeddings-domain to main May 2, 2026 08:50
@JohT JohT requested a review from Copilot May 2, 2026 08:50
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Introduces a new graph-algorithms vertical-slice domain (scripts + Cypher + Markdown summary) to run Neo4j GDS centrality, community-detection, and similarity analyses, and to generate a consolidated Markdown report under reports/graph-algorithms/. Also enhances report cleanup to remove empty subdirectories.

Changes:

  • Add domains/graph-algorithms/ with CSV entrypoints (centrality/community/similarity), a Markdown entrypoint, and summary report assembly.
  • Add domain-local Cypher query sets (centrality, community-detection, similarity) plus new “statistics” queries used by the Markdown summary.
  • Update cleanup logic to delete empty report subdirectories after removing empty report files.

Reviewed changes

Copilot reviewed 137 out of 137 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
scripts/cleanupAfterReportGeneration.sh Adds deletion of empty subdirectories after pruning empty report files.
domains/graph-algorithms/README.md Domain overview, entrypoints, prerequisites, and output description.
domains/graph-algorithms/PREREQUISITES.md Documents required Neo4j/GDS + enrichment/projection prerequisites.
domains/graph-algorithms/COPIED_FILES.md Traceability mapping for copied Cypher sources.
domains/graph-algorithms/centralityCsv.sh Runs centrality algorithms and writes CSV outputs under the domain report tree.
domains/graph-algorithms/communityCsv.sh Runs community-detection algorithms and writes CSV outputs under the domain report tree.
domains/graph-algorithms/similarityCsv.sh Runs node similarity (Jaccard) and writes CSV outputs + SIMILAR rels.
domains/graph-algorithms/graphAlgorithmsMarkdown.sh Markdown report entrypoint delegating to the summary script.
domains/graph-algorithms/summary/graphAlgorithmsSummary.sh Assembles the final Markdown report + embeds includes generated from statistics queries.
domains/graph-algorithms/summary/report.template.md Main Markdown report template with includes and fallback includes.
domains/graph-algorithms/summary/report_no_graph_data.template.md Fallback include when statistics data is missing.
domains/graph-algorithms/queries/statistics/Top_nodes_by_PageRank.cypher Statistics query for Markdown summary (top PageRank nodes).
domains/graph-algorithms/queries/statistics/Top_nodes_by_ArticleRank.cypher Statistics query for Markdown summary (top ArticleRank nodes).
domains/graph-algorithms/queries/statistics/Top_nodes_by_Betweenness.cypher Statistics query for Markdown summary (top Betweenness nodes).
domains/graph-algorithms/queries/statistics/Leiden_community_overview.cypher Statistics query for Markdown summary (Leiden community sizes).
domains/graph-algorithms/queries/statistics/SCC_overview.cypher Statistics query for Markdown summary (SCC component sizes).
domains/graph-algorithms/queries/statistics/WCC_overview.cypher Statistics query for Markdown summary (WCC component sizes).
domains/graph-algorithms/queries/statistics/LCC_nodes_by_coefficient.cypher Statistics query for Markdown summary (top LCC nodes).
domains/graph-algorithms/queries/statistics/Jaccard_similarity_top_pairs.cypher Statistics query for Markdown summary (top Jaccard similarity pairs).
domains/graph-algorithms/queries/similarity/Set_Parameters.cypher Example param settings for similarity runs.
domains/graph-algorithms/queries/similarity/Similarity_1a_Estimate.cypher Similarity estimate query (GDS).
domains/graph-algorithms/queries/similarity/Similarity_1b_Statistics.cypher Similarity statistics query (GDS).
domains/graph-algorithms/queries/similarity/Similarity_1c_Mutate.cypher Similarity mutate query (GDS).
domains/graph-algorithms/queries/similarity/Similarity_1d_Stream_Mutated.cypher Streams mutated similarity results for CSV/reporting.
domains/graph-algorithms/queries/similarity/Similarity_1e_Stream.cypher Streams similarity results (non-mutated stream variant).
domains/graph-algorithms/queries/similarity/Similarity_1f_Delete_Relationships.cypher Deletes existing SIMILAR relationships for the target node label.
domains/graph-algorithms/queries/similarity/Similarity_1g_Write_Mutated.cypher Writes SIMILAR relationships from the projection back to the graph.
domains/graph-algorithms/queries/similarity/Similarity_1h_Write.cypher Direct write variant for node similarity (GDS write).
domains/graph-algorithms/queries/similarity/Similarity_1i_Write_Node_Properties.cypher Writes per-node similarity properties to the graph.
domains/graph-algorithms/queries/community-detection/Set_Parameters.cypher Example param settings for community detection.
domains/graph-algorithms/queries/community-detection/Community_Detection_Summary.cypher Consolidated community output query for reporting.
domains/graph-algorithms/queries/community-detection/Compare_Louvain_vs_Leiden_Results.cypher Comparison query for Louvain vs Leiden results.
domains/graph-algorithms/queries/community-detection/Get_all_Packages_with_a_Community_Detection_Label.cypher Helper query to list packages labeled with community detection labels.
domains/graph-algorithms/queries/community-detection/Which_package_community_spans_multiple_artifacts.cypher Package community exploration query.
domains/graph-algorithms/queries/community-detection/Which_package_community_spans_several_artifacts_and_how_are_the_packages_distributed.cypher Package community distribution query across artifacts.
domains/graph-algorithms/queries/community-detection/Which_type_community_spans_several_artifacts_and_how_are_the_types_distributed.cypher Type community distribution query across artifacts.
domains/graph-algorithms/queries/community-detection/Type_communities_with_few_members_in_foreign_packages.cypher Type community exploration query.
domains/graph-algorithms/queries/community-detection/Type_communities_that_span_the_most_packages.cypher Type community exploration query.
domains/graph-algorithms/queries/community-detection/Type_communities_that_span_the_most_packages_with_type_statistics.cypher Type community exploration query with stats.
domains/graph-algorithms/queries/community-detection/Community_Detection_1a_Louvain_Estimate.cypher Louvain estimate query (GDS).
domains/graph-algorithms/queries/community-detection/Community_Detection_1b_Louvain_Statistics.cypher Louvain stats query (GDS).
domains/graph-algorithms/queries/community-detection/Community_Detection_1c_Louvain_Mutate.cypher Louvain mutate query (GDS).
domains/graph-algorithms/queries/community-detection/Community_Detection_1d_Louvain_Stream.cypher Louvain stream query (GDS).
domains/graph-algorithms/queries/community-detection/Community_Detection_1d_Stream_Intermediate_Mutated.cypher Streams intermediate communities from mutated projection.
domains/graph-algorithms/queries/community-detection/Community_Detection_1e_Louvain_Write_louvainCommunityId.cypher Louvain write query (GDS).
domains/graph-algorithms/queries/community-detection/Community_Detection_1e_Louvain_Write_intermediateLouvainCommunityId.cypher Louvain write query for intermediate IDs.
domains/graph-algorithms/queries/community-detection/Community_Detection_2a_Leiden_Estimate.cypher Leiden estimate query (GDS).
domains/graph-algorithms/queries/community-detection/Community_Detection_2b_Leiden_Statistics.cypher Leiden stats query (GDS).
domains/graph-algorithms/queries/community-detection/Community_Detection_2b_Leiden_Tuneable_Statistics.cypher Leiden tuneable stats query (GDS).
domains/graph-algorithms/queries/community-detection/Community_Detection_2c_Leiden_Mutate.cypher Leiden mutate query (GDS).
domains/graph-algorithms/queries/community-detection/Community_Detection_2d_Leiden_Stream.cypher Leiden stream query (GDS).
domains/graph-algorithms/queries/community-detection/Community_Detection_2d_Leiden_Write_Node_Property.cypher Leiden write query for node property (GDS).
domains/graph-algorithms/queries/community-detection/Community_Detection_2d_Leiden_Tuneable_Write.cypher Leiden tuneable write query (GDS).
domains/graph-algorithms/queries/community-detection/Community_Detection_3a_StronglyConnectedComponents_Estimate.cypher SCC estimate query (GDS).
domains/graph-algorithms/queries/community-detection/Community_Detection_3b_StronglyConnectedComponents_Statistics.cypher SCC stats query (GDS).
domains/graph-algorithms/queries/community-detection/Community_Detection_3c_StronglyConnectedComponents_Mutate.cypher SCC mutate query (GDS).
domains/graph-algorithms/queries/community-detection/Community_Detection_3d_StrongyConnectedComponents_Stream.cypher SCC stream query (GDS).
domains/graph-algorithms/queries/community-detection/Community_Detection_3e_StronglyConnectedComponents_Write.cypher SCC write query (GDS).
domains/graph-algorithms/queries/community-detection/Community_Detection_3a_WeaklyConnectedComponents_Estimate.cypher WCC estimate query (GDS).
domains/graph-algorithms/queries/community-detection/Community_Detection_3b_WeaklyConnectedComponents_Statistics.cypher WCC stats query (GDS).
domains/graph-algorithms/queries/community-detection/Community_Detection_3c_WeaklyConnectedComponents_Mutate.cypher WCC mutate query (GDS).
domains/graph-algorithms/queries/community-detection/Community_Detection_3d_WeaklyConnectedComponents_Stream.cypher WCC stream query (GDS).
domains/graph-algorithms/queries/community-detection/Community_Detection_3e_WeaklyConnectedComponents_Write.cypher WCC write query (GDS).
domains/graph-algorithms/queries/community-detection/Community_Detection_4a_Label_Propagation_Estimate.cypher Label Propagation estimate query (GDS).
domains/graph-algorithms/queries/community-detection/Community_Detection_4b_Label_Propagation_Statistics.cypher Label Propagation stats query (GDS).
domains/graph-algorithms/queries/community-detection/Community_Detection_4c_Label_Propagation_Mutate.cypher Label Propagation mutate query (GDS).
domains/graph-algorithms/queries/community-detection/Community_Detection_4d_Label_Propagation_Stream.cypher Label Propagation stream query (GDS).
domains/graph-algorithms/queries/community-detection/Community_Detection_4e_Label_Propagation_Write.cypher Label Propagation write query (GDS).
domains/graph-algorithms/queries/community-detection/Community_Detection_5a_K_Core_Decomposition_Estimate.cypher K-Core estimate query (GDS).
domains/graph-algorithms/queries/community-detection/Community_Detection_5b_K_Core_Decomposition_Statistics.cypher K-Core stats query (GDS).
domains/graph-algorithms/queries/community-detection/Community_Detection_5c_K_Core_Decomposition_Mutate.cypher K-Core mutate query (GDS).
domains/graph-algorithms/queries/community-detection/Community_Detection_5d_K_Core_Decomposition_Stream.cypher K-Core stream query (GDS).
domains/graph-algorithms/queries/community-detection/Community_Detection_5e_K_Core_Decomposition_Write.cypher K-Core write query (GDS).
domains/graph-algorithms/queries/community-detection/Community_Detection_6a_Approximate_Maximum_k_cut_Estimate.cypher MaxKCut estimate query (GDS).
domains/graph-algorithms/queries/community-detection/Community_Detection_6c_Approximate_Maximum_k_cut_Mutate.cypher MaxKCut mutate query (GDS).
domains/graph-algorithms/queries/community-detection/Community_Detection_6d_Approximate_Maximum_k_cut_Stream.cypher MaxKCut stream query (GDS).
domains/graph-algorithms/queries/community-detection/Community_Detection_7d_Modularity.cypher Modularity stream query (GDS).
domains/graph-algorithms/queries/community-detection/Community_Detection_7d_Modularity_Members.cypher Modularity members stream query (GDS).
domains/graph-algorithms/queries/community-detection/Community_Detection_7e_Write_Modularity.cypher Writes modularity back to nodes.
domains/graph-algorithms/queries/community-detection/Community_Detection_8d_Conductance.cypher Conductance stream query (GDS).
domains/graph-algorithms/queries/community-detection/Community_Detection_8d_Conductance_Members.cypher Conductance members stream query (GDS).
domains/graph-algorithms/queries/community-detection/Community_Detection_9_Community_Metrics.cypher Combined community metrics query (modularity + conductance).
domains/graph-algorithms/queries/community-detection/Community_Detection_10a_LocalClusteringCoefficient_Estimate.cypher LCC estimate query (GDS).
domains/graph-algorithms/queries/community-detection/Community_Detection_10b_LocalClusteringCoefficient_Statistics.cypher LCC stats query (GDS).
domains/graph-algorithms/queries/community-detection/Community_Detection_10c_LocalClusteringCoefficient_Mutate.cypher LCC mutate query (GDS).
domains/graph-algorithms/queries/community-detection/Community_Detection_10d_LocalClusteringCoefficient_Stream.cypher LCC stream query (GDS).
domains/graph-algorithms/queries/community-detection/Community_Detection_10d_LocalClusteringCoefficient_Stream_Aggregated.cypher Aggregated LCC stream query for reporting.
domains/graph-algorithms/queries/community-detection/Community_Detection_10e_LocalClusteringCoefficient_Write.cypher LCC write query (GDS).
domains/graph-algorithms/queries/community-detection/Community_Detection_11a_HDBSCAN_Estimate.cypher HDBSCAN estimate query (GDS).
domains/graph-algorithms/queries/community-detection/Community_Detection_11b_HDBSCAN_Statistics.cypher HDBSCAN stats query (GDS).
domains/graph-algorithms/queries/community-detection/Community_Detection_11c_HDBSCAN_Mutate.cypher HDBSCAN mutate query (GDS).
domains/graph-algorithms/queries/community-detection/Community_Detection_11d_HDBSCAN_Stream.cypher HDBSCAN stream query (GDS).
domains/graph-algorithms/queries/community-detection/Community_Detection_11e_HDBSCAN_Write.cypher HDBSCAN write query (GDS).
domains/graph-algorithms/queries/centrality/Set_Parameters.cypher Example param settings for centrality runs.
domains/graph-algorithms/queries/centrality/Centrality_1a_List_TopPercentile.cypher Centrality helper query (top percentile).
domains/graph-algorithms/queries/centrality/Centrality_1b_List_TopPercent.cypher Centrality helper query (top percent).
domains/graph-algorithms/queries/centrality/Centrality_1c_Label_Delete.cypher Removes “top” labels for a centrality metric.
domains/graph-algorithms/queries/centrality/Centrality_1d_Label_Add.cypher Adds “top” labels for a centrality metric.
domains/graph-algorithms/queries/centrality/Centrality_2a_Page_Rank_Estimate.cypher PageRank estimate query (GDS).
domains/graph-algorithms/queries/centrality/Centrality_2b_Page_Rank_Statistics.cypher PageRank stats query (GDS).
domains/graph-algorithms/queries/centrality/Centrality_3c_Page_Rank_Mutate.cypher PageRank mutate query (GDS).
domains/graph-algorithms/queries/centrality/Centrality_3d_Page_Rank_Stream.cypher PageRank stream query (GDS).
domains/graph-algorithms/queries/centrality/Centrality_3e_Page_Rank_Write.cypher PageRank write query (GDS).
domains/graph-algorithms/queries/centrality/Centrality_4a_Article_Rank_Estimate.cypher ArticleRank estimate query (GDS).
domains/graph-algorithms/queries/centrality/Centrality_4b_Article_Rank_Statistics.cypher ArticleRank stats query (GDS).
domains/graph-algorithms/queries/centrality/Centrality_4c_Article_Rank_Mutate.cypher ArticleRank mutate query (GDS).
domains/graph-algorithms/queries/centrality/Centrality_4d_Article_Rank_Stream.cypher ArticleRank stream query (GDS).
domains/graph-algorithms/queries/centrality/Centrality_4e_Article_Rank_Write.cypher ArticleRank write query (GDS).
domains/graph-algorithms/queries/centrality/Centrality_5a_Betweeness_Estimate.cypher Betweenness estimate query (GDS).
domains/graph-algorithms/queries/centrality/Centrality_5b_Betweeness_Statistics.cypher Betweenness stats query (GDS).
domains/graph-algorithms/queries/centrality/Centrality_5c_Betweeness_Mutate.cypher Betweenness mutate query (GDS).
domains/graph-algorithms/queries/centrality/Centrality_5d_Betweeness_Stream.cypher Betweenness stream query (GDS).
domains/graph-algorithms/queries/centrality/Centrality_5e_Betweeness_Write.cypher Betweenness write query (GDS).
domains/graph-algorithms/queries/centrality/Centrality_6a_Cost_effective_Lazy_Forward_CELF_Estimate.cypher CELF estimate query (GDS).
domains/graph-algorithms/queries/centrality/Centrality_6b_Cost_effective_Lazy_Forward_CELF_Statistics.cypher CELF stats query (GDS).
domains/graph-algorithms/queries/centrality/Centrality_6c_Cost_effective_Lazy_Forward_CELF_Mutate.cypher CELF mutate query (GDS).
domains/graph-algorithms/queries/centrality/Centrality_6d_Cost_effective_Lazy_Forward_CELF_Stream.cypher CELF stream query (GDS).
domains/graph-algorithms/queries/centrality/Centrality_6e_Cost_effective_Lazy_Forward_CELF_Write.cypher CELF write query (GDS).
domains/graph-algorithms/queries/centrality/Centrality_7b_Harmonic_Closeness_Statistics.cypher Harmonic closeness stats query (GDS).
domains/graph-algorithms/queries/centrality/Centrality_7c_Harmonic_Closeness_Mutate.cypher Harmonic closeness mutate query (GDS).
domains/graph-algorithms/queries/centrality/Centrality_7d_Harmonic_Closeness_Stream.cypher Harmonic closeness stream query (GDS).
domains/graph-algorithms/queries/centrality/Centrality_7e_Harmonic_Closeness_Write.cypher Harmonic closeness write query (GDS).
domains/graph-algorithms/queries/centrality/Centrality_8b_Closeness_Statistics.cypher Closeness stats query (GDS).
domains/graph-algorithms/queries/centrality/Centrality_8c_Closeness_Mutate.cypher Closeness mutate query (GDS).
domains/graph-algorithms/queries/centrality/Centrality_8d_Closeness_Stream.cypher Closeness stream query (GDS).
domains/graph-algorithms/queries/centrality/Centrality_8e_Closeness_Write.cypher Closeness write query (GDS).
domains/graph-algorithms/queries/centrality/Centrality_9a_Hyperlink_Induced_Topic_Search_HITS_Estimate.cypher HITS estimate query (GDS).
domains/graph-algorithms/queries/centrality/Centrality_9b_Hyperlink_Induced_Topic_Search_HITS_Statistics.cypher HITS stats query (GDS).
domains/graph-algorithms/queries/centrality/Centrality_9c_Hyperlink_Induced_Topic_Search_HITS_Mutate.cypher HITS mutate query (GDS).
domains/graph-algorithms/queries/centrality/Centrality_9d_Hyperlink_Induced_Topic_Search_HITS_Stream.cypher HITS stream query (GDS).
domains/graph-algorithms/queries/centrality/Centrality_9d_Hyperlink_Induced_Topic_Search_HITS_Stream_Mutated.cypher HITS stream query reading mutated properties for reporting.
domains/graph-algorithms/queries/centrality/Centrality_9e_Hyperlink_Induced_Topic_Search_HITS_Write.cypher HITS write query (GDS).
domains/graph-algorithms/queries/centrality/Centrality_10a_Bridges_Estimate.cypher Bridges estimate query (GDS).
domains/graph-algorithms/queries/centrality/Centrality_10d_Bridges_Stream.cypher Bridges stream query (GDS).
domains/graph-algorithms/queries/centrality/Centrality_10e_Bridges_Write.cypher Writes isBridge relationship property for detected bridges.
domains/graph-algorithms/queries/centrality/Centrality_90_Summary.cypher Centrality summary query for reporting.
.github/prompts/plan-graphAlgorithmsAndNodeEmbeddingsDomains.prompt.md Planning prompt describing intended domain split and implementation steps.

Comment thread domains/graph-algorithms/communityCsv.sh Outdated
Comment thread domains/graph-algorithms/summary/report.template.md
Comment thread domains/graph-algorithms/summary/report_no_graph_data.template.md
Comment thread domains/graph-algorithms/centralityCsv.sh Outdated
Comment thread domains/graph-algorithms/communityCsv.sh Outdated
Comment thread domains/graph-algorithms/similarityCsv.sh
Comment thread domains/graph-algorithms/README.md Outdated
Comment thread scripts/cleanupAfterReportGeneration.sh Outdated
@JohT JohT force-pushed the feature/introduce-graph-algorithms-domain branch 2 times, most recently from 63f5d71 to 4153e2b Compare May 2, 2026 12:34
@JohT JohT requested a review from Copilot May 2, 2026 12:34
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 45 out of 168 changed files in this pull request and generated 4 comments.

Comment thread domains/graph-algorithms/similarityCsv.sh
Comment thread README.md Outdated
Comment thread scripts/cleanupAfterReportGeneration.sh
Comment thread domains/graph-algorithms/summary/report.template.md Outdated
@JohT JohT force-pushed the feature/introduce-graph-algorithms-domain branch from 4153e2b to def1bf5 Compare May 2, 2026 14:42
@JohT JohT merged commit 6e3586a into main May 2, 2026
11 checks passed
@JohT JohT deleted the feature/introduce-graph-algorithms-domain branch May 2, 2026 15:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants