This repository contains the code and experiments for a defense framework against knowledge poisoning attacks in GraphRAG-style multi-hop QA systems. Our method, HoG-GRAG (Hop-wise Guard for GraphRAG), improves robustness by decomposing multi-hop questions into ordered subqueries, detecting poisoning-induced inconsistencies during hop-wise execution, and repairing corrupted retrieved subgraphs through targeted pruning and minimal evidence recovery.
We evaluate this framework on multi-hop question answering using:
-
Datasets
-
GraphRAG Pipelines
This repository is organized around the two core components of Auto-Immune GraphRAG β detection and repair β with supporting modules for evaluation.
HoG-GRAG/
β
βββ prompts/
β β βββ Question Paraphrasing.md
β β βββ Response Evaluation.md
βββ src/
β βββ detection.py
β βββ repairer.py
β βββ trace_analysis.py
β βββ evaluation.py
βββ baselines/
β β βββ Query_Paraphrasing.py
β β βββ Perplexity_based.py
βββ requirements.txt
βββ README.md
If you use this methodology in your research, please cite:
Havva Alizadeh Noughabi, Fattane Zarrinkalam, Ali Dehghantanha, Defense Against Knowledge Poisoning Attack on GraphRAG, Accepted at the Annual Meeting of the Association for Computational Linguistics (ACL 2026)
