ChatGPT_Law_PropertyGraph is a modular Java-based pipeline that transforms legal PDF documents into structured knowledge graphs and compliance checklists. The system has been developed in the context of my Diploma Thesis and uses the EU AI Act as a primary case study.
The system produces:
- A property graph in GraphML format
- A Markdown checklist outlining obligations, rights, and prohibitions
High-level architecture of the legal document analysis pipeline.
The repository consists of three independent Maven projects, each located in its own subdirectory:
/LegislativeTextParser
/LawToPropertyGraphGenerator
/GraphMLToChecklist
- LegislativeTextParser: Extracts and cleans legal text from PDF files, preserving legal structure (Law → Chapter → Article → Paragraph).
- LawToPropertyGraphGenerator: Uses a Large Language Model (GPT-4o-mini) to extract legal entities and relationships, producing a property graph.
- GraphMLToChecklist: Generates a Markdown checklist summarizing legal responsibilities.
- Parses and cleans legal PDF documents.
- Extracts hierarchical structure: Law → Chapter → Article → Paragraph.
- Outputs:
law_structure.jsonentities.txtto:src/resources/output/entities.txt
- Uses OpenAI GPT-4o-mini to extract legal entities and their verbal relations.
- Applies Levenshtein distance for entity normalization.
- Constructs a unified property graph.
-
Create a file named
env.propertiesin the root of this project with the following content:GPT_API_URL=https://api.openai.com/v1/chat/completions GPT_API_KEY=your_openai_key_here -
Copy
entities.txtfrom the first project's output:From: /LegislativeTextParser/src/resources/output/entities.txt To: /LawToPropertyGraphGenerator/src/resources/output/entities.txt -
Output:
final.graphMLinsrc/resources/output/
- Loads the GraphML file into a JavaFX interface.
- Generates a Markdown checklist summarizing the graph nodes.
- On startup, load the following files:
- GraphML:
/LawToPropertyGraphGenerator/src/resources/output/final.graphML - JSON:
Output from/LegislativeTextParser(e.g.law_structure.json)
- GraphML:
# 1. Run the legislative parser
cd LegislativeTextParser
mvn clean install
java -jar target/legislative-text-parser.jar
# 2. Copy the entities file to the second tool
cp src/resources/output/entities.txt ../LawToPropertyGraphGenerator/src/resources/output/
# 3. Add API credentials to env.properties in the second tool's root
# Create a file named "env.properties" in the project's root
# Paste inside :
GPT_API_URL=https://api.openai.com/v1/chat/completions
GPT_API_KEY=your_openai_key_here
# 4. Run the property graph generator
cd ../LawToPropertyGraphGenerator
mvn clean install
java -jar target/law-to-property-graph-generator.jar
# 5. Start the checklist generator and load the required files
cd ../GraphMLToChecklist
mvn clean install
java -jar target/graphml-to-checklist.jar- Java 21
- Maven
- JavaFX
- OpenAI GPT-4o-mini API
- GraphML (yEd-compatible)
- JSON & Markdown
This project was developed as part of the Diploma Thesis of
Papadopoulos Konstantinos, Undergraduate Student
Department of Computer Science and Engineering, University of Ioannina
Supervisor: Professor Panos Vasiliadis
