Skip to content

Commit 49545f4

Browse files
Python version, langchain libraries and document upgrade (#1410)
* upgrade Python version to 3.12 and update setup instructions in documentation * update dependencies in requirements.txt for compatibility and improvements * update dependencies in requirements.txt for compatibility and improvements * fix: revert package versions in requirements.txt for compatibility * chore: update package versions in requirements.txt and refactor imports for langchain modules * fix: update requirements and remove redundant imports in QA and relationships modules
1 parent acc0dfc commit 49545f4

File tree

10 files changed

+91
-60
lines changed

10 files changed

+91
-60
lines changed

README.md

Lines changed: 24 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,30 @@ Transform unstructured data (PDFs, DOCs, TXT, YouTube videos, web pages, etc.) i
88

99
This application allows you to upload files from various sources (local machine, GCS, S3 bucket, or web sources), choose your preferred LLM model, and generate a Knowledge Graph.
1010

11-
---
11+
## Getting Started
12+
13+
### **Prerequisites**
14+
- **Python 3.12 or higher** (for local/separate backend deployment)
15+
- Neo4j Database **5.23 or later** with APOC installed.
16+
- **Neo4j Aura** databases (including the free tier) are supported.
17+
- If using **Neo4j Desktop**, you will need to deploy the backend and frontend separately (docker-compose is not supported).
18+
19+
#### **Backend Setup**
20+
1. Create the `.env` file in the `backend` folder by copying `backend/example.env`.
21+
2. Preconfigure user credentials in the `.env` file to bypass the login dialog:
22+
```bash
23+
NEO4J_URI=<your-neo4j-uri>
24+
NEO4J_USERNAME=<your-username>
25+
NEO4J_PASSWORD=<your-password>
26+
NEO4J_DATABASE=<your-database-name>
27+
```
28+
3. Run:
29+
```bash
30+
cd backend
31+
python3.12 -m venv venv
32+
source venv/bin/activate # On Windows: venv\Scripts\activate
33+
pip install -r requirements.txt -c constraints.txt
34+
uvicorn score:app --reload
1235

1336
## Key Features
1437

backend/Dockerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
FROM python:3.10-slim
1+
FROM python:3.12-slim
22
WORKDIR /code
33
ENV PORT 8000
44
EXPOSE 8000

backend/README.md

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,11 @@
11
# Project Overview
22
Welcome to our project! This project is built using FastAPI framework to create a fast and modern API with Python.
33

4+
## Prerequisites
5+
6+
- Python 3.12 or higher
7+
- pip (Python package manager)
8+
49
## Feature
510
API Endpoint : This project provides various API endpoint to perform specific tasks.
611
Data Validation : Utilize FastAPI data validation and serialization feature.
@@ -16,9 +21,14 @@ Follow these steps to set up and run the project locally:
1621
1722
> cd llm-graph-builder
1823
19-
2. Install Dependency :
24+
2. Create a virtual environment (recommended):
25+
26+
> python3.12 -m venv venv
27+
> source venv/bin/activate # On Windows: venv\Scripts\activate
28+
29+
3. Install Dependency :
2030

21-
> pip install -t requirements.txt
31+
> pip install -r requirements.txt -c constraints.txt
2232
2333
## Run backend project using unicorn
2434
Run the server:

backend/requirements.txt

Lines changed: 40 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -1,65 +1,64 @@
1-
accelerate==1.7.0
2-
asyncio==3.4.3
3-
boto3==1.38.36
4-
botocore==1.38.36
5-
certifi==2025.6.15
6-
fastapi==0.115.12
1+
accelerate==1.12.0
2+
asyncio==4.0.0
3+
boto3==1.40.23
4+
botocore==1.40.23
5+
certifi==2025.8.3
6+
fastapi==0.116.1
77
fastapi-health==0.4.0
88
fireworks-ai==0.15.12
99
google-api-core==2.25.1
1010
google-auth==2.40.3
1111
google_auth_oauthlib==1.2.2
1212
google-cloud-core==2.4.3
13-
json-repair==0.39.1
13+
json-repair==0.44.1
1414
pip-install==1.3.5
15-
langchain==0.3.25
16-
langchain-aws==0.2.25
17-
langchain-anthropic==0.3.15
18-
langchain-fireworks==0.3.0
19-
langchain-community==0.3.25
20-
langchain-core==0.3.65
21-
langchain-experimental==0.3.4
22-
langchain-google-vertexai==2.0.25
23-
langchain-groq==0.3.2
24-
langchain-openai==0.3.23
25-
langchain-text-splitters==0.3.8
26-
langchain-huggingface==0.3.0
15+
langchain==1.1.2
16+
langchain-aws==1.1.0
17+
langchain-anthropic==1.2.0
18+
langchain-fireworks==1.1.0
19+
langchain-community==0.4.1
20+
langchain-core==1.1.1
21+
langchain-experimental==0.4.0
22+
langchain-google-vertexai==3.1.1
23+
langchain-groq==1.1.0
24+
langchain-openai==1.1.0
25+
langchain-text-splitters==1.0.0
26+
langchain-huggingface==1.1.0
27+
langchain-classic==1.0.0
2728
langdetect==1.0.9
28-
langsmith==0.3.45
29-
langserve==0.3.1
30-
neo4j-rust-ext==5.28.1.0
29+
langsmith==0.4.55
30+
langserve==0.3.3
31+
neo4j-rust-ext==5.28.2.1
3132
nltk==3.9.1
32-
openai==1.86.0
33-
opencv-python==4.11.0.86
33+
openai==2.9.0
3434
psutil==7.0.0
35-
pydantic==2.11.7
36-
python-dotenv==1.1.0
35+
pydantic==2.12.5
36+
python-dotenv==1.1.1
3737
python-magic==0.4.27
3838
PyPDF2==3.0.1
39-
PyMuPDF==1.26.1
40-
starlette==0.46.2
41-
sse-starlette==2.3.6
39+
PyMuPDF==1.26.4
40+
starlette==0.47.3
41+
sse-starlette==3.0.2
4242
starlette-session==0.4.3
4343
tqdm==4.67.1
44-
unstructured[all-docs]
45-
unstructured==0.17.2
46-
unstructured-client==0.36.0
44+
unstructured[all-docs]==0.18.14
45+
unstructured-client==0.42.3
4746
unstructured-inference==1.0.5
48-
urllib3==2.4.0
49-
uvicorn==0.34.3
47+
urllib3==2.5.0
48+
uvicorn==0.35.0
5049
gunicorn==23.0.0
5150
wikipedia==1.4.0
52-
wrapt==1.17.2
51+
wrapt==1.17.3
5352
yarl==1.20.1
54-
youtube-transcript-api==1.1.0
53+
youtube-transcript-api==1.2.2
5554
zipp==3.23.0
56-
sentence-transformers==5.0.0
55+
sentence-transformers==5.1.0
5756
google-cloud-logging==3.12.1
5857
pypandoc==1.15
59-
graphdatascience==1.15.1
60-
Secweb==1.18.1
61-
ragas==0.3.1
58+
graphdatascience==1.18a1
59+
Secweb==1.25.2
60+
ragas==0.3.2
6261
rouge_score==0.1.2
63-
langchain-neo4j==0.4.0
62+
langchain-neo4j==0.6.0
6463
pypandoc-binary==1.15
6564
chardet==5.2.0

backend/src/QA_integration.py

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -11,12 +11,11 @@
1111
from langchain_neo4j import Neo4jVector
1212
from langchain_neo4j import Neo4jChatMessageHistory
1313
from langchain_neo4j import GraphCypherQAChain
14-
from langchain.prompts import ChatPromptTemplate, MessagesPlaceholder
14+
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
1515
from langchain_core.output_parsers import StrOutputParser
1616
from langchain_core.runnables import RunnableBranch
17-
from langchain.retrievers import ContextualCompressionRetriever
18-
from langchain_community.document_transformers import EmbeddingsRedundantFilter
19-
from langchain.retrievers.document_compressors import EmbeddingsFilter, DocumentCompressorPipeline
17+
from langchain_classic.retrievers import ContextualCompressionRetriever
18+
from langchain_classic.retrievers.document_compressors import EmbeddingsFilter, DocumentCompressorPipeline
2019
from langchain_text_splitters import TokenTextSplitter
2120
from langchain_core.messages import HumanMessage, AIMessage
2221
from langchain_community.chat_message_histories import ChatMessageHistory

backend/src/create_chunks.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
from langchain_text_splitters import TokenTextSplitter
2-
from langchain.docstore.document import Document
2+
from langchain_core.documents import Document
33
from langchain_neo4j import Neo4jGraph
44
import logging
55
from src.document_sources.youtube import get_chunks_with_timestamps, get_calculated_timestamps

backend/src/document_sources/youtube.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
from langchain.docstore.document import Document
1+
from langchain_core.documents import Document
22
from src.shared.llm_graph_builder_exception import LLMGraphBuilderException
33
from youtube_transcript_api import YouTubeTranscriptApi
44
from youtube_transcript_api.proxies import GenericProxyConfig

backend/src/llm.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
import logging
2-
from langchain.docstore.document import Document
2+
from langchain_core.documents import Document
33
import os
44
from langchain_openai import ChatOpenAI, AzureChatOpenAI
55
from langchain_google_vertexai import ChatVertexAI

backend/src/make_relationships.py

Lines changed: 1 addition & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,5 @@
11
from langchain_neo4j import Neo4jGraph
2-
from langchain.docstore.document import Document
3-
from src.shared.common_fn import load_embedding_model,execute_graph_query
2+
from langchain_core.documents import Document
43
from src.shared.common_fn import load_embedding_model,execute_graph_query
54
import logging
65
from typing import List
@@ -34,7 +33,6 @@ def merge_relationship_between_chunk_and_entites(graph: Neo4jGraph, graph_docume
3433
MERGE (c)-[:HAS_ENTITY]->(n)
3534
"""
3635
execute_graph_query(graph,unwind_query, params={"batch_data": batch_data})
37-
execute_graph_query(graph,unwind_query, params={"batch_data": batch_data})
3836

3937

4038
def create_chunk_embeddings(graph, chunkId_chunkDoc_list, file_name):
@@ -61,7 +59,6 @@ def create_chunk_embeddings(graph, chunkId_chunkDoc_list, file_name):
6159
MERGE (c)-[:PART_OF]->(d)
6260
"""
6361
execute_graph_query(graph,query_to_create_embedding, params={"fileName":file_name, "data":data_for_query})
64-
execute_graph_query(graph,query_to_create_embedding, params={"fileName":file_name, "data":data_for_query})
6562

6663
def create_relation_between_chunks(graph, file_name, chunks: List[Document])->list:
6764
logging.info("creating FIRST_CHUNK and NEXT_CHUNK relationships between chunks")
@@ -130,7 +127,6 @@ def create_relation_between_chunks(graph, file_name, chunks: List[Document])->li
130127
MERGE (c)-[:PART_OF]->(d)
131128
"""
132129
execute_graph_query(graph,query_to_create_chunk_and_PART_OF_relation, params={"batch_data": batch_data})
133-
execute_graph_query(graph,query_to_create_chunk_and_PART_OF_relation, params={"batch_data": batch_data})
134130

135131
query_to_create_FIRST_relation = """
136132
UNWIND $relationships AS relationship
@@ -140,7 +136,6 @@ def create_relation_between_chunks(graph, file_name, chunks: List[Document])->li
140136
MERGE (d)-[:FIRST_CHUNK]->(c))
141137
"""
142138
execute_graph_query(graph,query_to_create_FIRST_relation, params={"f_name": file_name, "relationships": relationships})
143-
execute_graph_query(graph,query_to_create_FIRST_relation, params={"f_name": file_name, "relationships": relationships})
144139

145140
query_to_create_NEXT_CHUNK_relation = """
146141
UNWIND $relationships AS relationship

docs/project_docs.adoc

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,11 @@ This document provides comprehensive documentation for the Neo4j llm-graph-build
2121

2222
== Local Setup and Execution
2323

24+
Prerequisites:
25+
- Python 3.12 or higher
26+
- Node.js 20 or higher
27+
- Docker (optional, for containerized deployment)
28+
2429
Run Docker Compose to build and start all components:
2530
....
2631
docker-compose up --build
@@ -38,8 +43,8 @@ yarn run dev
3843
** For backend
3944
....
4045
cd backend
41-
python -m venv envName
42-
source envName/bin/activate
46+
python3.12 -m venv venv
47+
source venv/bin/activate
4348
pip install -r requirements.txt
4449
uvicorn score:app --reload
4550
....

0 commit comments

Comments
 (0)