Fix: duplicate_graph defined at module level instead of as ParsingService method by arnavp27 · Pull Request #660 · potpie-ai/potpie

arnavp27 · 2026-02-27T21:38:40Z

Summary

Fixes a silent structural bug where duplicate_graph was accidentally defined
as a module-level function instead of as a method of the ParsingService class,
making it completely unreachable through normal usage.

Root Cause

In app/modules/parsing/graph_construction/parsing_service.py, the
duplicate_graph function was defined at column 0 (no indentation), placing
it outside the ParsingService class body. Every other method in the class is
correctly indented at 4 spaces.

# BEFORE (broken) — defined at module level, outside the class
async def duplicate_graph(self, old_repo_id: str, new_repo_id: str):
    await self.search_service.clone_search_indices(old_repo_id, new_repo_id)
    ...

# AFTER (fixed) — correctly indented as a class method
    async def duplicate_graph(self, old_repo_id: str, new_repo_id: str):
        await self.search_service.clone_search_indices(old_repo_id, new_repo_id)
        ...

What Breaks at Runtime

Two failure modes result from this bug:

AttributeError on the class - Any caller doing
service.duplicate_graph(old_id, new_id) on a ParsingService instance
would immediately get:

AttributeError: 'ParsingService' object has no attribute 'duplicate_graph'

because the method simply does not exist on the class.

AttributeError inside the function — If somehow called directly as a
standalone function (e.g. duplicate_graph(some_obj, old_id, new_id)), both
self.search_service and self.inference_service would raise AttributeError
unless the caller manually passed a ParsingService instance as the first
argument — which is not how it was ever intended to be called.

The function references self.search_service (line 622) and
self.inference_service (lines 627, 670) — both are instance attributes set in
ParsingService.__init__ - confirming it was always intended to be a class method.

Fix

Re-indented the entire duplicate_graph function body (lines 621–711) by 4
spaces so it is correctly nested inside the ParsingService class.

Test Added

Added a regression test in tests/unit/parsing/test_parsing_service_method.py
that directly proves the bug and guards against regression:

def test_duplicate_graph_is_a_method_of_parsing_service(): assert hasattr(ParsingService, "duplicate_graph"), ( "duplicate_graph is not a method of ParsingService. " "It is defined at module level due to missing indentation." )

This test fails on the original code and passes after the fix. It requires
no external dependencies (no database, no Neo4j, no mocks) — it purely validates
class structure.

Files Changed

app/modules/parsing/graph_construction/parsing_service.py - indentation fix
tests/unit/parsing/test_parsing_service_method.py - new regression test

Summary by CodeRabbit

Bug Fixes
- Improved reliability of repository graph duplication with robust error propagation and clearer diagnostics for failures.
Tests
- Added regression tests to verify graph duplication behavior and prevent regressions.

coderabbitai · 2026-02-27T21:38:58Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a9d7256 and 38ffc8f.

📒 Files selected for processing (1)

app/modules/parsing/graph_construction/parsing_service.py

Walkthrough

The top-level duplicate_graph function was moved into ParsingService as an async method duplicate_graph(self, old_repo_id, new_repo_id), preserves batched node/relationship copying and now clones search indices first; errors now raise ParsingServiceError. A unit test ensures the method exists on the class.

Changes

Cohort / File(s)	Summary
ParsingService Refactor `app/modules/parsing/graph_construction/parsing_service.py`	Moved `duplicate_graph` from module-level to `ParsingService.async` method; awaits `clone_search_indices` first; performs batched node and relationship duplication using separate driver sessions; on exception logs and raises `ParsingServiceError`.
Test Addition `tests/unit/parsing/test_parsing_service_method.py`	Added regression test asserting `ParsingService` exposes a `duplicate_graph` attribute (method).

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Poem

🐇 I hopped through code from field to tree,
A function found a class to be.
Async whiskers, batches tight,
Errors wrapped and logs alight.
Together now—repo twins gleam bright.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title directly and accurately describes the main structural fix: moving duplicate_graph from module-level into the ParsingService class where it belongs.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

app/modules/intelligence/tools/code_changes_manager.py (1)
5395-5402: Avoid per-file project/service initialization inside the diff loop.

Line [5395] re-queries Project and Line [5401] recreates CodeProviderService for every file. For larger change sets, this can add significant avoidable latency and DB load. Consider resolving these once before the loop and reusing them.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/modules/intelligence/tools/code_changes_manager.py` around lines 5395 -
5402, The code is re-querying Project and instantiating CodeProviderService
inside the per-file diff loop; move the Project lookup
(db.query(Project).filter(Project.id == project_id).first()) and the
CodeProviderService creation (CodeProviderService(db)) out of the loop so they
are resolved once and reused; if the loop may contain multiple project_id
values, build a small cache keyed by project_id to reuse the same Project
instance and a single CodeProviderService per DB/session instead of recreating
them per file.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app/modules/parsing/graph_construction/parsing_service.py`:
- Around line 705-710: In duplicate_graph(), the except block currently logs the
exception with logger.exception but swallows it; update the handler to re-raise
a descriptive error after logging (e.g., raise ParsingError(f"Failed to
duplicate graph: {e}") from e or re-raise the original exception) so callers
know the operation failed; locate the try/except inside duplicate_graph and
replace the silent swallow with a raise that wraps the caught exception to
preserve context.

---

Nitpick comments:
In `@app/modules/intelligence/tools/code_changes_manager.py`:
- Around line 5395-5402: The code is re-querying Project and instantiating
CodeProviderService inside the per-file diff loop; move the Project lookup
(db.query(Project).filter(Project.id == project_id).first()) and the
CodeProviderService creation (CodeProviderService(db)) out of the loop so they
are resolved once and reused; if the loop may contain multiple project_id
values, build a small cache keyed by project_id to reuse the same Project
instance and a single CodeProviderService per DB/session instead of recreating
them per file.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1571110 and b20915a.

📒 Files selected for processing (4)

app/modules/intelligence/agents/chat_agents/history_processor.py
app/modules/intelligence/tools/code_changes_manager.py
app/modules/parsing/graph_construction/parsing_service.py
tests/unit/parsing/test_parsing_service_method.py

…ice method

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app/modules/parsing/graph_construction/parsing_service.py`:
- Around line 621-625: Move the await
self.search_service.clone_search_indices(old_repo_id, new_repo_id) call inside
the existing try block in duplicate_graph so any exceptions are caught and
rethrown as ParsingServiceError; locate the duplicate_graph method and the
try/except that currently wraps node_batch_size/relationship_batch_size and
ensure clone_search_indices is invoked before other work but after entering the
try, and that the except block catches errors and raises ParsingServiceError
(using the same pattern/format as other error handling in this method).
- Around line 627-628: The async method duplicate_graph() currently opens a
synchronous Neo4j session with with self.inference_service.driver.session(),
which blocks the event loop; fix by either switching InferenceService.driver to
use the async Neo4j driver and replace that block with an async context (async
with self.inference_service.driver.session() as session and await calls), or
keep the sync driver but move the blocking session work into a threadpool via
asyncio.get_running_loop().run_in_executor(...) (i.e., wrap the whole session
usage and any session.run()/transaction calls inside a function submitted to
run_in_executor). Update duplicate_graph() accordingly and ensure any subsequent
calls that expect results are awaited/returned from the executor task.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b20915a and a9d7256.

📒 Files selected for processing (2)

app/modules/parsing/graph_construction/parsing_service.py
tests/unit/parsing/test_parsing_service_method.py

🚧 Files skipped from review as they are similar to previous changes (1)

tests/unit/parsing/test_parsing_service_method.py

coderabbitai · 2026-02-27T22:35:58Z

+            with self.inference_service.driver.session() as session:
+                offset = 0


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # First, find the method around lines 627-628 echo "=== Method signature and context around lines 627-628 ===" sed -n '600,650p' app/modules/parsing/graph_construction/parsing_service.py | head -60 # Find the method definition that contains line 627 echo -e "\n=== Searching for method definition before line 627 ===" sed -n '580,627p' app/modules/parsing/graph_construction/parsing_service.py | tail -50

Repository: potpie-ai/potpie

Length of output: 5089

🏁 Script executed:

#!/bin/bash # Find InferenceService class definition echo "=== Finding InferenceService class ===" fd -t f -name "*.py" | xargs grep -l "class InferenceService" # Check how driver is defined in InferenceService echo -e "\n=== InferenceService driver initialization ===" fd -t f -name "*.py" | xargs grep -B5 -A10 "class InferenceService"

Repository: potpie-ai/potpie

Length of output: 627

🏁 Script executed:

#!/bin/bash # Check where inference_service is instantiated and what driver it gets echo "=== Find inference_service instantiation ===" rg -n "inference_service\s*=" --type py | head -20 # Check for self.inference_service definition echo -e "\n=== Find self.inference_service assignment ===" rg -n "self\.inference_service\s*=" --type py -A3 | head -30

Repository: potpie-ai/potpie

Length of output: 1610

🏁 Script executed:

#!/bin/bash # Check Neo4j driver usage patterns in the codebase echo "=== Check driver session usage patterns ===" rg -n "driver\.session\(\)" --type py | head -20 # Check if there are any async driver usages echo -e "\n=== Check for async driver usage ===" rg -n "async.*driver|driver.*async" --type py | head -20 # Specifically check if session() has async variant echo -e "\n=== Search for async session calls ===" rg -n "async with.*session\(\)" --type py

Repository: potpie-ai/potpie

Length of output: 2694

🏁 Script executed:

#!/bin/bash # Find and display InferenceService class and driver property echo "=== InferenceService class definition ===" rg -n "class InferenceService" --type py -A 30 # Find driver initialization in InferenceService echo -e "\n=== InferenceService driver initialization ===" rg -n "self.driver\s*=" app/modules/parsing/knowledge_graph/inference_service.py -A 3 # Check the driver property definition echo -e "\n=== InferenceService driver property ===" rg -n "@property" app/modules/parsing/knowledge_graph/inference_service.py -A 5 | grep -A 5 "driver"

Repository: potpie-ai/potpie

Length of output: 3468

🏁 Script executed:

#!/bin/bash # Look at the __init__ method of InferenceService to see how driver is set echo "=== InferenceService __init__ method ===" sed -n '1,150p' app/modules/parsing/knowledge_graph/inference_service.py | head -100

Repository: potpie-ai/potpie

Length of output: 3634

Use async Neo4j driver or wrap synchronous session calls with asyncio.run_in_executor().

The async def duplicate_graph() method uses synchronous blocking Neo4j session calls (with self.inference_service.driver.session()). The InferenceService.driver is initialized with GraphDatabase.driver(), which is the synchronous driver. These blocking calls will block the event loop in an async context. Either migrate to async Neo4j driver (async with) or wrap the session calls with asyncio.run_in_executor() to run them in a thread pool.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@app/modules/parsing/graph_construction/parsing_service.py` around lines 627 - 628, The async method duplicate_graph() currently opens a synchronous Neo4j session with with self.inference_service.driver.session(), which blocks the event loop; fix by either switching InferenceService.driver to use the async Neo4j driver and replace that block with an async context (async with self.inference_service.driver.session() as session and await calls), or keep the sync driver but move the blocking session work into a threadpool via asyncio.get_running_loop().run_in_executor(...) (i.e., wrap the whole session usage and any session.run()/transaction calls inside a function submitted to run_in_executor). Update duplicate_graph() accordingly and ensure any subsequent calls that expect results are awaited/returned from the executor task.

sonarqubecloud · 2026-02-27T22:47:56Z

Quality Gate failed

Failed conditions
C Reliability Rating on New Code (required ≥ A)
D Maintainability Rating on New Code (required ≥ A)

See analysis details on SonarQube Cloud

Catch issues before they fail your Quality Gate with our IDE extension SonarQube for IDE

coderabbitai Bot reviewed Feb 27, 2026

View reviewed changes

Comment thread app/modules/parsing/graph_construction/parsing_service.py Outdated

arnavp27 added 2 commits February 27, 2026 22:31

Fix duplicate_graph defined at module level instead of as ParsingServ…

542eaa0

…ice method

Re-raise exception in duplicate_graph to surface failures to callers

a9d7256

arnavp27 force-pushed the fix/duplicate-graph-indentation branch from b20915a to a9d7256 Compare February 27, 2026 22:32

coderabbitai Bot reviewed Feb 27, 2026

View reviewed changes

Move clone_search_indices inside try block for consistent error handling

38ffc8f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix: duplicate_graph defined at module level instead of as ParsingService method#660

Fix: duplicate_graph defined at module level instead of as ParsingService method#660
arnavp27 wants to merge 3 commits intopotpie-ai:mainfrom
arnavp27:fix/duplicate-graph-indentation

arnavp27 commented Feb 27, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Feb 27, 2026 •

edited

Loading

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot Feb 27, 2026

Uh oh!

sonarqubecloud Bot commented Feb 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		with self.inference_service.driver.session() as session:
		offset = 0

Conversation

arnavp27 commented Feb 27, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root Cause

What Breaks at Runtime

Fix

Test Added

Files Changed

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

sonarqubecloud Bot commented Feb 27, 2026

Quality Gate failed

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

arnavp27 commented Feb 27, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Feb 27, 2026 •

edited

Loading