Skip to content

lock when merging#8624

Open
ajtmccarty wants to merge 2 commits intostablefrom
ajtm-03162026-lock-merge
Open

lock when merging#8624
ajtmccarty wants to merge 2 commits intostablefrom
ajtm-03162026-lock-merge

Conversation

@ajtmccarty
Copy link
Contributor

@ajtmccarty ajtmccarty commented Mar 17, 2026

Why

Running multiple merge operation at one time, either for the same branch or different branches, can cause unexpected results.

We should only allow one merge operation to proceed at a time. Before executing, the merge operation confirms that the branch being merged is still open. This prevents any branch from being merged multiple times.

Closes #6974 / IFC-1657

What changed

Adds a new very simple MergeLocker class and uses it inside of the merge_branch flow. There's also some refactoring of the merge_branch flow to move much of the logic to a _do_merge_branch function to reduce indentation levels, but there is no functional difference.

Testing

Added some component tests with more mocking than we usually do. I could make it a functional test, but it would be very expensive (diffing and merging are our two slowest operation) and kind of hard to verify.

Also did manual testing for trying to merge different branches and the same branch simultaneously.

Summary by CodeRabbit

  • Bug Fixes

    • Implemented synchronization to ensure only a single merge operation runs at a time across all branches, preventing concurrent merge conflicts.
  • Tests

    • Added test coverage for concurrent merge operations to validate synchronization behavior.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 17, 2026

Walkthrough

This change introduces merge operation locking to ensure only a single merge task executes at a time across all branches. A new MergeLocker utility class is added to manage global locks on the "all_branches" resource. The branch merge logic in tasks.py is refactored to extract merge operations into a dedicated helper function (_do_merge_branch) and acquire a global lock before execution. The BranchMerger class is now publicly exported via the merge package. Import statements are updated for clarity, and logging messages are refined to distinguish between diff locks and merge locks. Tests are added to validate concurrent merge behavior and confirm that simultaneous merge requests are properly serialized.

🚥 Pre-merge checks | ✅ 2 | ❌ 3

❌ Failed checks (3 warnings)

Check name Status Explanation Resolution
Linked Issues check ⚠️ Warning The PR claims to close issue #6974, but the linked issue describes a frontend artifact diff spacing problem unrelated to merge locking. The actual scope mismatch is severe. Verify the correct issue numbers are linked. The PR implements merge concurrency control, which doesn't address artifact diff display issues mentioned in #6974.
Out of Scope Changes check ⚠️ Warning The PR changes are focused on merge locking and branch merge flow refactoring, but the linked issue #6974 concerns frontend artifact diff spacing—completely unrelated to the implementation. Clarify issue linkage: either correct the issue number or explain how merge locking fixes artifact diff spacing problems.
Docstring Coverage ⚠️ Warning Docstring coverage is 18.75% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'lock when merging' clearly summarizes the main change: introducing a lock mechanism for merge operations to prevent concurrent execution.
Description check ✅ Passed The PR description adequately covers why (concurrent merges cause issues), what changed (MergeLocker class and refactoring), and testing approach (component tests plus manual testing).

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions bot added the group/backend Issue related to the backend (API Server, Git Agent) label Mar 17, 2026
@codspeed-hq
Copy link

codspeed-hq bot commented Mar 17, 2026

Merging this PR will improve performance by 38.62%

⚠️ Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

⚡ 12 improved benchmarks

Performance Changes

Benchmark BASE HEAD Efficiency
test_query_rel_many 692.2 ms 522.7 ms +32.43%
test_graphql_generate_schema 516.2 ms 372.4 ms +38.62%
test_query_one_model 463.7 ms 352.2 ms +31.67%
test_schemabranch_process 1,027 ms 793.8 ms +29.37%
test_relationshipmanager_getpeer 156.7 µs 133.8 µs +17.07%
test_query_rel_one 662.6 ms 500.2 ms +32.48%
test_get_schema 324 ms 250.8 ms +29.2%
test_load_node_to_db_node_schema 66.3 ms 51.6 ms +28.43%
test_nodemanager_querypeers 1.5 ms 1.3 ms +21.6%
test_base_schema_duplicate_CoreProposedChange 2.1 ms 1.7 ms +28.21%
test_schemabranch_duplicate 7.3 ms 5.5 ms +31.47%
test_get_menu 245 ms 188.8 ms +29.79%

Comparing ajtm-03162026-lock-merge (7cc0e70) with stable (43d16b8)1

Open in CodSpeed

Footnotes

  1. No successful run was found on stable (e037919) during the generation of this report, so 43d16b8 was used instead as the comparison base. There might be some changes unrelated to this pull request in this report.

@ajtmccarty ajtmccarty marked this pull request as ready for review March 17, 2026 22:45
@ajtmccarty ajtmccarty requested a review from a team as a code owner March 17, 2026 22:45
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (5)
backend/infrahub/core/branch/tasks.py (4)

329-344: Use the existing workflow variable instead of calling get_workflow() again.

Line 329 assigns workflow = get_workflow(), but line 343 calls get_workflow() again when constructing BranchMerger. Use the existing variable for consistency.

♻️ Suggested fix
         merger = BranchMerger(
             db=db,
             diff_coordinator=diff_coordinator,
             diff_merger=diff_merger,
             diff_repository=diff_repository,
             source_branch=obj,
             diff_locker=DiffLocker(),
-            workflow=get_workflow(),
+            workflow=workflow,
         )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/infrahub/core/branch/tasks.py` around lines 329 - 344, The code
creates a local variable workflow = get_workflow() but then calls get_workflow()
again when constructing BranchMerger; replace the second call with the existing
workflow variable so BranchMerger(..., workflow=workflow, ...) uses the same
instance. Update the BranchMerger constructor invocation in the block where
merger is defined to pass workflow instead of calling get_workflow() again.

390-395: Use the existing workflow variable.

Same issue as above—get_workflow() is called again on line 391 when workflow is already available from line 329.

♻️ Suggested fix
     if ipam_node_details:
-        await get_workflow().submit_workflow(
+        await workflow.submit_workflow(
             workflow=IPAM_RECONCILIATION,
             context=context,
             parameters={"branch": registry.default_branch, "ipam_node_details": ipam_node_details},
         )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/infrahub/core/branch/tasks.py` around lines 390 - 395, The code calls
get_workflow() again when submitting the IPAM_RECONCILIATION workflow even
though a workflow variable is already available; modify the submit call to use
the existing workflow variable (the one defined earlier as workflow) instead of
calling get_workflow(), i.e., pass workflow=workflow to submit_workflow while
leaving context, parameters (registry.default_branch and ipam_node_details)
unchanged so submit_workflow uses the existing workflow instance.

399-401: Redundant DiffRepository component creation.

diff_repository is already created on line 333 with identical parameters. The variable remains in scope after the async with block, so this second creation on line 399 appears unnecessary.

♻️ Remove redundant creation
-    diff_repository = await component_registry.get_component(DiffRepository, db=db, branch=obj)
     await diff_repository.mark_tracking_ids_merged(tracking_ids=[BranchTrackingId(name=obj.name)])
     await diff_repository.freeze_diffs_for_branch(branch_name=obj.name)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/infrahub/core/branch/tasks.py` around lines 399 - 401, The second
creation of the DiffRepository via component_registry.get_component is
redundant; remove the duplicate await
component_registry.get_component(DiffRepository, db=db, branch=obj) and instead
reuse the existing diff_repository variable (created earlier in the surrounding
scope) to call mark_tracking_ids_merged(...) and freeze_diffs_for_branch(...);
if needed, ensure diff_repository remains in scope by not shadowing it or by
hoisting its declaration so the calls to mark_tracking_ids_merged and
freeze_diffs_for_branch reference the already-instantiated DiffRepository.

323-325: Consider adding a docstring for this complex helper function.

Per coding guidelines, functions should have docstrings. This helper performs significant merge coordination logic and would benefit from documentation.

📝 Suggested docstring
 async def _do_merge_branch(
     db: InfrahubDatabase, log: Logger | LoggerAdapter, obj: Branch, context: InfrahubContext
 ) -> Sequence[tuple[DiffAction, NodeChangelog]]:
+    """Execute the core branch merge operations under a global graph lock.
+
+    Performs the merge, handles schema migrations, triggers IPAM reconciliation,
+    updates branch status to MERGED, and schedules post-merge workflows.
+
+    Args:
+        db: Database session.
+        log: Logger instance for progress logging.
+        obj: Branch being merged.
+        context: Request context.
+
+    Returns:
+        Sequence of (DiffAction, NodeChangelog) tuples for event generation.
+    """
     component_registry = get_component_registry()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/infrahub/core/branch/tasks.py` around lines 323 - 325, The function
_do_merge_branch is missing a docstring; add a concise docstring above async def
_do_merge_branch explaining its purpose (coordinate branch merge operations),
describing parameters (db: InfrahubDatabase, log: Logger | LoggerAdapter, obj:
Branch, context: InfrahubContext), the return type (Sequence[tuple[DiffAction,
NodeChangelog]]), and summarizing side effects/errors it may raise; keep it
short (one-paragraph summary plus brief param/return lines) to follow project
docstring guidelines.
backend/infrahub/core/merge/merge_locker.py (1)

4-11: Consider adding a docstring and clarifying method semantics.

Per coding guidelines, classes should have docstrings. Additionally, acquire_global_lock doesn't actually acquire the lock—it returns a lock object to be used as an async context manager. Consider renaming to get_global_lock() for accuracy, or add a docstring clarifying the usage pattern.

📝 Suggested docstring addition
 class MergeLocker:
+    """Coordinates exclusive access for branch merge operations.
+
+    Uses a global lock to ensure only one merge operation runs at a time
+    across all branches in the system.
+    """
     lock_namespace = "merge"

     def __init__(self) -> None:
         self.lock_registry = lock.registry

-    def acquire_global_lock(self) -> lock.InfrahubLock:
+    def get_global_lock(self) -> lock.InfrahubLock:
+        """Return the global merge lock for use as an async context manager.
+
+        Returns:
+            InfrahubLock that serializes all merge operations.
+
+        Example:
+            async with merge_locker.get_global_lock():
+                # perform merge operation
+        """
         return self.lock_registry.get(name="all_branches", namespace=self.lock_namespace)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/infrahub/core/merge/merge_locker.py` around lines 4 - 11, The
MergeLocker class lacks a docstring and its method acquire_global_lock is
misnamed because it does not acquire the lock but returns a lock object to be
used as a context manager; add a class-level docstring describing purpose and
usage, and either rename acquire_global_lock to get_global_lock (updating
references) or keep the name and add a method docstring explaining it returns a
lock.InfrahubLock from self.lock_registry (the lock_registry attribute) which
must be used as an async context manager to actually acquire the lock.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@backend/infrahub/core/branch/tasks.py`:
- Around line 329-344: The code creates a local variable workflow =
get_workflow() but then calls get_workflow() again when constructing
BranchMerger; replace the second call with the existing workflow variable so
BranchMerger(..., workflow=workflow, ...) uses the same instance. Update the
BranchMerger constructor invocation in the block where merger is defined to pass
workflow instead of calling get_workflow() again.
- Around line 390-395: The code calls get_workflow() again when submitting the
IPAM_RECONCILIATION workflow even though a workflow variable is already
available; modify the submit call to use the existing workflow variable (the one
defined earlier as workflow) instead of calling get_workflow(), i.e., pass
workflow=workflow to submit_workflow while leaving context, parameters
(registry.default_branch and ipam_node_details) unchanged so submit_workflow
uses the existing workflow instance.
- Around line 399-401: The second creation of the DiffRepository via
component_registry.get_component is redundant; remove the duplicate await
component_registry.get_component(DiffRepository, db=db, branch=obj) and instead
reuse the existing diff_repository variable (created earlier in the surrounding
scope) to call mark_tracking_ids_merged(...) and freeze_diffs_for_branch(...);
if needed, ensure diff_repository remains in scope by not shadowing it or by
hoisting its declaration so the calls to mark_tracking_ids_merged and
freeze_diffs_for_branch reference the already-instantiated DiffRepository.
- Around line 323-325: The function _do_merge_branch is missing a docstring; add
a concise docstring above async def _do_merge_branch explaining its purpose
(coordinate branch merge operations), describing parameters (db:
InfrahubDatabase, log: Logger | LoggerAdapter, obj: Branch, context:
InfrahubContext), the return type (Sequence[tuple[DiffAction, NodeChangelog]]),
and summarizing side effects/errors it may raise; keep it short (one-paragraph
summary plus brief param/return lines) to follow project docstring guidelines.

In `@backend/infrahub/core/merge/merge_locker.py`:
- Around line 4-11: The MergeLocker class lacks a docstring and its method
acquire_global_lock is misnamed because it does not acquire the lock but returns
a lock object to be used as a context manager; add a class-level docstring
describing purpose and usage, and either rename acquire_global_lock to
get_global_lock (updating references) or keep the name and add a method
docstring explaining it returns a lock.InfrahubLock from self.lock_registry (the
lock_registry attribute) which must be used as an async context manager to
actually acquire the lock.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d32fa7a6-e9e5-400b-a065-e6760d2a3528

📥 Commits

Reviewing files that changed from the base of the PR and between e037919 and 7cc0e70.

📒 Files selected for processing (6)
  • backend/infrahub/core/branch/tasks.py
  • backend/infrahub/core/merge/__init__.py
  • backend/infrahub/core/merge/branch_merger.py
  • backend/infrahub/core/merge/merge_locker.py
  • backend/tests/component/core/diff/test_merge_task_lock.py
  • changelog/6794.fixed.md

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

group/backend Issue related to the backend (API Server, Git Agent)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: Artifact Diff Showing spacing in diff even though there is no actual diff

1 participant