Skip to content

fix: replace in-place dataclass mutations with dataclasses.replace()#10702

Merged
bogdankostic merged 4 commits intodeepset-ai:mainfrom
Aftabbs:fix/resolve-inplace-dataclass-mutations
Mar 6, 2026
Merged

fix: replace in-place dataclass mutations with dataclasses.replace()#10702
bogdankostic merged 4 commits intodeepset-ai:mainfrom
Aftabbs:fix/resolve-inplace-dataclass-mutations

Conversation

@Aftabbs
Copy link
Contributor

@Aftabbs Aftabbs commented Feb 28, 2026

Resolves the in-place mutation warnings introduced by the _warn_on_inplace_mutation guard in PR #10650.

Running hatch run test:unit | grep "Mutating attribute" surfaced mutations across five components. Each is replaced with dataclasses.replace(instance, field=new_value) so that dataclass instances are never mutated after creation.

Problem

After the _warn_on_inplace_mutation guard was added in #10650, running unit tests with hatch run test:unit | grep "Mutating attribute" surfaced 10 in-place mutations across 5 components. These mutations trigger the warning and can cause subtle bugs when the same dataclass instance is shared across pipeline components.

Fix

Replace every in-place attribute assignment on a dataclass instance with dataclasses.replace(instance, field=value). No logic changes — only the mutation pattern is updated.

Files changed

File Attribute mutated Dataclass
components/builders/chat_prompt_builder.py _content ChatMessage
core/pipeline/pipeline.py agent_snapshot, break_point `Pip
Changed files:
  • components/builders/chat_prompt_builder.py: replace _content mutation on rendered ChatMessage copy with dataclasses.replace()
  • core/pipeline/pipeline.py: replace two-field mutation on PipelineSnapshot (agent_snapshot + break_point) with a single dataclasses.replace() call
  • components/converters/image/file_to_image.py: replace ByteStream.mime_type mutation with dataclasses.replace()
  • components/extractors/llm_metadata_extractor.py: replace Document.content mutation with dataclasses.replace() (already imported replace)
  • components/fetchers/link_content.py: replace ByteStream.mime_type mutations in both sync and async run() methods
  • components/joiners/document_joiner.py: replace Document.score mutations in _score_norm, _reciprocal_rank_fusion, and _distribution_based_rank_fusion with non-mutating list comprehensions using dataclasses.replace()

Also updates test_document_joiner.py::test_list_with_one_empty_list to compare by document ID rather than object identity, since the test previously relied on the mutation side-effect to make the assertion pass.

Fixes #10659

Related Issues

  • fixes #issue-number

Proposed Changes:

How did you test it?

Notes for the reviewer

Checklist

  • I have read the contributors guidelines and the code of conduct.
  • I have updated the related issue with new insights and changes.
  • I have added unit tests and updated the docstrings.
  • I've used one of the conventional commit types for my PR title: fix:, feat:, build:, chore:, ci:, docs:, style:, refactor:, perf:, test: and added ! in case the PR includes breaking changes.
  • I have documented my code.
  • I have added a release note file, following the contributors guidelines.
  • I have run pre-commit hooks and fixed any issue.

@Aftabbs Aftabbs requested a review from a team as a code owner February 28, 2026 17:12
@Aftabbs Aftabbs requested review from bogdankostic and removed request for a team February 28, 2026 17:12
@vercel
Copy link

vercel bot commented Feb 28, 2026

Someone is attempting to deploy a commit to the deepset Team on Vercel.

A member of the Team first needs to authorize it.

@CLAassistant
Copy link

CLAassistant commented Feb 28, 2026

CLA assistant check
All committers have signed the CLA.

@bogdankostic
Copy link
Contributor

@Aftabbs Could you please sign the CLA?

@Aftabbs
Copy link
Contributor Author

Aftabbs commented Mar 3, 2026

i have Signed it

@bogdankostic
Copy link
Contributor

It seems like your commits are not linked to your GitHub account. I think you need to add the email address used in the commits to your account for the CLA check to pass.

@Aftabbs
Copy link
Contributor Author

Aftabbs commented Mar 3, 2026

@bogdankostic just now added it (5min ago and then signed the CLA)

@bogdankostic
Copy link
Contributor

The e-mail address used in the commit is aftabbs@github.com. I think you have to set your commit e-mail address in Git to the one that's linked with your GitHub account and overwrite the existing commit.

@Aftabbs
Copy link
Contributor Author

Aftabbs commented Mar 3, 2026

This aftabbs@github.com this was accidently added (this is the actual one aftabbs.wwe@gmail.com) which is added and verified

@bogdankostic
Copy link
Contributor

Could you try amending your commit with your e-mail address with the following?

git config user.email "new.email@example.com"
git commit --amend --author="Your Name <new.email@example.com>" --no-edit
git push --force-with-lease

Resolves the in-place mutation warnings introduced by the
_warn_on_inplace_mutation guard in PR deepset-ai#10650.

Running `hatch run test:unit | grep "Mutating attribute"` surfaced
mutations across five components. Each is replaced with
`dataclasses.replace(instance, field=new_value)` so that dataclass
instances are never mutated after creation.

Changed files:
- components/builders/chat_prompt_builder.py: replace _content mutation
  on rendered ChatMessage copy with dataclasses.replace()
- core/pipeline/pipeline.py: replace two-field mutation on PipelineSnapshot
  (agent_snapshot + break_point) with a single dataclasses.replace() call
- components/converters/image/file_to_image.py: replace ByteStream.mime_type
  mutation with dataclasses.replace()
- components/extractors/llm_metadata_extractor.py: replace Document.content
  mutation with dataclasses.replace() (already imported `replace`)
- components/fetchers/link_content.py: replace ByteStream.mime_type
  mutations in both sync and async run() methods
- components/joiners/document_joiner.py: replace Document.score mutations in
  _score_norm, _reciprocal_rank_fusion, and _distribution_based_rank_fusion
  with non-mutating list comprehensions using dataclasses.replace()

Also updates test_document_joiner.py::test_list_with_one_empty_list to
compare by document ID rather than object identity, since the test
previously relied on the mutation side-effect to make the assertion pass.

Fixes deepset-ai#10659
@Aftabbs Aftabbs force-pushed the fix/resolve-inplace-dataclass-mutations branch from a71c736 to 2a9e5b7 Compare March 3, 2026 12:03
@github-actions github-actions bot added the type:documentation Improvements on the docs label Mar 3, 2026
@Aftabbs
Copy link
Contributor Author

Aftabbs commented Mar 3, 2026

done

@bogdankostic
Copy link
Contributor

Thanks @Aftabbs, now the CLA check passes but there are other failing tests. I'd recommend going through our contributing guidelines for instructions on how to solve them.

@Aftabbs
Copy link
Contributor Author

Aftabbs commented Mar 3, 2026

Fixed both failing checks , wrapped the long line in chat_prompt_builder.py to resolve the ruff E501 error, and added a reno release note under releasenotes/notes/. The Vercel failure appears to be a deployment authorization issue unrelated to the code changes. Ready for re-review!

@bogdankostic
Copy link
Contributor

I'm sorry but it seems like your newest commit again uses aftabbs@github.com as an author letting the CLA check fail again.

- Break line 272 in chat_prompt_builder.py to satisfy E501 (max 120 chars)
- Add reno release note for the dataclass mutation fix

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@Aftabbs Aftabbs force-pushed the fix/resolve-inplace-dataclass-mutations branch from 1a9846a to 2ea244f Compare March 4, 2026 10:04
@Aftabbs
Copy link
Contributor Author

Aftabbs commented Mar 4, 2026

oh my bad, pl check once now

@vercel
Copy link

vercel bot commented Mar 5, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
haystack-docs Ignored Ignored Preview Mar 6, 2026 3:30pm

Request Review

Copy link
Contributor

@bogdankostic bogdankostic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR @Aftabbs! I left two minor comments.

Also, there are still some places left where we do in-place mutation of data classes. Running hatch run test:unit | grep "Mutating attribute" on your branch results in the following output:

/Users/bogdan.kostic/Repositories/haystack/test/components/preprocessors/test_embedding_based_document_splitter.py:315: Warning: Mutating attribute 'embedding' on an instance of 'Document' can lead to unexpected behavior by affecting other parts of the pipeline that use the same dataclass instance. Use `dataclasses.replace(instance, embedding=new_value)` instead. See https://docs.haystack.deepset.ai/docs/custom-components#requirements for details.
  /Users/bogdan.kostic/Repositories/haystack/test/components/preprocessors/test_embedding_based_document_splitter.py:335: Warning: Mutating attribute 'embedding' on an instance of 'Document' can lead to unexpected behavior by affecting other parts of the pipeline that use the same dataclass instance. Use `dataclasses.replace(instance, embedding=new_value)` instead. See https://docs.haystack.deepset.ai/docs/custom-components#requirements for details.
  /Users/bogdan.kostic/Repositories/haystack/haystack/components/rankers/hugging_face_tei.py:164: Warning: Mutating attribute 'score' on an instance of 'Document' can lead to unexpected behavior by affecting other parts of the pipeline that use the same dataclass instance. Use `dataclasses.replace(instance, score=new_value)` instead. See https://docs.haystack.deepset.ai/docs/custom-components#requirements for details.
  /Users/bogdan.kostic/Repositories/haystack/haystack/components/rankers/meta_field.py:405: Warning: Mutating attribute 'score' on an instance of 'Document' can lead to unexpected behavior by affecting other parts of the pipeline that use the same dataclass instance. Use `dataclasses.replace(instance, score=new_value)` instead. See https://docs.haystack.deepset.ai/docs/custom-components#requirements for details.
  /Users/bogdan.kostic/Repositories/haystack/haystack/components/rankers/sentence_transformers_similarity.py:292: Warning: Mutating attribute 'score' on an instance of 'Document' can lead to unexpected behavior by affecting other parts of the pipeline that use the same dataclass instance. Use `dataclasses.replace(instance, score=new_value)` instead. See https://docs.haystack.deepset.ai/docs/custom-components#requirements for details.
  /Users/bogdan.kostic/Repositories/haystack/haystack/components/rankers/transformers_similarity.py:321: Warning: Mutating attribute 'score' on an instance of 'Document' can lead to unexpected behavior by affecting other parts of the pipeline that use the same dataclass instance. Use `dataclasses.replace(instance, score=new_value)` instead. See https://docs.haystack.deepset.ai/docs/custom-components#requirements for details.
  /Users/bogdan.kostic/Repositories/haystack/haystack/components/readers/extractive.py:393: Warning: Mutating attribute 'query' on an instance of 'ExtractedAnswer' can lead to unexpected behavior by affecting other parts of the pipeline that use the same dataclass instance. Use `dataclasses.replace(instance, query=new_value)` instead. See https://docs.haystack.deepset.ai/docs/custom-components#requirements for details.
  /Users/bogdan.kostic/Repositories/haystack/test/components/routers/test_file_router.py:203: Warning: Mutating attribute 'mime_type' on an instance of 'ByteStream' can lead to unexpected behavior by affecting other parts of the pipeline that use the same dataclass instance. Use `dataclasses.replace(instance, mime_type=new_value)` instead. See https://docs.haystack.deepset.ai/docs/custom-components#requirements for details.
  /Users/bogdan.kostic/Repositories/haystack/test/components/routers/test_file_router.py:234: Warning: Mutating attribute 'mime_type' on an instance of 'ByteStream' can lead to unexpected behavior by affecting other parts of the pipeline that use the same dataclass instance. Use `dataclasses.replace(instance, mime_type=new_value)` instead. See https://docs.haystack.deepset.ai/docs/custom-components#requirements for details.
  /Users/bogdan.kostic/Repositories/haystack/test/components/routers/test_file_router.py:320: Warning: Mutating attribute 'mime_type' on an instance of 'ByteStream' can lead to unexpected behavior by affecting other parts of the pipeline that use the same dataclass instance. Use `dataclasses.replace(instance, mime_type=new_value)` instead. See https://docs.haystack.deepset.ai/docs/custom-components#requirements for details.
  /Users/bogdan.kostic/Repositories/haystack/test/components/routers/test_file_router.py:322: Warning: Mutating attribute 'mime_type' on an instance of 'ByteStream' can lead to unexpected behavior by affecting other parts of the pipeline that use the same dataclass instance. Use `dataclasses.replace(instance, mime_type=new_value)` instead. See https://docs.haystack.deepset.ai/docs/custom-components#requirements for details.
  /Users/bogdan.kostic/Repositories/haystack/test/components/routers/test_file_router.py:324: Warning: Mutating attribute 'mime_type' on an instance of 'ByteStream' can lead to unexpected behavior by affecting other parts of the pipeline that use the same dataclass instance. Use `dataclasses.replace(instance, mime_type=new_value)` instead. See https://docs.haystack.deepset.ai/docs/custom-components#requirements for details.
  /Users/bogdan.kostic/Repositories/haystack/test/core/pipeline/test_async_pipeline.py:149: Warning: Mutating attribute 'content' on an instance of 'Document' can lead to unexpected behavior by affecting other parts of the pipeline that use the same dataclass instance. Use `dataclasses.replace(instance, content=new_value)` instead. See https://docs.haystack.deepset.ai/docs/custom-components#requirements for details.
  /Users/bogdan.kostic/Repositories/haystack/test/core/pipeline/test_async_pipeline.py:187: Warning: Mutating attribute 'content' on an instance of 'Document' can lead to unexpected behavior by affecting other parts of the pipeline that use the same dataclass instance. Use `dataclasses.replace(instance, content=new_value)` instead. See https://docs.haystack.deepset.ai/docs/custom-components#requirements for details.
  /Users/bogdan.kostic/Repositories/haystack/test/core/pipeline/test_pipeline.py:242: Warning: Mutating attribute 'content' on an instance of 'Document' can lead to unexpected behavior by affecting other parts of the pipeline that use the same dataclass instance. Use `dataclasses.replace(instance, content=new_value)` instead. See https://docs.haystack.deepset.ai/docs/custom-components#requirements for details.
  /Users/bogdan.kostic/Repositories/haystack/test/core/pipeline/test_pipeline.py:279: Warning: Mutating attribute 'content' on an instance of 'Document' can lead to unexpected behavior by affecting other parts of the pipeline that use the same dataclass instance. Use `dataclasses.replace(instance, content=new_value)` instead. See https://docs.haystack.deepset.ai/docs/custom-components#requirements for details.
  /Users/bogdan.kostic/Repositories/haystack/test/core/pipeline/test_tracing.py:166: Warning: Mutating attribute 'content' on an instance of 'Document' can lead to unexpected behavior by affecting other parts of the pipeline that use the same dataclass instance. Use `dataclasses.replace(instance, content=new_value)` instead. See https://docs.haystack.deepset.ai/docs/custom-components#requirements for details.
  /Users/bogdan.kostic/Repositories/haystack/test/dataclasses/test_document.py:117: Warning: Mutating attribute 'id' on an instance of 'Document' can lead to unexpected behavior by affecting other parts of the pipeline that use the same dataclass instance. Use `dataclasses.replace(instance, id=new_value)` instead. See https://docs.haystack.deepset.ai/docs/custom-components#requirements for details.
  /Users/bogdan.kostic/Repositories/haystack/test/dataclasses/test_document.py:118: Warning: Mutating attribute 'id' on an instance of 'Document' can lead to unexpected behavior by affecting other parts of the pipeline that use the same dataclass instance. Use `dataclasses.replace(instance, id=new_value)` instead. See https://docs.haystack.deepset.ai/docs/custom-components#requirements for details.
  /Users/bogdan.kostic/Repositories/haystack/haystack/document_stores/in_memory/document_store.py:436: Warning: Mutating attribute 'embedding' on an instance of 'Document' can lead to unexpected behavior by affecting other parts of the pipeline that use the same dataclass instance. Use `dataclasses.replace(instance, embedding=new_value)` instead. See https://docs.haystack.deepset.ai/docs/custom-components#requirements for details.

rendered_message._content = [TextContent(text=rendered_text)]
# use dataclasses.replace to avoid in-place mutation of the copied message
rendered_message: ChatMessage = replace(
deepcopy(message), _content=[TextContent(text=rendered_text)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deepcopy shouldn't be needed anymore if we use dataclasses.replace.

Comment on lines +4 to +9
Fixed in-place mutation of ``ChatMessage`` dataclass instances in
``ChatPromptBuilder``, ``DynamicChatPromptBuilder``, and
``HuggingFaceLocalChatGenerator``. Mutations now use
``dataclasses.replace()`` instead of direct attribute assignment,
preventing unintended side-effects when the same message object is
reused across multiple pipeline runs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not only in-place mutation of ChatMessage but also other dataclasses, for example Document.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bogdankostic Thanks for the feedback, have resolved this kindly review and let me know, Thanks!

…ses.replace()

Address reviewer feedback:
- Remove unnecessary deepcopy in chat_prompt_builder (dataclasses.replace
  creates a new instance, so deepcopy is redundant)
- Fix in-place mutation of Document.score in HuggingFaceTEIRanker,
  MetaFieldRanker, SentenceTransformersSimilarityRanker, and
  TransformersSimilarityRanker
- Fix in-place mutation of ExtractedAnswer.query in ExtractiveReader
- Fix in-place mutation of Document.embedding in InMemoryDocumentStore
- Fix in-place mutation in test helpers (file router, document splitter
  mocks, pipeline tests, tracing tests, document equality test)
- Update release notes to list all affected components and dataclasses
@Aftabbs Aftabbs force-pushed the fix/resolve-inplace-dataclass-mutations branch from c080b95 to 63f88be Compare March 6, 2026 12:02
Copy link
Contributor

@bogdankostic bogdankostic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @Aftabbs!

@bogdankostic bogdankostic merged commit fa0281b into deepset-ai:main Mar 6, 2026
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Resolve in-place dataclass mutations

3 participants