fix: replace in-place dataclass mutations with dataclasses.replace()#10702
Conversation
|
Someone is attempting to deploy a commit to the deepset Team on Vercel. A member of the Team first needs to authorize it. |
|
@Aftabbs Could you please sign the CLA? |
|
i have Signed it |
|
It seems like your commits are not linked to your GitHub account. I think you need to add the email address used in the commits to your account for the CLA check to pass. |
|
@bogdankostic just now added it (5min ago and then signed the CLA) |
|
The e-mail address used in the commit is aftabbs@github.com. I think you have to set your commit e-mail address in Git to the one that's linked with your GitHub account and overwrite the existing commit. |
|
This aftabbs@github.com this was accidently added (this is the actual one aftabbs.wwe@gmail.com) which is added and verified |
|
Could you try amending your commit with your e-mail address with the following? |
Resolves the in-place mutation warnings introduced by the _warn_on_inplace_mutation guard in PR deepset-ai#10650. Running `hatch run test:unit | grep "Mutating attribute"` surfaced mutations across five components. Each is replaced with `dataclasses.replace(instance, field=new_value)` so that dataclass instances are never mutated after creation. Changed files: - components/builders/chat_prompt_builder.py: replace _content mutation on rendered ChatMessage copy with dataclasses.replace() - core/pipeline/pipeline.py: replace two-field mutation on PipelineSnapshot (agent_snapshot + break_point) with a single dataclasses.replace() call - components/converters/image/file_to_image.py: replace ByteStream.mime_type mutation with dataclasses.replace() - components/extractors/llm_metadata_extractor.py: replace Document.content mutation with dataclasses.replace() (already imported `replace`) - components/fetchers/link_content.py: replace ByteStream.mime_type mutations in both sync and async run() methods - components/joiners/document_joiner.py: replace Document.score mutations in _score_norm, _reciprocal_rank_fusion, and _distribution_based_rank_fusion with non-mutating list comprehensions using dataclasses.replace() Also updates test_document_joiner.py::test_list_with_one_empty_list to compare by document ID rather than object identity, since the test previously relied on the mutation side-effect to make the assertion pass. Fixes deepset-ai#10659
a71c736 to
2a9e5b7
Compare
|
done |
|
Thanks @Aftabbs, now the CLA check passes but there are other failing tests. I'd recommend going through our contributing guidelines for instructions on how to solve them. |
|
Fixed both failing checks , wrapped the long line in chat_prompt_builder.py to resolve the ruff E501 error, and added a reno release note under releasenotes/notes/. The Vercel failure appears to be a deployment authorization issue unrelated to the code changes. Ready for re-review! |
|
I'm sorry but it seems like your newest commit again uses aftabbs@github.com as an author letting the CLA check fail again. |
- Break line 272 in chat_prompt_builder.py to satisfy E501 (max 120 chars) - Add reno release note for the dataclass mutation fix Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
1a9846a to
2ea244f
Compare
|
oh my bad, pl check once now |
|
The latest updates on your projects. Learn more about Vercel for GitHub. 1 Skipped Deployment
|
bogdankostic
left a comment
There was a problem hiding this comment.
Thanks for this PR @Aftabbs! I left two minor comments.
Also, there are still some places left where we do in-place mutation of data classes. Running hatch run test:unit | grep "Mutating attribute" on your branch results in the following output:
/Users/bogdan.kostic/Repositories/haystack/test/components/preprocessors/test_embedding_based_document_splitter.py:315: Warning: Mutating attribute 'embedding' on an instance of 'Document' can lead to unexpected behavior by affecting other parts of the pipeline that use the same dataclass instance. Use `dataclasses.replace(instance, embedding=new_value)` instead. See https://docs.haystack.deepset.ai/docs/custom-components#requirements for details.
/Users/bogdan.kostic/Repositories/haystack/test/components/preprocessors/test_embedding_based_document_splitter.py:335: Warning: Mutating attribute 'embedding' on an instance of 'Document' can lead to unexpected behavior by affecting other parts of the pipeline that use the same dataclass instance. Use `dataclasses.replace(instance, embedding=new_value)` instead. See https://docs.haystack.deepset.ai/docs/custom-components#requirements for details.
/Users/bogdan.kostic/Repositories/haystack/haystack/components/rankers/hugging_face_tei.py:164: Warning: Mutating attribute 'score' on an instance of 'Document' can lead to unexpected behavior by affecting other parts of the pipeline that use the same dataclass instance. Use `dataclasses.replace(instance, score=new_value)` instead. See https://docs.haystack.deepset.ai/docs/custom-components#requirements for details.
/Users/bogdan.kostic/Repositories/haystack/haystack/components/rankers/meta_field.py:405: Warning: Mutating attribute 'score' on an instance of 'Document' can lead to unexpected behavior by affecting other parts of the pipeline that use the same dataclass instance. Use `dataclasses.replace(instance, score=new_value)` instead. See https://docs.haystack.deepset.ai/docs/custom-components#requirements for details.
/Users/bogdan.kostic/Repositories/haystack/haystack/components/rankers/sentence_transformers_similarity.py:292: Warning: Mutating attribute 'score' on an instance of 'Document' can lead to unexpected behavior by affecting other parts of the pipeline that use the same dataclass instance. Use `dataclasses.replace(instance, score=new_value)` instead. See https://docs.haystack.deepset.ai/docs/custom-components#requirements for details.
/Users/bogdan.kostic/Repositories/haystack/haystack/components/rankers/transformers_similarity.py:321: Warning: Mutating attribute 'score' on an instance of 'Document' can lead to unexpected behavior by affecting other parts of the pipeline that use the same dataclass instance. Use `dataclasses.replace(instance, score=new_value)` instead. See https://docs.haystack.deepset.ai/docs/custom-components#requirements for details.
/Users/bogdan.kostic/Repositories/haystack/haystack/components/readers/extractive.py:393: Warning: Mutating attribute 'query' on an instance of 'ExtractedAnswer' can lead to unexpected behavior by affecting other parts of the pipeline that use the same dataclass instance. Use `dataclasses.replace(instance, query=new_value)` instead. See https://docs.haystack.deepset.ai/docs/custom-components#requirements for details.
/Users/bogdan.kostic/Repositories/haystack/test/components/routers/test_file_router.py:203: Warning: Mutating attribute 'mime_type' on an instance of 'ByteStream' can lead to unexpected behavior by affecting other parts of the pipeline that use the same dataclass instance. Use `dataclasses.replace(instance, mime_type=new_value)` instead. See https://docs.haystack.deepset.ai/docs/custom-components#requirements for details.
/Users/bogdan.kostic/Repositories/haystack/test/components/routers/test_file_router.py:234: Warning: Mutating attribute 'mime_type' on an instance of 'ByteStream' can lead to unexpected behavior by affecting other parts of the pipeline that use the same dataclass instance. Use `dataclasses.replace(instance, mime_type=new_value)` instead. See https://docs.haystack.deepset.ai/docs/custom-components#requirements for details.
/Users/bogdan.kostic/Repositories/haystack/test/components/routers/test_file_router.py:320: Warning: Mutating attribute 'mime_type' on an instance of 'ByteStream' can lead to unexpected behavior by affecting other parts of the pipeline that use the same dataclass instance. Use `dataclasses.replace(instance, mime_type=new_value)` instead. See https://docs.haystack.deepset.ai/docs/custom-components#requirements for details.
/Users/bogdan.kostic/Repositories/haystack/test/components/routers/test_file_router.py:322: Warning: Mutating attribute 'mime_type' on an instance of 'ByteStream' can lead to unexpected behavior by affecting other parts of the pipeline that use the same dataclass instance. Use `dataclasses.replace(instance, mime_type=new_value)` instead. See https://docs.haystack.deepset.ai/docs/custom-components#requirements for details.
/Users/bogdan.kostic/Repositories/haystack/test/components/routers/test_file_router.py:324: Warning: Mutating attribute 'mime_type' on an instance of 'ByteStream' can lead to unexpected behavior by affecting other parts of the pipeline that use the same dataclass instance. Use `dataclasses.replace(instance, mime_type=new_value)` instead. See https://docs.haystack.deepset.ai/docs/custom-components#requirements for details.
/Users/bogdan.kostic/Repositories/haystack/test/core/pipeline/test_async_pipeline.py:149: Warning: Mutating attribute 'content' on an instance of 'Document' can lead to unexpected behavior by affecting other parts of the pipeline that use the same dataclass instance. Use `dataclasses.replace(instance, content=new_value)` instead. See https://docs.haystack.deepset.ai/docs/custom-components#requirements for details.
/Users/bogdan.kostic/Repositories/haystack/test/core/pipeline/test_async_pipeline.py:187: Warning: Mutating attribute 'content' on an instance of 'Document' can lead to unexpected behavior by affecting other parts of the pipeline that use the same dataclass instance. Use `dataclasses.replace(instance, content=new_value)` instead. See https://docs.haystack.deepset.ai/docs/custom-components#requirements for details.
/Users/bogdan.kostic/Repositories/haystack/test/core/pipeline/test_pipeline.py:242: Warning: Mutating attribute 'content' on an instance of 'Document' can lead to unexpected behavior by affecting other parts of the pipeline that use the same dataclass instance. Use `dataclasses.replace(instance, content=new_value)` instead. See https://docs.haystack.deepset.ai/docs/custom-components#requirements for details.
/Users/bogdan.kostic/Repositories/haystack/test/core/pipeline/test_pipeline.py:279: Warning: Mutating attribute 'content' on an instance of 'Document' can lead to unexpected behavior by affecting other parts of the pipeline that use the same dataclass instance. Use `dataclasses.replace(instance, content=new_value)` instead. See https://docs.haystack.deepset.ai/docs/custom-components#requirements for details.
/Users/bogdan.kostic/Repositories/haystack/test/core/pipeline/test_tracing.py:166: Warning: Mutating attribute 'content' on an instance of 'Document' can lead to unexpected behavior by affecting other parts of the pipeline that use the same dataclass instance. Use `dataclasses.replace(instance, content=new_value)` instead. See https://docs.haystack.deepset.ai/docs/custom-components#requirements for details.
/Users/bogdan.kostic/Repositories/haystack/test/dataclasses/test_document.py:117: Warning: Mutating attribute 'id' on an instance of 'Document' can lead to unexpected behavior by affecting other parts of the pipeline that use the same dataclass instance. Use `dataclasses.replace(instance, id=new_value)` instead. See https://docs.haystack.deepset.ai/docs/custom-components#requirements for details.
/Users/bogdan.kostic/Repositories/haystack/test/dataclasses/test_document.py:118: Warning: Mutating attribute 'id' on an instance of 'Document' can lead to unexpected behavior by affecting other parts of the pipeline that use the same dataclass instance. Use `dataclasses.replace(instance, id=new_value)` instead. See https://docs.haystack.deepset.ai/docs/custom-components#requirements for details.
/Users/bogdan.kostic/Repositories/haystack/haystack/document_stores/in_memory/document_store.py:436: Warning: Mutating attribute 'embedding' on an instance of 'Document' can lead to unexpected behavior by affecting other parts of the pipeline that use the same dataclass instance. Use `dataclasses.replace(instance, embedding=new_value)` instead. See https://docs.haystack.deepset.ai/docs/custom-components#requirements for details.
| rendered_message._content = [TextContent(text=rendered_text)] | ||
| # use dataclasses.replace to avoid in-place mutation of the copied message | ||
| rendered_message: ChatMessage = replace( | ||
| deepcopy(message), _content=[TextContent(text=rendered_text)] |
There was a problem hiding this comment.
deepcopy shouldn't be needed anymore if we use dataclasses.replace.
| Fixed in-place mutation of ``ChatMessage`` dataclass instances in | ||
| ``ChatPromptBuilder``, ``DynamicChatPromptBuilder``, and | ||
| ``HuggingFaceLocalChatGenerator``. Mutations now use | ||
| ``dataclasses.replace()`` instead of direct attribute assignment, | ||
| preventing unintended side-effects when the same message object is | ||
| reused across multiple pipeline runs. |
There was a problem hiding this comment.
It's not only in-place mutation of ChatMessage but also other dataclasses, for example Document.
There was a problem hiding this comment.
@bogdankostic Thanks for the feedback, have resolved this kindly review and let me know, Thanks!
…ses.replace() Address reviewer feedback: - Remove unnecessary deepcopy in chat_prompt_builder (dataclasses.replace creates a new instance, so deepcopy is redundant) - Fix in-place mutation of Document.score in HuggingFaceTEIRanker, MetaFieldRanker, SentenceTransformersSimilarityRanker, and TransformersSimilarityRanker - Fix in-place mutation of ExtractedAnswer.query in ExtractiveReader - Fix in-place mutation of Document.embedding in InMemoryDocumentStore - Fix in-place mutation in test helpers (file router, document splitter mocks, pipeline tests, tracing tests, document equality test) - Update release notes to list all affected components and dataclasses
c080b95 to
63f88be
Compare
bogdankostic
left a comment
There was a problem hiding this comment.
Thanks for the PR @Aftabbs!
Resolves the in-place mutation warnings introduced by the _warn_on_inplace_mutation guard in PR #10650.
Running
hatch run test:unit | grep "Mutating attribute"surfaced mutations across five components. Each is replaced withdataclasses.replace(instance, field=new_value)so that dataclass instances are never mutated after creation.Problem
After the
_warn_on_inplace_mutationguard was added in #10650, running unit tests withhatch run test:unit | grep "Mutating attribute"surfaced 10 in-place mutations across 5 components. These mutations trigger the warning and can cause subtle bugs when the same dataclass instance is shared across pipeline components.Fix
Replace every in-place attribute assignment on a dataclass instance with
dataclasses.replace(instance, field=value). No logic changes — only the mutation pattern is updated.Files changed
components/builders/chat_prompt_builder.py_contentChatMessagecore/pipeline/pipeline.pyagent_snapshot,break_pointreplace)Also updates test_document_joiner.py::test_list_with_one_empty_list to compare by document ID rather than object identity, since the test previously relied on the mutation side-effect to make the assertion pass.
Fixes #10659
Related Issues
Proposed Changes:
How did you test it?
Notes for the reviewer
Checklist
fix:,feat:,build:,chore:,ci:,docs:,style:,refactor:,perf:,test:and added!in case the PR includes breaking changes.