From 0046ac6524b63ef5883c2fd7ecd6e4b0c063c946 Mon Sep 17 00:00:00 2001 From: Wolf Behrenhoff Date: Tue, 9 Dec 2025 15:30:32 +0100 Subject: [PATCH 1/3] Improve docstring to CoW and chained assignments --- doc/source/user_guide/copy_on_write.rst | 30 +++++++++++++++++++++---- 1 file changed, 26 insertions(+), 4 deletions(-) diff --git a/doc/source/user_guide/copy_on_write.rst b/doc/source/user_guide/copy_on_write.rst index ca6dcc4083e2d..176f26b6cbc33 100644 --- a/doc/source/user_guide/copy_on_write.rst +++ b/doc/source/user_guide/copy_on_write.rst @@ -116,10 +116,32 @@ The following code snippet updated both ``df`` and ``subset`` without CoW: This is not possible anymore with CoW, since the CoW rules explicitly forbid this. This includes updating a single column as a :class:`Series` and relying on the change -propagating back to the parent :class:`DataFrame`. -This statement can be rewritten into a single statement with ``loc`` or ``iloc`` if -this behavior is necessary. :meth:`DataFrame.where` is another suitable alternative -for this case. +propagating back to the parent :class:`DataFrame`. To modify a DataFrame value in a given +column and row, the code must be rewritten as a single assignment to ``loc`` or ``iloc``. +When the column is given by name (``loc``) and the row by position (``iloc``), you either +need to convert the column name to its position using :meth:`Index.get_loc` or you need +to convert the row position to its index. Both variants as shown in the following snippet: + +.. code-block:: ipython + + In [1]: df = pd.DataFrame({"foo": [1, 2, 3], "bar": [4, 5, 6]}) + In [2]: df.iloc[0, df.columns.get_loc("foo")] = 100 + In [3]: df.loc[df.index[1], "bar"] = 200 + In [4]: df + Out[4]: + foo bar + 0 100 4 + 1 2 200 + 2 3 6 + +The ``iloc`` variant works as a direct replacement of the old code ``df["foo"].iloc[0] = 100`` +while the ``loc`` variant first translates the position to the index and then finds all +positions with that index. It does more work and only does the same if the DataFrame has +a unique row index. + +Note that many such statements in the code can potentially hurt the performance. If possible, +prefer to update the whole column at once. If you have boolean mask, +:meth:`DataFrame.where` could be another suitable alternative for this case. Updating a column selected from a :class:`DataFrame` with an inplace method will also not work anymore. From 4243d6e7019812636c344b26e8c6129369107197 Mon Sep 17 00:00:00 2001 From: Wolf Behrenhoff Date: Wed, 10 Dec 2025 14:41:14 +0100 Subject: [PATCH 2/3] Review comment --- doc/source/user_guide/copy_on_write.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/source/user_guide/copy_on_write.rst b/doc/source/user_guide/copy_on_write.rst index 176f26b6cbc33..d19c87fcbf242 100644 --- a/doc/source/user_guide/copy_on_write.rst +++ b/doc/source/user_guide/copy_on_write.rst @@ -122,7 +122,7 @@ When the column is given by name (``loc``) and the row by position (``iloc``), y need to convert the column name to its position using :meth:`Index.get_loc` or you need to convert the row position to its index. Both variants as shown in the following snippet: -.. code-block:: ipython +.. ipython:: python In [1]: df = pd.DataFrame({"foo": [1, 2, 3], "bar": [4, 5, 6]}) In [2]: df.iloc[0, df.columns.get_loc("foo")] = 100 From d984052fcfd595beafd7567076cbf21139d36027 Mon Sep 17 00:00:00 2001 From: Richard Shadrach Date: Sat, 10 Jan 2026 06:55:25 -0500 Subject: [PATCH 3/3] Move content to chained assignement section --- doc/source/user_guide/copy_on_write.rst | 52 ++++++++++++------------- 1 file changed, 26 insertions(+), 26 deletions(-) diff --git a/doc/source/user_guide/copy_on_write.rst b/doc/source/user_guide/copy_on_write.rst index d19c87fcbf242..2deeade3002fb 100644 --- a/doc/source/user_guide/copy_on_write.rst +++ b/doc/source/user_guide/copy_on_write.rst @@ -116,32 +116,9 @@ The following code snippet updated both ``df`` and ``subset`` without CoW: This is not possible anymore with CoW, since the CoW rules explicitly forbid this. This includes updating a single column as a :class:`Series` and relying on the change -propagating back to the parent :class:`DataFrame`. To modify a DataFrame value in a given -column and row, the code must be rewritten as a single assignment to ``loc`` or ``iloc``. -When the column is given by name (``loc``) and the row by position (``iloc``), you either -need to convert the column name to its position using :meth:`Index.get_loc` or you need -to convert the row position to its index. Both variants as shown in the following snippet: - -.. ipython:: python - - In [1]: df = pd.DataFrame({"foo": [1, 2, 3], "bar": [4, 5, 6]}) - In [2]: df.iloc[0, df.columns.get_loc("foo")] = 100 - In [3]: df.loc[df.index[1], "bar"] = 200 - In [4]: df - Out[4]: - foo bar - 0 100 4 - 1 2 200 - 2 3 6 - -The ``iloc`` variant works as a direct replacement of the old code ``df["foo"].iloc[0] = 100`` -while the ``loc`` variant first translates the position to the index and then finds all -positions with that index. It does more work and only does the same if the DataFrame has -a unique row index. - -Note that many such statements in the code can potentially hurt the performance. If possible, -prefer to update the whole column at once. If you have boolean mask, -:meth:`DataFrame.where` could be another suitable alternative for this case. +propagating back to the parent :class:`DataFrame`. See the +:ref:`copy_on_write_chained_assignment` section on how to perform these kinds of +operations. Updating a column selected from a :class:`DataFrame` with an inplace method will also not work anymore. @@ -293,6 +270,29 @@ With copy on write this can be done by using ``loc``. df.loc[df["bar"] > 5, "foo"] = 100 +When the column is given by name (``loc``) and the row by position +(``iloc``), you either need to convert the column name to its +position using :meth:`Index.get_loc` or you need to convert the +row position to its index. Both variants are shown in the following +example. + +.. ipython:: python + + df = pd.DataFrame({"foo": [1, 2, 3], "bar": [4, 5, 6]}) + df.iloc[0, df.columns.get_loc("foo")] = 100 + df.loc[df.index[1], "bar"] = 200 + df + +The ``iloc`` variant works as a direct replacement of the chained +assignment ``df["foo"].iloc[0] = 100`` while the ``loc`` variant first +translates the position to the index and then finds all positions with that +index. It does more work and is only equivalent if the DataFrame has a +unique row index. + +Note that many such statements in the code can potentially hurt the performance. If possible, +prefer to update the whole column at once. If you have boolean mask, +:meth:`DataFrame.where` could be another suitable alternative for this case. + .. _copy_on_write_read_only_na: Read-only NumPy arrays