diff --git a/hub/changelog.rst b/hub/changelog.rst index a7e53fac6f..0e9c0a487f 100644 --- a/hub/changelog.rst +++ b/hub/changelog.rst @@ -1,6 +1,12 @@ Changelog ========= +.. _changelog_2026-03-17: + +2026-03-17 +---------- +* Added the :ref:`$retract ` special field. Setting ``$retract: true`` on an output entity permanently removes all earlier versions of that entity id while keeping the current version. + .. _changelog_2026-03-04: 2026-03-04 diff --git a/hub/documentation/data-management/entity-data-model.rst b/hub/documentation/data-management/entity-data-model.rst index 0232ad2c13..42abff1480 100644 --- a/hub/documentation/data-management/entity-data-model.rst +++ b/hub/documentation/data-management/entity-data-model.rst @@ -173,6 +173,84 @@ Entity fields starting with ``$`` are semi-reserved. They have special meaning a the current one. - + .. _dollar_retract: + * - ``$retract`` + - If set to ``true``, all previous versions of the entity are permanently + removed from the dataset while the current version is retained. See + :ref:`Retract as history prune ` for details + and a configuration example. + - + +.. _dollar_retract_detail: + +Retract as history prune +------------------------ + +Setting ``$retract: true`` on an output entity permanently removes all earlier +versions of that entity id while keeping the current version. Deletion state is +unaffected. The operation is idempotent. + +.. note:: + ``$retract`` is stored as a regular field and will propagate downstream + automatically. However, patterns such as :ref:`merge sources ` + or :ref:`emit_children ` may not carry it through + to the output entity and require explicit handling. + +**Configuration example:** + +When a customer is offboarded their SSN is no longer needed. The pipe emits a +sanitised entity (omitting ``ssn``) and sets ``$retract: true`` to prune the +version history that contained it: + +.. code-block:: json + + { + "_id": "customer-status-retract-history", + "type": "pipe", + "source": { + "type": "dataset", + "dataset": "customer_status" + }, + "transform": { + "type": "dtl", + "rules": { + "default": [ + ["if", ["eq", "_S.status", "offboarded"], + [ + ["copy", "_id"], + ["copy", "status"], + ["add", "$retract", true] + ], + ["copy", "*"] + ] + ] + } + } + } + +Source entity while the customer is active: + +.. code-block:: json + + { + "_id": "customer_123", + "status": "paid", + "ssn": "123456789" + } + +After offboarding, the pipe outputs: + +.. code-block:: json + + { + "_id": "customer_123", + "status": "offboarded", + "$retract": true + } + +The sink writes this as the latest version and permanently removes all earlier +versions, including those that contained ``ssn``. + .. _entity_data_types: Standard types diff --git a/hub/quick-reference.rst b/hub/quick-reference.rst index b7790c7af2..9f3b7534a6 100644 --- a/hub/quick-reference.rst +++ b/hub/quick-reference.rst @@ -336,4 +336,5 @@ Entity model * - Special fields - :ref:`$children ` · :ref:`$ids ` · - :ref:`$replaced ` + :ref:`$replaced ` · + :ref:`$retract `