Publishing dependency tracking and side-effect calculation + dependency state summary for published and draft #369

ormsbee · 2025-08-25T22:42:39Z

Status: Data models and fundamental logic are implemented. I need to clean up a lot and add tests and some admin debug screens. I also need to do the data migration for existing containers data.

This PR introduces a publishing app concept of dependency tracking, as well as adding a PublishingSideEffect analog of the existing DraftSideEffect. The high-level goal is to be able to efficiently do operations like, "publish this entity and all its children" or answer queries like "when was the last time the published state of this entity changed (including its children)?".

The publish_from_drafts function has been modified to publish all dependencies.

Follow-on to #328, broadly implements what was discussed in #317, though I found it simpler to model all dependencies at the PublishableEntity level, rather than splitting them out into separate Draft and Published dependencies. That being said, the dependencies_hash_digest must be computed separately, since the published and draft versions of any given dependency can be different at any given time.

openedx_learning/apps/authoring/publishing/api.py

ormsbee · 2025-09-03T03:17:21Z

@bradenmacdonald, @kdmccormick: Based on your feedback today, I reworked the admin page entries to look like this:

It's not quite obvious, but the version information for the draft is italicized if it's different from the published version + dependency state.

Create the PublishSideEffect model as the publishing analog for the DraftSideEffect model, and create these side effects for parent containers when one of the children is published. This allows us to efficiently query when any subtree of content was affected by a publish. Also introduces PublishableEntityVersionDependency as a way of tracking unpinned dependencies of a version, e.g. the children of a version of a Unit. Also introduces a new dependencies_hash_digest field to Draft and Published models to track dependency state. This PR adds Django admin functionality for this new data, as well as optimizing some existing calls that used to have to do more complex container -> entity list traversal.

bradenmacdonald · 2025-09-03T17:05:51Z

@ormsbee Cool, I think that's a bit more useful and clear.

ormsbee · 2025-09-03T17:22:49Z

@kdmccormick, @bradenmacdonald: I still have to write more tests, but I need to put this down for a day or two to attend to some long-overdue reviews. The code should be in a reviewable state, minus some union-related type annotation stuff that I have to look into later. Please review if you have time.

ormsbee · 2025-09-11T04:28:52Z

Annoying late night thought: It sounds like we may at some point elevate the state summary hash to something that external clients would use, e.g. for history comparison or caching purposes. But it's not stored in the history, since it's calculated on the Draft and Published tables. But it doesn't have to be--since any change to a Draft or a Published (even side-effects) has to be represented in the change logs, that means that we could put those hash summaries on the DraftChangeLogRecord and PublishChangeLogRecord instead. That way, you could see where a particular state was in the history of that thing, and what caused it to become that way.

kdmccormick · 2025-09-11T14:19:28Z

@ormsbee I think it's a great idea

kdmccormick · 2025-09-11T14:37:55Z

Before locking ourselves into UUIDs, I'm still curious if there are any other ways to generate a dependency hash.

A bad idea (just writing it down for posterity)

What about generating the hash from the intrinsic data of the PE, so that a hash fully captures that state of some entity in a package? For example, in pseudocode:

def hash(pev) = combine_hashes(
    hash(pev.publishable_entity.key),
    hash(pev.title),
    *pev.component_version.contents.hashes if pev.component_version,
    *hash(child for child in pev.container_version.entity_list.children) if pev.container,
)

I think this would work OK with libraries as they are today. But, they would not capture any data that is hung off of the LC models in the future. It would break down as soon as someone, for example, made a custom Container subclass with some critical metadata on the ContainerVersion subclass... updating that metadata would fail to modify the dependency hash.

Maybe this could be resolved by allowing PE subclasses to register custom hashing functions? Probably not worth it.

A half-baked idea

What about just hashing just the PEV by its key and version number? We already guarantee that (key, version number) uniquely identifies a PEV in a package, so it should be sufficient input for a hash whose job is to capture whether any changes have happened to PE's dependencies (which are in that same package).

(key, version_num) only breaks down across Open edX instances, but I don't think we need to worry about that for the dependency hash, right?

ormsbee · 2025-09-11T17:15:33Z

What about generating the hash from the intrinsic data of the PE, so that a hash fully captures that state of some entity in a package?

I know you state this is a bad idea, but I do want to explicitly highlight the tradeoffs we're making here for other people to read later, because in many ways this sort of hashing would be the gold standard for how to do this. As you point out, it would require anyone extending the data models to properly calculate a hash for their extended data. Not only would we need to create API hooks for that, it's asking a lot of people to get that hashing right.

Instead of this, both the UUID approach in this PR and the (identifier + version num) approach suggested below try to reduce the burden on developers by taking the shortcut that any PublishableEntityVersion and all the models that may hang off of it are intended to be immutable. Therefore, whatever the actual content is, we can use any shorthand identifier for it, so long as those identifiers won't conflict with each other. This means that we aren't quite as accurate as hashing in the intrinsic content. If we add a piece of text to a problem in v2 and remove it again in v3, there is nothing in the system that marks v1 and v3 as being equivalent to each other. In exchange for this loss of accuracy, we get a simpler API and fast, predictable hash generation.

Going to reply to the (key, version) comment in a bit...

…ta in the log models

kdmccormick

still reviewing, just a couple quick thoughts so far.

openedx_learning/apps/authoring/publishing/models/publishable_entity.py

bradenmacdonald · 2025-10-28T00:57:19Z

What's left to do before we can merge this PR? We need the full descendant publishing to fix some libraries bugs - see #421

openedx_learning/apps/authoring/publishing/models/publishable_entity.py

ormsbee · 2025-10-28T17:39:31Z

@bradenmacdonald: I did a major shift over of where I'm storing the hashes (onto the log records), but I'm still chasing down some transaction related bugs with some very unhelpful traces.

ormsbee · 2025-10-30T14:28:04Z

Addressed all comments so far.

tox.ini

kdmccormick

two more minor comments. the models all look good to me 👍🏻

currently reviewing the publishing-cascade, side-effect-calculation, and backfill code...

kdmccormick · 2025-10-31T14:48:40Z

openedx_learning/apps/authoring/publishing/models/entity_list.py

+    @cached_property
+    def rows(self):
+        """
+        Convenience method to iterate rows.
+
+        I'd normally make this the reverse lookup name for the EntityListRow ->
+        EntityList foreign key relation, but we already have references to
+        entitylistrow_set in various places, and I thought this would be better
+        than breaking compatibility.
+        """
+        return self.entitylistrow_set.order_by("order_num")


[optional] the only external references to entitylistrow_set are in a single test file in edx-platform. Since we're still <v1.0, imo now is the time to make these little breaking change that make the API nicer.

kdmccormick · 2025-10-31T14:57:08Z

openedx_learning/apps/authoring/publishing/models/publish_log.py

+    it almost always will), then publishing one of those Components will alter
+    the published state of the Unit, even if the UnitVersion does not change. In
+    that case, we still consider the Unit to have been "published".


Suggested change

it almost always will), then publishing one of those Components will alter

the published state of the Unit, even if the UnitVersion does not change. In

that case, we still consider the Unit to have been "published".

it almost always will), then publishing one of those Components will alter

the published state of the Unit, even if the UnitVersion does not change.

[optional] IMO what we "consider to have been have been published" is a fuzzy UX-level concept that is subject to change as the product evolves. I suggest focusing on what's literally happening in the data model, and let the UX decide what it means for something to "have been published" from a user's perspective.

kdmccormick

(still reviewing)

kdmccormick · 2025-10-31T19:09:13Z

openedx_learning/apps/authoring/publishing/api.py

+        ]

+        # These are the Published or Draft objects where we need to repoint the
+        # log_record (publish_log_record or draft_change_log) to point to the


Suggested change

# log_record (publish_log_record or draft_change_log) to point to the

# log_record (publish_log_record or draft_change_log_record) to point to the

kdmccormick · 2025-10-31T19:16:01Z

openedx_learning/apps/authoring/publishing/api.py

+        branch_objs_to_update_with_side_effects = []
+
+        while changes_and_affected:
+            original_change, affected = changes_and_affected.pop()


I would rename this inner variable to something like change or cause, because:

If we take "original change" to mean "a change coming directly from the change_log" (as is implied by affected_by_original_change), then I think anything at the 2nd layer of affect or lower is no longer "original".

It's shadowing the outer loop's original_change, which is confusing. I don't think it makes the code any more complicated for the inner loop (the one that descends the layers) to have its own variable.

kdmccormick

I've looked at most of the code. As always, thanks for the careful docs, they've helped a lot. I'd like to give it at least one more pass early next week.

kdmccormick · 2025-10-31T19:41:23Z

openedx_learning/apps/authoring/publishing/api.py

+            # If the Draft or Published that is affected by this change is not
+            # already in the change_log, then we have to add it.
+            affected_version_pk = affected.version_id
+


Suggested change

# If the Draft or Published that is affected by this change is not

# already in the change_log, then we have to add it.

affected_version_pk = affected.version_id

I think this comment is already covered (better) by your comment right above defaults={ and the extra variable here adds cognitive overhead-- I would just inline affected.version_id.

Agreed. I think this was some left-over from a previous iteration where I was doing something weirder.

kdmccormick · 2025-10-31T19:42:49Z

openedx_learning/apps/authoring/publishing/api.py

+                    'old_version_id': affected_version_pk,
+                    'new_version_id': affected_version_pk


Suggested change

'old_version_id': affected_version_pk,

'new_version_id': affected_version_pk

'old_version_id': affected.version_id,

'new_version_id': affected.version_id,

(this goes with my previous suggestion)

openedx_learning/apps/authoring/publishing/api.py

kdmccormick · 2025-10-31T20:38:16Z

openedx_learning/apps/authoring/publishing/api.py

+    By default, this will also publish all dependencies (e.g. children) of the
+    Drafts that are passed in.


Suggested change

By default, this will also publish all dependencies (e.g. children) of the

Drafts that are passed in.

By default, this will also publish all dependencies (e.g. unpinned children) of the

Drafts that are passed in.

Took me a minute to realize why this function wasn't specially handling pinned children.

ormsbee · 2025-11-03T21:49:58Z

@kdmccormick: Addressed all comments except the EntityList.entitylistrow refactoring, which I made a separate ticket for.

openedx_learning/apps/authoring/publishing/api.py

Co-authored-by: Kyle McCormick <kyle@kylemccormick.me>

kdmccormick

LGTM!

I tested the backfill, tested that publish side affects percolate upwards, and tested the reverse migration.

Notes for future work, none of it critical for Ulmo AFAICT:

We should get rip out any code which checked for a container's unpublished changes the old way (walking down the tree), if there is any of that.
I think it could be good to add more cross-links in the admin interface, in support of devs and operators who are browsing around and trying to understand how the log changes and entities are related to one another. I didn't have the time to enumerate these places.
As we discussed earlier, I think a lot of this code could become more readable if common abstract mixins were factored out from DraftChange{Log,LogRecord,SideEffect} and Publish{Log,LogRecord,SideEffect}.

openedx_learning/apps/authoring/publishing/api.py

This pulls in publishing dependency changes from: openedx/openedx-learning#369 This fixes a bug where publishing a Content Library v2 container would publish only its direct children instead of publishing all ancestors. Co-authored-by: Kyle McCormick <kyle@axim.org>

This pulls in publishing dependency changes from: openedx/openedx-learning#369 This fixes a bug where publishing a Content Library v2 container would publish only its direct children instead of publishing all ancestors. Backports: 190a8b8 Co-authored-by: Kyle McCormick <kyle@axim.org>

ormsbee marked this pull request as draft August 25, 2025 22:42

ormsbee changed the title ~~Simpler side effects~~ Publishing dependency tracking and side-effect calculation for Draft and Published Aug 25, 2025

ormsbee changed the title ~~Publishing dependency tracking and side-effect calculation for Draft and Published~~ Publishing dependency tracking and side-effect calculation + dependency state summary for published and draft Aug 27, 2025

ormsbee force-pushed the simpler-side-effects branch from 72688d7 to 51511f6 Compare August 28, 2025 15:49

ormsbee commented Sep 2, 2025

View reviewed changes

openedx_learning/apps/authoring/publishing/api.py Show resolved Hide resolved

ormsbee force-pushed the simpler-side-effects branch from 0b452cf to 9323705 Compare September 3, 2025 14:41

ormsbee force-pushed the simpler-side-effects branch from 9323705 to 2e9cc4c Compare September 3, 2025 14:42

temp: linter fixups

0af01c0

temp: remove redundant early return, add comments

00e6ea6

ormsbee marked this pull request as ready for review September 3, 2025 17:20

ChrisChV mentioned this pull request Sep 17, 2025

Publish section/subsection openedx/frontend-app-authoring#1977

Closed

ormsbee added 2 commits October 2, 2025 12:20

refactor: models part of the refactoring of putting the dependency da…

a74d8b1

…ta in the log models

temp: transitional state to deps in logs

dee35da

kdmccormick self-requested a review October 24, 2025 14:40

kdmccormick reviewed Oct 24, 2025

View reviewed changes

openedx_learning/apps/authoring/publishing/models/publishable_entity.py Outdated Show resolved Hide resolved

openedx_learning/apps/authoring/publishing/models/publishable_entity.py Outdated Show resolved Hide resolved

bradenmacdonald mentioned this pull request Oct 28, 2025

feat: get complete entity structure api #421

Closed

navinkarkera reviewed Oct 28, 2025

View reviewed changes

openedx_learning/apps/authoring/publishing/models/publishable_entity.py Outdated Show resolved Hide resolved

ormsbee added 3 commits October 28, 2025 17:45

temp: mostly through refactor to putting hashes on the log records

65793f0

temp: all the tests pass, finally

9d9ec94

temp: fixups, remove some old comments

3adc0cc

fix: version constraint was overly conservative

898be1e

rodmgwgu reviewed Oct 30, 2025

View reviewed changes

tox.ini Outdated Show resolved Hide resolved

temp: re-enable pylint (cough)

3e5c507

dwong2708 mentioned this pull request Oct 31, 2025

Publishing a draft Container does not recursively publish nested drafts #423

Closed

kdmccormick self-requested a review October 31, 2025 14:36

kdmccormick reviewed Oct 31, 2025

View reviewed changes

kdmccormick self-requested a review October 31, 2025 15:10

kdmccormick requested changes Oct 31, 2025

View reviewed changes

This was referenced Nov 3, 2025

The whole publish dependency thing #411

Closed

Cleanup: EntityList.entitylistrow -> rows #424

Open

temp: small renaming and comment fixes from Kyle's reviews

be1d376

kdmccormick reviewed Nov 4, 2025

View reviewed changes

openedx_learning/apps/authoring/publishing/api.py Outdated Show resolved Hide resolved

kdmccormick reviewed Nov 4, 2025

View reviewed changes

openedx_learning/apps/authoring/publishing/api.py Outdated Show resolved Hide resolved

kdmccormick reviewed Nov 4, 2025

View reviewed changes

openedx_learning/apps/authoring/publishing/api.py Show resolved Hide resolved

kdmccormick reviewed Nov 4, 2025

View reviewed changes

openedx_learning/apps/authoring/publishing/api.py Outdated Show resolved Hide resolved

temp: update openedx_learning/apps/authoring/publishing/api.py

6673a69

Co-authored-by: Kyle McCormick <kyle@kylemccormick.me>

kdmccormick approved these changes Nov 6, 2025

View reviewed changes

openedx_learning/apps/authoring/publishing/api.py Outdated Show resolved Hide resolved

temp: comment fixups based on review comments

764c98e

ormsbee merged commit 72d082c into openedx:main Nov 7, 2025
11 checks passed

ormsbee deleted the simpler-side-effects branch November 7, 2025 00:59

ormsbee mentioned this pull request Nov 7, 2025

fix: bump learning-core to 0.30.0 openedx/edx-platform#37614

Merged

kdmccormick mentioned this pull request Nov 7, 2025

fix: bump openedx-learning to 0.30.0 [ulmo] openedx/edx-platform#37615

Merged

ormsbee mentioned this pull request Nov 19, 2025

Publish does not cascade downward openedx/frontend-app-authoring#2502

Closed

	# log_record (publish_log_record or draft_change_log) to point to the
	# log_record (publish_log_record or draft_change_log_record) to point to the

	# If the Draft or Published that is affected by this change is not
	# already in the change_log, then we have to add it.
	affected_version_pk = affected.version_id

		'old_version_id': affected_version_pk,
		'new_version_id': affected_version_pk

		By default, this will also publish all dependencies (e.g. children) of the
		Drafts that are passed in.

Publishing dependency tracking and side-effect calculation + dependency state summary for published and draft #369

Publishing dependency tracking and side-effect calculation + dependency state summary for published and draft #369

Uh oh!

Conversation

ormsbee commented Aug 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

ormsbee commented Sep 3, 2025

Uh oh!

bradenmacdonald commented Sep 3, 2025

Uh oh!

ormsbee commented Sep 3, 2025

Uh oh!

ormsbee commented Sep 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kdmccormick commented Sep 11, 2025

Uh oh!

kdmccormick commented Sep 11, 2025

A bad idea (just writing it down for posterity)

A half-baked idea

Uh oh!

ormsbee commented Sep 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kdmccormick left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

bradenmacdonald commented Oct 28, 2025

Uh oh!

Uh oh!

ormsbee commented Oct 28, 2025

Uh oh!

ormsbee commented Oct 30, 2025

Uh oh!

Uh oh!

kdmccormick left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kdmccormick left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kdmccormick left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ormsbee commented Nov 3, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kdmccormick left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

ormsbee commented Aug 25, 2025 •

edited

Loading

ormsbee commented Sep 11, 2025 •

edited

Loading

ormsbee commented Sep 11, 2025 •

edited

Loading

kdmccormick left a comment •

edited

Loading