Update outdated remote actors in place instead of upserting#3451
Open
pfefferle wants to merge 4 commits into
Open
Update outdated remote actors in place instead of upserting#3451pfefferle wants to merge 4 commits into
pfefferle wants to merge 4 commits into
Conversation
In Scheduler::update_remote_actors() the actors all come from get_outdated(), so the ap_actor post always exists and we already hold its ID. upsert() re-resolves it via get_by_uri() (a redundant lookup) and falls back to create() if the remote now reports a different id (e.g. a migration), inserting a duplicate actor during what should be a plain refresh. Update by post ID instead. Follow-up to #3450; adds a regression test that fails on upsert (duplicate) and passes on update (refreshed in place).
There was a problem hiding this comment.
Pull request overview
This PR updates the scheduled refresh of cached remote actors so that it refreshes the already-known ap_actor post in place (by post ID) instead of calling Remote_Actors::upsert(), which can create a duplicate actor when the remote metadata’s id has changed.
Changes:
- Switch
Scheduler::update_remote_actors()fromRemote_Actors::upsert( $meta )toRemote_Actors::update( $actor->ID, $meta )to avoid redundant lookups and prevent duplicate actor insertion. - Add a PHPUnit test to cover the “remote id changed” refresh scenario and ensure the cached actor is updated in place.
- Add a changelog entry describing the fix.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
includes/class-scheduler.php |
Refreshes outdated remote actors by updating the existing ap_actor record via post ID rather than upserting by URI. |
tests/phpunit/tests/includes/class-test-scheduler.php |
Adds a unit test asserting scheduled refresh updates an actor in place even if the fetched metadata reports a different id. |
.github/changelog/fix-scheduler-update-remote-actor-in-place |
Documents the user-facing fix in the changelogger format. |
Assert the ap_actor count is unchanged by the refresh (no duplicate) and that the original post is updated in place, instead of asserting exactly one actor exists in the whole test database.
A routine refresh should update an actor's profile, not relocate its identity. If the remote now reports a different (or missing) id, that is a Move or a malformed response: applying it would rewrite the cached guid in place and could collide with another cached actor (duplicate guid). Skip those and refresh in place only when the fetched id matches the cached guid. Updates the test to cover a same-identity refresh and adds one for the skipped identity-change case.
Comment on lines
+304
to
+307
| $fetched_id = isset( $meta['id'] ) && \is_string( $meta['id'] ) ? \esc_url_raw( $meta['id'] ) : ''; | ||
| if ( $fetched_id !== $actor->guid ) { | ||
| continue; | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Follow-up to #3450 (see this thread).
Proposed changes
In
Scheduler::update_remote_actors(), replaceRemote_Actors::upsert( $meta )withRemote_Actors::update( $actor->ID, $meta ).Every actor in that loop comes from
Remote_Actors::get_outdated(), so theap_actorpost always exists and we already hold its$actor->ID.upsert()throws that ID away and re-resolves the post viaget_by_uri( $meta['id'] )(a redundant lookup), then falls back tocreate()if that lookup misses, for example when a remote actor migrates to a new id. The result is a duplicate actor inserted during what should be a plain refresh. Updating by post ID refreshes the known-outdated record in place and avoids both.Other information
test_update_remote_actors_refreshes_in_place(fails onupsertby creating a duplicate, passes onupdate).Testing instructions
post_modified_gmtpast one day so the scheduled refresh picks it up.activitypub_pre_http_get_remote_object) to return metadata with a differentidthan the cached one.Scheduler::update_remote_actors().ap_actorpost (the original, refreshed in place), not a duplicate.Changelog entry
Changelog Entry Details
Significance
Type
Message
Refresh cached remote profiles in place during scheduled updates to avoid creating duplicate copies.