Skip to content

Update outdated remote actors in place instead of upserting#3451

Open
pfefferle wants to merge 4 commits into
trunkfrom
fix/scheduler-update-remote-actor-in-place
Open

Update outdated remote actors in place instead of upserting#3451
pfefferle wants to merge 4 commits into
trunkfrom
fix/scheduler-update-remote-actor-in-place

Conversation

@pfefferle

Copy link
Copy Markdown
Member

Follow-up to #3450 (see this thread).

Proposed changes

In Scheduler::update_remote_actors(), replace Remote_Actors::upsert( $meta ) with Remote_Actors::update( $actor->ID, $meta ).

Every actor in that loop comes from Remote_Actors::get_outdated(), so the ap_actor post always exists and we already hold its $actor->ID. upsert() throws that ID away and re-resolves the post via get_by_uri( $meta['id'] ) (a redundant lookup), then falls back to create() if that lookup misses, for example when a remote actor migrates to a new id. The result is a duplicate actor inserted during what should be a plain refresh. Updating by post ID refreshes the known-outdated record in place and avoids both.

Other information

  • Have you written new tests for your changes, if applicable? — test_update_remote_actors_refreshes_in_place (fails on upsert by creating a duplicate, passes on update).

Testing instructions

  • Cache a remote actor, then age its post_modified_gmt past one day so the scheduled refresh picks it up.
  • Stub the remote fetch (filter activitypub_pre_http_get_remote_object) to return metadata with a different id than the cached one.
  • Run Scheduler::update_remote_actors().
  • Confirm there is still exactly one ap_actor post (the original, refreshed in place), not a duplicate.

Changelog entry

  • Automatically create a changelog entry from the details below.
Changelog Entry Details

Significance

  • Patch

Type

  • Fixed

Message

Refresh cached remote profiles in place during scheduled updates to avoid creating duplicate copies.

In Scheduler::update_remote_actors() the actors all come from get_outdated(),
so the ap_actor post always exists and we already hold its ID. upsert()
re-resolves it via get_by_uri() (a redundant lookup) and falls back to create()
if the remote now reports a different id (e.g. a migration), inserting a
duplicate actor during what should be a plain refresh. Update by post ID instead.

Follow-up to #3450; adds a regression test that fails on upsert (duplicate) and
passes on update (refreshed in place).
Copilot AI review requested due to automatic review settings June 19, 2026 15:47
@pfefferle pfefferle self-assigned this Jun 19, 2026
@pfefferle pfefferle requested a review from a team June 19, 2026 15:47

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the scheduled refresh of cached remote actors so that it refreshes the already-known ap_actor post in place (by post ID) instead of calling Remote_Actors::upsert(), which can create a duplicate actor when the remote metadata’s id has changed.

Changes:

  • Switch Scheduler::update_remote_actors() from Remote_Actors::upsert( $meta ) to Remote_Actors::update( $actor->ID, $meta ) to avoid redundant lookups and prevent duplicate actor insertion.
  • Add a PHPUnit test to cover the “remote id changed” refresh scenario and ensure the cached actor is updated in place.
  • Add a changelog entry describing the fix.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
includes/class-scheduler.php Refreshes outdated remote actors by updating the existing ap_actor record via post ID rather than upserting by URI.
tests/phpunit/tests/includes/class-test-scheduler.php Adds a unit test asserting scheduled refresh updates an actor in place even if the fetched metadata reports a different id.
.github/changelog/fix-scheduler-update-remote-actor-in-place Documents the user-facing fix in the changelogger format.

Comment thread tests/phpunit/tests/includes/class-test-scheduler.php Outdated
Assert the ap_actor count is unchanged by the refresh (no duplicate) and that
the original post is updated in place, instead of asserting exactly one actor
exists in the whole test database.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

Comment thread includes/class-scheduler.php Outdated
A routine refresh should update an actor's profile, not relocate its identity.
If the remote now reports a different (or missing) id, that is a Move or a
malformed response: applying it would rewrite the cached guid in place and could
collide with another cached actor (duplicate guid). Skip those and refresh in
place only when the fetched id matches the cached guid.

Updates the test to cover a same-identity refresh and adds one for the skipped
identity-change case.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

Comment on lines +304 to +307
$fetched_id = isset( $meta['id'] ) && \is_string( $meta['id'] ) ? \esc_url_raw( $meta['id'] ) : '';
if ( $fetched_id !== $actor->guid ) {
continue;
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants