Skip to content

For export processes that use the "All data added or updated since last export" filter, need to modify what's considered the "last export" #12

@jthomale

Description

@jthomale

In a production environment, scheduled export processes that run regularly to sync data from Sierra to the API need to ensure that they capture all changes made since the last time the sync process ran. If they don't, you end up with records in the API that are out of sync with Sierra.

The "all data added or updated since last export" filter is supposed to ensure this happens correctly. Right now, it tries to find the datetime that the last export job with the same export type ran, and it uses that in a date range query to pull all records that have been updated since that time.

The problem is, if you manually run an export with a different filter (one that targets a specific record range, for instance), then the next scheduled process that runs uses that job as the "last export." This means anything added or updated between the previous scheduled job and the manual job is lost.

Example.

  • 2:00 PM: Scheduled ItemsToSolr sync runs. The API is up-to-date with Sierra when it finishes.
  • 3:30 PM: I manually trigger an ItemsToSolr job to load just one specific record.
  • 4:00 PM: The next scheduled ItemsToSolr sync runs. Instead of grabbing everything that was updated from 2:00 PM to 4:00 PM, it grabs everything that was updated from 3:30 PM to 4:00 PM (because the 3:30 job becomes the last export). Records updated between 2:00 PM to 3:30 PM never get synced.

There are a couple of potential solutions.

  1. Maybe the "last export" should be the last export of the same type that used the same filter type.
  2. Maybe the "last export" should be the last export of the same type triggered by the same user.

Fix 1 makes more logical sense, but in some cases it's useful to be able to test scheduled exports (in a development environment) without loading the entire database. In this case, you can just run an export manually and then start the scheduled exports. So fix 2 may be the way to go, as long as you set up one user to use for (most) scheduled exports and another to use for manual exports. (And, of course, set the EXPORTER_AUTOMATED_USERNAME setting appropriately.)

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions