Preservation: AIP Creation and Ingest into ARCHive

Roles

As of January 2026, the Digital Archivist organizes the contents into AIP folders and makes the metadata.csv and the Head of Digital Stewardship makes AIPs and ingests them into ARCHive. This is a workflow that the Head of Digital Stewardship is already doing for other departments, so it frees up more time for the Digital Archivist to process and spares them another set of scripts and tools to be responsible for learning.

Communication is managed through the Born-Digital Collections Tracking in Planner.

AIP Creation

Create AIPs within one month of being assigned a collection in Planner.

Preparation

Move PreservationCopy from Hub to local machine with TeraCopy. (until address permission issues)
Move each AIP folder into a temporary folder, which can be renamed with the AIP ID. (until update script)
Update the folder column in metadata.csv to match the temporary folders.

TBD: check that AIPs do not exceed maximum size of 100 GB and 10,000 files. See aip_prep.py for automatically splitting AIPs that are too large and have no logical subdivisions. Consult with the Digital Archivist to update IDs in the description if splitting AIPs or establish a naming convention to keep AIP ID and DIP ID related but different, like AIP ID = DIP ID + additional sequential number.

Script

Make a copy of aips_directory, for an easier restart in case of script errors.
Verify the configuration.py paths are correct.
Skim through metadata.csv for any errors. Occassionally get a typo or smart quotes in a title, for example.
Run general_aip.py.

If different versions of the same AIP are being made at the same time, they must be in different aip_directories or the script fails when it tries to make the second aip-id_bag (same folder name) and have different parent directories or the combined-fits and preservation-xml get overwritten.

Some of the script outputs are retained as part of the permanent record of the collection. The Digital Archivist is responsible for saving or deleting files at the end of processing.

Quality Control

Results are documented in aip_qc_results.txt.

Review AIP log for errors.
Zip all preservation.xml and validate in the ARCHive application, adding the collection if needed using ID and title from Planner.
Compare MD5 checksums of the accession(s) to the AIP(s) using the bag manifests. excel-md5-compare.md
Check a sample of 1-5 AIPs (depending on number of AIPs and level of difference) in more detail:
- Bag has MD5 and SHA manifests.
- Bag has objects and metadata folders.
- The number of files in "objects" matches the number of FITS in "metadata".
- The contents of the preservation.xml match the metadata.csv and combined FITS XML.
- Look for anything that doesn't look right.
Note PASS/FAIL for the AIP Creation portion of aip_qc_results.txt

ARCHive Ingest

Add the AIPs to UGA's digital preservation system. See DCWG Teams for documentation.

Job Scheduling

Copy the zipped AIPs to the ARCHive ingest folder.
Schedule the job for ingest via the ARCHive application.
Check Planner for if there are access copies. If so, add a Special Access Policy right and a note "See Hub Born-digital access copies if researcher requests"

Quality Control

Review the Job Summary in the ARCHive application for errors.
Compare the preservation.xml and manifest.csv to the metadata in the ARCHive application for the AIP sample.
Check the version and audit logs in the ARCHive application for the AIP sample.
Request a copy of one AIP from the sample from ARCHive.
- Only do this once per day if creating AIPs for multiple collections, as the ARCHive automatic ingest verification is sound.
- Validate the zip MD5 or unzip and validate the bag.
- Check the copy request log for the AIP in the ARCHive application.
Note PASS/FAIL for the ARCHIVE INGEST portion of aip_qc_results.txt.

Documentation and Hand Off

Wait until all AIPs have successfully ingested into ARCHive and passed QC.

Hub

Move PreservationCopy back into the Hub collection folder, which now contains all the script outputs and logs. (until address permission issues)

At the end of processing, the Digital Archivist will move permanent documentation to the collections folder in Digital Stewardship Teams and delete all other files from Hub.

Preservation Log

Add an event for AIP creation and ingest.

Standard text for both on the same day:

Created # AIPs with general_aip.py and ingested into ARCHive with no errors.

Standard text for different days (need to wait for ARCHive ingest to complete for final QC):

Created # AIPs with general_aip.py with no errors.
Ingested into ARCHive with no errors.

Do not include the AIP count if there are multiple accessions unless it is very quick to see how many AIPs per accession. Each accession has a separate preservation log.

If there is an error during ingest, add the count of of ones successfully ingested (e.g., ingested 7 AIPs into ARCHive with no errors) and make a different entry for how the errors were addressed.

For remaking AIPs due to characters in the preservation.xml: Removed special characters from preservation.xml from AIP ID LIST. Validated bag with bagit.py prior to editing, updated bag manifest with update_bag.py, tarred and zipped with 7Zip, and calculated md5 for the manifest with md5deep64. Ingested into ARCHive with no errors.

Born-Digital Collections Tracking Planner

This is how the Digital Archivist will be notified that preservation is complete.

In the checklist, mark "Preservation: AIP creation and ingest" as complete.
Add a comment if anything unusual happened or give feedback on the process.
Change the person assigned to the Digital Archivist, which generates an email notification.
Move the collection to the Finding Aid column.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Preservation: AIP Creation and Ingest into ARCHive

Roles

AIP Creation

Preparation

Script

Quality Control

ARCHive Ingest

Job Scheduling

Quality Control

Documentation and Hand Off

Hub

Preservation Log

Born-Digital Collections Tracking Planner

FilesExpand file tree

preservation.md

Latest commit

History

preservation.md

File metadata and controls

Preservation: AIP Creation and Ingest into ARCHive

Roles

AIP Creation

Preparation

Script

Quality Control

ARCHive Ingest

Job Scheduling

Quality Control

Documentation and Hand Off

Hub

Preservation Log

Born-Digital Collections Tracking Planner