Describe mechanism for automated retention management #157

j616 · 2025-11-14T16:44:16Z

Details

This PR includes an ADR that presents multiple options for the signalling and implementation of retention management, alongside the relevant tags and an Application Note for the chosen options. This PR also adds the tag currently used by the AWS store implementation to the listing.

Jira Issue (if relevant)

Jira URL: https://jira.dev.bbc.co.uk/browse/CLOUDFIT-5483

Related PRs

Where appropriate. Indicate order to be merged.

Submitter PR Checks

(tick as appropriate)

PR completes task/fixes bug
API version has been incremented if necessary
ADR status has been updated, and ADR implementation has been recorded
Documentation updated (README, etc.)
PR added to Jira Issue (if relevant)
Follow-up stories added to Jira

Reviewer PR Checks

(tick as appropriate)

PR completes task/fixes bug
Design makes sense, and fits with our current code base
Code is easy to follow
PR size is sensible
Commit history is sensible and tidy

Info on PRs

The checks above are guidelines. They don't all have to be ticked, but they should all have been considered.

samdbmg

LGTM - one thought inlined about using tags, but it's just more reasoning, I agree with the conclusions.

docs/adr/0043-signalling-retention-time.md

iSchluff · 2025-11-21T12:18:58Z

Could this application note also provide clarification on media objects that were allocated but not used in any flow segments? The spec so far just says that implementations have to handle this, but there is no guidance on how long a client can expect the objects to be valid for.

Would it be possible e.g. for the implementation to define a latest time point at which flow segments must have been registered for allocated objects?

j616 · 2025-11-25T11:30:42Z

Could this application note also provide clarification on media objects that were allocated but not used in any flow segments? The spec so far just says that implementations have to handle this, but there is no guidance on how long a client can expect the objects to be valid for.

Would it be possible e.g. for the implementation to define a latest time point at which flow segments must have been registered for allocated objects?

I shall have a think. But we've avoided being specific on this in the past as its hard for us to make universally useful recommendations. Different workflows and deployments will have different requirements in this space. A large organisation-wide installation may experience significant cost impacts of retaining unused content longer than necessary. But a transfer over a poor quality connection may require a larger grace period. I think the desire to have well defined expectations on the part of writing clients is reasonable. But my fear is that whatever number/mechanism we choose would be wrong in a large number of cases, or at least would have quite real and significant consequences to organisations using TAMS. I feel the best that we can do in this case is call out that this is something implementations should consider, and leave it to them to make an informed decision on the best approach based on the needs of themselves and their customers.

himslm01 · 2025-11-27T14:16:51Z

Over and above the retention of Flow segments and the objects that they reference, a thing that worries me slightly is "orphaned" objects - chunks of media which have been stored in the storage back-end (probably S3 objects) but have never been registered in a segment on a flow.

For S3, implementations could be created using S3 object create event notifications to register the object into database table, with registering the object against a segment removing the object from that table. Objects in the table that are older than a defined age are deleted.

I wonder whether there is guidance anywhere for keeping a clean and orphan-free storage layer?

docs/adr/0043-signalling-retention-time.md

j616 · 2025-11-27T15:37:35Z

Over and above the retention of Flow segments and the objects that they reference, a thing that worries me slightly is "orphaned" objects - chunks of media which have been stored in the storage back-end (probably S3 objects) but have never been registered in a segment on a flow.

For S3, implementations could be created using S3 object create event notifications to register the object into database table, with registering the object against a segment removing the object from that table. Objects in the table that are older than a defined age are deleted.

I wonder whether there is guidance anywhere for keeping a clean and orphan-free storage layer?

I think thats similar to the comment above. I guess the bit worth adding is that those Objects have to be tracked by the TAMS service from the point at which they are created via the API anyway. We require that the first registration of an Object against as Segment MUST be against the Flow the storage was allocated against. This is a requirement particularly aimed at implementations that support fine-grained auth to make sure permissions can be derived at all points in the lifecycle, and to avoid any weird edge cases where storage is assigned against one flow and registered against another or where a malicious actor might "steal" an object and register it against a flow they have permissions for between media being uploaded, and it being associated with the legimate Flow. But it's also to facilitate this sort of handling of objects which are never registered against segments. As I say above, I think theres a bunch of reasons we can only go so far with recommendations in this space. But perhaps we need to make the current language clearer.

GeorginaShippey · 2025-12-01T12:18:34Z

As mentioned on Friday, I think we need to have a better understanding around what is going on at the Source level here - if anything needs to be signalled or an idea of what queries can be made to discover how long a particular source will stick around for. I think users will be most interested in understanding retention and managing retention on a source rather than individual flows. As well as understanding any complexities around flows that have segments stored at the multi and mono essence layers, and where flows end up being collected by multiple source multis, ensuring the underlying flows are not deleted before any users expects.

iSchluff · 2025-12-01T12:44:28Z

I shall have a think. But we've avoided being specific on this in the past as its hard for us to make universally useful recommendations. Different workflows and deployments will have different requirements in this space.

I think it is reasonable to have different deadlines based on the usage, but I would strongly suggest having a consistent way to signal this expectation to clients. Especially when clients and service implementation come from different vendors. E.g. having an optional field signalling the deadline/timeout for registration in the object allocation response. This would then guarantee to the client that presigned urls and uploaded objects are valid for a certain amount of time.
Otherwise a client might unexpectedly get an auth error on object upload or an error on flow registration about nonexistent objects.

…listing

samdbmg · 2025-12-10T15:10:33Z

Could this application note also provide clarification on media objects that were allocated but not used in any flow segments? The spec so far just says that implementations have to handle this, but there is no guidance on how long a client can expect the objects to be valid for.

Would it be possible e.g. for the implementation to define a latest time point at which flow segments must have been registered for allocated objects?

Spent a little while talking about this with @j616 this morning, and I think he's going to try and capture some of the options somewhere.

For writing, I find myself wondering whether it's enough to stipulate a minimum validity time (say, 5min?) and expecting clients should request URLs, upload objects and register them in a timely manner to meet that deadline. As in, "you may complete upload and registration any time within 5min of making this request: beyond that it may fail". Beyond that time, the upload URL will expire, and then any object uploaded to it but not registered should probably be deleted (e.g. using the same reference counting/garbage collection as deleting the last Flow Segment that references an object).

I agree on the value of being flexible in general, but in this case part of me would rather not make this a config option because then writing clients have to include logic to handle different values. Instead, I think clients should be requesting a new page of URLs frequently enough to never run into that deadline (but calling out what the deadline is could be useful).

I suspect read is a different case: depending on your reader you might need much more time. That one probably deserves more thought: should that also be fixed? Should that be a request parameter? Or a rubric like "2x the duration of the page"?

…imeouts.

j616 · 2026-01-05T14:05:51Z

I've made a first pass at ADR options for signalling garbage collection and presigned URL timeouts in 7fe2a37 . I've haven't selected any options yet. I personally favour 8a and 9a for the reasons outlined in the document. But I'd appreciate feedback.

j616 requested a review from a team as a code owner November 14, 2025 16:44

j616 force-pushed the jamessa-loopRecord branch from 5bcdf4a to 06aeb82 Compare November 14, 2025 16:46

samdbmg approved these changes Nov 18, 2025

View reviewed changes

docs/adr/0043-signalling-retention-time.md Show resolved Hide resolved

docs/adr/0043-signalling-retention-time.md Show resolved Hide resolved

docs/adr/0043-signalling-retention-time.md Show resolved Hide resolved

docs/adr/0043-signalling-retention-time.md Show resolved Hide resolved

himslm01 reviewed Nov 27, 2025

View reviewed changes

docs/adr/0043-signalling-retention-time.md Outdated Show resolved Hide resolved

j616 added 6 commits December 10, 2025 13:58

Add ADR regarding signalling of retention time

23a6a16

Add the implementation specific loop_recorder_duration to the tags …

cfeef30

…listing

Add ADR0043 tags to listing

abb0e79

Add AppNote 0019 describing how to implement retention management

50ac907

Sort tag listing alphabetically

9c5a927

Fix missing full stop

c6dadcd

Add ADR options for signalling garbage collection and presigned URL t…

7fe2a37

…imeouts.

j616 force-pushed the jamessa-loopRecord branch from c63e409 to 7fe2a37 Compare January 5, 2026 14:04

j616 added 2 commits January 5, 2026 14:27

Add garbage collection and pre-signed URL timeout ADR options to list

499be36

DONOTMERGE: Placeholders for chosen timeout options

9abda02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Describe mechanism for automated retention management #157

Describe mechanism for automated retention management #157

j616 commented Nov 14, 2025

Uh oh!

samdbmg left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

iSchluff commented Nov 21, 2025

Uh oh!

j616 commented Nov 25, 2025

Uh oh!

himslm01 commented Nov 27, 2025

Uh oh!

Uh oh!

j616 commented Nov 27, 2025

Uh oh!

GeorginaShippey commented Dec 1, 2025

Uh oh!

iSchluff commented Dec 1, 2025

Uh oh!

samdbmg commented Dec 10, 2025

Uh oh!

j616 commented Jan 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Describe mechanism for automated retention management #157

Are you sure you want to change the base?

Describe mechanism for automated retention management #157

Conversation

j616 commented Nov 14, 2025

Details

Jira Issue (if relevant)

Related PRs

Submitter PR Checks

Reviewer PR Checks

Info on PRs

Uh oh!

samdbmg left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

iSchluff commented Nov 21, 2025

Uh oh!

j616 commented Nov 25, 2025

Uh oh!

himslm01 commented Nov 27, 2025

Uh oh!

Uh oh!

j616 commented Nov 27, 2025

Uh oh!

GeorginaShippey commented Dec 1, 2025

Uh oh!

iSchluff commented Dec 1, 2025

Uh oh!

samdbmg commented Dec 10, 2025

Uh oh!

j616 commented Jan 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants