Extended Attributes at variable level: `processing_level`, `comment`, `creator_name`, `project`, `date_modified`, `date_metadata_modified`

netCDF allows for a lot more information than exists in exchange files, with the CCHDO documentation metadata extraction project going, eventually we will need a place to put that metadata. For a while now, I have wanted to store the information contained in the "Bob Headers" in a more structured way. The following ACDD attributes, when pushed down from the global to variable level, should enable the creation of "Bob Headers": `processing_level`, `comment`, `creator_name`, `project`, `date_modified`, `date_metadata_modified`.  Further examination of each of these:

* `processing_level`
  In ACDD the processing level is a freeform string. We should use this to indicate the following status that very roughly correspond to the satellite communities L0 though L4 processing levels :
  * collected - water was taken but not received
  * raw - used for CTD but not discrete
  * preliminary - data in the file that maybe has not had final calibration applied
  * final - data that is not expecting any more updates
  * product - we probably won't use this, but included since that is what L4 tends to be
  A controlled vocabulary of these should be searched for.
* `comment`
   Free text notes, usually these are very short for each parameter. This is the "notes" part of the Bob Headers
* `creator_name`
   This is the PI for the parameter in question, we should use array of strings for multiple PIs in our at rest data files. This is the "who" part of the Bob Headers. There is also a `creator_url` attribute that we might consider storing ORCiDs in.
* `project`
   We need a way to tie multiple variables with the same PI/status together, e.g. nutrients are usually 3~5 variables. In the ACDD docs, a program (GO-SHIP) is made up of multiple projects (Total Carbon, pH, Nutrients, CTD, etc..). Variables that have the same project value would be grouped into the "includes" list in the Bob Headers, the `comment` and `creator_name` would need to be the same to avoid ambiguity.
  There probably is not a single controlled vocabulary for these project names, they would also likely benefit from some coordination with GO-SHIP.
* `date_modified`
  If the data itself is changed, this would be updated to be the date it was changed in the data file. The merge_fq accessor already updates this.
* `date_metadata_modified`
   If only the metadata were modified, this attribute would be updated to the date the change was done. The merge_fq accessor already updates this if the print format is different.

The only non standard ACDD usage of the above are being at the variable level rather than global, and the possible use of arrays of strings. We could define combining rules to put all this information in the global attributes that fully conform to ACDD, but this would likely be one way (update the globals from variables, not the other way around). For example: the global `date_modified` would be set to the most recent date seen from all the variables that also have `date_modified`.

Things this might make possible:
* Getting a list of updated files since (or even between/before) a certain date could be done at a per variable level by examining the `date_modified` attribute. We can even exclude simple metadata updates that didn't change the values used in science.
* Find all the preliminary data or exclude preliminary data from a result set.
* Know who has not turned in their data yet by examining the `processing_level` attribute for "collected" and the `creator_name` attribute. This can also be done for bottle data with flag 1.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Extended Attributes at variable level: `processing_level`, `comment`, `creator_name`, `project`, `date_modified`, `date_metadata_modified` #40

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Extended Attributes at variable level: processing_level, comment, creator_name, project, date_modified, date_metadata_modified #40

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Extended Attributes at variable level: `processing_level`, `comment`, `creator_name`, `project`, `date_modified`, `date_metadata_modified` #40