Add personal data and confidentiality indicators (fixes #58, #59)#66
Add personal data and confidentiality indicators (fixes #58, #59)#66deeeeeepesh wants to merge 4 commits intoopen-semantic-interchange:mainfrom
Conversation
…d Dataset schemas Co-authored-by: deeeeeepesh <85902051+deeeeeepesh@users.noreply.github.com>
Co-authored-by: deeeeeepesh <85902051+deeeeeepesh@users.noreply.github.com>
…confidential-indicator Add data sensitivity attributes: contains_personal_data and is_confidential
| # Optional: Human-readable description of the logical dataset | ||
| description: string | ||
|
|
||
| # Optional: Indicates if this dataset contains personal data (PII) subject to privacy regulations |
There was a problem hiding this comment.
Should this be extended to classification_type rather than just an indicator of whether it is a personal type or not? Or the expression should be used as a classification type? Here are some of the classification type can be considered as contains_personal_data
https://github.com/ananthdurai/schemata/blob/main/src/opencontract/v1/org/schemata/protobuf/schemata.proto#L87
There was a problem hiding this comment.
Does GDPR have an official standard for data classification ?
|
deeeeeepesh Thanks for this PR. But there is a catalog working group that is figuring out a full proposal for catalog , sensitive data classification and governance. Let's wait for the proposal from that working group to update the spec cohesively. You are welcome to join the working group as well. |
Summary
This PR implements two feature requests:
Changes
Added two new optional boolean attributes to both
FieldandDatasetschemas:contains_personal_datais_confidentialFiles Modified
core-spec/osi-schema.json- JSON Schema definitionscore-spec/spec.yaml- YAML specificationcore-spec/spec.md- Documentation with examplesExample Usage