Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 6 additions & 11 deletions src/mixs/schema/mixs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3703,13 +3703,14 @@ slots:
title: Hazard Analysis Critical Control Points (HACCP) guide food safety term
examples:
- value: tetrodotoxic poisoning [FOODON:03530249]
- value: tetrodotoxic poisoning[FOODON:03530249]; neurotoxic shellfish poisoning[FOODON:03530246]
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new multi-value example uses ; as a delimiter and omits the space before [ (e.g., poisoning[FOODON:...]). In this schema, multi-term free-text fields typically document/illustrate pipe-separated values (|), e.g. dietary_claim_use description (around mixs.yaml:5761-5763) and animal_feed_equip example ...|... (around mixs.yaml:4216). Consider updating this example to use | and the canonical termLabel [TERM:ID] spacing, or explicitly document that ; and optional spacing are intended for HACCP_term.

Copilot uses AI. Check for mistakes.
keywords:
- food
- term
slot_uri: MIXS:0001215
multivalued: true
range: string
pattern: ^([^\s-]{1,2}|[^\s-]+.+[^\s-]+) \[[a-zA-Z]{2,}:[a-zA-Z0-9]\d+\]$
pattern: ^.+\s*\[FOODON:\d+\](;\s*\[FOODON:\d+\])*$
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This regex requires FOODON ontology. Is this what we want?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Please use the pattern ^(\S[^\r\n]*) [FOODON:\d{7,8}]$ instead of ^.+\s*\[FOODON:\d+\](;\s*\[FOODON:\d+\])*$ or see my notes on dynamic enumerations.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you intentionally remove the white-space between the label and the term id? I don't think that's consistent with other ontology term patterns in MIxS

Copy link
Copy Markdown
Collaborator Author

@mslarae13 mslarae13 Jul 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@turbomam
For the white space, do you mean if it should be "lead poisoning [FOODON:03530243]" vs "lead poisoning[FOODON:03530243]"

So, the white space is supposed to be there? I thought I had it set to be valid with or without it... does it matter? If so, I'll make sure I correct it. Just tell me which is correct.

Looking at the submission schema the white space should be there. So I can make that update to the regex.

Copy link
Copy Markdown
Collaborator Author

@mslarae13 mslarae13 Jul 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From your comment here : #802 (comment)

^(\S[^\r\n]*) [FOODON:\d{7,8}]$

I f we want to use pattern-only validation, I suggest we go with that.

That regex ^(\S[^\r\n]*) [FOODON:\d{7,8}]$
is showing me that "lead poisoning [FOODON:03530243]" is invalid... :(

... are you sure that's right?? Or am I missing something about the formatting of the value for "lead poisoning [FOODON:03530243]" ?

I think it needs to be ^.+\s*\[FOODON:\d{7,8}]$

(https://www.ebi.ac.uk/ols4/ontologies/foodon/classes/http%253A%252F%252Fpurl.obolibrary.org%252Fobo%252FFOODON_03530221)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Screenshot 2024-07-09 at 4 53 27 PM Screenshot 2024-07-09 at 4 53 40 PM

Copy link
Copy Markdown
Collaborator Author

@mslarae13 mslarae13 Dec 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

decision, the regex in the 2nd image is good.

I'll test this and confirm then finish this PR.

Discussed 12/03
pattern vs structured_pattern : we have this for some of the more generic term label and term IDs.
Look for "settings" section in schema.

Copy link
Copy Markdown
Member

@turbomam turbomam Dec 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot to escape the square brackets around FOODON with backslashes \[F... etc

Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This regex is effectively satisfied by any string that ends with a bracketed FOODON ID, even if earlier semicolon-separated terms don't have their own IDs (e.g., term1; term2 [FOODON:...] would match). If the intent is to validate a list of HACCP terms, each entry should include its own termLabel [FOODON:digits] (and if you keep ; as the separator, the repeating group should include the label as well). Tightening the pattern to require the termLabel + [FOODON:id] structure per item will prevent partially-specified multi-term values from being accepted.

Suggested change
pattern: ^.+\s*\[FOODON:\d+\](;\s*\[FOODON:\d+\])*$
pattern: ^[^;]+\s*\[FOODON:\d+\](;\s*[^;]+\s*\[FOODON:\d+\])*$

Copilot uses AI. Check for mistakes.
structured_pattern:
syntax: ^{termLabel} \[{termID}\]$
interpolated: true
Expand Down Expand Up @@ -3741,8 +3742,7 @@ slots:
abs_air_humidity:
annotations:
Preferred_unit: gram per gram, kilogram per kilogram, kilogram, pound
description: Actual mass of water vapor - mh20 - present in the air water vapor
mixture
description: Actual mass of water vapor - mh20 - present in the air water vapor mixture
title: absolute air humidity
examples:
- value: 9 gram per gram
Expand Down Expand Up @@ -3793,8 +3793,7 @@ slots:
slot_uri: MIXS:0001009
required: true
additional_info:
description: Information that doesn't fit anywhere else. Can also be used to propose
new entries for fields with controlled vocabulary
description: Information that doesn't fit anywhere else. Can also be used to propose new entries for fields with controlled vocabulary
title: additional info
keywords:
- information
Expand All @@ -3808,8 +3807,7 @@ slots:
string_serialization: '{integer}{text}'
slot_uri: MIXS:0000218
adj_room:
description: List of rooms (room number, room name) immediately adjacent to the
sampling room
description: List of rooms (room number, room name) immediately adjacent to the sampling room
title: adjacent rooms
keywords:
- adjacent
Expand All @@ -3824,10 +3822,7 @@ slots:
adjacent_environment:
annotations:
Expected_value: ENVO_01001110 or ENVO_00000070
description: Description of the environmental system or features that are adjacent
to the sampling site. This field accepts terms under ecosystem (http://purl.obolibrary.org/obo/ENVO_01001110)
and human construction (http://purl.obolibrary.org/obo/ENVO_00000070). Multiple
terms can be separated by pipes
description: Description of the environmental system or features that are adjacent to the sampling site. This field accepts terms under ecosystem (http://purl.obolibrary.org/obo/ENVO_01001110) and human construction (http://purl.obolibrary.org/obo/ENVO_00000070). Multiple terms can be separated by pipes
title: environment adjacent to site
examples:
- value: estuarine biome [ENVO:01000020]
Expand Down
Loading