Skip to content

New delimited lengthKind behavior for delimited by following dfdl:initiator #47

@mbeckerle

Description

@mbeckerle

We've seen this so many times now that it really must be considered for DFDL v2.0 as a new feature.

<sequence dfdl:separator="%SP; %NL;">  
  ....
  <element dfdl:initiator="A)%WSP+; A)" name="location" type="notam:nzString" dfdl:lengthKind="pattern"
                   dfdl:lengthPattern="[A-Z0-9 ]{1,69}(?=[ \r\n]{1,5}B\))"/>
  <element dfdl:initiator="B)%WSP+; B)" name="startOfActivity" type="notam:effectiveDateTime"/>
  ....
</sequence>

In the above, the location element can contain the separator + a prefix of the initiator, " B", of the next element startOfActivity. It just can't contain the whole separator+initiator " B)". The use of lengthKind 'pattern' here, and a custom nzString (not zero length string) type, is the hack to express that the location element is delimited by finding what must be the starting initiator of the next element (after the separator).

This works in simple cases like this, because the next thing is easy to discern. But this gets harder if instead of the startOfActivity element, we had a choice, with many children elements having initiators. The regex for the pattern becomes hard to maintain and highly complex, hard to test, etc.

So the request here is for a way to specify that the element is delimited, but that the terminator is finding (but not consuming) the initiator of something following (presumably or end-of-data).

Metadata

Metadata

Assignees

No one assigned

    Labels

    DFDL 2.0For issues associated with DFDL v2.0 (next major revision)enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions