Skip to content

Modifying segment parsing to handle repetition with elements.#57

Open
jkamal000 wants to merge 1 commit intotastypackets:mainfrom
jkamal000:segment-repetition
Open

Modifying segment parsing to handle repetition with elements.#57
jkamal000 wants to merge 1 commit intotastypackets:mainfrom
jkamal000:segment-repetition

Conversation

@jkamal000
Copy link
Copy Markdown

Original Issue

The parsing of individual segments did not handle cases where the elements were repeated. (See example below.)
Under the current implementation in the event that an element is repeated the element is not fully parsed. The repetition delimiters are still left in the element. Additionally if the repeated element has components then parsing incorrectly splits by the component delimiter before splitting out each of the repeated elements. Please see the example below.

Example

The following is an example of the X12 RAS segment based on documentation for X12 Release 8010. More information can be found at the stedi documentation page.

Based on the documentation the RAS-03 element can repeat up to 15 times and the element itself is a composite with multiple components.

The segment is as follows:

RAS*34.25*1234567890*12345:0^67890:1

Based on documentation the segment elements are:

  • RAS-01 - 34.25
  • RAS-02 - 1234567890
  • RAS-03 (repeat 1) - 12345:0 → this has sub components 12345 and 0
  • RAS-03 (repeat 2) - 67890:1 → this has sub components 67890 and 1

However under the current implementation a parsed segment will not split by the repetition delimiter and will produce the following:

{
  '1': '34.25',
  '2': '1234567890',
  '3': '12345',
  name: 'RAS',
  '3-1': '0^67890',
  '3-2': '1'
}

Fix

To address this issue I turned the parsed field of a segment into a 3D array (was a 2D array). Now the first layer of parsed represents each of the unique element types. The second represents the list of elements that are repeated (e.g. the list of RAS-03 elements that repeat), in most cases this will only have one value. The third layer will be the list of components for each of the elements.

Additionally to depict the repetition while also keeping compatibility I added a to the formatted value so that the index of the repeated element appears AT THE END. This way all the previous formats will remain the same.
For example:

  • RAS*a still appears as { name: 'RAS', 1: 'a' },
  • RAS*a:b:c still appears as { name: 'RAS', 1: 'a', '1-1': 'b', '1-2': 'c' }

But now:

  • RAS*a:b^c:d appears as { name: 'RAS', 1: 'a', '1-1-1': 'b', '1-1-2': 'c', '1-2-2': 'd' }
  • RAS*a^c:d appears as { name: 'RAS', 1: 'a', '1-1-2': 'c', '1-2-2': 'd' }
  • RAS*a^b^c appears as { name: 'RAS', 1: 'a', '1-1-2': 'b', '1-1-3': 'c' }

Copy link
Copy Markdown
Owner

@tastypackets tastypackets left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for being so late to reply, I switched phones and forgot to setup GitHub push notifications. 🙁

Thanks for catching this, it's definitely a missing feature. I do worry about the naming on the index for the repeating data, looks like the index keeps going up on each repeat count within an overall segment. I'm thinking we might want to scope it to the specific composite.

const formatted: FormattedSegment = { name: this.#name };

this.#parsed.forEach((element, elementIndex) => {
this.#parsed.forEach((elements, elementsIdx) => {
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid renaming stuff in the PR?

Comment on lines +41 to +44
// skipping first element
if (componentIdx == 0 && repeatIdx == 0) return;
// because we skipped first element we need to decrement index
// avoiding shifting because modifies internal list
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like these might be AI comments, could we remove them?

this.#parsed.forEach((elements, elementsIdx) => {
// First item in parsed string is the segment value, everything else is components
formatted[`${elementIndex + 1}`] = element.shift() ?? '';
formatted[`${elementsIdx + 1}`] = elements.at(0)?.at(0) ?? '';
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might be able to remove the logic around adjusting the index if we pull out the first element, maybe something like:

      const [firstElement, ...subsequentElements] = elements;
      formatted[elementKey] = firstElement?.[0] ?? '';

Comment on lines +102 to +105
'3-1-2': '12345',
'3-2-2': '0',
'3-1-3': '67890',
'3-2-3': '1',
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does the next composite element end with 3? Should this reset to 1? I think it's going to make it impossible/hard to track where a repeat fits into the overall segment.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if you still have notifications open for this (I forgot about this pr as I was jumping between projects).
The last digit in the composite index was originally intended to keep track of the order in which the definitions occurred.

Suppose we had RAS*subCompA0:subCompA1:subCompA2^subCompB0:subCompB1subCompB2^subCompC0:subCompC1:subCompC2
In this scenario I originally intended for the last digit in the key to represent the index of the repeated element within its own "list." The idea was for a way to tell that repeated elements go A then B then C etc.

The meaning of the key was intended to be elementIdx-subElementIdx-repetitionIdx
A key like 3-2-4 has the following meaning:

  • 3 indicates it is the 3rd component of the RAS segment
  • 2 indicates the value is the second sub component (the same way it was originally parsed)
  • 4 indicates that this is a part of the 4th repetition of the 3rd component. The third component can have a repeat of 15 so I used the last value to determine its order (which is 4, where we use 1-indexing)

The output was expected like

{
  name: "RAS",
  "1": "subCompA0",           // keeping with original scheme
  "1-1-1": "subCompA1",       // sub component 1 of repetition 1 of main component 1
  "1-2-1": "subCompA2",       // sub component 2 of repetition 1 of main component 1
  "1-1-2": "subCompB0",       // sub component 1 of repetition 2 of main component 1
  "1-2-2": "subCompB1",       // sub component 2 of repetition 2 of main component 1
  "1-3-2": "subCompB2",       // sub component 3 of repetition 2 of main component 1
  // ... rest
}

I agree that there may be a little confusion with putting the index of the repetition at the end instead of using a key likeelementIdx-repetitionIdx-subElementIdx however such a key would cause old systems, where the second number is the sub element index, to break.

@jkamal000
Copy link
Copy Markdown
Author

Sorry for not addressing these sooner. I had forgotten to check this repo in the last month. I will try to address all the comments and concerns throughout the next two weeks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants