Thank you for embarking on this scary and uncertain path!
I took a look at the GlycoCT parsing implementation and how the resulting ambiguous parent links are represented in the Glycan/Monosaccharide/Link classes. Pretty similar to how I've prototyped this previously. I like how ambiguity in the the parent node and position are handled consistently.
First, a bug: AmbiguousLink.find_open_position fails when there are two undetermined sections with the same list of possible parents, but a single parent_position specified for each. The method only checks the parent data member (first parent from the list of possibilities), so the call to find_open_position for the second AmbiguousLink fails. This happens at GlycoCT parse time.
The GlyTouCan accession that showed me this issue is: G41857GD
Stack trace:
Traceback (most recent call last):
File "test.py", line 95, in <module>
g = glycoct.loads(s)
File "/.../GlyPy/glypy/io/glycoct.py", line 1064, in loads
first = next(g)
File "/.../GlyPy/glypy/io/glycoct.py", line 671, in next
return next(self._iter)
File "/.../GlyPy/glypy/io/glycoct.py", line 1024, in parse
self._complete_structure()
File "/.../GlyPy/glypy/io/glycoct.py", line 937, in _complete_structure
result = self.postprocess()
File "/.../GlyPy/glypy/io/glycoct.py", line 957, in postprocess
level.postprocess()
File "/.../GlyPy/glypy/io/glycoct.py", line 572, in postprocess
link_obj.find_open_position()
File "/.../GlyPy/glypy/structure/link.py", line 545, in find_open_position
raise ValueError("Could not find a valid configurations on current parent/child pair")
ValueError: Could not find a valid configurations on current parent/child pair
Second, a feature request - implementation of an AmbiguousLink.iterconfiguration type method for Glycans, which does the combinatorics for all AmbiguousLink configurations (which interact with each other, as per the issue above). Perhaps the find_open_position logic will be sufficient, but I'm not sure.
Thanks...
Thank you for embarking on this scary and uncertain path!
I took a look at the GlycoCT parsing implementation and how the resulting ambiguous parent links are represented in the Glycan/Monosaccharide/Link classes. Pretty similar to how I've prototyped this previously. I like how ambiguity in the the parent node and position are handled consistently.
First, a bug: AmbiguousLink.find_open_position fails when there are two undetermined sections with the same list of possible parents, but a single parent_position specified for each. The method only checks the parent data member (first parent from the list of possibilities), so the call to find_open_position for the second AmbiguousLink fails. This happens at GlycoCT parse time.
The GlyTouCan accession that showed me this issue is: G41857GD
Stack trace:
Second, a feature request - implementation of an AmbiguousLink.iterconfiguration type method for Glycans, which does the combinatorics for all AmbiguousLink configurations (which interact with each other, as per the issue above). Perhaps the find_open_position logic will be sufficient, but I'm not sure.
Thanks...