Make symbolic serialization type-accurate and bijective#2366
Conversation
Agent-Logs-Url: https://github.com/spcl/dace/sessions/1e472f6e-62b2-4d7b-9d2a-9a3613bc9df8 Co-authored-by: tbennun <8348955+tbennun@users.noreply.github.com>
Agent-Logs-Url: https://github.com/spcl/dace/sessions/1e472f6e-62b2-4d7b-9d2a-9a3613bc9df8 Co-authored-by: tbennun <8348955+tbennun@users.noreply.github.com>
Agent-Logs-Url: https://github.com/spcl/dace/sessions/1e472f6e-62b2-4d7b-9d2a-9a3613bc9df8 Co-authored-by: tbennun <8348955+tbennun@users.noreply.github.com>
Agent-Logs-Url: https://github.com/spcl/dace/sessions/1e472f6e-62b2-4d7b-9d2a-9a3613bc9df8 Co-authored-by: tbennun <8348955+tbennun@users.noreply.github.com>
Agent-Logs-Url: https://github.com/spcl/dace/sessions/1e472f6e-62b2-4d7b-9d2a-9a3613bc9df8 Co-authored-by: tbennun <8348955+tbennun@users.noreply.github.com>
Agent-Logs-Url: https://github.com/spcl/dace/sessions/1e472f6e-62b2-4d7b-9d2a-9a3613bc9df8 Co-authored-by: tbennun <8348955+tbennun@users.noreply.github.com>
tbennun
left a comment
There was a problem hiding this comment.
Looks great, but a few more changes and tests are required.
| _SERIALIZED_TYPED_CONSTANT = re.compile( | ||
| rf'(?<![A-Za-z0-9_\.])(?P<value>{_SERIALIZED_TYPED_CONSTANT_VALUE})(?P<suffix>{_SERIALIZED_TYPED_CONSTANT_SUFFIX})\b' | ||
| ) | ||
| _TYPECLASS_TO_LITERAL_SUFFIX = { |
There was a problem hiding this comment.
This should move to dtypes.py
There was a problem hiding this comment.
Moved the typed-literal suffix metadata/helpers into dace.dtypes in 275da54.
|
|
||
|
|
||
| class TypedConstant(sympy.AtomicExpr): | ||
| """A symbolic constant with an explicit DaCe data type.""" |
There was a problem hiding this comment.
- I am not sure what a "symbolic constant" is. This is a constant value used in a symbolic expression;
- Add an example.
There was a problem hiding this comment.
Clarified the TypedConstant docstring and added an explicit serialization example in 23f7e2f.
| return TypedConstant(expr, dtypes.float64) | ||
| if not isinstance(expr, str): | ||
| return expr | ||
| if expr == '?': |
There was a problem hiding this comment.
I would add a test for undefined symbols. Are they serialized as $? or ?? I would opt for the former.
There was a problem hiding this comment.
Added undefined-symbol round-trip coverage and switched serialization to use $? in 275da54.
Agent-Logs-Url: https://github.com/spcl/dace/sessions/ccc5069b-3a41-411b-a3fc-7ac4d373e5b1 Co-authored-by: tbennun <8348955+tbennun@users.noreply.github.com>
Agent-Logs-Url: https://github.com/spcl/dace/sessions/ccc5069b-3a41-411b-a3fc-7ac4d373e5b1 Co-authored-by: tbennun <8348955+tbennun@users.noreply.github.com>
Agent-Logs-Url: https://github.com/spcl/dace/sessions/ccc5069b-3a41-411b-a3fc-7ac4d373e5b1 Co-authored-by: tbennun <8348955+tbennun@users.noreply.github.com>
Agent-Logs-Url: https://github.com/spcl/dace/sessions/ccc5069b-3a41-411b-a3fc-7ac4d373e5b1 Co-authored-by: tbennun <8348955+tbennun@users.noreply.github.com>
Agent-Logs-Url: https://github.com/spcl/dace/sessions/ccc5069b-3a41-411b-a3fc-7ac4d373e5b1 Co-authored-by: tbennun <8348955+tbennun@users.noreply.github.com>
Agent-Logs-Url: https://github.com/spcl/dace/sessions/433e8b02-2ba6-4aca-9c25-cf3fd1f77802 Co-authored-by: tbennun <8348955+tbennun@users.noreply.github.com>
Agent-Logs-Url: https://github.com/spcl/dace/sessions/433e8b02-2ba6-4aca-9c25-cf3fd1f77802 Co-authored-by: tbennun <8348955+tbennun@users.noreply.github.com>
Agent-Logs-Url: https://github.com/spcl/dace/sessions/433e8b02-2ba6-4aca-9c25-cf3fd1f77802 Co-authored-by: tbennun <8348955+tbennun@users.noreply.github.com>
Agent-Logs-Url: https://github.com/spcl/dace/sessions/ad1f4362-f591-4cf9-84d0-d89dc605a6a9 Co-authored-by: tbennun <8348955+tbennun@users.noreply.github.com>
Agent-Logs-Url: https://github.com/spcl/dace/sessions/ad1f4362-f591-4cf9-84d0-d89dc605a6a9 Co-authored-by: tbennun <8348955+tbennun@users.noreply.github.com>
Agent-Logs-Url: https://github.com/spcl/dace/sessions/ad1f4362-f591-4cf9-84d0-d89dc605a6a9 Co-authored-by: tbennun <8348955+tbennun@users.noreply.github.com>
Agent-Logs-Url: https://github.com/spcl/dace/sessions/ad1f4362-f591-4cf9-84d0-d89dc605a6a9 Co-authored-by: tbennun <8348955+tbennun@users.noreply.github.com>
Agent-Logs-Url: https://github.com/spcl/dace/sessions/ad1f4362-f591-4cf9-84d0-d89dc605a6a9 Co-authored-by: tbennun <8348955+tbennun@users.noreply.github.com>
Agent-Logs-Url: https://github.com/spcl/dace/sessions/ad1f4362-f591-4cf9-84d0-d89dc605a6a9 Co-authored-by: tbennun <8348955+tbennun@users.noreply.github.com>
Agent-Logs-Url: https://github.com/spcl/dace/sessions/ad1f4362-f591-4cf9-84d0-d89dc605a6a9 Co-authored-by: tbennun <8348955+tbennun@users.noreply.github.com>
Agent-Logs-Url: https://github.com/spcl/dace/sessions/90f491eb-508c-4961-9084-e9c79a40c148 Co-authored-by: tbennun <8348955+tbennun@users.noreply.github.com>
Agent-Logs-Url: https://github.com/spcl/dace/sessions/90f491eb-508c-4961-9084-e9c79a40c148 Co-authored-by: tbennun <8348955+tbennun@users.noreply.github.com>
Agent-Logs-Url: https://github.com/spcl/dace/sessions/893139c2-bf40-47b2-8768-81e3358bc8bc Co-authored-by: tbennun <8348955+tbennun@users.noreply.github.com>
Agent-Logs-Url: https://github.com/spcl/dace/sessions/893139c2-bf40-47b2-8768-81e3358bc8bc Co-authored-by: tbennun <8348955+tbennun@users.noreply.github.com>
Agent-Logs-Url: https://github.com/spcl/dace/sessions/893139c2-bf40-47b2-8768-81e3358bc8bc Co-authored-by: tbennun <8348955+tbennun@users.noreply.github.com>
tbennun
left a comment
There was a problem hiding this comment.
The latest round of changes are clearly a symptomatic fix for a deeper issue. That is also reflected by the fact that tests are still failing in CI.
The serialization/deserialization functions were already sound w.r.t. data types of symbols, so I suspect the issues are rooted within dace and the use of properties, as well as some invalid creation of new symbols with inappropriate dtypes happening somewhere in deserialization (otherwise, a serialize-deserialize-serialize roundtrip will always result in the same symbol type, regardless of whether it is an int32 $i or an int64 symbol($i, dtype=dace.int64). Please try again without special cases.
| s_currentsymbol = 0 | ||
|
|
||
| def __new__(cls, name=None, dtype=DEFAULT_SYMBOL_TYPE, **assumptions): | ||
| def __new__(cls, name=None, dtype=DEFAULT_SYMBOL_TYPE, explicit_dtype=True, **assumptions): |
There was a problem hiding this comment.
explicit_dtype is a bad argument. It should be very simple - if the dtype of the symbol is not DEFAULT_SYMBOL_TYPE, it should be printed out. Remove this argument.
There was a problem hiding this comment.
Removed explicit_dtype and the related symbol-constructor plumbing in ff13b2b.
| def _label_str(val): | ||
| if issymbolic(val): | ||
| return dace.symbolic.serialize_symbolic(val).replace('$', '') | ||
| return str(val) | ||
|
|
||
| def _dim_to_string(dim): | ||
| if not isinstance(dim, tuple): | ||
| return _label_str(dim) | ||
| dres = _label_str(dim[0]) | ||
| if dim[1] is not None: | ||
| if dim[1] - dim[0] != 0: | ||
| off = 1 | ||
| if dim[2] is not None and (dim[2] < 0) == True: | ||
| off = -1 | ||
| dres += ':' + _label_str(dim[1] + off) | ||
| if dim[2] != 1: | ||
| if dim[1] is None: | ||
| dres += ':' | ||
| dres += ':' + _label_str(dim[2]) | ||
| return dres | ||
|
|
||
| return self.label + "[" + ", ".join( | ||
| ["{}={}".format(i, r) | ||
| for i, r in zip(self._params, [sbs.Range.dim_to_string(d) for d in self._range])]) + "]" | ||
| [f"{i}={r}" for i, r in zip(self._params, [_dim_to_string(d) for d in self._range])]) + "]" |
There was a problem hiding this comment.
Unnecessary change, please revert. Range.dim_to_string's behavior should stay unaffected by serialization.
There was a problem hiding this comment.
Reverted the map label/stringification change in ff13b2b.
| def _serialize_symbolic_with_converter(converter, value): | ||
| if not isinstance(value, (symbolic.SymExpr, sp.Basic)): | ||
| try: | ||
| value = _deserialize_symbolic_with_converter(converter, value) | ||
| except Exception: | ||
| pass | ||
| return symbolic.serialize_symbolic(value) |
There was a problem hiding this comment.
What is the purpose of this roundtrip?
There was a problem hiding this comment.
Removed that helper roundtrip in ff13b2b. The fix now uses symbolic property types directly instead.
| @@ -1452,14 +1460,17 @@ def _binop_sub(a, b): | |||
| @staticmethod | |||
| def _binop_mul(a, b): | |||
| if (_SerializedSymbolicParser._requires_preserved_structure(a) | |||
| or _SerializedSymbolicParser._requires_preserved_structure(b)): | |||
| or _SerializedSymbolicParser._requires_preserved_structure(b) or isinstance(a, sympy.Add) | |||
| or isinstance(b, sympy.Add)): | |||
| return sympy.Mul(a, b, evaluate=False) | |||
| return a * b | |||
|
|
|||
| @staticmethod | |||
| def _binop_div(a, b): | |||
| if (_SerializedSymbolicParser._requires_preserved_structure(a) | |||
| or _SerializedSymbolicParser._requires_preserved_structure(b)): | |||
| or _SerializedSymbolicParser._requires_preserved_structure(b) | |||
| or (isinstance(a, sympy.Mul) and any(isinstance(arg, sympy.Add) | |||
| for arg in a.args)) or isinstance(b, sympy.Add)): | |||
There was a problem hiding this comment.
The special cases here seem to indicate that the serialization-deserialization-serialization is inconsistent. These might not be the last special cases and this concerns me. Is there a simpler way to always disable evaluate so that we don't have to special case distributive rules etc.?
There was a problem hiding this comment.
Replaced the ad hoc operator special cases with a generic non-evaluating symbolic parse in ff13b2b.
| return '*'.join(parts) if parts else '1' | ||
|
|
||
|
|
||
| @lru_cache(maxsize=16384) |
There was a problem hiding this comment.
Tracked this down to same-name symbol serialization collisions; ff13b2b fixes the root cause without reintroducing the previous symptomatic behavior.
| if dtypes.validate_name(expr): | ||
| return symbol(expr) |
There was a problem hiding this comment.
This means that we will deserialize a symbol by name even if it is not in the standard serialized format. This is wrong. Please revert this. I think this is a symptom of an issue somewhere else, and this is not the right way to solve it.
There was a problem hiding this comment.
Reverted the name-based deserialize fallback in ff13b2b.
| def test_inferred_symbol_dtype_is_not_serialized(): | ||
| inferred = symbolic.symbol('i', dtype=dace.int64, explicit_dtype=False) | ||
|
|
||
| assert symbolic.serialize_symbolic(inferred) == '$i' |
There was a problem hiding this comment.
Remove this test as it is wrong!
| def test_list_property_pystr_to_symbolic_json_roundtrip_supports_plain_names(): | ||
| prop = ListProperty(element_type=symbolic.pystr_to_symbolic) | ||
|
|
||
| assert prop.to_json(['START']) == ['$START'] | ||
|
|
||
| restored = prop.from_json(['START']) | ||
|
|
||
| assert len(restored) == 1 | ||
| assert restored[0] == symbolic.symbol('START') | ||
|
|
||
|
|
||
| def test_dict_property_pystr_to_symbolic_json_roundtrip_supports_plain_names(): | ||
| prop = DictProperty(key_type=str, value_type=symbolic.pystr_to_symbolic) | ||
|
|
||
| assert prop.to_json({'N': 'N'}) == {'N': '$N'} | ||
|
|
||
| restored = prop.from_json({'N': 'N'}) | ||
|
|
||
| assert restored == {'N': symbolic.symbol('N')} | ||
|
|
||
|
|
There was a problem hiding this comment.
It seems to me that the error that caused these two tests is the element_type/value_type being wrongly assigned to a function. I would encourage finding a better solution for lists of symbolic expressions, as well as dicts with symbolic keys/values. Subsequently, remove all the cases where users used {element,key,value}_type=pystr_to_symbolic. Maybe the special case in properties.py should be "if the type is a subclass of/equal to sympy.Basic or set to symbolic.SymbolicType or is a subclass of/equal to SymExpr"
There was a problem hiding this comment.
Changed the symbolic list/dict properties to use real symbolic types (sympy.Basic) and removed the pystr_to_symbolic property declarations in ff13b2b.
Agent-Logs-Url: https://github.com/spcl/dace/sessions/1dc8b311-87d3-4b93-9a7a-edc47d144171 Co-authored-by: tbennun <8348955+tbennun@users.noreply.github.com>
Reworked this around the underlying causes in ff13b2b: removed the serializer/deserializer special cases, switched symbolic properties to real symbolic types, stopped inferring non-default iterator/index symbol dtypes in the frontend, and kept the affected |
Agent-Logs-Url: https://github.com/spcl/dace/sessions/1dc8b311-87d3-4b93-9a7a-edc47d144171 Co-authored-by: tbennun <8348955+tbennun@users.noreply.github.com>
Agent-Logs-Url: https://github.com/spcl/dace/sessions/90e03a82-7b38-4393-8cb8-e753956a138d Co-authored-by: tbennun <8348955+tbennun@users.noreply.github.com>
Agent-Logs-Url: https://github.com/spcl/dace/sessions/90e03a82-7b38-4393-8cb8-e753956a138d Co-authored-by: tbennun <8348955+tbennun@users.noreply.github.com>
symbol($name, dtype=...)+ -1/+(-1*...)to subtraction)DACE_testing_serialization=1