Skip to content

NEM-366 Ensure deterministic ordering on file export#115

Open
Mateus-Cordeiro wants to merge 3 commits intomainfrom
NEM-366-deterministic-ordering
Open

NEM-366 Ensure deterministic ordering on file export#115
Mateus-Cordeiro wants to merge 3 commits intomainfrom
NEM-366-deterministic-ordering

Conversation

@Mateus-Cordeiro
Copy link
Collaborator

This PR makes YAML exports deterministic by preserving model-defined field order when serializing dataclasses and Pydantic models.

PyYAML's default behaviour can produce noisy diffs due to unstable key ordering.

Changes

  • Updated yaml.py's YAML representer to handle dataclasses and Pydantic models
  • Generic objects (anything with dict) now serialize public attributes (non _ prefixed) with keys sorted alphabetically to avoid nodeterministic attribute order.

Limitations

Database samples can't be guaranteed to be deterministic. At the moment, a sampling query looks like this:

SELECT * FROM "{schema}"."{table}"

At the point in the code in which we obtain samples, we know the columns of the table, so an ORDER BY clause could be created. This, however, would not guarantee that the same rows would be returned.

@JulienArzul
Copy link
Collaborator

Database samples can't be guaranteed to be deterministic. At the moment, a sampling query looks like this:

SELECT * FROM "{schema}"."{table}"
At the point in the code in which we obtain samples, we know the columns of the table, so an ORDER BY clause could > be created. This, however, would not guarantee that the same rows would be returned.

Very good point, it would probably still be a good idea to use an ORDER BY on the samples (probably on the primary key?) so that at least two subsequent calls to build_context would return the same content if the table's data hasn't changed.
Without an ORDER BY, we're not even sure that we're getting back the same rows even if the table's data hasn't changed...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants