Feat!: Categorize indirect MV changes as breaking for seamless version switching#5374
Conversation
| if snapshot.is_materialized_view: | ||
| # We categorize changes as breaking to allow for instantaneous switches in a virtual layer. | ||
| # Otherwise, there might be a potentially long downtime during MVs recreation. | ||
| snapshot.categorize_as(SnapshotChangeCategory.INDIRECT_BREAKING, forward_only) |
There was a problem hiding this comment.
I think this looks good, but there's still a question what should be the correct behavior when forward_only is True, which will lead to the version being reused despite the INDIRECT_BREAKING category.
The forward_only flag can be set due to:
- Running
sqlmesh plan --forward-only - The
virtual_environment_modeis set todev_onlyin the project config, indicating that no virtual layer should be used in production
In both cases, it is the user's explicit intent to continue using the same (existing) table version. How should we handle these scenarios?
There was a problem hiding this comment.
Good question. I'm not sure, but maybe we should keep the old behavior in case of forward-only changes?
There was a problem hiding this comment.
I added a check to preserve the original behavior for forward-only changes.
a4baca7 to
fb559d2
Compare
|
Hey @izeigerman, please let me know if there’s anything else that should be changed or improved before moving forward. |
georgesittas
left a comment
There was a problem hiding this comment.
This looks good to me, thanks @xardasos!
| elif self._context_diff.indirectly_modified(snapshot.name): | ||
| if snapshot.is_materialized_view and not forward_only: | ||
| # We categorize changes as breaking to allow for instantaneous switches in a virtual layer. | ||
| # Otherwise, there might be a potentially long downtime during MVs recreation. |
There was a problem hiding this comment.
I'm curious to understand a bit more about this. What would cause this long downtime? Can you give an example?
There was a problem hiding this comment.
RisingWave doesn't support create or replace nor transactional DDL (although it can alter MV A_1 and swap it with another existing MV A_2). Previously, sqlmesh would drop and recreate the MV in case of indirect changes, making the data unavailable until the MV was rebuilt (and this can take a lot of time in case of larger MVs).
Even if an engine supports create or replace of some sort, I think having side-by-side versions of an MV allows for instantaneous rollbacks in a virtual layer.
Addresses issue: #5365
We now categorize indirect changes to MVs as breaking to allow for instantaneous switches in a virtual layer. This addresses the issue of potentially long downtimes during MVs recreation. This improvement is especially relevant for RisingWave, where chaining MVs is a common use case, as described in https://risingwave.com/understanding-materialized-views/.