Skip to content

RFC: Precise Schema Reconciler Shape Invalidation #4328

@alco

Description

@alco

Spec: Precise Schema Reconciler Shape Invalidation

Problem

When SchemaReconciler detects schema divergence between the Inspector cache and the live database, it currently invalidates all shapes on the affected table. This is overly conservative - many schema changes don't affect shapes that don't select the changed columns.

The replication stream already does column-level filtering via Shape.is_affected_by_relation_change?/2, but it only sees name/type changes (what Postgres sends in Relation messages). SchemaReconciler has access to richer column metadata but doesn't use it for precise invalidation.

Goal

Make SchemaReconciler invalidate only shapes that are actually affected by schema changes, using the same column-level filtering logic as the replication stream where possible, plus additional attribute-aware logic.

Current Flow

list_relations_with_stale_cache()
  → returns [{oid, relation}] for tables with ANY column divergence
  → remove_shapes_for_relations(stack_id, relations)
    → ShapeStatus.list_shape_handles_for_relations() -- ALL shapes for table
    → remove_shapes() -- invalidates all of them

Proposed Flow

list_relations_with_column_changes()
  → returns [{oid, relation, changed_columns}]
  → for each {oid, relation, changed_columns}:
    → build synthetic Relation with affected_columns
    → Filter.affected_shapes(filter, synthetic_relation)
    → remove only affected shapes

Column Attribute Change Analysis

For each column attribute, determine whether a change requires shape invalidation:

does_change_require_invalidation?(attribute, old_value, new_value, shape)

# Column name changed - shape selecting old name must be invalidated
def does_change_require_invalidation?(:name, old_name, new_name, shape) do
  old_name in shape.selected_columns
end

# Type changed - shape selecting column must be invalidated
def does_change_require_invalidation?(:type_id, old_type, new_type, shape) do
  # Type change affects data encoding/interpretation
  column_selected?(shape, column_name)
end

# Nullability changed
def does_change_require_invalidation?(:not_null, old_val, new_val, shape) do
  # NULL -> NOT NULL: PG requires data migration first, shape log has updates
  # NOT NULL -> NULL: No data inconsistency
  # Conservative: invalidate if shape selects column
  # Optimistic: don't invalidate (data is consistent via replication)
  
  # Decision: Don't invalidate - the replication stream already delivered
  # any data changes required for the constraint to be added
  false
end

# Generated column status changed
def does_change_require_invalidation?(:is_generated, false, true, shape) do
  # Regular -> Generated: values now computed, may differ from cached
  column_selected?(shape, column_name)
end

def does_change_require_invalidation?(:is_generated, true, false, shape) do
  # Generated -> Regular: similar concern
  column_selected?(shape, column_name)
end

# Array dimensions changed
def does_change_require_invalidation?(:array_dimensions, old_dims, new_dims, shape) do
  # Structural type change
  column_selected?(shape, column_name)
end

# PK position changed - affects record identity
def does_change_require_invalidation?(:pk_position, old_pos, new_pos, _shape) do
  # PK changes affect how records are keyed, regardless of selected columns
  # Must invalidate all shapes on the table
  true
end

# Type kind changed (e.g., enum)
def does_change_require_invalidation?(:type_kind, old_kind, new_kind, shape) do
  column_selected?(shape, column_name)
end

Attribute Classification

Attribute Change Requires Invalidation? Scope
name Yes Shapes selecting old name
type_id Yes Shapes selecting column
type_mod Yes Shapes selecting column
array_dimensions Yes Shapes selecting column
type_kind Yes Shapes selecting column
is_generated Yes Shapes selecting column
not_null No* N/A
pk_position Yes ALL shapes on table
formatted_type Derived from above -

*Nullability: The data migration required by Postgres before adding NOT NULL comes through the replication stream. Cached shape data is updated via those change events. The constraint itself doesn't affect how Electric serves data.

Implementation Steps

1. Extend list_relations_with_stale_cache to return column diff

@spec list_relations_with_stale_cache(opts :: term()) ::
  {:ok, [%{oid: oid(), relation: relation(), column_changes: [column_change()]}]}

@type column_change :: %{
  column_name: String.t(),
  change_type: :added | :removed | :modified,
  changed_attributes: [attribute_change()]
}

@type attribute_change :: %{
  attribute: atom(),
  old_value: term(),
  new_value: term()
}

2. Add compute_column_diff/2 function

@spec compute_column_diff([column_info()], [column_info()]) :: [column_change()]
def compute_column_diff(cached_columns, live_columns) do
  cached_by_name = Map.new(cached_columns, &{&1.name, &1})
  live_by_name = Map.new(live_columns, &{&1.name, &1})
  
  cached_names = MapSet.new(Map.keys(cached_by_name))
  live_names = MapSet.new(Map.keys(live_by_name))
  
  added = MapSet.difference(live_names, cached_names)
  removed = MapSet.difference(cached_names, live_names)
  common = MapSet.intersection(cached_names, live_names)
  
  added_changes = Enum.map(added, &%{column_name: &1, change_type: :added, ...})
  removed_changes = Enum.map(removed, &%{column_name: &1, change_type: :removed, ...})
  
  modified_changes = 
    common
    |> Enum.filter(fn name -> cached_by_name[name] != live_by_name[name] end)
    |> Enum.map(fn name ->
      %{
        column_name: name,
        change_type: :modified,
        changed_attributes: diff_attributes(cached_by_name[name], live_by_name[name])
      }
    end)
  
  added_changes ++ removed_changes ++ modified_changes
end

3. Add shapes_affected_by_column_changes/2

@spec shapes_affected_by_column_changes(Filter.t(), [column_change()]) :: MapSet.t(shape_id())
def shapes_affected_by_column_changes(filter, column_changes) do
  # Check for PK changes first - affects all shapes
  if any_pk_change?(column_changes) do
    Filter.all_shape_ids(filter)
  else
    affected_columns = 
      column_changes
      |> Enum.filter(&requires_invalidation?/1)
      |> Enum.map(& &1.column_name)
    
    # Build synthetic Relation and use existing filtering logic
    synthetic_relation = %Relation{
      affected_columns: affected_columns,
      columns: live_columns  # for column count check
    }
    
    Filter.affected_shapes(filter, synthetic_relation)
  end
end

4. Update SchemaReconciler to use precise invalidation

defp handle_diverged_relations({:ok, diverged}, state) do
  for %{oid: oid, relation: rel, column_changes: changes} <- diverged do
    affected_shapes = shapes_affected_by_column_changes(state.filter, changes)
    
    if MapSet.size(affected_shapes) > 0 do
      Logger.notice("Schema change in #{inspect(rel)}: invalidating #{MapSet.size(affected_shapes)} shapes")
      ShapeCleaner.remove_shapes(state.stack_id, MapSet.to_list(affected_shapes))
    end
    
    Inspector.clean(oid, state.inspector)
  end
end

Edge Cases

  1. Column renamed to existing name: Detected as remove + add, both shapes affected
  2. Type change within same type family: Still invalidate (e.g., varchar(50) -> varchar(100))
  3. Multiple attributes changed: Any invalidating change triggers invalidation
  4. Shape selects all columns: Affected by any column add/remove

Testing

  1. Add column not selected by any shape → no invalidation
  2. Add column, shape selects all → invalidation
  3. Change nullability → no invalidation
  4. Change type → invalidation of shapes selecting column
  5. Change PK → invalidation of all shapes
  6. Rename column → invalidation of shapes selecting old name

Metrics

Add telemetry for:

  • schema_reconciler.diverged_tables.count
  • schema_reconciler.shapes_invalidated.count
  • schema_reconciler.shapes_spared.count (new)

Open Questions

  1. Nullability: Should we be conservative and invalidate anyway? The argument for not invalidating is that data consistency is maintained via replication. The argument for invalidating is defense-in-depth.

  2. Default value changes: Not in current column_info. Should we track them?

  3. Constraint changes (CHECK, UNIQUE): Not tracked. Do they matter for read-only replication?

  4. Filter access: SchemaReconciler doesn't currently have access to the Filter. Need to pass it in or query ShapeStatus.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions