Subquery shapes error on first WAL event after restart: Materializer not available

## Summary

When Electric restarts and restores shapes with subqueries from disk, the first WAL event(s) that arrive trigger `RuntimeError: Materializer for stack "..." and handle "..." is not available` errors. This is a race condition between eager filter restoration and lazy consumer/materializer startup.

## Root cause

On startup, two things happen in sequence:

1. `ShapeLogCollector` restores filters **eagerly** for all persisted shapes, including shapes with subquery dependencies. After this, the filter knows about all shapes and can try to evaluate subquery WHERE clauses.

2. Consumer and Materializer processes are started **lazily** — only when the first event arrives for a shape (via `ConsumerRegistry`).

When a WAL event arrives, the flow is:

```
ShapeLogCollector.handle_call
  → EventRouter.route_change
    → Filter.affected_shapes
      → WhereCondition.other_shapes_affected
        → refs_fun.(shape)
          → Materializer.get_all_as_refs
            → GenServer.call(materializer)  ← process doesn't exist yet!
```

The filter needs the Materializer to resolve subquery link values, but the Materializer is only started *after* `Filter.affected_shapes` returns and `ConsumerRegistry.publish` triggers lazy consumer startup.

The error is caught by `Filter.affected_shapes` which falls back to returning all shape IDs (safe but noisy).

## Reproduction

1. Create a table with some data:
```sql
CREATE TABLE hot_rows (
  id bigint PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
  grp_a integer NOT NULL,
  grp_b integer NOT NULL,
  grp_c integer NOT NULL,
  payload text NOT NULL DEFAULT '',
  updated_at timestamptz NOT NULL DEFAULT now()
);
-- Insert rows with random group values
```

2. Create a shape with a subquery WHERE clause:
```
where=id IN (SELECT id FROM hot_rows WHERE grp_a = 1)
  OR id IN (SELECT id FROM hot_rows WHERE grp_b = 1)
  OR id IN (SELECT id FROM hot_rows WHERE grp_c = 1)
```

3. Restart Electric (shapes restore from disk).

4. Update any row in the table — without any HTTP client connecting first:
```sql
UPDATE hot_rows SET payload = md5(random()::text), updated_at = now()
WHERE id = (SELECT id FROM hot_rows ORDER BY random() LIMIT 1);
```

5. Observe the error flood — one error per subquery dependency shape, per change in the transaction. Increasing the number of modified rows in a single transaction or the number of shapes proportionally increases the number of logged errors.

## Example output

Startup completes normally, shapes and filters are restored:

```
13:03:52.356 [notice] Restored publication filters in 3ms
13:03:52.380 [notice] Restored filters for 4 shapes in 28ms
13:03:52.380 [notice] Consumers ready in 39ms (4 shapes)
```

A single-row UPDATE arrives ~2 seconds later. The filter tries to evaluate subquery WHERE clauses but the Materializer processes don't exist yet — one error per dependency shape (3 subqueries = 2 errors, one for the `old_record` and one for the `record` evaluation in `UpdatedRecord` handling):

```
13:03:54.827 [info] Received relation "public"."hot_rows"

13:03:54.919 [error] Unexpected error in Filter.WhereCondition.affected_shapes:
** (RuntimeError) Materializer for stack "single_stack" and handle "16096921-1772539368604093" is not available
    (electric 1.4.11) lib/electric/shapes/consumer/materializer.ex:98: Electric.Shapes.Consumer.Materializer.genserver_get_link_values/1
    (electric 1.4.11) lib/electric/shapes/consumer/materializer.ex:111: anonymous fn/2 in Electric.Shapes.Consumer.Materializer.get_all_as_refs/2
    ...
    (electric 1.4.11) lib/electric/shapes/filter/where_condition.ex:210: anonymous fn/6 in Electric.Shapes.Filter.WhereCondition.other_shapes_affected/4
    ...
    (electric 1.4.11) lib/electric/shapes/event_router.ex:149: Electric.Shapes.EventRouter.find_affected_shapes_for_change/2

13:03:54.942 [error] Unexpected error in Filter.WhereCondition.affected_shapes:
** (RuntimeError) Materializer for stack "single_stack" and handle "16096921-1772539368604093" is not available
    ...
```

Only *after* the errors do consumer and materializer processes get lazily started:

```
13:03:54.943 [debug] Starting Electric.Shapes.Consumer for 16096921-1772539368604093
13:03:54.954 [debug] Started consumer for existing handle 16096921-1772539368604093
13:03:54.954 [debug] Starting Electric.Shapes.Consumer for 79160600-1772539368614088
13:03:54.955 [debug] Started consumer for existing handle 79160600-1772539368614088
13:03:54.955 [debug] Starting Electric.Shapes.Consumer for 95336946-1772539368614654
13:03:54.956 [debug] Started consumer for existing handle 95336946-1772539368614654
...
13:03:55.022 [debug] Starting Electric.Shapes.Consumer.Materializer for 16096921-1772539368604093
13:03:55.023 [debug] Starting Electric.Shapes.Consumer.Materializer for 79160600-1772539368614088
13:03:55.023 [debug] Starting Electric.Shapes.Consumer.Materializer for 95336946-1772539368614654
13:03:55.023 [debug] Starting Electric.Shapes.Consumer for 76149289-1772539368615417
```

## Impact

- The catch-all in `Filter.affected_shapes` returns **all** shape IDs when the error occurs, so no events are lost — but every shape gets notified of every change during the race window, which is wasteful.
- Each change in the WAL produces one error per subquery dependency, so a multi-row transaction causes an error flood.
- Rare in production because HTTP clients typically reconnect after a restart before the first WAL event arrives, which triggers eager consumer startup. Observed twice on the `hertz` instance (ELECTRIC-5D7, ELECTRIC-5PF).

## Sentry references

- https://electricsql-04.sentry.io/issues/94742345/
- https://electricsql-04.sentry.io/issues/86495420/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Subquery shapes error on first WAL event after restart: Materializer not available #3946

Summary

Root cause

Reproduction

Example output

Impact

Sentry references

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Subquery shapes error on first WAL event after restart: Materializer not available #3946

Description

Summary

Root cause

Reproduction

Example output

Impact

Sentry references

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions