Summary
Toggling cleaning_method (and likely other state that bumps dataframe_id) shows a transient mid-state where the widget renders new data against a stale or empty column_config, then re-renders with the correct config. Visible as a brief column flash / "half update."
Currently mitigated by a JS-side bandaid (last-known-good column_config fallback in BuckarooInfiniteWidget), but the two underlying issues should be fixed at the root.
Root cause
buckaroo/dataflow/dataflow.py:481-525 (_handle_widget_change) publishes two traitlets in sequence:
- L499 / L503:
self.df_data_dict = {...} — first comm sync
- L525:
self.df_display_args = temp_display_args — second comm sync
Each self.X = ... triggers an immediate traitlet/comm sync. anywidget's useModelState subscribes per-key (change:df_data_dict, change:df_display_args), so the JS gets two separate React renders. Between them, the active view's df_viewer_config.column_config can mismatch the data, or be empty (EMPTY_DFVIEWER_CONFIG in styling_core.py:186-189).
Compounded by BuckarooWidgetInfinite.tsx:277-286: effectiveDataframeId (the AG-Grid remount key) is derived from optimistic local state (buckaroo_state.cleaning_method), so the grid remounts the instant the user clicks the toggle — before Python has sent any reply — mounting fresh against the still-stale df_display_args.
Fix 1 — Atomic publish on the Python side
Batch the two traitlet writes so the JS receives one comm payload, fires both change: events synchronously, and React 18 auto-batches into one render.
with self.hold_sync():
self.df_data_dict = {...}
self.df_display_args = temp_display_args
hold_sync comes from ipywidgets.Widget. DataFlow doesn't inherit from it directly — needs a quick check whether the call needs to be hoisted to the outer BuckarooWidget (which does inherit DOMWidget), or if hold_trait_notifications on HasTraits is sufficient. The two combined fields are small; one custom message (model.send) is an option if the trait-level batching doesn't compose cleanly.
Fix 2 — Server-confirmed remount key on the JS side
Drop the optimistic-state bundling from effectiveDataframeId. Use only dataframe_id (the server-confirmed id, bumped by Python on operations / cleaning / post-processing / quick-command changes). The grid then stays mounted until Python's reply arrives, and remounts cleanly against new data + new config in the same tick.
Currently:
const effectiveDataframeId = JSON.stringify([
dataframe_id, operations, post_processing, cleaning_method, quick_command_args
]);
Target:
const effectiveDataframeId = dataframe_id;
Trade-off: a brief perceived latency between clicking the toggle and the grid changing, versus today's instant-but-half-flash. With Fix 1 in place this latency is just the Python round-trip; without Fix 1 the in-between state is still visible.
Requires verifying that Python actually bumps dataframe_id on every row-content-changing state change (operations, post_processing, cleaning_method, quick_command_args). If any of those don't bump it today, that needs adding before this fix is safe.
Current bandaid (to be removed once 1+2 land)
BuckarooInfiniteWidget holds a ref lastGoodDfvcRef keyed by view name. If incoming column_config is empty for a view we've previously seen non-empty, the prior df_viewer_config is substituted. Logged via [bk-flash] effectiveDisplayArgs falling back to last-known-good column_config.
This handles the empty case but doesn't help the column-name-mismatch case (new data vs old config with overlapping but not identical column sets). Fixes 1 and 2 do.
Acceptance
- Toggle cleaning_method back and forth ~5 times on a notebook df. No bk-flash log line about the last-known-good fallback firing. No visible column flash between toggles.
- Toggle post_processing and quick_command_args (sort/search) similarly.
- Confirm fallback ref no longer needed and remove it.
Summary
Toggling
cleaning_method(and likely other state that bumpsdataframe_id) shows a transient mid-state where the widget renders new data against a stale or emptycolumn_config, then re-renders with the correct config. Visible as a brief column flash / "half update."Currently mitigated by a JS-side bandaid (last-known-good
column_configfallback inBuckarooInfiniteWidget), but the two underlying issues should be fixed at the root.Root cause
buckaroo/dataflow/dataflow.py:481-525(_handle_widget_change) publishes two traitlets in sequence:self.df_data_dict = {...}— first comm syncself.df_display_args = temp_display_args— second comm syncEach
self.X = ...triggers an immediate traitlet/comm sync. anywidget'suseModelStatesubscribes per-key (change:df_data_dict,change:df_display_args), so the JS gets two separate React renders. Between them, the active view'sdf_viewer_config.column_configcan mismatch the data, or be empty (EMPTY_DFVIEWER_CONFIGinstyling_core.py:186-189).Compounded by
BuckarooWidgetInfinite.tsx:277-286:effectiveDataframeId(the AG-Grid remount key) is derived from optimistic local state (buckaroo_state.cleaning_method), so the grid remounts the instant the user clicks the toggle — before Python has sent any reply — mounting fresh against the still-staledf_display_args.Fix 1 — Atomic publish on the Python side
Batch the two traitlet writes so the JS receives one comm payload, fires both
change:events synchronously, and React 18 auto-batches into one render.hold_synccomes fromipywidgets.Widget.DataFlowdoesn't inherit from it directly — needs a quick check whether the call needs to be hoisted to the outerBuckarooWidget(which does inheritDOMWidget), or ifhold_trait_notificationsonHasTraitsis sufficient. The two combined fields are small; one custom message (model.send) is an option if the trait-level batching doesn't compose cleanly.Fix 2 — Server-confirmed remount key on the JS side
Drop the optimistic-state bundling from
effectiveDataframeId. Use onlydataframe_id(the server-confirmed id, bumped by Python on operations / cleaning / post-processing / quick-command changes). The grid then stays mounted until Python's reply arrives, and remounts cleanly against new data + new config in the same tick.Currently:
Target:
Trade-off: a brief perceived latency between clicking the toggle and the grid changing, versus today's instant-but-half-flash. With Fix 1 in place this latency is just the Python round-trip; without Fix 1 the in-between state is still visible.
Requires verifying that Python actually bumps
dataframe_idon every row-content-changing state change (operations, post_processing, cleaning_method, quick_command_args). If any of those don't bump it today, that needs adding before this fix is safe.Current bandaid (to be removed once 1+2 land)
BuckarooInfiniteWidgetholds a reflastGoodDfvcRefkeyed by view name. If incomingcolumn_configis empty for a view we've previously seen non-empty, the priordf_viewer_configis substituted. Logged via[bk-flash] effectiveDisplayArgs falling back to last-known-good column_config.This handles the empty case but doesn't help the column-name-mismatch case (new data vs old config with overlapping but not identical column sets). Fixes 1 and 2 do.
Acceptance