feat: Add bigframes.execution_history API to track BigQuery jobs by shuoweil · Pull Request #16588 · googleapis/google-cloud-python

shuoweil · 2026-04-08T22:01:29Z

This PR promotes execution_history() to the top-level bigframes namespace and upgrades it to track rich metadata for every BigQuery job executed during your session.

Key User Benefits:

Easier Access: Call bigframes.execution_history() directly instead of digging into sub-namespaces.
Rich Metadata Tracking: Captures structured statistics for both Query Jobs and Load Jobs including:
- job_id and a direct Google Cloud Console URL for easy debugging.
- Performance metrics: total_bytes_processed, duration_seconds, and slot_millis.
- Query details (truncated preview of the SQL ran).
Clean, Focused Logs: Automatically filters out internal library overhead (like schema validations and index uniqueness checks) so your history only shows the data processing steps you actually care about.

Usage Example:

    1 import bigframes.pandas as bpd
    2 import pandas as pd
    3 import bigframes
    4
    5 # ... run some bigframes operations ...
    6 df = bpd.read_gbq("SELECT 1")
    7
    8 # Upload some local data (triggers a Load Job)
    9 bpd.read_pandas(pd.DataFrame({'a': [1, 2, 3]}))
   10
   11 # Get a DataFrame of all BQ jobs run in this session
   12 history = bigframes.execution_history()
   13
   14 # Inspect recent queries, their costs, and durations
   15 print(history[['job_id', 'job_type', 'total_bytes_processed', 'duration_seconds', 'query']])

verified at:

vs code notebook: screen/8u2yhaRV9iHbDbF
colab notebook: screen/9L8VrP5y9DXhnZz

More testcases and notebook update will be checked in using separate PRs for easier review.

Fixes #<481840739> 🦕

gemini-code-assist

Code Review

This pull request implements an execution history feature to track and display BigQuery and local Polars jobs initiated during a session. Key changes include the addition of a JobMetadata dataclass, updates to ExecutionMetrics for job tracking, and a specialized _ExecutionHistory DataFrame for formatted output. Review feedback identifies opportunities to improve error logging in the HTML representation, remove redundant attribute assignments in the metrics logic, and ensure that bytes processed during local executions are consistently aggregated into session-level metrics.

gemini-code-assist · 2026-04-08T22:06:45Z

packages/bigframes/bigframes/session/__init__.py

+        except Exception:
+            return super()._repr_html_()  # type: ignore


Using a broad except Exception: can hide bugs in the HTML representation logic. It's better to catch specific exceptions or at least log the caught exception to aid in debugging if the formatting fails. This will provide visibility into any issues without breaking the user's interactive session.

Suggested change

except Exception:

return super()._repr_html_() # type: ignore

except Exception as e:

logger.warning("Failed to generate custom HTML representation for execution history: %s", e)

return super()._repr_html_() # type: ignore

gemini-code-assist · 2026-04-08T22:06:45Z

packages/bigframes/bigframes/session/metrics.py

+                metadata.total_bytes_processed = bytes_processed
+                metadata.total_slot_ms = slot_millis


These assignments are redundant because JobMetadata.from_job already populates total_bytes_processed and total_slot_ms from the query_job object when the job is a QueryJob. The values from get_performance_stats are sourced from the same attributes on the job object. Removing these lines will make the code cleaner.

gemini-code-assist · 2026-04-08T22:06:45Z

packages/bigframes/bigframes/session/metrics.py

+        if isinstance(event, bigframes.core.events.ExecutionFinished):
+            if event.result and isinstance(event.result, LocalExecuteResult):
+                self.execution_count += 1
+                bytes_processed = event.result.total_bytes_processed or 0


The execution_count is being incremented for local Polars executions, but bytes_processed is not. For consistency with how other job types are handled in count_job_stats, self.bytes_processed should also be updated. This ensures that metrics like session.bytes_processed_sum are comprehensive. Note that the docstring for bytes_processed_sum might need to be updated in a separate change to reflect that it includes more than just BigQuery jobs.

Suggested change

bytes_processed = event.result.total_bytes_processed or 0

bytes_processed = event.result.total_bytes_processed or 0

self.bytes_processed += bytes_processed

shuoweil requested review from GarrettWu, TrevorBergeron and sycai April 8, 2026 22:01

shuoweil self-assigned this Apr 8, 2026

shuoweil requested review from a team as code owners April 8, 2026 22:01

feat: Implement execution history

eab6cdb

shuoweil force-pushed the shuowei-execution-history branch from 35a379d to eab6cdb Compare April 8, 2026 22:03

gemini-code-assist bot reviewed Apr 8, 2026

View reviewed changes

style: Fix formatting and imports

e5ec3e9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add bigframes.execution_history API to track BigQuery jobs#16588

feat: Add bigframes.execution_history API to track BigQuery jobs#16588
shuoweil wants to merge 2 commits intomainfrom
shuowei-execution-history

shuoweil commented Apr 8, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Apr 8, 2026

Uh oh!

gemini-code-assist bot Apr 8, 2026

Uh oh!

gemini-code-assist bot Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		except Exception:
		return super()._repr_html_() # type: ignore

		metadata.total_bytes_processed = bytes_processed
		metadata.total_slot_ms = slot_millis

	bytes_processed = event.result.total_bytes_processed or 0
	bytes_processed = event.result.total_bytes_processed or 0
	self.bytes_processed += bytes_processed

Conversation

shuoweil commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

shuoweil commented Apr 8, 2026 •

edited

Loading