Skip to content

fix(bigframes): avoid exceptions for unnamed JSON columns in SQL Cell outputs#17367

Draft
tswast wants to merge 1 commit into
mainfrom
b519239774-json-formatting
Draft

fix(bigframes): avoid exceptions for unnamed JSON columns in SQL Cell outputs#17367
tswast wants to merge 1 commit into
mainfrom
b519239774-json-formatting

Conversation

@tswast
Copy link
Copy Markdown
Contributor

@tswast tswast commented Jun 3, 2026

🦕

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for iloc column-based assignment (__setitem__) in DataFrames and refactors iloc indexing to dynamically enforce ordering only when row subsets or specific orderings are requested. It also updates JSON column serialization in _get_display_df to use the new iloc assignment, bypassing limitations with duplicate or non-string column names. Feedback on these changes includes a recommendation to use _assign_multi_items instead of assign in __setitem__ to prevent potential TypeErrors with non-string column labels, removing a useless expression (df._block.apply_analytic) in _get_display_df, and reverting a hardcoded development project ID in the generative AI notebook to a placeholder.

Comment on lines +321 to +323
col_label = self._dataframe.columns[col_offset]
df = self._dataframe.assign(**{col_label: value})
self._dataframe._set_block(df._get_block())
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using df.assign(**{col_label: value}) will raise a TypeError: assign() keywords must be strings if col_label is not a string (e.g., an integer or None). Since _assign_multi_items is already used in the other branches of this method and handles non-string/duplicate column names safely, we should use it here as well.

Suggested change
col_label = self._dataframe.columns[col_offset]
df = self._dataframe.assign(**{col_label: value})
self._dataframe._set_block(df._get_block())
col_label = self._dataframe.columns[col_offset]
df = self._dataframe._assign_multi_items([col_label], value)
self._dataframe._set_block(df._get_block())

Comment on lines +837 to +838
df._block.apply_analytic
df.iloc[:, json_col_indexes] = cast(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The statement df._block.apply_analytic is a useless expression with no side effects and should be removed.

            df.iloc[:, json_col_indexes] = cast(

"import bigframes.pandas as bpd\n",
"\n",
"PROJECT_ID = \"\" # @param {type:\"string\"}\n",
"PROJECT_ID = \"bigframes-dev\" # @param {type:\"string\"}\n",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

It is generally better to keep PROJECT_ID as an empty string "" or a placeholder like "your-project-id" in public notebooks so that users are prompted to enter their own Google Cloud project ID, and to avoid hardcoding internal/development project names.

Suggested change
"PROJECT_ID = \"bigframes-dev\" # @param {type:\"string\"}\n",
"PROJECT_ID = \"\" # @param {type:\"string\"}\n",

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant