Skip to content

Commit 9b80d01

Browse files
authored
Merge branch 'main' into b431830622-date_range
2 parents b220319 + 090ce8e commit 9b80d01

File tree

129 files changed

+3694
-1582
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

129 files changed

+3694
-1582
lines changed

.pre-commit-config.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ repos:
4343
exclude: "^third_party"
4444
args: ["--check-untyped-defs", "--explicit-package-bases", "--ignore-missing-imports"]
4545
- repo: https://github.com/biomejs/pre-commit
46-
rev: v2.0.2
46+
rev: v2.2.4
4747
hooks:
4848
- id: biome-check
4949
files: '\.(js|css)$'

CHANGELOG.md

Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,84 @@
44

55
[1]: https://pypi.org/project/bigframes/#history
66

7+
## [2.20.0](https://github.com/googleapis/python-bigquery-dataframes/compare/v2.19.0...v2.20.0) (2025-09-16)
8+
9+
10+
### Features
11+
12+
* Add `__dataframe__` interchange support ([#2063](https://github.com/googleapis/python-bigquery-dataframes/issues/2063)) ([3b46a0d](https://github.com/googleapis/python-bigquery-dataframes/commit/3b46a0d91eb379c61ced45ae0b25339281326c3d))
13+
* Add ai_generate_bool to the bigframes.bigquery package ([#2060](https://github.com/googleapis/python-bigquery-dataframes/issues/2060)) ([70d6562](https://github.com/googleapis/python-bigquery-dataframes/commit/70d6562df64b2aef4ff0024df6f57702d52dcaf8))
14+
* Add bigframes.bigquery.to_json_string ([#2076](https://github.com/googleapis/python-bigquery-dataframes/issues/2076)) ([41e8f33](https://github.com/googleapis/python-bigquery-dataframes/commit/41e8f33ceb46a7c2a75d1c59a4a3f2f9413d281d))
15+
* Add rank(pct=True) support ([#2084](https://github.com/googleapis/python-bigquery-dataframes/issues/2084)) ([c1e871d](https://github.com/googleapis/python-bigquery-dataframes/commit/c1e871d9327bf6c920d17e1476fed3088d506f5f))
16+
* Add StreamingDataFrame.to_bigtable and .to_pubsub start_timestamp parameter ([#2066](https://github.com/googleapis/python-bigquery-dataframes/issues/2066)) ([a63cbae](https://github.com/googleapis/python-bigquery-dataframes/commit/a63cbae24ff2dc191f0a53dced885bc95f38ec96))
17+
* Can call agg with some callables ([#2055](https://github.com/googleapis/python-bigquery-dataframes/issues/2055)) ([17a1ed9](https://github.com/googleapis/python-bigquery-dataframes/commit/17a1ed99ec8c6d3215d3431848814d5d458d4ff1))
18+
* Support astype to json ([#2073](https://github.com/googleapis/python-bigquery-dataframes/issues/2073)) ([6bd6738](https://github.com/googleapis/python-bigquery-dataframes/commit/6bd67386341de7a92ada948381702430c399406e))
19+
* Support pandas.Index as key for DataFrame.__setitem__() ([#2062](https://github.com/googleapis/python-bigquery-dataframes/issues/2062)) ([b3cf824](https://github.com/googleapis/python-bigquery-dataframes/commit/b3cf8248e3b8ea76637ded64fb12028d439448d1))
20+
* Support pd.cut() for array-like type ([#2064](https://github.com/googleapis/python-bigquery-dataframes/issues/2064)) ([21eb213](https://github.com/googleapis/python-bigquery-dataframes/commit/21eb213c5f0e0f696f2d1ca1f1263678d791cf7c))
21+
* Support to cast struct to json ([#2067](https://github.com/googleapis/python-bigquery-dataframes/issues/2067)) ([b0ff718](https://github.com/googleapis/python-bigquery-dataframes/commit/b0ff718a04fadda33cfa3613b1d02822cde34bc2))
22+
23+
24+
### Bug Fixes
25+
26+
* Deflake ai_gen_bool multimodel test ([#2085](https://github.com/googleapis/python-bigquery-dataframes/issues/2085)) ([566a37a](https://github.com/googleapis/python-bigquery-dataframes/commit/566a37a30ad5677aef0c5f79bdd46bca2139cc1e))
27+
* Do not scroll page selector in anywidget `repr_mode` ([#2082](https://github.com/googleapis/python-bigquery-dataframes/issues/2082)) ([5ce5d63](https://github.com/googleapis/python-bigquery-dataframes/commit/5ce5d63fcb51bfb3df2769108b7486287896ccb9))
28+
* Fix the potential invalid VPC egress configuration ([#2068](https://github.com/googleapis/python-bigquery-dataframes/issues/2068)) ([cce4966](https://github.com/googleapis/python-bigquery-dataframes/commit/cce496605385f2ac7ab0becc0773800ed5901aa5))
29+
* Return a DataFrame containing query stats for all non-SELECT statements ([#2071](https://github.com/googleapis/python-bigquery-dataframes/issues/2071)) ([a52b913](https://github.com/googleapis/python-bigquery-dataframes/commit/a52b913d9d8794b4b959ea54744a38d9f2f174e7))
30+
* Use the remote and managed functions for bigframes results ([#2079](https://github.com/googleapis/python-bigquery-dataframes/issues/2079)) ([49b91e8](https://github.com/googleapis/python-bigquery-dataframes/commit/49b91e878de651de23649756259ee35709e3f5a8))
31+
32+
33+
### Performance Improvements
34+
35+
* Avoid re-authenticating if credentials have already been fetched ([#2058](https://github.com/googleapis/python-bigquery-dataframes/issues/2058)) ([913de1b](https://github.com/googleapis/python-bigquery-dataframes/commit/913de1b31f3bb0b306846fddae5dcaff6be3cec4))
36+
* Improve apply axis=1 performance ([#2077](https://github.com/googleapis/python-bigquery-dataframes/issues/2077)) ([12e4380](https://github.com/googleapis/python-bigquery-dataframes/commit/12e438051134577e911c1a6ce9d5a5885a0b45ad))
37+
38+
## [2.19.0](https://github.com/googleapis/python-bigquery-dataframes/compare/v2.18.0...v2.19.0) (2025-09-09)
39+
40+
41+
### Features
42+
43+
* Add str.join method ([#2054](https://github.com/googleapis/python-bigquery-dataframes/issues/2054)) ([8804ada](https://github.com/googleapis/python-bigquery-dataframes/commit/8804adaf8ba23fdcad6e42a7bf034bd0a11c890f))
44+
* Support display.max_colwidth option ([#2053](https://github.com/googleapis/python-bigquery-dataframes/issues/2053)) ([5229e07](https://github.com/googleapis/python-bigquery-dataframes/commit/5229e07b4535c01b0cdbd731455ff225a373b5c8))
45+
* Support VPC egress setting in remote function ([#2059](https://github.com/googleapis/python-bigquery-dataframes/issues/2059)) ([5df779d](https://github.com/googleapis/python-bigquery-dataframes/commit/5df779d4f421d3ba777cfd928d99ca2e8a3f79ad))
46+
47+
48+
### Bug Fixes
49+
50+
* Fix issue mishandling chunked array while loading data ([#2051](https://github.com/googleapis/python-bigquery-dataframes/issues/2051)) ([873d0ee](https://github.com/googleapis/python-bigquery-dataframes/commit/873d0eee474ed34f1d5164c37383f2737dbec4db))
51+
* Remove warning for slot_millis_sum ([#2047](https://github.com/googleapis/python-bigquery-dataframes/issues/2047)) ([425a691](https://github.com/googleapis/python-bigquery-dataframes/commit/425a6917d5442eeb4df486c6eed1fd136bbcedfb))
52+
53+
## [2.18.0](https://github.com/googleapis/python-bigquery-dataframes/compare/v2.17.0...v2.18.0) (2025-09-03)
54+
55+
56+
### ⚠ BREAKING CHANGES
57+
58+
* add `allow_large_results` option to `read_gbq_query`, aligning with `bpd.options.compute.allow_large_results` option ([#1935](https://github.com/googleapis/python-bigquery-dataframes/issues/1935))
59+
60+
### Features
61+
62+
* Add `allow_large_results` option to `read_gbq_query`, aligning with `bpd.options.compute.allow_large_results` option ([#1935](https://github.com/googleapis/python-bigquery-dataframes/issues/1935)) ([a7963fe](https://github.com/googleapis/python-bigquery-dataframes/commit/a7963fe57a0e141debf726f0bc7b0e953ebe9634))
63+
* Add parameter shuffle for ml.model_selection.train_test_split ([#2030](https://github.com/googleapis/python-bigquery-dataframes/issues/2030)) ([2c72c56](https://github.com/googleapis/python-bigquery-dataframes/commit/2c72c56fb5893eb01d5aec6273d11945c9c532c5))
64+
* Can pivot unordered, unindexed dataframe ([#2040](https://github.com/googleapis/python-bigquery-dataframes/issues/2040)) ([1a0f710](https://github.com/googleapis/python-bigquery-dataframes/commit/1a0f710ac11418fd71ab3373f3f6002fa581b180))
65+
* Local date accessor execution support ([#2034](https://github.com/googleapis/python-bigquery-dataframes/issues/2034)) ([7ac6fe1](https://github.com/googleapis/python-bigquery-dataframes/commit/7ac6fe16f7f2c09d2efac6ab813ec841c21baef8))
66+
* Support args in dataframe apply method ([#2026](https://github.com/googleapis/python-bigquery-dataframes/issues/2026)) ([164c481](https://github.com/googleapis/python-bigquery-dataframes/commit/164c4818bc4ff2990dca16b9f22a798f47e0a60b))
67+
* Support args in series apply method ([#2013](https://github.com/googleapis/python-bigquery-dataframes/issues/2013)) ([d9d725c](https://github.com/googleapis/python-bigquery-dataframes/commit/d9d725cfbc3dca9e66b460cae4084e25162f2acf))
68+
* Support callable for dataframe mask method ([#2020](https://github.com/googleapis/python-bigquery-dataframes/issues/2020)) ([9d4504b](https://github.com/googleapis/python-bigquery-dataframes/commit/9d4504be310d38b63515d67c0f60d2e48e68c7b5))
69+
* Support multi-column assignment for DataFrame ([#2028](https://github.com/googleapis/python-bigquery-dataframes/issues/2028)) ([ba0d23b](https://github.com/googleapis/python-bigquery-dataframes/commit/ba0d23b59c44ba5a46ace8182ad0e0cfc703b3ab))
70+
* Support string matching in local executor ([#2032](https://github.com/googleapis/python-bigquery-dataframes/issues/2032)) ([c0b54f0](https://github.com/googleapis/python-bigquery-dataframes/commit/c0b54f03849ee3115413670e690e68f3ef10f2ec))
71+
72+
73+
### Bug Fixes
74+
75+
* Fix scalar op lowering tree walk ([#2029](https://github.com/googleapis/python-bigquery-dataframes/issues/2029)) ([935af10](https://github.com/googleapis/python-bigquery-dataframes/commit/935af107ef98837fb2b81d72185d0b6a9e09fbcf))
76+
* Read_csv fails when check file size for wildcard gcs files ([#2019](https://github.com/googleapis/python-bigquery-dataframes/issues/2019)) ([b0d620b](https://github.com/googleapis/python-bigquery-dataframes/commit/b0d620bbe8227189bbdc2ba5a913b03c70575296))
77+
* Resolve the validation issue for other arg in dataframe where method ([#2042](https://github.com/googleapis/python-bigquery-dataframes/issues/2042)) ([8689199](https://github.com/googleapis/python-bigquery-dataframes/commit/8689199aa82212ed300fff592097093812e0290e))
78+
79+
80+
### Performance Improvements
81+
82+
* Improve axis=1 aggregation performance ([#2036](https://github.com/googleapis/python-bigquery-dataframes/issues/2036)) ([fbb2094](https://github.com/googleapis/python-bigquery-dataframes/commit/fbb209468297a8057d9d49c40e425c3bfdeb92bd))
83+
* Improve iter_nodes_topo performance using Kahn's algorithm ([#2038](https://github.com/googleapis/python-bigquery-dataframes/issues/2038)) ([3961637](https://github.com/googleapis/python-bigquery-dataframes/commit/39616374bba424996ebeb9a12096bfaf22660b44))
84+
785
## [2.17.0](https://github.com/googleapis/python-bigquery-dataframes/compare/v2.16.0...v2.17.0) (2025-08-22)
886

987

bigframes/_config/auth.py

Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
# Copyright 2025 Google LLC
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
15+
from __future__ import annotations
16+
17+
import threading
18+
from typing import Optional
19+
20+
import google.auth.credentials
21+
import google.auth.transport.requests
22+
import pydata_google_auth
23+
24+
_SCOPES = ["https://www.googleapis.com/auth/cloud-platform"]
25+
26+
# Put the lock here rather than in BigQueryOptions so that BigQueryOptions
27+
# remains deepcopy-able.
28+
_AUTH_LOCK = threading.Lock()
29+
_cached_credentials: Optional[google.auth.credentials.Credentials] = None
30+
_cached_project_default: Optional[str] = None
31+
32+
33+
def get_default_credentials_with_project() -> tuple[
34+
google.auth.credentials.Credentials, Optional[str]
35+
]:
36+
global _AUTH_LOCK, _cached_credentials, _cached_project_default
37+
38+
with _AUTH_LOCK:
39+
if _cached_credentials is not None:
40+
return _cached_credentials, _cached_project_default
41+
42+
_cached_credentials, _cached_project_default = pydata_google_auth.default(
43+
scopes=_SCOPES, use_local_webserver=False
44+
)
45+
46+
# Ensure an access token is available.
47+
_cached_credentials.refresh(google.auth.transport.requests.Request())
48+
49+
return _cached_credentials, _cached_project_default
50+
51+
52+
def reset_default_credentials_and_project():
53+
global _AUTH_LOCK, _cached_credentials, _cached_project_default
54+
55+
with _AUTH_LOCK:
56+
_cached_credentials = None
57+
_cached_project_default = None

bigframes/_config/bigquery_options.py

Lines changed: 41 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@
2222
import google.auth.credentials
2323
import requests.adapters
2424

25+
import bigframes._config.auth
2526
import bigframes._importing
2627
import bigframes.enums
2728
import bigframes.exceptions as bfe
@@ -37,6 +38,7 @@
3738

3839
def _get_validated_location(value: Optional[str]) -> Optional[str]:
3940
import bigframes._tools.strings
41+
import bigframes.constants
4042

4143
if value is None or value in bigframes.constants.ALL_BIGQUERY_LOCATIONS:
4244
return value
@@ -141,20 +143,52 @@ def application_name(self, value: Optional[str]):
141143
)
142144
self._application_name = value
143145

146+
def _try_set_default_credentials_and_project(
147+
self,
148+
) -> tuple[google.auth.credentials.Credentials, Optional[str]]:
149+
# Don't fetch credentials or project if credentials is already set.
150+
# If it's set, we've already authenticated, so if the user wants to
151+
# re-auth, they should explicitly reset the credentials.
152+
if self._credentials is not None:
153+
return self._credentials, self._project
154+
155+
(
156+
credentials,
157+
credentials_project,
158+
) = bigframes._config.auth.get_default_credentials_with_project()
159+
self._credentials = credentials
160+
161+
# Avoid overriding an explicitly set project with a default value.
162+
if self._project is None:
163+
self._project = credentials_project
164+
165+
return credentials, self._project
166+
144167
@property
145-
def credentials(self) -> Optional[google.auth.credentials.Credentials]:
168+
def credentials(self) -> google.auth.credentials.Credentials:
146169
"""The OAuth2 credentials to use for this client.
147170
171+
Set to None to force re-authentication.
172+
148173
Returns:
149174
None or google.auth.credentials.Credentials:
150175
google.auth.credentials.Credentials if exists; otherwise None.
151176
"""
152-
return self._credentials
177+
if self._credentials:
178+
return self._credentials
179+
180+
credentials, _ = self._try_set_default_credentials_and_project()
181+
return credentials
153182

154183
@credentials.setter
155184
def credentials(self, value: Optional[google.auth.credentials.Credentials]):
156185
if self._session_started and self._credentials is not value:
157186
raise ValueError(SESSION_STARTED_MESSAGE.format(attribute="credentials"))
187+
188+
if value is None:
189+
# The user has _explicitly_ asked that we re-authenticate.
190+
bigframes._config.auth.reset_default_credentials_and_project()
191+
158192
self._credentials = value
159193

160194
@property
@@ -183,7 +217,11 @@ def project(self) -> Optional[str]:
183217
None or str:
184218
Google Cloud project ID as a string; otherwise None.
185219
"""
186-
return self._project
220+
if self._project:
221+
return self._project
222+
223+
_, project = self._try_set_default_credentials_and_project()
224+
return project
187225

188226
@project.setter
189227
def project(self, value: Optional[str]):

bigframes/_config/display_options.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,7 @@ class DisplayOptions:
3535
progress_bar: Optional[str] = "auto"
3636
repr_mode: Literal["head", "deferred", "anywidget"] = "head"
3737

38+
max_colwidth: Optional[int] = 50
3839
max_info_columns: int = 100
3940
max_info_rows: Optional[int] = 200000
4041
memory_usage: bool = True
@@ -52,6 +53,8 @@ def pandas_repr(display_options: DisplayOptions):
5253
so that we don't override pandas behavior.
5354
"""
5455
with pd.option_context(
56+
"display.max_colwidth",
57+
display_options.max_colwidth,
5558
"display.max_columns",
5659
display_options.max_columns,
5760
"display.max_rows",

bigframes/bigquery/__init__.py

Lines changed: 43 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,9 @@
1616
such as array functions:
1717
https://cloud.google.com/bigquery/docs/reference/standard-sql/array_functions. """
1818

19+
import sys
20+
21+
from bigframes.bigquery._operations import ai
1922
from bigframes.bigquery._operations.approx_agg import approx_top_count
2023
from bigframes.bigquery._operations.array import (
2124
array_agg,
@@ -48,47 +51,57 @@
4851
json_value,
4952
json_value_array,
5053
parse_json,
54+
to_json_string,
5155
)
5256
from bigframes.bigquery._operations.search import create_vector_index, vector_search
5357
from bigframes.bigquery._operations.sql import sql_scalar
5458
from bigframes.bigquery._operations.struct import struct
59+
from bigframes.core import log_adapter
5560

56-
__all__ = [
61+
_functions = [
5762
# approximate aggregate ops
58-
"approx_top_count",
63+
approx_top_count,
5964
# array ops
60-
"array_agg",
61-
"array_length",
62-
"array_to_string",
65+
array_agg,
66+
array_length,
67+
array_to_string,
6368
# datetime ops
64-
"unix_micros",
65-
"unix_millis",
66-
"unix_seconds",
69+
unix_micros,
70+
unix_millis,
71+
unix_seconds,
6772
# geo ops
68-
"st_area",
69-
"st_buffer",
70-
"st_centroid",
71-
"st_convexhull",
72-
"st_difference",
73-
"st_distance",
74-
"st_intersection",
75-
"st_isclosed",
76-
"st_length",
73+
st_area,
74+
st_buffer,
75+
st_centroid,
76+
st_convexhull,
77+
st_difference,
78+
st_distance,
79+
st_intersection,
80+
st_isclosed,
81+
st_length,
7782
# json ops
78-
"json_extract",
79-
"json_extract_array",
80-
"json_extract_string_array",
81-
"json_query",
82-
"json_query_array",
83-
"json_set",
84-
"json_value",
85-
"json_value_array",
86-
"parse_json",
83+
json_extract,
84+
json_extract_array,
85+
json_extract_string_array,
86+
json_query,
87+
json_query_array,
88+
json_set,
89+
json_value,
90+
json_value_array,
91+
parse_json,
92+
to_json_string,
8793
# search ops
88-
"create_vector_index",
89-
"vector_search",
94+
create_vector_index,
95+
vector_search,
9096
# sql ops
91-
"sql_scalar",
97+
sql_scalar,
9298
# struct ops
93-
"struct",
99+
struct,
94100
]
101+
102+
__all__ = [f.__name__ for f in _functions] + ["ai"]
103+
104+
_module = sys.modules[__name__]
105+
for f in _functions:
106+
_decorated_object = log_adapter.method_logger(f, custom_base_name="bigquery")
107+
setattr(_module, f.__name__, _decorated_object)

0 commit comments

Comments
 (0)