Skip to content

Commit 2f6df1d

Browse files
authored
Merge branch 'main' into output_schema
2 parents 90f8c10 + 6bf06a7 commit 2f6df1d

File tree

59 files changed

+2164
-184
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

59 files changed

+2164
-184
lines changed

CHANGELOG.md

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,50 @@
44

55
[1]: https://pypi.org/project/bigframes/#history
66

7+
## [2.17.0](https://github.com/googleapis/python-bigquery-dataframes/compare/v2.16.0...v2.17.0) (2025-08-22)
8+
9+
10+
### Features
11+
12+
* Add isin local execution impl ([#1993](https://github.com/googleapis/python-bigquery-dataframes/issues/1993)) ([26df6e6](https://github.com/googleapis/python-bigquery-dataframes/commit/26df6e691bb27ed09322a81214faedbf3639b32e))
13+
* Add reset_index names, col_level, col_fill, allow_duplicates args ([#2017](https://github.com/googleapis/python-bigquery-dataframes/issues/2017)) ([c02a1b6](https://github.com/googleapis/python-bigquery-dataframes/commit/c02a1b67d27758815430bb8006ac3a72cea55a89))
14+
* Support callable for series mask method ([#2014](https://github.com/googleapis/python-bigquery-dataframes/issues/2014)) ([5ac32eb](https://github.com/googleapis/python-bigquery-dataframes/commit/5ac32ebe17cfda447870859f5dd344b082b4d3d0))
15+
16+
## [2.16.0](https://github.com/googleapis/python-bigquery-dataframes/compare/v2.15.0...v2.16.0) (2025-08-20)
17+
18+
19+
### Features
20+
21+
* Add `bigframes.pandas.options.display.precision` option ([#1979](https://github.com/googleapis/python-bigquery-dataframes/issues/1979)) ([15e6175](https://github.com/googleapis/python-bigquery-dataframes/commit/15e6175ec0aeb1b7b02d0bba9e8e1e018bd11c31))
22+
* Add level, inplace params to reset_index ([#1988](https://github.com/googleapis/python-bigquery-dataframes/issues/1988)) ([3446950](https://github.com/googleapis/python-bigquery-dataframes/commit/34469504b79a082d3380f9f25c597483aef2068a))
23+
* Add ML code samples from dbt blog post ([#1978](https://github.com/googleapis/python-bigquery-dataframes/issues/1978)) ([ebaa244](https://github.com/googleapis/python-bigquery-dataframes/commit/ebaa244a9eb7b87f7f9fd9c3bebe5c7db24cd013))
24+
* Add where, coalesce, fillna, casewhen, invert local impl ([#1976](https://github.com/googleapis/python-bigquery-dataframes/issues/1976)) ([f7f686c](https://github.com/googleapis/python-bigquery-dataframes/commit/f7f686cf85ab7e265d9c07ebc7f0cd59babc5357))
25+
* Adjust anywidget CSS to prevent overflow ([#1981](https://github.com/googleapis/python-bigquery-dataframes/issues/1981)) ([204f083](https://github.com/googleapis/python-bigquery-dataframes/commit/204f083a2f00fcc9fd1500dcd7a738eda3904d2f))
26+
* Format page number in table widget ([#1992](https://github.com/googleapis/python-bigquery-dataframes/issues/1992)) ([e83836e](https://github.com/googleapis/python-bigquery-dataframes/commit/e83836e8e1357f009f3f95666f1661bdbe0d3751))
27+
* Or, And, Xor can execute locally ([#1994](https://github.com/googleapis/python-bigquery-dataframes/issues/1994)) ([59c52a5](https://github.com/googleapis/python-bigquery-dataframes/commit/59c52a55ebea697855eb4c70529e226cc077141f))
28+
* Support callable bigframes function for dataframe where ([#1990](https://github.com/googleapis/python-bigquery-dataframes/issues/1990)) ([44c1ec4](https://github.com/googleapis/python-bigquery-dataframes/commit/44c1ec48cc4db1c4c9c15ec1fab43d4ef0758e56))
29+
* Support callable for series where method ([#2005](https://github.com/googleapis/python-bigquery-dataframes/issues/2005)) ([768b82a](https://github.com/googleapis/python-bigquery-dataframes/commit/768b82af96a5dd0c434edcb171036eb42cfb9b41))
30+
* When using `repr_mode = "anywidget"`, numeric values align right ([15e6175](https://github.com/googleapis/python-bigquery-dataframes/commit/15e6175ec0aeb1b7b02d0bba9e8e1e018bd11c31))
31+
32+
33+
### Bug Fixes
34+
35+
* Address the packages issue for bigframes function ([#1991](https://github.com/googleapis/python-bigquery-dataframes/issues/1991)) ([68f1d22](https://github.com/googleapis/python-bigquery-dataframes/commit/68f1d22d5ed8457a5cabc7751ed1d178063dd63e))
36+
* Correct pypdf dependency specifier for remote PDF functions ([#1980](https://github.com/googleapis/python-bigquery-dataframes/issues/1980)) ([0bd5e1b](https://github.com/googleapis/python-bigquery-dataframes/commit/0bd5e1b3c004124d2100c3fbec2fbe1e965d1e96))
37+
* Enable default retries in calls to BQ Storage Read API ([#1985](https://github.com/googleapis/python-bigquery-dataframes/issues/1985)) ([f25d7bd](https://github.com/googleapis/python-bigquery-dataframes/commit/f25d7bd30800dffa65b6c31b0b7ac711a13d790f))
38+
* Fix the copyright year in dbt sample files ([#1996](https://github.com/googleapis/python-bigquery-dataframes/issues/1996)) ([fad5722](https://github.com/googleapis/python-bigquery-dataframes/commit/fad57223d129f0c95d0c6a066179bb66880edd06))
39+
40+
41+
### Performance Improvements
42+
43+
* Faster session startup by defering anon dataset fetch ([#1982](https://github.com/googleapis/python-bigquery-dataframes/issues/1982)) ([2720c4c](https://github.com/googleapis/python-bigquery-dataframes/commit/2720c4cf070bf57a0930d7623bfc41d89cc053ee))
44+
45+
46+
### Documentation
47+
48+
* Add examples of running bigframes in kaggle ([#2002](https://github.com/googleapis/python-bigquery-dataframes/issues/2002)) ([7d89d76](https://github.com/googleapis/python-bigquery-dataframes/commit/7d89d76976595b75cb0105fbe7b4f7ca2fdf49f2))
49+
* Remove preview warning from partial ordering mode sample notebook ([#1986](https://github.com/googleapis/python-bigquery-dataframes/issues/1986)) ([132e0ed](https://github.com/googleapis/python-bigquery-dataframes/commit/132e0edfe9f96c15753649d77fcb6edd0b0708a3))
50+
751
## [2.15.0](https://github.com/googleapis/python-bigquery-dataframes/compare/v2.14.0...v2.15.0) (2025-08-11)
852

953

bigframes/core/blocks.py

Lines changed: 21 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -387,12 +387,21 @@ def reversed(self) -> Block:
387387
index_labels=self.index.names,
388388
)
389389

390-
def reset_index(self, level: LevelsType = None, drop: bool = True) -> Block:
390+
def reset_index(
391+
self,
392+
level: LevelsType = None,
393+
drop: bool = True,
394+
*,
395+
col_level: Union[str, int] = 0,
396+
col_fill: typing.Hashable = "",
397+
allow_duplicates: bool = False,
398+
) -> Block:
391399
"""Reset the index of the block, promoting the old index to a value column.
392400
393401
Arguments:
394402
level: the label or index level of the index levels to remove.
395403
name: this is the column id for the new value id derived from the old index
404+
allow_duplicates:
396405
397406
Returns:
398407
A new Block because dropping index columns can break references
@@ -438,6 +447,11 @@ def reset_index(self, level: LevelsType = None, drop: bool = True) -> Block:
438447
)
439448
else:
440449
# Add index names to column index
450+
col_level_n = (
451+
col_level
452+
if isinstance(col_level, int)
453+
else self.column_labels.names.index(col_level)
454+
)
441455
column_labels_modified = self.column_labels
442456
for position, level_id in enumerate(level_ids):
443457
label = self.col_id_to_index_name[level_id]
@@ -447,11 +461,15 @@ def reset_index(self, level: LevelsType = None, drop: bool = True) -> Block:
447461
else:
448462
label = f"level_{self.index_columns.index(level_id)}"
449463

450-
if label in self.column_labels:
464+
if (not allow_duplicates) and (label in self.column_labels):
451465
raise ValueError(f"cannot insert {label}, already exists")
466+
452467
if isinstance(self.column_labels, pd.MultiIndex):
453468
nlevels = self.column_labels.nlevels
454-
label = tuple(label if i == 0 else "" for i in range(nlevels))
469+
label = tuple(
470+
label if i == col_level_n else col_fill for i in range(nlevels)
471+
)
472+
455473
# Create index copy with label inserted
456474
# See: https://pandas.pydata.org/docs/reference/api/pandas.Index.insert.html
457475
column_labels_modified = column_labels_modified.insert(position, label)

bigframes/core/compile/ibis_compiler/scalar_op_registry.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1062,7 +1062,7 @@ def isin_op_impl(x: ibis_types.Value, op: ops.IsInOp):
10621062
if op.match_nulls and contains_nulls:
10631063
return x.isnull() | x.isin(matchable_ibis_values)
10641064
else:
1065-
return x.isin(matchable_ibis_values)
1065+
return x.isin(matchable_ibis_values).fillna(False)
10661066

10671067

10681068
@scalar_op_compiler.register_unary_op(ops.ToDatetimeOp, pass_op=True)

bigframes/core/compile/polars/compiler.py

Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -263,11 +263,9 @@ def _(self, op: ops.ScalarOp, l_input: pl.Expr, r_input: pl.Expr) -> pl.Expr:
263263
def _(self, op: ops.ScalarOp, input: pl.Expr) -> pl.Expr:
264264
# TODO: Filter out types that can't be coerced to right type
265265
assert isinstance(op, gen_ops.IsInOp)
266-
if op.match_nulls or not any(map(pd.isna, op.values)):
267-
# newer polars version have nulls_equal arg
268-
return input.is_in(op.values)
269-
else:
270-
return input.is_in(op.values) or input.is_null()
266+
assert not op.match_nulls # should be stripped by a lowering step rn
267+
values = pl.Series(op.values, strict=False)
268+
return input.is_in(values)
271269

272270
@compile_op.register(gen_ops.FillNaOp)
273271
@compile_op.register(gen_ops.CoalesceOp)

bigframes/core/compile/polars/lowering.py

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,8 +13,10 @@
1313
# limitations under the License.
1414

1515
import dataclasses
16+
from typing import cast
1617

1718
import numpy as np
19+
import pandas as pd
1820

1921
from bigframes import dtypes
2022
from bigframes.core import bigframe_node, expression
@@ -316,6 +318,35 @@ def lower(self, expr: expression.OpExpression) -> expression.Expression:
316318
return expr
317319

318320

321+
class LowerIsinOp(op_lowering.OpLoweringRule):
322+
@property
323+
def op(self) -> type[ops.ScalarOp]:
324+
return generic_ops.IsInOp
325+
326+
def lower(self, expr: expression.OpExpression) -> expression.Expression:
327+
assert isinstance(expr.op, generic_ops.IsInOp)
328+
arg = expr.children[0]
329+
new_values = []
330+
match_nulls = False
331+
for val in expr.op.values:
332+
# coercible, non-coercible
333+
# float NaN/inf should be treated as distinct from 'true' null values
334+
if cast(bool, pd.isna(val)) and not isinstance(val, float):
335+
if expr.op.match_nulls:
336+
match_nulls = True
337+
elif dtypes.is_compatible(val, arg.output_type):
338+
new_values.append(val)
339+
else:
340+
pass
341+
342+
new_isin = ops.IsInOp(tuple(new_values), match_nulls=False).as_expr(arg)
343+
if match_nulls:
344+
return ops.coalesce_op.as_expr(new_isin, expression.const(True))
345+
else:
346+
# polars propagates nulls, so need to coalesce to false
347+
return ops.coalesce_op.as_expr(new_isin, expression.const(False))
348+
349+
319350
def _coerce_comparables(
320351
expr1: expression.Expression,
321352
expr2: expression.Expression,
@@ -414,6 +445,7 @@ def _lower_cast(cast_op: ops.AsTypeOp, arg: expression.Expression):
414445
LowerModRule(),
415446
LowerAsTypeRule(),
416447
LowerInvertOp(),
448+
LowerIsinOp(),
417449
)
418450

419451

0 commit comments

Comments
 (0)