Skip to content

Commit b1e31e3

Browse files
authored
Merge branch 'main' into allow_large_results
2 parents 68f1a10 + 9af7130 commit b1e31e3

File tree

249 files changed

+11530
-2246
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

249 files changed

+11530
-2246
lines changed

.pre-commit-config.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,4 +46,4 @@ repos:
4646
rev: v2.0.2
4747
hooks:
4848
- id: biome-check
49-
files: '\.js$'
49+
files: '\.(js|css)$'

CHANGELOG.md

Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,80 @@
44

55
[1]: https://pypi.org/project/bigframes/#history
66

7+
## [2.15.0](https://github.com/googleapis/python-bigquery-dataframes/compare/v2.14.0...v2.15.0) (2025-08-11)
8+
9+
10+
### Features
11+
12+
* Add `st_buffer`, `st_centroid`, and `st_convexhull` and their corresponding GeoSeries methods ([#1963](https://github.com/googleapis/python-bigquery-dataframes/issues/1963)) ([c4c7fa5](https://github.com/googleapis/python-bigquery-dataframes/commit/c4c7fa578e135e7f0e31ad3063db379514957acc))
13+
* Add first, last support to GroupBy ([#1969](https://github.com/googleapis/python-bigquery-dataframes/issues/1969)) ([41dda88](https://github.com/googleapis/python-bigquery-dataframes/commit/41dda889860c0ed8ca2eab81b34a9d71372c69f7))
14+
* Add value_counts to GroupBy classes ([#1974](https://github.com/googleapis/python-bigquery-dataframes/issues/1974)) ([82175a4](https://github.com/googleapis/python-bigquery-dataframes/commit/82175a4d0fa41d8aee11efdf8778a21bb70b1c0f))
15+
* Allow callable as a conditional or replacement input in DataFrame.where ([#1971](https://github.com/googleapis/python-bigquery-dataframes/issues/1971)) ([a8d57d2](https://github.com/googleapis/python-bigquery-dataframes/commit/a8d57d2f7075158eff69ec65a14c232756ab72a6))
16+
* Can cast locally in hybrid engine ([#1944](https://github.com/googleapis/python-bigquery-dataframes/issues/1944)) ([d9bc4a5](https://github.com/googleapis/python-bigquery-dataframes/commit/d9bc4a5940e9930d5e3c3bfffdadd2f91f96b53b))
17+
* Df.join lsuffix and rsuffix support ([#1857](https://github.com/googleapis/python-bigquery-dataframes/issues/1857)) ([26515c3](https://github.com/googleapis/python-bigquery-dataframes/commit/26515c34c4f0a5e4602d2f59bf229d41e0fc9196))
18+
19+
20+
### Bug Fixes
21+
22+
* Add warnings for duplicated or conflicting type hints in bigfram… ([#1956](https://github.com/googleapis/python-bigquery-dataframes/issues/1956)) ([d38e42c](https://github.com/googleapis/python-bigquery-dataframes/commit/d38e42ce689e65f57223e9a8b14c4262cba08966))
23+
* Make `remote_function` more robust when there are `create_function` retries ([#1973](https://github.com/googleapis/python-bigquery-dataframes/issues/1973)) ([cd954ac](https://github.com/googleapis/python-bigquery-dataframes/commit/cd954ac07ad5e5820a20b941d3c6cab7cfcc1f29))
24+
* Make ExecutionMetrics stats tracking more robust to missing stats ([#1977](https://github.com/googleapis/python-bigquery-dataframes/issues/1977)) ([feb3ff4](https://github.com/googleapis/python-bigquery-dataframes/commit/feb3ff4b543eb8acbf6adf335b67a266a1cf4297))
25+
26+
27+
### Performance Improvements
28+
29+
* Remove an unnecessary extra `dry_run` query from `read_gbq_table` ([#1972](https://github.com/googleapis/python-bigquery-dataframes/issues/1972)) ([d17b711](https://github.com/googleapis/python-bigquery-dataframes/commit/d17b711750d281ef3efd42c160f3784cd60021ae))
30+
31+
32+
### Documentation
33+
34+
* Divide BQ DataFrames quickstart code cell ([#1975](https://github.com/googleapis/python-bigquery-dataframes/issues/1975)) ([fedb8f2](https://github.com/googleapis/python-bigquery-dataframes/commit/fedb8f23120aa315c7e9dd6f1bf1255ccf1ebc48))
35+
36+
## [2.14.0](https://github.com/googleapis/python-bigquery-dataframes/compare/v2.13.0...v2.14.0) (2025-08-05)
37+
38+
39+
### Features
40+
41+
* Dynamic table width for better display across devices (https://github.com/googleapis/python-bigquery-dataframes/issues/1948) ([a6d30ae](https://github.com/googleapis/python-bigquery-dataframes/commit/a6d30ae3f4358925c999c53b558c1ecd3ee03e6c)) ([a6d30ae](https://github.com/googleapis/python-bigquery-dataframes/commit/a6d30ae3f4358925c999c53b558c1ecd3ee03e6c))
42+
* Retry AI/ML jobs that fail more often ([#1965](https://github.com/googleapis/python-bigquery-dataframes/issues/1965)) ([25bde9f](https://github.com/googleapis/python-bigquery-dataframes/commit/25bde9f9b89112db0efcc119bf29b6d1f3896c33))
43+
* Support series input in managed function ([#1920](https://github.com/googleapis/python-bigquery-dataframes/issues/1920)) ([62a189f](https://github.com/googleapis/python-bigquery-dataframes/commit/62a189f4d69f6c05fe348a1acd1fbac364fa60b9))
44+
45+
46+
### Bug Fixes
47+
48+
* Enhance type error messages for bigframes functions ([#1958](https://github.com/googleapis/python-bigquery-dataframes/issues/1958)) ([770918e](https://github.com/googleapis/python-bigquery-dataframes/commit/770918e998bf1fde7a656e8f8a0ff0a8c68509f2))
49+
50+
51+
### Performance Improvements
52+
53+
* Use promote_offsets for consistent row number generation for index.get_loc ([#1957](https://github.com/googleapis/python-bigquery-dataframes/issues/1957)) ([c67a25a](https://github.com/googleapis/python-bigquery-dataframes/commit/c67a25a879ab2a35ca9053a81c9c85b5660206ae))
54+
55+
56+
### Documentation
57+
58+
* Add code snippet for storing dataframes to a CSV file ([#1943](https://github.com/googleapis/python-bigquery-dataframes/issues/1943)) ([a511e09](https://github.com/googleapis/python-bigquery-dataframes/commit/a511e09e6924d2e8302af2eb4a602c6b9e5d2d72))
59+
* Add code snippet for storing dataframes to a CSV file ([#1953](https://github.com/googleapis/python-bigquery-dataframes/issues/1953)) ([a298a02](https://github.com/googleapis/python-bigquery-dataframes/commit/a298a02b451f03ca200fe0756b9a7b57e3d1bf0e))
60+
61+
## [2.13.0](https://github.com/googleapis/python-bigquery-dataframes/compare/v2.12.0...v2.13.0) (2025-07-25)
62+
63+
64+
### Features
65+
66+
* _read_gbq_colab creates hybrid session ([#1901](https://github.com/googleapis/python-bigquery-dataframes/issues/1901)) ([31b17b0](https://github.com/googleapis/python-bigquery-dataframes/commit/31b17b01706ccfcee9a2d838c43a9609ec4dc218))
67+
* Add CSS styling for TableWidget pagination interface ([#1934](https://github.com/googleapis/python-bigquery-dataframes/issues/1934)) ([5b232d7](https://github.com/googleapis/python-bigquery-dataframes/commit/5b232d7e33563196316f5dbb50b28c6be388d440))
68+
* Add row numbering local pushdown in hybrid execution ([#1932](https://github.com/googleapis/python-bigquery-dataframes/issues/1932)) ([92a2377](https://github.com/googleapis/python-bigquery-dataframes/commit/92a237712aa4ce516b1a44748127b34d7780fff6))
69+
* Implement Index.get_loc ([#1921](https://github.com/googleapis/python-bigquery-dataframes/issues/1921)) ([bbbcaf3](https://github.com/googleapis/python-bigquery-dataframes/commit/bbbcaf35df113617fd6bb8ae36468cf3f7ab493b))
70+
71+
72+
### Bug Fixes
73+
74+
* Add license header and correct issues in dbt sample ([#1931](https://github.com/googleapis/python-bigquery-dataframes/issues/1931)) ([ab01b0a](https://github.com/googleapis/python-bigquery-dataframes/commit/ab01b0a236ffc7b667f258e0497105ea5c3d3aab))
75+
76+
77+
### Dependencies
78+
79+
* Replace `google-cloud-iam` with `grpc-google-iam-v1` ([#1864](https://github.com/googleapis/python-bigquery-dataframes/issues/1864)) ([e5ff8f7](https://github.com/googleapis/python-bigquery-dataframes/commit/e5ff8f7d9fdac3ea47dabcc80a2598d601f39e64))
80+
781
## [2.12.0](https://github.com/googleapis/python-bigquery-dataframes/compare/v2.11.0...v2.12.0) (2025-07-23)
882

983

GEMINI.md

Lines changed: 147 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,147 @@
1+
# Contribution guidelines, tailored for LLM agents
2+
3+
## Testing
4+
5+
We use `nox` to instrument our tests.
6+
7+
- To test your changes, run unit tests with `nox`:
8+
9+
```bash
10+
nox -r -s unit
11+
```
12+
13+
- To run a single unit test:
14+
15+
```bash
16+
nox -r -s unit-3.13 -- -k <name of test>
17+
```
18+
19+
- To run system tests, you can execute::
20+
21+
# Run all system tests
22+
$ nox -r -s system
23+
24+
# Run a single system test
25+
$ nox -r -s system-3.13 -- -k <name of test>
26+
27+
- The codebase must have better coverage than it had previously after each
28+
change. You can test coverage via `nox -s unit system cover` (takes a long
29+
time).
30+
31+
## Code Style
32+
33+
- We use the automatic code formatter `black`. You can run it using
34+
the nox session `format`. This will eliminate many lint errors. Run via:
35+
36+
```bash
37+
nox -r -s format
38+
```
39+
40+
- PEP8 compliance is required, with exceptions defined in the linter configuration.
41+
If you have ``nox`` installed, you can test that you have not introduced
42+
any non-compliant code via:
43+
44+
```
45+
nox -r -s lint
46+
```
47+
48+
- When writing tests, use the idiomatic "pytest" style.
49+
50+
## Documentation
51+
52+
If a method or property is implementing the same interface as a third-party
53+
package such as pandas or scikit-learn, place the relevant docstring in the
54+
corresponding `third_party/bigframes_vendored/package_name` directory, not in
55+
the `bigframes` directory. Implementations may be placed in the `bigframes`
56+
directory, though.
57+
58+
### Testing code samples
59+
60+
Code samples are very important for accurate documentation. We use the "doctest"
61+
framework to ensure the samples are functioning as expected. After adding a code
62+
sample, please ensure it is correct by running doctest. To run the samples
63+
doctests for just a single method, refer to the following example:
64+
65+
```bash
66+
pytest --doctest-modules bigframes/pandas/__init__.py::bigframes.pandas.cut
67+
```
68+
69+
## Tips for implementing common BigFrames features
70+
71+
### Adding a scalar operator
72+
73+
For an example, see commit
74+
[c5b7fdae74a22e581f7705bc0cf5390e928f4425](https://github.com/googleapis/python-bigquery-dataframes/commit/c5b7fdae74a22e581f7705bc0cf5390e928f4425).
75+
76+
To add a new scalar operator, follow these steps:
77+
78+
1. **Define the operation dataclass:**
79+
- In `bigframes/operations/`, find the relevant file (e.g., `geo_ops.py` for geography functions) or create a new one.
80+
- Create a new dataclass inheriting from `base_ops.UnaryOp` for unary
81+
operators, `base_ops.BinaryOp` for binary operators, `base_ops.TernaryOp`
82+
for ternary operators, or `base_ops.NaryOp for operators with many
83+
arguments. Note that these operators are counting the number column-like
84+
arguments. A function that takes only a single column but several literal
85+
values would still be a `UnaryOp`.
86+
- Define the `name` of the operation and any parameters it requires.
87+
- Implement the `output_type` method to specify the data type of the result.
88+
89+
2. **Export the new operation:**
90+
- In `bigframes/operations/__init__.py`, import your new operation dataclass and add it to the `__all__` list.
91+
92+
3. **Implement the user-facing function (pandas-like):**
93+
94+
- Identify the canonical function from pandas / geopandas / awkward array /
95+
other popular Python package that this operator implements.
96+
- Find the corresponding class in BigFrames. For example, the implementation
97+
for most geopandas.GeoSeries methods is in
98+
`bigframes/geopandas/geoseries.py`. Pandas Series methods are implemented
99+
in `bigframes/series.py` or one of the accessors, such as `StringMethods`
100+
in `bigframes/operations/strings.py`.
101+
- Create the user-facing function that will be called by users (e.g., `length`).
102+
- If the SQL method differs from pandas or geopandas in a way that can't be
103+
made the same, raise a `NotImplementedError` with an appropriate message and
104+
link to the feedback form.
105+
- Add the docstring to the corresponding file in
106+
`third_party/bigframes_vendored`, modeled after pandas / geopandas.
107+
108+
4. **Implement the user-facing function (SQL-like):**
109+
110+
- In `bigframes/bigquery/_operations/`, find the relevant file (e.g., `geo.py`) or create a new one.
111+
- Create the user-facing function that will be called by users (e.g., `st_length`).
112+
- This function should take a `Series` for any column-like inputs, plus any other parameters.
113+
- Inside the function, call `series._apply_unary_op`,
114+
`series._apply_binary_op`, or similar passing the operation dataclass you
115+
created.
116+
- Add a comprehensive docstring with examples.
117+
- In `bigframes/bigquery/__init__.py`, import your new user-facing function and add it to the `__all__` list.
118+
119+
5. **Implement the compilation logic:**
120+
- In `bigframes/core/compile/scalar_op_compiler.py`:
121+
- If the BigQuery function has a direct equivalent in Ibis, you can often reuse an existing Ibis method.
122+
- If not, define a new Ibis UDF using `@ibis_udf.scalar.builtin` to map to the specific BigQuery function signature.
123+
- Create a new compiler implementation function (e.g., `geo_length_op_impl`).
124+
- Register this function to your operation dataclass using `@scalar_op_compiler.register_unary_op` or `@scalar_op_compiler.register_binary_op`.
125+
- This implementation will translate the BigQuery DataFrames operation into the appropriate Ibis expression.
126+
127+
6. **Add Tests:**
128+
- Add system tests in the `tests/system/` directory to verify the end-to-end
129+
functionality of the new operator. Test various inputs, including edge cases
130+
and `NULL` values.
131+
132+
Where possible, run the same test code against pandas or GeoPandas and
133+
compare that the outputs are the same (except for dtypes if BigFrames
134+
differs from pandas).
135+
- If you are overriding a pandas or GeoPandas property, add a unit test to
136+
ensure the correct behavior (e.g., raising `NotImplementedError` if the
137+
functionality is not supported).
138+
139+
140+
## Constraints
141+
142+
- Only add git commits. Do not change git history.
143+
- Follow the spec file for development.
144+
- Check off items in the "Acceptance
145+
criteria" and "Detailed steps" sections with `[x]`.
146+
- Please do this as they are completed.
147+
- Refer back to the spec after each step.

MANIFEST.in

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@
1717
# Generated by synthtool. DO NOT EDIT!
1818
include README.rst LICENSE
1919
recursive-include third_party/bigframes_vendored *
20-
recursive-include bigframes *.json *.proto *.js py.typed
20+
recursive-include bigframes *.json *.proto *.js *.css py.typed
2121
recursive-include tests *
2222
global-exclude *.py[co]
2323
global-exclude __pycache__

bigframes/_config/display_options.py

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,8 +26,12 @@
2626
class DisplayOptions:
2727
__doc__ = vendored_pandas_config.display_options_doc
2828

29+
# Options borrowed from pandas.
2930
max_columns: int = 20
30-
max_rows: int = 25
31+
max_rows: int = 10
32+
precision: int = 6
33+
34+
# Options unique to BigQuery DataFrames.
3135
progress_bar: Optional[str] = "auto"
3236
repr_mode: Literal["head", "deferred", "anywidget"] = "head"
3337

@@ -52,6 +56,8 @@ def pandas_repr(display_options: DisplayOptions):
5256
display_options.max_columns,
5357
"display.max_rows",
5458
display_options.max_rows,
59+
"display.precision",
60+
display_options.precision,
5561
"display.show_dimensions",
5662
True,
5763
) as pandas_context:

bigframes/_importing.py

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@
1414
import importlib
1515
from types import ModuleType
1616

17+
import numpy
1718
from packaging import version
1819

1920
# Keep this in sync with setup.py
@@ -22,9 +23,13 @@
2223

2324
def import_polars() -> ModuleType:
2425
polars_module = importlib.import_module("polars")
25-
imported_version = version.Version(polars_module.build_info()["version"])
26-
if imported_version < POLARS_MIN_VERSION:
26+
# Check for necessary methods instead of the version number because we
27+
# can't trust the polars version until
28+
# https://github.com/pola-rs/polars/issues/23940 is fixed.
29+
try:
30+
polars_module.lit(numpy.int64(100), dtype=polars_module.Int64())
31+
except TypeError:
2732
raise ImportError(
28-
f"Imported polars version: {imported_version} is below the minimum version: {POLARS_MIN_VERSION}"
33+
f"Imported polars version is likely below the minimum version: {POLARS_MIN_VERSION}"
2934
)
3035
return polars_module

bigframes/bigquery/__init__.py

Lines changed: 11 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,9 @@
2929
)
3030
from bigframes.bigquery._operations.geo import (
3131
st_area,
32+
st_buffer,
33+
st_centroid,
34+
st_convexhull,
3235
st_difference,
3336
st_distance,
3437
st_intersection,
@@ -54,11 +57,18 @@
5457
# approximate aggregate ops
5558
"approx_top_count",
5659
# array ops
57-
"array_length",
5860
"array_agg",
61+
"array_length",
5962
"array_to_string",
63+
# datetime ops
64+
"unix_micros",
65+
"unix_millis",
66+
"unix_seconds",
6067
# geo ops
6168
"st_area",
69+
"st_buffer",
70+
"st_centroid",
71+
"st_convexhull",
6272
"st_difference",
6373
"st_distance",
6474
"st_intersection",
@@ -81,8 +91,4 @@
8191
"sql_scalar",
8292
# struct ops
8393
"struct",
84-
# datetime ops
85-
"unix_micros",
86-
"unix_millis",
87-
"unix_seconds",
8894
]

0 commit comments

Comments
 (0)