Skip to content

Releases: googleapis/python-bigquery-dataframes

v1.10.0

25 Jun 23:08
2e692e9
Compare
Choose a tag to compare

1.10.0 (2024-06-21)

Features

  • Add dataframe.insert (#770) (e8bab68)
  • Add groupby head API (#791) (44202bc)
  • Add ml.preprocessing.PolynomialFeatures class (#793) (b4fbb51)
  • Bigframes.streaming module for continuous queries (#703) (0433a1c)
  • Include index columns in DataFrame.sql if they are named (#788) (c8d16c0)

Bug Fixes

  • Allow __repr__ to work with uninitialed DataFrame/Series/Index (#778) (e14c7a9)
  • Df.loc with the 2nd input as bigframes boolean Series (#789) (a4ac82e)
  • Ensure numpy version matches in remote_function deployment (#798) (324d93c)
  • Fix temp table creation retries by now throwing if table already exists. (#787) (0e57d1f)
  • Self-join optimization doesn't needlessly invalidate caching (#797) (1b96b80)

v1.9.0

10 Jun 22:39
b7b134e
Compare
Choose a tag to compare

1.9.0 (2024-06-10)

Features

  • Allow functions returned from bpd.read_gbq_function to execute outside of apply (#706) (ad7d8ac)
  • Support bigquery.vector_search() (#736) (dad66fd)
  • Support score() in GeminiTextGenerator (#740) (b2c7d8b)
  • Support bytes type in remote_function (#761) (4915424)
  • Support fit() in GeminiTextGenerator (#758) (d751f5c)

Bug Fixes

  • ARIMAPlus loads auto_arima_min_order param (#752) (39d7013)
  • Improve to_pandas_batches for large results (#746) (61f18cb)
  • Resolve issue with unset thread-local options (#741) (d93dbaf)

Documentation

  • Fix ML.EVALUATE spelling (#749) (7899749)
  • Remove LogisticRegression normal_equation strategy (#753) (ea5d367)

v1.8.0

03 Jun 17:11
b5a3928
Compare
Choose a tag to compare

1.8.0 (2024-05-31)

Features

  • merge only generates a default index if both inputs already have an index (#733) (25d049c)
  • Add +, - as unary ops, ^ binary op (#724) (968d825)
  • Add GroupBy.size() to get number of rows in each group (#479) (1fca588)
  • Add DataFrame ~ operator (#721) (354abc1)
  • Add GeminiText 1.5 Preview models (#737) (56cbd3b)
  • Add slot_millis and add stats to session object (#725) (72e9583)
  • Adds bigframes.bigquery.array_to_string to convert array elements to delimited strings (#731) (f12c906)
  • Allow functions decorated with bpd.remote_function() to execute locally (#704) (d850da6)
  • Ensure "bigframes-api" label is always set on jobs, even if the API is unknown (#722) (1832778)
  • Support ml.SimpleImputer in bigframes (#708) (4c4415f)
  • Support type annotations to supply input and output types to bpd.remote_function() decorator (#717) (4a12e3c)
  • Support type annotations with bpd.remote_function() and axis=1 (a preview feature) (#730) (e5a2992)

Bug Fixes

  • Correct index labels in multiple aggregations for DataFrameGroupBy (#723) (6a78c89)
  • Fix Null index assign series to column (#711) (ffb4b57)
  • Set bpd.remote_function()s input_types and output_types default to None to allow omitting them when type annotations are present (#729) (0e25a3b)
  • Warn and disable time travel for linked datasets (#712) (085fa9d)

Performance Improvements

  • Optimize dataframe-series alignment on axis=1 (#732) (3d39221)

Documentation

  • Add examples to DataFrameGroupBy and SeriesGroupBy (#701) (e7da0f0)

v1.7.0

21 May 14:00
f89b6be
Compare
Choose a tag to compare

1.7.0 (2024-05-20)

Features

  • read_gbq_query supports filters (9386373)
  • read_gbq suggests a correct column name when one is not found (9386373)
  • Add DefaultIndexKind.NULL to use as index_col in read_gbq*, creating an indexless DataFrame/Series (#662) (29e4886)
  • Bigframes.bigquery.array_agg(SeriesGroupBy|DataFrameGroupby) (#663) (412f28b)
  • To_datetime supports utc=False for string inputs (#579) (adf9889)

Bug Fixes

  • read_gbq_table respects primary keys even when filters are set (#689) (9386373)
  • Fix type error in test_cluster (#698) (14d81c1)
  • Improve escaping of literals and identifiers (#682) (da9b136)
  • Properly identify non-unique index in tables without primary keys (#699) (6e0f4d8)
  • Remove a usage of the resource package when not available, such as on Windows (#681) (96243f2)
  • The imported samples error and use peek() (#688) (1a0b744)

Performance Improvements

  • Don't run query immediately from read_gbq_table if filters is set (9386373)
  • Use a LIMIT clause when max_results is set (9386373)

Documentation

  • Add code snippets for imported onnx tutorials (#684) (cb36e46)
  • Add code snippets for imported tensorflow model (#679) (b02c401)
  • Use class_weight="balanced" in the logistic regression prediction tutorial (#678) (b951549)

v1.6.0

13 May 21:27
0b8b827
Compare
Choose a tag to compare

1.6.0 (2024-05-13)

Features

  • Add DataFrame.__delitem__ (#673) (2218c21)
  • Add Series.case_when() (#673) (2218c21)
  • Add strategy="quantile" in KBinsDiscretizer (#654) (c6c487f)
  • Add Series.combine (#680) (2fd1b81)
  • Series.str.split (#675) (6eb19a7)
  • Suggest correct options in bpd.options.bigquery.location (#666) (57ccabc)
  • Support axis=1 in df.apply for scalar outputs (#629) (f6bdc4a)
  • Support gcf vpc connector in remote_function (#677) (9ca92d0)
  • Warn with a more specific DefaultLocationWarning category when no location can be detected (#648) (e084e54)

Bug Fixes

  • Include index_col when selecting columns and filters in read_gbq_table (#648) (e084e54)

Dependencies

  • Add jellyfish as a dependency for spelling correction (57ccabc)

Documentation

  • Add code snippets for llm text generatiion (#669) (93416ed)
  • Add logistic regression samples (#673) (2218c21)
  • Address lint errors in code samples (#665) (4fc8964)
  • Document inlining of small data in read_* APIs (#670) (306953a)

v1.5.0

07 May 06:46
ff23b18
Compare
Choose a tag to compare

1.5.0 (2024-05-07)

Features

  • bigframes.options and bigframes.option_context now uses thread-local variables to prevent context managers in separate threads from affecting each other (#652) (651fd7d)
  • Add ARIMAPlus.coef_ property exposing ML.ARIMA_COEFFICIENTS functionality (#585) (81d1262)
  • Add a unique session_id to Session and allow cleaning up sessions (#553) (c8d4e23)
  • Add the bigframes.bigquery sub-package with a bigframes.bigquery.array_length function (#630) (9963f85)
  • Always do a query dry run when option.repr_mode == "deferred" (#652) (651fd7d)
  • Custom query labels for compute options (#638) (f561799)
  • Raise NoDefaultIndexError from read_gbq on clustered/partitioned tables with no index_col or filters set (#631) (73064dd)
  • Support index_col=False in read_csv and engine="bigquery" (73064dd)
  • Support gcf max instance count in remote_function (#657) (36578ab)

Bug Fixes

  • Don't raise UnknownLocationWarning for US or EU multi-regions (#653) (8e4616b)
  • Downgrade NoDefaultIndexError to DefaultIndexWarning (#658) (2715d2b)
  • Fix bug with na in the column labels in stack (#659) (4a34293)
  • Use explicit session in PaLM2TextGenerator (#651) (e4f13c3)

Documentation

  • Add python code sample for multiple forecasting time series (#531) (16866d2)
  • Fix the Palm2TextGenerator output token size (#649) (c67e501)

v1.4.0

30 Apr 00:11
ac8f40c
Compare
Choose a tag to compare

1.4.0 (2024-04-29)

Features

  • Add .cache() method to persist intermediate dataframe (#626) (a5c94ec)
  • Add transpose support for small homogeneously typed DataFrames. (#621) (054075d)
  • Allow single input type in remote_function (#641) (3aa643f)
  • Expose gcf max timeout in remote_function (#639) (dfeaad0)
  • Series binary ops compatible with more types (#618) (518d315)
  • Support the score method for PaLM2TextGenerator (#634) (3ffc1d2)

Bug Fixes

Performance Improvements

  • Automatically condense internal expression representation (#516) (03c1b0d)
  • Cache transpose to allow performant retranspose (#635) (44b738d)

Documentation

  • Add supported pandas apis on the main page (#628) (8d2a51c)
  • Add the first sample for the Single time-series forecasting from Google Analytics data tutorial (#623) (2b84c4f)
  • Address more technical writers' feedback (#640) (1e7793c)

v1.3.0

22 Apr 23:16
7227a6a
Compare
Choose a tag to compare

1.3.0 (2024-04-22)

Features

  • Add Series.struct.dtypes property (#599) (d924ec2)
  • Add fine tuning fit() for Palm2TextGenerator (#616) (9c106bd)
  • Add quantile statistic (#613) (bc82804)
  • Expose max_batching_rows in remote_function (#622) (240a1ac)
  • Support primary key(s) in read_gbq by using as the index_col by default (#625) (75bb240)
  • Warn if location is set to unknown location (#609) (3706b4f)

Bug Fixes

  • Address technical writers fb (#611) (9f8f181)
  • Infer narrowest numeric type when combining numeric columns (#602) (8f9ece6)
  • Use exact median implementation by default (#619) (9d205ae)

Documentation

  • Fix rendering of examples for multiple apis (#620) (9665e39)
  • Set index_cols in read_gbq as a best practice (#624) (70015b7)

v1.2.0

16 Apr 17:08
458bfb2
Compare
Choose a tag to compare

1.2.0 (2024-04-15)

Features

Bug Fixes

  • Address more technical writers feedback (#581) (4b08d92)
  • Error for object dtype on read_pandas (#570) (8702dcf)
  • Inverting int now does bitwise inversion rather than sign flip (#574) (5f1db8b)
  • Loc setitem dtype issue. (#603) (b94bae9)
  • Toc menu missing plotting name (#591) (eed12c1)

Documentation

v1.1.0

04 Apr 23:00
8add6b1
Compare
Choose a tag to compare

1.1.0 (2024-04-04)

Features

  • (Series|DataFrame).explode (#556) (9e32f57)
  • Add DataFrame.eval and DataFrame.query (#361) (5e28ebd)
  • Add ColumnTransformer save/load (#541) (9d8cf67)
  • Add ml.metrics.mean_squared_error (#559) (853c25e)
  • Add support for numpy expm1, log1p, floor, ceil, arctan2 ops (#505) (e8e66cf)
  • Add transformers save/load (#552) (d805241)
  • Allow DataFrame binary ops to align on either axis and with loc… (#544) (6d8f3af)
  • Expose DataFrame.bqclient to assist in integrations (#519) (0be8911)
  • Read_pandas accepts pandas Series and Index objects (#573) (f8821fe)
  • Support ML.GENERATE_EMBEDDING in PaLM2TextEmbeddingGenerator (#539) (1156c1e)
  • Support max_columns in repr and make repr more efficient (#515) (54e49cf)

Bug Fixes

  • Assign NaN scalar to column error. (#513) (0a4153c)
  • Don't download 100gb onto local python machine in load test (#537) (082c58b)
  • Exclude list-like s parameter in plot.scatter (#568) (1caac27)
  • Fix case where df.peek would fail to execute even with force=True (#511) (8eca99a)
  • Fix error in Series.drop(0) (#575) (75dd786)
  • Include all names in MultiIndex repr (#564) (b188146)
  • Plot.scatter s parameter cannot accept float-like column (#563) (8d39187)
  • Product operation produces float result for all input types (#501) (6873b30)
  • Reloaded transformer .transform error (#569) (39fe474)
  • Rename PaLM2TextEmbeddingGenerator.predict output columns to be backward compatible (#561) (4995c00)
  • Respect hard stack size limit and swallow limit change exception. (#558) (4833908)
  • Restore string to date/time type coercion (#565) (4ae0262)
  • Sync the notebook with embedding changes (#550) (347f2dd)
  • Use bytes limit on frame inlining rather than element count (#576) (659a161)

Performance Improvements

  • Add multi-query execution capability for complex dataframes (#427) (d2d7e33)

Dependencies

Documentation

  • bigframes.options.bigquery.project and location are optional in some circumstances (#548) (90bcec5)
  • Add "Supported pandas APIs" reference to the documentation (#542) (74c3915)
  • Add General Availability banner to README (#507) (262ff59)
  • Add opeartions in API docs (#557) (ea95761)
  • Add progress_bar code sample (#508) (92a1af3)
  • Add the code samples for metrics{auc, roc_auc_score, roc_curve} (#520) (5f37b09)
  • Address more comments from technical writers to meet legal purposes (#571) (9084df3)
  • Fix docs of ARIMAPlus.predict (#512) (3b80f95)
  • Include Index in table-of-contents (#564) (b188146)
  • Mark Gemini model as Pre-GA (#543) (769868b)
  • Migrate the overview page to Bigframes official landing page (#536) (a0fb8bb)