Skip to content

Commit 2f4998f

Browse files
committed
Merge remote-tracking branch 'upstream/main' into ref/index_equiv
2 parents 76058f2 + ded256d commit 2f4998f

File tree

121 files changed

+1353
-1756
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

121 files changed

+1353
-1756
lines changed

Dockerfile

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,4 +11,5 @@ RUN apt-get install -y libhdf5-dev libgles2-mesa-dev
1111
RUN python -m pip install --upgrade pip
1212
COPY requirements-dev.txt /tmp
1313
RUN python -m pip install -r /tmp/requirements-dev.txt
14+
RUN git config --global --add safe.directory /home/pandas
1415
CMD ["/bin/bash"]

asv_bench/benchmarks/categoricals.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -88,7 +88,7 @@ def setup(self):
8888
)
8989

9090
for col in ("int", "float", "timestamp"):
91-
self.df[col + "_as_str"] = self.df[col].astype(str)
91+
self.df[f"{col}_as_str"] = self.df[col].astype(str)
9292

9393
for col in self.df.columns:
9494
self.df[col] = self.df[col].astype("category")

asv_bench/benchmarks/join_merge.py

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -328,6 +328,23 @@ def time_i8merge(self, how):
328328
merge(self.left, self.right, how=how)
329329

330330

331+
class UniqueMerge:
332+
params = [4_000_000, 1_000_000]
333+
param_names = ["unique_elements"]
334+
335+
def setup(self, unique_elements):
336+
N = 1_000_000
337+
self.left = DataFrame({"a": np.random.randint(1, unique_elements, (N,))})
338+
self.right = DataFrame({"a": np.random.randint(1, unique_elements, (N,))})
339+
uniques = self.right.a.drop_duplicates()
340+
self.right["a"] = concat(
341+
[uniques, Series(np.arange(0, -(N - len(uniques)), -1))], ignore_index=True
342+
)
343+
344+
def time_unique_merge(self, unique_elements):
345+
merge(self.left, self.right, how="inner")
346+
347+
331348
class MergeDatetime:
332349
params = [
333350
[

asv_bench/benchmarks/timeseries.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -183,7 +183,7 @@ def setup(self):
183183
self.dt_ts = Series(5, rng3, dtype="datetime64[ns]")
184184

185185
def time_resample(self):
186-
self.dt_ts.resample("1S").last()
186+
self.dt_ts.resample("1s").last()
187187

188188

189189
class AsOf:

doc/source/_static/css/getting_started.css

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -248,6 +248,7 @@ ul.task-bullet > li > p:first-child {
248248
}
249249

250250
.tutorial-card .card-header {
251+
--bs-card-cap-color: var(--pst-color-text-base);
251252
cursor: pointer;
252253
background-color: var(--pst-color-surface);
253254
border: 1px solid var(--pst-color-border)
@@ -269,7 +270,7 @@ ul.task-bullet > li > p:first-child {
269270

270271

271272
.tutorial-card .gs-badge-link a {
272-
color: var(--pst-color-text-base);
273+
color: var(--pst-color-primary-text);
273274
text-decoration: none;
274275
}
275276

doc/source/development/contributing_docstring.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -940,7 +940,7 @@ Finally, docstrings can also be appended to with the ``doc`` decorator.
940940

941941
In this example, we'll create a parent docstring normally (this is like
942942
``pandas.core.generic.NDFrame``). Then we'll have two children (like
943-
``pandas.core.series.Series`` and ``pandas.core.frame.DataFrame``). We'll
943+
``pandas.core.series.Series`` and ``pandas.DataFrame``). We'll
944944
substitute the class names in this docstring.
945945

946946
.. code-block:: python

doc/source/development/maintaining.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -151,15 +151,15 @@ and then run::
151151
git bisect start
152152
git bisect good v1.4.0
153153
git bisect bad v1.5.0
154-
git bisect run bash -c "python setup.py build_ext -j 4; python t.py"
154+
git bisect run bash -c "python -m pip install -ve . --no-build-isolation --config-settings editable-verbose=true; python t.py"
155155

156156
This finds the first commit that changed the behavior. The C extensions have to be
157157
rebuilt at every step, so the search can take a while.
158158

159159
Exit bisect and rebuild the current version::
160160

161161
git bisect reset
162-
python setup.py build_ext -j 4
162+
python -m pip install -ve . --no-build-isolation --config-settings editable-verbose=true
163163

164164
Report your findings under the corresponding issue and ping the commit author to get
165165
their input.

doc/source/user_guide/enhancingperf.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -453,7 +453,7 @@ by evaluate arithmetic and boolean expression all at once for large :class:`~pan
453453
:func:`~pandas.eval` is many orders of magnitude slower for
454454
smaller expressions or objects than plain Python. A good rule of thumb is
455455
to only use :func:`~pandas.eval` when you have a
456-
:class:`.DataFrame` with more than 10,000 rows.
456+
:class:`~pandas.core.frame.DataFrame` with more than 10,000 rows.
457457

458458
Supported syntax
459459
~~~~~~~~~~~~~~~~

doc/source/user_guide/io.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6400,7 +6400,7 @@ ignored.
64006400
In [2]: df = pd.DataFrame({'A': np.random.randn(sz), 'B': [1] * sz})
64016401
64026402
In [3]: df.info()
6403-
<class 'pandas.core.frame.DataFrame'>
6403+
<class 'pandas.DataFrame'>
64046404
RangeIndex: 1000000 entries, 0 to 999999
64056405
Data columns (total 2 columns):
64066406
A 1000000 non-null float64

doc/source/user_guide/missing_data.rst

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -88,7 +88,7 @@ To detect these missing value, use the :func:`isna` or :func:`notna` methods.
8888

8989
.. warning::
9090

91-
Experimental: the behaviour of :class:`NA`` can still change without warning.
91+
Experimental: the behaviour of :class:`NA` can still change without warning.
9292

9393
Starting from pandas 1.0, an experimental :class:`NA` value (singleton) is
9494
available to represent scalar missing values. The goal of :class:`NA` is provide a
@@ -105,7 +105,7 @@ dtype, it will use :class:`NA`:
105105
s[2]
106106
s[2] is pd.NA
107107
108-
Currently, pandas does not yet use those data types using :class:`NA` by default
108+
Currently, pandas does not use those data types using :class:`NA` by default in
109109
a :class:`DataFrame` or :class:`Series`, so you need to specify
110110
the dtype explicitly. An easy way to convert to those dtypes is explained in the
111111
:ref:`conversion section <missing_data.NA.conversion>`.
@@ -253,8 +253,8 @@ Conversion
253253
^^^^^^^^^^
254254

255255
If you have a :class:`DataFrame` or :class:`Series` using ``np.nan``,
256-
:meth:`Series.convert_dtypes` and :meth:`DataFrame.convert_dtypes`
257-
in :class:`DataFrame` that can convert data to use the data types that use :class:`NA`
256+
:meth:`DataFrame.convert_dtypes` and :meth:`Series.convert_dtypes`, respectively,
257+
will convert your data to use the nullable data types supporting :class:`NA`,
258258
such as :class:`Int64Dtype` or :class:`ArrowDtype`. This is especially helpful after reading
259259
in data sets from IO methods where data types were inferred.
260260

0 commit comments

Comments
 (0)