Skip to content

Commit bf40bb5

Browse files
authored
Merge branch 'main' into bug#60723
2 parents d53dc0a + 2030d9d commit bf40bb5

File tree

12 files changed

+59
-20
lines changed

12 files changed

+59
-20
lines changed

Dockerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,5 +13,5 @@ COPY requirements-dev.txt /tmp
1313
RUN python -m pip install -r /tmp/requirements-dev.txt
1414
RUN git config --global --add safe.directory /home/pandas
1515

16-
ENV SHELL "/bin/bash"
16+
ENV SHELL="/bin/bash"
1717
CMD ["/bin/bash"]

doc/source/getting_started/overview.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -174,3 +174,4 @@ License
174174
-------
175175

176176
.. literalinclude:: ../../../LICENSE
177+
:language: none

doc/source/user_guide/io.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,10 +18,10 @@ The pandas I/O API is a set of top level ``reader`` functions accessed like
1818
:widths: 30, 100, 60, 60
1919

2020
text,`CSV <https://en.wikipedia.org/wiki/Comma-separated_values>`__, :ref:`read_csv<io.read_csv_table>`, :ref:`to_csv<io.store_in_csv>`
21-
text,Fixed-Width Text File, :ref:`read_fwf<io.fwf_reader>` , NA
21+
text,Fixed-Width Text File, :ref:`read_fwf<io.fwf_reader>`, NA
2222
text,`JSON <https://www.json.org/>`__, :ref:`read_json<io.json_reader>`, :ref:`to_json<io.json_writer>`
2323
text,`HTML <https://en.wikipedia.org/wiki/HTML>`__, :ref:`read_html<io.read_html>`, :ref:`to_html<io.html>`
24-
text,`LaTeX <https://en.wikipedia.org/wiki/LaTeX>`__, :ref:`Styler.to_latex<io.latex>` , NA
24+
text,`LaTeX <https://en.wikipedia.org/wiki/LaTeX>`__, NA, :ref:`Styler.to_latex<io.latex>`
2525
text,`XML <https://www.w3.org/standards/xml/core>`__, :ref:`read_xml<io.read_xml>`, :ref:`to_xml<io.xml>`
2626
text, Local clipboard, :ref:`read_clipboard<io.clipboard>`, :ref:`to_clipboard<io.clipboard>`
2727
binary,`MS Excel <https://en.wikipedia.org/wiki/Microsoft_Excel>`__ , :ref:`read_excel<io.excel_reader>`, :ref:`to_excel<io.excel_writer>`

doc/source/whatsnew/v3.0.0.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -718,6 +718,7 @@ I/O
718718
^^^
719719
- Bug in :class:`DataFrame` and :class:`Series` ``repr`` of :py:class:`collections.abc.Mapping`` elements. (:issue:`57915`)
720720
- Bug in :meth:`.DataFrame.to_json` when ``"index"`` was a value in the :attr:`DataFrame.column` and :attr:`Index.name` was ``None``. Now, this will fail with a ``ValueError`` (:issue:`58925`)
721+
- Bug in :meth:`.io.common.is_fsspec_url` not recognizing chained fsspec URLs (:issue:`48978`)
721722
- Bug in :meth:`DataFrame._repr_html_` which ignored the ``"display.float_format"`` option (:issue:`59876`)
722723
- Bug in :meth:`DataFrame.from_records` where ``columns`` parameter with numpy structured array was not reordering and filtering out the columns (:issue:`59717`)
723724
- Bug in :meth:`DataFrame.to_dict` raises unnecessary ``UserWarning`` when columns are not unique and ``orient='tight'``. (:issue:`58281`)

pandas/_libs/tslibs/timedeltas.pyx

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1740,7 +1740,8 @@ cdef class _Timedelta(timedelta):
17401740
Format the Timedelta as ISO 8601 Duration.
17411741

17421742
``P[n]Y[n]M[n]DT[n]H[n]M[n]S``, where the ``[n]`` s are replaced by the
1743-
values. See https://en.wikipedia.org/wiki/ISO_8601#Durations.
1743+
values. See Wikipedia:
1744+
`ISO 8601 § Durations <https://en.wikipedia.org/wiki/ISO_8601#Durations>`_.
17441745

17451746
Returns
17461747
-------

pandas/_libs/tslibs/timestamps.pyx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1309,7 +1309,7 @@ cdef class _Timestamp(ABCTimestamp):
13091309
By default, the fractional part is omitted if self.microsecond == 0
13101310
and self._nanosecond == 0.
13111311

1312-
If self.tzinfo is not None, the UTC offset is also attached, giving
1312+
If self.tzinfo is not None, the UTC offset is also attached,
13131313
giving a full format of 'YYYY-MM-DD HH:MM:SS.mmmmmmnnn+HH:MM'.
13141314

13151315
Parameters

pandas/core/frame.py

Lines changed: 24 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5880,6 +5880,8 @@ def set_index(
58805880
Delete columns to be used as the new index.
58815881
append : bool, default False
58825882
Whether to append columns to existing index.
5883+
Setting to True will add the new columns to existing index.
5884+
When set to False, the current index will be dropped from the DataFrame.
58835885
inplace : bool, default False
58845886
Whether to modify the DataFrame rather than creating a new one.
58855887
verify_integrity : bool, default False
@@ -5953,6 +5955,25 @@ def set_index(
59535955
2 4 4 2014 40
59545956
3 9 7 2013 84
59555957
4 16 10 2014 31
5958+
5959+
Append a column to the existing index:
5960+
5961+
>>> df = df.set_index("month")
5962+
>>> df.set_index("year", append=True)
5963+
sale
5964+
month year
5965+
1 2012 55
5966+
4 2014 40
5967+
7 2013 84
5968+
10 2014 31
5969+
5970+
>>> df.set_index("year", append=False)
5971+
sale
5972+
year
5973+
2012 55
5974+
2014 40
5975+
2013 84
5976+
2014 31
59565977
"""
59575978
inplace = validate_bool_kwarg(inplace, "inplace")
59585979
self._check_inplace_and_allows_duplicate_labels(inplace)
@@ -10265,7 +10286,9 @@ def apply(
1026510286
either the DataFrame's index (``axis=0``) or the DataFrame's columns
1026610287
(``axis=1``). By default (``result_type=None``), the final return type
1026710288
is inferred from the return type of the applied function. Otherwise,
10268-
it depends on the `result_type` argument.
10289+
it depends on the `result_type` argument. The return type of the applied
10290+
function is inferred based on the first computed result obtained after
10291+
applying the function to a Series object.
1026910292
1027010293
Parameters
1027110294
----------

pandas/io/common.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -71,7 +71,7 @@
7171

7272
_VALID_URLS = set(uses_relative + uses_netloc + uses_params)
7373
_VALID_URLS.discard("")
74-
_RFC_3986_PATTERN = re.compile(r"^[A-Za-z][A-Za-z0-9+\-+.]*://")
74+
_FSSPEC_URL_PATTERN = re.compile(r"^[A-Za-z][A-Za-z0-9+\-+.]*(::[A-Za-z0-9+\-+.]+)*://")
7575

7676
BaseBufferT = TypeVar("BaseBufferT", bound=BaseBuffer)
7777

@@ -291,7 +291,7 @@ def is_fsspec_url(url: FilePath | BaseBuffer) -> bool:
291291
"""
292292
return (
293293
isinstance(url, str)
294-
and bool(_RFC_3986_PATTERN.match(url))
294+
and bool(_FSSPEC_URL_PATTERN.match(url))
295295
and not url.startswith(("http://", "https://"))
296296
)
297297

pandas/io/json/_normalize.py

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -147,7 +147,7 @@ def nested_to_record(
147147
return new_ds
148148

149149

150-
def _normalise_json(
150+
def _normalize_json(
151151
data: Any,
152152
key_string: str,
153153
normalized_dict: dict[str, Any],
@@ -177,7 +177,7 @@ def _normalise_json(
177177
if not key_string:
178178
new_key = new_key.removeprefix(separator)
179179

180-
_normalise_json(
180+
_normalize_json(
181181
data=value,
182182
key_string=new_key,
183183
normalized_dict=normalized_dict,
@@ -188,7 +188,7 @@ def _normalise_json(
188188
return normalized_dict
189189

190190

191-
def _normalise_json_ordered(data: dict[str, Any], separator: str) -> dict[str, Any]:
191+
def _normalize_json_ordered(data: dict[str, Any], separator: str) -> dict[str, Any]:
192192
"""
193193
Order the top level keys and then recursively go to depth
194194
@@ -201,10 +201,10 @@ def _normalise_json_ordered(data: dict[str, Any], separator: str) -> dict[str, A
201201
202202
Returns
203203
-------
204-
dict or list of dicts, matching `normalised_json_object`
204+
dict or list of dicts, matching `normalized_json_object`
205205
"""
206206
top_dict_ = {k: v for k, v in data.items() if not isinstance(v, dict)}
207-
nested_dict_ = _normalise_json(
207+
nested_dict_ = _normalize_json(
208208
data={k: v for k, v in data.items() if isinstance(v, dict)},
209209
key_string="",
210210
normalized_dict={},
@@ -235,7 +235,7 @@ def _simple_json_normalize(
235235
Returns
236236
-------
237237
frame : DataFrame
238-
d - dict or list of dicts, matching `normalised_json_object`
238+
d - dict or list of dicts, matching `normalized_json_object`
239239
240240
Examples
241241
--------
@@ -256,14 +256,14 @@ def _simple_json_normalize(
256256
}
257257
258258
"""
259-
normalised_json_object = {}
259+
normalized_json_object = {}
260260
# expect a dictionary, as most jsons are. However, lists are perfectly valid
261261
if isinstance(ds, dict):
262-
normalised_json_object = _normalise_json_ordered(data=ds, separator=sep)
262+
normalized_json_object = _normalize_json_ordered(data=ds, separator=sep)
263263
elif isinstance(ds, list):
264-
normalised_json_list = [_simple_json_normalize(row, sep=sep) for row in ds]
265-
return normalised_json_list
266-
return normalised_json_object
264+
normalized_json_list = [_simple_json_normalize(row, sep=sep) for row in ds]
265+
return normalized_json_list
266+
return normalized_json_object
267267

268268

269269
def json_normalize(

pandas/tests/io/json/test_pandas.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1753,6 +1753,7 @@ def test_read_timezone_information(self):
17531753
[
17541754
"s3://example-fsspec/",
17551755
"gcs://another-fsspec/file.json",
1756+
"filecache::s3://yet-another-fsspec/file.json",
17561757
"https://example-site.com/data",
17571758
"some-protocol://data.txt",
17581759
],

0 commit comments

Comments
 (0)