Skip to content

Commit 0ccc362

Browse files
author
VibavariG
committed
Merge remote-tracking branch 'upstream/main' into 59717-from-records-columns-reorder
2 parents 7c82e5b + 2237217 commit 0ccc362

File tree

10 files changed

+109
-24
lines changed

10 files changed

+109
-24
lines changed

.github/workflows/wheels.yml

Lines changed: 1 addition & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -102,9 +102,7 @@ jobs:
102102
python: [["cp310", "3.10"], ["cp311", "3.11"], ["cp312", "3.12"], ["cp313", "3.13"], ["cp313t", "3.13"]]
103103
include:
104104
# TODO: Remove this plus installing build deps in cibw_before_build.sh
105-
# and test deps in cibw_before_test.sh after pandas can be built with a released NumPy/Cython
106-
- python: ["cp313", "3.13"]
107-
cibw_build_frontend: 'pip; args: --no-build-isolation'
105+
# after pandas can be built with a released NumPy/Cython
108106
- python: ["cp313t", "3.13"]
109107
cibw_build_frontend: 'pip; args: --no-build-isolation'
110108
# Build Pyodide wheels and upload them to Anaconda.org
@@ -187,11 +185,9 @@ jobs:
187185
- name: Test Windows Wheels
188186
if: ${{ matrix.buildplat[1] == 'win_amd64' }}
189187
shell: pwsh
190-
# TODO: Remove NumPy nightly install when there's a 3.13 wheel on PyPI
191188
run: |
192189
$TST_CMD = @"
193190
python -m pip install hypothesis>=6.84.0 pytest>=7.3.2 pytest-xdist>=3.4.0;
194-
${{ matrix.python[1] == '3.13' && 'python -m pip install -i https://pypi.anaconda.org/scientific-python-nightly-wheels/simple numpy;' }}
195191
python -m pip install `$(Get-Item pandas\wheelhouse\*.whl);
196192
python -c `'import pandas as pd; pd.test(extra_args=[`\"--no-strict-data-files`\", `\"-m not clipboard and not single_cpu and not slow and not network and not db`\"])`';
197193
"@

MANIFEST.in

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -65,4 +65,3 @@ graft pandas/_libs/include
6565

6666
# Include cibw script in sdist since it's needed for building wheels
6767
include scripts/cibw_before_build.sh
68-
include scripts/cibw_before_test.sh

ci/code_checks.sh

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -383,7 +383,6 @@ if [[ -z "$CHECK" || "$CHECK" == "docstrings" ]]; then
383383
-i "pandas.tseries.offsets.Week.n GL08" \
384384
-i "pandas.tseries.offsets.Week.normalize GL08" \
385385
-i "pandas.tseries.offsets.Week.weekday GL08" \
386-
-i "pandas.tseries.offsets.WeekOfMonth SA01" \
387386
-i "pandas.tseries.offsets.WeekOfMonth.is_on_offset GL08" \
388387
-i "pandas.tseries.offsets.WeekOfMonth.n GL08" \
389388
-i "pandas.tseries.offsets.WeekOfMonth.normalize GL08" \

pandas/_libs/tslibs/np_datetime.pxd

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -89,7 +89,7 @@ cdef int string_to_dts(
8989
int* out_local,
9090
int* out_tzoffset,
9191
bint want_exc,
92-
format: str | None = *,
92+
str format = *,
9393
bint exact = *
9494
) except? -1
9595

pandas/_libs/tslibs/np_datetime.pyx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -331,7 +331,7 @@ cdef int string_to_dts(
331331
int* out_local,
332332
int* out_tzoffset,
333333
bint want_exc,
334-
format: str | None=None,
334+
str format=None,
335335
bint exact=True,
336336
) except? -1:
337337
cdef:

pandas/_libs/tslibs/offsets.pyx

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3582,6 +3582,11 @@ cdef class WeekOfMonth(WeekOfMonthMixin):
35823582
"""
35833583
Describes monthly dates like "the Tuesday of the 2nd week of each month".
35843584
3585+
This offset allows for generating or adjusting dates by specifying
3586+
a particular week and weekday within a month. The week is zero-indexed,
3587+
where 0 corresponds to the first week of the month, and weekday follows
3588+
a Monday=0 convention.
3589+
35853590
Attributes
35863591
----------
35873592
n : int, default 1
@@ -3602,6 +3607,12 @@ cdef class WeekOfMonth(WeekOfMonthMixin):
36023607
- 5 is Saturday
36033608
- 6 is Sunday.
36043609
3610+
See Also
3611+
--------
3612+
offsets.Week : Describes weekly frequency adjustments.
3613+
offsets.MonthEnd : Describes month-end frequency adjustments.
3614+
date_range : Generates a range of dates based on a specific frequency.
3615+
36053616
Examples
36063617
--------
36073618
>>> ts = pd.Timestamp(2022, 1, 1)

pyproject.toml

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -157,7 +157,6 @@ test-command = """
157157
"""
158158
free-threaded-support = true
159159
before-build = "bash {package}/scripts/cibw_before_build.sh"
160-
before-test = "bash {package}/scripts/cibw_before_test.sh"
161160

162161
[tool.cibuildwheel.windows]
163162
before-build = "pip install delvewheel && bash {package}/scripts/cibw_before_build.sh"
@@ -173,7 +172,7 @@ test-command = """
173172

174173
[[tool.cibuildwheel.overrides]]
175174
select = "*-musllinux*"
176-
before-test = "apk update && apk add musl-locales && bash {package}/scripts/cibw_before_test.sh"
175+
before-test = "apk update && apk add musl-locales"
177176

178177
[[tool.cibuildwheel.overrides]]
179178
select = "*-win*"

scripts/cibw_before_build.sh

Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,6 @@
1-
# TODO: Delete when there's PyPI NumPy/Cython releases the support Python 3.13.
2-
# If free-threading support is not included in those releases, this script will have
3-
# to whether this runs for a free-threaded build instead.
4-
PYTHON_VERSION="$(python -c "import sys; print(f'{sys.version_info.major}{sys.version_info.minor}')")"
5-
if [[ $PYTHON_VERSION == "313" ]]; then
1+
# TODO: Delete when there's a PyPI Cython release that supports free-threaded Python 3.13.
2+
FREE_THREADED_BUILD="$(python -c"import sysconfig; print(bool(sysconfig.get_config_var('Py_GIL_DISABLED')))")"
3+
if [[ $FREE_THREADED_BUILD == "True" ]]; then
64
python -m pip install -U pip
75
python -m pip install -i https://pypi.anaconda.org/scientific-python-nightly-wheels/simple numpy cython
86
python -m pip install ninja meson-python versioneer[toml]

scripts/cibw_before_test.sh

Lines changed: 0 additions & 8 deletions
This file was deleted.

web/pandas/community/ecosystem.md

Lines changed: 91 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -367,6 +367,97 @@ pandas-gbq provides high performance reads and writes to and from
367367
these methods were exposed as `pandas.read_gbq` and `DataFrame.to_gbq`.
368368
Use `pandas_gbq.read_gbq` and `pandas_gbq.to_gbq`, instead.
369369

370+
371+
### [ArcticDB](https://github.com/man-group/ArcticDB)
372+
373+
ArcticDB is a serverless DataFrame database engine designed for the Python Data Science ecosystem. ArcticDB enables you to store, retrieve, and process pandas DataFrames at scale. It is a storage engine designed for object storage and also supports local-disk storage using LMDB. ArcticDB requires zero additional infrastructure beyond a running Python environment and access to object storage and can be installed in seconds. Please find full documentation [here](https://docs.arcticdb.io/latest/).
374+
375+
#### ArcticDB Terminology
376+
377+
ArcticDB is structured to provide a scalable and efficient way to manage and retrieve DataFrames, organized into several key components:
378+
379+
- `Object Store` Collections of libraries. Used to separate logical environments from each other. Analogous to a database server.
380+
- `Library` Contains multiple symbols which are grouped in a certain way (different users, markets, etc). Analogous to a database.
381+
- `Symbol` Atomic unit of data storage. Identified by a string name. Data stored under a symbol strongly resembles a pandas DataFrame. Analogous to tables.
382+
- `Version` Every modifying action (write, append, update) performed on a symbol creates a new version of that object.
383+
384+
#### Installation
385+
386+
To install, simply run:
387+
388+
```console
389+
pip install arcticdb
390+
```
391+
392+
To get started, we can import ArcticDB and instantiate it:
393+
394+
```python
395+
import arcticdb as adb
396+
import numpy as np
397+
import pandas as pd
398+
# this will set up the storage using the local file system
399+
arctic = adb.Arctic("lmdb://arcticdb_test")
400+
```
401+
402+
> **Note:** ArcticDB supports any S3 API compatible storage, including AWS. ArcticDB also supports Azure Blob storage.
403+
> ArcticDB also supports LMDB for local/file based storage - to use LMDB, pass an LMDB path as the URI: `adb.Arctic('lmdb://path/to/desired/database')`.
404+
405+
#### Library Setup
406+
407+
ArcticDB is geared towards storing many (potentially millions) of tables. Individual tables (DataFrames) are called symbols and are stored in collections called libraries. A single library can store many symbols. Libraries must first be initialized prior to use:
408+
409+
```python
410+
lib = arctic.get_library('sample', create_if_missing=True)
411+
```
412+
413+
#### Writing Data to ArcticDB
414+
415+
Now we have a library set up, we can get to reading and writing data. ArcticDB has a set of simple functions for DataFrame storage. Let's write a DataFrame to storage.
416+
417+
```python
418+
df = pd.DataFrame(
419+
{
420+
"a": list("abc"),
421+
"b": list(range(1, 4)),
422+
"c": np.arange(3, 6).astype("u1"),
423+
"d": np.arange(4.0, 7.0, dtype="float64"),
424+
"e": [True, False, True],
425+
"f": pd.date_range("20130101", periods=3)
426+
}
427+
)
428+
429+
df
430+
df.dtypes
431+
```
432+
433+
Write to ArcticDB.
434+
435+
```python
436+
write_record = lib.write("test", df)
437+
```
438+
439+
> **Note:** When writing pandas DataFrames, ArcticDB supports the following index types:
440+
>
441+
> - `pandas.Index` containing int64 (or the corresponding dedicated types Int64Index, UInt64Index)
442+
> - `RangeIndex`
443+
> - `DatetimeIndex`
444+
> - `MultiIndex` composed of above supported types
445+
>
446+
> The "row" concept in `head`/`tail` refers to the row number ('iloc'), not the value in the `pandas.Index` ('loc').
447+
448+
#### Reading Data from ArcticDB
449+
450+
Read the data back from storage:
451+
452+
```python
453+
read_record = lib.read("test")
454+
read_record.data
455+
df.dtypes
456+
```
457+
458+
ArcticDB also supports appending, updating, and querying data from storage to a pandas DataFrame. Please find more information [here](https://docs.arcticdb.io/latest/api/query_builder/).
459+
460+
370461
## Out-of-core
371462

372463
### [Bodo](https://bodo.ai/)

0 commit comments

Comments
 (0)