Skip to content

Commit 200a249

Browse files
authored
Fixing a few Typos (#1220)
* Fix typos found with codespell * Optionally integrate codespell into the repo * Avoid updating uv package versions
1 parent 61f981b commit 200a249

File tree

16 files changed

+807
-771
lines changed

16 files changed

+807
-771
lines changed

.github/workflows/build.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,10 @@ jobs:
4848
uv run --no-project ruff check --output-format=github python/
4949
uv run --no-project ruff format --check python/
5050
51+
- name: Run codespell
52+
run: |
53+
uv run --no-project codespell --toml pyproject.toml
54+
5155
generate-license:
5256
runs-on: ubuntu-latest
5357
steps:

.pre-commit-config.yaml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,5 +45,13 @@ repos:
4545
types: [file, rust]
4646
language: system
4747

48+
- repo: https://github.com/codespell-project/codespell
49+
rev: v2.4.1
50+
hooks:
51+
- id: codespell
52+
args: [ --toml, "pyproject.toml"]
53+
additional_dependencies:
54+
- tomli
55+
4856
default_language_version:
4957
python: python3

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -233,7 +233,7 @@ and for `uv run` commands the additional parameter `--no-project`
233233
git clone [email protected]:apache/datafusion-python.git
234234
# cd to the repo root
235235
cd datafusion-python/
236-
# create the virtual enviornment
236+
# create the virtual environment
237237
uv sync --dev --no-install-package datafusion
238238
# activate the environment
239239
source .venv/bin/activate

docs/source/contributor-guide/ffi.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -195,7 +195,7 @@ optimization levels. If you wish to go down this route, there are two approaches
195195
have identified you can use.
196196

197197
#. Re-export all of ``datafusion-python`` yourself with your extensions built in.
198-
#. Carefully synchonize your software releases with the ``datafusion-python`` CI build
198+
#. Carefully synchronize your software releases with the ``datafusion-python`` CI build
199199
system so that your libraries use the exact same compiler, features, and
200200
optimization level.
201201

docs/source/contributor-guide/introduction.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ Bootstrap:
4343
4444
# fetch this repo
4545
git clone [email protected]:apache/datafusion-python.git
46-
# create the virtual enviornment
46+
# create the virtual environment
4747
uv sync --dev --no-install-package datafusion
4848
# activate the environment
4949
source .venv/bin/activate

docs/source/user-guide/common-operations/expressions.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -64,7 +64,7 @@ Arrays
6464
------
6565

6666
For columns that contain arrays of values, you can access individual elements of the array by index
67-
using bracket indexing. This is similar to callling the function
67+
using bracket indexing. This is similar to calling the function
6868
:py:func:`datafusion.functions.array_element`, except that array indexing using brackets is 0 based,
6969
similar to Python arrays and ``array_element`` is 1 based indexing to be compatible with other SQL
7070
approaches.

docs/source/user-guide/common-operations/windows.rst

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ In this section you will learn about window functions. A window function utilize
2424
multiple rows to produce a result for each individual row, unlike an aggregate function that
2525
provides a single value for multiple rows.
2626

27-
The window functions are availble in the :py:mod:`~datafusion.functions` module.
27+
The window functions are available in the :py:mod:`~datafusion.functions` module.
2828

2929
We'll use the pokemon dataset (from Ritchie Vink) in the following examples.
3030

@@ -99,8 +99,8 @@ If you do not specify a Window Frame, the frame will be set depending on the fol
9999
criteria.
100100

101101
* If an ``order_by`` clause is set, the default window frame is defined as the rows between
102-
unbounded preceeding and the current row.
103-
* If an ``order_by`` is not set, the default frame is defined as the rows betwene unbounded
102+
unbounded preceding and the current row.
103+
* If an ``order_by`` is not set, the default frame is defined as the rows between unbounded
104104
and unbounded following (the entire partition).
105105

106106
Window Frames are defined by three parameters: unit type, starting bound, and ending bound.
@@ -116,7 +116,7 @@ The unit types available are:
116116
``order_by`` clause.
117117

118118
In this example we perform a "rolling average" of the speed of the current Pokemon and the
119-
two preceeding rows.
119+
two preceding rows.
120120

121121
.. ipython:: python
122122

docs/source/user-guide/data-sources.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ DataFusion provides a wide variety of ways to get data into a DataFrame to perfo
2525
Local file
2626
----------
2727

28-
DataFusion has the abilty to read from a variety of popular file formats, such as :ref:`Parquet <io_parquet>`,
28+
DataFusion has the ability to read from a variety of popular file formats, such as :ref:`Parquet <io_parquet>`,
2929
:ref:`CSV <io_csv>`, :ref:`JSON <io_json>`, and :ref:`AVRO <io_avro>`.
3030

3131
.. ipython:: python
@@ -120,7 +120,7 @@ DataFusion can import DataFrames directly from other libraries, such as
120120
`Polars <https://pola.rs/>`_ and `Pandas <https://pandas.pydata.org/>`_.
121121
Since DataFusion version 42.0.0, any DataFrame library that supports the Arrow FFI PyCapsule
122122
interface can be imported to DataFusion using the
123-
:py:func:`~datafusion.context.SessionContext.from_arrow` function. Older verions of Polars may
123+
:py:func:`~datafusion.context.SessionContext.from_arrow` function. Older versions of Polars may
124124
not support the arrow interface. In those cases, you can still import via the
125125
:py:func:`~datafusion.context.SessionContext.from_polars` function.
126126

pyproject.toml

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -129,6 +129,18 @@ max-doc-length = 88
129129
"benchmarks/*" = ["D", "F", "T", "BLE", "FURB", "PLR", "E", "TD", "TRY", "S", "SIM", "EXE", "UP"]
130130
"docs/*" = ["D"]
131131

132+
[tool.codespell]
133+
skip = [
134+
"./target",
135+
"uv.lock",
136+
"./python/tests/test_functions.py"
137+
]
138+
count = true
139+
ignore-words-list = [
140+
"ans",
141+
"IST"
142+
]
143+
132144
[dependency-groups]
133145
dev = [
134146
"maturin>=1.8.1",
@@ -139,6 +151,7 @@ dev = [
139151
"ruff>=0.9.1",
140152
"toml>=0.10.2",
141153
"pygithub==2.5.0",
154+
"codespell==2.4.1",
142155
]
143156
docs = [
144157
"sphinx>=7.1.2",

python/datafusion/dataframe.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -588,7 +588,7 @@ def tail(self, n: int = 5) -> DataFrame:
588588
def collect(self) -> list[pa.RecordBatch]:
589589
"""Execute this :py:class:`DataFrame` and collect results into memory.
590590
591-
Prior to calling ``collect``, modifying a DataFrme simply updates a plan
591+
Prior to calling ``collect``, modifying a DataFrame simply updates a plan
592592
(no actual computation is performed). Calling ``collect`` triggers the
593593
computation.
594594
@@ -767,7 +767,7 @@ def explain(self, verbose: bool = False, analyze: bool = False) -> None:
767767
768768
Args:
769769
verbose: If ``True``, more details will be included.
770-
analyze: If ``Tru`e``, the plan will run and metrics reported.
770+
analyze: If ``True``, the plan will run and metrics reported.
771771
"""
772772
self.df.explain(verbose, analyze)
773773

0 commit comments

Comments
 (0)