Skip to content

Commit 80d4e9f

Browse files
Use prek and ruff for linting and update rule set (#1952)
This pull request updates code style tooling and configuration, streamlines code formatting in notebooks, and improves import organization and configuration files. The most impactful changes are the modernization of the pre-commit setup and the consolidation of code formatting tools, alongside broad cleanups and formatting simplifications across Jupyter notebooks and Python files. **Tooling and Configuration Updates:** * Updated `.pre-commit-config.yaml` to use the latest `ruff-pre-commit` repo, replaced `black` and `isort` with Ruff's formatter and import sorter, and adjusted hooks and dependencies for consistency and modern standards. [[1]](diffhunk://#diff-63a9c44a44acf85fea213a857769990937107cf072831e1a26808cfde9d096b9L17-R28) [[2]](diffhunk://#diff-63a9c44a44acf85fea213a857769990937107cf072831e1a26808cfde9d096b9L63-R58) * Overhauled `pyproject.toml` to remove legacy `black` and `isort` settings, expanded Ruff configuration for linting, formatting, and per-file ignores, and added mypy configuration for static type checking. [[1]](diffhunk://#diff-50c86b7ed8ac2cf95bd48334961bf0530cdc77b5a56f852c5c61b89d735fd711L128-L132) [[2]](diffhunk://#diff-50c86b7ed8ac2cf95bd48334961bf0530cdc77b5a56f852c5c61b89d735fd711L185-R182) [[3]](diffhunk://#diff-50c86b7ed8ac2cf95bd48334961bf0530cdc77b5a56f852c5c61b89d735fd711L234-R259) **Code Formatting and Import Organization:** * Refactored multi-line imports and function calls into single lines in both Python and Jupyter notebook files for improved readability and consistency. [[1]](diffhunk://#diff-008dcb3426febd767787b1521f1fe33086313b927ea37eaab86df5fa88a51698L182-R182) [[2]](diffhunk://#diff-aa879055ed1e1322b67939f639d3bc3c12d5bc295d7c4e834effbc4f1a53fd47L36-R36) [[3]](diffhunk://#diff-77d278999778b6ae6f7aeb8a0ea8d5abbaf3e79a4ff53526648d343874eaf098L7-R7) * Standardized import order and grouping in multiple notebooks, moving library imports to the top and ensuring proper separation between standard, third-party, and local imports. [[1]](diffhunk://#diff-9ec1db58f8064a03a3bd0a6f18fe17675093c4396bf1e3dd24e929b7f06a79cdL162-R165) [[2]](diffhunk://#diff-c84d3a222eb8e8c95893972a3e136fb420615d6eba35e05fb8ac1fd0ab0fe2b5L424-R423) [[3]](diffhunk://#diff-c84d3a222eb8e8c95893972a3e136fb420615d6eba35e05fb8ac1fd0ab0fe2b5L644-R646) [[4]](diffhunk://#diff-f78c8ee0adaf4e4c73403184ec90e74ecc8704b06602e3ac283a4eb0aa31e189L52-R53) [[5]](diffhunk://#diff-90f4d52ca00dc82b18fa350790970ad209a3b5648d5740c56ae215619a2677e2R165) [[6]](diffhunk://#diff-90f4d52ca00dc82b18fa350790970ad209a3b5648d5740c56ae215619a2677e2L237-R242) [[7]](diffhunk://#diff-c84d3a222eb8e8c95893972a3e136fb420615d6eba35e05fb8ac1fd0ab0fe2b5L945-R941) **Notebook Code Simplification:** * Simplified function calls and list comprehensions to single lines in notebooks to enhance clarity and maintainability. [[1]](diffhunk://#diff-7038db84d18585b0f7da5f29a8b74c49d11868f4dac32cff9c7b36b373873949L165-R166) [[2]](diffhunk://#diff-43702435914ce70eb5b8f94e455bdadee7eef74f794db43c3b926618c870c586L431-R431) [[3]](diffhunk://#diff-65cccf7b378e42bad759a58749367aefb5ef7d672bc8f1791c721865b8970456L3298-R3298) [[4]](diffhunk://#diff-44ada6db5a5f3838c865cc4be02b2550cdeaa528f0379d1f2042455f5626b352L181-R181) [[5]](diffhunk://#diff-780024224c8402230299f07fd7170df50dbcffa25a77718fd27f5ab40a9d746cL142-R142) [[6]](diffhunk://#diff-49df280ccdfe00a9f5bb767c6e9a7cba44b992af016e9d7a0bed54d652477db1L311-R311) [[7]](diffhunk://#diff-61f535306c629a7f07ad9a9c0e7b99d4bb808840d2a267a82fd86ce3e74ca697L45478-R45478) [[8]](diffhunk://#diff-c84d3a222eb8e8c95893972a3e136fb420615d6eba35e05fb8ac1fd0ab0fe2b5L187-R187) [[9]](diffhunk://#diff-c84d3a222eb8e8c95893972a3e136fb420615d6eba35e05fb8ac1fd0ab0fe2b5L745-R744) [[10]](diffhunk://#diff-c84d3a222eb8e8c95893972a3e136fb420615d6eba35e05fb8ac1fd0ab0fe2b5L757-R754) **Minor Code Quality Fixes:** * Fixed logic and formatting errors, such as the generator length check in `docs/source/conf.py` and string concatenation in CLI help messages, improving correctness and clarity. [[1]](diffhunk://#diff-008dcb3426febd767787b1521f1fe33086313b927ea37eaab86df5fa88a51698L182-R182) [[2]](diffhunk://#diff-6f1f61fbe373037fdf31139d07acfd57161d3557eaf7369807fea9f6ec65293fL58-R58) [[3]](diffhunk://#diff-6f1f61fbe373037fdf31139d07acfd57161d3557eaf7369807fea9f6ec65293fL114-R115) [[4]](diffhunk://#diff-4357d21435d95de54958f2aa1214e0af88de1df968b314c3f0a1736cd91d8d8aL70-R70) [[5]](diffhunk://#diff-4357d21435d95de54958f2aa1214e0af88de1df968b314c3f0a1736cd91d8d8aL79-R79) **Dependency and Optional Package Updates:** * Adjusted optional dependencies in `pyproject.toml`, removing `pre-commit` and adding `prek` for development. These changes collectively modernize the project's code style infrastructure, simplify code in notebooks, and improve overall maintainability and developer experience. #1935 ### Checklist <!-- Put an 'x' in all the boxes that apply --> - [ ] I have added tests to cover my changes or documented any manual tests. - [ ] I have updated the [documentation](https://github.com/open-edge-platform/datumaro/tree/develop/docs) accordingly --------- Signed-off-by: Jort Bergfeld <[email protected]> Signed-off-by: Albert van Houten <[email protected]> Signed-off-by: Jort Bergfeld <[email protected]> Co-authored-by: Albert van Houten <[email protected]>
1 parent 8598868 commit 80d4e9f

File tree

330 files changed

+3414
-7508
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

330 files changed

+3414
-7508
lines changed

.github/workflows/linter.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -35,12 +35,12 @@ jobs:
3535
shell: bash
3636
run: |
3737
python -m pip install --upgrade pip
38-
pip install pre-commit
38+
pip install prek
3939
4040
# Execute pre-commit checks
4141
- name: Run pre-commit checks
4242
shell: bash
43-
run: pre-commit run --all-files
43+
run: prek run --all-files
4444

4545
Zizmor-Scan-PR:
4646
runs-on: ubuntu-latest

.pre-commit-config.yaml

Lines changed: 13 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ default_language_version:
44
repos:
55
# EOF and whitespace checker
66
- repo: https://github.com/pre-commit/pre-commit-hooks
7-
rev: v4.4.0
7+
rev: v6.0.0
88
hooks:
99
- id: end-of-file-fixer
1010
exclude: |
@@ -14,24 +14,14 @@ repos:
1414
)$
1515
1616
# Ruff
17-
- repo: https://github.com/charliermarsh/ruff-pre-commit
18-
rev: "v0.5.0"
17+
- repo: https://github.com/astral-sh/ruff-pre-commit
18+
rev: v0.14.4
1919
hooks:
20-
- id: ruff
21-
exclude: "tests"
22-
23-
# python code formatting
24-
- repo: https://github.com/psf/black
25-
rev: 23.1.0
26-
hooks:
27-
- id: black
28-
29-
# Sort Python import
30-
- repo: https://github.com/pycqa/isort
31-
rev: 5.11.5
32-
hooks:
33-
- id: isort
34-
name: isort (python)
20+
- id: ruff
21+
name: Ruff linter on all Python files
22+
args: ["--fix"]
23+
- id: ruff-format
24+
name: Ruff formatter on all Python files
3525

3626
# TODO: Enable mypy as soon as possible
3727
# python static type checking
@@ -58,22 +48,22 @@ repos:
5848

5949
# notebooks
6050
- repo: https://github.com/nbQA-dev/nbQA
61-
rev: 1.8.5
51+
rev: 1.9.1
6252
hooks:
63-
- id: nbqa-black
6453
- id: nbqa-ruff
65-
additional_dependencies: [ruff==0.4.10]
54+
additional_dependencies: [ruff==0.14.4]
55+
args: [--fix]
6656

6757
# zizmor detects security issues in GitHub Actions workflows.
6858
- repo: https://github.com/woodruffw/zizmor-pre-commit
69-
rev: v1.11.0
59+
rev: v1.16.3
7060
hooks:
7161
- id: zizmor
7262
args: ["--min-severity", "low", "--min-confidence", "low"]
7363

7464
# add bandit for security checks
7565
- repo: https://github.com/PyCQA/bandit
76-
rev: 1.8.3
66+
rev: 1.8.6
7767
hooks:
7868
- id: bandit
7969
args:

docs/source/conf.py

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -179,7 +179,7 @@ def replace(app, what, name, obj, options, lines):
179179
for b in exclude_plugins_name:
180180
if a.lower() == b:
181181
names.pop(n)
182-
if all(1 == len(a) for a in names):
182+
if all(len(a) == 1 for a in names):
183183
prog_name = "".join(names).lower()
184184
else:
185185
prog_name = "_".join(names).lower()
@@ -188,9 +188,8 @@ def replace(app, what, name, obj, options, lines):
188188
prog = str("%(prog)s")
189189
lines[i] = lines[i].replace(prog, prog_name)
190190
lines[i] = lines[i].replace("'frame_'", r"'frame\_'") # fix unwanted link
191-
if "'|n'" not in lines[i]:
192-
if "'|s'" not in lines[i]:
193-
lines[i] = lines[i].replace("|n", "\n").replace("|s", " ")
191+
if "'|n'" not in lines[i] and "'|s'" not in lines[i]:
192+
lines[i] = lines[i].replace("|n", "\n").replace("|s", " ")
194193

195194

196195
def setup(app):

notebooks/01_merge_multiple_datasets_for_classification.ipynb

Lines changed: 2 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -162,12 +162,8 @@
162162
}
163163
],
164164
"source": [
165-
"eurosat_label_names = [\n",
166-
" label_cat.name for label_cat in eurosat.categories()[dm.AnnotationType.label]\n",
167-
"]\n",
168-
"uc_merced_label_names = [\n",
169-
" label_cat.name for label_cat in uc_merced.categories()[dm.AnnotationType.label]\n",
170-
"]\n",
165+
"eurosat_label_names = [label_cat.name for label_cat in eurosat.categories()[dm.AnnotationType.label]]\n",
166+
"uc_merced_label_names = [label_cat.name for label_cat in uc_merced.categories()[dm.AnnotationType.label]]\n",
171167
"\n",
172168
"print(\"EuroSAT label names:\")\n",
173169
"print(eurosat_label_names)\n",

notebooks/03_visualize.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -124,7 +124,7 @@
124124
"\n",
125125
"dataset = dm.Dataset.import_from(\"coco_dataset\", format=\"coco_instances\")\n",
126126
"print(\"Subset candidates:\", dataset.subsets().keys())\n",
127-
"subset = list(dataset.subsets().keys())[0] # val2017\n",
127+
"subset = next(dataset.subsets().keys()) # val2017\n",
128128
"print(\"Subset:\", subset)"
129129
]
130130
},

notebooks/04_filter.ipynb

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -428,9 +428,7 @@
428428
],
429429
"source": [
430430
"print(\"There is no item with occlusion\")\n",
431-
"filtered = dataset.clone().filter(\n",
432-
" '/item/annotation[occluded=\"False\"]', filter_annotations=True, remove_empty=True\n",
433-
")\n",
431+
"filtered = dataset.clone().filter('/item/annotation[occluded=\"False\"]', filter_annotations=True, remove_empty=True)\n",
434432
"for item in filtered:\n",
435433
" print(f\"ID: {item.id}\")\n",
436434
"\n",

notebooks/05_transform.ipynb

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3295,9 +3295,7 @@
32953295
"source": [
32963296
"# from datumaro.components.dataset import Dataset\n",
32973297
"\n",
3298-
"val_dataset = dataset.filter(\n",
3299-
" '/item[subset=\"val2017\"]'\n",
3300-
") # or Dataset(dataset.get_subset(subsets[0]))\n",
3298+
"val_dataset = dataset.filter('/item[subset=\"val2017\"]') # or Dataset(dataset.get_subset(subsets[0]))\n",
33013299
"val_dataset"
33023300
]
33033301
},

notebooks/06_tiling.ipynb

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -178,9 +178,7 @@
178178
"source": [
179179
"from datumaro.plugins.tiling import Tile\n",
180180
"\n",
181-
"tiled_dataset = dataset.transform(\n",
182-
" Tile, grid_size=(2, 2), overlap=(0.1, 0.1), threshold_drop_ann=0.5\n",
183-
")\n",
181+
"tiled_dataset = dataset.transform(Tile, grid_size=(2, 2), overlap=(0.1, 0.1), threshold_drop_ann=0.5)\n",
184182
"target_ids = [tiled_id for tiled_id in get_ids(tiled_dataset, subset) if target_id in tiled_id]\n",
185183
"print(target_ids)"
186184
]

notebooks/08_e2e_example_yolo_ultralytics_trainer.ipynb

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -139,9 +139,7 @@
139139
}
140140
],
141141
"source": [
142-
"splited_dataset = dataset.transform(\n",
143-
" \"random_split\", splits=[(\"train\", 0.5), (\"val\", 0.2), (\"test\", 0.3)]\n",
144-
")\n",
142+
"splited_dataset = dataset.transform(\"random_split\", splits=[(\"train\", 0.5), (\"val\", 0.2), (\"test\", 0.3)])\n",
145143
"splited_dataset"
146144
]
147145
},

notebooks/11_validate.ipynb

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -159,9 +159,10 @@
159159
}
160160
],
161161
"source": [
162-
"import numpy as np\n",
163162
"import cv2\n",
163+
"import numpy as np\n",
164164
"from matplotlib import pyplot as plt\n",
165+
"\n",
165166
"from datumaro.components.annotation import AnnotationType, LabelCategories\n",
166167
"\n",
167168
"label_categories = dataset.categories().get(AnnotationType.label, LabelCategories())\n",
@@ -189,7 +190,7 @@
189190
"for report in reports[\"validation_reports\"]:\n",
190191
" if report[\"anomaly_type\"] == \"NegativeLength\":\n",
191192
" item = dataset.get(report[\"item_id\"], report[\"subset\"])\n",
192-
" label_id = [int(s) for s in str.split(report[\"description\"], \"'\") if s.isdigit()][0]\n",
193+
" label_id = next(int(s) for s in str.split(report[\"description\"], \"'\") if s.isdigit())\n",
193194
" visualize_label_id(item, label_id)"
194195
]
195196
},
@@ -228,13 +229,13 @@
228229
" if report[\"anomaly_type\"] == \"UndefinedAttribute\":\n",
229230
" item = dataset.get(report[\"item_id\"], report[\"subset\"])\n",
230231
" for ann in item.annotations:\n",
231-
" for k in ann.attributes.keys():\n",
232+
" for k in ann.attributes:\n",
232233
" label_categories[ann.label].attributes.add(k)\n",
233234
"\n",
234235
" if report[\"anomaly_type\"] == \"NegativeLength\":\n",
235236
" item = dataset.get(report[\"item_id\"], report[\"subset\"])\n",
236237
" print(report[\"description\"])\n",
237-
" label_id = [int(s) for s in str.split(report[\"description\"], \"'\") if s.isdigit()][0]\n",
238+
" label_id = next(int(s) for s in str.split(report[\"description\"], \"'\") if s.isdigit())\n",
238239
" neg_len_anns = []\n",
239240
" for ann in item.annotations:\n",
240241
" if ann.id == label_id:\n",
@@ -431,7 +432,7 @@
431432
" if report[\"anomaly_type\"] == \"FarFromLabelMean\":\n",
432433
" print(report[\"description\"])\n",
433434
" item = dataset.get(report[\"item_id\"], report[\"subset\"])\n",
434-
" label_id = [int(s) for s in str.split(report[\"description\"], \"'\") if s.isdigit()][0]\n",
435+
" label_id = next(int(s) for s in str.split(report[\"description\"], \"'\") if s.isdigit())\n",
435436
" visualize_label_id(item, label_id)"
436437
]
437438
},

0 commit comments

Comments
 (0)