Merge branch 'main' into fix_zenodraft

egpbos · web-flow · commit e7261a43816b · 2025-07-28T16:41:00.000+02:00
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -1,5 +1,5 @@
 repos:
   - repo: https://github.com/rbubley/mirrors-prettier
-    rev: v3.5.3
+    rev: v3.6.2
     hooks:
       - id: prettier
diff --git a/language_guides/python.md b/language_guides/python.md
@@ -249,7 +249,6 @@ If you use another editor, perhaps it is more convenient to pick another service
 - List of other available software can be found on the [Python wiki page on debugging tools](https://wiki.python.org/moin/PythonDebuggingTools).
 
 - If you are looking for some tutorials to get started:
-
   - https://pymotw.com/2/pdb
   - https://github.com/spiside/pdb-tutorial
   - https://www.jetbrains.com/help/pycharm/2016.3/debugging.html
diff --git a/technology/datasets.md b/technology/datasets.md
@@ -25,7 +25,6 @@ SQLite is a transactional database, so if you have a dataset that is changing wi
 
 - DuckDB can also create views (virtual tables) from other sources like files, other databases, but with SQLite you always have to import the data before running any queries.
 - DuckDB is multi-threaded. This can be an advantage for large databases, where aggregation queries tend to be faster than sqlite.
-
   - However if you have a really large dataset, say 100Ms of rows, and want to perform a deeply nested query, it would require substantial amount of memory, making it unfeasible to run on personal laptops.
   - There are options to customize memory handling, and push what is possible on a single machine.
 
@@ -44,7 +43,6 @@ SQLite is a transactional database, so if you have a dataset that is changing wi
   Note, if your query is deeply nested, you should have sufficient disk space for DuckDB to use; e.g. for 4 nested levels of `INNER JOIN` combined with a `GROUP BY`, we observed a disk spill over of 30x the original dataset. However we found this was not always reliable.
 
   In this kind of borderline cases, it might be possible to address the limitation by splitting the workload into chunks, and aggregating later, or by considering one of the alternatives mentioned below.
-
   - You can also optimize the queries for DuckDB, but that requires a deeper dive into the documentation, and understanding how DuckDB query optimisation works.
 
 - Both databases support setting (unique) indexes. Indexes are useful and sometimes necessary