Skip to content

Commit 1edecda

Browse files
committed
Merge branch 'master' of https://github.com/datajoint/datajoint-python into hidden-attr-alt
2 parents 4c4bac5 + b63900b commit 1edecda

File tree

148 files changed

+6655
-1880
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

148 files changed

+6655
-1880
lines changed

.codespellrc

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
[codespell]
2+
skip = .git,*.pdf,*.svg,*.csv,*.ipynb,*.drawio
3+
# Rever -- nobody knows
4+
# numer -- numerator variable
5+
ignore-words-list = rever,numer

.github/workflows/development.yaml

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -93,6 +93,16 @@ jobs:
9393
black datajoint --check -v
9494
black tests --check -v
9595
black tests_old --check -v
96+
codespell:
97+
name: Check for spelling errors
98+
permissions:
99+
contents: read
100+
runs-on: ubuntu-latest
101+
steps:
102+
- name: Checkout
103+
uses: actions/checkout@v3
104+
- name: Codespell
105+
uses: codespell-project/actions-codespell@v2
96106
publish-docs:
97107
if: |
98108
github.event_name == 'push' &&

.github/workflows/docs.yaml

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
name: Manual docs release
2+
on:
3+
workflow_dispatch:
4+
jobs:
5+
publish-docs:
6+
runs-on: ubuntu-latest
7+
env:
8+
DOCKER_CLIENT_TIMEOUT: "120"
9+
COMPOSE_HTTP_TIMEOUT: "120"
10+
steps:
11+
- uses: actions/checkout@v3
12+
- name: Deploy docs
13+
run: |
14+
export MODE=BUILD
15+
export PACKAGE=datajoint
16+
export UPSTREAM_REPO=https://github.com/${GITHUB_REPOSITORY}.git
17+
export HOST_UID=$(id -u)
18+
docker compose -f docs/docker-compose.yaml up --exit-code-from docs --build
19+
git push origin gh-pages

CHANGELOG.md

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,14 @@
11
## Release notes
22

3+
### Upcoming
4+
- Added - Codespell GitHub Actions workflow
5+
- Added - GitHub Actions workflow to manually release docs
6+
- Changed - Update `datajoint/nginx` to `v0.2.6`
7+
- Changed - Migrate docs from `https://docs.datajoint.org/python` to `https://datajoint.com/docs/core/datajoint-python`
8+
- Fixed - Updated set_password to work on MySQL 8 - PR [#1106](https://github.com/datajoint/datajoint-python/pull/1106)
9+
- Added - Missing tests for set_password - PR [#1106](https://github.com/datajoint/datajoint-python/pull/1106)
10+
- Changed - Returning success count after the .populate() call - PR [#1050](https://github.com/datajoint/datajoint-python/pull/1050)
11+
312
### 0.14.1 -- Jun 02, 2023
413
- Fixed - Fix altering a part table that uses the "master" keyword - PR [#991](https://github.com/datajoint/datajoint-python/pull/991)
514
- Fixed - `.ipynb` output in tutorials is not visible in dark mode ([#1078](https://github.com/datajoint/datajoint-python/issues/1078)) PR [#1080](https://github.com/datajoint/datajoint-python/pull/1080)
@@ -31,7 +40,7 @@
3140
- Fixed - Fix queries with backslashes ([#999](https://github.com/datajoint/datajoint-python/issues/999)) PR [#1052](https://github.com/datajoint/datajoint-python/pull/1052)
3241

3342
### 0.13.7 -- Jul 13, 2022
34-
- Fixed - Fix networkx incompatable change by version pinning to 2.6.3 (#1035) PR #1036
43+
- Fixed - Fix networkx incompatible change by version pinning to 2.6.3 (#1035) PR #1036
3544
- Added - Support for serializing numpy datetime64 types (#1022) PR #1036
3645
- Changed - Add traceback to default logging PR #1036
3746

@@ -83,7 +92,7 @@
8392
- Fixed - `schema.list_tables()` is not topologically sorted (#838) PR #893
8493
- Fixed - Diagram part tables do not show proper class name (#882) PR #893
8594
- Fixed - Error in complex restrictions (#892) PR #893
86-
- Fixed - WHERE and GROUP BY clases are dropped on joins with aggregation (#898, #899) PR #893
95+
- Fixed - WHERE and GROUP BY classes are dropped on joins with aggregation (#898, #899) PR #893
8796

8897
### 0.13.0 -- Mar 24, 2021
8998
- Re-implement query transpilation into SQL, fixing issues (#386, #449, #450, #484, #558). PR #754

README.md

Lines changed: 20 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -5,10 +5,18 @@
55

66
# Welcome to DataJoint for Python!
77

8-
DataJoint for Python is a framework for scientific workflow management based on relational principles. DataJoint is built on the foundation of the relational data model and prescribes a consistent method for organizing, populating, computing, and querying data.
9-
10-
DataJoint was initially developed in 2009 by Dimitri Yatsenko in Andreas Tolias' Lab at Baylor College of Medicine for the distributed processing and management of large volumes of data streaming from regular experiments. Starting in 2011, DataJoint has been available as an open-source project adopted by other labs and improved through contributions from several developers.
11-
Presently, the primary developer of DataJoint open-source software is the company DataJoint (https://datajoint.com).
8+
DataJoint for Python is a framework for scientific workflow management based on
9+
relational principles. DataJoint is built on the foundation of the relational data
10+
model and prescribes a consistent method for organizing, populating, computing, and
11+
querying data.
12+
13+
DataJoint was initially developed in 2009 by Dimitri Yatsenko in Andreas Tolias' Lab at
14+
Baylor College of Medicine for the distributed processing and management of large
15+
volumes of data streaming from regular experiments. Starting in 2011, DataJoint has
16+
been available as an open-source project adopted by other labs and improved through
17+
contributions from several developers.
18+
Presently, the primary developer of DataJoint open-source software is the company
19+
DataJoint (https://datajoint.com).
1220

1321
## Data Pipeline Example
1422

@@ -18,7 +26,13 @@ Presently, the primary developer of DataJoint open-source software is the compan
1826

1927
## Getting Started
2028

21-
- Install from PyPI
29+
- Install with Conda
30+
31+
```bash
32+
conda install -c conda-forge datajoint
33+
```
34+
35+
- Install with pip
2236

2337
```bash
2438
pip install datajoint
@@ -33,9 +47,4 @@ Presently, the primary developer of DataJoint open-source software is the compan
3347
- Contribute
3448
- [Development Environment](https://datajoint.com/docs/core/datajoint-python/latest/develop/)
3549

36-
- [Guidelines](https://datajoint.com/docs/community/contribute/)
37-
38-
- Legacy Resources (To be replaced by above)
39-
- [Documentation](https://docs.datajoint.org)
40-
41-
- [Tutorials](https://tutorials.datajoint.org)
50+
- [Guidelines](https://datajoint.com/docs/about/contribute/)

datajoint/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
"""
2-
DataJoint for Python is a framework for building data piplines using MySQL databases
2+
DataJoint for Python is a framework for building data pipelines using MySQL databases
33
to represent pipeline structure and bulk storage systems for large objects.
44
DataJoint is built on the foundation of the relational data model and prescribes a
55
consistent method for organizing, populating, and querying data.

datajoint/admin.py

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
import pymysql
22
from getpass import getpass
3+
from packaging import version
34
from .connection import conn
45
from .settings import config
56
from .utils import user_choice
@@ -14,9 +15,16 @@ def set_password(new_password=None, connection=None, update_config=None):
1415
new_password = getpass("New password: ")
1516
confirm_password = getpass("Confirm password: ")
1617
if new_password != confirm_password:
17-
logger.warn("Failed to confirm the password! Aborting password change.")
18+
logger.warning("Failed to confirm the password! Aborting password change.")
1819
return
19-
connection.query("SET PASSWORD = PASSWORD('%s')" % new_password)
20+
21+
if version.parse(
22+
connection.query("select @@version;").fetchone()[0]
23+
) >= version.parse("5.7"):
24+
# SET PASSWORD is deprecated as of MySQL 5.7 and removed in 8+
25+
connection.query("ALTER USER user() IDENTIFIED BY '%s';" % new_password)
26+
else:
27+
connection.query("SET PASSWORD = PASSWORD('%s')" % new_password)
2028
logger.info("Password updated.")
2129

2230
if update_config or (

datajoint/autopopulate.py

Lines changed: 101 additions & 80 deletions
Original file line numberDiff line numberDiff line change
@@ -118,7 +118,7 @@ def _job_key(self, key):
118118

119119
def _jobs_to_do(self, restrictions):
120120
"""
121-
:return: the query yeilding the keys to be computed (derived from self.key_source)
121+
:return: the query yielding the keys to be computed (derived from self.key_source)
122122
"""
123123
if self.restriction:
124124
raise DataJointError(
@@ -180,6 +180,9 @@ def populate(
180180
to be passed down to each ``make()`` call. Computation arguments should be
181181
specified within the pipeline e.g. using a `dj.Lookup` table.
182182
:type make_kwargs: dict, optional
183+
:return: a dict with two keys
184+
"success_count": the count of successful ``make()`` calls in this ``populate()`` call
185+
"error_list": the error list that is filled if `suppress_errors` is True
183186
"""
184187
if self.connection.in_transaction:
185188
raise DataJointError("Populate cannot be called during a transaction.")
@@ -222,49 +225,62 @@ def handler(signum, frame):
222225

223226
keys = keys[:max_calls]
224227
nkeys = len(keys)
225-
if not nkeys:
226-
return
227-
228-
processes = min(_ for _ in (processes, nkeys, mp.cpu_count()) if _)
229228

230229
error_list = []
231-
populate_kwargs = dict(
232-
suppress_errors=suppress_errors,
233-
return_exception_objects=return_exception_objects,
234-
make_kwargs=make_kwargs,
235-
)
230+
success_list = []
236231

237-
if processes == 1:
238-
for key in (
239-
tqdm(keys, desc=self.__class__.__name__) if display_progress else keys
240-
):
241-
error = self._populate1(key, jobs, **populate_kwargs)
242-
if error is not None:
243-
error_list.append(error)
244-
else:
245-
# spawn multiple processes
246-
self.connection.close() # disconnect parent process from MySQL server
247-
del self.connection._conn.ctx # SSLContext is not pickleable
248-
with mp.Pool(
249-
processes, _initialize_populate, (self, jobs, populate_kwargs)
250-
) as pool, (
251-
tqdm(desc="Processes: ", total=nkeys)
252-
if display_progress
253-
else contextlib.nullcontext()
254-
) as progress_bar:
255-
for error in pool.imap(_call_populate1, keys, chunksize=1):
256-
if error is not None:
257-
error_list.append(error)
258-
if display_progress:
259-
progress_bar.update()
260-
self.connection.connect() # reconnect parent process to MySQL server
232+
if nkeys:
233+
processes = min(_ for _ in (processes, nkeys, mp.cpu_count()) if _)
234+
235+
populate_kwargs = dict(
236+
suppress_errors=suppress_errors,
237+
return_exception_objects=return_exception_objects,
238+
make_kwargs=make_kwargs,
239+
)
240+
241+
if processes == 1:
242+
for key in (
243+
tqdm(keys, desc=self.__class__.__name__)
244+
if display_progress
245+
else keys
246+
):
247+
status = self._populate1(key, jobs, **populate_kwargs)
248+
if status is True:
249+
success_list.append(1)
250+
elif isinstance(status, tuple):
251+
error_list.append(status)
252+
else:
253+
assert status is False
254+
else:
255+
# spawn multiple processes
256+
self.connection.close() # disconnect parent process from MySQL server
257+
del self.connection._conn.ctx # SSLContext is not pickleable
258+
with mp.Pool(
259+
processes, _initialize_populate, (self, jobs, populate_kwargs)
260+
) as pool, (
261+
tqdm(desc="Processes: ", total=nkeys)
262+
if display_progress
263+
else contextlib.nullcontext()
264+
) as progress_bar:
265+
for status in pool.imap(_call_populate1, keys, chunksize=1):
266+
if status is True:
267+
success_list.append(1)
268+
elif isinstance(status, tuple):
269+
error_list.append(status)
270+
else:
271+
assert status is False
272+
if display_progress:
273+
progress_bar.update()
274+
self.connection.connect() # reconnect parent process to MySQL server
261275

262276
# restore original signal handler:
263277
if reserve_jobs:
264278
signal.signal(signal.SIGTERM, old_handler)
265279

266-
if suppress_errors:
267-
return error_list
280+
return {
281+
"success_count": sum(success_list),
282+
"error_list": error_list,
283+
}
268284

269285
def _populate1(
270286
self, key, jobs, suppress_errors, return_exception_objects, make_kwargs=None
@@ -275,55 +291,60 @@ def _populate1(
275291
:param key: dict specifying job to populate
276292
:param suppress_errors: bool if errors should be suppressed and returned
277293
:param return_exception_objects: if True, errors must be returned as objects
278-
:return: (key, error) when suppress_errors=True, otherwise None
294+
:return: (key, error) when suppress_errors=True,
295+
True if successfully invoke one `make()` call, otherwise False
279296
"""
280297
make = self._make_tuples if hasattr(self, "_make_tuples") else self.make
281298

282-
if jobs is None or jobs.reserve(self.target.table_name, self._job_key(key)):
283-
self.connection.start_transaction()
284-
if key in self.target: # already populated
299+
if jobs is not None and not jobs.reserve(
300+
self.target.table_name, self._job_key(key)
301+
):
302+
return False
303+
304+
self.connection.start_transaction()
305+
if key in self.target: # already populated
306+
self.connection.cancel_transaction()
307+
if jobs is not None:
308+
jobs.complete(self.target.table_name, self._job_key(key))
309+
return False
310+
311+
logger.debug(f"Making {key} -> {self.target.full_table_name}")
312+
self.__class__._allow_insert = True
313+
try:
314+
make(dict(key), **(make_kwargs or {}))
315+
except (KeyboardInterrupt, SystemExit, Exception) as error:
316+
try:
285317
self.connection.cancel_transaction()
286-
if jobs is not None:
287-
jobs.complete(self.target.table_name, self._job_key(key))
318+
except LostConnectionError:
319+
pass
320+
error_message = "{exception}{msg}".format(
321+
exception=error.__class__.__name__,
322+
msg=": " + str(error) if str(error) else "",
323+
)
324+
logger.debug(
325+
f"Error making {key} -> {self.target.full_table_name} - {error_message}"
326+
)
327+
if jobs is not None:
328+
# show error name and error message (if any)
329+
jobs.error(
330+
self.target.table_name,
331+
self._job_key(key),
332+
error_message=error_message,
333+
error_stack=traceback.format_exc(),
334+
)
335+
if not suppress_errors or isinstance(error, SystemExit):
336+
raise
288337
else:
289-
logger.debug(f"Making {key} -> {self.target.full_table_name}")
290-
self.__class__._allow_insert = True
291-
try:
292-
make(dict(key), **(make_kwargs or {}))
293-
except (KeyboardInterrupt, SystemExit, Exception) as error:
294-
try:
295-
self.connection.cancel_transaction()
296-
except LostConnectionError:
297-
pass
298-
error_message = "{exception}{msg}".format(
299-
exception=error.__class__.__name__,
300-
msg=": " + str(error) if str(error) else "",
301-
)
302-
logger.debug(
303-
f"Error making {key} -> {self.target.full_table_name} - {error_message}"
304-
)
305-
if jobs is not None:
306-
# show error name and error message (if any)
307-
jobs.error(
308-
self.target.table_name,
309-
self._job_key(key),
310-
error_message=error_message,
311-
error_stack=traceback.format_exc(),
312-
)
313-
if not suppress_errors or isinstance(error, SystemExit):
314-
raise
315-
else:
316-
logger.error(error)
317-
return key, error if return_exception_objects else error_message
318-
else:
319-
self.connection.commit_transaction()
320-
logger.debug(
321-
f"Success making {key} -> {self.target.full_table_name}"
322-
)
323-
if jobs is not None:
324-
jobs.complete(self.target.table_name, self._job_key(key))
325-
finally:
326-
self.__class__._allow_insert = False
338+
logger.error(error)
339+
return key, error if return_exception_objects else error_message
340+
else:
341+
self.connection.commit_transaction()
342+
logger.debug(f"Success making {key} -> {self.target.full_table_name}")
343+
if jobs is not None:
344+
jobs.complete(self.target.table_name, self._job_key(key))
345+
return True
346+
finally:
347+
self.__class__._allow_insert = False
327348

328349
def progress(self, *restrictions, display=False):
329350
"""

datajoint/blob.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -449,7 +449,7 @@ def pack_dict(self, d):
449449
)
450450

451451
def read_struct(self):
452-
"""deserialize matlab stuct"""
452+
"""deserialize matlab struct"""
453453
n_dims = self.read_value()
454454
shape = self.read_value(count=n_dims)
455455
n_elem = np.prod(shape, dtype=int)

0 commit comments

Comments
 (0)