Skip to content

Commit 5217cd4

Browse files
Merge pull request #878 from dimitri-yatsenko/cascade-delete
Fix join error in the new query parser (#857)
2 parents 322f17f + 9955fe2 commit 5217cd4

File tree

13 files changed

+133
-91
lines changed

13 files changed

+133
-91
lines changed

CHANGELOG.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,16 @@
11
## Release notes
22

3-
### 0.13.0 -- Feb 15, 2021
3+
### 0.13.0 -- Mar 19, 2021
44
* Re-implement query transpilation into SQL, fixing issues (#386, #449, #450, #484). PR #754
55
* Re-implement cascading deletes for better performance. PR #839.
66
* Add table method `.update1` to update a row in the table with new values PR #763
77
* Python datatypes are now enabled by default in blobs (#761). PR #785
88
* Added permissive join and restriction operators `@` and `^` (#785) PR #754
99
* Support DataJoint datatype and connection plugins (#715, #729) PR 730, #735
10-
* add `dj.key_hash` alias to `dj.hash.key_hash`
11-
* default enable_python_native_blobs to True
10+
* Add `dj.key_hash` alias to `dj.hash.key_hash`
11+
* Default enable_python_native_blobs to True
12+
* Bugfix - Regression error on joins with same attribute name (#857) PR #878
13+
* Bugfix - Error when `fetch1('KEY')` when `dj.config['fetch_format']='frame'` set (#876) PR #880, #878
1214
* Drop support for Python 3.5
1315

1416
### 0.12.8 -- Jan 12, 2021

LNX-docker-compose.yml

Lines changed: 12 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ services:
3232
interval: 1s
3333
fakeservices.datajoint.io:
3434
<<: *net
35-
image: raphaelguzman/nginx:v0.0.13
35+
image: datajoint/nginx:v0.0.15
3636
environment:
3737
- ADD_db_TYPE=DATABASE
3838
- ADD_db_ENDPOINT=db:3306
@@ -72,15 +72,17 @@ services:
7272
- COVERALLS_SERVICE_NAME
7373
- COVERALLS_REPO_TOKEN
7474
working_dir: /src
75-
command: >
76-
/bin/sh -c
77-
"
78-
pip install --user -r test_requirements.txt;
79-
pip install --user .;
80-
pip freeze | grep datajoint;
81-
nosetests -vsw tests --with-coverage --cover-package=datajoint && coveralls;
82-
# jupyter notebook;
83-
"
75+
command:
76+
- sh
77+
- -c
78+
- |
79+
set -e
80+
pip install --user -r test_requirements.txt
81+
pip install --user .
82+
pip freeze | grep datajoint
83+
nosetests -vsw tests --with-coverage --cover-package=datajoint
84+
coveralls
85+
# jupyter notebook
8486
# ports:
8587
# - "8888:8888"
8688
user: ${UID}:${GID}

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
[![Coverage Status](https://coveralls.io/repos/datajoint/datajoint-python/badge.svg?branch=master&service=github)](https://coveralls.io/github/datajoint/datajoint-python?branch=master)
44
[![PyPI version](https://badge.fury.io/py/datajoint.svg)](http://badge.fury.io/py/datajoint)
55
[![Requirements Status](https://requires.io/github/datajoint/datajoint-python/requirements.svg?branch=master)](https://requires.io/github/datajoint/datajoint-python/requirements/?branch=master)
6-
[![Join the chat at https://gitter.im/datajoint/datajoint-python](https://badges.gitter.im/Join%20Chat.svg)](https://gitter.im/datajoint/datajoint-python?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)
6+
[![Slack](https://img.shields.io/badge/slack-chat-green.svg)](https://datajoint.slack.com/)
77

88
# Welcome to DataJoint for Python!
99
DataJoint for Python is a framework for scientific workflow management based on relational principles. DataJoint is built on the foundation of the relational data model and prescribes a consistent method for organizing, populating, computing, and querying data.

datajoint/condition.py

Lines changed: 8 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -45,8 +45,9 @@ def __init__(self, restriction):
4545

4646
def assert_join_compatibility(expr1, expr2):
4747
"""
48-
Determine if expressions expr1 and expr2 are join-compatible. To be join-compatible, the matching attributes
49-
in the two expressions must be in the primary key of one or the other expression.
48+
Determine if expressions expr1 and expr2 are join-compatible. To be join-compatible,
49+
the matching attributes in the two expressions must be in the primary key of one or the
50+
other expression.
5051
Raises an exception if not compatible.
5152
:param expr1: A QueryExpression object
5253
:param expr2: A QueryExpression object
@@ -58,10 +59,12 @@ def assert_join_compatibility(expr1, expr2):
5859
raise DataJointError('Object %r is not a QueryExpression and cannot be joined.' % rel)
5960
if not isinstance(expr1, U) and not isinstance(expr2, U): # dj.U is always compatible
6061
try:
61-
raise DataJointError("Cannot join query expressions on dependent attribute `%s`" % next(r for r in set(
62-
expr1.heading.secondary_attributes).intersection(expr2.heading.secondary_attributes)))
62+
raise DataJointError(
63+
"Cannot join query expressions on dependent attribute `%s`" % next(
64+
r for r in set(expr1.heading.secondary_attributes).intersection(
65+
expr2.heading.secondary_attributes)))
6366
except StopIteration:
64-
pass
67+
pass # all ok
6568

6669

6770
def make_condition(query_expression, condition, columns):

datajoint/expression.py

Lines changed: 22 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@ class QueryExpression:
3737
_restriction = None
3838
_restriction_attributes = None
3939
_left = [] # True for left joins, False for inner joins
40-
_join_attributes = []
40+
_original_heading = None # heading before projections
4141

4242
# subclasses or instantiators must provide values
4343
_connection = None
@@ -61,6 +61,11 @@ def heading(self):
6161
""" a dj.Heading object, reflects the effects of the projection operator .proj """
6262
return self._heading
6363

64+
@property
65+
def original_heading(self):
66+
""" a dj.Heading object reflecting the attributes before projection """
67+
return self._original_heading or self.heading
68+
6469
@property
6570
def restriction(self):
6671
""" a AndList object of restrictions applied to input to produce the result """
@@ -85,11 +90,10 @@ def from_clause(self):
8590
support = ('(' + src.make_sql() + ') as `_s%x`' % next(
8691
self._subquery_alias_count) if isinstance(src, QueryExpression) else src for src in self.support)
8792
clause = next(support)
88-
for s, a, left in zip(support, self._join_attributes, self._left):
89-
clause += '{left} JOIN {clause}{using}'.format(
93+
for s, left in zip(support, self._left):
94+
clause += 'NATURAL{left} JOIN {clause}'.format(
9095
left=" LEFT" if left else "",
91-
clause=s,
92-
using="" if not a else " USING (%s)" % ",".join('`%s`' % _ for _ in a))
96+
clause=s)
9397
return clause
9498

9599
def where_clause(self):
@@ -241,34 +245,29 @@ def join(self, other, semantic_check=True, left=False):
241245
other = other() # instantiate
242246
if not isinstance(other, QueryExpression):
243247
raise DataJointError("The argument of join must be a QueryExpression")
244-
other_clash = set(other.heading.names) | set(
245-
(other.heading[n].attribute_expression.strip('`') for n in other.heading.new_attributes))
246-
self_clash = set(self.heading.names) | set(
247-
(self.heading[n].attribute_expression for n in self.heading.new_attributes))
248-
need_subquery1 = isinstance(self, Union) or any(
249-
n for n in self.heading.new_attributes if (
250-
n in other_clash or self.heading[n].attribute_expression.strip('`') in other_clash))
251-
need_subquery2 = (len(other.support) > 1 or
252-
isinstance(self, Union) or any(
253-
n for n in other.heading.new_attributes if (
254-
n in self_clash or other.heading[n].attribute_expression.strip('`') in other_clash)))
248+
if semantic_check:
249+
assert_join_compatibility(self, other)
250+
join_attributes = set(n for n in self.heading.names if n in other.heading.names)
251+
# needs subquery if FROM class has common attributes with the other's FROM clause
252+
need_subquery1 = need_subquery2 = bool(
253+
(set(self.original_heading.names) & set(other.original_heading.names))
254+
- join_attributes)
255+
# need subquery if any of the join attributes are derived
256+
need_subquery1 = need_subquery1 or any(n in self.heading.new_attributes for n in join_attributes)
257+
need_subquery2 = need_subquery2 or any(n in other.heading.new_attributes for n in join_attributes)
255258
if need_subquery1:
256259
self = self.make_subquery()
257260
if need_subquery2:
258261
other = other.make_subquery()
259-
if semantic_check:
260-
assert_join_compatibility(self, other)
261262
result = QueryExpression()
262263
result._connection = self.connection
263264
result._support = self.support + other.support
264-
result._join_attributes = (
265-
self._join_attributes + [[a for a in self.heading.names if a in other.heading.names]] +
266-
other._join_attributes)
267265
result._left = self._left + [left] + other._left
268266
result._heading = self.heading.join(other.heading)
269267
result._restriction = AndList(self.restriction)
270268
result._restriction.append(other.restriction)
271-
assert len(result.support) == len(result._join_attributes) + 1 == len(result._left) + 1
269+
result._original_heading = self.original_heading.join(other.original_heading)
270+
assert len(result.support) == len(result._left) + 1
272271
return result
273272

274273
def __add__(self, other):
@@ -371,6 +370,7 @@ def proj(self, *attributes, **named_attributes):
371370
need_subquery = any(name in self.restriction_attributes for name in self.heading.new_attributes)
372371

373372
result = self.make_subquery() if need_subquery else copy.copy(self)
373+
result._original_heading = result.original_heading
374374
result._heading = result.heading.select(
375375
attributes, rename_map=dict(**rename_map, **replicate_map), compute_map=compute_map)
376376
return result
@@ -525,7 +525,6 @@ def create(cls, arg, group, keep_all_rows=False):
525525
result._connection = join.connection
526526
result._heading = join.heading.set_primary_key(arg.primary_key) # use left operand's primary key
527527
result._support = join.support
528-
result._join_attributes = join._join_attributes
529528
result._left = join._left
530529
result._left_restrict = join.restriction # WHERE clause applied before GROUP BY
531530
result._grouping_attributes = result.primary_key

datajoint/fetch.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -207,6 +207,7 @@ def __call__(self, *attrs, offset=None, limit=None, order_by=None, format=None,
207207
except Exception as e:
208208
raise e
209209
for name in heading:
210+
# unpack blobs and externals
210211
ret[name] = list(map(partial(get, heading[name]), ret[name]))
211212
if format == "frame":
212213
ret = pandas.DataFrame(ret).set_index(heading.primary_key)
@@ -251,7 +252,7 @@ def __call__(self, *attrs, squeeze=False, download_path='.'):
251252
else: # fetch some attributes, return as tuple
252253
attributes = [a for a in attrs if not is_key(a)]
253254
result = self._expression.proj(*attributes).fetch(
254-
squeeze=squeeze, download_path=download_path)
255+
squeeze=squeeze, download_path=download_path, format="array")
255256
if len(result) != 1:
256257
raise DataJointError(
257258
'fetch1 should only return one tuple. %d tuples found' % len(result))

datajoint/schemas.py

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -298,12 +298,10 @@ def exists(self):
298298
"""
299299
if self.database is None:
300300
raise DataJointError("Schema must be activated first.")
301-
return self.database is not None and (
302-
self.connection.query(
303-
"SELECT schema_name "
304-
"FROM information_schema.schemata "
305-
"WHERE schema_name = '{database}'".format(
306-
database=self.database)).rowcount > 0)
301+
return bool(self.connection.query(
302+
"SELECT schema_name "
303+
"FROM information_schema.schemata "
304+
"WHERE schema_name = '{database}'".format(database=self.database)).rowcount)
307305

308306
@property
309307
def jobs(self):

datajoint/version.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
1-
__version__ = "0.13.dev5"
1+
__version__ = "0.13.dev6"
22

33
assert len(__version__) <= 10 # The log table limits version to the 10 characters

docs-parts/computation/06-distributed-computing_jobs_by_key.rst

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -3,17 +3,18 @@ This can be done by using `dj.key_hash` to convert the key as follows:
33

44
.. code-block:: python
55
6-
In [4]: schema.jobs & {'key_hash' : dj.key_hash({'id': 2})}
7-
Out[4]:
6+
In [4]: jk = {'table_name': JobResults.table_name, 'key_hash' : dj.key_hash({'id': 2})}
7+
In [5]: schema.jobs & jk
8+
Out[5]:
89
*table_name *key_hash status key error_message error_stac user host pid connection_id timestamp
910
+------------+ +------------+ +--------+ +--------+ +------------+ +--------+ +------------+ +-------+ +--------+ +------------+ +------------+
1011
__job_results c81e728d9d4c2f error =BLOB= KeyboardInterr =BLOB= datajoint@localhost localhost 15571 59 2017-09-04 14:
1112
(Total: 1)
1213
13-
In [5]: (schema.jobs & {'key_hash' : dj.key_hash({'id': 2})}).delete()
14+
In [6]: (schema.jobs & jk).delete()
1415
15-
In [6]: schema.jobs & {'key_hash' : dj.key_hash({'id': 2})}
16-
Out[6]:
16+
In [7]: schema.jobs & jk
17+
Out[7]:
1718
*table_name *key_hash status key error_message error_stac user host pid connection_id timestamp
1819
+------------+ +----------+ +--------+ +--------+ +------------+ +--------+ +------+ +------+ +-----+ +------------+ +-----------+
1920

docs-parts/intro/Releases_lang1.rst

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,15 @@
1-
0.13.0 -- Feb 15, 2021
1+
0.13.0 -- Mar 19, 2021
22
----------------------
33
* Re-implement query transpilation into SQL, fixing issues (#386, #449, #450, #484). PR #754
44
* Re-implement cascading deletes for better performance. PR #839.
55
* Add table method `.update1` to update a row in the table with new values PR #763
66
* Python datatypes are now enabled by default in blobs (#761). PR #785
77
* Added permissive join and restriction operators `@` and `^` (#785) PR #754
88
* Support DataJoint datatype and connection plugins (#715, #729) PR 730, #735
9-
* add `dj.key_hash` alias to `dj.hash.key_hash`
10-
* default enable_python_native_blobs to True
9+
* Add `dj.key_hash` alias to `dj.hash.key_hash`
10+
* Default enable_python_native_blobs to True
11+
* Bugfix - Regression error on joins with same attribute name (#857) PR #878
12+
* Bugfix - Error when `fetch1('KEY')` when `dj.config['fetch_format']='frame'` set (#876) PR #880, #878
1113
* Drop support for Python 3.5
1214

1315
0.12.8 -- Jan 12, 2021

0 commit comments

Comments
 (0)