Skip to content

Commit 3b6e845

Browse files
Merge pull request #1067 from CBroz1/master
Add support for insert CSV
2 parents e339d46 + 7692f3d commit 3b6e845

File tree

7 files changed

+101
-31
lines changed

7 files changed

+101
-31
lines changed

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
### 0.14.0 -- TBA
44
* Bugfix - Activating a schema requires all tables to exist even if `create_tables=False` PR [#1058](https://github.com/datajoint/datajoint-python/pull/1058)
55
* Update - Populate call with `reserve_jobs=True` to exclude `error` and `ignore` keys - PR [#1062](https://github.com/datajoint/datajoint-python/pull/1062)
6+
* Add - Support for inserting data with CSV files - PR [#1067](https://github.com/datajoint/datajoint-python/pull/1067)
67

78
### 0.13.8 -- Sep 21, 2022
89
* Add - New documentation structure based on markdown PR [#1052](https://github.com/datajoint/datajoint-python/pull/1052)

LNX-docker-compose.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ services:
3232
interval: 1s
3333
fakeservices.datajoint.io:
3434
<<: *net
35-
image: datajoint/nginx:v0.2.3
35+
image: datajoint/nginx:v0.2.4
3636
environment:
3737
- ADD_db_TYPE=DATABASE
3838
- ADD_db_ENDPOINT=db:3306

README.md

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -112,15 +112,15 @@ important DataJoint schema or records.
112112

113113
### API docs
114114

115-
The API documentation can be built using sphinx by running
115+
The API documentation can be built with mkdocs using the docker compose file in
116+
`docs/` with the following command:
116117

117118
``` bash
118-
pip install sphinx sphinx_rtd_theme
119-
(cd docs-api/sphinx && make html)
119+
MODE="LIVE" PACKAGE=datajoint UPSTREAM_REPO=https://github.com/datajoint/datajoint-python.git HOST_UID=$(id -u) docker compose -f docs/docker-compose.yaml up --build
120120
```
121121

122-
Generated docs are written to `docs-api/docs/html/index.html`.
123-
More details in [docs-api/README.md](docs-api/README.md).
122+
The site will then be available at `http://localhost/`. When finished, be sure to run
123+
the same command as above, but replace `up --build` with `down`.
124124

125125
## Running Tests Locally
126126
<details>
@@ -141,11 +141,11 @@ HOST_GID=1000
141141
* Add entry in `/etc/hosts` for `127.0.0.1 fakeservices.datajoint.io`
142142
* Run desired tests. Some examples are as follows:
143143

144-
| Use Case | Shell Code |
145-
| ---------------------------- | ------------------------------------------------------------------------------ |
146-
| Run all tests | `nosetests -vsw tests --with-coverage --cover-package=datajoint` |
147-
| Run one specific class test | `nosetests -vs --tests=tests.test_fetch:TestFetch.test_getattribute_for_fetch1` |
148-
| Run one specific basic test | `nosetests -vs --tests=tests.test_external_class:test_insert_and_fetch` |
144+
| Use Case | Shell Code |
145+
| ---------------------------- | ------------------------------------------------------------------------------ |
146+
| Run all tests | `nosetests -vsw tests --with-coverage --cover-package=datajoint` |
147+
| Run one specific class test | `nosetests -vs --tests=tests.test_fetch:TestFetch.test_getattribute_for_fetch1` |
148+
| Run one specific basic test | `nosetests -vs --tests=tests.test_external_class:test_insert_and_fetch` |
149149

150150

151151
### Launch Docker Terminal

datajoint/table.py

Lines changed: 12 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66
import pandas
77
import logging
88
import uuid
9+
import csv
910
import re
1011
from pathlib import Path
1112
from .settings import config
@@ -345,13 +346,16 @@ def insert(
345346
"""
346347
Insert a collection of rows.
347348
348-
:param rows: An iterable where an element is a numpy record, a dict-like object, a
349-
pandas.DataFrame, a sequence, or a query expression with the same heading as self.
349+
:param rows: Either (a) an iterable where an element is a numpy record, a
350+
dict-like object, a pandas.DataFrame, a sequence, or a query expression with
351+
the same heading as self, or (b) a pathlib.Path object specifying a path
352+
relative to the current directory with a CSV file, the contents of which
353+
will be inserted.
350354
:param replace: If True, replaces the existing tuple.
351355
:param skip_duplicates: If True, silently skip duplicate inserts.
352356
:param ignore_extra_fields: If False, fields that are not in the heading raise error.
353-
:param allow_direct_insert: applies only in auto-populated tables. If False (default),
354-
insert are allowed only from inside the make callback.
357+
:param allow_direct_insert: Only applies in auto-populated tables. If False (default),
358+
insert may only be called from inside the make callback.
355359
356360
Example:
357361
@@ -366,6 +370,10 @@ def insert(
366370
drop=len(rows.index.names) == 1 and not rows.index.names[0]
367371
).to_records(index=False)
368372

373+
if isinstance(rows, Path):
374+
with open(rows, newline="") as data_file:
375+
rows = list(csv.DictReader(data_file, delimiter=","))
376+
369377
# prohibit direct inserts into auto-populated tables
370378
if not allow_direct_insert and not getattr(self, "_allow_insert", True):
371379
raise DataJointError(

docs/src/query-lang/common-commands.md

Lines changed: 74 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,70 @@
11

2-
<!-- ## Insert is present in the general docs here-->
3-
2+
## Insert
3+
4+
Data entry is as easy as providing the appropriate data structure to a permitted table.
5+
Given the following table definition, we can insert data as tuples, dicts, pandas
6+
dataframes, or pathlib `Path` relative paths to local CSV files.
7+
8+
```text
9+
mouse_id: int # unique mouse id
10+
---
11+
dob: date # mouse date of birth
12+
sex: enum('M', 'F', 'U') # sex of mouse - Male, Female, or Unknown
13+
```
14+
15+
=== "Tuple"
16+
17+
```python
18+
mouse.insert1( (0, '2017-03-01', 'M') ) # Single entry
19+
data = [
20+
(1, '2016-11-19', 'M'),
21+
(2, '2016-11-20', 'U'),
22+
(5, '2016-12-25', 'F')
23+
]
24+
mouse.insert(data) # Multi-entry
25+
```
26+
27+
=== "Dict"
28+
29+
```python
30+
mouse.insert1( dict(mouse_id=0, dob='2017-03-01', sex='M') ) # Single entry
31+
data = [
32+
{'mouse_id':1, 'dob':'2016-11-19', 'sex':'M'},
33+
{'mouse_id':2, 'dob':'2016-11-20', 'sex':'U'},
34+
{'mouse_id':5, 'dob':'2016-12-25', 'sex':'F'}
35+
]
36+
mouse.insert(data) # Multi-entry
37+
```
38+
39+
=== "Pandas"
40+
41+
```python
42+
import pandas as pd
43+
data = pd.DataFrame(
44+
[[1, "2016-11-19", "M"], [2, "2016-11-20", "U"], [5, "2016-12-25", "F"]],
45+
columns=["mouse_id", "dob", "sex"],
46+
)
47+
mouse.insert(data)
48+
```
49+
50+
=== "CSV"
51+
52+
Given the following CSV in the current working directory as `mice.csv`
53+
54+
```console
55+
mouse_id,dob,sex
56+
1,2016-11-19,M
57+
2,2016-11-20,U
58+
5,2016-12-25,F
59+
```
60+
61+
We can import as follows:
62+
63+
```python
64+
from pathlib import Path
65+
mouse.insert(Path('./mice.csv'))
66+
```
67+
468
## Make
569

670
See the article on [`make` methods](../../reproduce/make-method/)
@@ -31,8 +95,8 @@ data = query.fetch(as_dict=True) # (2)
3195
### Separate variables
3296

3397
``` python
34-
name, img = query.fetch1('name', 'image') # when query has exactly one entity
35-
name, img = query.fetch('name', 'image') # [name, ...] [image, ...]
98+
name, img = query.fetch1('mouse_id', 'dob') # when query has exactly one entity
99+
name, img = query.fetch('mouse_id', 'dob') # [mouse_id, ...] [dob, ...]
36100
```
37101

38102
### Primary key values
@@ -51,19 +115,18 @@ primary keys.
51115
To sort the result, use the `order_by` keyword argument.
52116

53117
``` python
54-
data = query.fetch(order_by='name') # ascending order
55-
data = query.fetch(order_by='name desc') # descending order
56-
data = query.fetch(order_by=('name desc', 'year')) # by name first, year second
57-
data = query.fetch(order_by='KEY') # sort by the primary key
58-
data = query.fetch(order_by=('name', 'KEY desc')) # sort by name but for same names order by primary key
118+
data = query.fetch(order_by='mouse_id') # ascending order
119+
data = query.fetch(order_by='mouse_id desc') # descending order
120+
data = query.fetch(order_by=('mouse_id', 'dob')) # by ID first, dob second
121+
data = query.fetch(order_by='KEY') # sort by the primary key
59122
```
60123

61124
The `order_by` argument can be a string specifying the attribute to sort by. By default
62125
the sort is in ascending order. Use `'attr desc'` to sort in descending order by
63126
attribute `attr`. The value can also be a sequence of strings, in which case, the sort
64127
performed on all the attributes jointly in the order specified.
65128

66-
The special attribute name `'KEY'` represents the primary key attributes in order that
129+
The special attribute named `'KEY'` represents the primary key attributes in order that
67130
they appear in the index. Otherwise, this name can be used as any other argument.
68131

69132
If an attribute happens to be a SQL reserved word, it needs to be enclosed in
@@ -82,7 +145,7 @@ Similar to sorting, the `limit` and `offset` arguments can be used to limit the
82145
to a subset of entities.
83146

84147
``` python
85-
data = query.fetch(order_by='name', limit=10, offset=5)
148+
data = query.fetch(order_by='mouse_id', limit=10, offset=5)
86149
```
87150

88151
Note that an `offset` cannot be used without specifying a `limit` as

local-docker-compose.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ services:
3434
interval: 1s
3535
fakeservices.datajoint.io:
3636
<<: *net
37-
image: datajoint/nginx:v0.2.3
37+
image: datajoint/nginx:v0.2.4
3838
environment:
3939
- ADD_db_TYPE=DATABASE
4040
- ADD_db_ENDPOINT=db:3306

tests/test_university.py

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -33,11 +33,9 @@ def test_activate():
3333
Enroll,
3434
Grade,
3535
):
36-
import csv
36+
from pathlib import Path
3737

38-
with open("./data/" + table.__name__ + ".csv") as f:
39-
reader = csv.DictReader(f)
40-
table().insert(reader)
38+
table().insert(Path("./data/" + table.__name__ + ".csv"))
4139

4240

4341
def test_fill():

0 commit comments

Comments
 (0)