Skip to content

Commit 3ec0b07

Browse files
committed
Update recall page
1 parent 508b250 commit 3ec0b07

File tree

1 file changed

+133
-59
lines changed

1 file changed

+133
-59
lines changed

docs/src/design/recall.md

Lines changed: 133 additions & 59 deletions
Original file line numberDiff line numberDiff line change
@@ -1,88 +1,154 @@
11
# Work with Existing Pipelines
22

3-
This section describes how to work with database schemas without access to the original
4-
code that generated the schema. These situations often arise when the database is
5-
created by another user who has not shared the generating code yet or when the database
6-
schema is created from a programming language other than Python.
7-
83
## Loading Classes
94

10-
Typically, a DataJoint schema is created as a dedicated Python module. This module
11-
defines a schema object that is used to link classes declared in the module to tables
12-
in the database schema. With the module installed, you can simply import it to interact
13-
with its tables:
5+
This section describes how to work with database schemas without access to the
6+
original code that generated the schema. These situations often arise when the
7+
database is created by another user who has not shared the generating code yet
8+
or when the database schema is created from a programming language other than
9+
Python.
1410

1511
```python
1612
import datajoint as dj
17-
from element_calcium_imaging import scan # This and other [DataJoint Elements](https://datajoint.com/docs/elements/) are installable via `pip` or downloadable via their respective GitHub repositories.
1813
```
1914

20-
To visualize an unfamiliar schema, see commands for generating [diagrams](../../getting-started/#diagram).
15+
### Working with schemas and their modules
16+
17+
Typically a DataJoint schema is created as a dedicated Python module. This
18+
module defines a schema object that is used to link classes declared in the
19+
module to tables in the database schema. As an example, examine the university
20+
module: [university.py](https://github.com/datajoint-company/db-programming-with-datajoint/blob/master/notebooks/university.py).
21+
22+
You may then import the module to interact with its tables:
23+
24+
```python
25+
import university as uni
26+
dj.Diagram(uni)
27+
```
28+
29+
![query object preview](../images/virtual-module-ERD.svg){: style="align:center"}
30+
31+
Note that dj.Diagram can extract the diagram from a schema object or from a
32+
Python module containing its schema object, lending further support to the
33+
convention of one-to-one correspondence between database schemas and Python
34+
modules in a DataJoint project:
2135

22-
## Spawning Missing Classes
36+
`dj.Diagram(uni)`
2337

24-
Now, imagine we do not have access to the
25-
[Python definition of Scan](https://github.com/datajoint/element-calcium-imaging/blob/main/element_calcium_imaging/scan.py),
26-
or we're unsure if the version on our server matches the definition available. We can
27-
use the `dj.list_schemas` function to list the available database schemas.
38+
is equivalent to
39+
40+
`dj.Diagram(uni.schema)`
41+
42+
```python
43+
# students without majors
44+
uni.Student - uni.StudentMajor
45+
```
46+
47+
![query object preview](../images/StudentTable.png){: style="align:center"}
48+
49+
### Spawning missing classes
50+
51+
Now imagine that you do not have access to `university.py` or you do not have
52+
its latest version. You can still connect to the database schema but you will
53+
not have classes declared to interact with it.
54+
55+
So let's start over in this scenario.
56+
57+
You may use the `dj.list_schemas` function (new in DataJoint 0.12.0) to
58+
list the names of database schemas available to you.
2859

2960
```python
3061
import datajoint as dj
31-
dj.conn() # Establish a connection to the server.
32-
dj.list_schemas() # List the available schemas on the server.
33-
dj.Schema('schema_name').list_tables() # List the tables for a given schema from the previous step. These will appear in their raw database form, with underscores instead of camelcase and special characters for Part tables.
62+
dj.list_schemas()
63+
```
64+
65+
```text
66+
*['dimitri_alter','dimitri_attach','dimitri_blob','dimitri_blobs',
67+
'dimitri_nphoton','dimitri_schema','dimitri_university','dimitri_uuid',
68+
'university']*
69+
```
70+
71+
Just as with a new schema, we start by creating a schema object to connect to
72+
the chosen database schema:
73+
74+
```python
75+
schema = dj.Schema('dimitri_university')
76+
```
77+
78+
If the schema already exists, `dj.Schema` is initialized as usual and you may plot
79+
the schema diagram. But instead of seeing class names, you will see the raw
80+
table names as they appear in the database.
81+
82+
```python
83+
# let's plot its diagram
84+
dj.Diagram(schema)
3485
```
3586

36-
Just as with a new schema, we can create a schema object to connect to the chosen
37-
database schema. If the schema already exists, `dj.Schema` is initialized as usual.
87+
![query object preview](../images/dimitri-ERD.svg){: style="align:center"}
88+
89+
You may view the diagram but, at this point, there is no way to interact with
90+
these tables. A similar situation arises when another developer has added new
91+
tables to the schema but has not yet shared the updated module code with you.
92+
Then the diagram will show a mixture of class names and database table names.
3893

39-
If a diagram will shows a mixture of class names and database table names, the
40-
`spawn_missing_classes` method will spawn classes into the local namespace for any
41-
tables missing their classes. This will allow us to interact with all tables as if
42-
they were declared in the current namespace.
94+
Now you may use the `spawn_missing_classes` method to spawn classes into
95+
the local namespace for any tables missing their classes:
4396

4497
```python
4598
schema.spawn_missing_classes()
99+
dj.Diagram(schema)
46100
```
47101

48-
## Virtual Modules
102+
![query object preview](../images/spawned-classes-ERD.svg){: style="align:center"}
49103

50-
While `spawn_missing_classes` creates the new classes in the local namespace, it is
51-
often more convenient to import a schema with its Python module, equivalent to the
52-
Python command. We can mimic this import without having access to the schema using
53-
the `VirtualModule` class object:
104+
Now you may interact with these tables as if they were declared right here in
105+
this namespace:
54106

55107
```python
56-
import datajoint as dj
57-
subject = dj.VirtualModule(module_name='subject', schema_name='db_subject')
108+
# students without majors
109+
Student - StudentMajor
58110
```
59111

60-
Now, `subject` behaves as an imported module complete with the schema object and all the
61-
table classes.
112+
![query object preview](../images/StudentTable.png){: style="align:center"}
62113

63-
The class object `VirtualModule` of the `dj.Schema` class provides access to virtual
64-
modules. It creates a python module with the given name from the name of a schema on
65-
the server, automatically adds classes to it corresponding to the tables in the
66-
schema.
114+
### Creating a virtual module
67115

68-
The function can take several parameters:
116+
Now `spawn_missing_classes` creates the new classes in the local namespace.
117+
However, it is often more convenient to import a schema with its Python module,
118+
equivalent to the Python command
69119

70-
- `module_name`: displayed module name.
120+
```python
121+
import university as uni
122+
```
71123

72-
- `schema_name`: name of the database in MySQL.
124+
We can mimick this import without having access to `university.py` using the
125+
`VirtualModule` class object:
73126

74-
`create_schema`: if `True`, create the schema on the database server if it does not
75-
already exist; if `False` (default), raise an error when the schema is not found.
127+
```python
128+
import datajoint as dj
76129

77-
- `create_tables`: if `True`, `module.schema` can be used as the decorator for declaring
78-
new classes; if `False`, such use will raise an error stating that the module is
79-
intend only to work with existing tables.
130+
uni = dj.VirtualModule(module_name='university.py', schema_name='dimitri_university')
131+
```
80132

81-
The function returns the Python module containing classes from the schema object with
82-
all the table classes already declared inside it.
133+
Now `uni` behaves as an imported module complete with the schema object and all
134+
the table classes.
83135

84-
`create_schema=False` may be useful if we want to make sure that the schema already
85-
exists. If none exists, `create_schema=True` will create an empty schema.
136+
```python
137+
dj.Diagram(uni)
138+
```
139+
140+
![query object preview](../images/added-example-ERD.svg){: style="align:center"}
141+
142+
```python
143+
uni.Student - uni.StudentMajor
144+
```
145+
146+
![query object preview](../images/StudentTable.png){: style="align:center"}
147+
148+
`dj.VirtualModule` takes optional arguments.
149+
150+
First, `create_schema=False` assures that an error is raised when the schema
151+
does not already exist. Set it to `True` if you want to create an empty schema.
86152

87153
```python
88154
dj.VirtualModule('what', 'nonexistent')
@@ -91,19 +157,25 @@ dj.VirtualModule('what', 'nonexistent')
91157
Returns
92158

93159
```python
160+
---------------------------------------------------------------------------
161+
DataJointError Traceback (most recent call last)
162+
.
163+
.
164+
.
94165
DataJointError: Database named `nonexistent` was not defined. Set argument create_schema=True to create it.
95166
```
96167

97-
`create_tables=False` prevents the use of the schema object of the virtual module for
98-
creating new tables in the existing schema. This is a precautionary measure since
99-
virtual modules are often used for completed schemas. `create_tables=True` will new
100-
tables to the existing schema. A more common approach in this scenario would be to
101-
create a new schema object and to use the `spawn_missing_classes` function to make the
102-
classes available.
168+
The other optional argument, `create_tables=False` is passed to the schema
169+
object. It prevents the use of the schema object of the virtual module for
170+
creating new tables in the existing schema. This is a precautionary measure
171+
since virtual modules are often used for completed schemas. You may set this
172+
argument to `True` if you wish to add new tables to the existing schema. A
173+
more common approach in this scenario would be to create a new schema object and
174+
to use the `spawn_missing_classes` function to make the classes available.
103175

104-
However, you if do decide to create new tables in an existing tables using the virtual
105-
module, you may do so by using the schema object from the module as the decorator for
106-
declaring new tables:
176+
However, you if do decide to create new tables in an existing tables using the
177+
virtual module, you may do so by using the schema object from the module as the
178+
decorator for declaring new tables:
107179

108180
```python
109181
uni = dj.VirtualModule('university.py', 'dimitri_university', create_tables=True)
@@ -122,3 +194,5 @@ class Example(dj.Manual):
122194
```python
123195
dj.Diagram(uni)
124196
```
197+
198+
![query object preview](../images/added-example-ERD.svg){: style="align:center"}

0 commit comments

Comments
 (0)