|
84 | 84 | ">* Nodes in this graph are represented as database **tables**. Examples of such tables include \"Subject\", \"Session\", \"Implantation\", \"Experimenter\", \"Equipment\", but also \"OptoWaveform\", \"OptoStimParams\", or \"Neuronal spikes\". \n", |
85 | 85 | "\n", |
86 | 86 | ">* The data pipeline is formed by making these tables interdependent (as the nodes are connected in a network). A **dependency** is a situation where a step of the data pipeline is dependent on a result from a sequentially previous step before it can complete its execution. A dependency graph forms an entire cohesive data pipeline. \n", |
87 | | - "While this is an accurate description, it may not be the most intuitive definition. Put succinctly, a data pipeline is a listing or a \"map\" of various \"things\" that you work with in a project, with line connecting things to each other to indicate their dependencies. The \"things\" in a data pipeline tends to be the *nouns* you find when describing a project. The \"things\" may include anything from mouse, experimenter, equipment, to experiment session, trial, two-photon scans, electric activities, to receptive fields, neuronal spikes, to figures for a publication! A data pipeline gives you a framework to:\n", |
88 | 87 | "\n", |
89 | 88 | "A [DataJoint pipeline](https://datajoint.com/docs/core/datajoint-python/0.14/concepts/terminology/) contains database table definitions, dependencies, and associated computations, together with the transformations underlying a DataJoint workflow. \n", |
90 | 89 | "\n", |
|
102 | 101 | "cell_type": "markdown", |
103 | 102 | "metadata": {}, |
104 | 103 | "source": [ |
105 | | - "#### Practical examples" |
| 104 | + "##### Practical examples" |
106 | 105 | ] |
107 | 106 | }, |
108 | 107 | { |
|
172 | 171 | "cell_type": "markdown", |
173 | 172 | "metadata": {}, |
174 | 173 | "source": [ |
175 | | - "Just by going through the description, we can start to identify **entities** that need to be stored and represented in our data pipeline:\n", |
| 174 | + "Just by going through the description, we can start to identify **entities** that needs to be stored and represented in our data pipeline:\n", |
176 | 175 | "\n", |
177 | 176 | ">* Mouse\n", |
178 | 177 | ">* Experimental session\n", |
|
207 | 206 | "cell_type": "markdown", |
208 | 207 | "metadata": {}, |
209 | 208 | "source": [ |
210 | | - "#### Concepts" |
| 209 | + "##### Concepts" |
211 | 210 | ] |
212 | 211 | }, |
213 | 212 | { |
|
222 | 221 | "\n", |
223 | 222 | "In this case, the information that uniquely identifies the `Mouse` table is their **mouse IDs** - a unique ID number assigned to each animal in the lab. This attribute is named the **primary key** of the table.\n", |
224 | 223 | "\n", |
225 | | - "| mouse_id* (*Primary key attribute*)|\n", |
| 224 | + "| Mouse_ID (*Primary key attribute*)|\n", |
226 | 225 | "|:--------: | \n", |
227 | 226 | "| 11234 |\n", |
228 | 227 | "| 11432 |" |
|
234 | 233 | "source": [ |
235 | 234 | "After some thought, we might conclude that each mouse can be uniquely identified by knowing its **mouse ID** - a unique ID number assigned to each mouse in the lab. \n", |
236 | 235 | "\n", |
237 | | - "The `mouse_id` is then a column in the table or an **attribute** that can be used to **uniquely identify** each mouse. \n", |
| 236 | + "The mouse ID is then a column in the table or an **attribute** that can be used to **uniquely identify** each mouse. \n", |
238 | 237 | "\n", |
239 | 238 | "Such an attribute is called the **primary key** of the table: the subset of table attributes uniquely identifying each entity in the table. The **secondary attribute** refers to any field in a table, not in the primary key.\n", |
240 | 239 | "\n", |
241 | | - "| mouse_id* (*Primary key attribute*) \n", |
| 240 | + "| Mouse_ID (*Primary key attribute*) \n", |
242 | 241 | "|:--------:| \n", |
243 | 242 | "| 11234 (*Secondary attribute*)\n", |
244 | 243 | "| 11432 (*Secondary attribute*)" |
|
248 | 247 | "cell_type": "markdown", |
249 | 248 | "metadata": {}, |
250 | 249 | "source": [ |
251 | | - "Once we have successfully identified the table's primary key, we can now think about what other columns, or **non-primary key attributes** - additional information **about each entry in the table that needs to be stored as well**." |
| 250 | + "Once we have successfully identified the table's primary key, we can now think about what other columns, or **non-primary key attributes** - additional information **about each entry in the table that need to be stored as well**." |
252 | 251 | ] |
253 | 252 | }, |
254 | 253 | { |
|
264 | 263 | "cell_type": "markdown", |
265 | 264 | "metadata": {}, |
266 | 265 | "source": [ |
267 | | - "| mouse_id* | dob | sex |\n", |
| 266 | + "| Mouse_ID | DOB | sex |\n", |
268 | 267 | "|:--------:|------------|--------|\n", |
269 | 268 | "| 11234 | 2017-11-17 | M |\n", |
270 | 269 | "| 11432 | 2018-03-04 | F |" |
|
281 | 280 | "cell_type": "markdown", |
282 | 281 | "metadata": {}, |
283 | 282 | "source": [ |
284 | | - "#### Practical example" |
| 283 | + "##### Practical example" |
285 | 284 | ] |
286 | 285 | }, |
287 | 286 | { |
288 | 287 | "cell_type": "markdown", |
289 | 288 | "metadata": {}, |
290 | 289 | "source": [ |
291 | | - "#### Schema" |
| 290 | + "##### Schema" |
292 | 291 | ] |
293 | 292 | }, |
294 | 293 | { |
|
326 | 325 | "cell_type": "markdown", |
327 | 326 | "metadata": {}, |
328 | 327 | "source": [ |
329 | | - "#### Table" |
| 328 | + "##### Table" |
330 | 329 | ] |
331 | 330 | }, |
332 | 331 | { |
333 | 332 | "cell_type": "markdown", |
334 | 333 | "metadata": {}, |
335 | 334 | "source": [ |
336 | | - "In DataJoint, you define each table as a `class`, and provide the table definition (e.g. attribute definitions) as the `definition` static string property. The class will inherit from the `dj.Manual` class provided by DataJoint (more on this later)." |
| 335 | + "In DataJoint, you define each table as a `class`, and provide the table definition (e.g., attribute definitions) as the `definition` static string property. The class will inherit from the `dj.Manual` class provided by DataJoint (more on this later)." |
337 | 336 | ] |
338 | 337 | }, |
339 | 338 | { |
|
380 | 379 | "cell_type": "markdown", |
381 | 380 | "metadata": {}, |
382 | 381 | "source": [ |
383 | | - "#### Insert operators" |
| 382 | + "##### Insert operators" |
384 | 383 | ] |
385 | 384 | }, |
386 | 385 | { |
|
520 | 519 | "cell_type": "markdown", |
521 | 520 | "metadata": {}, |
522 | 521 | "source": [ |
523 | | - "#### Data integrity" |
| 522 | + "##### Data integrity" |
524 | 523 | ] |
525 | 524 | }, |
526 | 525 | { |
|
608 | 607 | "cell_type": "markdown", |
609 | 608 | "metadata": {}, |
610 | 609 | "source": [ |
611 | | - "As with `mouse`, we should consider **what information (i.e. attributes) is needed to identify an `experimental session`** uniquely. Here is the relevant section of the project description:\n", |
| 610 | + "As with `mouse`, we should consider **what information (i.e., attributes) is needed to identify an `experimental session`** uniquely. Here is the relevant section of the project description:\n", |
612 | 611 | "\n", |
613 | 612 | "> * As a hard-working neuroscientist, you perform experiments daily, sometimes working with **more than one mouse in a day**. However, on any given day, **a mouse undergoes at most one recording session**.\n", |
614 | 613 | "> * For each **experimental session**, you want to record **what mouse you worked with** and **when you performed the experiment**. You also want to keep track of other helpful information, such as the **experimental setup** you worked on. " |
|
618 | 617 | "cell_type": "markdown", |
619 | 618 | "metadata": {}, |
620 | 619 | "source": [ |
621 | | - "Based on the above, you need to know the following information to uniquely identify a single experimental session:\n", |
| 620 | + "Based on the above, it seems that you need to know these two data to uniquely identify a single experimental session:\n", |
622 | 621 | "\n", |
623 | 622 | "* the date of the session\n", |
624 | 623 | "* the mouse you recorded from in that session" |
|
628 | 627 | "cell_type": "markdown", |
629 | 628 | "metadata": {}, |
630 | 629 | "source": [ |
631 | | - "Note that to uniquely identify an experimental session (or simply a **session**), we need to know the mouse used in that session. In other words, a session cannot exist without a corresponding mouse! \n", |
| 630 | + "Note that, to uniquely identify an experimental session (or simply a `Session`), we need to know the mouse that the session was about. In other words, a session cannot existing without a corresponding mouse! \n", |
632 | 631 | "\n", |
633 | 632 | "With **mouse** already represented as a table in our pipeline, we say that the session **depends on** the mouse! We could graphically represent this in an **entity relationship diagram (ERD)** by drawing the line between two tables, with the one below (**session**) depending on the one above (**mouse**)." |
634 | 633 | ] |
|
639 | 638 | "source": [ |
640 | 639 | "Thus, we will need both the **mouse** and the new attribute **session_date** to identify a single `session` uniquely. \n", |
641 | 640 | "\n", |
642 | | - "Remember that a **mouse** is already uniquely identified by its primary key - **mouse_id**. In DataJoint, you can declare that **session** depends on the mouse, and DataJoint will automatically include the mouse's primary key (`mouse_id`) as part of the session's primary key, along side any additional attribute(s) you specify." |
| 641 | + "Remember that a **mouse** is uniquely identified by its primary key - **mouse_id**. In DataJoint, you can declare that **session** depends on the mouse, and DataJoint will automatically include the mouse's primary key (`mouse_id`) as part of the session's primary key, alongside any additional attribute(s) you specify." |
643 | 642 | ] |
644 | 643 | }, |
645 | 644 | { |
|
828 | 827 | "cell_type": "markdown", |
829 | 828 | "metadata": {}, |
830 | 829 | "source": [ |
831 | | - "We will introduce the major types of queries used in DataJoint:\n", |
832 | | - "1. Restriction (`&`) and negative restriction (`-`): filter the data with certain conditions\n", |
833 | | - "2. Join (`*`): bring fields from different tables together\n", |
834 | | - "3. Projection (`.proj()`): focus on a subset of attributes\n", |
| 830 | + "We will introduce significant types of queries used in DataJoint:\n", |
| 831 | + "* 1. Restriction (`&`) and negative restriction (`-`): filter the data with certain conditions\n", |
| 832 | + "* 2. Join (`*`): bring fields from different tables together\n", |
| 833 | + "* 3. Projection (`.proj()`): focus on a subset of attributes\n", |
| 834 | + "* 4. Fetch (`.fetch()`): pull the data from the database\n", |
| 835 | + "* 5. Deletion (`.delete()`): delete entries and their dependencies\n", |
| 836 | + "* 6. Drop (`.drop()`): drop the table from the schema" |
835 | 837 | ] |
836 | 838 | }, |
837 | 839 | { |
|
852 | 854 | "cell_type": "markdown", |
853 | 855 | "metadata": {}, |
854 | 856 | "source": [ |
855 | | - "#### Exact match" |
| 857 | + "##### Exact match" |
856 | 858 | ] |
857 | 859 | }, |
858 | 860 | { |
|
1017 | 1019 | "cell_type": "markdown", |
1018 | 1020 | "metadata": {}, |
1019 | 1021 | "source": [ |
1020 | | - "The result of one query can be used in another query! Let's first find **all the female mice** and store the result:" |
| 1022 | + "The result of one query can be used in another query! Let's first find `all the female mice` and `store the result`:" |
1021 | 1023 | ] |
1022 | 1024 | }, |
1023 | 1025 | { |
|
1348 | 1350 | "cell_type": "markdown", |
1349 | 1351 | "metadata": {}, |
1350 | 1352 | "source": [ |
1351 | | - "Fetch it:" |
| 1353 | + "Fetch it!:" |
1352 | 1354 | ] |
1353 | 1355 | }, |
1354 | 1356 | { |
|
1609 | 1611 | "cell_type": "markdown", |
1610 | 1612 | "metadata": {}, |
1611 | 1613 | "source": [ |
1612 | | - "Note that the `.delete()` method not only deletes the entries in the table where it is called, but also all the corresponding entries in subsequent (downstream) tables!" |
| 1614 | + "Note that the `.delete()` method not only delete the entries in a table, but also all the corresponding entries in subsequent (downstream) tables!" |
1613 | 1615 | ] |
1614 | 1616 | }, |
1615 | 1617 | { |
|
0 commit comments