Skip to content

Commit 786b66a

Browse files
committed
fix updated tutorials including changesPR#1 Kushal
1 parent 85dd9fc commit 786b66a

File tree

3 files changed

+47
-43
lines changed

3 files changed

+47
-43
lines changed

README.md

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,21 +1,24 @@
11
# Welcome to DataJoint tutorials!
22

3-
DataJoint is an open-source library for science labs to design and build data pipelines for automated data analysis and sharing.
3+
DataJoint is an open-source library for scientific research labs to design and build
4+
data pipelines for automated data analysis and sharing.
45

5-
This document will guide you as a new DataJoint user through interactive tutorials organized in [Jupyter notebooks](https://jupyter-notebook.readthedocs.io/en/stable/) and written in [Python](https://www.python.org/).
6+
This document will guide you through interactive tutorials written in
7+
[Python](https://www.python.org/) and organized in [Jupyter
8+
notebooks](https://jupyter-notebook.readthedocs.io/en/stable/).
69

710
*Please note that these hands-on DataJoint tutorials are friendly to non-expert users, and advanced programming skills are not required.*
811

912

1013
## Table of contents
11-
- In the [tutorials](./tutorials) folder are interactive Jupyter notebooks to learn DataJoint. The calcium imaging and electrophysiology tutorials provide examples of defining and interacting with data pipelines. In addition, some fill-in-the-blank sections are included for you to code yourself!
14+
- The [tutorials](./tutorials) folder contains interactive Jupyter notebooks designed to teach DataJoint. The calcium imaging and electrophysiology tutorials provide examples of defining and interacting with data pipelines. In addition, some fill-in-the-blank sections are included for you to code yourself!
1215
- 01-DataJoint Basics
1316
- 02-Calcium Imaging Imported Tables
1417
- 03-Calcium Imaging Computed Tables
1518
- 04-Electrophysiology Imported Tables
1619
- 05-Electrophysiology Computed Tables
1720

18-
- In the [completed_tutorials](./completed_tutorials) folder are Jupyter notebooks with the code sections completed and solved.
21+
- The [completed_tutorials](./completed_tutorials) folder contains Jupyter notebooks with all code sections completed and solved.
1922

2023
- You will find the following notebooks in the [short_tutorials](./short_tutorials) folder:
2124
- DataJoint in 30min

tutorials/01-DataJoint Basics.ipynb

Lines changed: 29 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -84,7 +84,6 @@
8484
">* Nodes in this graph are represented as database **tables**. Examples of such tables include \"Subject\", \"Session\", \"Implantation\", \"Experimenter\", \"Equipment\", but also \"OptoWaveform\", \"OptoStimParams\", or \"Neuronal spikes\". \n",
8585
"\n",
8686
">* The data pipeline is formed by making these tables interdependent (as the nodes are connected in a network). A **dependency** is a situation where a step of the data pipeline is dependent on a result from a sequentially previous step before it can complete its execution. A dependency graph forms an entire cohesive data pipeline. \n",
87-
"While this is an accurate description, it may not be the most intuitive definition. Put succinctly, a data pipeline is a listing or a \"map\" of various \"things\" that you work with in a project, with line connecting things to each other to indicate their dependencies. The \"things\" in a data pipeline tends to be the *nouns* you find when describing a project. The \"things\" may include anything from mouse, experimenter, equipment, to experiment session, trial, two-photon scans, electric activities, to receptive fields, neuronal spikes, to figures for a publication! A data pipeline gives you a framework to:\n",
8887
"\n",
8988
"A [DataJoint pipeline](https://datajoint.com/docs/core/datajoint-python/0.14/concepts/terminology/) contains database table definitions, dependencies, and associated computations, together with the transformations underlying a DataJoint workflow. \n",
9089
"\n",
@@ -102,7 +101,7 @@
102101
"cell_type": "markdown",
103102
"metadata": {},
104103
"source": [
105-
"#### Practical examples"
104+
"##### Practical examples"
106105
]
107106
},
108107
{
@@ -172,7 +171,7 @@
172171
"cell_type": "markdown",
173172
"metadata": {},
174173
"source": [
175-
"Just by going through the description, we can start to identify **entities** that need to be stored and represented in our data pipeline:\n",
174+
"Just by going through the description, we can start to identify **entities** that needs to be stored and represented in our data pipeline:\n",
176175
"\n",
177176
">* Mouse\n",
178177
">* Experimental session\n",
@@ -207,7 +206,7 @@
207206
"cell_type": "markdown",
208207
"metadata": {},
209208
"source": [
210-
"#### Concepts"
209+
"##### Concepts"
211210
]
212211
},
213212
{
@@ -222,7 +221,7 @@
222221
"\n",
223222
"In this case, the information that uniquely identifies the `Mouse` table is their **mouse IDs** - a unique ID number assigned to each animal in the lab. This attribute is named the **primary key** of the table.\n",
224223
"\n",
225-
"| mouse_id* (*Primary key attribute*)|\n",
224+
"| Mouse_ID (*Primary key attribute*)|\n",
226225
"|:--------: | \n",
227226
"| 11234 |\n",
228227
"| 11432 |"
@@ -234,11 +233,11 @@
234233
"source": [
235234
"After some thought, we might conclude that each mouse can be uniquely identified by knowing its **mouse ID** - a unique ID number assigned to each mouse in the lab. \n",
236235
"\n",
237-
"The `mouse_id` is then a column in the table or an **attribute** that can be used to **uniquely identify** each mouse. \n",
236+
"The mouse ID is then a column in the table or an **attribute** that can be used to **uniquely identify** each mouse. \n",
238237
"\n",
239238
"Such an attribute is called the **primary key** of the table: the subset of table attributes uniquely identifying each entity in the table. The **secondary attribute** refers to any field in a table, not in the primary key.\n",
240239
"\n",
241-
"| mouse_id* (*Primary key attribute*) \n",
240+
"| Mouse_ID (*Primary key attribute*) \n",
242241
"|:--------:| \n",
243242
"| 11234 (*Secondary attribute*)\n",
244243
"| 11432 (*Secondary attribute*)"
@@ -248,7 +247,7 @@
248247
"cell_type": "markdown",
249248
"metadata": {},
250249
"source": [
251-
"Once we have successfully identified the table's primary key, we can now think about what other columns, or **non-primary key attributes** - additional information **about each entry in the table that needs to be stored as well**."
250+
"Once we have successfully identified the table's primary key, we can now think about what other columns, or **non-primary key attributes** - additional information **about each entry in the table that need to be stored as well**."
252251
]
253252
},
254253
{
@@ -264,7 +263,7 @@
264263
"cell_type": "markdown",
265264
"metadata": {},
266265
"source": [
267-
"| mouse_id* | dob | sex |\n",
266+
"| Mouse_ID | DOB | sex |\n",
268267
"|:--------:|------------|--------|\n",
269268
"| 11234 | 2017-11-17 | M |\n",
270269
"| 11432 | 2018-03-04 | F |"
@@ -281,14 +280,14 @@
281280
"cell_type": "markdown",
282281
"metadata": {},
283282
"source": [
284-
"#### Practical example"
283+
"##### Practical example"
285284
]
286285
},
287286
{
288287
"cell_type": "markdown",
289288
"metadata": {},
290289
"source": [
291-
"#### Schema"
290+
"##### Schema"
292291
]
293292
},
294293
{
@@ -326,14 +325,14 @@
326325
"cell_type": "markdown",
327326
"metadata": {},
328327
"source": [
329-
"#### Table"
328+
"##### Table"
330329
]
331330
},
332331
{
333332
"cell_type": "markdown",
334333
"metadata": {},
335334
"source": [
336-
"In DataJoint, you define each table as a `class`, and provide the table definition (e.g. attribute definitions) as the `definition` static string property. The class will inherit from the `dj.Manual` class provided by DataJoint (more on this later)."
335+
"In DataJoint, you define each table as a `class`, and provide the table definition (e.g., attribute definitions) as the `definition` static string property. The class will inherit from the `dj.Manual` class provided by DataJoint (more on this later)."
337336
]
338337
},
339338
{
@@ -380,7 +379,7 @@
380379
"cell_type": "markdown",
381380
"metadata": {},
382381
"source": [
383-
"#### Insert operators"
382+
"##### Insert operators"
384383
]
385384
},
386385
{
@@ -520,7 +519,7 @@
520519
"cell_type": "markdown",
521520
"metadata": {},
522521
"source": [
523-
"#### Data integrity"
522+
"##### Data integrity"
524523
]
525524
},
526525
{
@@ -608,7 +607,7 @@
608607
"cell_type": "markdown",
609608
"metadata": {},
610609
"source": [
611-
"As with `mouse`, we should consider **what information (i.e. attributes) is needed to identify an `experimental session`** uniquely. Here is the relevant section of the project description:\n",
610+
"As with `mouse`, we should consider **what information (i.e., attributes) is needed to identify an `experimental session`** uniquely. Here is the relevant section of the project description:\n",
612611
"\n",
613612
"> * As a hard-working neuroscientist, you perform experiments daily, sometimes working with **more than one mouse in a day**. However, on any given day, **a mouse undergoes at most one recording session**.\n",
614613
"> * For each **experimental session**, you want to record **what mouse you worked with** and **when you performed the experiment**. You also want to keep track of other helpful information, such as the **experimental setup** you worked on. "
@@ -618,7 +617,7 @@
618617
"cell_type": "markdown",
619618
"metadata": {},
620619
"source": [
621-
"Based on the above, you need to know the following information to uniquely identify a single experimental session:\n",
620+
"Based on the above, it seems that you need to know these two data to uniquely identify a single experimental session:\n",
622621
"\n",
623622
"* the date of the session\n",
624623
"* the mouse you recorded from in that session"
@@ -628,7 +627,7 @@
628627
"cell_type": "markdown",
629628
"metadata": {},
630629
"source": [
631-
"Note that to uniquely identify an experimental session (or simply a **session**), we need to know the mouse used in that session. In other words, a session cannot exist without a corresponding mouse! \n",
630+
"Note that, to uniquely identify an experimental session (or simply a `Session`), we need to know the mouse that the session was about. In other words, a session cannot existing without a corresponding mouse! \n",
632631
"\n",
633632
"With **mouse** already represented as a table in our pipeline, we say that the session **depends on** the mouse! We could graphically represent this in an **entity relationship diagram (ERD)** by drawing the line between two tables, with the one below (**session**) depending on the one above (**mouse**)."
634633
]
@@ -639,7 +638,7 @@
639638
"source": [
640639
"Thus, we will need both the **mouse** and the new attribute **session_date** to identify a single `session` uniquely. \n",
641640
"\n",
642-
"Remember that a **mouse** is already uniquely identified by its primary key - **mouse_id**. In DataJoint, you can declare that **session** depends on the mouse, and DataJoint will automatically include the mouse's primary key (`mouse_id`) as part of the session's primary key, along side any additional attribute(s) you specify."
641+
"Remember that a **mouse** is uniquely identified by its primary key - **mouse_id**. In DataJoint, you can declare that **session** depends on the mouse, and DataJoint will automatically include the mouse's primary key (`mouse_id`) as part of the session's primary key, alongside any additional attribute(s) you specify."
643642
]
644643
},
645644
{
@@ -828,10 +827,13 @@
828827
"cell_type": "markdown",
829828
"metadata": {},
830829
"source": [
831-
"We will introduce the major types of queries used in DataJoint:\n",
832-
"1. Restriction (`&`) and negative restriction (`-`): filter the data with certain conditions\n",
833-
"2. Join (`*`): bring fields from different tables together\n",
834-
"3. Projection (`.proj()`): focus on a subset of attributes\n",
830+
"We will introduce significant types of queries used in DataJoint:\n",
831+
"* 1. Restriction (`&`) and negative restriction (`-`): filter the data with certain conditions\n",
832+
"* 2. Join (`*`): bring fields from different tables together\n",
833+
"* 3. Projection (`.proj()`): focus on a subset of attributes\n",
834+
"* 4. Fetch (`.fetch()`): pull the data from the database\n",
835+
"* 5. Deletion (`.delete()`): delete entries and their dependencies\n",
836+
"* 6. Drop (`.drop()`): drop the table from the schema"
835837
]
836838
},
837839
{
@@ -852,7 +854,7 @@
852854
"cell_type": "markdown",
853855
"metadata": {},
854856
"source": [
855-
"#### Exact match"
857+
"##### Exact match"
856858
]
857859
},
858860
{
@@ -1017,7 +1019,7 @@
10171019
"cell_type": "markdown",
10181020
"metadata": {},
10191021
"source": [
1020-
"The result of one query can be used in another query! Let's first find **all the female mice** and store the result:"
1022+
"The result of one query can be used in another query! Let's first find `all the female mice` and `store the result`:"
10211023
]
10221024
},
10231025
{
@@ -1348,7 +1350,7 @@
13481350
"cell_type": "markdown",
13491351
"metadata": {},
13501352
"source": [
1351-
"Fetch it:"
1353+
"Fetch it!:"
13521354
]
13531355
},
13541356
{
@@ -1609,7 +1611,7 @@
16091611
"cell_type": "markdown",
16101612
"metadata": {},
16111613
"source": [
1612-
"Note that the `.delete()` method not only deletes the entries in the table where it is called, but also all the corresponding entries in subsequent (downstream) tables!"
1614+
"Note that the `.delete()` method not only delete the entries in a table, but also all the corresponding entries in subsequent (downstream) tables!"
16131615
]
16141616
},
16151617
{

tutorials/02-Calcium Imaging Imported Tables.ipynb

Lines changed: 11 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@
1919
"During this session you will learn:\n",
2020
"\n",
2121
"* To import neuron imaging data from data files into an `Imported` table\n",
22-
"* To automatically trigger data importing and computations for all the missing entries with `populate`"
22+
"* To automatically trigger data importing and computations for all the missing entries with `Populate`"
2323
]
2424
},
2525
{
@@ -33,7 +33,7 @@
3333
"cell_type": "markdown",
3434
"metadata": {},
3535
"source": [
36-
"First things first, let's import `DataJoint` again."
36+
"First thing first, let's import `DataJoint` again."
3737
]
3838
},
3939
{
@@ -74,11 +74,11 @@
7474
"cell_type": "markdown",
7575
"metadata": {},
7676
"source": [
77-
"In the `data` folder in this repository, you will find a small dataset of three different calcium imaging scans named `example_scan_02.tif`, `example_scan_03.tif`and `example_scan_01.tif`.\n",
77+
"In the `data` folder in this `DataJoint-Tutorials`, you can find a small dataset of three different cases of calcium imaging scans: `example_scan_02.tif`, `example_scan_03.tif`and `example_scan_01.tif`.\n",
7878
"\n",
7979
"As you might know, calcium imaging scans (raw data) are stored as *.tif* files. \n",
8080
"\n",
81-
"*NOTE: For this tutorial there is no need to deeply explore this dataset. Nevertheless, if you are curious about visualizing these example scans, we recommend you to open the TIFF with [ImageJ](https://imagej.nih.gov/ij/download.html).*"
81+
"*NOTE: For this tutorial there is no need to deeper explore this small dataset. Nevertheless, if you are curious about visualizing these example scans, we recommend you to open the TIFF with [ImageJ](https://imagej.nih.gov/ij/download.html).*"
8282
]
8383
},
8484
{
@@ -92,7 +92,7 @@
9292
"cell_type": "markdown",
9393
"metadata": {},
9494
"source": [
95-
"DataJoint pipelines commonly start with tables for `Mouse` and `Session`. Let's quickly create these tables based on what we learned in the previous session:"
95+
"The DataJoint pipeline commonly starts with a `schema` and the following classes for each table: `Mouse` and `Session`. Let's quickly create this pipeline's first steps as we learned it in the previous session:"
9696
]
9797
},
9898
{
@@ -320,10 +320,9 @@
320320
"cell_type": "markdown",
321321
"metadata": {},
322322
"source": [
323-
"This tiff file contains 100 frames. \n",
323+
"Particularly, this example contains 100 frames. \n",
324324
"\n",
325-
"Let's calculate the average of the images over the frames and plot the result.\n",
326-
"\n"
325+
"Let's calculate the average of the images over the frames and plot the result.\n"
327326
]
328327
},
329328
{
@@ -402,7 +401,7 @@
402401
"cell_type": "markdown",
403402
"metadata": {},
404403
"source": [
405-
"We defined `average_frame` as a `longblob`, which allows us to store a NumPy array. This NumPy array will be imported and computed from the file corresponding to each scan."
404+
"We defined `average_frame` as a `longblob`, which allow us to store a NumPy array. This NumPy array will be imported and computed from the file corresponding to each scan."
406405
]
407406
},
408407
{
@@ -425,7 +424,7 @@
425424
"source": [
426425
"In DataJoint, the tier of the table indicates **the nature of the data and the data source for the table**. So far we have encountered two table tiers: `Manual` and `Imported`, and we will encounter the two other major tiers in this session. \n",
427426
"\n",
428-
"DataJoint tables in `Manual` tier, or simply **Manual tables** indicate that its contents are **manually** entered by either experimenters or a recording system, and its content **do not depend on external data files or other tables**. This is the most basic table type you will encounter, especially as the tables at the beginning of the pipeline. In the Diagram, `Manual` tables are depicted by green rectangles.\n",
427+
"DataJoint tables in `Manual` tier, or simply **Manual tables** indicate that its contents are **manually** entered by either experimenters or a recording system, and its content **do not depend on external data files or other tables**. This is the most basic table type you will encounter, especially as the tables at the beginning of the pipeline. In the diagram, `Manual` tables are depicted by green rectangles.\n",
429428
"\n",
430429
"On the other hand, **Imported tables** are understood to pull data (or *import* data) from external data files, and come equipped with functionalities to perform this importing process automatically, as we will see shortly! In the Diagram, `Imported` tables are depicted by blue ellipses."
431430
]
@@ -450,7 +449,7 @@
450449
"cell_type": "markdown",
451450
"metadata": {},
452451
"source": [
453-
"Rather than filling out the content of the table manually using `insert1` or `insert` methods, we are going to make use of the `make` and `populate` logic that comes with `Imported` tables to automatically figure out what needs to be imported and perform the import."
452+
"Rather than filling out the content of the table manually using `insert1` or `insert` methods, we are going to make use of the `make` and `populate` logic that comes with `Imported` tables. These two methods automatically figure it out what needs to be imported, and perform the import."
454453
]
455454
},
456455
{
@@ -464,7 +463,7 @@
464463
"cell_type": "markdown",
465464
"metadata": {},
466465
"source": [
467-
"`Imported` tables come with a special method called `populate`. Let's call it for `AverageFrame`:\n",
466+
"`Imported` table comes with a special method called `populate`. Let's call it for `AverageFrame`:\n",
468467
"\n",
469468
"*Note that the following code line is intended to generate a code error.*"
470469
]

0 commit comments

Comments
 (0)