Skip to content

Commit 4190925

Browse files
committed
Upsdate tutorial markdown
1 parent 69cef22 commit 4190925

File tree

1 file changed

+48
-162
lines changed

1 file changed

+48
-162
lines changed

notebooks/tutorial.ipynb

Lines changed: 48 additions & 162 deletions
Original file line numberDiff line numberDiff line change
@@ -1125,50 +1125,6 @@
11251125
"multielectrode probe. "
11261126
]
11271127
},
1128-
{
1129-
"cell_type": "code",
1130-
"execution_count": 13,
1131-
"metadata": {},
1132-
"outputs": [
1133-
{
1134-
"data": {
1135-
"text/plain": [
1136-
"'# Represent a physical probe with unique identification\\nprobe : varchar(32) # unique identifier for this model of probe (e.g. serial number)\\n---\\n-> probe.ProbeType\\nprobe_comment=\"\" : varchar(1000) \\n'"
1137-
]
1138-
},
1139-
"execution_count": 13,
1140-
"metadata": {},
1141-
"output_type": "execute_result"
1142-
}
1143-
],
1144-
"source": [
1145-
"print(probe.Probe.describe())"
1146-
]
1147-
},
1148-
{
1149-
"cell_type": "code",
1150-
"execution_count": 14,
1151-
"metadata": {},
1152-
"outputs": [
1153-
{
1154-
"data": {
1155-
"text/plain": [
1156-
"# Represent a physical probe with unique identification\n",
1157-
"probe : varchar(32) # unique identifier for this model of probe (e.g. serial number)\n",
1158-
"---\n",
1159-
"probe_type : varchar(32) # e.g. neuropixels_1.0\n",
1160-
"probe_comment=\"\" : varchar(1000) # "
1161-
]
1162-
},
1163-
"execution_count": 14,
1164-
"metadata": {},
1165-
"output_type": "execute_result"
1166-
}
1167-
],
1168-
"source": [
1169-
"probe.Probe.heading"
1170-
]
1171-
},
11721128
{
11731129
"cell_type": "code",
11741130
"execution_count": 15,
@@ -1293,7 +1249,7 @@
12931249
}
12941250
],
12951251
"source": [
1296-
"ephys.ProbeInsertion.describe()"
1252+
"print(ephys.ProbeInsertion.describe())"
12971253
]
12981254
},
12991255
{
@@ -1428,77 +1384,24 @@
14281384
]
14291385
},
14301386
{
1431-
"attachments": {},
14321387
"cell_type": "markdown",
14331388
"metadata": {},
14341389
"source": [
14351390
"## Populate\n",
14361391
"\n",
14371392
"### Automatically populate tables\n",
14381393
"\n",
1439-
"`ephys.EphysRecording` is the first table in the pipeline that can be populated automatically.\n",
1440-
"If a table contains a part table, this part table is also populated during the\n",
1441-
"`populate()` call. `populate()` takes several arguments including the a session\n",
1442-
"key. This key restricts `populate()` to performing the operation on the session\n",
1443-
"of interest rather than all possible sessions which could be a time-intensive\n",
1444-
"process for databases with lots of entries.\n",
1394+
"In DataJoint, the `populate()` method is a powerful feature designed to fill tables based on the logic defined in the table's `make` method. Here's a breakdown of its functionality:\n",
14451395
"\n",
1446-
"Let's view the `ephys.EphysRecording` and its part table\n",
1447-
"`ephys.EphysRecording.EphysFile` and populate both through a single `populate()`\n",
1448-
"call."
1449-
]
1450-
},
1451-
{
1452-
"cell_type": "code",
1453-
"execution_count": 19,
1454-
"metadata": {},
1455-
"outputs": [
1456-
{
1457-
"data": {
1458-
"text/plain": [
1459-
"# Ephys recording from a probe insertion for a given session.\n",
1460-
"subject : varchar(8) # \n",
1461-
"session_datetime : datetime # \n",
1462-
"insertion_number : tinyint unsigned # \n",
1463-
"---\n",
1464-
"electrode_config_hash : uuid # \n",
1465-
"acq_software : varchar(24) # \n",
1466-
"sampling_rate : float # (Hz)\n",
1467-
"recording_datetime : datetime # datetime of the recording from this probe\n",
1468-
"recording_duration : float # (seconds) duration of the recording from this probe"
1469-
]
1470-
},
1471-
"execution_count": 19,
1472-
"metadata": {},
1473-
"output_type": "execute_result"
1474-
}
1475-
],
1476-
"source": [
1477-
"ephys.EphysRecording.heading"
1478-
]
1479-
},
1480-
{
1481-
"cell_type": "code",
1482-
"execution_count": 20,
1483-
"metadata": {},
1484-
"outputs": [
1485-
{
1486-
"data": {
1487-
"text/plain": [
1488-
"# Paths of files of a given EphysRecording round.\n",
1489-
"subject : varchar(8) # \n",
1490-
"session_datetime : datetime # \n",
1491-
"insertion_number : tinyint unsigned # \n",
1492-
"file_path : varchar(255) # filepath relative to root data directory"
1493-
]
1494-
},
1495-
"execution_count": 20,
1496-
"metadata": {},
1497-
"output_type": "execute_result"
1498-
}
1499-
],
1500-
"source": [
1501-
"ephys.EphysRecording.EphysFile.heading"
1396+
"- **Automation**: Instead of manually inserting data into each table, which can be error-prone and time-consuming, `populate()` automates the insertion based on the dependencies and relationships already established in the schema.\n",
1397+
"\n",
1398+
"- **Dependency Resolution**: Before populating a table, `populate()` ensures all its dependencies are populated. This maintains the integrity and consistency of the data.\n",
1399+
"\n",
1400+
"- **Part Tables**: If a table has part tables associated with it, calling `populate()` on the main table will also populate its part tables. This is especially useful in cases like `ephys.EphysRecording` and its part table `ephys.EphysRecording.EphysFile`, as they are closely linked in terms of data lineage.\n",
1401+
"\n",
1402+
"- **Restriction**: The `populate()` method can be restricted to specific entries. For instance, by providing a `session_key`, we're ensuring the method only operates on the data relevant to that particular session. This is both efficient and avoids unnecessary operations on unrelated data.\n",
1403+
"\n",
1404+
"In the upcoming cells, we'll make use of the `populate()` method to fill the `ephys.EphysRecording` table and its part table. Remember, while this operation is automated, it's essential to understand the underlying logic to ensure accurate and consistent data entry.\n"
15021405
]
15031406
},
15041407
{
@@ -2131,26 +2034,6 @@
21312034
"downstream processing. Let's view the attributes to get a better understanding. "
21322035
]
21332036
},
2134-
{
2135-
"cell_type": "code",
2136-
"execution_count": 28,
2137-
"metadata": {},
2138-
"outputs": [
2139-
{
2140-
"data": {
2141-
"text/plain": [
2142-
"'# Manual table for defining a clustering task ready to be run\\n-> ephys.EphysRecording\\n-> ephys.ClusteringParamSet\\n---\\nclustering_output_dir=\"\" : varchar(255) # clustering output directory relative to the clustering root data directory\\ntask_mode=\"load\" : enum(\\'load\\',\\'trigger\\') # \\'load\\': load computed analysis results, \\'trigger\\': trigger computation\\n'"
2143-
]
2144-
},
2145-
"execution_count": 28,
2146-
"metadata": {},
2147-
"output_type": "execute_result"
2148-
}
2149-
],
2150-
"source": [
2151-
"ephys.ClusteringTask.describe()"
2152-
]
2153-
},
21542037
{
21552038
"cell_type": "code",
21562039
"execution_count": 29,
@@ -2187,7 +2070,7 @@
21872070
"+ `paramset_idx` \n",
21882071
"+ `task_mode` \n",
21892072
"\n",
2190-
"The `paramset_idx` attribute is tracks\n",
2073+
"The `paramset_idx` attribute tracks\n",
21912074
"your kilosort parameter sets. You can choose the parameter set using which \n",
21922075
"you want spike sort ephys data. For example, `paramset_idx=0` may contain\n",
21932076
"default parameters for kilosort processing whereas `paramset_idx=1` contains your custom parameters for sorting. This\n",
@@ -2215,15 +2098,6 @@
22152098
")"
22162099
]
22172100
},
2218-
{
2219-
"attachments": {},
2220-
"cell_type": "markdown",
2221-
"metadata": {},
2222-
"source": [
2223-
"Notice we set the `task_mode` to `load`. Let's call populate on the `Clustering`\n",
2224-
"table in the pipeline."
2225-
]
2226-
},
22272101
{
22282102
"cell_type": "code",
22292103
"execution_count": 31,
@@ -2335,34 +2209,46 @@
23352209
"\n",
23362210
"In this tutorial, we will do some exploratory analysis by fetching the data from the database and creating a few plots.\n",
23372211
"\n",
2338-
"## Query\n",
2212+
"## Querying Data\n",
2213+
"\n",
2214+
"DataJoint provides a powerful querying system, allowing you to retrieve and work with data stored in your database seamlessly. In this section, we'll explore the fundamental querying concepts.\n",
2215+
"\n",
2216+
"#### What is a Query?\n",
2217+
"\n",
2218+
"- A query is essentially a request for data. With DataJoint, you can craft specific queries to fetch data that meets your criteria from the database.\n",
2219+
"\n",
2220+
"#### The `fetch()` Method\n",
2221+
"\n",
2222+
"- The primary method for retrieving data from a DataJoint table is `fetch()`.\n",
2223+
"- **Default Behavior**: Without any arguments, `fetch()` returns a list of dictionaries. Each dictionary corresponds to an entry in the table.\n",
2224+
" \n",
2225+
"#### The `fetch1()` Method\n",
2226+
"\n",
2227+
"- For tables with a single entry or when you're only interested in the first entry, use `fetch1()`.\n",
2228+
"- **Default Behavior**: It returns a dictionary of attributes for that one entry.\n",
2229+
"\n",
2230+
"#### Specific Attributes\n",
2231+
"\n",
2232+
"- Both `fetch()` and `fetch1()` can be made more specific by providing attributes.\n",
2233+
"- Example: `fetch1('fps')` will retrieve only the `fps` attribute from the first entry.\n",
2234+
"\n",
2235+
"#### Restricting Queries\n",
2236+
"\n",
2237+
"- Often, you don't want to fetch everything. Instead, you might want data related to a specific subject or session.\n",
2238+
"- DataJoint uses the `&` operator to restrict queries.\n",
2239+
"- Example: To get all session times for `subject5`, you might use:\n",
2240+
" ```python\n",
2241+
" subject1_times = (session.Session & \"subject = 'subject1'\").fetch(\"session_datetime\")\n",
2242+
" ```\n",
23392243
"\n",
2340-
"This section focuses on working with data that is already in the\n",
2341-
"database. \n",
2244+
"#### Fetching Primary Keys\n",
23422245
"\n",
2343-
"DataJoint queries allow you to view and import data from the database into a python\n",
2344-
"variable using the `fetch()` method. \n",
2246+
"- Sometimes, you just need the primary keys of entries.\n",
2247+
"- Use the `fetch(\"KEY\")` syntax for this. For instance, `(session.Session).fetch(\"KEY\")`.\n",
23452248
"\n",
2346-
"There are several important features supported by `fetch()`:\n",
2347-
"- By default, an empty `fetch()` imports a list of dictionaries containing all\n",
2348-
" attributes of all entries in the table that is queried.\n",
2349-
"- **`fetch1()`**, on the other hand, imports a dictionary containing all attributes of\n",
2350-
" one of the entries in the table. By default, if a table has multiple entries,\n",
2351-
" `fetch1()` imports the first entry in the table.\n",
2352-
"- Both `fetch()` and `fetch1()` accept table attributes as an argument to query\n",
2353-
" that particular attribute. For example `fetch1('fps')` will fetch the first\n",
2354-
" value of the `fps` attribute if it exists in the table.\n",
2355-
"- Recommended best practice is to **restrict** queries by primary key attributes of the\n",
2356-
" table to ensure the accuracy of imported data.\n",
2357-
" - The most common restriction for entries in DataJoint tables is performed\n",
2358-
" using the `&` operator. For example to fetch all session start times belonging to\n",
2359-
" `subject1`, a possible query could be `subject1_sessions =\n",
2360-
" (session.Session & \"subject = 'subject1'\").fetch(\"session_datetime\")`. \n",
2361-
"- `fetch()` can also be used to obtain the primary keys of a table. To fetch the primary\n",
2362-
" keys of a table use `<table_name>.fetch(\"KEY\")` syntax.\n",
2249+
"#### Let's Dive In!\n",
23632250
"\n",
2364-
"Let's walk through these concepts of querying by moving from simple to more\n",
2365-
"complex queries."
2251+
"Now that we've established the basics, let's delve deeper into querying with some practical examples."
23662252
]
23672253
},
23682254
{

0 commit comments

Comments
 (0)