|
1125 | 1125 | "multielectrode probe. " |
1126 | 1126 | ] |
1127 | 1127 | }, |
1128 | | - { |
1129 | | - "cell_type": "code", |
1130 | | - "execution_count": 13, |
1131 | | - "metadata": {}, |
1132 | | - "outputs": [ |
1133 | | - { |
1134 | | - "data": { |
1135 | | - "text/plain": [ |
1136 | | - "'# Represent a physical probe with unique identification\\nprobe : varchar(32) # unique identifier for this model of probe (e.g. serial number)\\n---\\n-> probe.ProbeType\\nprobe_comment=\"\" : varchar(1000) \\n'" |
1137 | | - ] |
1138 | | - }, |
1139 | | - "execution_count": 13, |
1140 | | - "metadata": {}, |
1141 | | - "output_type": "execute_result" |
1142 | | - } |
1143 | | - ], |
1144 | | - "source": [ |
1145 | | - "print(probe.Probe.describe())" |
1146 | | - ] |
1147 | | - }, |
1148 | | - { |
1149 | | - "cell_type": "code", |
1150 | | - "execution_count": 14, |
1151 | | - "metadata": {}, |
1152 | | - "outputs": [ |
1153 | | - { |
1154 | | - "data": { |
1155 | | - "text/plain": [ |
1156 | | - "# Represent a physical probe with unique identification\n", |
1157 | | - "probe : varchar(32) # unique identifier for this model of probe (e.g. serial number)\n", |
1158 | | - "---\n", |
1159 | | - "probe_type : varchar(32) # e.g. neuropixels_1.0\n", |
1160 | | - "probe_comment=\"\" : varchar(1000) # " |
1161 | | - ] |
1162 | | - }, |
1163 | | - "execution_count": 14, |
1164 | | - "metadata": {}, |
1165 | | - "output_type": "execute_result" |
1166 | | - } |
1167 | | - ], |
1168 | | - "source": [ |
1169 | | - "probe.Probe.heading" |
1170 | | - ] |
1171 | | - }, |
1172 | 1128 | { |
1173 | 1129 | "cell_type": "code", |
1174 | 1130 | "execution_count": 15, |
|
1293 | 1249 | } |
1294 | 1250 | ], |
1295 | 1251 | "source": [ |
1296 | | - "ephys.ProbeInsertion.describe()" |
| 1252 | + "print(ephys.ProbeInsertion.describe())" |
1297 | 1253 | ] |
1298 | 1254 | }, |
1299 | 1255 | { |
|
1428 | 1384 | ] |
1429 | 1385 | }, |
1430 | 1386 | { |
1431 | | - "attachments": {}, |
1432 | 1387 | "cell_type": "markdown", |
1433 | 1388 | "metadata": {}, |
1434 | 1389 | "source": [ |
1435 | 1390 | "## Populate\n", |
1436 | 1391 | "\n", |
1437 | 1392 | "### Automatically populate tables\n", |
1438 | 1393 | "\n", |
1439 | | - "`ephys.EphysRecording` is the first table in the pipeline that can be populated automatically.\n", |
1440 | | - "If a table contains a part table, this part table is also populated during the\n", |
1441 | | - "`populate()` call. `populate()` takes several arguments including the a session\n", |
1442 | | - "key. This key restricts `populate()` to performing the operation on the session\n", |
1443 | | - "of interest rather than all possible sessions which could be a time-intensive\n", |
1444 | | - "process for databases with lots of entries.\n", |
| 1394 | + "In DataJoint, the `populate()` method is a powerful feature designed to fill tables based on the logic defined in the table's `make` method. Here's a breakdown of its functionality:\n", |
1445 | 1395 | "\n", |
1446 | | - "Let's view the `ephys.EphysRecording` and its part table\n", |
1447 | | - "`ephys.EphysRecording.EphysFile` and populate both through a single `populate()`\n", |
1448 | | - "call." |
1449 | | - ] |
1450 | | - }, |
1451 | | - { |
1452 | | - "cell_type": "code", |
1453 | | - "execution_count": 19, |
1454 | | - "metadata": {}, |
1455 | | - "outputs": [ |
1456 | | - { |
1457 | | - "data": { |
1458 | | - "text/plain": [ |
1459 | | - "# Ephys recording from a probe insertion for a given session.\n", |
1460 | | - "subject : varchar(8) # \n", |
1461 | | - "session_datetime : datetime # \n", |
1462 | | - "insertion_number : tinyint unsigned # \n", |
1463 | | - "---\n", |
1464 | | - "electrode_config_hash : uuid # \n", |
1465 | | - "acq_software : varchar(24) # \n", |
1466 | | - "sampling_rate : float # (Hz)\n", |
1467 | | - "recording_datetime : datetime # datetime of the recording from this probe\n", |
1468 | | - "recording_duration : float # (seconds) duration of the recording from this probe" |
1469 | | - ] |
1470 | | - }, |
1471 | | - "execution_count": 19, |
1472 | | - "metadata": {}, |
1473 | | - "output_type": "execute_result" |
1474 | | - } |
1475 | | - ], |
1476 | | - "source": [ |
1477 | | - "ephys.EphysRecording.heading" |
1478 | | - ] |
1479 | | - }, |
1480 | | - { |
1481 | | - "cell_type": "code", |
1482 | | - "execution_count": 20, |
1483 | | - "metadata": {}, |
1484 | | - "outputs": [ |
1485 | | - { |
1486 | | - "data": { |
1487 | | - "text/plain": [ |
1488 | | - "# Paths of files of a given EphysRecording round.\n", |
1489 | | - "subject : varchar(8) # \n", |
1490 | | - "session_datetime : datetime # \n", |
1491 | | - "insertion_number : tinyint unsigned # \n", |
1492 | | - "file_path : varchar(255) # filepath relative to root data directory" |
1493 | | - ] |
1494 | | - }, |
1495 | | - "execution_count": 20, |
1496 | | - "metadata": {}, |
1497 | | - "output_type": "execute_result" |
1498 | | - } |
1499 | | - ], |
1500 | | - "source": [ |
1501 | | - "ephys.EphysRecording.EphysFile.heading" |
| 1396 | + "- **Automation**: Instead of manually inserting data into each table, which can be error-prone and time-consuming, `populate()` automates the insertion based on the dependencies and relationships already established in the schema.\n", |
| 1397 | + "\n", |
| 1398 | + "- **Dependency Resolution**: Before populating a table, `populate()` ensures all its dependencies are populated. This maintains the integrity and consistency of the data.\n", |
| 1399 | + "\n", |
| 1400 | + "- **Part Tables**: If a table has part tables associated with it, calling `populate()` on the main table will also populate its part tables. This is especially useful in cases like `ephys.EphysRecording` and its part table `ephys.EphysRecording.EphysFile`, as they are closely linked in terms of data lineage.\n", |
| 1401 | + "\n", |
| 1402 | + "- **Restriction**: The `populate()` method can be restricted to specific entries. For instance, by providing a `session_key`, we're ensuring the method only operates on the data relevant to that particular session. This is both efficient and avoids unnecessary operations on unrelated data.\n", |
| 1403 | + "\n", |
| 1404 | + "In the upcoming cells, we'll make use of the `populate()` method to fill the `ephys.EphysRecording` table and its part table. Remember, while this operation is automated, it's essential to understand the underlying logic to ensure accurate and consistent data entry.\n" |
1502 | 1405 | ] |
1503 | 1406 | }, |
1504 | 1407 | { |
|
2131 | 2034 | "downstream processing. Let's view the attributes to get a better understanding. " |
2132 | 2035 | ] |
2133 | 2036 | }, |
2134 | | - { |
2135 | | - "cell_type": "code", |
2136 | | - "execution_count": 28, |
2137 | | - "metadata": {}, |
2138 | | - "outputs": [ |
2139 | | - { |
2140 | | - "data": { |
2141 | | - "text/plain": [ |
2142 | | - "'# Manual table for defining a clustering task ready to be run\\n-> ephys.EphysRecording\\n-> ephys.ClusteringParamSet\\n---\\nclustering_output_dir=\"\" : varchar(255) # clustering output directory relative to the clustering root data directory\\ntask_mode=\"load\" : enum(\\'load\\',\\'trigger\\') # \\'load\\': load computed analysis results, \\'trigger\\': trigger computation\\n'" |
2143 | | - ] |
2144 | | - }, |
2145 | | - "execution_count": 28, |
2146 | | - "metadata": {}, |
2147 | | - "output_type": "execute_result" |
2148 | | - } |
2149 | | - ], |
2150 | | - "source": [ |
2151 | | - "ephys.ClusteringTask.describe()" |
2152 | | - ] |
2153 | | - }, |
2154 | 2037 | { |
2155 | 2038 | "cell_type": "code", |
2156 | 2039 | "execution_count": 29, |
|
2187 | 2070 | "+ `paramset_idx` \n", |
2188 | 2071 | "+ `task_mode` \n", |
2189 | 2072 | "\n", |
2190 | | - "The `paramset_idx` attribute is tracks\n", |
| 2073 | + "The `paramset_idx` attribute tracks\n", |
2191 | 2074 | "your kilosort parameter sets. You can choose the parameter set using which \n", |
2192 | 2075 | "you want spike sort ephys data. For example, `paramset_idx=0` may contain\n", |
2193 | 2076 | "default parameters for kilosort processing whereas `paramset_idx=1` contains your custom parameters for sorting. This\n", |
|
2215 | 2098 | ")" |
2216 | 2099 | ] |
2217 | 2100 | }, |
2218 | | - { |
2219 | | - "attachments": {}, |
2220 | | - "cell_type": "markdown", |
2221 | | - "metadata": {}, |
2222 | | - "source": [ |
2223 | | - "Notice we set the `task_mode` to `load`. Let's call populate on the `Clustering`\n", |
2224 | | - "table in the pipeline." |
2225 | | - ] |
2226 | | - }, |
2227 | 2101 | { |
2228 | 2102 | "cell_type": "code", |
2229 | 2103 | "execution_count": 31, |
|
2335 | 2209 | "\n", |
2336 | 2210 | "In this tutorial, we will do some exploratory analysis by fetching the data from the database and creating a few plots.\n", |
2337 | 2211 | "\n", |
2338 | | - "## Query\n", |
| 2212 | + "## Querying Data\n", |
| 2213 | + "\n", |
| 2214 | + "DataJoint provides a powerful querying system, allowing you to retrieve and work with data stored in your database seamlessly. In this section, we'll explore the fundamental querying concepts.\n", |
| 2215 | + "\n", |
| 2216 | + "#### What is a Query?\n", |
| 2217 | + "\n", |
| 2218 | + "- A query is essentially a request for data. With DataJoint, you can craft specific queries to fetch data that meets your criteria from the database.\n", |
| 2219 | + "\n", |
| 2220 | + "#### The `fetch()` Method\n", |
| 2221 | + "\n", |
| 2222 | + "- The primary method for retrieving data from a DataJoint table is `fetch()`.\n", |
| 2223 | + "- **Default Behavior**: Without any arguments, `fetch()` returns a list of dictionaries. Each dictionary corresponds to an entry in the table.\n", |
| 2224 | + " \n", |
| 2225 | + "#### The `fetch1()` Method\n", |
| 2226 | + "\n", |
| 2227 | + "- For tables with a single entry or when you're only interested in the first entry, use `fetch1()`.\n", |
| 2228 | + "- **Default Behavior**: It returns a dictionary of attributes for that one entry.\n", |
| 2229 | + "\n", |
| 2230 | + "#### Specific Attributes\n", |
| 2231 | + "\n", |
| 2232 | + "- Both `fetch()` and `fetch1()` can be made more specific by providing attributes.\n", |
| 2233 | + "- Example: `fetch1('fps')` will retrieve only the `fps` attribute from the first entry.\n", |
| 2234 | + "\n", |
| 2235 | + "#### Restricting Queries\n", |
| 2236 | + "\n", |
| 2237 | + "- Often, you don't want to fetch everything. Instead, you might want data related to a specific subject or session.\n", |
| 2238 | + "- DataJoint uses the `&` operator to restrict queries.\n", |
| 2239 | + "- Example: To get all session times for `subject5`, you might use:\n", |
| 2240 | + " ```python\n", |
| 2241 | + " subject1_times = (session.Session & \"subject = 'subject1'\").fetch(\"session_datetime\")\n", |
| 2242 | + " ```\n", |
2339 | 2243 | "\n", |
2340 | | - "This section focuses on working with data that is already in the\n", |
2341 | | - "database. \n", |
| 2244 | + "#### Fetching Primary Keys\n", |
2342 | 2245 | "\n", |
2343 | | - "DataJoint queries allow you to view and import data from the database into a python\n", |
2344 | | - "variable using the `fetch()` method. \n", |
| 2246 | + "- Sometimes, you just need the primary keys of entries.\n", |
| 2247 | + "- Use the `fetch(\"KEY\")` syntax for this. For instance, `(session.Session).fetch(\"KEY\")`.\n", |
2345 | 2248 | "\n", |
2346 | | - "There are several important features supported by `fetch()`:\n", |
2347 | | - "- By default, an empty `fetch()` imports a list of dictionaries containing all\n", |
2348 | | - " attributes of all entries in the table that is queried.\n", |
2349 | | - "- **`fetch1()`**, on the other hand, imports a dictionary containing all attributes of\n", |
2350 | | - " one of the entries in the table. By default, if a table has multiple entries,\n", |
2351 | | - " `fetch1()` imports the first entry in the table.\n", |
2352 | | - "- Both `fetch()` and `fetch1()` accept table attributes as an argument to query\n", |
2353 | | - " that particular attribute. For example `fetch1('fps')` will fetch the first\n", |
2354 | | - " value of the `fps` attribute if it exists in the table.\n", |
2355 | | - "- Recommended best practice is to **restrict** queries by primary key attributes of the\n", |
2356 | | - " table to ensure the accuracy of imported data.\n", |
2357 | | - " - The most common restriction for entries in DataJoint tables is performed\n", |
2358 | | - " using the `&` operator. For example to fetch all session start times belonging to\n", |
2359 | | - " `subject1`, a possible query could be `subject1_sessions =\n", |
2360 | | - " (session.Session & \"subject = 'subject1'\").fetch(\"session_datetime\")`. \n", |
2361 | | - "- `fetch()` can also be used to obtain the primary keys of a table. To fetch the primary\n", |
2362 | | - " keys of a table use `<table_name>.fetch(\"KEY\")` syntax.\n", |
| 2249 | + "#### Let's Dive In!\n", |
2363 | 2250 | "\n", |
2364 | | - "Let's walk through these concepts of querying by moving from simple to more\n", |
2365 | | - "complex queries." |
| 2251 | + "Now that we've established the basics, let's delve deeper into querying with some practical examples." |
2366 | 2252 | ] |
2367 | 2253 | }, |
2368 | 2254 | { |
|
0 commit comments