|
1 |
| -## Work in progress |
2 |
| -You may ask questions in the chat window below or |
3 |
| -refer to [legacy documentation](https://docs.datajoint.org/) |
| 1 | +# Query Principles |
| 2 | + |
| 3 | +**Data queries** retrieve data from the database. |
| 4 | +A data query is performed with the help of a **query object**, which is a symbolic |
| 5 | +representation of the query that does not in itself contain any actual data. |
| 6 | +The simplest query object is an instance of a **table class**, representing the |
| 7 | +contents of an entire table. |
| 8 | + |
| 9 | +For example, if `experiment.Session` is a DataJoint table class, you can create a query |
| 10 | +object to retrieve its entire contents as follows: |
| 11 | + |
| 12 | +```python |
| 13 | +query = experiment.Session() |
| 14 | +``` |
| 15 | + |
| 16 | +More generally, a query object may be formed as a **query expression** constructed by |
| 17 | +applying [operators](operators.md) to other query objects. |
| 18 | + |
| 19 | +For example, the following query retrieves information about all experiments and scans |
| 20 | +for mouse 102 (excluding experiments with no scans): |
| 21 | + |
| 22 | +```python |
| 23 | +query = experiment.Session * experiment.Scan & 'animal_id = 102' |
| 24 | +``` |
| 25 | + |
| 26 | +Note that for brevity, query operators can be applied directly to class objects rather |
| 27 | +than instance objects so that `experiment.Session` may be used in place of |
| 28 | +`experiment.Session()`. |
| 29 | + |
| 30 | +You can preview the contents of the query in Python, Jupyter Notebook, or MATLAB by |
| 31 | +simply displaying the object. |
| 32 | +In the image below, the object `query` is first defined as a restriction of the table |
| 33 | +`EEG` by values of the attribute `eeg_sample_rate` greater than 1000 Hz. |
| 34 | +Displaying the object gives a preview of the entities that will be returned by `query`. |
| 35 | +Note that this preview only lists a few of the entities that will be returned. |
| 36 | +Also, the preview does not contain any data for attributes of datatype `blob`. |
| 37 | + |
| 38 | +{: style="align:center"} |
| 39 | + |
| 40 | +Defining a query object and previewing the entities returned by the query. |
| 41 | + |
| 42 | +Once the desired query object is formed, the query can be executed using its |
| 43 | +[fetch](fetch.md) methods. |
| 44 | +To **fetch** means to transfer the data represented by the query object from the |
| 45 | +database server into the workspace of the host language. |
| 46 | + |
| 47 | +```python |
| 48 | +s = query.fetch() |
| 49 | +``` |
| 50 | + |
| 51 | +Here fetching from the `query` object produces the NumPy record array `s` of the |
| 52 | +queried data. |
| 53 | + |
| 54 | +## Checking for returned entities |
| 55 | + |
| 56 | +The preview of the query object shown above displayed only a few of the entities |
| 57 | +returned by the query but also displayed the total number of entities that would be |
| 58 | +returned. |
| 59 | +It can be useful to know the number of entities returned by a query, or even whether a |
| 60 | +query will return any entities at all, without having to fetch all the data themselves. |
| 61 | + |
| 62 | +The `bool` function applied to a query object evaluates to `True` if the query returns |
| 63 | +any entities and to `False` if the query result is empty. |
| 64 | + |
| 65 | +The `len` function applied to a query object determines the number of entities returned |
| 66 | +by the query. |
| 67 | + |
| 68 | +```python |
| 69 | +# number of sessions since the start of 2018. |
| 70 | +n = len(Session & 'session_date >= "2018-01-01"') |
| 71 | +``` |
| 72 | + |
| 73 | +## Normalization in queries |
| 74 | + |
| 75 | +Query objects adhere to entity [entity normalization](../design/normalization.md) just |
| 76 | +like the stored tables do. |
| 77 | +The result of a query is a well-defined entity set with an readily identifiable entity |
| 78 | +class and designated primary attributes that jointly distinguish any two entities from |
| 79 | +each other. |
| 80 | +The query [operators](operators.md) are designed to keep the result normalized even in |
| 81 | +complex query expressions. |
0 commit comments