Skip to content

Commit 7fecafb

Browse files
author
Gerit Wagner
committed
docs: update serrializer/tests
1 parent f95320e commit 7fecafb

File tree

6 files changed

+187
-94
lines changed

6 files changed

+187
-94
lines changed

docs/source/dev_docs/parser_development.rst

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -75,7 +75,9 @@ Implement ``parse_query_tree()`` to build the query object, creating nested quer
7575

7676
.. note::
7777

78-
Check whether ``SearchFields`` can be created for nested queries (e.g., ``TI=(eHealth OR mHealth)``or only for individual terms, e.g., ``eHealth[ti] OR mHealth[ti]``.)
78+
Parsers can be developed as top-down parsers (see PubMed) or bottom-up parsers (see Web of Science).
79+
80+
Check whether ``SearchFields`` can be created for nested queries (e.g., ``TI=(eHealth OR mHealth)``or only for individual terms, e.g., ``eHealth[ti] OR mHealth[ti]``.)
7981

8082
**Parser Skeleton**
8183

docs/source/dev_docs/parser_skeleton_tests.py

Lines changed: 0 additions & 89 deletions
This file was deleted.
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
Serializers
2+
===========
3+
4+
Serializers convert a query object into a string representation.
5+
This enables the query to be rendered for human inspection, logging, or submission to search engines.
6+
7+
Each serializer implements a function that takes a `Query` object and returns a string.
8+
This supports various output formats including debugging views and platform-specific syntaxes.
9+
10+
Interface
11+
---------
12+
Serializers are typically implemented as standalone functions. The core interface is:
13+
14+
.. literalinclude:: serializer_skeleton.py
15+
:language: python
16+
17+
18+
Serializers follow a shared conceptual pattern:
19+
20+
- Accept a `Query` object.
21+
- Recursively traverse the query tree.
22+
- Render each node (logical operator, term, field) into a string.
23+
- Combine child nodes with appropriate formatting and syntax.
24+
25+
.. note::
26+
27+
Avoid embedding platform-specific validation logic (use linters for that).
Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
#!/usr/bin/env python3
2+
"""Example serializer template for a custom platform."""
3+
from __future__ import annotations
4+
5+
from typing import TYPE_CHECKING
6+
7+
if TYPE_CHECKING:
8+
from search_query.query import Query
9+
10+
11+
def to_string_custom(query: Query) -> str:
12+
13+
# Leaf node (no children)
14+
if not query.children:
15+
field = query.search_field.value if query.search_field else ""
16+
return f"{field}{query.value}"
17+
18+
# Composite node (operator with children)
19+
serialized_children = [to_string_custom(child) for child in query.children]
20+
joined_children = f" {query.value} ".join(serialized_children)
21+
22+
# Add parentheses to clarify grouping
23+
if len(query.children) > 1:
24+
joined_children = f"({joined_children})"
25+
26+
# Prefix with field if applicable
27+
if query.search_field:
28+
return f"{query.search_field.value}{joined_children}"
29+
return joined_children

docs/source/dev_docs/tests.rst

Lines changed: 126 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,128 @@
11
Tests
2-
----------------
2+
============
33

4-
.. literalinclude:: parser_skeleton_tests.py
5-
:language: python
4+
This section outlines best practices for writing unit tests in the `search_query` package.
5+
Tests are primarily written using `pytest` and are organized by module (`parser`, `linter`, `translator`, etc.).
6+
7+
8+
To run all tests:
9+
::
10+
11+
pytest test
12+
13+
Test Types
14+
----------
15+
16+
1. **Tokenization Tests**
17+
- Purpose: Verify that a query string is tokenized correctly into expected tokens.
18+
- Tools: `pytest.mark.parametrize` for multiple cases.
19+
- Example:
20+
21+
.. code-block:: python
22+
23+
@pytest.mark.parametrize(
24+
"query_str, expected_tokens",
25+
[
26+
(
27+
"AB=(Health)",
28+
[
29+
Token(value="AB=", type=TokenTypes.FIELD, position=(0, 3)),
30+
Token(value="(", type=TokenTypes.PARENTHESIS_OPEN, position=(3, 4)),
31+
Token(value="Health", type=TokenTypes.SEARCH_TERM, position=(4, 10)),
32+
Token(value=")", type=TokenTypes.PARENTHESIS_CLOSED, position=(10, 11)),
33+
],
34+
)
35+
],
36+
)
37+
def test_tokenization(query_str: str, expected_tokens: list) -> None:
38+
print(
39+
f"Run query parser for: \n {Colors.GREEN}{query_str}{Colors.END}\n--------------------\n"
40+
)
41+
42+
parser = XYParser(query_str)
43+
parser.tokenize()
44+
45+
assert parser.tokens == expected_tokens, (
46+
f"\nExpected: {expected_tokens}\nGot : {parser.tokens}"
47+
)
48+
49+
2. **Linter Message Tests**
50+
- Purpose: Verify that the linter raises expected warnings or errors for malformed input.
51+
- Approach:
52+
- Catch exceptions where necessary.
53+
- Use structured comparison with linter messages.
54+
- Example:
55+
56+
.. code-block:: python
57+
58+
@pytest.mark.parametrize(
59+
"query_str, search_field_general, messages",
60+
[
61+
(
62+
'("health tracking" OR "remote monitoring") AND (("mobile application" OR "wearable device")',
63+
"Title",
64+
[
65+
{
66+
"code": "F1001",
67+
"label": "unbalanced-parentheses",
68+
"message": "Parentheses are unbalanced in the query",
69+
"is_fatal": True,
70+
"position": (47, 48),
71+
"details": "Unbalanced opening parenthesis",
72+
},
73+
{
74+
"code": "E0001",
75+
"label": "search-field-missing",
76+
"message": "Expected search field is missing",
77+
"is_fatal": False,
78+
"position": (-1, -1),
79+
"details": "Search fields should be specified in the query instead of the search_field_general",
80+
},
81+
],
82+
),
83+
# add more cases here as needed...
84+
],
85+
)
86+
def test_linter(query_str: str, search_field_general: str, messages: list[dict]) -> None:
87+
88+
parser = XYParser(query_str, search_field_general=search_field_general)
89+
try:
90+
parser.parse()
91+
except SearchQueryException:
92+
pass # Errors are expected in some cases
93+
94+
actual_messages = parser.linter.messages
95+
if actual_messages != messages:
96+
print("Expected:")
97+
for m in messages:
98+
print(f" - {m}")
99+
print("Got:")
100+
for m in actual_messages:
101+
print(f" - {m}")
102+
103+
assert actual_messages == messages
104+
105+
3. **Translation Tests**
106+
- Purpose: Confirm that parsing + serialization results in the expected generic or structured query string.
107+
- Example:
108+
109+
**TODO:**
110+
111+
::
112+
113+
@pytest.mark.parametrize(
114+
"query_string, expected_translation",
115+
[
116+
("TS=(eHealth) AND TS=(Review)", "AND[eHealth[TS=], Review[TS=]]"),
117+
],
118+
)
119+
def test_parser_translation(query_string, expected_translation):
120+
parser = XYParser(query_string)
121+
query_tree = parser.parse()
122+
assert query_tree.to_generic_string() == expected_translation
123+
124+
125+
.. note::
126+
127+
- Use helper functions like `print_debug_tokens()` to ease debugging.
128+
- Use `assert ... == ...` with fallbacks for `print(...)` for inspection.

docs/source/index.rst

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -180,4 +180,5 @@ Below is a high-level overview of the core functionalities:
180180
dev_docs/parser_development
181181
dev_docs/linter_development
182182
dev_docs/translator_development
183-
dev_docs/tests
183+
dev_docs/serializer_development
184+
dev_docs/tests

0 commit comments

Comments
 (0)