Skip to content

Commit da6c5ec

Browse files
author
Gerit Wagner
committed
update docs
1 parent cb7209d commit da6c5ec

23 files changed

+342
-155
lines changed
-17.3 KB
Loading
-2.48 KB
Binary file not shown.
Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
Linter
2+
============
3+
4+
The linter checks should reuse (or extend) the messages from the constants module, which are documented in the `messages <../messages/errors_index.html>`_ section.
5+
The linter message should be unambiguously defined in the constants module.
6+
An additional *details* parameter can be added to the linter message, explaining the specific problem.
7+
8+
.. literalinclude:: linter_skeleton.py
9+
:language: python
10+
11+
12+
Search Field Validation
13+
-------------------------
14+
15+
Strict vs. Non-Strict Modes
16+
17+
.. list-table:: Search Field Validation in Strict vs. Non-Strict Modes
18+
:widths: 20 20 20 20 20
19+
:header-rows: 1
20+
21+
* - **Search-Field required**
22+
- **Search String**
23+
- **Search-Field**
24+
- **Mode: Strict**
25+
- **Mode: Non-Strict**
26+
* - Yes
27+
- With Search-Field
28+
- Empty
29+
- ok
30+
- ok
31+
* - Yes
32+
- With Search-Field
33+
- Equal to Search-String
34+
- ok - search-field-redundant
35+
- ok
36+
* - Yes
37+
- With Search-Field
38+
- Different from Search-String
39+
- error: search-field-contradiction
40+
- ok - search-field-contradiction. Parser uses Search-String per default
41+
* - Yes
42+
- Without Search-Field
43+
- Empty
44+
- error: search-field-missing
45+
- ok - search-field-missing. Parser adds `title` as the default
46+
* - Yes
47+
- Without Search-Field
48+
- Given
49+
- ok - search-field-extracted
50+
- ok
51+
* - No
52+
- With Search-Field
53+
- Empty
54+
- ok
55+
- ok
56+
* - No
57+
- With Search-Field
58+
- Equal to Search-String
59+
- ok - search-field-redundant
60+
- ok
61+
* - No
62+
- With Search-Field
63+
- Different from Search-String
64+
- error: search-field-contradiction
65+
- ok - search-field-contradiction. Parser uses Search-String per default
66+
* - No
67+
- Without Search-Field
68+
- Empty
69+
- ok - search-field-not-specified
70+
- ok - Parser uses default of database
71+
* - No
72+
- Without Search-Field
73+
- Given
74+
- ok - search-field-extracted
75+
- ok

docs/source/dev_docs/overview.rst

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
Overview
2+
==========================
3+
4+
To support a platform, a parser, linter, translator, and serializer are required.
5+
The parser is responsible for parsing the search string, while the linter checks the parsed query for errors.
6+
The serializer converts the parsed query back into a search string.
7+
8+
.. image:: documentation.png
9+
:width: 800px
10+
11+
Development setup
12+
-------------------
13+
14+
.. code-block::
15+
:caption: Installation in editable mode with `dev` extras
16+
17+
pip install -e ".[dev]"
18+
19+
A code skeleton is available for the parser, linter, translator, and tests.

docs/source/dev_docs/parser.rst

Lines changed: 0 additions & 133 deletions
This file was deleted.
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
Parser
2+
============
3+
4+
To parse a list format, the numbered sub-queries should be replaced to create a search string, which can be parsed with the standard string-parser. This helps to avoid redundant implementation.
5+
6+
.. literalinclude:: parser_skeleton.py
7+
:language: python
8+
9+
10+
TODO :
11+
- refer to SearchFieldGeneral here!?
12+
- recommend to use a broad regex for search-fields (make sure it is mapped to the right token type) and use a validator to check whether the specific value (searchf-field) is valid.

docs/source/dev_docs/tests.rst

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
Tests
2+
----------------
3+
4+
.. literalinclude:: parser_skeleton_tests.py
5+
:language: python
Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
Translator
2+
============
3+
4+
.. literalinclude:: parser_skeleton.py
5+
:language: python
6+
7+
8+
Mapping Fields to Standard-Fields
9+
----------------------------------------------------------
10+
11+
The search fields supported by the database (Platform-Fields) may not necessarily match exactly with the standard fields (Standard-Fields) in ``constants.Fields``.
12+
13+
TODO : revisit this part:
14+
15+
We distinguish the following cases:
16+
17+
**1:1 matches**
18+
19+
Cases where a 1:1 match exists between DB-Fields and Standard-Fields are added to the ``constants.PLATFORM_FIELD_MAP``.
20+
21+
**1:n matches**
22+
23+
Cases where a DB-Field combines multiple Standard-Fields are added to the ``constants.PLATFORM_COMBINED_FIELDS_MAP``. For example, Pubmed offers a search for ``[tiab]``, which combines ``Fields.TITLE`` and ``Fields.ABSTRACT``.
24+
25+
When parsing combined DB-Fields, the standard platform should consist of n nodes, each with the same search term and an atomic Standard-Field. For example, ``Literacy[tiab]`` should become ``(Literacy[ti] OR Literacy[ab])``. When serializing a database string, it is recommended to combine Standard-Fields into DB-Fields whenever possible.
26+
27+
**n:1 matches**
28+
29+
If multiple Database-Fields correspond to the same Standard-Field, a combination of the default Database-Field and Standard-Field are added to the ``constants.PLATFORM_FIELD_MAP``. Non-default Database-Fields are replaced by the parser. For example, the default for MeSH terms at Pubmed is ``[mh]``, but the parser also supports ``[mesh]``.
Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,80 @@
1+
import re
2+
3+
# handle special cases before (e.g., [tiab:~6])?
4+
# simply replace the search_field values
5+
6+
# Map different variations (such as capitalization, short/long versions) to a standard_syntax_str
7+
PREPROCESSING_MAP = {
8+
"TI=": r"TI=|Title=",
9+
}
10+
11+
def map_to_standard(syntax_str: str) -> set:
12+
for standard_key, variation_regex in PREPROCESSING_MAP.items():
13+
if re.match(variation_regex, syntax_str, flags=re.IGNORECASE):
14+
return standard_key
15+
return "default"
16+
17+
# Map standard_syntax_str to set of generic_search_field_set
18+
SYNTAX_GENERIC_MAP = {
19+
"TIAB=": {"TI=", "AB="},
20+
"TI=": {"TI="},
21+
"AB=": {"AB="},
22+
"TP=": {"TP="},
23+
"ATP=": {"TP="},
24+
"WOSTP=": {"TP="},
25+
}
26+
27+
def syntax_str_to_generic_search_field_set(syntax_str: str) -> set:
28+
standard_syntax_str = map_to_standard(syntax_str)
29+
generic_search_field_set = SYNTAX_GENERIC_MAP[standard_syntax_str]
30+
return generic_search_field_set
31+
32+
def generic_search_field_set_to_syntax_set(generic_search_field_set: set) -> set:
33+
syntax_set = {}
34+
for key, value in SYNTAX_GENERIC_MAP.items():
35+
if generic_search_field_set == value:
36+
syntax_set.add(key)
37+
# will add TIAB for {TI, AB} but not TI or AB
38+
# will add TP, ATP, and WOSTP for {TP}
39+
if not syntax_set:
40+
raise Exception
41+
42+
return syntax_set
43+
44+
45+
# search_field string in syntax_1
46+
sf_1 = "TI"
47+
# mapped to a set of generic search fields
48+
sf = {"TI"}
49+
# mapped to a set of strings in syntax_2
50+
sf_2 = {"[ti]"}
51+
52+
# search_field string in syntax_1
53+
sf_1 = "TIAB"
54+
# mapped to a set of generic search fields
55+
sf = {"TI", "AB"}
56+
# mapped to a set of strings in syntax_2
57+
sf_2 = {"TI=", "AB="}
58+
59+
# search_field string in syntax_1
60+
sf_1 = "TIAB"
61+
# mapped to a set of generic search fields
62+
sf = {"TI", "AB"}
63+
# mapped to a set of strings in syntax_2
64+
sf_2 = {"[tiab]"}
65+
66+
# search_field string in syntax_1
67+
sf_1 = "TP"
68+
# mapped to a set of generic search fields
69+
sf = {"TP"}
70+
# mapped to a set of strings in syntax_2
71+
sf_2 = {"[tp]", "[atp]", "[wostp]"}
72+
73+
74+
75+
# linters: operate on syntax-specific fields
76+
77+
# parser: if len(sf_1) > 1:
78+
# create OR query with children for each element in search_field
79+
80+
# TODO : example for creating OR queries (or combining them)

docs/source/improve.rst

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -105,4 +105,6 @@ References
105105

106106
.. parsed-literal::
107107
108-
Cooper C, Varley-Campbell J, Booth A, et al. (2018) Systematic review identifies six metrics and one method for assessing literature search effectiveness but no consensus on appropriate use. Journal of Clinical Epidemiology 99: 53–63. DOI: 10.1016/J.JCLINEPI.2018.02.025.
108+
Cooper C, Varley-Campbell J, Booth A, et al. (2018) Systematic review identifies six metrics and one method for assessin
109+
literature search effectiveness but no consensus on appropriate use. Journal of Clinical Epidemiology 99: 53–63.
110+
DOI: 10.1016/J.JCLINEPI.2018.02.025.

0 commit comments

Comments
 (0)