CoLRev-Environment
diff --git a/‎docs/source/dev_docs/documentation.png‎
-17.3 KB b/‎docs/source/dev_docs/documentation.png‎
-17.3 KB
diff --git a/‎docs/source/dev_docs/documentation.pptx‎
-2.48 KB b/‎docs/source/dev_docs/documentation.pptx‎
-2.48 KB
diff --git a/‎docs/source/dev_docs/linter_development.rst‎
Lines changed: 75 additions & 0 deletions b/‎docs/source/dev_docs/linter_development.rst‎
Lines changed: 75 additions & 0 deletions
diff --git a/‎docs/source/dev_docs/overview.rst‎
Lines changed: 19 additions & 0 deletions b/‎docs/source/dev_docs/overview.rst‎
Lines changed: 19 additions & 0 deletions
diff --git a/‎docs/source/dev_docs/parser.rst‎
Lines changed: 0 additions & 133 deletions b/‎docs/source/dev_docs/parser.rst‎
Lines changed: 0 additions & 133 deletions
diff --git a/‎docs/source/dev_docs/parser_development.rst‎
Lines changed: 12 additions & 0 deletions b/‎docs/source/dev_docs/parser_development.rst‎
Lines changed: 12 additions & 0 deletions
diff --git a/‎docs/source/dev_docs/tests.rst‎
Lines changed: 5 additions & 0 deletions b/‎docs/source/dev_docs/tests.rst‎
Lines changed: 5 additions & 0 deletions
diff --git a/‎docs/source/dev_docs/translator_development.rst‎
Lines changed: 29 additions & 0 deletions b/‎docs/source/dev_docs/translator_development.rst‎
Lines changed: 29 additions & 0 deletions
diff --git a/‎docs/source/dev_docs/translator_skeleton.py‎
Lines changed: 80 additions & 0 deletions b/‎docs/source/dev_docs/translator_skeleton.py‎
Lines changed: 80 additions & 0 deletions
diff --git a/‎docs/source/improve.rst‎
Lines changed: 3 additions & 1 deletion b/‎docs/source/improve.rst‎
Lines changed: 3 additions & 1 deletion
@@ -0,0 +1,75 @@
+Linter
+============
+
+The linter checks should reuse (or extend) the messages from the constants module, which are documented in the `messages <../messages/errors_index.html>`_ section.
+The linter message should be unambiguously defined in the constants module.
+An additional *details* parameter can be added to the linter message, explaining the specific problem.
+
+.. literalinclude:: linter_skeleton.py
+   :language: python
+
+
+Search Field Validation
+-------------------------
+
+Strict vs. Non-Strict Modes
+
+.. list-table:: Search Field Validation in Strict vs. Non-Strict Modes
+   :widths: 20 20 20 20 20
+   :header-rows: 1
+
+   * - **Search-Field required**
+     - **Search String**
+     - **Search-Field**
+     - **Mode: Strict**
+     - **Mode: Non-Strict**
+   * - Yes
+     - With Search-Field
+     - Empty
+     - ok
+     - ok
+   * - Yes
+     - With Search-Field
+     - Equal to Search-String
+     - ok - search-field-redundant
+     - ok
+   * - Yes
+     - With Search-Field
+     - Different from Search-String
+     - error: search-field-contradiction
+     - ok - search-field-contradiction. Parser uses Search-String per default
+   * - Yes
+     - Without Search-Field
+     - Empty
+     - error: search-field-missing
+     - ok - search-field-missing. Parser adds `title` as the default
+   * - Yes
+     - Without Search-Field
+     - Given
+     - ok - search-field-extracted
+     - ok
+   * - No
+     - With Search-Field
+     - Empty
+     - ok
+     - ok
+   * - No
+     - With Search-Field
+     - Equal to Search-String
+     - ok - search-field-redundant
+     - ok
+   * - No
+     - With Search-Field
+     - Different from Search-String
+     - error: search-field-contradiction
+     - ok - search-field-contradiction. Parser uses Search-String per default
+   * - No
+     - Without Search-Field
+     - Empty
+     - ok - search-field-not-specified
+     - ok - Parser uses default of database
+   * - No
+     - Without Search-Field
+     - Given
+     - ok - search-field-extracted
+     - ok
@@ -0,0 +1,19 @@
+Overview
+==========================
+
+To support a platform, a parser, linter, translator, and serializer are required.
+The parser is responsible for parsing the search string, while the linter checks the parsed query for errors.
+The serializer converts the parsed query back into a search string.
+
+.. image:: documentation.png
+   :width: 800px
+
+Development setup
+-------------------
+
+.. code-block::
+   :caption: Installation in editable mode with `dev` extras
+
+   pip install -e ".[dev]"
+
+A code skeleton is available for the parser, linter, translator, and tests.
@@ -0,0 +1,12 @@
+Parser
+============
+
+To parse a list format, the numbered sub-queries should be replaced to create a search string, which can be parsed with the standard string-parser. This helps to avoid redundant implementation.
+
+.. literalinclude:: parser_skeleton.py
+   :language: python
+
+
+TODO :
+- refer to SearchFieldGeneral here!?
+- recommend to use a broad regex for search-fields (make sure it is mapped to the right token type) and use a validator to check whether the specific value (searchf-field) is valid.
@@ -0,0 +1,5 @@
+Tests
+----------------
+
+.. literalinclude:: parser_skeleton_tests.py
+   :language: python
@@ -0,0 +1,29 @@
+Translator
+============
+
+.. literalinclude:: parser_skeleton.py
+   :language: python
+
+
+Mapping Fields to Standard-Fields
+----------------------------------------------------------
+
+The search fields supported by the database (Platform-Fields) may not necessarily match exactly with the standard fields (Standard-Fields) in ``constants.Fields``.
+
+TODO : revisit this part:
+
+We distinguish the following cases:
+
+**1:1 matches**
+
+Cases where a 1:1 match exists between DB-Fields and Standard-Fields are added to the ``constants.PLATFORM_FIELD_MAP``.
+
+**1:n matches**
+
+Cases where a DB-Field combines multiple Standard-Fields are added to the ``constants.PLATFORM_COMBINED_FIELDS_MAP``. For example, Pubmed offers a search for ``[tiab]``, which combines ``Fields.TITLE`` and ``Fields.ABSTRACT``.
+
+When parsing combined DB-Fields, the standard platform should consist of n nodes, each with the same search term and an atomic Standard-Field. For example, ``Literacy[tiab]`` should become ``(Literacy[ti] OR Literacy[ab])``. When serializing a database string, it is recommended to combine Standard-Fields into DB-Fields whenever possible.
+
+**n:1 matches**
+
+If multiple Database-Fields correspond to the same Standard-Field, a combination of the default Database-Field and Standard-Field are added to the ``constants.PLATFORM_FIELD_MAP``. Non-default Database-Fields are replaced by the parser. For example, the default for MeSH terms at Pubmed is ``[mh]``, but the parser also supports ``[mesh]``.
@@ -0,0 +1,80 @@
+import re
+
+# handle special cases before (e.g., [tiab:~6])?
+# simply replace the search_field values
+
+# Map different variations (such as capitalization, short/long versions) to a standard_syntax_str
+PREPROCESSING_MAP = {
+    "TI=": r"TI=|Title=",
+}
+
+def map_to_standard(syntax_str: str) -> set:
+    for standard_key, variation_regex in PREPROCESSING_MAP.items():
+        if re.match(variation_regex, syntax_str, flags=re.IGNORECASE):
+            return standard_key
+    return "default"
+
+# Map standard_syntax_str to set of generic_search_field_set
+SYNTAX_GENERIC_MAP = {
+    "TIAB=": {"TI=", "AB="},
+    "TI=": {"TI="},
+    "AB=": {"AB="},
+    "TP=": {"TP="},
+    "ATP=": {"TP="},
+    "WOSTP=": {"TP="},
+}
+
+def syntax_str_to_generic_search_field_set(syntax_str: str) -> set:
+    standard_syntax_str = map_to_standard(syntax_str)
+    generic_search_field_set = SYNTAX_GENERIC_MAP[standard_syntax_str]
+    return generic_search_field_set
+
+def generic_search_field_set_to_syntax_set(generic_search_field_set: set) -> set:
+    syntax_set = {}
+    for key, value in SYNTAX_GENERIC_MAP.items():
+        if generic_search_field_set == value:
+            syntax_set.add(key)
+            # will add TIAB for {TI, AB} but not TI or AB
+            # will add TP, ATP, and WOSTP for {TP}
+    if not syntax_set:
+        raise Exception
+
+    return syntax_set
+
+
+# search_field string in syntax_1
+sf_1 = "TI"
+# mapped to a set of generic search fields
+sf = {"TI"}
+# mapped to a set of strings in syntax_2
+sf_2 = {"[ti]"}
+
+# search_field string in syntax_1
+sf_1 = "TIAB"
+# mapped to a set of generic search fields
+sf = {"TI", "AB"}
+# mapped to a set of strings in syntax_2
+sf_2 = {"TI=", "AB="}
+
+# search_field string in syntax_1
+sf_1 = "TIAB"
+# mapped to a set of generic search fields
+sf = {"TI", "AB"}
+# mapped to a set of strings in syntax_2
+sf_2 = {"[tiab]"}
+
+# search_field string in syntax_1
+sf_1 = "TP"
+# mapped to a set of generic search fields
+sf = {"TP"}
+# mapped to a set of strings in syntax_2
+sf_2 = {"[tp]", "[atp]", "[wostp]"}
+
+
+
+# linters: operate on syntax-specific fields
+
+# parser: if len(sf_1) > 1:
+# create OR query with children for each element in search_field
+
+# TODO : example for creating OR queries (or combining them)
@@ -105,4 +105,6 @@ References
 
 .. parsed-literal::
 
-   Cooper C, Varley-Campbell J, Booth A, et al. (2018) Systematic review identifies six metrics and one method for assessing literature search effectiveness but no consensus on appropriate use. Journal of Clinical Epidemiology 99: 53–63. DOI: 10.1016/J.JCLINEPI.2018.02.025.
+   Cooper C, Varley-Campbell J, Booth A, et al. (2018) Systematic review identifies six metrics and one method for assessin
+      literature search effectiveness but no consensus on appropriate use. Journal of Clinical Epidemiology 99: 53–63.
+      DOI: 10.1016/J.JCLINEPI.2018.02.025.
-Original file line number
+Diff line change
@@ @@ -0,0 +1,5 @@ @@
 +Tests
 +----------------
++
 +.. literalinclude:: parser_skeleton_tests.py
 +   :language: python