Skip to content

Commit 235d6cf

Browse files
authored
Database for queries and filters (#35)
* extract translator * add cli:translate * rename: syntax > platform * copy objects etc * revise serializers * introduce serializer class * add serializer_base * serializer class * minor issues * refactor serializer * revise tests * update * extract translator * format * refactor translators/constants modules * revisions * exclude docs skeletons from linters * rename function * revise generic_search_field_to_syntax_field * update docs * wos: handling unbalanced quotes * initial code * update import_lib files * add importlib dependency for python3.8 * fix format * revise * update docs * fix typing infos
1 parent 17e821c commit 235d6cf

File tree

11 files changed

+185
-4
lines changed

11 files changed

+185
-4
lines changed

docs/source/load.rst

Lines changed: 52 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,11 @@
33
Load
44
====================
55

6+
Queries can be loaded from strings/files, defined as objects, or retrieved from the internal database.
7+
8+
String/File
9+
-------------------------
10+
611
Search-query can parse queries from strings and JSON query files.
712
To load a JSON query file, run the parser:
813

@@ -28,11 +33,57 @@ JSON files in the standard format (Haddaway et al. 2022). Example:
2833
"search_string": "TS=(quantum AND dot AND spin)"
2934
}
3035
36+
37+
Query objects
38+
-------------------------
39+
3140
Query objects can also be created programmatically.
3241

33-
Filters (TODO)
42+
.. code-block:: python
43+
44+
from search_query import OrQuery, AndQuery
45+
46+
# Typical building-blocks approach
47+
digital_synonyms = OrQuery(["digital", "virtual", "online"], search_field="Abstract")
48+
work_synonyms = OrQuery(["work", "labor", "service"], search_field="Abstract")
49+
query = AndQuery([digital_synonyms, work_synonyms], search_field="Author Keywords")
50+
51+
Database
3452
---------------------
3553

54+
.. code-block:: python
55+
56+
from search-query import database
57+
58+
query = database.load_query("journals_FT50")
59+
60+
61+
62+
.. code-block:: python
63+
64+
from search_query.database import FT50, clinical_trials
65+
66+
print(FT50)
67+
> OR[issn=1234, issn=5678, JN="MIS Quartery", ...]
68+
69+
print(clinical_trials)
70+
> OR[title=rct, title="clinical trial", title="randomized controlled trial", title="experiment", ...]
71+
72+
# Combination with custom query blocks
73+
custom_block = ORQuery(....)
74+
full_query = ANDQuery(custom_block, clinical_trials, FT50)
75+
76+
In addition, the ``database_queries`` offer direct programmatic access to full queries and filters:
77+
78+
.. code-block:: python
79+
80+
from search_query.database_queries import FT50
81+
82+
print(FT50)
83+
84+
85+
Links:
86+
3687
- `search blocks <https://blocks.bmi-online.nl/>`_ are available under a creative-commons license
3788
- `overview_1 <https://sites.google.com/york.ac.uk/sureinfo/home/search-filters>`_
3889
- `overview_2 <https://sites.google.com/a/york.ac.uk/issg-search-filters-resource/home/https-sites-google-com-a-york-ac-uk-issg-search-filters-resource-collections-of-search-filters>`_

pyproject.toml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,9 @@ classifiers = [
2626
]
2727
include = ["LICENSE", "README.md"]
2828
requires-python = ">=3.8"
29+
dependencies = [
30+
'importlib-resources>=5.1; python_version < "3.9"'
31+
]
2932

3033
[project.optional-dependencies]
3134
docs = [

search_query/database.py

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
#!/usr/bin/env python3
2+
"""Database and filters."""
3+
import typing
4+
5+
try:
6+
from importlib.resources import files # Python 3.9+
7+
except ImportError:
8+
from importlib_resources import files # pip install importlib_resources
9+
10+
from search_query.parser import parse
11+
from search_query.search_file import load_search_file
12+
13+
if typing.TYPE_CHECKING:
14+
from search_query.query import Query
15+
16+
# mypy: disable-error-code=attr-defined
17+
18+
19+
def load_query(name: str) -> "Query":
20+
"""Load a query object from JSON by name."""
21+
try:
22+
name = name.replace(".json", "")
23+
json_path = files("search_query") / f"json_db/{name}.json" # mypy: ignore
24+
# print(json_path)
25+
search = load_search_file(json_path)
26+
query = parse(search.search_string, platform=search.platform)
27+
# print(query.to_structured_string())
28+
return query
29+
except FileNotFoundError as exc:
30+
raise KeyError(f"No query file named {name}.json found") from exc
31+
32+
33+
def list_queries() -> typing.List[str]:
34+
"""List all available predefined query identifiers (without .json)."""
35+
36+
# TODO : also give details (dictionary?)
37+
38+
json_dir = files("search_query") / "json_db"
39+
return [
40+
file.name.replace(".json", "")
41+
for file in json_dir.iterdir()
42+
if file.suffix == ".json"
43+
]
44+
45+
46+
# # TODO : offer an alternative load_search_file() function
47+
# (which gives users access to more information)

search_query/database_queries.py

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
#!/usr/bin/env python3
2+
"""Database and filters."""
3+
from search_query.database import load_query
4+
5+
FT50 = load_query("journals_FT50")
6+
AIS_8 = load_query("ais_senior_scholars_basket")
7+
AIS_11 = load_query("ais_senior_scholars_list_of_premier_journals")
8+
9+
__all__ = ["FT50", "AIS_8", "AIS_11"]
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
{
2+
"record_info": {},
3+
"authors": [{"name": "Gerit Wagner"}],
4+
"date": {"data_entry": "2025-05-12"},
5+
"platform": "wos",
6+
"type": "filter",
7+
"database": [],
8+
"search_string": "SO=(\"European Journal of Information Systems\" OR \"Information Systems Journal\" OR \"Information Systems Research\" OR \"Journal of the Association for Information Systems\" OR \"Journal of Information Technology\" OR \"Journal of Management Information Systems\" OR \"Journal of Strategic Information Systems\" OR \"MIS Quarterly\") OR IS=(0960-085X OR 1476-9344 OR 1350-1917 OR 1365-2575 OR 1047-7047 OR 1526-5536 OR 1536-9323 OR 0268-3962 OR 1466-4437 OR 0742-1222 OR 1557-928X OR 0963-8687 OR 1873-1198 OR 0276-7783 OR 2162-9730)",
9+
"title": "Search filter: AIS Senior Scholars Basket",
10+
"description": "Search filter for the AIS Senior Scholars Basket of Journals (eight)",
11+
"keywords": "AIS, journals, information systems",
12+
"license": "Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License."
13+
}
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
{
2+
"record_info": {},
3+
"authors": [{"name": "Gerit Wagner"}],
4+
"date": {"data_entry": "2025-05-12"},
5+
"platform": "wos",
6+
"type": "filter",
7+
"database": [],
8+
"search_string": "SO=(\"European Journal of Information Systems\" OR \"Information Systems Journal\" OR \"Information Systems Research\" OR \"Journal of the Association for Information Systems\" OR \"Journal of Information Technology\" OR \"Journal of Management Information Systems\" OR \"Journal of Strategic Information Systems\" OR \"MIS Quarterly\" OR \"Decision Support Systems\" OR \"Information & Management\" OR \"Information and Organization\") OR IS=(0960-085X OR 1476-9344 OR 1350-1917 OR 1365-2575 OR 1047-7047 OR 1526-5536 OR 1536-9323 OR 0268-3962 OR 1466-4437 OR 0742-1222 OR 1557-928X OR 0963-8687 OR 1873-1198 OR 0276-7783 OR 2162-9730 OR 0167-9236 OR 1873-5797 OR 0378-7206 OR 1872-7530 OR 1471-7727 OR 1873-7919)",
9+
"title": "Search filter: AIS Senior Scholars List of Premier Journals",
10+
"description": "Search filter for the AIS Senior Scholars List of Premier Journals (eleven), see https://aisnet.org/page/SeniorScholarListofPremierJournals",
11+
"keywords": "AIS, journals, information systems",
12+
"license": "Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License."
13+
}
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
{
2+
"record_info": {
3+
"url": "https://blocks.bmi-online.nl/catalog/343"
4+
},
5+
"authors": [{"name": "Ket JCF"}],
6+
"date": {"data_entry": "2019-01-15"},
7+
"platform": "PubMed",
8+
"type": "filter",
9+
"database": [],
10+
"search_string": "(treatment*[tiab] AND usual[tiab]) OR (standard[tiab] AND care[tiab]) OR (standard[tiab] AND treatment[tiab]) OR (routine[tiab] AND care[tiab]) OR (usual[tiab] AND medication*[tiab]) OR (usual[tiab] AND care[tiab]) OR tau[tiab] OR waitlist*[tiab] OR wait list*[tiab] OR waiting list*[tiab] OR (waiting[tiab] AND (condition[tiab] OR control[tiab])) OR wlc[tiab] OR (delay*[tiab] AND (start[tiab] OR treatment*[tiab])) OR \"no intervention\"[tiab] OR non treatment*[tiab] OR nontreatment*[tiab] OR (minim*[tiab] AND treatment*[tiab]) OR untreated group*[tiab] OR untreated control*[tiab] OR \"without any treatment\"[tiab] OR (untreated[tiab] AND (patients[tiab] OR participants[tiab] OR subjects[tiab] OR group*[tiab] OR control*[tiab])) OR non intervention*[tiab] OR (\"without any\"[tiab] AND intervention*[tiab]) OR (receiv*[tiab] AND nothing[tiab]) OR \"did not receive\"[tiab] OR standard control[tiab] OR non therap*[tiab] OR nontherap*[tiab] OR nonpsychotherap*[tiab] OR (minim*[tiab] AND therap*[tiab]) OR pseudotherap*[tiab] OR pseudo therap*[tiab] OR (therap*[tiab] AND as usual[tiab]) OR usual therap*[tiab] OR reference group[tiab] OR observation group[tiab] OR (convention*[tiab] AND treatment[tiab]) OR conventional therap*[tiab] OR standard treatment*[tiab] OR (standard[tiab] AND therap*[tiab])",
11+
"title": "Search block: Treatment as usual",
12+
"description": "original drafts by Sarah Dawson, Cochrane CCDAN group, 21 Jun 2016. TAU-filters may retrieve other comparative studies than RCTs, the comparison here being 'treatment as usual', in stead of placebo or gold standard therapy. See also 'no treatment'. For CINAHL the 'Treatment as usual'-filter is integrated in the RCT-filter.",
13+
"keywords": "EBSCO/CINAHL; EBSCO/PsycInfo (PI); EMbase.com (EM); PubMed (PM); Study types; Therapy",
14+
"license": "Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License."
15+
}
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
{
2+
"record_info": {},
3+
"authors": [{"name": "Gerit Wagner"}],
4+
"date": {"data_entry": "2025-05-12"},
5+
"platform": "wos",
6+
"type": "filter",
7+
"database": [],
8+
"search_string": "SO=(\"Academy of Management Journal\" OR \"Academy of Management Review\" OR \"Accounting, Organizations and Society\" OR \"Administrative Science Quarterly\" OR \"American Economic Review\" OR \"Contemporary Accounting Research\" OR \"Econometrica\" OR \"Entrepreneurship Theory and Practice\" OR \"Harvard Business Review\" OR \"Human Relations\" OR \"Human Resource Management\" OR \"Information Systems Research\" OR \"Journal of Accounting and Economics\" OR \"Journal of Accounting Research\" OR \"Journal of Applied Psychology\" OR \"Journal of Business Ethics\" OR \"Journal of Business Venturing\" OR \"Journal of Consumer Psychology\" OR \"Journal of Consumer Research\" OR \"Journal of Finance\" OR \"Journal of Financial and Quantitative Analysis\" OR \"Journal of Financial Economics\" OR \"Journal of International Business Studies\" OR \"Journal of Management\" OR \"Journal of Management Information Systems\" OR \"Journal of Management Studies\" OR \"Journal of Marketing\" OR \"Journal of Marketing Research\" OR \"Journal of Operations Management\" OR \"Journal of Political Economy\" OR \"Journal of the Academy of Marketing Science\" OR \"Management Science\" OR \"Manufacturing and Service Operations Management\" OR \"Marketing Science\" OR \"MIS Quarterly\" OR \"Operations Research\" OR \"Organization Science\" OR \"Organization Studies\" OR \"Organizational Behavior and Human Decision Processes\" OR \"Production and Operations Management\" OR \"Quarterly Journal of Economics\" OR \"Research Policy\" OR \"Review of Accounting Studies\" OR \"Review of Economic Studies\" OR \"Review of Finance\" OR \"Review of Financial Studies\" OR \"Sloan Management Review\" OR \"Strategic Entrepreneurship Journal\" OR \"Strategic Management Journal\" OR \"The Accounting Review\") OR IS=(0001-4273 OR 0363-7425 OR 0361-3682 OR 0001-8392 OR 0002-8282 OR 0823-9150 OR 0012-9682 OR 1042-2587 OR 0017-8012 OR 0018-7267 OR 0090-4848 OR 1047-7047 OR 0165-4101 OR 0021-8456 OR 0021-9010 OR 0167-4544 OR 0883-9026 OR 1057-7408 OR 0093-5301 OR 0022-1082 OR 0022-1090 OR 0304-405X OR 0047-2506 OR 0149-2063 OR 0742-1222 OR 0022-2429 OR 0022-2437 OR 0272-6963 OR 0022-3808 OR 0092-0703 OR 0025-1909 OR 1523-4614 OR 0732-2399 OR 0276-7783 OR 0030-364X OR 1047-7039 OR 0170-8406 OR 0749-5978 OR 1059-1478 OR 0033-5533 OR 0048-7333 OR 1380-6653 OR 0034-6527 OR 1572-3097 OR 0893-9454 OR 0036-8075 OR 1932-4391 OR 0143-2095 OR 0001-4826)",
9+
"title": "Search filter: FT50 journals",
10+
"description": "Search filter for the FT50 (Financial Times 50) journals. The FT50 list is a list of 50 academic journals in the field of business and management that are considered to be the most prestigious and influential in the world. The list is published annually by the Financial Times and is used as a benchmark for academic research and publishing in the field. More information: https://www.ft.com/content/3405a512-5cbb-11e1-8f1f-00144feabdc0",
11+
"keywords": "FT50, journals, business, management",
12+
"license": "Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License."
13+
}

search_query/linter_pubmed.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -460,6 +460,7 @@ def _extract_subqueries(
460460
if not query.children:
461461
subqueries[subquery_id].append(query)
462462

463+
# pylint: disable=duplicate-code
463464
def check_unsupported_pubmed_search_fields(self) -> None:
464465
"""Check for the correct format of fields."""
465466

test/test_database.py

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
#!/usr/bin/env python
2+
"""Tests for search query database."""
3+
from search_query.database import load_query
4+
from search_query.query import Query
5+
6+
# ruff: noqa: E501
7+
# flake8: noqa: E501
8+
9+
10+
def test_load_query_journals_ft50() -> None:
11+
"""Test loading the JOURNALS_FT50 query file."""
12+
query = load_query("journals_FT50")
13+
14+
assert isinstance(query, Query)
15+
print(query.to_generic_string())
16+
assert (
17+
query.to_generic_string()
18+
== 'OR[OR[SO=]["Academy of Management Journal", "Academy of Management Review", "Accounting, Organizations and Society", "Administrative Science Quarterly", "American Economic Review", "Contemporary Accounting Research", "Econometrica", "Entrepreneurship Theory and Practice", "Harvard Business Review", "Human Relations", "Human Resource Management", "Information Systems Research", "Journal of Accounting and Economics", "Journal of Accounting Research", "Journal of Applied Psychology", "Journal of Business Ethics", "Journal of Business Venturing", "Journal of Consumer Psychology", "Journal of Consumer Research", "Journal of Finance", "Journal of Financial and Quantitative Analysis", "Journal of Financial Economics", "Journal of International Business Studies", "Journal of Management", "Journal of Management Information Systems", "Journal of Management Studies", "Journal of Marketing", "Journal of Marketing Research", "Journal of Operations Management", "Journal of Political Economy", "Journal of the Academy of Marketing Science", "Management Science", "Manufacturing and Service Operations Management", "Marketing Science", "MIS Quarterly", "Operations Research", "Organization Science", "Organization Studies", "Organizational Behavior and Human Decision Processes", "Production and Operations Management", "Quarterly Journal of Economics", "Research Policy", "Review of Accounting Studies", "Review of Economic Studies", "Review of Finance", "Review of Financial Studies", "Sloan Management Review", "Strategic Entrepreneurship Journal", "Strategic Management Journal", "The Accounting Review"], OR[IS=][0001-4273, 0363-7425, 0361-3682, 0001-8392, 0002-8282, 0823-9150, 0012-9682, 1042-2587, 0017-8012, 0018-7267, 0090-4848, 1047-7047, 0165-4101, 0021-8456, 0021-9010, 0167-4544, 0883-9026, 1057-7408, 0093-5301, 0022-1082, 0022-1090, 0304-405X, 0047-2506, 0149-2063, 0742-1222, 0022-2429, 0022-2437, 0272-6963, 0022-3808, 0092-0703, 0025-1909, 1523-4614, 0732-2399, 0276-7783, 0030-364X, 1047-7039, 0170-8406, 0749-5978, 1059-1478, 0033-5533, 0048-7333, 1380-6653, 0034-6527, 1572-3097, 0893-9454, 0036-8075, 1932-4391, 0143-2095, 0001-4826]]'
19+
)

0 commit comments

Comments
 (0)