Skip to content

Commit a973edd

Browse files
Feature/bcss 21355 subject assertion util (#125)
<!-- markdownlint-disable-next-line first-line-heading --> ## Description <!-- Describe your changes in detail. --> Adding a new util to manage assertions on subjects. This is very similar to how the selenium tests do it except if the subject does not match the criteria, it checks each criterion individually to see what is causing the error and logs it: ``` ERROR root:subject_assertion.py:69 Subject Assertion Failed Failed criteria: latest episode type, FOBT latest episode status, Open latest episode has referral date, Past latest episode has diagnosis date, No latest episode diagnosis date reason, NULL ``` ## Context <!-- Why is this change required? What problem does it solve? --> Allows us to perform assertions on subjects using a common util. ## Type of changes <!-- What types of changes does your code introduce? Put an `x` in all the boxes that apply. --> - [x] Refactoring (non-breaking change) - [x] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would change existing functionality) - [ ] Bug fix (non-breaking change which fixes an issue) ## Checklist <!-- Go over all the following points, and put an `x` in all the boxes that apply. --> - [x] I am familiar with the [contributing guidelines](https://github.com/nhs-england-tools/playwright-python-blueprint/blob/main/CONTRIBUTING.md) - [x] I have followed the code style of the project - [x] I have added tests to cover my changes (where appropriate) - [x] I have updated the documentation accordingly - [ ] This PR is a result of pair or mob programming --- ## Sensitive Information Declaration To ensure the utmost confidentiality and protect your and others privacy, we kindly ask you to NOT including [PII (Personal Identifiable Information) / PID (Personal Identifiable Data)](https://digital.nhs.uk/data-and-information/keeping-data-safe-and-benefitting-the-public) or any other sensitive data in this PR (Pull Request) and the codebase changes. We will remove any PR that do contain any sensitive information. We really appreciate your cooperation in this matter. - [x] I confirm that neither PII/PID nor sensitive data are included in this PR and the codebase changes.
1 parent 67ec692 commit a973edd

File tree

5 files changed

+311
-2
lines changed

5 files changed

+311
-2
lines changed

classes/subject.py

Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,8 @@
1010
from classes.sdd_reason_for_change_type import SDDReasonForChangeType
1111
from classes.ss_reason_for_change_type import SSReasonForChangeType
1212
from classes.ssdd_reason_for_change_type import SSDDReasonForChangeType
13+
from utils.date_time_utils import DateTimeUtils
14+
import pandas as pd
1315

1416

1517
@dataclass
@@ -1241,3 +1243,62 @@ def __str__(self):
12411243
f"datestamp={self.datestamp}"
12421244
f"]"
12431245
)
1246+
1247+
@staticmethod
1248+
def from_dataframe_row(row: pd.Series) -> "Subject":
1249+
"""
1250+
Populates a Subject object from a pandas DataFrame row.
1251+
Handles type conversions for dates and datetimes.
1252+
Only fields present in the SQL query are populated.
1253+
"""
1254+
1255+
field_map = {
1256+
"screening_subject_id": row.get("screening_subject_id"),
1257+
"nhs_number": row.get("subject_nhs_number"),
1258+
"surname": row.get("person_family_name"),
1259+
"forename": row.get("person_given_name"),
1260+
"datestamp": DateTimeUtils.parse_datetime(row.get("datestamp")),
1261+
"screening_status_id": row.get("screening_status_id"),
1262+
"screening_status_change_reason_id": row.get("ss_reason_for_change_id"),
1263+
"screening_status_change_date": DateTimeUtils.parse_date(
1264+
row.get("screening_status_change_date")
1265+
),
1266+
"screening_due_date": DateTimeUtils.parse_date(
1267+
row.get("screening_due_date")
1268+
),
1269+
"screening_due_date_change_reason_id": row.get("sdd_reason_for_change_id"),
1270+
"screening_due_date_change_date": DateTimeUtils.parse_date(
1271+
row.get("sdd_change_date")
1272+
),
1273+
"calculated_screening_due_date": DateTimeUtils.parse_date(
1274+
row.get("calculated_sdd")
1275+
),
1276+
"surveillance_screening_due_date": DateTimeUtils.parse_date(
1277+
row.get("surveillance_screen_due_date")
1278+
),
1279+
"calculated_surveillance_due_date": DateTimeUtils.parse_date(
1280+
row.get("calculated_ssdd")
1281+
),
1282+
"surveillance_due_date_change_reason_id": row.get(
1283+
"surveillance_sdd_rsn_change_id"
1284+
),
1285+
"surveillance_due_date_change_date": DateTimeUtils.parse_date(
1286+
row.get("surveillance_sdd_change_date")
1287+
),
1288+
"lynch_due_date": DateTimeUtils.parse_date(
1289+
row.get("lynch_screening_due_date")
1290+
),
1291+
"lynch_due_date_change_reason_id": row.get(
1292+
"lynch_sdd_reason_for_change_id"
1293+
),
1294+
"lynch_due_date_change_date": DateTimeUtils.parse_date(
1295+
row.get("lynch_sdd_change_date")
1296+
),
1297+
"calculated_lynch_due_date": DateTimeUtils.parse_date(
1298+
row.get("lynch_calculated_sdd")
1299+
),
1300+
"date_of_birth": DateTimeUtils.parse_date(row.get("date_of_birth")),
1301+
"date_of_death": DateTimeUtils.parse_date(row.get("date_of_death")),
1302+
}
1303+
1304+
return Subject(**field_map)
Lines changed: 87 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,87 @@
1+
# Utility Guide: Subject Assertion Utility
2+
3+
This guide explains the purpose and usage of the `subject_assertion` utility found in [`utils/subject_assertion.py`](../../utils/subject_assertion.py).
4+
It is designed to assert that a subject with a given NHS number matches specified criteria in the database, and provides detailed logging when criteria do not match.
5+
6+
---
7+
8+
## Table of Contents
9+
10+
- [Utility Guide: Subject Assertion Utility](#utility-guide-subject-assertion-utility)
11+
- [Table of Contents](#table-of-contents)
12+
- [Overview](#overview)
13+
- [Required Arguments](#required-arguments)
14+
- [How It Works](#how-it-works)
15+
- [Example Usage](#example-usage)
16+
- [Behaviour Details](#behaviour-details)
17+
- [Best Practices](#best-practices)
18+
- [Reference](#reference)
19+
20+
---
21+
22+
## Overview
23+
24+
The `subject_assertion` function is used to verify that a subject in the database matches a set of criteria.
25+
If the subject does not match all criteria, the function will iteratively loop through each criteria (except NHS number), logging any criteria that caused the assertion to fail.
26+
27+
---
28+
29+
## Required Arguments
30+
31+
- `nhs_number` (`str`): The NHS number of the subject to check.
32+
- `criteria` (`dict`): A dictionary of criteria to match against the subject's attributes.
33+
34+
---
35+
36+
## How It Works
37+
38+
1. The function first checks if the subject with the given NHS number matches all provided criteria.
39+
2. If not, it checks one criterion at a time and retries the assertion.
40+
3. This process continues until all criteria have been checked.
41+
4. If a match is found only after removing criteria, the failed criteria are logged.
42+
5. The function returns `True` only if all criteria match on the first attempt; otherwise, it returns `False`.
43+
44+
---
45+
46+
## Example Usage
47+
48+
Below are examples of how to use `subject_assertion` in your tests:
49+
50+
```python
51+
import pytest
52+
from utils.subject_assertion import subject_assertion
53+
54+
pytestmark = [pytest.mark.utils_local]
55+
56+
def test_subject_assertion_true():
57+
nhs_number = "9233639266"
58+
criteria = {"screening status": "Inactive", "subject age": "> 28"}
59+
assert subject_assertion(nhs_number, criteria) is True
60+
```
61+
62+
See `tests_utils/test_subject_assertion_util.py` for more examples.
63+
64+
---
65+
66+
## Behaviour Details
67+
68+
- The function always keeps the NHS number criterion.
69+
- If a match is found only after removing criteria, the failed criteria are logged in the format:
70+
- Failed criteria: Key: 'key1', Value: 'value1'
71+
- The function will only return `True` if all criteria match on the first attempt.
72+
73+
---
74+
75+
## Best Practices
76+
77+
- Use this utility to validate subject data in database-driven tests.
78+
- Review logs for failed criteria to diagnose why assertions did not pass.
79+
- Always provide the NHS number as part of your criteria.
80+
81+
---
82+
83+
## Reference
84+
85+
- [`utils/subject_assertion.py`](../../utils/subject_assertion.py)
86+
- [`tests_utils/test_subject_assertion_util.py`](../../tests_utils/test_subject_assertion_util.py)
87+
- [SubjectSelectionQueryBuilder Utility Guide](SubjectSelectionQueryBuilder.md)
Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
import pytest
2+
from utils.subject_assertion import subject_assertion
3+
4+
pytestmark = [pytest.mark.utils_local]
5+
6+
nhs_number = "9233639266"
7+
8+
9+
def test_subject_assertion_true():
10+
criteria = {"screening status": "Inactive", "subject age": "> 28"}
11+
assert subject_assertion(nhs_number, criteria) is True
12+
13+
14+
def test_subject_assertion_false():
15+
criteria = {"screening status": "Call", "subject age": "< 28"}
16+
assert subject_assertion(nhs_number, criteria) is False
17+
18+
19+
def test_subject_assertion_false_with_some_true():
20+
criteria = {
21+
"screening status": "Inactive",
22+
"subject age": "> 28",
23+
"latest episode type": "FOBT",
24+
"latest episode status": "Open",
25+
"latest episode has referral date": "Past",
26+
"latest episode has diagnosis date": "No",
27+
"latest episode diagnosis date reason": "NULL",
28+
}
29+
assert subject_assertion(nhs_number, criteria) is False

utils/date_time_utils.py

Lines changed: 59 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
1-
from datetime import datetime, timedelta
2-
from typing import Optional
1+
from datetime import datetime, timedelta, date
2+
from typing import Optional, Union
3+
import pandas as pd
34
import random
45

56

@@ -152,3 +153,59 @@ def generate_unique_weekday_date(start_year: int = 2025) -> str:
152153
base_date += timedelta(days=1)
153154

154155
return base_date.strftime("%d/%m/%Y")
156+
157+
@staticmethod
158+
def parse_date(
159+
val: Optional[Union[pd.Timestamp, str, datetime, date]],
160+
) -> Optional[date]:
161+
"""
162+
Converts a value to a Python date object if possible.
163+
164+
Args:
165+
val: The value to convert (can be pandas.Timestamp, string, datetime, date, or None).
166+
167+
Returns:
168+
Optional[date]: The converted date object, or None if conversion fails.
169+
"""
170+
if pd.isnull(val):
171+
return None
172+
if isinstance(val, pd.Timestamp):
173+
return val.to_pydatetime().date()
174+
if isinstance(val, str):
175+
try:
176+
return datetime.strptime(val[:10], "%Y-%m-%d").date()
177+
except Exception:
178+
return None
179+
if isinstance(val, datetime):
180+
return val.date()
181+
if isinstance(val, date):
182+
return val
183+
return None
184+
185+
@staticmethod
186+
def parse_datetime(
187+
val: Optional[Union[pd.Timestamp, str, datetime, date]],
188+
) -> Optional[datetime]:
189+
"""
190+
Converts a value to a Python datetime object if possible.
191+
192+
Args:
193+
val: The value to convert (can be pandas.Timestamp, string, datetime, or None).
194+
195+
Returns:
196+
Optional[datetime]: The converted datetime object, or None if conversion fails.
197+
"""
198+
if pd.isnull(val):
199+
return None
200+
if isinstance(val, pd.Timestamp):
201+
return val.to_pydatetime()
202+
if isinstance(val, str):
203+
for fmt in ("%Y-%m-%d %H:%M:%S", "%Y-%m-%dT%H:%M:%S"):
204+
try:
205+
return datetime.strptime(val[:19], fmt)
206+
except Exception:
207+
continue
208+
return None
209+
if isinstance(val, datetime):
210+
return val
211+
return None

utils/subject_assertion.py

Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
from utils.oracle.subject_selection_query_builder import SubjectSelectionQueryBuilder
2+
from utils.oracle.oracle import OracleDB
3+
from classes.subject import Subject
4+
from classes.user import User
5+
import logging
6+
7+
8+
def subject_assertion(nhs_number: str, criteria: dict) -> bool:
9+
"""
10+
Asserts that a subject with the given NHS number exists and matches the provided criteria.
11+
Args:
12+
nhs_number (str): The NHS number of the subject to find.
13+
criteria (dict): A dictionary of criteria to match against the subject's attributes.
14+
Returns:
15+
bool: True if the subject matches the provided criteria, False if it does not.
16+
"""
17+
nhs_number_string = "nhs number"
18+
subject_nhs_number_string = "subject_nhs_number"
19+
nhs_no_criteria = {nhs_number_string: nhs_number}
20+
subject = Subject()
21+
user = User()
22+
builder = SubjectSelectionQueryBuilder()
23+
24+
query, bind_vars = builder.build_subject_selection_query(
25+
criteria=nhs_no_criteria,
26+
user=user,
27+
subject=subject,
28+
subjects_to_retrieve=1,
29+
)
30+
31+
subject_df = OracleDB().execute_query(query, bind_vars)
32+
subject = Subject.from_dataframe_row(subject_df.iloc[0])
33+
34+
criteria[nhs_number_string] = nhs_number
35+
36+
# Check all criteria together first
37+
query, bind_vars = builder.build_subject_selection_query(
38+
criteria=criteria,
39+
user=user,
40+
subject=subject,
41+
subjects_to_retrieve=1,
42+
)
43+
df = OracleDB().execute_query(query, bind_vars)
44+
if nhs_number in df[subject_nhs_number_string].values:
45+
return True
46+
47+
# Check each criterion independently
48+
failed_criteria = []
49+
criteria_keys = [key for key in criteria if key != nhs_number_string]
50+
for key in criteria_keys:
51+
single_criteria = {nhs_number_string: nhs_number, key: criteria[key]}
52+
query, bind_vars = builder.build_subject_selection_query(
53+
criteria=single_criteria,
54+
user=user,
55+
subject=subject,
56+
subjects_to_retrieve=1,
57+
)
58+
df = OracleDB().execute_query(query, bind_vars)
59+
if (
60+
subject_nhs_number_string not in df.columns
61+
or nhs_number not in df[subject_nhs_number_string].values
62+
):
63+
failed_criteria.append((key, criteria[key]))
64+
65+
if failed_criteria:
66+
log_message = "Subject Assertion Failed\nFailed criteria:\n" + "\n".join(
67+
[f"{key}, {value}" for key, value in failed_criteria]
68+
)
69+
logging.error(log_message)
70+
else:
71+
logging.error(
72+
"Subject Assertion Failed: Criteria combination is invalid or conflicting."
73+
)
74+
75+
return False

0 commit comments

Comments
 (0)