[DTOSS-11551] Add extract model #709

cameronhargreaves1-nhs · 2025-11-12T14:24:33Z

Description

we want to model the relationship between Appontments and the Extract it was booked in properly to give us the ability to more easily check sequence numbers.

Whenever appointments are created, a new Extract will need to be created as well.

Jira link

Review notes

Review checklist

Check database queries are correctly scoped to current_provider

Harriethw

Looks pretty good so far! just some minor changes/ questions

manage_breast_screening/notifications/management/commands/create_appointments.py

manage_breast_screening/notifications/tests/management/commands/test_create_appointments.py

manage_breast_screening/notifications/models.py

manage_breast_screening/notifications/tests/management/commands/test_create_appointments.py

manage_breast_screening/notifications/management/commands/create_appointments.py

manage_breast_screening/notifications/migrations/0022_extract.py

manage_breast_screening/notifications/management/commands/create_appointments.py

steventux

Looking good, a couple of small things, the main thing is around how we read the header, I think we can make it more readable.

manage_breast_screening/notifications/management/commands/create_appointments.py

steventux · 2025-11-24T09:58:11Z

manage_breast_screening/notifications/management/commands/create_appointments.py

+
+                file_headers = self.get_file_header(blob_content)
+
+                extract = Extract.objects.create(sequence_number = int(file_headers[1].strip()),


There are a lot of magic indexes to what we are doing here, it makes the code hard to read when we don't know what file_headers[1] or file_headers[4] refer to.
Could we have a pattern like:

def create_extract(filename: str) -> Extract: bso_code = filename.split("_")[0] type_id, extract_id, start_date, start_time, record_count = raw_data.split("\r\n").split("|") # Maybe raise if the above fails return Extract.objects.create( sequence_number: extract_id, bso_code: bso_code, filename: filename, record_count: record_count, )

just seen this doesn't look like it was added, will take a look

@steventux have pushed in latest commit - IMO the string manipulation is harder to understand than just converting to a dataframe, what do you think?

there were a few more steps to parsing the string than in your original suggestion - i guess we get all that "for free" from the pandas stuff

manage_breast_screening/notifications/tests/fixtures/ABC_20251118150721_APPT_101.dat

addressed changes

Harriethw · 2025-11-26T16:42:16Z

I tidied up the commits to make it easier to grep, but there are a few changes of note from the original PR:

adding the related_name on the ManyToMany allows us to query extracts from the Appointment, which I think is useful to have
the validation on a unique constraint (e.g. if we try to process same Extract file twice) does happen on create once the Model is setup right - so no new Extract would be created, and we would be alerted.
I had to pass transaction=True into the unique test because pytest was struggling and didn't seem to the think the transaction was atomic - I think it should be because we've wrapped everything in a try/catch block, but not 100% sure 🤔

steventux

This looks great, very clean and well tested 💯 🥇 🛳️

steventux · 2025-11-26T18:06:36Z

manage_breast_screening/notifications/management/commands/create_appointments.py

+    def create_extract(self, filename: str, raw_data: str) -> Extract:
+        bso_code = filename.split("/")[1].split("_")[0]
+        type_id, extract_id, start_date, start_time, record_count = raw_data.split(
+            "\n"


I don't know if this matters or even accurate but the spec states that the line separator is CR/LF which is \r\n in our money. I suppose the only side effect of splitting on \n would be rogue carriage returns.
Perhaps as we are already stripping quotes we could attempt to strip \r?
Not a dealbreaker.

done!
I didn't know how to get the \r to show up in the data we have though 🤷 at least it will get removed if it does turn up!

To store information about the .dat files we receive, from which we extract Appointment info.

here we add an Appointment to an Extract wherever it is created

To avoid converting to data frame again

cameronhargreaves1-nhs changed the title ~~Dtoss 11551 add extract model~~ [DTOSS-11551] Add extract model Nov 19, 2025

cameronhargreaves1-nhs force-pushed the DTOSS-11551-add-extract-model branch from cfb2f0f to bd17c1d Compare November 19, 2025 08:30

Harriethw previously requested changes Nov 19, 2025

View reviewed changes

steventux reviewed Nov 20, 2025

View reviewed changes

manage_breast_screening/notifications/migrations/0022_extract.py Outdated Show resolved Hide resolved

manage_breast_screening/notifications/management/commands/create_appointments.py Outdated Show resolved Hide resolved

cameronhargreaves1-nhs force-pushed the DTOSS-11551-add-extract-model branch 2 times, most recently from 54d67c6 to 5ff80f1 Compare November 21, 2025 15:42

cameronhargreaves1-nhs marked this pull request as ready for review November 24, 2025 09:17

cameronhargreaves1-nhs requested a review from a team November 24, 2025 09:17

steventux reviewed Nov 24, 2025

View reviewed changes

Harriethw force-pushed the DTOSS-11551-add-extract-model branch from 6d65503 to 96a4d2d Compare November 26, 2025 12:58

Harriethw marked this pull request as draft November 26, 2025 13:38

Harriethw force-pushed the DTOSS-11551-add-extract-model branch 4 times, most recently from 6936293 to d7f1e57 Compare November 26, 2025 16:00

Harriethw force-pushed the DTOSS-11551-add-extract-model branch 3 times, most recently from e2962e9 to 9a1b6c9 Compare November 26, 2025 16:37

Harriethw marked this pull request as ready for review November 26, 2025 16:42

steventux approved these changes Nov 26, 2025

View reviewed changes

Harriethw added 3 commits November 27, 2025 10:16

add an Extract model

5accdeb

To store information about the .dat files we receive, from which we extract Appointment info.

Add Extract info to Appointments on creation

50fa1af

here we add an Appointment to an Extract wherever it is created

Refactor create extract method

5a29d3f

To avoid converting to data frame again

Harriethw force-pushed the DTOSS-11551-add-extract-model branch from 55e3a23 to 5a29d3f Compare November 27, 2025 10:16

Harriethw merged commit fb844a3 into main Nov 27, 2025
12 checks passed

Harriethw deleted the DTOSS-11551-add-extract-model branch November 27, 2025 10:22


		file_headers = self.get_file_header(blob_content)

		extract = Extract.objects.create(sequence_number = int(file_headers[1].strip()),

[DTOSS-11551] Add extract model #709

[DTOSS-11551] Add extract model #709

Uh oh!

Conversation

cameronhargreaves1-nhs commented Nov 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Jira link

Review notes

Review checklist

Uh oh!

Harriethw left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

steventux left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

steventux Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

Harriethw Nov 26, 2025

Choose a reason for hiding this comment

Uh oh!

Harriethw Nov 26, 2025

Choose a reason for hiding this comment

Uh oh!

Harriethw Nov 26, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Harriethw commented Nov 26, 2025

Uh oh!

steventux left a comment

Choose a reason for hiding this comment

Uh oh!

steventux Nov 26, 2025

Choose a reason for hiding this comment

Uh oh!

Harriethw Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

cameronhargreaves1-nhs commented Nov 12, 2025 •

edited

Loading