-
Notifications
You must be signed in to change notification settings - Fork 4
[DTOSS-11551] Add extract model #709
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cfb2f0f to
bd17c1d
Compare
Harriethw
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good so far! just some minor changes/ questions
manage_breast_screening/notifications/management/commands/create_appointments.py
Outdated
Show resolved
Hide resolved
manage_breast_screening/notifications/management/commands/create_appointments.py
Outdated
Show resolved
Hide resolved
manage_breast_screening/notifications/tests/management/commands/test_create_appointments.py
Outdated
Show resolved
Hide resolved
manage_breast_screening/notifications/tests/management/commands/test_create_appointments.py
Outdated
Show resolved
Hide resolved
manage_breast_screening/notifications/management/commands/create_appointments.py
Outdated
Show resolved
Hide resolved
manage_breast_screening/notifications/migrations/0022_extract.py
Outdated
Show resolved
Hide resolved
manage_breast_screening/notifications/management/commands/create_appointments.py
Outdated
Show resolved
Hide resolved
54d67c6 to
5ff80f1
Compare
steventux
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good, a couple of small things, the main thing is around how we read the header, I think we can make it more readable.
manage_breast_screening/notifications/management/commands/create_appointments.py
Outdated
Show resolved
Hide resolved
|
|
||
| file_headers = self.get_file_header(blob_content) | ||
|
|
||
| extract = Extract.objects.create(sequence_number = int(file_headers[1].strip()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a lot of magic indexes to what we are doing here, it makes the code hard to read when we don't know what file_headers[1] or file_headers[4] refer to.
Could we have a pattern like:
def create_extract(filename: str) -> Extract:
bso_code = filename.split("_")[0]
type_id, extract_id, start_date, start_time, record_count = raw_data.split("\r\n").split("|")
# Maybe raise if the above fails
return Extract.objects.create(
sequence_number: extract_id,
bso_code: bso_code,
filename: filename,
record_count: record_count,
)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just seen this doesn't look like it was added, will take a look
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@steventux have pushed in latest commit - IMO the string manipulation is harder to understand than just converting to a dataframe, what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there were a few more steps to parsing the string than in your original suggestion - i guess we get all that "for free" from the pandas stuff
manage_breast_screening/notifications/tests/fixtures/ABC_20251118150721_APPT_101.dat
Outdated
Show resolved
Hide resolved
6d65503 to
96a4d2d
Compare
6936293 to
d7f1e57
Compare
e2962e9 to
9a1b6c9
Compare
|
I tidied up the commits to make it easier to grep, but there are a few changes of note from the original PR:
|
steventux
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great, very clean and well tested 💯 🥇 🛳️
| def create_extract(self, filename: str, raw_data: str) -> Extract: | ||
| bso_code = filename.split("/")[1].split("_")[0] | ||
| type_id, extract_id, start_date, start_time, record_count = raw_data.split( | ||
| "\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if this matters or even accurate but the spec states that the line separator is CR/LF which is \r\n in our money. I suppose the only side effect of splitting on \n would be rogue carriage returns.
Perhaps as we are already stripping quotes we could attempt to strip \r?
Not a dealbreaker.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done!
I didn't know how to get the \r to show up in the data we have though 🤷 at least it will get removed if it does turn up!
To store information about the .dat files we receive, from which we extract Appointment info.
here we add an Appointment to an Extract wherever it is created
To avoid converting to data frame again
55e3a23 to
5a29d3f
Compare
Description
we want to model the relationship between Appontments and the Extract it was booked in properly to give us the ability to more easily check sequence numbers.
Whenever appointments are created, a new Extract will need to be created as well.
Jira link
Review notes
Review checklist