VED-754 Handle non utf 8 encoded csv (bugfix) by dlzhry2nhs · Pull Request #792 · NHSDigital/immunisation-fhir-api

dlzhry2nhs · 2025-09-05T07:31:03Z

Summary

Bugfix/handling for specific supplier settings

See https://nhsd-jira.digital.nhs.uk/browse/VED-754 for the investigation and root cause. Essentially, until we update the specification to stipulate that we expect utf-8 encoding and suppliers come into line, it is fair and reasonable to put in temporary handling.

We would want to get rid of this as soon as we can, and reject files that do not conform to spec.

Note: this only occurs in a slim minority of cases, as the differences in encoding only surfaces for non-ASCII characters, which appear quite rarely within the batch extracts.

Reviews Required

Dev

Review Checklist

ℹ️ This section is to be filled in by the reviewer.

I have reviewed the changes in this PR and they fill all or part of the acceptance criteria of the ticket, and the code is in a mergeable state.
If there were infrastructure, operational, or build changes, I have made sure there is sufficient evidence that the changes will work.
I have ensured the changelog has been updated by the submitter, if necessary.

dlzhry2nhs · 2025-09-05T07:36:37Z

Quality Gate failed

Failed conditions 0.0% Coverage on New Code (required ≥ 80%)

See analysis details on SonarQube Cloud

The release branch does not yet have the fix from main which ensures the correct version of Python + Poetry is used, so the unit tests do not run. See: coverage-fhir-api step from https://github.com/NHSDigital/immunisation-fhir-api/actions/runs/17486786687/job/49667303142?pr=792

I have verified that the unit test I have added exercises the new code.

dlzhry2nhs · 2025-09-05T14:35:48Z

recordprocessor/src/utils_for_recordprocessor.py

+            file_bytes.seek(0)
+            return False
+
+    file_bytes.seek(0)


For discussion, this is a bit naughty. It avoids the function having a side effect of fast forwarding the file to a certain point and ensures it always sets the stream back to the start.

Other option would be to create to BytesIO object. One for encoding validation, and then one to pass to the dict reader. However, concluded minimising the objects is better even is we have to manually call .seek(0).

sonarqubecloud · 2025-09-05T17:50:17Z

Quality Gate failed

Failed conditions
0.0% Coverage on New Code (required ≥ 80%)

See analysis details on SonarQube Cloud

dlzhry2nhs · 2025-09-11T07:59:23Z

Closing - superseded by a separate fix.

Initial fix

6eaea79

Removed some comments once tested in environment

93448ef

dlzhry2nhs force-pushed the bugfix/VED-754-handle-non-utf-8-encoded-csv branch from b4ae9f1 to 93448ef Compare September 5, 2025 08:02

NHSDigital deleted a comment from sonarqubecloud bot Sep 5, 2025

dlzhry2nhs marked this pull request as ready for review September 5, 2025 13:34

dlzhry2nhs commented Sep 5, 2025

View reviewed changes

Skip file encoding check for internal systems

94b348f

dlzhry2nhs closed this Sep 11, 2025

dlzhry2nhs deleted the bugfix/VED-754-handle-non-utf-8-encoded-csv branch September 11, 2025 07:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VED-754 Handle non utf 8 encoded csv (bugfix)#792

VED-754 Handle non utf 8 encoded csv (bugfix)#792
dlzhry2nhs wants to merge 3 commits intorelease-2025-09-04from
bugfix/VED-754-handle-non-utf-8-encoded-csv

dlzhry2nhs commented Sep 5, 2025 •

edited

Loading

Uh oh!

dlzhry2nhs commented Sep 5, 2025

Quality Gate failed

Uh oh!

dlzhry2nhs Sep 5, 2025

Uh oh!

sonarqubecloud bot commented Sep 5, 2025

Uh oh!

dlzhry2nhs commented Sep 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dlzhry2nhs commented Sep 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Reviews Required

Review Checklist

Uh oh!

dlzhry2nhs commented Sep 5, 2025

Quality Gate failed

Uh oh!

dlzhry2nhs Sep 5, 2025

Choose a reason for hiding this comment

Uh oh!

sonarqubecloud bot commented Sep 5, 2025

Quality Gate failed

Uh oh!

dlzhry2nhs commented Sep 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

dlzhry2nhs commented Sep 5, 2025 •

edited

Loading