Skip to content

Validate request body type in EMRCreateMixin and EMRUpdateMixin#3564

Open
RCcoders wants to merge 7 commits intoohcnetwork:developfrom
RCcoders:fix-emr-request-body-validation-mixins
Open

Validate request body type in EMRCreateMixin and EMRUpdateMixin#3564
RCcoders wants to merge 7 commits intoohcnetwork:developfrom
RCcoders:fix-emr-request-body-validation-mixins

Conversation

@RCcoders
Copy link

@RCcoders RCcoders commented Mar 8, 2026

Problem

Several viewsets rely on EMRCreateMixin and EMRUpdateMixin which directly pass request.data to Pydantic model validation.

If a request body is not a JSON object (e.g., a list or string), Python raises a TypeError during unpacking, resulting in a 500 Internal Server Error.

This can cause backend crashes across multiple endpoints.

Solution

Added validation to ensure request.data is a dictionary before processing.

If request.data is not a dict, a RestFrameworkValidationError is raised with a clear message.

Changes

  • Added request body type validation in:
    • EMRCreateMixin.create
    • EMRUpdateMixin.update

Result

Prevents system-wide TypeError crashes caused by malformed request bodies.

Summary by CodeRabbit

Release Notes

  • Security

    • Strengthened facility-aware authorization and validation for patient identifier updates to enforce proper access control.
  • Bug Fixes

    • Enhanced API request validation to enforce proper JSON object formatting across create, update, and upsert operations.
    • Improved error handling and messaging for invalid facility type configurations.

@RCcoders RCcoders requested a review from a team as a code owner March 8, 2026 18:50
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 8, 2026

📝 Walkthrough

Walkthrough

This pull request adds input type validation across EMR viewset mixins, implements facility-aware authorization for patient identifier updates, improves error handling for facility type lookups, and introduces a security test to validate cross-facility access controls are properly enforced.

Changes

Cohort / File(s) Summary
Input Validation in Base Mixins
care/emr/api/viewsets/base.py
Added dict type validation to EMRCreateMixin.create, EMRUpdateMixin.clean_update_data/update, and EMRUpsertMixin.upsert with appropriate error messages for non-dict request data.
Facility-Aware Authorization
care/emr/api/viewsets/patient.py
Enhanced update_identifier with facility resolution, validation of identifier config facility context, and authorization checks via AuthorizationController.can_write_facility_patient_identifier_config.
Error Handling Improvement
care/emr/resources/facility/spec.py
Wrapped facility_type lookup in try/except to catch KeyError and raise descriptive ValueError instead.
Security Test Coverage
care/emr/tests/test_patient_security_api.py
New test class verifying that cross-facility patient identifier updates are rejected with 403 Forbidden, confirming BOLA vulnerability protection is working as intended.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Description check ⚠️ Warning The description explains the problem and solution, but lacks associated issue link and merge checklist items required by the template. Add an 'Associated Issue' section with a link and complete the merge checklist with test, documentation, and linting confirmations.
Docstring Coverage ⚠️ Warning Docstring coverage is 9.09% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately reflects the main changes across EMRCreateMixin, EMRUpdateMixin, and EMRUpsertMixin for request body type validation.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
care/emr/api/viewsets/patient.py (1)

404-407: Use the parsed request_data.facility here.

This branch re-reads request.data after PatientIdentifierConfigRequest has already validated it. Keeping the lookup tied to the parsed model avoids a second source of truth, which is harmless until it suddenly is not.

Proposed fix
-        facility_id = request.data.get("facility")
         facility = None
-        if facility_id:
-            facility = get_object_or_404(Facility, external_id=facility_id)
+        if request_data.facility:
+            facility = get_object_or_404(
+                Facility, external_id=request_data.facility
+            )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@care/emr/api/viewsets/patient.py` around lines 404 - 407, Replace the direct
re-read of request.data with the already-validated parsed object: use
request_data.facility (from PatientIdentifierConfigRequest) instead of
request.data.get("facility") when populating facility_id and looking up Facility
via get_object_or_404; update the facility_id/facility assignment to rely on
request_data to avoid a second source of truth.
care/emr/api/viewsets/base.py (1)

134-135: Please centralize the object-body guard.

The same isinstance(..., dict) check now lives in four places, with three different messages, and update() repeats what clean_update_data() already enforces. It works, sure, but this is exactly the kind of duplication that drifts the next time someone fixes only one path.

Proposed refactor
 class EMRCreateMixin:
+    def validate_request_body_is_object(self, request_data):
+        if not isinstance(request_data, dict):
+            raise RestFrameworkValidationError("request body must be json object")
+
     def get_form_submission_params(self, instance):
         return {"patient": instance.patient, "encounter": instance.encounter}
@@
     def create(self, request, *args, **kwargs):
-        if not isinstance(request.data, dict):
-            raise RestFrameworkValidationError("Invalid data format")
+        self.validate_request_body_is_object(request.data)
         return Response(self.handle_create(request.data))
@@
 class EMRUpdateMixin:
     def clean_update_data(self, request_data, keep_fields: set | None = None):
-        if not isinstance(request_data, dict):
-            raise RestFrameworkValidationError("request body must be json object")
+        self.validate_request_body_is_object(request_data)
         ignored_fields = {"id", "external_id", "patient", "encounter"}
@@
     def update(self, request, *args, **kwargs):
-        if not isinstance(request.data, dict):
-            raise RestFrameworkValidationError("request body must be json object")
+        self.validate_request_body_is_object(request.data)
         instance = self.get_object()
         return Response(self.handle_update(instance, request.data))
@@
 class EMRUpsertMixin:
     `@action`(detail=False, methods=["POST"])
     def upsert(self, request, *args, **kwargs):
-        if not isinstance(request.data, dict):
-            raise RestFrameworkValidationError("Invalid request data")
+        self.validate_request_body_is_object(request.data)
         datapoints = request.data.get("datapoints", [])

Also applies to: 201-202, 214-215, 272-273

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@care/emr/api/viewsets/base.py` around lines 134 - 135, Duplicate
isinstance(request.data, dict) checks should be centralized: add a single
private helper method on the class (e.g., _ensure_body_is_dict or
validate_request_body) that performs isinstance(request.data, dict) and raises
RestFrameworkValidationError with a single standardized message, then replace
the repeated inline checks in update(), clean_update_data(), and other locations
with calls to that helper so update() no longer duplicates validation already
performed by clean_update_data(); ensure clean_update_data() no longer rechecks
the type but assumes the helper has been called, and update() calls the helper
before invoking clean_update_data().
care/emr/tests/test_patient_security_api.py (1)

30-61: Add a case that actually reaches the new permission check.

This test is rejected by the facility-mismatch branch, so it still passes even if the new can_write_facility_patient_identifier_config check in care/emr/api/viewsets/patient.py is removed. Please add a second scenario where payload["facility"] == config.facility.external_id but the user lacks facility-scoped write access; that is the regression this PR really needs covered.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@care/emr/tests/test_patient_security_api.py` around lines 30 - 61, The test
currently fails fast on facility mismatch; add a second scenario in
test_update_identifier_bola_vulnerability that targets the actual permission
check by creating a PatientIdentifierConfig whose facility equals
payload["facility"] (e.g., create config_same_facility =
PatientIdentifierConfig.objects.create(facility=self.facility_a, ...)), then
POST with "config": str(config_same_facility.external_id) and "facility":
str(self.facility_a.external_id) while ensuring the test user does not have
facility-scoped write access, and assert response.status_code is
HTTP_403_FORBIDDEN and that no PatientIdentifier exists for that config and
value; this exercises the can_write_facility_patient_identifier_config branch
referenced in care/emr/api/viewsets/patient.py and ensures the permission check
is enforced.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@care/emr/resources/facility/spec.py`:
- Around line 157-160: Add validation for facility_type into the Pydantic model
by creating a field validator (e.g. `@validator`('facility_type') or
`@root_validator` in the same model class) that checks the incoming value exists
in REVERSE_REVERSE_FACILITY_TYPES and raises a ValidationError/ValueError on
failure; then remove the try/except KeyError->ValueError mapping currently in
perform_extra_deserialization()/de_serialize() that sets obj.facility_type =
REVERSE_REVERSE_FACILITY_TYPES[self.facility_type] so invalid values are caught
during model validation and surface as 4xx validation errors instead of a
runtime ValueError from deserialization.

---

Nitpick comments:
In `@care/emr/api/viewsets/base.py`:
- Around line 134-135: Duplicate isinstance(request.data, dict) checks should be
centralized: add a single private helper method on the class (e.g.,
_ensure_body_is_dict or validate_request_body) that performs
isinstance(request.data, dict) and raises RestFrameworkValidationError with a
single standardized message, then replace the repeated inline checks in
update(), clean_update_data(), and other locations with calls to that helper so
update() no longer duplicates validation already performed by
clean_update_data(); ensure clean_update_data() no longer rechecks the type but
assumes the helper has been called, and update() calls the helper before
invoking clean_update_data().

In `@care/emr/api/viewsets/patient.py`:
- Around line 404-407: Replace the direct re-read of request.data with the
already-validated parsed object: use request_data.facility (from
PatientIdentifierConfigRequest) instead of request.data.get("facility") when
populating facility_id and looking up Facility via get_object_or_404; update the
facility_id/facility assignment to rely on request_data to avoid a second source
of truth.

In `@care/emr/tests/test_patient_security_api.py`:
- Around line 30-61: The test currently fails fast on facility mismatch; add a
second scenario in test_update_identifier_bola_vulnerability that targets the
actual permission check by creating a PatientIdentifierConfig whose facility
equals payload["facility"] (e.g., create config_same_facility =
PatientIdentifierConfig.objects.create(facility=self.facility_a, ...)), then
POST with "config": str(config_same_facility.external_id) and "facility":
str(self.facility_a.external_id) while ensuring the test user does not have
facility-scoped write access, and assert response.status_code is
HTTP_403_FORBIDDEN and that no PatientIdentifier exists for that config and
value; this exercises the can_write_facility_patient_identifier_config branch
referenced in care/emr/api/viewsets/patient.py and ensures the permission check
is enforced.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 04bf096b-833c-4f88-9832-0beb0104bea0

📥 Commits

Reviewing files that changed from the base of the PR and between d2a2c2f and 278511c.

📒 Files selected for processing (4)
  • care/emr/api/viewsets/base.py
  • care/emr/api/viewsets/patient.py
  • care/emr/resources/facility/spec.py
  • care/emr/tests/test_patient_security_api.py

Comment on lines +157 to +160
try:
obj.facility_type = REVERSE_REVERSE_FACILITY_TYPES[self.facility_type]
except KeyError as exc:
raise ValueError(f"Invalid facility_type: {self.facility_type}") from exc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Validate facility_type in the Pydantic model, not during deserialization.

This gives a nicer message, but it still lets an invalid facility_type escape as a plain ValueError from de_serialize(), which means callers still get a 500 instead of a validation response. Moving the check into a field validator keeps bad input on the normal 4xx path and keeps perform_extra_deserialization() a little less exciting than it needs to be.

Proposed fix
 class FacilityCreateSpec(FacilityBaseSpec):
     geo_organization: UUID4
     features: list[int]
     print_templates: list[PrintTemplate] = []
+
+    `@field_validator`("facility_type")
+    `@classmethod`
+    def validate_facility_type(cls, value):
+        if value not in REVERSE_REVERSE_FACILITY_TYPES:
+            raise ValueError("Invalid facility_type")
+        return value
@@
     def perform_extra_deserialization(self, is_update, obj):
         obj.geo_organization = Organization.objects.filter(
             external_id=self.geo_organization, org_type="govt"
         ).first()
-        try:
-            obj.facility_type = REVERSE_REVERSE_FACILITY_TYPES[self.facility_type]
-        except KeyError as exc:
-            raise ValueError(f"Invalid facility_type: {self.facility_type}") from exc
+        obj.facility_type = REVERSE_REVERSE_FACILITY_TYPES[self.facility_type]
🧰 Tools
🪛 Ruff (0.15.4)

[warning] 160-160: Avoid specifying long messages outside the exception class

(TRY003)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@care/emr/resources/facility/spec.py` around lines 157 - 160, Add validation
for facility_type into the Pydantic model by creating a field validator (e.g.
`@validator`('facility_type') or `@root_validator` in the same model class) that
checks the incoming value exists in REVERSE_REVERSE_FACILITY_TYPES and raises a
ValidationError/ValueError on failure; then remove the try/except
KeyError->ValueError mapping currently in
perform_extra_deserialization()/de_serialize() that sets obj.facility_type =
REVERSE_REVERSE_FACILITY_TYPES[self.facility_type] so invalid values are caught
during model validation and surface as 4xx validation errors instead of a
runtime ValueError from deserialization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants