Validate request body type in EMRCreateMixin and EMRUpdateMixin#3564
Validate request body type in EMRCreateMixin and EMRUpdateMixin#3564RCcoders wants to merge 7 commits intoohcnetwork:developfrom
Conversation
…vent TypeError crashes
📝 WalkthroughWalkthroughThis pull request adds input type validation across EMR viewset mixins, implements facility-aware authorization for patient identifier updates, improves error handling for facility type lookups, and introduces a security test to validate cross-facility access controls are properly enforced. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes 🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
care/emr/api/viewsets/patient.py (1)
404-407: Use the parsedrequest_data.facilityhere.This branch re-reads
request.dataafterPatientIdentifierConfigRequesthas already validated it. Keeping the lookup tied to the parsed model avoids a second source of truth, which is harmless until it suddenly is not.Proposed fix
- facility_id = request.data.get("facility") facility = None - if facility_id: - facility = get_object_or_404(Facility, external_id=facility_id) + if request_data.facility: + facility = get_object_or_404( + Facility, external_id=request_data.facility + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@care/emr/api/viewsets/patient.py` around lines 404 - 407, Replace the direct re-read of request.data with the already-validated parsed object: use request_data.facility (from PatientIdentifierConfigRequest) instead of request.data.get("facility") when populating facility_id and looking up Facility via get_object_or_404; update the facility_id/facility assignment to rely on request_data to avoid a second source of truth.care/emr/api/viewsets/base.py (1)
134-135: Please centralize the object-body guard.The same
isinstance(..., dict)check now lives in four places, with three different messages, andupdate()repeats whatclean_update_data()already enforces. It works, sure, but this is exactly the kind of duplication that drifts the next time someone fixes only one path.Proposed refactor
class EMRCreateMixin: + def validate_request_body_is_object(self, request_data): + if not isinstance(request_data, dict): + raise RestFrameworkValidationError("request body must be json object") + def get_form_submission_params(self, instance): return {"patient": instance.patient, "encounter": instance.encounter} @@ def create(self, request, *args, **kwargs): - if not isinstance(request.data, dict): - raise RestFrameworkValidationError("Invalid data format") + self.validate_request_body_is_object(request.data) return Response(self.handle_create(request.data)) @@ class EMRUpdateMixin: def clean_update_data(self, request_data, keep_fields: set | None = None): - if not isinstance(request_data, dict): - raise RestFrameworkValidationError("request body must be json object") + self.validate_request_body_is_object(request_data) ignored_fields = {"id", "external_id", "patient", "encounter"} @@ def update(self, request, *args, **kwargs): - if not isinstance(request.data, dict): - raise RestFrameworkValidationError("request body must be json object") + self.validate_request_body_is_object(request.data) instance = self.get_object() return Response(self.handle_update(instance, request.data)) @@ class EMRUpsertMixin: `@action`(detail=False, methods=["POST"]) def upsert(self, request, *args, **kwargs): - if not isinstance(request.data, dict): - raise RestFrameworkValidationError("Invalid request data") + self.validate_request_body_is_object(request.data) datapoints = request.data.get("datapoints", [])Also applies to: 201-202, 214-215, 272-273
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@care/emr/api/viewsets/base.py` around lines 134 - 135, Duplicate isinstance(request.data, dict) checks should be centralized: add a single private helper method on the class (e.g., _ensure_body_is_dict or validate_request_body) that performs isinstance(request.data, dict) and raises RestFrameworkValidationError with a single standardized message, then replace the repeated inline checks in update(), clean_update_data(), and other locations with calls to that helper so update() no longer duplicates validation already performed by clean_update_data(); ensure clean_update_data() no longer rechecks the type but assumes the helper has been called, and update() calls the helper before invoking clean_update_data().care/emr/tests/test_patient_security_api.py (1)
30-61: Add a case that actually reaches the new permission check.This test is rejected by the facility-mismatch branch, so it still passes even if the new
can_write_facility_patient_identifier_configcheck incare/emr/api/viewsets/patient.pyis removed. Please add a second scenario wherepayload["facility"] == config.facility.external_idbut the user lacks facility-scoped write access; that is the regression this PR really needs covered.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@care/emr/tests/test_patient_security_api.py` around lines 30 - 61, The test currently fails fast on facility mismatch; add a second scenario in test_update_identifier_bola_vulnerability that targets the actual permission check by creating a PatientIdentifierConfig whose facility equals payload["facility"] (e.g., create config_same_facility = PatientIdentifierConfig.objects.create(facility=self.facility_a, ...)), then POST with "config": str(config_same_facility.external_id) and "facility": str(self.facility_a.external_id) while ensuring the test user does not have facility-scoped write access, and assert response.status_code is HTTP_403_FORBIDDEN and that no PatientIdentifier exists for that config and value; this exercises the can_write_facility_patient_identifier_config branch referenced in care/emr/api/viewsets/patient.py and ensures the permission check is enforced.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@care/emr/resources/facility/spec.py`:
- Around line 157-160: Add validation for facility_type into the Pydantic model
by creating a field validator (e.g. `@validator`('facility_type') or
`@root_validator` in the same model class) that checks the incoming value exists
in REVERSE_REVERSE_FACILITY_TYPES and raises a ValidationError/ValueError on
failure; then remove the try/except KeyError->ValueError mapping currently in
perform_extra_deserialization()/de_serialize() that sets obj.facility_type =
REVERSE_REVERSE_FACILITY_TYPES[self.facility_type] so invalid values are caught
during model validation and surface as 4xx validation errors instead of a
runtime ValueError from deserialization.
---
Nitpick comments:
In `@care/emr/api/viewsets/base.py`:
- Around line 134-135: Duplicate isinstance(request.data, dict) checks should be
centralized: add a single private helper method on the class (e.g.,
_ensure_body_is_dict or validate_request_body) that performs
isinstance(request.data, dict) and raises RestFrameworkValidationError with a
single standardized message, then replace the repeated inline checks in
update(), clean_update_data(), and other locations with calls to that helper so
update() no longer duplicates validation already performed by
clean_update_data(); ensure clean_update_data() no longer rechecks the type but
assumes the helper has been called, and update() calls the helper before
invoking clean_update_data().
In `@care/emr/api/viewsets/patient.py`:
- Around line 404-407: Replace the direct re-read of request.data with the
already-validated parsed object: use request_data.facility (from
PatientIdentifierConfigRequest) instead of request.data.get("facility") when
populating facility_id and looking up Facility via get_object_or_404; update the
facility_id/facility assignment to rely on request_data to avoid a second source
of truth.
In `@care/emr/tests/test_patient_security_api.py`:
- Around line 30-61: The test currently fails fast on facility mismatch; add a
second scenario in test_update_identifier_bola_vulnerability that targets the
actual permission check by creating a PatientIdentifierConfig whose facility
equals payload["facility"] (e.g., create config_same_facility =
PatientIdentifierConfig.objects.create(facility=self.facility_a, ...)), then
POST with "config": str(config_same_facility.external_id) and "facility":
str(self.facility_a.external_id) while ensuring the test user does not have
facility-scoped write access, and assert response.status_code is
HTTP_403_FORBIDDEN and that no PatientIdentifier exists for that config and
value; this exercises the can_write_facility_patient_identifier_config branch
referenced in care/emr/api/viewsets/patient.py and ensures the permission check
is enforced.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 04bf096b-833c-4f88-9832-0beb0104bea0
📒 Files selected for processing (4)
care/emr/api/viewsets/base.pycare/emr/api/viewsets/patient.pycare/emr/resources/facility/spec.pycare/emr/tests/test_patient_security_api.py
| try: | ||
| obj.facility_type = REVERSE_REVERSE_FACILITY_TYPES[self.facility_type] | ||
| except KeyError as exc: | ||
| raise ValueError(f"Invalid facility_type: {self.facility_type}") from exc |
There was a problem hiding this comment.
Validate facility_type in the Pydantic model, not during deserialization.
This gives a nicer message, but it still lets an invalid facility_type escape as a plain ValueError from de_serialize(), which means callers still get a 500 instead of a validation response. Moving the check into a field validator keeps bad input on the normal 4xx path and keeps perform_extra_deserialization() a little less exciting than it needs to be.
Proposed fix
class FacilityCreateSpec(FacilityBaseSpec):
geo_organization: UUID4
features: list[int]
print_templates: list[PrintTemplate] = []
+
+ `@field_validator`("facility_type")
+ `@classmethod`
+ def validate_facility_type(cls, value):
+ if value not in REVERSE_REVERSE_FACILITY_TYPES:
+ raise ValueError("Invalid facility_type")
+ return value
@@
def perform_extra_deserialization(self, is_update, obj):
obj.geo_organization = Organization.objects.filter(
external_id=self.geo_organization, org_type="govt"
).first()
- try:
- obj.facility_type = REVERSE_REVERSE_FACILITY_TYPES[self.facility_type]
- except KeyError as exc:
- raise ValueError(f"Invalid facility_type: {self.facility_type}") from exc
+ obj.facility_type = REVERSE_REVERSE_FACILITY_TYPES[self.facility_type]🧰 Tools
🪛 Ruff (0.15.4)
[warning] 160-160: Avoid specifying long messages outside the exception class
(TRY003)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@care/emr/resources/facility/spec.py` around lines 157 - 160, Add validation
for facility_type into the Pydantic model by creating a field validator (e.g.
`@validator`('facility_type') or `@root_validator` in the same model class) that
checks the incoming value exists in REVERSE_REVERSE_FACILITY_TYPES and raises a
ValidationError/ValueError on failure; then remove the try/except
KeyError->ValueError mapping currently in
perform_extra_deserialization()/de_serialize() that sets obj.facility_type =
REVERSE_REVERSE_FACILITY_TYPES[self.facility_type] so invalid values are caught
during model validation and surface as 4xx validation errors instead of a
runtime ValueError from deserialization.
Problem
Several viewsets rely on EMRCreateMixin and EMRUpdateMixin which directly pass request.data to Pydantic model validation.
If a request body is not a JSON object (e.g., a list or string), Python raises a TypeError during unpacking, resulting in a 500 Internal Server Error.
This can cause backend crashes across multiple endpoints.
Solution
Added validation to ensure request.data is a dictionary before processing.
If request.data is not a dict, a RestFrameworkValidationError is raised with a clear message.
Changes
Result
Prevents system-wide TypeError crashes caused by malformed request bodies.
Summary by CodeRabbit
Release Notes
Security
Bug Fixes