Skip to content

fix: (2.39) notification trigger cause server halt [DHIS2-19452]#20576

Merged
zubaira merged 21 commits into2.39from
2.39-fix-server-halt
Apr 29, 2025
Merged

fix: (2.39) notification trigger cause server halt [DHIS2-19452]#20576
zubaira merged 21 commits into2.39from
2.39-fix-server-halt

Conversation

@zubaira
Copy link
Contributor

@zubaira zubaira commented Apr 8, 2025

While testing this issue, we observed that the server occasionally halts or restarts when a user attempts to send scheduled notifications.

This PR will now be deployed to the IM environment for further testing. If everything works as expected there, we can proceed with merging it.

https://dhis2.atlassian.net/browse/DHIS2-19452

Filtering at the Database Level
Previously, all ProgramNotificationInstance records were loaded from the database and filtered in memory based on the scheduledAt date.
This PR moves that filtering to the database query itself, ensuring that only relevant notifications scheduled for the current day are fetched.

Reducing JSONB Payload Size
The ProgramNotificationTemplateSnapshot column was previously storing the full serialized template object, including redundant and unnecessary data.
This caused heavy query delays due to bloated row sizes.
The updated logic stores only the essential fields in the snapshot, significantly improving query performance and database efficiency.

@zubaira zubaira added the deploy Deploy DHIS2 instance with IM. label Apr 8, 2025
@zubaira zubaira changed the title fix: notification trigger cause server halt fix: (2.39) notification trigger cause server halt Apr 9, 2025
@zubaira zubaira changed the title fix: (2.39) notification trigger cause server halt fix: (2.39) notification trigger cause server halt [DHIS2-19452] Apr 9, 2025
@zubaira zubaira removed the deploy Deploy DHIS2 instance with IM. label Apr 10, 2025
@zubaira zubaira requested review from a team, ameenhere and teleivo April 11, 2025 21:31
return predicates;
}

public Pair<Date, Date> getStartAndEndOfDay(Date scheduledAt) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only for this branch. For higher branches, I will have a param mapper mapping scheduledAt to scheduledFrom and scheduledTo

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by

Only for this branch. For higher branches

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normally, we handle this in the parameter mapper class. However, for version 2.39, I implemented it directly in the store. For higher versions, I plan to follow the standard approach using the parameter validation mapper class.

@teleivo teleivo requested a review from enricocolasante April 15, 2025 08:04
@@ -658,13 +661,13 @@ void testScheduledNotifications() {
return new BatchResponseStatus(Collections.emptyList());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we delete this mock test which is what we generally try to do and rely on dhis-2/dhis-test-integration/src/test/java/org/hisp/dhis/program/notification/ProgramNotificationServiceIntegrationTest.java instead

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. I will do that after creating integration tests for ProgramNotificationService

/* No assertions are made here because this test is intended to verify that
scheduled notifications are fetched and processed within the specified time limit
(performance check only) */
programNotificationService.sendScheduledNotifications(NoopJobProgress.INSTANCE);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is not really needed is it? The notificationInstanceService.getAll() was the issue right? So if you add a test to the ProgramNotificationInstanceServiceTest for getAll and the method you are using now showing they do not timeout that would be enough. This would also allow you to assert on their results.

This test here could be useful but IMHO you would need to assert that the notifications you expected to be sent were actually sent.

@codecov
Copy link

codecov bot commented Apr 15, 2025

Codecov Report

Attention: Patch coverage is 96.00000% with 1 line in your changes missing coverage. Please review.

Project coverage is 57.62%. Comparing base (886690f) to head (918ed87).
Report is 439 commits behind head on 2.39.

Files with missing lines Patch % Lines
...otification/DefaultProgramNotificationService.java 66.66% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               2.39   #20576      +/-   ##
============================================
- Coverage     58.58%   57.62%   -0.96%     
+ Complexity    25817    25552     -265     
============================================
  Files          3207     3228      +21     
  Lines        121397   123292    +1895     
  Branches      14158    14371     +213     
============================================
- Hits          71118    71053      -65     
- Misses        44340    45849    +1509     
- Partials       5939     6390     +451     
Flag Coverage Δ
integration 50.27% <96.00%> (+1.68%) ⬆️
integration-h2 34.51% <0.00%> (?)
unit ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
.../template/snapshot/NotificationTemplateMapper.java 100.00% <100.00%> (+42.10%) ⬆️
...ion/DefaultProgramNotificationInstanceService.java 48.38% <100.00%> (+1.72%) ⬆️
...ion/HibernateProgramNotificationInstanceStore.java 76.08% <100.00%> (+14.32%) ⬆️
...otification/DefaultProgramNotificationService.java 45.83% <66.66%> (-26.46%) ⬇️

... and 1307 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cf6e226...918ed87. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@zubaira zubaira requested a review from teleivo April 16, 2025 06:38
@zubaira zubaira requested a review from enricocolasante April 26, 2025 06:37
@sonarqubecloud
Copy link

@zubaira zubaira merged commit 91e09ed into 2.39 Apr 29, 2025
11 checks passed
@zubaira zubaira deleted the 2.39-fix-server-halt branch April 29, 2025 10:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants

Comments