Skip to content

Stop generating all_courses_csv report synchronously #6608

@gabina

Description

@gabina

In PR #6423 we did some effort on moving heavy course CSV generation to background jobs.
I don't remember why we didn't move all_courses_csv report from a web-request to the background , but we should definitely do it, since simultaneous requests to that report may saturate the web-server.

In addition, we should review the details of the CSV report generation and evaluate if there is room for improvement in performance.

To Reproduce

Go to https://outreachdashboard.wmflabs.org/usage -> generate CSV of all programs

Downtime logs

There was a new downtime today (2025/12/31) at 2:31 PM UTC. I restarted the web-server and checked the logs in /var/log/apache2/error.log. I think the all_courses_csv report may be the culprit. Several requests for it were started right before the downtime and none of them seem to have finished successfully.

App 15946 output: I, [2025-12-31T14:23:14.122684 #15946]  INFO -- : Processing by AnalyticsController#all_courses_csv as HTML
App 5990 output: I, [2025-12-31T14:24:16.197120 #5990]  INFO -- : Processing by AnalyticsController#all_courses_csv as HTML
App 65445 output: I, [2025-12-31T14:25:15.883540 #65445]  INFO -- : Started GET "/all_courses_csv" for 172.16.19.232 at 2025-12-31 14:25:15 +0000
App 65445 output: I, [2025-12-31T14:25:15.887356 #65445]  INFO -- : Processing by AnalyticsController#all_courses_csv as HTML
App 68770 output: I, [2025-12-31T14:26:20.929198 #68770]  INFO -- : Started GET "/all_courses_csv" for 172.16.19.232 at 2025-12-31 14:26:20 +0000
App 68770 output: I, [2025-12-31T14:26:20.932541 #68770]  INFO -- : Processing by AnalyticsController#all_courses_csv as HTML
App 71733 output: I, [2025-12-31T14:27:20.929673 #71733]  INFO -- : Started GET "/all_courses_csv" for 172.16.19.232 at 2025-12-31 14:27:20 +0000
App 71733 output: I, [2025-12-31T14:27:20.932936 #71733]  INFO -- : Processing by AnalyticsController#all_courses_csv as HTML
App 71667 output: I, [2025-12-31T14:28:21.105459 #71667]  INFO -- : Started GET "/all_courses_csv" for 172.16.19.232 at 2025-12-31 14:28:21 +0000
App 71667 output: I, [2025-12-31T14:28:21.107578 #71667]  INFO -- : Processing by AnalyticsController#all_courses_csv as HTML

In fact, I downloaded the report going to https://outreachdashboard.wmflabs.org/usage -> generate CSV of all programs and it took more than 10 minutes.

App 1475 output: I, [2025-12-31T15:32:38.911051 #1475]  INFO -- : Started GET "/all_courses_csv" for 172.16.19.232 at 2025-12-31 15:32:38 +0000
App 1475 output: I, [2025-12-31T15:32:38.913637 #1475]  INFO -- : Processing by AnalyticsController#all_courses_csv as HTML
App 1475 output: I, [2025-12-31T15:32:55.461166 #1475]  INFO -- sentry: [Transport] Sending envelope with items [sessions]  to Sentry
App 1475 output: I, [2025-12-31T15:45:17.769716 #1475]  INFO -- :   Rendered text template (Duration: 0.1ms | Allocations: 26)
App 1475 output: I, [2025-12-31T15:45:17.770073 #1475]  INFO -- : Sent data all_courses-2025-12-31.csv (1.3ms)
App 1475 output: I, [2025-12-31T15:45:17.770195 #1475]  INFO -- : Completed 200 OK in 758856ms (Views: 0.7ms | ActiveRecord: 384396.5ms | Allocations: 135166079)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions