Skip to content

Commit 038811b

Browse files
aagrawalrtslAyushi Agrawal
andauthored
Adding a partitioned reporting_patient_states table (#5614)
**Story card:** [sc-15583](https://app.shortcut.com/simpledotorg/story/15583/new-reporting-pipeline-for-reporting-patient-states) ## Because Materialized view refresh is very slow. We want to introduce partitioned table approach, where with each month it will drop that partition, regenerate data for that partition and attach it to the main table. With this new approach we will not be doing a full refresh of the table every day. ## This addresses This is the first step towards redesigning our reporting pipeline. We are adding a partitioned reporting_patient_states table, under a different schema. The Mat view version of same would continue to behave as it is, until we decide to finally make the partitioned table live. Dashboards are still going to use the mat view for getting the data. When the daily rake task for runs, then for reporting_patient_states will refresh only the current month and previous month. We still want to refresh last 15 months data to handle delayed sync scenario. So last 15 months data would be refreshed once at some point during the current month. There is a rake task for getting this table fully refreshed. This would delete everything from the table and regenerate all the data from the beginning. Ideally, this task would run only once, in the start when the feature is deployed. ## Test instructions Run the migrations. rake "db:refresh_reporting_views" - for monthly/daily refresh rake "reporting:full_partitioned_refresh" - for full refresh of reporting_patient_states. --------- Co-authored-by: Ayushi Agrawal <[email protected]>
1 parent d7f09d1 commit 038811b

14 files changed

+1176
-4
lines changed
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
class RefreshReportingPartitionedTableJob
2+
include Sidekiq::Worker
3+
4+
sidekiq_options queue: :default
5+
6+
def perform(reporting_month, table_name)
7+
Rails.logger.info "Starting refresh for '#{table_name}' for month '#{reporting_month}' at #{Time.now.utc}"
8+
ActiveRecord::Base.connection.exec_query(
9+
"CALL simple_reporting.add_shard_to_table('#{reporting_month}', '#{table_name}')"
10+
)
11+
end
12+
end

app/models/reports/patient_state.rb

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,16 @@ def self.materialized?
1313
true
1414
end
1515

16+
def self.partitioned?
17+
true
18+
end
19+
20+
def self.partitioned_refresh(refresh_month)
21+
ActiveRecord::Base.connection.exec_query(
22+
"CALL simple_reporting.add_shard_to_table('#{refresh_month}', 'reporting_patient_states')"
23+
)
24+
end
25+
1626
def self.by_assigned_region(region_or_source)
1727
region = region_or_source.region
1828
where("assigned_#{region.region_type}_region_id" => region.id)

app/models/reports/refreshable.rb

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,10 @@ def refresh(transaction: true)
1010
end
1111
end
1212

13+
def partitioned?
14+
false
15+
end
16+
1317
private
1418

1519
def refresh_view

app/models/reports/view.rb

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,5 +33,13 @@ def self.add_column_description(column_description, column_name)
3333
def self.materialized?
3434
raise NotImplementedError
3535
end
36+
37+
def self.get_refresh_months
38+
current_date = Date.today
39+
current_day = current_date.day
40+
current_month = current_date.beginning_of_month
41+
month_offset = (current_day / 2) + 1
42+
current_day.odd? ? [current_month, current_month.prev_month] : [current_month, current_month - month_offset.month]
43+
end
3644
end
3745
end

app/services/refresh_reporting_views.rb

Lines changed: 13 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -105,18 +105,28 @@ def all_views_refreshed?
105105

106106
def refresh
107107
views.each do |name|
108+
klass = name.constantize
108109
benchmark_and_statsd(name) do
109-
klass = name.constantize
110110
klass.refresh
111111
end
112+
113+
if klass.partitioned?
114+
benchmark_and_statsd(name, true) do
115+
klass.get_refresh_months.each do |refresh_month|
116+
klass.partitioned_refresh(refresh_month)
117+
end
118+
end
119+
end
112120
end
113121
end
114122

115-
def benchmark_and_statsd(operation)
123+
def benchmark_and_statsd(operation, partitioned_refresh = false)
116124
view = operation == "all" ? "all" : operation.constantize.table_name
117125
name = "reporting_views_refresh_duration_seconds"
118126
result = nil
119-
Metrics.benchmark_and_gauge(name, {view: view}) do
127+
options_hash = {view: view}
128+
options_hash[:partitioned_refresh] = true if partitioned_refresh
129+
Metrics.benchmark_and_gauge(name, options_hash) do
120130
result = yield
121131
end
122132
result

0 commit comments

Comments
 (0)