Skip to content

UI Performance Gain: Materialized View + Mojo::IOLoop->recurring job for History Endpoint (Minion::Backend::Pg) #140

@rshingleton

Description

@rshingleton

Summary

I've refactored the /history endpoint to use a PostgreSQL materialized view, refreshed on a recurring timer and guarded by an advisory lock. This approach has reduced history endpoint response time from over 5 seconds to under 130 ms on a dataset of ~7.5 million records (48 hours). The advisory lock ensures only one process or node runs the refresh at a time, which is essential for preforked or multi-node environments.


Timing Results

Implementation Request Status Response Size Time
Original (dynamic query) /history 200 0.8 kB 5.21 s
Materialized View (MV) /history 200 0.8 kB 127 ms

Note: An initial or rare /history request may be slower if a refresh hasn't run recently.


Current Updated Implementation

Materialized View SQL

CREATE MATERIALIZED VIEW minion_jobs_history_hourly AS
SELECT EXTRACT(EPOCH FROM ts) AS epoch,
       COALESCE(failed_jobs, 0) AS failed_jobs,
       COALESCE(finished_jobs, 0) AS finished_jobs
FROM (
  SELECT EXTRACT(DAY FROM finished) AS day,
         EXTRACT(HOUR FROM finished) AS hour,
         COUNT(*) FILTER (WHERE state = 'failed') AS failed_jobs,
         COUNT(*) FILTER (WHERE state = 'finished') AS finished_jobs
    FROM minion_jobs
   WHERE finished > NOW() - INTERVAL '24 hours'
   GROUP BY day, hour
) AS j
RIGHT OUTER JOIN (
  SELECT *
    FROM GENERATE_SERIES(
      date_trunc('hour', NOW() - INTERVAL '24 hour'),
      date_trunc('hour', NOW()),
      '1 hour'
    ) AS ts
) AS s
ON EXTRACT(HOUR FROM ts) = j.hour AND EXTRACT(DAY FROM ts) = j.day
ORDER BY epoch ASC;

Materialized View Refresh Timer (Perl, Logging Omitted)

sub _history_refresh_timer_id {
    my $self = shift;

    return $self->{_history_refresh_timer_id} if $self->{_history_refresh_timer_id};
    return undef unless Mojo::IOLoop->is_running;

    my $interval = 300; # 5 minutes
    my $hash = sha1_hex("MINION_HISTORY");
    my $lock_key = hex(substr($hash, 0, 8));

    my $callback = sub {
        my $dbh = $self->pg->db->dbh;
        my $got = $dbh->selectrow_array('SELECT pg_try_advisory_lock(?)', undef, $lock_key);
        if ($got) {
            eval {
                $self->pg->db->query('REFRESH MATERIALIZED VIEW minion_jobs_history_hourly');
            };
        }
    };

    $self->{_history_refresh_timer_id} = Mojo::IOLoop->recurring($interval => sub {
        $callback->();
    });

    $callback->(); # Run immediately on startup

    return $self->{_history_refresh_timer_id};
}

Updated history Method

sub history {
    my $self = shift;
    $self->_history_refresh_timer_id; # Starts/ensures the refresh loop

    my $daily = $self->pg->db->query(
        "SELECT epoch, failed_jobs, finished_jobs FROM minion_jobs_history_hourly ORDER BY epoch ASC"
    )->hashes->to_array;

    return { daily => $daily };
}

How It Works

  • When the history endpoint is hit, the periodic timer and advisory lock are initialized if not already present.
  • The advisory lock is not released, but has a TTL that expires it after the interval to prevent subsequent runs from drift or parallel processes.
  • Every 5 minutes (interval configurable), one process acquires the advisory lock and refreshes the materialized view—preventing redundant, overlapping refreshes across preforked or multi-node deployments.
  • All stats are served from the precomputed view, providing consistent sub-150ms UI response even at multimillion-record scale.
  • A priming or rarely used endpoint may briefly reflect refresh time, but subsequent calls are instant.

  • Any further observations or edge case cautions for advisory lock use with periodic refresh in clustered/HA settings?
  • Suggestions for a better place or way to schedule the view update?

PostgreSQL 17.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.5.0 20240719 (Red Hat 11.5.0-5), 64-bit

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions