Skip to content

Add out of the box performance timing mechanism #163

@colmsnowplow

Description

@colmsnowplow

SQL-runner logs each step to stdout with timestamps, which is awesome. When doing gnarly debugging, or just longer-term logging of performance, it's potentially not ideal however, because if the query is in the Queue, or the connection is laggy, these timings will appear to indicate poor performance.

Feels like we can easily create a boilerplate way to log start and end timestamps to a table in the actual database for both a playbook and the steps of a playbook. For example, this is what I'm using for the model I'm currently working on (redshift):


-- At start of whole model:
CREATE TABLE IF NOT EXISTS {{.output_schema}}.base_timings (
  run_id TIMESTAMP,
  test_id VARCHAR(64),
  step_name VARCHAR(64),
  start_tstamp TIMESTAMP,
  end_tstamp TIMESTAMP
);

DROP TABLE IF EXISTS {{.scratch_schema}}.step_timing;

CREATE TABLE {{.scratch_schema}}.step_timing (
  test_id VARCHAR(64),
  step_name VARCHAR(64),
  start_tstamp TIMESTAMP,
  end_tstamp TIMESTAMP);

-- At every step:
INSERT INTO {{.scratch_schema}}.step_timing VALUES(
  '{{.entropy}}', '01-new-events-limits', GETDATE(), NULL
);


UPDATE {{.scratch_schema}}.step_timing SET end_tstamp = GETDATE() WHERE test_id = '{{.entropy}}' AND step_name = '01-new-events-limits';

-- At end of whole model:


INSERT INTO {{.output_schema}}.base_timings (SELECT GETDATE(), * FROM {{.scratch_schema}}.step_timing);

Could be a nice feature if this was available as a configuration to sql-runner.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions