-
-
Notifications
You must be signed in to change notification settings - Fork 776
Description
Provide environment information
System:
OS: Linux 6.8 Ubuntu 24.04.2 LTS 24.04.2 LTS (Noble Numbat)
CPU: (4) x64 AMD EPYC-Milan Processor
Memory: 10.10 GB / 15.24 GB
Container: Yes
Shell: 5.2.21 - /bin/bash
Binaries:
Node: 18.19.1 - /usr/bin/node
npm: 9.2.0 - /usr/bin/npm
Describe the bug
I self hosted trigger for a long time. I have t many runs stored in my db so I recently decided to clean the older ones. But my script seems to took a very long time to only delete 1500 runs. When I tried to describe the query plan to delete one row it appears to me that two relations where missing some key indexes.
Find slow foreign keys
postgres=# begin;
explain (analyze,buffers,timing)
delete from "TaskRun" where id='cmawag7nafpp8o0281yebzmdz';
rollback;
BEGIN
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
Delete on "TaskRun" (cost=0.42..8.44 rows=0 width=0) (actual time=3.989..3.990 rows=0 loops=1)
Buffers: shared hit=7 read=5 dirtied=1
-> Index Scan using "TaskRun_pkey" on "TaskRun" (cost=0.42..8.44 rows=1 width=6) (actual time=1.936..1.938 rows=1 loops=1)
Index Cond: (id = 'cmawag7nafpp8o0281yebzmdz'::text)
Buffers: shared hit=2 read=2
Planning Time: 0.136 ms
Trigger for constraint TaskRunAttempt_taskRunId_fkey on TaskRun: time=2.418 calls=1
Trigger for constraint TaskRunDependency_taskRunId_fkey on TaskRun: time=2.381 calls=1
Trigger for constraint BatchTaskRunItem_taskRunId_fkey on TaskRun: time=2.630 calls=1
Trigger for constraint Checkpoint_runId_fkey on TaskRun: time=0.861 calls=1
Trigger for constraint CheckpointRestoreEvent_runId_fkey on TaskRun: time=0.742 calls=1
Trigger for constraint BulkActionItem_sourceRunId_fkey on TaskRun: time=0.032 calls=1
Trigger for constraint BulkActionItem_destinationRunId_fkey on TaskRun: time=0.015 calls=1
Trigger for constraint _TaskRunToTaskRunTag_A_fkey on TaskRun: time=2.354 calls=1
Trigger for constraint ProjectAlert_taskRunId_fkey on TaskRun: time=0.015 calls=1
Trigger for constraint TaskRun_rootTaskRunId_fkey on TaskRun: time=1.790 calls=1
Trigger for constraint TaskRun_parentTaskRunId_fkey on TaskRun: time=1.418 calls=1
Trigger for constraint TaskRunDependency_dependentAttemptId_fkey on TaskRunAttempt: time=0.022 calls=1
Trigger for constraint BatchTaskRun_dependentTaskAttemptId_fkey on TaskRunAttempt: time=0.638 calls=1
Trigger for constraint Checkpoint_attemptId_fkey on TaskRunAttempt: time=0.814 calls=1
Trigger for constraint BatchTaskRunItem_taskRunAttemptId_fkey on TaskRunAttempt: time=1.489 calls=1
Trigger for constraint CheckpointRestoreEvent_attemptId_fkey on TaskRunAttempt: time=40.039 calls=1
Trigger for constraint ProjectAlert_taskRunAttemptId_fkey on TaskRunAttempt: time=0.033 calls=1
Trigger for constraint TaskRun_parentTaskRunAttemptId_fkey on TaskRunAttempt: time=336.831 calls=1
Execution Time: 398.553 ms
(25 rows)
ROLLBACK
Fixing the indexes
postgres=# -- does not block writers while it builds
CREATE INDEX CONCURRENTLY IF NOT EXISTS
"TaskRun_parentTaskRunAttemptId_idx"
ON "TaskRun" ("parentTaskRunAttemptId");
CREATE INDEX CONCURRENTLY IF NOT EXISTS
"CheckpointRestoreEvent_attemptId_idx"
ON "CheckpointRestoreEvent" ("attemptId");
CREATE INDEX
CREATE INDEX
postgres=# begin;
explain (analyze,buffers,timing)
delete from "TaskRun" where id='cmawag7nafpp8o0281yebzmdz';
rollback;
BEGIN
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
Delete on "TaskRun" (cost=0.42..8.44 rows=0 width=0) (actual time=0.132..0.133 rows=0 loops=1)
Buffers: shared hit=11 dirtied=1
-> Index Scan using "TaskRun_pkey" on "TaskRun" (cost=0.42..8.44 rows=1 width=6) (actual time=0.046..0.047 rows=1 loops=1)
Index Cond: (id = 'cmawag7nafpp8o0281yebzmdz'::text)
Buffers: shared hit=4
Planning:
Buffers: shared hit=17 read=1
Planning Time: 0.301 ms
Trigger for constraint TaskRunAttempt_taskRunId_fkey on TaskRun: time=0.134 calls=1
Trigger for constraint TaskRunDependency_taskRunId_fkey on TaskRun: time=0.092 calls=1
Trigger for constraint BatchTaskRunItem_taskRunId_fkey on TaskRun: time=0.087 calls=1
Trigger for constraint Checkpoint_runId_fkey on TaskRun: time=0.138 calls=1
Trigger for constraint CheckpointRestoreEvent_runId_fkey on TaskRun: time=0.379 calls=1
Trigger for constraint BulkActionItem_sourceRunId_fkey on TaskRun: time=0.109 calls=1
Trigger for constraint BulkActionItem_destinationRunId_fkey on TaskRun: time=0.061 calls=1
Trigger for constraint _TaskRunToTaskRunTag_A_fkey on TaskRun: time=0.079 calls=1
Trigger for constraint ProjectAlert_taskRunId_fkey on TaskRun: time=0.050 calls=1
Trigger for constraint TaskRun_rootTaskRunId_fkey on TaskRun: time=0.189 calls=1
Trigger for constraint TaskRun_parentTaskRunId_fkey on TaskRun: time=0.245 calls=1
Trigger for constraint TaskRunDependency_dependentAttemptId_fkey on TaskRunAttempt: time=0.067 calls=1
Trigger for constraint BatchTaskRun_dependentTaskAttemptId_fkey on TaskRunAttempt: time=0.081 calls=1
Trigger for constraint Checkpoint_attemptId_fkey on TaskRunAttempt: time=0.059 calls=1
Trigger for constraint BatchTaskRunItem_taskRunAttemptId_fkey on TaskRunAttempt: time=0.063 calls=1
Trigger for constraint CheckpointRestoreEvent_attemptId_fkey on TaskRunAttempt: time=0.135 calls=1
Trigger for constraint ProjectAlert_taskRunAttemptId_fkey on TaskRunAttempt: time=0.043 calls=1
Trigger for constraint TaskRun_parentTaskRunAttemptId_fkey on TaskRunAttempt: time=0.154 calls=1
Execution Time: 2.353 ms
(27 rows)
ROLLBACK
It could be usefull to add those by default.
If you are okay with that I could maybe create a pr for this.
Thanks
Reproduction repo
https://trigger.dev/docs/open-source-self-hosting#self-hosting
To reproduce
Just selfhost trigger and try having more than >500k task run in your db.
Additional information
No response