feat: task logs for database operations by jason-lynch · Pull Request #36 · pgEdge/control-plane

jason-lynch · 2025-05-27T14:30:46Z

Summary

With this PR, each database operation has an associated task that can be tracked via the API. I've modified each operation endpoint to return a task alongside their normal response.

I've divided it up into more granular commits that can (hopefully!) be reviewed individually.

Details

In order to make this behavior consistent for deletes, the delete task is not deleted when a database is deleted, and it can still be fetched via API after the database is deleted. This way, API clients wont' get a 404 if they're slow to request the delete task, and they can track the delete process through to completion. The delete task will, however, get deleted if you recreate a new database with the same ID.

The backup and restore operations already had tasks associated with them, so I've introduced the concept of "parent tasks". Those existing tasks now have a parent_id that's set to the main operation's task ID.

I've also added optional timestamps to the task log API. You can enable them with a new include_timestamps parameter.

Since I was already making so many API changes (including adding a response type for the backup operation) I renamed initiate-database-backup to backup-database-node to be consistent with the other endpoints. This leaves room for a potential backup-database endpoint that backs up all nodes.

Remember to run restish api sync for each control plane server to get the latest API changes in your CLI.

PLAT-33

mmols

This is working well for me - just a few questions.

mmols · 2025-05-28T17:09:08Z

server/internal/workflows/activities/delete_db_entities.go

+		return nil, fmt.Errorf("failed to list tasks: %w", err)
+	}
+	for _, t := range tasks {
+		// We want to leave the delete task in place so that API clients can


I wonder if our default behavior should be to leave these tasks in the system, and handle task cleanup through a separate function / story where a retention is configurable.

If the user deletes a database, the current situation is that there is little context / logs other than the main Control Plane log.

Is there a technical reason to not keep older tasks around?

I did this because it felt like a strange user experience to see all of the old tasks if you recreate a database with the same identifier - especially since there's no other mechanism to remove them. That said, I might be biased because I reuse identifiers in order to make development easier. I don't know if that will be common for end-users.

If the user deletes a database, the current situation is that there is little context / logs other than the main Control Plane log.

The way this works is that the delete task is left in place, but all other tasks are removed. You're still able to list and fetch tasks for the database, but you'll only see the delete task.

If you then recreate the database with the same ID, the delete task is removed before the database is created.

Is there a technical reason to not keep older tasks around?

The only technical reason to clean these up is that they consume Etcd capacity, which is limited. But, these don't take up much space relative to the workflow records, so I mainly did this for user experience.

I thought more about this, and I've changed it so that tasks are not deleted when the database is deleted. They are still deleted on create, so it will clear out old tasks if you reuse a database identifier.

That seems like a good compromise - I tested this and it meshes with what I'd expect. I think re-using the identifier won't be common for most users (compared to omitting it and having it generated).

api/design/api.go

tsivaprasad · 2025-05-29T13:53:07Z

server/internal/workflows/activities/apply_event.go

+		return a.TaskSvc.AddLogLine(ctx, databaseID, taskID, msg)
+	}
+
+	msg := fmt.Sprintf("%s %s on host %s", verb, resource.Identifier(), a.Config.HostID)


I think we can retrieve the resource ID once and reuse it in multiple places within the function, instead of calling resource.Identifier() repeatedly. What do you think?"

Thanks! I'm refactoring this a little bit for the structured logging change, so I'll make sure to include this.

I did this as part of the change to structured logging in ea72b00.

tsivaprasad · 2025-05-29T14:37:53Z

server/internal/workflows/activities/log_task_event.go

+func (a *Activities) LogTaskEvent(ctx context.Context, input *LogTaskEventInput) (*LogTaskEventOutput, error) {
+	logger := activity.Logger(ctx).With("database_id", input.DatabaseID.String())
+	logger.Info("updating database state")
+


We could add an early return to skip processing if len(input.Messages) == 0 to avoid unnecessary work. What do you think?

Sure! That makes sense.

I've added this check to the logTaskEvent function that invokes this activity in 1950037.

PLAT-33

Also renames `initiate-database-backup` to `backup-database-node` to be consistent with other operations and to leave room for a potential `backup-database` operation. PLAT-33

PLAT-33

Modifies database operation workflows to take a task ID and update the task status during processing. PLAT-33

Adds some already-existing fields to the task APIs and an `include_timestamp` option to the task log API. PLAT-33

PLAT-33

Our task logs have a separate `timestamp` field, so disabling these timestamps will make the task logs easier to consume consistently. PLAT-33

Fixed requested changes in ea72b00 and 1950037

jason-lynch requested review from mmols and tsivaprasad May 27, 2025 14:30

Base automatically changed from feat/PLAT-55/release-process to main May 28, 2025 14:51

mmols reviewed May 28, 2025

View reviewed changes

tsivaprasad previously requested changes May 29, 2025

View reviewed changes

jason-lynch added 11 commits May 31, 2025 11:00

refactor: add workflow service to manage workflows

44da3d6

PLAT-33

refactor: move task domain logic to task package

a1ce3d2

PLAT-33

feat: add timestamp to task log lines

0cf4136

PLAT-33

feat: add log_task_event activity

8af2b9b

PLAT-33

feat: return task IDs from database operations

3752f6f

Also renames `initiate-database-backup` to `backup-database-node` to be consistent with other operations and to leave room for a potential `backup-database` operation. PLAT-33

feat: create and return tasks in workflow service

4486edd

PLAT-33

feat: update tasks in workflows

74ef439

Modifies database operation workflows to take a task ID and update the task status during processing. PLAT-33

feat: minor task API improvements

21b5d0c

Adds some already-existing fields to the task APIs and an `include_timestamp` option to the task log API. PLAT-33

docs: release notes for task API changes

673a358

PLAT-33

feat: use structured logging for task logs

ea72b00

PLAT-33

feat: disable pgbackrest timestamp logging

bba8a1b

Our task logs have a separate `timestamp` field, so disabling these timestamps will make the task logs easier to consume consistently. PLAT-33

jason-lynch force-pushed the feat/PLAT-33/workflow-task-logs-split branch from 92172f9 to bba8a1b Compare May 31, 2025 16:05

fix: short-circuit in logTaskEvent

1950037

jason-lynch requested review from mmols and tsivaprasad June 2, 2025 13:03

mmols approved these changes Jun 2, 2025

View reviewed changes

jason-lynch merged commit 37399cc into main Jun 2, 2025
2 checks passed

jason-lynch deleted the feat/PLAT-33/workflow-task-logs-split branch June 2, 2025 20:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

feat: task logs for database operations#36

feat: task logs for database operations#36
jason-lynch merged 12 commits intomainfrom
feat/PLAT-33/workflow-task-logs-split

jason-lynch commented May 27, 2025

Uh oh!

mmols left a comment

Uh oh!

mmols May 28, 2025

Uh oh!

jason-lynch May 28, 2025

Uh oh!

jason-lynch Jun 2, 2025

Uh oh!

mmols Jun 2, 2025

Uh oh!

Uh oh!

tsivaprasad May 29, 2025

Uh oh!

jason-lynch May 30, 2025

Uh oh!

jason-lynch Jun 2, 2025

Uh oh!

tsivaprasad May 29, 2025

Uh oh!

jason-lynch May 30, 2025

Uh oh!

jason-lynch Jun 2, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

Conversation

jason-lynch commented May 27, 2025

Summary

Details

Uh oh!

mmols left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants