PBM-1665 keep PITR running and prioritize different node for profile backups#1263
Conversation
66ae91f to
6c0de56
Compare
boris-ilijic
left a comment
There was a problem hiding this comment.
Looks good, but it's necessary to expand it from use-case perspective.
cmd/pbm-agent/backup.go
Outdated
|
|
||
| // Only set if not already present (preserve previous priorities) | ||
| if _, exists := coefficients[pl.Node]; !exists { | ||
| coefficients[pl.Node] = 0.1 // low priority, making it a last resort |
There was a problem hiding this comment.
This will not work correctly for PSA. From requirements:
In case of PSA topology, prioritize Secondary.
P: 1.0; S (after deprioritize): 0.1 --> backup will be executed on P, and it should go on S.
There was a problem hiding this comment.
P is actually 1/2, but your point is valid anyway. The priority for PITR secondary should be > 1/2. Since the number matters, are you OK if we make the score constants in pnm/prio/priority.go public?
There was a problem hiding this comment.
deprioritizePITRNodes deals with coefficient not scores, so maybe we can make that intention clear by setting coefficient prio.defaultScore - 0.1. That would be well good rule for protecting PITR node of running backup.
I would remove fixed 0.6, it's kind "where's this number coming from?" :)
But all good for me, this would work now.
6c0de56 to
edf4d8b
Compare
cmd/pbm-agent/backup.go
Outdated
|
|
||
| // Only set if not already present (preserve previous priorities) | ||
| if _, exists := coefficients[pl.Node]; !exists { | ||
| coefficients[pl.Node] = 0.6 // low priority, making it a last resort |
There was a problem hiding this comment.
@boris-ilijic this should be do the trick for PSA. But I would prefer making the score constants in priority package public and use that here.
This stop was temporary since for profile, the PITR monitor in agent would restart it anyway.
a1b5bd6 to
1f31f29
Compare
Ticket: https://perconadev.atlassian.net/browse/PBM-1665
PITR was being stopped only temporarily – when backup arrived, the main PITR slicer would be signaled, take a last slice and return. However the PITR monitor rutine in Agent would then check active backup logs and in case of profile storage, start the PITR anew. This took about 15s
This PR adjust the behavior in two ways