Commit ff9a996
authored
improve supervisor startup, add comments about issues with the supervisor getting stuck (#109)
* improve supervisor startup, add comments about issues with the supervisor getting stuck
## Improve supervisor startup
Previously, the supervisor's snapshot loop was not ideal for process startup:
* We started a reflector and then *immediately* stopped it upon entering the snapshot loop.
* Thus no updates would accumulate before creating the first snapshot. So the snapshot
we uploaded would be basically a copy of the latest snapshot in S3.
Fix: We sleep first in the loop to allow updates to accumulate before stopping the reflector
Furthermore, the time-to-first-new-snapshot has actually been improved, perhaps
counter-intuitively given that we are now immediately *sleeping* instead of
making a snapshot.
* Consider that previously the first fruitful iteration was delayed until the
first (duplicate) snapshot was compressed & uploaded.
## Supervisor getting stuck
This PR also documents a "supervisor stuck" issue we hit:
* We noticed the supervisor not making progress during a massive LDB cleanup.
* We had a series of 31 ~1MB transactions that deleted an exorbitant number of stranded
rows from the LDB.
* To overcome the system being stuck we increased our configured sleep duration from
5 to 30 minutes, until the system got unwedged.
* update tests to wait for first sleep to happen for snapshot to complete
* fix up language in comment
* already have a metric for ledger updates applied, just were not graphing it yet1 parent 9427c90 commit ff9a996
2 files changed
+31
-10
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
129 | 129 | | |
130 | 130 | | |
131 | 131 | | |
| 132 | + | |
132 | 133 | | |
133 | | - | |
134 | | - | |
135 | | - | |
136 | | - | |
137 | | - | |
138 | | - | |
139 | | - | |
140 | | - | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
141 | 151 | | |
142 | 152 | | |
| 153 | + | |
| 154 | + | |
143 | 155 | | |
144 | 156 | | |
145 | 157 | | |
146 | 158 | | |
147 | 159 | | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
148 | 167 | | |
149 | 168 | | |
150 | 169 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
50 | 50 | | |
51 | 51 | | |
52 | 52 | | |
| 53 | + | |
| 54 | + | |
53 | 55 | | |
54 | | - | |
| 56 | + | |
55 | 57 | | |
56 | 58 | | |
57 | 59 | | |
| |||
127 | 129 | | |
128 | 130 | | |
129 | 131 | | |
130 | | - | |
| 132 | + | |
131 | 133 | | |
132 | 134 | | |
133 | 135 | | |
| |||
0 commit comments