Commit 2998460
Decouple pbench-move-results from server
pbench-move-results knows how to query a server to get a destination
for the tarballs that it has to move. It then checks that it can get
to the destination. If that is successful it packages up a tarball,
and calculates an md5. These two are copied to the destination. It
checks that the tarball's md5 is correct and if so, it marks the
transaction complete by renaming the md5 file at the destination.
The implementation in this commit will apply to version-002 clients:
when they query the server they will get a different destination than
the version-001 clients. See below for the server-side handling.
Version-001 agents will continue to do things as they do today
(but see below for modified server-side handling of version-001
agents).
pbench-move-results gets a new --user option. The "user = <value>"
option gets added to the metadata.log in the [run] section before the
tarball is created. Its default value comes from the env variable
PBENCH_USER (if it exists), but may be overridden on the command line.
Any prefix specified on the command line is also handled the same way:
the "prefix = <value>" option gets added to the [run] section of
the metadata log.
Tarballs and md5 sum files are created in a temp directory
which is copied (recursively) to the server. The md5 file is
called <resultname>.tar.xz.md5.check and is renamed to omit
the .check suffix after a successful md5 check, thus signalling
completion of the moving and checking of the tarball.
The number of ssh invocations has thus been reduced to one initial
one to check connectivity plus two for each tarball: one to copy
the temp directory over and one to check the MD5 sum and rename
the file.
Duplicate detection has been eliminated from the agent side.
It is done on the server side.
pbench-server-prep-shim-002 is a new script: it will scour the
version-002 reception directory for xxx.tar.xz.md5 files (i.e. files
that the agent has already checked and renamed). For each file, it
will check the md5 sum and, if successful, it will copy the
corresponding tarball and its md5 to the archive directory and then
create the link in the TODO state directory. That will get the ball
rolling on the rest of the processing. Errors like missing tarball,
missing or bad md5, duplicate names, are detected and handled
appropriately (mostly by quarantining the tarball). This takes care
of version-002 agents.
Version-001 agents are now also taken care of by a server shim script:
they end up in a different reception directory, and the script uses
the TODO link that version 001 agents create as an indication that
a tarball is ready to process. Most of the procession is similar to
the version 002 shim, except that this shim also has to handle an
optional prefix file.
This commit also includes changes to the example config files for the
server to accommodate both version 001 and 002 agents, and the
corresponding changes to various pbench-server-activate-* scripts
which are called on server installation to set up the appropriate
structures.
Handling --user and --prefix options server-side:
Version 002 pbench-move-results options --user and --prefix (the first
new, the second one preexisting) are handled by inserting their
values into the metadata.log file. The server side has to be modified
to handle them.
At the same time, older agents that package the prefix option into a
separate file that is copied to the server, also need to be handled
properly. N.B. that older agents do not have the user option capability
at all.
The handling is done at several levels:
- The version 001 shim renames the prefix file (if there is one) to
$resultname.prefix from prefix.$resultname (the latter was an
unfortunate choice that required name surgery in later processing).
- pbench-dispatch looks for the (now renamed) prefix file and moves it
into the .prefix subdirectory, just as it did before (except for the
renaming).
- pbench-unpack-tarballs takes the prefix (either out of the metadata.log
file for version 002 agents or the prefix file for version 001 agents)
and makes a link at the appropriate subdirectory of the results/ directory.
In addition, it retrieves the user option out of the metadata.log file and (if
non-empty) uses it to make a link in a new hierarchy:
users/$user/$controller/$prefix/$resultname
Of course, the metadata log is indexed into ES, so these values (in
particular the user value) are going to be available for the dashboard
to use.
- pbench-sync-package-tarballs now uses the modified prefix form.
One problem that is *NOT* addressed by this PR at all is what the
(agent-side) pbench-edit-prefix is supposed to do. For now, we punt.
Revert inotify stuff from pbench-base.sh and pbench-sync-satellite: It
will go in as part of a different PR. Clean up header comment in
pbench-sync-satellite.
(After review) Fix error handling in the shims. There is a "quarantine"
directory and three subdirs for each version (001 and 002):
md5, duplicates and errors.
The handling goes as follows:
- Errors in quarantine are fatal.
- MD5 errors go to "md5".
- Duplicate errors go to "duplicates".
- Operational errors (mkdir/mv/ln failures) in the shims quarantine
into a a different subdir "errors". After whatever
caused any of these errors is fixed, the quarantined tarball should
be retried by moving them into the appropriate reception area.
- A quarantine setting is added to the config file.
- create-results-dir-structure is modified to create the quarantine
directory and its subdirs.
Emit error messages before calling quarantine.
Status formatting: all status on a single line.
Forget about prefix stats. Exit with code $nerrs.
The quarantine function now logs some information: it makes an
assumption that it is called within a log_init/log_finish context and
logs any error to the error file of the program that called it.
Fix prefix handling in pbench-move-unpacked: duplicate the handling of
the prefix in pbench-unpack-tarballs into pbench-move-unpacked.
(After further review) Fixes to the two shims after review and
discussion.
- Add more error checking.
- Simplify the counting of various error conditions to maintain
the condition
ntotal = ntbs + nquarantines + ndups + nerrs
- Avoid pushd/popd when fixing up the prefix in the -001 shim.
- Annotate the error messages that go into the status file and the error log
with "Quarantined", "Duplicate", or "Error" tags, as the case might be.
- Fix bug in -002: check $qdir for existence, not $quarantine.1 parent 55a92a0 commit 2998460
File tree
14 files changed
+769
-226
lines changed- agent/util-scripts
- server/pbench/bin
- state/config
14 files changed
+769
-226
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
8 | 8 | | |
9 | 9 | | |
10 | 10 | | |
| 11 | + | |
| 12 | + | |
11 | 13 | | |
12 | 14 | | |
13 | 15 | | |
14 | 16 | | |
15 | 17 | | |
16 | 18 | | |
17 | | - | |
| 19 | + | |
18 | 20 | | |
19 | 21 | | |
20 | 22 | | |
| |||
24 | 26 | | |
25 | 27 | | |
26 | 28 | | |
| 29 | + | |
| 30 | + | |
27 | 31 | | |
28 | 32 | | |
29 | 33 | | |
30 | 34 | | |
31 | 35 | | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
32 | 43 | | |
33 | 44 | | |
34 | 45 | | |
| |||
43 | 54 | | |
44 | 55 | | |
45 | 56 | | |
46 | | - | |
| 57 | + | |
47 | 58 | | |
48 | 59 | | |
49 | 60 | | |
| |||
61 | 72 | | |
62 | 73 | | |
63 | 74 | | |
| 75 | + | |
64 | 76 | | |
65 | 77 | | |
66 | 78 | | |
| |||
123 | 135 | | |
124 | 136 | | |
125 | 137 | | |
126 | | - | |
127 | 138 | | |
128 | 139 | | |
129 | 140 | | |
| |||
137 | 148 | | |
138 | 149 | | |
139 | 150 | | |
140 | | - | |
141 | | - | |
142 | | - | |
143 | | - | |
144 | | - | |
145 | 151 | | |
146 | 152 | | |
147 | 153 | | |
148 | 154 | | |
149 | | - | |
| 155 | + | |
| 156 | + | |
150 | 157 | | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
151 | 165 | | |
152 | 166 | | |
153 | 167 | | |
| |||
172 | 186 | | |
173 | 187 | | |
174 | 188 | | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
175 | 201 | | |
176 | 202 | | |
177 | 203 | | |
178 | | - | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
179 | 213 | | |
180 | 214 | | |
181 | 215 | | |
182 | 216 | | |
183 | 217 | | |
184 | | - | |
| 218 | + | |
185 | 219 | | |
186 | | - | |
| 220 | + | |
187 | 221 | | |
188 | 222 | | |
189 | 223 | | |
190 | 224 | | |
191 | 225 | | |
192 | 226 | | |
193 | | - | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
194 | 233 | | |
195 | 234 | | |
196 | | - | |
| 235 | + | |
197 | 236 | | |
| 237 | + | |
198 | 238 | | |
199 | 239 | | |
200 | | - | |
201 | | - | |
202 | | - | |
203 | | - | |
204 | | - | |
205 | | - | |
206 | | - | |
207 | | - | |
208 | | - | |
209 | | - | |
210 | | - | |
211 | | - | |
212 | | - | |
213 | | - | |
214 | | - | |
215 | | - | |
216 | | - | |
217 | | - | |
218 | | - | |
219 | | - | |
220 | | - | |
221 | | - | |
222 | | - | |
223 | | - | |
224 | | - | |
225 | | - | |
226 | | - | |
227 | | - | |
228 | | - | |
229 | | - | |
230 | | - | |
231 | | - | |
232 | | - | |
233 | | - | |
234 | | - | |
235 | | - | |
236 | | - | |
237 | | - | |
238 | | - | |
239 | | - | |
240 | | - | |
241 | | - | |
242 | | - | |
243 | | - | |
244 | | - | |
245 | | - | |
246 | | - | |
247 | | - | |
248 | | - | |
249 | | - | |
250 | | - | |
251 | | - | |
| 240 | + | |
252 | 241 | | |
253 | 242 | | |
254 | | - | |
| 243 | + | |
255 | 244 | | |
256 | | - | |
257 | | - | |
| 245 | + | |
| 246 | + | |
258 | 247 | | |
259 | 248 | | |
260 | 249 | | |
261 | 250 | | |
262 | | - | |
263 | | - | |
264 | | - | |
265 | | - | |
266 | | - | |
267 | 251 | | |
268 | | - | |
| 252 | + | |
| 253 | + | |
269 | 254 | | |
270 | | - | |
271 | 255 | | |
272 | 256 | | |
273 | | - | |
| 257 | + | |
274 | 258 | | |
275 | 259 | | |
276 | | - | |
277 | | - | |
278 | | - | |
279 | | - | |
280 | | - | |
281 | | - | |
282 | 260 | | |
| 261 | + | |
283 | 262 | | |
284 | | - | |
285 | | - | |
286 | | - | |
287 | | - | |
288 | | - | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
289 | 268 | | |
290 | 269 | | |
291 | 270 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
42 | 42 | | |
43 | 43 | | |
44 | 44 | | |
45 | | - | |
46 | 45 | | |
47 | 46 | | |
48 | 47 | | |
49 | | - | |
| 48 | + | |
50 | 49 | | |
51 | 50 | | |
52 | 51 | | |
| |||
125 | 124 | | |
126 | 125 | | |
127 | 126 | | |
128 | | - | |
129 | | - | |
130 | | - | |
131 | | - | |
132 | | - | |
133 | | - | |
134 | | - | |
135 | | - | |
136 | | - | |
137 | | - | |
138 | | - | |
139 | | - | |
140 | | - | |
141 | | - | |
142 | | - | |
143 | | - | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
144 | 151 | | |
145 | | - | |
146 | | - | |
147 | | - | |
148 | | - | |
149 | | - | |
150 | | - | |
151 | 152 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
19 | | - | |
| 19 | + | |
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
| |||
97 | 97 | | |
98 | 98 | | |
99 | 99 | | |
100 | | - | |
| 100 | + | |
101 | 101 | | |
102 | 102 | | |
103 | 103 | | |
| |||
119 | 119 | | |
120 | 120 | | |
121 | 121 | | |
122 | | - | |
| 122 | + | |
123 | 123 | | |
124 | 124 | | |
125 | 125 | | |
126 | 126 | | |
127 | 127 | | |
128 | 128 | | |
129 | 129 | | |
130 | | - | |
| 130 | + | |
131 | 131 | | |
132 | 132 | | |
133 | 133 | | |
| |||
148 | 148 | | |
149 | 149 | | |
150 | 150 | | |
151 | | - | |
| 151 | + | |
152 | 152 | | |
153 | 153 | | |
154 | 154 | | |
| |||
0 commit comments