You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Rely on Sling Jobs Persisted Logs for asynchronous installation logging
Implement a polling mechanism for logs connected to the Sling Job ID.
That should work reliably even if the Job is processed on another
server.
Leverage ids to deduplicate server-side events
This closes#815
/** Attaches the log listener callback to an installation triggered previously via {@link #applyAsynchronously(InstallationOptions)}.
38
58
*
39
59
* @param jobId the job id returned by {@link #applyAsynchronously(InstallationOptions)}
40
-
* @param listener the listener to attach, receives the level and the message per each log line
60
+
* @param listener the listener to attach, receives the level and the message per each log message
41
61
* @param finishListener the listener to attach, receives a boolean status indicating success or failure once the installation was finished
42
62
* @return {@code true} if the listeners were attached successfully (i.e. an installation with the given executionId was triggered before and is still ongoing), {@code false} otherwise
43
63
* @since 3.6.0
44
64
* @see #applyAsynchronously(InstallationOptions)
65
+
* @deprecated use {@link #pollLog(String, int, BiConsumer, Duration, Duration)} instead, this one is no longer functional.
* Polls the log of an asynchronous installation job with the given ID until the job finishes or the timeout is reached.
72
+
* This method is blocking. It may be called multiple times in case a previous execution returned {@code false} due to a timeout or temporary error.
73
+
* In that case make sure to set the {@code offset} parameter to the last known offset of the log messages.
74
+
* @param jobId the job id returned by {@link #applyAsynchronously(InstallationOptions)}
75
+
* @param offset the offset of the last log message received, or {@code 0} to start from the beginning
76
+
* @param logConsumer is called for each log message (may contain new lines), the first parameter is the optional offset of the log message (for deduplication), the second parameter is the log message itself
77
+
* @param timeOut the maximum time to wait for the job to finish, if the job does not finish within this time, the method returns {@code false}
78
+
* @param pollInterval the interval between polls of the log, if the job is still running, the method will wait for this duration before polling again
79
+
* @return {@code true} if the logs were either polled successfully or the given job id is no longer running or invalid, {@code false} if the job ran into a timeout or a temporary error
* Retrieving the job relies on a query under the hood but the query index is updated asynchronously
122
+
* so we need to poll for the job until it is found or the timeout is reached.
123
+
* AEM uses an <a href="https://jackrabbit.apache.org/oak/docs/query/indexing.html#nrt-indexing">{@code npr} index</a> which should be updated after some seconds.
124
+
* Polling with an interval of 400ms for a maximum of 3 seconds should be sufficient.
// no async install log, most probably job is running on another instance
240
-
messageListener.accept(InstallationLogLevel.INFO, "No async install log available, but job found with id " + jobId + ". Job running on another instance with id \"" + job.getTargetInstance() + "\".");
241
-
messageListener.accept(InstallationLogLevel.INFO, "Please check the logs of the other instance for details.");
while (System.currentTimeMillis() < startTime + timeOut.toMillis()) {
270
+
Jobjob = getJob(jobId).orElse(null);
271
+
if (job == null) {
272
+
if (offset == 0) {
273
+
logConsumer.accept(Optional.empty(), "Job " + jobId + " not found anymore, it is completed already and history is no longer available, check the persisted log");
274
+
}
275
+
LOG.debug("Job {} not found, it is probably completed already and history is no longer available, check the persisted log", jobId);
276
+
returntrue;
277
+
}
278
+
offset = pollLog(job, offset, logConsumer);
279
+
if (job.getJobState() != Job.JobState.ACTIVE && job.getJobState() != Job.JobState.QUEUED) {
280
+
// consume remaining log lines
281
+
offset = pollLog(job, offset, logConsumer);
282
+
if (offset == 0) {
283
+
logConsumer.accept(Optional.empty(), "Job " + jobId + " is not active, current state: " + job.getJobState());
284
+
}
285
+
LOG.debug("Job {} is not active or queued, current state: {}", jobId, job.getJobState());
286
+
returntrue;
287
+
}
288
+
try {
289
+
Thread.sleep(pollInterval.toMillis());
290
+
} catch (InterruptedExceptione) {
291
+
Thread.currentThread().interrupt();
292
+
logConsumer.accept(Optional.empty(), "Polling for job's " + jobId + " log was interrupted");
293
+
returnfalse;
294
+
}
245
295
}
246
-
returntrue;
296
+
LOG.debug("Polling for job's {} log timed out after {} seconds", jobId, timeOut.getSeconds());
297
+
returnfalse;
298
+
}
299
+
300
+
/** As retrieving the job relies on a query under the hood but the query index is updated asynchronously
301
+
* we need to poll for the job until it is found or the timeout is reached.
302
+
* @param jobId
303
+
* @return the job if it is found within the timeout, otherwise an empty Optional
304
+
*/
305
+
privateOptional<Job> getJob(StringjobId) {
306
+
// now poll for the job, as it might be that the search index has not been updated yet to contain the job in its new location
307
+
longstartTime = System.currentTimeMillis();
308
+
while (System.currentTimeMillis() < startTime + GET_JOB_TIMEOUT_MS) {
309
+
Jobjob = jobManager.getJobById(jobId);
310
+
if (job != null) {
311
+
returnOptional.of(job);
312
+
} else {
313
+
try {
314
+
Thread.sleep(GET_JOB_POLLING_INTERVAL_MS);
315
+
} catch (InterruptedExceptione) {
316
+
Thread.currentThread().interrupt();
317
+
LOG.warn("Polling for job's {} log was interrupted", jobId, e);
0 commit comments