-
Notifications
You must be signed in to change notification settings - Fork 16
Job Retry
When creating jobs you have an option of setting the retryMax and retryDelay values. Both of these values will determine what will happen if a job fails either by a worker node not responding or the job process failing.
Every job has a property called dateRetry which is used to determine if the job is ready for processing after a failure has occurred. The dateRetry value is set when the job is retrieved from the database for processing. The retrieval query will not return jobs where the dateRetry value is greater than the current date time value.
Currently the formula used to set the dateRetry value during the job retrieval process is:
now() + job.timeout + ( job.retryDelay * job.retryCount )The plan in the future is to move this to an exponential formula once RethinDB has a power function.
As you can see, to disable the retry process and make jobs retry as soon as possible, simply set the retryDelay to zero.
If we take the default values for the retryMax which is 3, and retryDelay which is 600 seconds (10 minutes), then the following sequence of events will occur:
-
Job has never been processed and has default properties.
timeout = 300retryCount = 0retryMax = 3retryDelay = 600
-
Job is retrieved from database setting the
dateRetryvalue.
- `dateRetry = now() + timeout`
- Job fails for some reason, worker node is still functioning.
- `retryCount = 1`
- `status` = 'retry'
-
Job is available to be retrieved from the database after 300 seconds.
-
Job is retrieved from database setting the
dateRetryvalue.dateRetry = now + timeout + (retryDelay * retryCount)
-
Job fails for some reason.
retryCount = 2
-
Job is available to be retrieved from the database after 900 seconds.
-
Job is retrieved from database setting the
dateRetryvalue.dateRetry = now + timeout + (retryDelay * retryCount)
-
Job fails for some reason.
retryCount = 3
-
Job is available to be retrieved from the database after 1500 seconds.
-
Job is retrieved from database setting the
dateRetryvalue.- This is redundant because the job has hard failed, however it is easier to set the value than add branching logic for this one case.
-
Job has hard failed and will no longer be retrieved from the database.
- Introduction
- Tutorial
- Queue Constructor
- Queue Connection
- Queue Options
- Queue PubSub
- Queue Master
- Queue Events
- State Document
- Job Processing
- Job Options
- Job Status
- Job Retry
- Job Repeat
- Job Logging
- Job Editing
- Job Schema
- Job Name
- Complex Job
- Delayed Job
- Cancel Job
- Error Handling
- Queue.createJob
- Queue.addJob
- Queue.getJob
- Queue.findJob
- Queue.findJobByName
- Queue.containsJobByName
- Queue.cancelJob
- Queue.reanimateJob
- Queue.removeJob
- Queue.process
- Queue.review
- Queue.summary
- Queue.ready
- Queue.pause
- Queue.resume
- Queue.reset
- Queue.stop
- Queue.drop
- Queue.Job
- Queue.host
- Queue.port
- Queue.db
- Queue.name
- Queue.r
- Queue.id
- Queue.jobOptions [R/W]
- Queue.changeFeed
- Queue.master
- Queue.masterInterval
- Queue.removeFinishedJobs
- Queue.running
- Queue.concurrency [R/W]
- Queue.paused
- Queue.idle
- Event.ready
- Event.added
- Event.updated
- Event.active
- Event.processing
- Event.progress
- Event.log
- Event.pausing
- Event.paused
- Event.resumed
- Event.completed
- Event.cancelled
- Event.failed
- Event.terminated
- Event.reanimated
- Event.removed
- Event.idle
- Event.reset
- Event.error
- Event.reviewed
- Event.detached
- Event.stopping
- Event.stopped
- Event.dropped
- Job.setName
- Job.setPriority
- Job.setTimeout
- Job.setDateEnable
- Job.setRetryMax
- Job.setRetryDelay
- Job.setRepeat
- Job.setRepeatDelay
- Job.updateProgress
- Job.update
- Job.getCleanCopy
- Job.addLog
- Job.getLastLog