Queue Master

Description

When creating a Queue object within rethinkdb-job-queue you can customize its operation with configuration options. One of the options is called the masterInterval. If this option is set to false, the Queue object will not be a Queue Master. If the masterInterval option is set to a positive Integer then you will have a Queue Master. See the Queue Options document for more detail.

The value of the masterInterval represents a repeat time period in seconds. The default value for the masterInterval is 310 seconds or five minutes and ten seconds. This is ten seconds past the default job timeout value of 300 seconds. The extra 10 seconds is to assist in detecting failed jobs directly after queue startup. During long term operation the extra 10 seconds will make no difference.

When the time period elapses, the Queue Master will review the database Table backing the queue. This is called the Queue Review process.

It is worth noting that only one Queue Master can be enabled per Node.js process. If the Node.js process already has a Queue object that is a master, then creating more Queue Master objects will not enable multiple database reviews.

The queue master role in rethinkdb-job-queue is an integral role to ensure failed jobs get processed and the database is cleaned. A Queue Master will perform three tasks within the job queue during the Queue Review process:

Failed Node.js Process

Discover and retry jobs that have failed due to the Node.js process crashing or hanging.

Delayed Job Processing

Process failed jobs delayed for retry if the Queue object is idle.

Remove Finished Jobs

Remove completed, cancelled, or terminated jobs from the queue.

If you do not enable a Queue Master against a queue, these tasks will still be performed during Node.js process start as long as a handler function has been added to a Queue object. See the Queue.process document for more detail.

Failed Node.js Process

During normal queue operation, Queue objects processing jobs will detect when a job has taken too long and is operating past its timeout value. If this situation occurs the job status in the database is set to 'failed' and the job will be delayed based on the retryDelay, retryCount, and retryMax values. See Job Retry for more detail.

However, if a Node.js process fails for any reason whilst working on a job, the job will not complete and will remain in the database with an active status causing an orphaned job.

To ensure the job is not forgotten, a Queue Master will repeatedly review the queue database backing table based on the masterInterval. When the Queue Master reviews the queue backing table, it looks for jobs that have a status of active and are past their dateRetry value. The dateRetry value is set when the job is retrieved from the database for processing. Again, for more detail on the dateRetry value see the Job Retry document.

The queue review process will update the job status based on the retryCount and retryMax values:

If the jobs retryCount value is less than the retryMax then the job status will be set to 'failed' and the retryCount value will be incremented. This job will now be ready for processing.
If the jobs retryCount value is equal to the retryMax value then the job status will be set to terminated and the job is considered finished.

It is possible for the job being processed to extend past its initial timeout value and be marked as failed by the Queue Master review process. To prevent this, call the Job.progress method on the Job object. When progress for a job is updated, the dateRetry value and the timeout process also get updated. Therefore calling Job.progress periodically within the job timeout period will prevent the job from erroneously being marked as failed on review.

Delayed Job Processing

In a busy queue the database will be queried often on completion of jobs to find more jobs to process. This includes finding jobs with a status of waiting, timeout, or retry.

If the last job in the queue fails and the retryDelay value is not 0, the jobs status will be set to 'retry' and the queue will enter an idle state.

Without something initiating the queue to process jobs, the last job will remain in the database until more jobs are added to the queue.

To prevent this situation from delaying the last job well beyond its dateRetry value, the Master Queue database review process completes by calling the queue process task. The queue process task will query the database discovering the delayed job and retrieve it for processing.

Remove Finished Jobs

Once a job has finished processing and its status is changed to either completed, cancelled, or terminated, it will no longer be an active part of the queue. The job details in the database including its log entries and other properties are just taking up space.

Now if you are processing thousands of jobs a day this might not be a big deal and you may very well be happy to just leave the job details in the database for future reference. However if you are processing millions of jobs a day, the space taken up by the completed jobs could add up over a year or more. If that is the case then you will want to remove completed, cancelled, or terminated jobs from the database to free up space.

Fortunately rethinkdb-job-queue has three options for cleaning up jobs once they are finished. If you set the Queue.removeFinishedJobs property to true, jobs that are completed, cancelled, or terminated will be removed from the database immediately.

If you set the Queue.removeFinishedJobs property to false, jobs will never be removed from the database no matter what their status is.

You do have the option of setting the Queue.removeFinishedJobs property to a number representing days. The default is 180 days. If the property value is a number, then at some point in the future after a job has been completed, cancelled, or terminated it will need to be removed from the database. This is the final task for a Master Queue.

When the Master Queue reviews the database if the Queue.removeFinishedJobs property is a number on the saved jobs, and the date has moved past the expiry day, then the jobs will be removed.

Queue Master

Description

Failed Node.js Process

Delayed Job Processing

Remove Finished Jobs

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!