-
Notifications
You must be signed in to change notification settings - Fork 84
The Job API
jashkenas edited this page Sep 13, 2010
·
7 revisions
To create a job on a CloudCrowd cluster, and start it processing, you post a JSON representation of the job to /jobs. The representation may include the following:
| action | The name of the action you’d like to run (word_count, or process_pdfs, for example). |
| inputs | An array of the inputs to the job, each of which will be processed in parallel. Inputs are often URLs, but can be any valid JSON. |
| options (optional) | An arbitrary JSON object that will be passed through directly to the action. Used to configure specific actions. |
| callback_url (optional) | A URL that will be pinged with the JSON representation of the job and all of its results, as soon as the job is finished. If you don’t specify a callback_url, you’ll need to poll the job’s status to determine when it finishes. |
Here’s an example of a hypothetical job creation request:
RestClient.post('http://localhost:9173/jobs',
{:job => {
'action' => 'structural_analysis',
'inputs' => [
'http://www.gutenberg.org/a_midsummers_nights_dream.txt',
'http://www.gutenberg.org/romeo_and_juliet.txt',
'http://www.gutenberg.org/titus_andronicus.txt',
],
'options' => {
'limit' => 20,
'variance' => 0.75
}
}.to_json}
)
When you first create a job, ask the central server for the status of a job, or get pinged at the callback_url upon job completion, you’ll receive a JSON representation of the job. Its attributes are:
| id | The job’s integer id. Use this when requesting the status of a job at /jobs/:job_id
|
| status | The job’s current status. One of: succeeded, failed, processing, splitting, merging |
| outputs | If the job is complete (either succeeded or failed), outputs will be an array of all the job’s results. Often these are URLs where the finished output can be downloaded, for import back into your application. |
| percent_complete | The percentage of the job’s work units that have already been completed. An integer between 0 and 100. This number may occasionally move backwards if your action uses split, increasing the number of remaining work units.
|
| work_units | The total number of work units that make up the job. This number may change over time, due to split or merge. Useful for comparing the relative size of different jobs. |
| time_taken | The number of seconds that this job has been running for, or, if complete, the number of seconds it took to process from start to finish. |
| color | A unique hexadecimal color code, which can be used directly in HTML to distinguish the job visually. |
Here’s an example of the final JSON representation of a completed job:
{
"id" : 11,
"status" : "succeeded",
"time_taken" : 62.3368,
"percent_complete" : 100,
"work_units" : 0,
"color" : "2652dc",
"outputs" : ["http://s3.amazonaws.com/process_pdfs/job_11/unit_94/pdfs.tar"]
}