Skip to content

Commit f14d192

Browse files
committed
Merge branch 'master' of github.com:alien4cloud/alien4cloud.github.io
2 parents 1a7e398 + dd8eb34 commit f14d192

File tree

11 files changed

+6348
-4589
lines changed

11 files changed

+6348
-4589
lines changed

Gemfile.lock

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -75,7 +75,7 @@ GEM
7575
gemoji (~> 2.0)
7676
html-pipeline (~> 1.9)
7777
jekyll (~> 2.0)
78-
json (1.8.1)
78+
json (1.8.3)
7979
kramdown (1.3.1)
8080
liquid (2.6.1)
8181
listen (2.7.11)

documentation/2.1.0/orchestrators/yorc/index.html

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -37,9 +37,6 @@ <h2 id="overview">Overview</h2>
3737
<li>Applications and Components are written in TOSCA (Topology and Orchestration Specification for Cloud Applications), an OASIS consortium standard language to describe a topology of cloud based web services, their components and relationships, portable across infrastructures.</li>
3838
</ul>
3939

40-
<h2 id="quickstart">Quickstart</h2>
41-
<p>To get started, you can start running a Yorc docker container, install Alien4Cloud, upload sample application from the <a href="https://github.com/ystia/forge/tree/v2.0.0/org/ystia">Forge</a> and deploy this application on one of the supported types of infrastructures as described in the following sections.</p>
42-
4340
<a class="btn btn-primary pull-right" href="http://prose.io/#alien4cloud/alien4cloud.github.io/edit/sources/documentation/2.1.0/orchestrators/yorc/index.markdown"><i class="fa fa-pencil-square-o"></i> Edit (pull request)</a>
4441
</div>
4542
</div>
Lines changed: 186 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,186 @@
1+
2+
<script type="text/javascript">
3+
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
4+
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
5+
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
6+
})(window,document,'script','//www.google-analytics.com/analytics.js','ga');
7+
8+
ga('create', 'UA-73216650-1', 'auto');
9+
ga('set', {
10+
page: '/documentation/2.1.0/orchestrators/yorc/jobs.html',
11+
title: 'Working with jobs'
12+
});
13+
ga('send', 'pageview');
14+
15+
</script>
16+
17+
<div class="container-fluid">
18+
<div class="row">
19+
20+
<div class="col-sm-4 col-md-3">
21+
<div id="sidebar_menu" class="tree" role="complementary"></div>
22+
</div>
23+
<div id="content" class="col-sm-8 col-md-9">
24+
25+
<div style="height: 50px;">
26+
<h1 class="pull-left" style="margin-top: 0px;">Working with jobs</h1>
27+
<a class="btn btn-primary pull-right" href="http://prose.io/#alien4cloud/alien4cloud.github.io/edit/sources/documentation/2.1.0/orchestrators/yorc/jobs.md"><i class="fa fa-pencil-square-o"></i> Edit (pull request)</a>
28+
</div>
29+
<h2 id="whats-a-job">What’s a Job?</h2>
30+
31+
<p>By opposite to a service which is a long running application, a Job is an application that runs to completion.</p>
32+
33+
<p>TOSCA life-cycle (install -&gt; configure -&gt; start ……. then finally stop -&gt; delete) was designed to handle services. There is no concept of Jobs life-cycle within normative TOSCA.
34+
But, as per our experience in HPC and emerging container scheduling within Container as a Service solutions like Kubernetes, we are convinced that supporting Job scheduling is fundamental for any orchestration solution.</p>
35+
36+
<p>So we decided in collaboration with the Alien4Cloud team to extend TOSCA to support Jobs!</p>
37+
38+
<h2 id="extending-tosca-to-support-jobs">Extending TOSCA to support Jobs</h2>
39+
40+
<p>First was the life-cycle! In TOSCA the core concept is the life-cycle. So, based on our experience we defined a life-cycle for Jobs.</p>
41+
42+
<p><img src="../../../../images/2.1.0/yorc/JobsRunLifeCycle.png" alt="Jobs Life Cycle" /></p>
43+
44+
<p>Translated in TOSCA, we defined a new interface <strong>tosca.interfaces.node.lifecycle.Runnable</strong> this interface defines three operations:</p>
45+
46+
<ul>
47+
<li><strong>submit</strong>: Submit is this operation that <em>submits</em> a job to a Job Scheduler, generally at the end of the <strong>submit</strong> we got a <strong>job identifier</strong></li>
48+
<li><strong>run</strong>: Run is an asynchronous operation that will be called periodically to check the <strong>job status</strong>.</li>
49+
<li><strong>cancel</strong>: Cancel allows to <em>cancel</em> a <strong>submitted job</strong>.</li>
50+
</ul>
51+
52+
<h2 id="supported-jobs-schedulers">Supported Jobs Schedulers</h2>
53+
54+
<h3 id="slurm">Slurm</h3>
55+
56+
<p>Slurm is an HPC scheduler. Unsurprisingly, it was our first builtin support for Jobs scheduling. Our Slurm support allows to run single jobs made of several jobs. Moreover, Yorc supports the execution of jobs as Singularity jobs. Several TOSCA types are available for each of these use cases.</p>
57+
58+
<p>Let’s see how to define in a TOSCA component to run a Slurm job.</p>
59+
60+
<p>You have to define a node type derived from <strong>yorc.nodes.slurm.Job</strong> type. Different node properties are available in order to configure your Slurm job component. For example :</p>
61+
62+
<ul>
63+
<li><strong>credentials</strong> property can be used to provide user credentials for slurm (used to connect to the slurm client node)</li>
64+
<li><strong>name</strong> property can be used to provide a job name</li>
65+
<li><strong>account</strong> property can be used to charge resources used by this job to specified account.</li>
66+
</ul>
67+
68+
<p>The complete list with detailed description can be found in the Alien4Cloud catalog ; search for <strong>Job</strong> component having <strong>yorc.nodes.slurm.Job</strong> type, after having created a Slurm location for your Yorc orchestrator.</p>
69+
70+
<p>The TOSCA component must provide an implementation for the <strong>tosca.interfaces.node.lifecycle.Runnable</strong> interface.</p>
71+
72+
<p>Example of a job component. Here the <strong>submit</strong> operation definition provides the submission script <strong>submit.sh</strong>.</p>
73+
74+
<div class="highlight"><pre><code class="language-yaml" data-lang="yaml"><span class="l-Scalar-Plain">node_types</span><span class="p-Indicator">:</span>
75+
<span class="l-Scalar-Plain">org.ystia.yorc.samples.job.simple.Component</span><span class="p-Indicator">:</span>
76+
<span class="l-Scalar-Plain">derived_from</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">yorc.nodes.slurm.Job</span>
77+
<span class="l-Scalar-Plain">tags</span><span class="p-Indicator">:</span>
78+
<span class="l-Scalar-Plain">icon</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">/images/slurm.png</span>
79+
<span class="l-Scalar-Plain">artifacts</span><span class="p-Indicator">:</span>
80+
<span class="p-Indicator">-</span> <span class="l-Scalar-Plain">bin</span><span class="p-Indicator">:</span>
81+
<span class="l-Scalar-Plain">type</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">tosca.artifacts.File</span>
82+
<span class="l-Scalar-Plain">file</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">bin</span>
83+
<span class="l-Scalar-Plain">interfaces</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">tosca.interfaces.node.lifecycle.Runnable</span><span class="p-Indicator">:</span>
84+
<span class="l-Scalar-Plain">submit</span><span class="p-Indicator">:</span>
85+
<span class="l-Scalar-Plain">implementation</span><span class="p-Indicator">:</span>
86+
<span class="l-Scalar-Plain">file</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">bin/submit.sh</span>
87+
<span class="l-Scalar-Plain">type</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">yorc.artifacts.Deployment.SlurmJobBatch</span></code></pre></div>
88+
89+
<p>To run a Singularity job, users can provide in the component definition the docker image to be run by Singularity.</p>
90+
91+
<div class="highlight"><pre><code class="language-yaml" data-lang="yaml"><span class="l-Scalar-Plain">repositories</span><span class="p-Indicator">:</span>
92+
<span class="l-Scalar-Plain">docker</span><span class="p-Indicator">:</span>
93+
<span class="l-Scalar-Plain">url</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">&lt;https://hpda-docker-registry:5000/&gt;</span>
94+
<span class="l-Scalar-Plain">type</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">a4c_ignore</span>
95+
96+
<span class="l-Scalar-Plain">node_types</span><span class="p-Indicator">:</span>
97+
<span class="l-Scalar-Plain">org.ystia.yorc.samples.job.singularity.Component</span><span class="p-Indicator">:</span>
98+
<span class="l-Scalar-Plain">derived_from</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">yorc.nodes.slurm.SingularityJob</span>
99+
<span class="l-Scalar-Plain">description</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">Sample component to show how to run a job via singularity run</span>
100+
<span class="l-Scalar-Plain">tags</span><span class="p-Indicator">:</span>
101+
<span class="l-Scalar-Plain">icon</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">/images/singularity.png</span>
102+
<span class="l-Scalar-Plain">interfaces</span><span class="p-Indicator">:</span>
103+
<span class="l-Scalar-Plain">tosca.interfaces.node.lifecycle.Runnable</span><span class="p-Indicator">:</span>
104+
<span class="l-Scalar-Plain">submit</span><span class="p-Indicator">:</span>
105+
<span class="l-Scalar-Plain">implementation</span><span class="p-Indicator">:</span>
106+
<span class="l-Scalar-Plain">file</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">docker://godlovedc/lolcow:latest</span>
107+
<span class="l-Scalar-Plain">repository</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">docker</span>
108+
<span class="l-Scalar-Plain">type</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">yorc.artifacts.Deployment.SlurmJobImage</span></code></pre></div>
109+
110+
<h3 id="kubernetes">Kubernetes</h3>
111+
112+
<p>Over the years Kubernetes became the de-facto standard of Containers As A Service (CaaS).</p>
113+
114+
<p>Kubernetes has a special builtin <em>Controller</em> for jobs called <em>Jobs - Run to Completion</em>.</p>
115+
116+
<h3 id="the-one-you-want">The one you want!</h3>
117+
118+
<p>Yorc also support Jobs defined in pure-TOSCA. That means that you are able to write using YAML and Python, Shell or Ansible scripts your own interaction with any scheduler.</p>
119+
120+
<p>All you need to do is to provide implementation for at least the <strong>submit</strong> operation of the job life-cycle. If you do not provide implementation for the <strong>run</strong> operation, your job will run in <em>fire and forget</em> mode, you will not be able to get information about its completion. Similarly, if you do not provide an implementation for the <strong>cancel</strong> operation then your Job will simply not being cancellable.</p>
121+
122+
<p>To allow Yorc to manage your job properly some conventions:</p>
123+
124+
<ul>
125+
<li>at the end of the <strong>submit</strong> operation you should export a fact or environment variable named <strong>TOSCA_JOB_ID</strong> containing the <strong>submitted job identifier</strong>.</li>
126+
<li>Yorc automatically injects this <strong>TOSCA_JOB_ID</strong> as an input of the <strong>run</strong> and <strong>cancel</strong> operations.</li>
127+
<li>
128+
<p>The <strong>run</strong> operation should be designed to be <strong>non-blocking</strong> and <strong>called several times</strong>. Its primary role is to check the job status. It should export a fact or environment variable named <strong>TOSCA_JOB_STATUS</strong> containing one of the following values:</p>
129+
130+
<ul>
131+
<li><strong>COMPLETED</strong>: meaning that the job is done successfully.</li>
132+
<li><strong>FAILED</strong>: meaning that the job is done but in error.</li>
133+
<li><strong>RUNNING</strong>: meaning that the job is still running.</li>
134+
<li><strong>QUEUED</strong>: meaning that the job is submitted but didn’t started yet.</li>
135+
</ul>
136+
137+
<p>Internally <strong>RUNNING</strong> and <strong>QUEUED</strong> statuses are handled the same way by Yorc that will recall the <strong>run</strong> operation after a delay to refresh the status.</p>
138+
</li>
139+
<li>The <strong>run</strong> operation can also be used to retrieve logs or perform some cleanup after the job completion.</li>
140+
</ul>
141+
142+
<p>You can find an example of a pure-TOSCA implementation of jobs in the official <em>CSARs public library</em> with an implementation of a <a href="https://github.com/alien4cloud/csar-public-library/tree/develop/org/alien4cloud/spark/job-linux-sh">Spark Job</a></p>
143+
144+
<h2 id="specific-workflows-for-jobs">Specific workflows for Jobs</h2>
145+
146+
<p>When your application contains Jobs (meaning node templates which implements the <strong>tosca.interfaces.node.lifecycle.Runnable</strong> interface) then Alien4Cloud will automatically generate two workflows:</p>
147+
148+
<ul>
149+
<li><strong>run</strong>: a workflow that submits and monitor jobs</li>
150+
<li><strong>cancel</strong>: a workflow that cancels jobs</li>
151+
</ul>
152+
153+
<div class="note warning">
154+
<p>The cancel workflow is a kind of temporary work around. It allows to cancel jobs but do not take care if the job is submitted or not. The recommended way to cancel a <strong>run</strong> workflow is to cancel the associated task in Yorc using either the CLI or the Rest API. This is temporary and we will provide soon a way to cancel workflows directly from Alien4Cloud.</p>
155+
</div>
156+
157+
<p>The <strong>run</strong> workflow allows to orchestrate Jobs. That means that if for instance, <strong>jobB</strong> depends on <strong>jobA</strong> using a TOSCA <strong>dependsOn</strong> or <strong>connectsTO</strong> relationship then Alien4Cloud will generate a workflow that first submit and wait for the completion of <strong>jobA</strong> before submitting <strong>jobB</strong>.</p>
158+
159+
<h2 id="jobs-cancellation">Jobs cancellation</h2>
160+
161+
<p>The proper way to cancel Jobs that were submitted by a TOSCA workflow is to cancel the associated Yorc Task/Execution of this workflow. This way Yorc will automatically call <strong>cancel</strong> operations for nodes that implement it and which have successfully executed their <strong>submit</strong> operation. Currently those automatic cancellation steps do not appear in Alien4Cloud. We will work soon on making them visible.</p>
162+
163+
<a class="btn btn-primary pull-right" href="http://prose.io/#alien4cloud/alien4cloud.github.io/edit/sources/documentation/2.1.0/orchestrators/yorc/jobs.md"><i class="fa fa-pencil-square-o"></i> Edit (pull request)</a>
164+
</div>
165+
</div>
166+
</div>
167+
</div><!-- /container -->
168+
169+
<script>
170+
var hash = location.hash.replace( /^#/, '' );
171+
if(hash && hash!== null && hash.match(/html$/)) {
172+
} else {
173+
var newLocation = location.protocol+"//"+location.host+"#"+location.pathname;
174+
location.replace(newLocation);
175+
}
176+
</script>
177+
<script type="text/javascript" src="/js/post-layout.js"></script>
178+
<script>
179+
$(document).ready(function () {
180+
makeSideBar('DOCUMENTATION-2.1.0', 'documentation/2.1.0/orchestrators/yorc/jobs.md');
181+
});
182+
</script>
183+
184+
<script>
185+
$("div[data-gist]").gist();
186+
</script>

0 commit comments

Comments
 (0)