You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/cyclecloud/common-issues/node-cyclecloud-connectivity.md
+10-10Lines changed: 10 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,11 +7,11 @@ ms.author: adjohnso
7
7
---
8
8
# Common Issues: Node to CycleCloud Connectivity
9
9
10
-
Cyclecloud installs an agent on each VM that needs to be able to communicate back to the CycleCloud application in order to report status, monitoring, as well as to make API requests for auto-scaling and distributed synchronization.
10
+
CycleCloud installs an agent on each virtual machine that needs to communicate with the CycleCloud application. The agent reports status and monitoring and makes API requests for autoscaling and distributed synchronization.
11
11
12
-
It is recommended that the application server be deployed in the same VNET (virtual network) as the cluster. Where this is not feasible, connectivity may be established by doing [VNET peering](../network-connectivity.md#vnet-peering) or using a [proxy node](../network-connectivity.md#proxy-node). These error messages indicate that nodes are unable to communicate back to the CycleCloud application server.
12
+
We recommend deploying the application server in the same virtual network as the cluster. If this configuration isn't feasible, establish connectivity by doing [virtual network peering](../network-connectivity.md#virtual-network-peering) or using a [proxy node](../network-connectivity.md#proxy-node). These error messages indicate that nodes can't communicate with the CycleCloud application server.
13
13
14
-
## Possible Error Messages
14
+
## Possible error messages
15
15
-`Timeout awaiting system boot-up`
16
16
-`Timed out connecting to CycleCloud at {https://A.B.C.D}`
17
17
-`Connection refused to CycleCloud through return-proxy tunnel at {https://A.B.C.D:37140}`
@@ -21,17 +21,17 @@ It is recommended that the application server be deployed in the same VNET (virt
21
21
22
22
## Resolution
23
23
24
-
- If the CycleCloud server and the cluster is in the same VNET, check the network security groups for the subnets in the VNET. Cluster nodes need to be able to reach the CycleCloud server at TCP 9443 and 5672. In the other direction, Azure CycleCloud needs to be able to reach ganglia (TCP 8652) and SSH (TCP 22) ports of the cluster for system and job monitoring.
24
+
- If the CycleCloud server and the cluster are in the same virtual network, check the network security groups for the subnets in the virtual network. Cluster nodes need to reach the CycleCloud server at TCP 9443 and 5672. In the other direction, Azure CycleCloud needs to reach ganglia (TCP 8652) and SSH (TCP 22) ports of the cluster for system and job monitoring.
25
25
26
-
- You may need to add a public IP address.
26
+
- You might need to add a public IP address.
27
27
28
28
- If the error message indicates a return proxy, check the [return proxy settings](../how-to/return-proxy.md).
29
29
30
-
- After updating network or proxy settings, you can test connectivity by SSHing into the node as the cyclecloud user and using `curl -k {https://error-message-url}`.
30
+
- After updating network or proxy settings, test connectivity by SSHing into the node as the cyclecloud user and using `curl -k {https://error-message-url}`.
31
31
32
-
- After validating that network connectivity is fixed, you will need to terminate and restart the node.
32
+
- After validating that network connectivity is fixed, terminate and restart the node.
33
33
34
-
## More Information
34
+
## More information
35
35
36
-
[Read more about network-connectivity here](../network-connectivity.md)
37
-
[Read more about return proxy here](../how-to/return-proxy.md)
36
+
[Learn more about network connectivity](../network-connectivity.md).
37
+
[Learn more about return proxy](../how-to/return-proxy.md).
Copy file name to clipboardExpand all lines: articles/cyclecloud/htcondor.md
+12-12Lines changed: 12 additions & 12 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,7 +8,7 @@ ms.author: adjohnso
8
8
9
9
# HTCondor
10
10
11
-
[HTCondor](http://research.cs.wisc.edu/htcondor/manual/latest)can easily be enabled on a CycleCloud cluster by modifying the "run_list" in the configuration section of your cluster definition. There are three basic components of an HTCondor cluster. The first is the "central manager" which provides the scheduling and management daemons. The second component of an HTCondor cluster is one or more schedulers from which jobs are submitted into the system. The final component is one or more execute nodes which are the hosts perform the computation. A simple HTCondor template may look like:
11
+
You can enable [HTCondor](http://research.cs.wisc.edu/htcondor/manual/latest) on a CycleCloud cluster by modifying the `run_list` in the configuration section of your cluster definition. There are three basic components of an HTCondor cluster. The first is the **central manager**, which provides the scheduling and management daemons. The second component is one or more **schedulers**, from which jobs are submitted into the system. The final component is one or more **execute nodes**, which are the hosts that perform the computation. A simple HTCondor template might look like:
12
12
13
13
```ini
14
14
[cluster htcondor]
@@ -36,15 +36,15 @@ ms.author: adjohnso
36
36
run_list = role[usc_execute]
37
37
```
38
38
39
-
Importing and starting a cluster with definition in CycleCloud will yield a "manager" and a "scheduler" node, as well as one "execute" node. Execute nodes can be added to the cluster via the `cyclecloud add_node` command. To add 10 more execute nodes:
39
+
When you import and start a cluster with this definition in CycleCloud, you get a **manager** and a **scheduler** node, and one **execute** node. You can add **execute** nodes to the cluster by using the `cyclecloud add_node` command. To add 10 more **execute** nodes, use the following command:
40
40
41
41
```azurecli-interactive
42
42
cyclecloud add_node htcondor -t execute -c 10
43
43
```
44
44
45
45
## HTCondor Autoscaling
46
46
47
-
CycleCloud supports autoscaling for HTCondor, which means that the software will monitor the status of your queue and turn on and off nodes as needed to complete the work in an optimal amount of time/cost. You can enable autoscaling for HTCondor by adding`Autoscale=true` to your cluster definition:
47
+
CycleCloud supports autoscaling for HTCondor. The software monitors the status of your queue and turns on and off nodes as needed to complete the work in an optimal amount of time and cost. To enable autoscaling for HTCondor, add`Autoscale=true` to your cluster definition:
48
48
49
49
```ini
50
50
[cluster htcondor]
@@ -53,11 +53,11 @@ Autoscale = True
53
53
54
54
## HTCondor Advanced Usage
55
55
56
-
If you know the average runtime of jobs, you can define `average_runtime` (in minutes) in your job. CycleCloud will use that to start the minimum number of nodes (for example, five 10-minute jobs will only start a single node instead of five when `average_runtime` is set to 10).
56
+
If you know the average runtime of jobs, define `average_runtime` (in minutes) in your job. CycleCloud uses that value to start the minimum number of nodes. For example, if five 10-minute jobs are submitted and `average_runtime` is set to 10, CycleCloud starts only one node instead of five.
57
57
58
58
## Autoscale Nodearray
59
59
60
-
By default, HTCondor will request cores from the nodearray called 'execute'. If a job requires a different nodearray (for example if certain jobs within a workflow have a high memory requirement), you can specify a `slot_type` attribute for the job. For example, adding `+slot_type = "highmemory"`will cause HTCondor to request a node from the "highmemory" nodearray instead of "execute" (note that this currently requires `htcondor.slot_type = "highmemory"` to be set in the nodearray's `[[[configuration]]]` section). This will not affect how HTCondor schedules the jobs, so you may want to include the `slot_type` startd attribute in the job's `requirements` or `rank` expressions. For example: `Requirements = target.slot_type = "highmemory"`.
60
+
By default, HTCondor requests cores from the nodearray called `execute`. If a job requires a different nodearray (for example, if certain jobs within a workflow have a high memory requirement), specify a `slot_type` attribute for the job. For example, adding `+slot_type = "highmemory"`causes HTCondor to request a node from the `highmemory` nodearray instead of `execute` (this setting currently requires `htcondor.slot_type = "highmemory"` to be set in the nodearray's `[[[configuration]]]` section). This setting doesn't affect how HTCondor schedules the jobs, so you might want to include the `slot_type` startd attribute in the job's `requirements` or `rank` expressions. For example: `Requirements = target.slot_type = "highmemory"`.
61
61
62
62
## Submitting Jobs to HTCondor
63
63
@@ -84,7 +84,7 @@ A sample submit file might look like this:
84
84
85
85
## HTCondor Configuration Reference
86
86
87
-
The following are the HTCondor-specific configuration options you can set to customize functionality:
87
+
The following HTCondor-specific configuration options customize functionality:
@@ -94,15 +94,15 @@ The following are the HTCondor-specific configuration options you can set to cus
94
94
| htcondor.condor_owner | The Linux account that owns the HTCondor scaledown scripts. Default: root |
95
95
| htcondor.condor_group | The Linux group that owns the HTCondor scaledown scripts. Default: root |
96
96
| htcondor.data_dir | The directory for logs, spool directories, execute directories, and local config file. Default: /mnt/condor_data (Linux), C:\All Services\condor_local (Windows) |
97
-
| htcondor.ignore_hyperthreads | (Windows only) Set the number of CPUs to be half of the detected CPUs as a way to "disable" hyperthreading. If using autoscale, specify the non-hyperthread core count with the `Cores` configuration setting in the [[node]] or [[nodearray]] section. Default: false |
97
+
| htcondor.ignore_hyperthreads | (Windows only) Set the number of CPUs to half of the detected CPUs to "disable" hyperthreading. If using autoscale, specify the non-hyperthread core count with the `Cores` configuration setting in the [[node]] or [[nodearray]] section. Default: false |
98
98
| htcondor.install_dir | The directory that HTCondor is installed to. Default: /opt/condor (Linux), C:\condor (Windows) |
99
-
| htcondor.job_start_count | The number of jobs a schedd will start per cycle. 0 is unlimited. Default: 20 |
99
+
| htcondor.job_start_count | The number of jobs a schedd starts per cycle. 0 is unlimited. Default: 20 |
100
100
| htcondor.job_start_delay | The number of seconds between each job start interval. 0 is immediate. Default: 1 |
101
101
| htcondor.max_history_log | The maximum size of the job history file in bytes. Default: 20971520 |
102
102
| htcondor.max_history_rotations | The maximum number of job history files to keep. Default: 20 |
103
-
| htcondor.negotiator_cycle_delay | The minimum number of seconds before a new negotiator cycle may start. Default: 20 |
103
+
| htcondor.negotiator_cycle_delay | The minimum number of seconds before a new negotiator cycle can start. Default: 20 |
104
104
| htcondor.negotiator_interval | How often (in seconds) the condor_negotiator starts a negotiation cycle. Default: 60 |
105
-
| htcondor.negotiator_inform_startd | If true, the negotiator informs the startd when it is matched to a job. Default: true |
105
+
| htcondor.negotiator_inform_startd | If true, the negotiator informs the startd when it matches to a job. Default: true |
106
106
| htcondor.remove_stopped_nodes | If true, stopped execute nodes are removed from the CycleServer view instead of being marked as "down". |
107
107
| htcondor.running | If true, HTCondor collector and negotiator daemons run on the central manager. Otherwise, only the condor_master runs. Default: true |
108
108
| htcondor.scheduler_dual | If true, schedulers run two schedds. Default: true |
@@ -114,7 +114,7 @@ The following are the HTCondor-specific configuration options you can set to cus
114
114
115
115
## HTCondor Auto-Generated Configuration File
116
116
117
-
HTCondor has large number of configuration settings, including user-defined attributes. CycleCloud offers the ability to create a custom configuration file using attributes defined in the cluster:
117
+
HTCondor has a large number of configuration settings, including user-defined attributes. CycleCloud offers the ability to create a custom configuration file using attributes defined in the cluster:
118
118
119
119
| Attribute | Description |
120
120
| --------- | ------------ |
@@ -123,6 +123,6 @@ HTCondor has large number of configuration settings, including user-defined attr
123
123
| htcondor.custom_config.settings | The attributes to write to the custom config file such as `htcondor.custom_config.settings.max_jobs_running = 5000`|
124
124
125
125
> [!NOTE]
126
-
> HTCondor configuration attributes containing a . cannot be specified using this method. If such attributes are needed, they should be specified in a cookbook or a file installed with cluster-init.
126
+
> You can't specify HTCondor configuration attributes containing a `.`using this method. If you need such attributes, specify them in a cookbook or a file installed with `cluster-init`.
Copy file name to clipboardExpand all lines: articles/cyclecloud/images.md
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,7 +8,7 @@ ms.author: adjohnso
8
8
9
9
# Images
10
10
11
-
Azure CycleCloud ships with support for standard operating systems. You can specify the image with `ImageName`, which may be a CycleCloud image name, an image URN, or the resource ID of a custom image:
11
+
Azure CycleCloud ships with support for standard operating systems. You can specify the image with `ImageName`, which can be a CycleCloud image name, an image URN, or the resource ID of a custom image:
Alternatively, you can use `Image` which supports image labels:
27
+
Alternatively, use `Image` which supports image labels:
28
28
29
29
```ini
30
30
[[node defaults]]
31
31
Image = Windows 2022 DataCenter
32
32
```
33
33
34
-
When an exact version is not specified, CycleCloud automatically uses the latest released version of the image for the region that the node is in.
34
+
When you don't specify an exact version, CycleCloud automatically uses the latest released version of the image for the region of the node.
35
35
36
36
> [!NOTE]
37
-
> If you are using a custom (non-standard) image that was created with Jetpack, you can set `AwaitInstallation=true` on the node, specifying that the image supports sending status messages back to CycleCloud. This will allow for more accurate representations of the node's state within CycleCloud.
37
+
> If you're using a custom (nonstandard) image that you created with Jetpack, set `AwaitInstallation=true` on the node. This setting specifies that the image supports sending status messages back to CycleCloud. With this setting, CycleCloud can provide more accurate representations of the node's state.
38
38
39
39
CycleCloud currently includes the following images:
40
40
@@ -54,4 +54,4 @@ CycleCloud currently includes the following images:
54
54
| Windows 2022 DataCenter | Windows 2022 DataCenter | cycle.image.win2022 ||
55
55
56
56
> [!NOTE]
57
-
> Standard images referenced in CycleCloud are the latest known versions of publicly-available operating system images hosted in Marketplace and are not created, maintained, or supported by Microsoft for the CycleCloud product.
57
+
> Standard images referenced in CycleCloud are the latest known versions of publiclyavailable operating system images hosted in the Marketplace. Microsoft doesn't create, maintain, or support these images for the CycleCloud product.
0 commit comments