Skip to content

Commit e6fb8ac

Browse files
authored
Merge pull request #297762 from tfitzmac/0404edit1
copy edit
2 parents f0c9e66 + 858487e commit e6fb8ac

File tree

8 files changed

+143
-172
lines changed

8 files changed

+143
-172
lines changed

articles/cyclecloud/common-issues/node-cyclecloud-connectivity.md

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -7,11 +7,11 @@ ms.author: adjohnso
77
---
88
# Common Issues: Node to CycleCloud Connectivity
99

10-
Cyclecloud installs an agent on each VM that needs to be able to communicate back to the CycleCloud application in order to report status, monitoring, as well as to make API requests for auto-scaling and distributed synchronization.
10+
CycleCloud installs an agent on each virtual machine that needs to communicate with the CycleCloud application. The agent reports status and monitoring and makes API requests for autoscaling and distributed synchronization.
1111

12-
It is recommended that the application server be deployed in the same VNET (virtual network) as the cluster. Where this is not feasible, connectivity may be established by doing [VNET peering](../network-connectivity.md#vnet-peering) or using a [proxy node](../network-connectivity.md#proxy-node). These error messages indicate that nodes are unable to communicate back to the CycleCloud application server.
12+
We recommend deploying the application server in the same virtual network as the cluster. If this configuration isn't feasible, establish connectivity by doing [virtual network peering](../network-connectivity.md#virtual-network-peering) or using a [proxy node](../network-connectivity.md#proxy-node). These error messages indicate that nodes can't communicate with the CycleCloud application server.
1313

14-
## Possible Error Messages
14+
## Possible error messages
1515
- `Timeout awaiting system boot-up`
1616
- `Timed out connecting to CycleCloud at {https://A.B.C.D}`
1717
- `Connection refused to CycleCloud through return-proxy tunnel at {https://A.B.C.D:37140}`
@@ -21,17 +21,17 @@ It is recommended that the application server be deployed in the same VNET (virt
2121

2222
## Resolution
2323

24-
- If the CycleCloud server and the cluster is in the same VNET, check the network security groups for the subnets in the VNET. Cluster nodes need to be able to reach the CycleCloud server at TCP 9443 and 5672. In the other direction, Azure CycleCloud needs to be able to reach ganglia (TCP 8652) and SSH (TCP 22) ports of the cluster for system and job monitoring.
24+
- If the CycleCloud server and the cluster are in the same virtual network, check the network security groups for the subnets in the virtual network. Cluster nodes need to reach the CycleCloud server at TCP 9443 and 5672. In the other direction, Azure CycleCloud needs to reach ganglia (TCP 8652) and SSH (TCP 22) ports of the cluster for system and job monitoring.
2525

26-
- You may need to add a public IP address.
26+
- You might need to add a public IP address.
2727

2828
- If the error message indicates a return proxy, check the [return proxy settings](../how-to/return-proxy.md).
2929

30-
- After updating network or proxy settings, you can test connectivity by SSHing into the node as the cyclecloud user and using `curl -k {https://error-message-url}`.
30+
- After updating network or proxy settings, test connectivity by SSHing into the node as the cyclecloud user and using `curl -k {https://error-message-url}`.
3131

32-
- After validating that network connectivity is fixed, you will need to terminate and restart the node.
32+
- After validating that network connectivity is fixed, terminate and restart the node.
3333

34-
## More Information
34+
## More information
3535

36-
[Read more about network-connectivity here](../network-connectivity.md)
37-
[Read more about return proxy here](../how-to/return-proxy.md)
36+
[Learn more about network connectivity](../network-connectivity.md).
37+
[Learn more about return proxy](../how-to/return-proxy.md).

articles/cyclecloud/htcondor.md

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ ms.author: adjohnso
88

99
# HTCondor
1010

11-
[HTCondor](http://research.cs.wisc.edu/htcondor/manual/latest) can easily be enabled on a CycleCloud cluster by modifying the "run_list" in the configuration section of your cluster definition. There are three basic components of an HTCondor cluster. The first is the "central manager" which provides the scheduling and management daemons. The second component of an HTCondor cluster is one or more schedulers from which jobs are submitted into the system. The final component is one or more execute nodes which are the hosts perform the computation. A simple HTCondor template may look like:
11+
You can enable [HTCondor](http://research.cs.wisc.edu/htcondor/manual/latest) on a CycleCloud cluster by modifying the `run_list` in the configuration section of your cluster definition. There are three basic components of an HTCondor cluster. The first is the **central manager**, which provides the scheduling and management daemons. The second component is one or more **schedulers**, from which jobs are submitted into the system. The final component is one or more **execute nodes**, which are the hosts that perform the computation. A simple HTCondor template might look like:
1212

1313
``` ini
1414
[cluster htcondor]
@@ -36,15 +36,15 @@ ms.author: adjohnso
3636
run_list = role[usc_execute]
3737
```
3838

39-
Importing and starting a cluster with definition in CycleCloud will yield a "manager" and a "scheduler" node, as well as one "execute" node. Execute nodes can be added to the cluster via the `cyclecloud add_node` command. To add 10 more execute nodes:
39+
When you import and start a cluster with this definition in CycleCloud, you get a **manager** and a **scheduler** node, and one **execute** node. You can add **execute** nodes to the cluster by using the `cyclecloud add_node` command. To add 10 more **execute** nodes, use the following command:
4040

4141
```azurecli-interactive
4242
cyclecloud add_node htcondor -t execute -c 10
4343
```
4444

4545
## HTCondor Autoscaling
4646

47-
CycleCloud supports autoscaling for HTCondor, which means that the software will monitor the status of your queue and turn on and off nodes as needed to complete the work in an optimal amount of time/cost. You can enable autoscaling for HTCondor by adding `Autoscale=true` to your cluster definition:
47+
CycleCloud supports autoscaling for HTCondor. The software monitors the status of your queue and turns on and off nodes as needed to complete the work in an optimal amount of time and cost. To enable autoscaling for HTCondor, add `Autoscale=true` to your cluster definition:
4848

4949
``` ini
5050
[cluster htcondor]
@@ -53,11 +53,11 @@ Autoscale = True
5353

5454
## HTCondor Advanced Usage
5555

56-
If you know the average runtime of jobs, you can define `average_runtime` (in minutes) in your job. CycleCloud will use that to start the minimum number of nodes (for example, five 10-minute jobs will only start a single node instead of five when `average_runtime` is set to 10).
56+
If you know the average runtime of jobs, define `average_runtime` (in minutes) in your job. CycleCloud uses that value to start the minimum number of nodes. For example, if five 10-minute jobs are submitted and `average_runtime` is set to 10, CycleCloud starts only one node instead of five.
5757

5858
## Autoscale Nodearray
5959

60-
By default, HTCondor will request cores from the nodearray called 'execute'. If a job requires a different nodearray (for example if certain jobs within a workflow have a high memory requirement), you can specify a `slot_type` attribute for the job. For example, adding `+slot_type = "highmemory"` will cause HTCondor to request a node from the "highmemory" nodearray instead of "execute" (note that this currently requires `htcondor.slot_type = "highmemory"` to be set in the nodearray's `[[[configuration]]]` section). This will not affect how HTCondor schedules the jobs, so you may want to include the `slot_type` startd attribute in the job's `requirements` or `rank` expressions. For example: `Requirements = target.slot_type = "highmemory"`.
60+
By default, HTCondor requests cores from the nodearray called `execute`. If a job requires a different nodearray (for example, if certain jobs within a workflow have a high memory requirement), specify a `slot_type` attribute for the job. For example, adding `+slot_type = "highmemory"` causes HTCondor to request a node from the `highmemory` nodearray instead of `execute` (this setting currently requires `htcondor.slot_type = "highmemory"` to be set in the nodearray's `[[[configuration]]]` section). This setting doesn't affect how HTCondor schedules the jobs, so you might want to include the `slot_type` startd attribute in the job's `requirements` or `rank` expressions. For example: `Requirements = target.slot_type = "highmemory"`.
6161

6262
## Submitting Jobs to HTCondor
6363

@@ -84,7 +84,7 @@ A sample submit file might look like this:
8484

8585
## HTCondor Configuration Reference
8686

87-
The following are the HTCondor-specific configuration options you can set to customize functionality:
87+
The following HTCondor-specific configuration options customize functionality:
8888

8989
| HTCondor-Specific Configuration Options | Description |
9090
| --------------------------------------- | -------------|
@@ -94,15 +94,15 @@ The following are the HTCondor-specific configuration options you can set to cus
9494
| htcondor.condor_owner | The Linux account that owns the HTCondor scaledown scripts. Default: root |
9595
| htcondor.condor_group | The Linux group that owns the HTCondor scaledown scripts. Default: root |
9696
| htcondor.data_dir | The directory for logs, spool directories, execute directories, and local config file. Default: /mnt/condor_data (Linux), C:\All Services\condor_local (Windows) |
97-
| htcondor.ignore_hyperthreads | (Windows only) Set the number of CPUs to be half of the detected CPUs as a way to "disable" hyperthreading. If using autoscale, specify the non-hyperthread core count with the `Cores` configuration setting in the [[node]] or [[nodearray]] section. Default: false |
97+
| htcondor.ignore_hyperthreads | (Windows only) Set the number of CPUs to half of the detected CPUs to "disable" hyperthreading. If using autoscale, specify the non-hyperthread core count with the `Cores` configuration setting in the [[node]] or [[nodearray]] section. Default: false |
9898
| htcondor.install_dir | The directory that HTCondor is installed to. Default: /opt/condor (Linux), C:\condor (Windows) |
99-
| htcondor.job_start_count | The number of jobs a schedd will start per cycle. 0 is unlimited. Default: 20 |
99+
| htcondor.job_start_count | The number of jobs a schedd starts per cycle. 0 is unlimited. Default: 20 |
100100
| htcondor.job_start_delay | The number of seconds between each job start interval. 0 is immediate. Default: 1 |
101101
| htcondor.max_history_log | The maximum size of the job history file in bytes. Default: 20971520 |
102102
| htcondor.max_history_rotations | The maximum number of job history files to keep. Default: 20 |
103-
| htcondor.negotiator_cycle_delay | The minimum number of seconds before a new negotiator cycle may start. Default: 20 |
103+
| htcondor.negotiator_cycle_delay | The minimum number of seconds before a new negotiator cycle can start. Default: 20 |
104104
| htcondor.negotiator_interval | How often (in seconds) the condor_negotiator starts a negotiation cycle. Default: 60 |
105-
| htcondor.negotiator_inform_startd | If true, the negotiator informs the startd when it is matched to a job. Default: true |
105+
| htcondor.negotiator_inform_startd | If true, the negotiator informs the startd when it matches to a job. Default: true |
106106
| htcondor.remove_stopped_nodes | If true, stopped execute nodes are removed from the CycleServer view instead of being marked as "down". |
107107
| htcondor.running | If true, HTCondor collector and negotiator daemons run on the central manager. Otherwise, only the condor_master runs. Default: true |
108108
| htcondor.scheduler_dual | If true, schedulers run two schedds. Default: true |
@@ -114,7 +114,7 @@ The following are the HTCondor-specific configuration options you can set to cus
114114

115115
## HTCondor Auto-Generated Configuration File
116116

117-
HTCondor has large number of configuration settings, including user-defined attributes. CycleCloud offers the ability to create a custom configuration file using attributes defined in the cluster:
117+
HTCondor has a large number of configuration settings, including user-defined attributes. CycleCloud offers the ability to create a custom configuration file using attributes defined in the cluster:
118118

119119
| Attribute | Description |
120120
| --------- | ------------ |
@@ -123,6 +123,6 @@ HTCondor has large number of configuration settings, including user-defined attr
123123
| htcondor.custom_config.settings | The attributes to write to the custom config file such as `htcondor.custom_config.settings.max_jobs_running = 5000`|
124124

125125
> [!NOTE]
126-
> HTCondor configuration attributes containing a . cannot be specified using this method. If such attributes are needed, they should be specified in a cookbook or a file installed with cluster-init.
126+
> You can't specify HTCondor configuration attributes containing a `.` using this method. If you need such attributes, specify them in a cookbook or a file installed with `cluster-init`.
127127
128128
[!INCLUDE [scheduler-integration](~/articles/cyclecloud/includes/scheduler-integration.md)]

articles/cyclecloud/images.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ ms.author: adjohnso
88

99
# Images
1010

11-
Azure CycleCloud ships with support for standard operating systems. You can specify the image with `ImageName`, which may be a CycleCloud image name, an image URN, or the resource ID of a custom image:
11+
Azure CycleCloud ships with support for standard operating systems. You can specify the image with `ImageName`, which can be a CycleCloud image name, an image URN, or the resource ID of a custom image:
1212

1313
``` ini
1414
# CycleCloud image name
@@ -24,17 +24,17 @@ ImageName = MicrosoftWindowsServer:WindowsServer:2022-datacenter-g2:latest
2424
ImageName = /subscriptions/xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxx/resourceGroups/MyResourceGroup/providers/Microsoft.Compute/images/MyCustomImage
2525
```
2626

27-
Alternatively, you can use `Image` which supports image labels:
27+
Alternatively, use `Image` which supports image labels:
2828

2929
``` ini
3030
[[node defaults]]
3131
Image = Windows 2022 DataCenter
3232
```
3333

34-
When an exact version is not specified, CycleCloud automatically uses the latest released version of the image for the region that the node is in.
34+
When you don't specify an exact version, CycleCloud automatically uses the latest released version of the image for the region of the node.
3535

3636
> [!NOTE]
37-
> If you are using a custom (non-standard) image that was created with Jetpack, you can set `AwaitInstallation=true` on the node, specifying that the image supports sending status messages back to CycleCloud. This will allow for more accurate representations of the node's state within CycleCloud.
37+
> If you're using a custom (nonstandard) image that you created with Jetpack, set `AwaitInstallation=true` on the node. This setting specifies that the image supports sending status messages back to CycleCloud. With this setting, CycleCloud can provide more accurate representations of the node's state.
3838
3939
CycleCloud currently includes the following images:
4040

@@ -54,4 +54,4 @@ CycleCloud currently includes the following images:
5454
| Windows 2022 DataCenter | Windows 2022 DataCenter | cycle.image.win2022 | |
5555

5656
> [!NOTE]
57-
> Standard images referenced in CycleCloud are the latest known versions of publicly-available operating system images hosted in Marketplace and are not created, maintained, or supported by Microsoft for the CycleCloud product.
57+
> Standard images referenced in CycleCloud are the latest known versions of publicly available operating system images hosted in the Marketplace. Microsoft doesn't create, maintain, or support these images for the CycleCloud product.

0 commit comments

Comments
 (0)