Skip to content

Commit 194707e

Browse files
authored
Merge pull request #53 from osg-htc/fix/fence-languages-conservative-recreate
docs: normalize fences + add backtick check (CI example) — recreate
2 parents 315d591 + e363610 commit 194707e

File tree

41 files changed

+11732
-451
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

41 files changed

+11732
-451
lines changed
Lines changed: 106 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,106 @@
1+
# perfSONAR Deployment Options
2+
3+
The primary motivation for perfSONAR deployment is to test isolation, i.e. only one end-to-end test should run on a host
4+
at a time. This ensures that the test results are not impacted by the other tests. Otherwise it is much more difficult
5+
to interpret test results, which may vary due to host effects rather then network effects. Taking this into account it
6+
means that perfSONAR measurement tools are much more accurate running on a dedicated hardware and while it may be useful
7+
to run them on other hosts such as Data Transfer Nodes the current recommendation is to have specific measurement
8+
machine. In addition, as bandwidth testing could impact latency testing, we recommend to deploy two different nodes,
9+
each focused on specific set of tests. The following deployment options are currently available:
10+
11+
* **Bare metal** - preffered option in one of two possible configurations:
12+
13+
```text
14+
15+
* Two bare metal servers, one for latency node, one for bandwidth node
16+
17+
* One bare metal server running both latency and bandwidth node together provided that there are two NICs available, please refer to dual NIC section for more details on this.
18+
19+
```
20+
21+
* **Virtual Machine** - if bare metal is not available then it is also possible to run perfSONAR on a VM, however there are a set of additional requirements to fulfill:
22+
23+
```text
24+
25+
* Full-node VM is strongly preferred, having 2 VMs (latency/bandwidth node) on a single bare metal. Mixing perfSONAR VM(s) with others might have an impact on the measurements and is therefore not recommended.
26+
* VM needs to be configured to have SR-IOV to NIC(s) as well as pinned CPUs to ensure bandwidth tests are not impacted (by hypervisor switching CPUs during the test)
27+
* Succesfull full speed local bandwidth test is highly recommended prior to putting the VM into production
28+
```
29+
30+
* **Container** - perfSONAR has supported containers from version 4.1 (Q1 2018) and is documented at <https://docs.perfsonar.net/install_docker.html> but is not typically used in the same way as a full toolkit installation.
31+
32+
```text
33+
34+
* Docker perfSONAR test instance can however still be used by sites that run multiple perfSONAR instances on site for their internal testing as this deployment model allows to flexibly deploy a testpoint which can send results to a local measurement archive running on the perfSONAR toolkit node.
35+
36+
```
37+
38+
## perfSONAR Toolkit vs Testpoint
39+
40+
The perfSONAR team has documented the types of installations supported at
41+
<https://docs.perfsonar.net/install_options.html>. With the release of version 5, OSG/WLCG sites have a new option:
42+
instead of installing the full Toolkit sites can choose to install the Testpoint bundle.
43+
44+
* Pros
45+
46+
```text
47+
* Simpler deployment when a local web interface is not needed and a central measurement archive is available.
48+
* Less resource intensive for both memory and I/O capacity.
49+
```
50+
51+
* Cons
52+
53+
```text
54+
* Measurements are not stored locally
55+
* No web interface to use for configuration or adding local tests
56+
* Unable to show results in MaDDash
57+
58+
```
59+
60+
While sites are free to choose whatever deployment method they want, we would like to strongly recommend the use of
61+
perfSONAR's containerized testpoint. This method was chosen as a "best practice" recommendation because of the reduced
62+
resource constraints, less components and easier management.
63+
64+
### perfSONAR Hardware Requirements
65+
66+
There are two different nodes participating in the network testing, latency node and bandwidth node, while both are
67+
running on the exact same perfSONAR toolkit, they have very different requirements. Bandwidth node measures available
68+
(or peak) throughput with low test frequency and will thus require NIC with high capacity (1/10/40/100G are supported)
69+
as well as enough memory and CPU to support high bandwidth testing. Our recommendation is to match bandwidth node NIC
70+
speed with the one installed on the storage nodes as this would provide us with the best match when there are issues to
71+
investigate. In case you'd like to deploy high speed (100G) bandwidth node, please consult [ESNet tuning
72+
guide](https://fasterdata.es.net/host-tuning/100g-tuning/) and [100G tuning
73+
presentation](https://www.es.net/assets/Uploads/100G-Tuning-TechEx2016.tierney.pdf). Latency node on the other hand runs
74+
low bandwidth, but high frequency tests, sending a continuous stream of packets to measure delay and corresponding
75+
packet loss, packet reordering, etc. This means that while it doesn't require high capacity NIC, 1G is usually
76+
sufficient, it can impose significant load on the IO to disk as well as CPU as many tests run in parallel and need to
77+
continuously store its results into local measurement archive. The minimum hardware requirements to run perfSONAR
78+
toolkit are documented [here](http://docs.perfsonar.net/install_hardware_details.html). For WLCG/OSG deployment and
79+
taking into account the amount of testing that we perform, we recommend at least the following for perfSONAR 5.0+:
80+
81+
* NIC for bandwidth node matching the capacity of the site storage nodes(10/25/40/100G), 1G NIC for latency node (for higher NIC capacities, 40/100G, please check [ESNet tuning guide](https://fasterdata.es.net/host-tuning/100g-tuning/))
82+
83+
* High clock speede CPU (3.0 Ghz+), fwere cores OK, with at least 32GB+ of RAM (8GB+ if using a Testpoint install)
84+
85+
* NVMe or SSD disk (128GB should be sufficient) if using full Toolkit install with Opensearch.
86+
87+
<!-- anchor removed; heading provides an automatic id -->
88+
89+
### Multiple NIC (Network Interface Card) Guidance
90+
91+
Many sites would prefer **not** to have to deploy two servers for cost, space and power reasons. Since perfSONAR 3.5+
92+
there is a way to install both latency and bandwidth measurement services on a single node, as long as it has at least
93+
two NICs (one per 'flavor' of measurement) and sufficient processing power and memory. There are few additional steps
94+
required in order to configure the node with multiple network cards:
95+
96+
* Please setup source routing as described in the [official documentation](http://docs.perfsonar.net/manage_dual_xface.html).
97+
98+
* You'll need to register two hostnames in [OIM](installation.md)/[GOCDB](installation.md) (and have two reverse DNS entries) as you would normally for two separate nodes.
99+
100+
* Instead of configuring just one auto-URL in for the remote URL, please add both, so you'll end up having something like this:
101+
102+
```bash
103+
psconfig remote add "https://psconfig.opensciencegrid.org/pub/auto/<FQDN_latency>"
104+
psconfig remote add "https://psconfig.opensciencegrid.org/pub/auto/<FQDN_throughput>"
105+
...
106+
```
Lines changed: 106 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,106 @@
1+
# perfSONAR Deployment Options
2+
3+
The primary motivation for perfSONAR deployment is to test isolation, i.e. only one end-to-end test should run on a host
4+
at a time. This ensures that the test results are not impacted by the other tests. Otherwise it is much more difficult
5+
to interpret test results, which may vary due to host effects rather then network effects. Taking this into account it
6+
means that perfSONAR measurement tools are much more accurate running on a dedicated hardware and while it may be useful
7+
to run them on other hosts such as Data Transfer Nodes the current recommendation is to have specific measurement
8+
machine. In addition, as bandwidth testing could impact latency testing, we recommend to deploy two different nodes,
9+
each focused on specific set of tests. The following deployment options are currently available:
10+
11+
* **Bare metal** - preffered option in one of two possible configurations:
12+
13+
```text
14+
15+
* Two bare metal servers, one for latency node, one for bandwidth node
16+
17+
* One bare metal server running both latency and bandwidth node together provided that there are two NICs available, please refer to dual NIC section for more details on this.
18+
19+
```
20+
21+
* **Virtual Machine** - if bare metal is not available then it is also possible to run perfSONAR on a VM, however there are a set of additional requirements to fulfill:
22+
23+
```text
24+
25+
* Full-node VM is strongly preferred, having 2 VMs (latency/bandwidth node) on a single bare metal. Mixing perfSONAR VM(s) with others might have an impact on the measurements and is therefore not recommended.
26+
* VM needs to be configured to have SR-IOV to NIC(s) as well as pinned CPUs to ensure bandwidth tests are not impacted (by hypervisor switching CPUs during the test)
27+
* Succesfull full speed local bandwidth test is highly recommended prior to putting the VM into production
28+
```
29+
30+
* **Container** - perfSONAR has supported containers from version 4.1 (Q1 2018) and is documented at <https://docs.perfsonar.net/install_docker.html> but is not typically used in the same way as a full toolkit installation.
31+
32+
```text
33+
34+
* Docker perfSONAR test instance can however still be used by sites that run multiple perfSONAR instances on site for their internal testing as this deployment model allows to flexibly deploy a testpoint which can send results to a local measurement archive running on the perfSONAR toolkit node.
35+
36+
```
37+
38+
## perfSONAR Toolkit vs Testpoint
39+
40+
The perfSONAR team has documented the types of installations supported at
41+
<https://docs.perfsonar.net/install_options.html>. With the release of version 5, OSG/WLCG sites have a new option:
42+
instead of installing the full Toolkit sites can choose to install the Testpoint bundle.
43+
44+
* Pros
45+
46+
```text
47+
* Simpler deployment when a local web interface is not needed and a central measurement archive is available.
48+
* Less resource intensive for both memory and I/O capacity.
49+
```
50+
51+
* Cons
52+
53+
```text
54+
* Measurements are not stored locally
55+
* No web interface to use for configuration or adding local tests
56+
* Unable to show results in MaDDash
57+
58+
```
59+
60+
While sites are free to choose whatever deployment method they want, we would like to strongly recommend the use of
61+
perfSONAR's containerized testpoint. This method was chosen as a "best practice" recommendation because of the reduced
62+
resource constraints, less components and easier management.
63+
64+
### perfSONAR Hardware Requirements
65+
66+
There are two different nodes participating in the network testing, latency node and bandwidth node, while both are
67+
running on the exact same perfSONAR toolkit, they have very different requirements. Bandwidth node measures available
68+
(or peak) throughput with low test frequency and will thus require NIC with high capacity (1/10/40/100G are supported)
69+
as well as enough memory and CPU to support high bandwidth testing. Our recommendation is to match bandwidth node NIC
70+
speed with the one installed on the storage nodes as this would provide us with the best match when there are issues to
71+
investigate. In case you'd like to deploy high speed (100G) bandwidth node, please consult [ESNet tuning
72+
guide](https://fasterdata.es.net/host-tuning/100g-tuning/) and [100G tuning
73+
presentation](https://www.es.net/assets/Uploads/100G-Tuning-TechEx2016.tierney.pdf). Latency node on the other hand runs
74+
low bandwidth, but high frequency tests, sending a continuous stream of packets to measure delay and corresponding
75+
packet loss, packet reordering, etc. This means that while it doesn't require high capacity NIC, 1G is usually
76+
sufficient, it can impose significant load on the IO to disk as well as CPU as many tests run in parallel and need to
77+
continuously store its results into local measurement archive. The minimum hardware requirements to run perfSONAR
78+
toolkit are documented [here](http://docs.perfsonar.net/install_hardware_details.html). For WLCG/OSG deployment and
79+
taking into account the amount of testing that we perform, we recommend at least the following for perfSONAR 5.0+:
80+
81+
* NIC for bandwidth node matching the capacity of the site storage nodes(10/25/40/100G), 1G NIC for latency node (for higher NIC capacities, 40/100G, please check [ESNet tuning guide](https://fasterdata.es.net/host-tuning/100g-tuning/))
82+
83+
* High clock speede CPU (3.0 Ghz+), fwere cores OK, with at least 32GB+ of RAM (8GB+ if using a Testpoint install)
84+
85+
* NVMe or SSD disk (128GB should be sufficient) if using full Toolkit install with Opensearch.
86+
87+
<!-- anchor removed; heading provides an automatic id -->
88+
89+
### Multiple NIC (Network Interface Card) Guidance
90+
91+
Many sites would prefer **not** to have to deploy two servers for cost, space and power reasons. Since perfSONAR 3.5+
92+
there is a way to install both latency and bandwidth measurement services on a single node, as long as it has at least
93+
two NICs (one per 'flavor' of measurement) and sufficient processing power and memory. There are few additional steps
94+
required in order to configure the node with multiple network cards:
95+
96+
* Please setup source routing as described in the [official documentation](http://docs.perfsonar.net/manage_dual_xface.html).
97+
98+
* You'll need to register two hostnames in [OIM](installation.md)/[GOCDB](installation.md) (and have two reverse DNS entries) as you would normally for two separate nodes.
99+
100+
* Instead of configuring just one auto-URL in for the remote URL, please add both, so you'll end up having something like this:
101+
102+
```bash
103+
psconfig remote add "https://psconfig.opensciencegrid.org/pub/auto/<FQDN_latency>"
104+
psconfig remote add "https://psconfig.opensciencegrid.org/pub/auto/<FQDN_throughput>"
105+
...
106+
```
Lines changed: 106 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,106 @@
1+
# perfSONAR Deployment Options
2+
3+
The primary motivation for perfSONAR deployment is to test isolation, i.e. only one end-to-end test should run on a host
4+
at a time. This ensures that the test results are not impacted by the other tests. Otherwise it is much more difficult
5+
to interpret test results, which may vary due to host effects rather then network effects. Taking this into account it
6+
means that perfSONAR measurement tools are much more accurate running on a dedicated hardware and while it may be useful
7+
to run them on other hosts such as Data Transfer Nodes the current recommendation is to have specific measurement
8+
machine. In addition, as bandwidth testing could impact latency testing, we recommend to deploy two different nodes,
9+
each focused on specific set of tests. The following deployment options are currently available:
10+
11+
* **Bare metal** - preffered option in one of two possible configurations:
12+
13+
```text
14+
15+
* Two bare metal servers, one for latency node, one for bandwidth node
16+
17+
* One bare metal server running both latency and bandwidth node together provided that there are two NICs available, please refer to dual NIC section for more details on this.
18+
19+
```
20+
21+
* **Virtual Machine** - if bare metal is not available then it is also possible to run perfSONAR on a VM, however there are a set of additional requirements to fulfill:
22+
23+
```text
24+
25+
* Full-node VM is strongly preferred, having 2 VMs (latency/bandwidth node) on a single bare metal. Mixing perfSONAR VM(s) with others might have an impact on the measurements and is therefore not recommended.
26+
* VM needs to be configured to have SR-IOV to NIC(s) as well as pinned CPUs to ensure bandwidth tests are not impacted (by hypervisor switching CPUs during the test)
27+
* Succesfull full speed local bandwidth test is highly recommended prior to putting the VM into production
28+
```
29+
30+
* **Container** - perfSONAR has supported containers from version 4.1 (Q1 2018) and is documented at <https://docs.perfsonar.net/install_docker.html> but is not typically used in the same way as a full toolkit installation.
31+
32+
```text
33+
34+
* Docker perfSONAR test instance can however still be used by sites that run multiple perfSONAR instances on site for their internal testing as this deployment model allows to flexibly deploy a testpoint which can send results to a local measurement archive running on the perfSONAR toolkit node.
35+
36+
```
37+
38+
## perfSONAR Toolkit vs Testpoint
39+
40+
The perfSONAR team has documented the types of installations supported at
41+
<https://docs.perfsonar.net/install_options.html>. With the release of version 5, OSG/WLCG sites have a new option:
42+
instead of installing the full Toolkit sites can choose to install the Testpoint bundle.
43+
44+
* Pros
45+
46+
```text
47+
* Simpler deployment when a local web interface is not needed and a central measurement archive is available.
48+
* Less resource intensive for both memory and I/O capacity.
49+
```
50+
51+
* Cons
52+
53+
```text
54+
* Measurements are not stored locally
55+
* No web interface to use for configuration or adding local tests
56+
* Unable to show results in MaDDash
57+
58+
```
59+
60+
While sites are free to choose whatever deployment method they want, we would like to strongly recommend the use of
61+
perfSONAR's containerized testpoint. This method was chosen as a "best practice" recommendation because of the reduced
62+
resource constraints, less components and easier management.
63+
64+
### perfSONAR Hardware Requirements
65+
66+
There are two different nodes participating in the network testing, latency node and bandwidth node, while both are
67+
running on the exact same perfSONAR toolkit, they have very different requirements. Bandwidth node measures available
68+
(or peak) throughput with low test frequency and will thus require NIC with high capacity (1/10/40/100G are supported)
69+
as well as enough memory and CPU to support high bandwidth testing. Our recommendation is to match bandwidth node NIC
70+
speed with the one installed on the storage nodes as this would provide us with the best match when there are issues to
71+
investigate. In case you'd like to deploy high speed (100G) bandwidth node, please consult [ESNet tuning
72+
guide](https://fasterdata.es.net/host-tuning/100g-tuning/) and [100G tuning
73+
presentation](https://www.es.net/assets/Uploads/100G-Tuning-TechEx2016.tierney.pdf). Latency node on the other hand runs
74+
low bandwidth, but high frequency tests, sending a continuous stream of packets to measure delay and corresponding
75+
packet loss, packet reordering, etc. This means that while it doesn't require high capacity NIC, 1G is usually
76+
sufficient, it can impose significant load on the IO to disk as well as CPU as many tests run in parallel and need to
77+
continuously store its results into local measurement archive. The minimum hardware requirements to run perfSONAR
78+
toolkit are documented [here](http://docs.perfsonar.net/install_hardware_details.html). For WLCG/OSG deployment and
79+
taking into account the amount of testing that we perform, we recommend at least the following for perfSONAR 5.0+:
80+
81+
* NIC for bandwidth node matching the capacity of the site storage nodes(10/25/40/100G), 1G NIC for latency node (for higher NIC capacities, 40/100G, please check [ESNet tuning guide](https://fasterdata.es.net/host-tuning/100g-tuning/))
82+
83+
* High clock speede CPU (3.0 Ghz+), fwere cores OK, with at least 32GB+ of RAM (8GB+ if using a Testpoint install)
84+
85+
* NVMe or SSD disk (128GB should be sufficient) if using full Toolkit install with Opensearch.
86+
87+
<!-- anchor removed; heading provides an automatic id -->
88+
89+
### Multiple NIC (Network Interface Card) Guidance
90+
91+
Many sites would prefer **not** to have to deploy two servers for cost, space and power reasons. Since perfSONAR 3.5+
92+
there is a way to install both latency and bandwidth measurement services on a single node, as long as it has at least
93+
two NICs (one per 'flavor' of measurement) and sufficient processing power and memory. There are few additional steps
94+
required in order to configure the node with multiple network cards:
95+
96+
* Please setup source routing as described in the [official documentation](http://docs.perfsonar.net/manage_dual_xface.html).
97+
98+
* You'll need to register two hostnames in [OIM](installation.md)/[GOCDB](installation.md) (and have two reverse DNS entries) as you would normally for two separate nodes.
99+
100+
* Instead of configuring just one auto-URL in for the remote URL, please add both, so you'll end up having something like this:
101+
102+
```bash
103+
psconfig remote add "https://psconfig.opensciencegrid.org/pub/auto/<FQDN_latency>"
104+
psconfig remote add "https://psconfig.opensciencegrid.org/pub/auto/<FQDN_throughput>"
105+
...
106+
```

0 commit comments

Comments
 (0)