|
| 1 | +# SciTags, Fireflies, and perfSONAR |
| 2 | + |
| 3 | +## What Are SciTags? |
| 4 | + |
| 5 | +[SciTags](https://www.scitags.org/) is an initiative by the HEP/WLCG networking community to improve |
| 6 | +network visibility by **tagging** network flows with metadata that identifies the experiment and activity |
| 7 | +generating them. By embedding compact identifiers — an *experiment ID* and an *activity ID* — into every |
| 8 | +packet of a tagged flow, network operators, site administrators, and researchers can: |
| 9 | + |
| 10 | +- **Attribute traffic** to specific scientific experiments (ATLAS, CMS, LHCb, ALICE, Belle II, DUNE, etc.). |
| 11 | +- **Distinguish activity types** such as production data transfers, analysis jobs, or network measurements. |
| 12 | +- **Correlate** network metrics with application-level behavior for faster root-cause analysis. |
| 13 | +- **Enable smarter traffic engineering** by providing information that routers, firewalls, and monitoring |
| 14 | + systems can act on. |
| 15 | + |
| 16 | +The technical specification is maintained by the |
| 17 | +[SciTags Organization](https://www.scitags.org/) and documented in the |
| 18 | +[SciTags Technical Specification](https://docs.google.com/document/d/1x9JsZ7iTj44Ta06IHdkwpv5Q2u4U2QGLWnUeN2Zf5ts/edit). |
| 19 | + |
| 20 | +--- |
| 21 | + |
| 22 | +## What Are Fireflies? |
| 23 | + |
| 24 | +A **firefly** is a lightweight UDP packet that carries metadata about a network flow. When a tagged |
| 25 | +application (or a daemon acting on its behalf) starts or stops a network transfer, it emits a firefly to a |
| 26 | +designated collector. Each firefly contains: |
| 27 | + |
| 28 | +| Field | Description | |
| 29 | +|-------|-------------| |
| 30 | +| **Experiment ID** | Numeric identifier for the experiment or project (e.g. ATLAS = 2, CMS = 3) | |
| 31 | +| **Activity ID** | Numeric identifier for the type of activity (e.g. data transfer, network test) | |
| 32 | +| **Source IP:port** | Origin of the network flow | |
| 33 | +| **Destination IP:port** | Target of the network flow | |
| 34 | +| **Protocol** | Transport protocol (TCP, UDP) | |
| 35 | +| **State** | Flow state: *start*, *ongoing*, or *end* | |
| 36 | + |
| 37 | +Fireflies serve two complementary purposes: |
| 38 | + |
| 39 | +1. **Flow announcement** — Notify collectors and monitoring infrastructure that a tagged flow exists so it |
| 40 | + can be tracked from start to finish. |
| 41 | +2. **Packet marking** — On hosts that support it, the same metadata can be encoded into the IPv6 Flow Label |
| 42 | + or IPv4 DSCP/TOS field of every packet, enabling in-network identification without deep packet |
| 43 | + inspection. |
| 44 | + |
| 45 | +--- |
| 46 | + |
| 47 | +## What Is flowd-go? |
| 48 | + |
| 49 | +[**flowd-go**](https://github.com/scitags/flowd-go) is a lightweight, high-performance daemon that |
| 50 | +implements the SciTags flow-marking and firefly-sending infrastructure. It is written in Go and ships as a |
| 51 | +single statically-linked binary (with embedded eBPF programs for packet marking on Linux ≥ 5.x kernels). |
| 52 | + |
| 53 | +### Key capabilities |
| 54 | + |
| 55 | +| Feature | Details | |
| 56 | +|---------|---------| |
| 57 | +| **Packet marking** | Uses eBPF to stamp the IPv6 Flow Label on egress packets matching tracked flows | |
| 58 | +| **Firefly emission** | Sends UDP fireflies to local or remote collectors when flows start, continue, or end | |
| 59 | +| **perfSONAR plugin** | Built-in plugin that marks *all* egress traffic with a configured experiment/activity ID — ideal for dedicated measurement hosts | |
| 60 | +| **Low overhead** | Statically compiled Go binary; no Python, no containers, no runtime dependencies beyond `libz` and `libelf` (typically already present) | |
| 61 | +| **RPM packaged** | Available from the SciTags repository for EL9 (`x86_64` and `aarch64`) | |
| 62 | + |
| 63 | +### Architecture |
| 64 | + |
| 65 | +flowd-go follows a **plugin → backend** pipeline: |
| 66 | + |
| 67 | +``` |
| 68 | + ┌──────────┐ ┌──────────┐ |
| 69 | + │ Plugins │──────▶│ Backends │ |
| 70 | + │ (sources)│ │ (sinks) │ |
| 71 | + └──────────┘ └──────────┘ |
| 72 | +``` |
| 73 | + |
| 74 | +- **Plugins** detect or receive flow events (API calls, eBPF socket monitoring, perfSONAR catch-all, etc.). |
| 75 | +- **Backends** act on those events: mark packets via eBPF, send fireflies, export Prometheus metrics. |
| 76 | + |
| 77 | +For perfSONAR deployments the recommended configuration uses the **perfsonar** plugin (marks all egress |
| 78 | +traffic) together with the **marker** backend (eBPF-based IPv6 Flow Label stamping). |
| 79 | + |
| 80 | +--- |
| 81 | + |
| 82 | +## Why Use flowd-go with perfSONAR? |
| 83 | + |
| 84 | +perfSONAR measurement hosts generate a significant amount of network traffic for latency, throughput, and |
| 85 | +traceroute tests. Without tagging, this traffic is indistinguishable from other flows traversing the same |
| 86 | +links. By running flowd-go alongside perfSONAR: |
| 87 | + |
| 88 | +- **Network operators** can instantly identify perfSONAR measurement traffic on their infrastructure. |
| 89 | +- **Experiment coordinators** can attribute network test results to specific projects. |
| 90 | +- **Troubleshooters** can correlate measurement anomalies with flow-level metadata in packet captures or |
| 91 | + NetFlow/sFlow records. |
| 92 | + |
| 93 | +The integration is intentionally lightweight: |
| 94 | + |
| 95 | +1. Install the `flowd-go` RPM (a single package, ~5 MB). |
| 96 | +2. Write a small YAML configuration selecting your experiment. |
| 97 | +3. Enable and start the `flowd-go` systemd service. |
| 98 | + |
| 99 | +The helper scripts in this repository automate all three steps. |
| 100 | + |
| 101 | +--- |
| 102 | + |
| 103 | +## Experiment IDs |
| 104 | + |
| 105 | +The following experiment IDs are defined in the SciTags registry. Select the one that matches your site's |
| 106 | +primary experiment affiliation: |
| 107 | + |
| 108 | +| ID | Experiment | |
| 109 | +|----|------------| |
| 110 | +| 1 | Default (no specific experiment) | |
| 111 | +| 2 | ATLAS | |
| 112 | +| 3 | CMS | |
| 113 | +| 4 | LHCb | |
| 114 | +| 5 | ALICE | |
| 115 | +| 6 | Belle II | |
| 116 | +| 7 | SKA | |
| 117 | +| 8 | DUNE | |
| 118 | +| 9 | LSST / Rubin Observatory | |
| 119 | +| 10 | ILC | |
| 120 | +| 11 | Auger | |
| 121 | +| 12 | JUNO | |
| 122 | +| 13 | NOvA | |
| 123 | +| 14 | XENON | |
| 124 | + |
| 125 | +Activity ID **2** (network testing / perfSONAR) is used by default for perfSONAR deployments. |
| 126 | + |
| 127 | +--- |
| 128 | + |
| 129 | +## Configuration Reference |
| 130 | + |
| 131 | +A minimal `/etc/flowd-go/conf.yaml` for a perfSONAR host affiliated with ATLAS: |
| 132 | + |
| 133 | +```yaml |
| 134 | +plugins: |
| 135 | + perfsonar: |
| 136 | + activityId: 2 |
| 137 | + experimentId: 2 |
| 138 | + |
| 139 | +backends: |
| 140 | + marker: |
| 141 | + targetInterfaces: [ens4f0np0, ens4f1np1] |
| 142 | + markingStrategy: label |
| 143 | + forceHookRemoval: true |
| 144 | +``` |
| 145 | +
|
| 146 | +| Key | Description | |
| 147 | +|-----|-------------| |
| 148 | +| `plugins.perfsonar.activityId` | Activity type — use **2** for network testing | |
| 149 | +| `plugins.perfsonar.experimentId` | Experiment affiliation (see table above) | |
| 150 | +| `backends.marker.targetInterfaces` | List of NIC names whose egress traffic should be marked | |
| 151 | +| `backends.marker.markingStrategy` | `label` = IPv6 Flow Label (recommended) | |
| 152 | +| `backends.marker.forceHookRemoval` | Remove eBPF hooks cleanly on daemon stop | |
| 153 | + |
| 154 | +For the full set of options see the |
| 155 | +[flowd-go man page](https://github.com/scitags/flowd-go/blob/main/rpm/flowd-go.1.md) and the |
| 156 | +[default conf.yaml](https://github.com/scitags/flowd-go/blob/main/rpm/conf.yaml). |
| 157 | + |
| 158 | +--- |
| 159 | + |
| 160 | +## Verifying the Installation |
| 161 | + |
| 162 | +After starting flowd-go, verify it is running and marking traffic: |
| 163 | + |
| 164 | +```bash |
| 165 | +# Check service status |
| 166 | +systemctl status flowd-go |
| 167 | +
|
| 168 | +# View recent log output |
| 169 | +journalctl -u flowd-go --no-pager -n 20 |
| 170 | +
|
| 171 | +# Confirm eBPF programs are attached (should show tc/qdisc entries) |
| 172 | +tc qdisc show dev <NIC_NAME> |
| 173 | +``` |
| 174 | + |
| 175 | +--- |
| 176 | + |
| 177 | +## Further Reading |
| 178 | + |
| 179 | +- [SciTags Organization](https://www.scitags.org/) |
| 180 | +- [SciTags Technical Specification](https://docs.google.com/document/d/1x9JsZ7iTj44Ta06IHdkwpv5Q2u4U2QGLWnUeN2Zf5ts/edit) |
| 181 | +- [flowd-go GitHub Repository](https://github.com/scitags/flowd-go) |
| 182 | +- [flowd-go Man Page](https://github.com/scitags/flowd-go/blob/main/rpm/flowd-go.1.md) |
| 183 | +- [perfSONAR Documentation](https://docs.perfsonar.net/) |
0 commit comments