Skip to content

Latest commit

 

History

History
349 lines (250 loc) · 13 KB

File metadata and controls

349 lines (250 loc) · 13 KB

Developing Agent Control

Compiling and running Agent Control

As of now, Agent Control is supported on Linux (x86_64 and aarch64). The program is written in Rust, and for multiplatform compilation we leverage cargo-zigbuild and musl libc.

On-host

To compile and run locally:

  1. Install the Rust toolchain for your system, also add the targets you wish to compile for, e.g. (rustup target add x86_64-unknown-linux-musl aarch64-unknown-linux-musl).

  2. Install Zig with one of the supported methods.

  3. Install cargo-zigbuild with cargo install --locked cargo-zigbuild.

  4. Run cargo zigbuild --bin newrelic-agent-control --target <ARCH>-unknown-linux-musl, where <ARCH> is either x86_64 or aarch64, depending on your system.

    • On macOS, you might run into an error like the following:

      cargo zigbuild --bin newrelic-agent-control-onhost --target aarch64-unknown-linux-musl
      [...]
        = note: some arguments are omitted. use `--verbose` to show all linker arguments
        = note: error: unable to search for static library /<SOME_PATH_TO_RLIB_FILE>.rlib: ProcessFdQuotaExceeded

      This is a known issue. To address it, increase the number of file descriptors for the current shell session with:

      ulimit -n 4096
  5. newrelic-agent-control binary will be generated at ./target/<ARCH>-unknown-linux-musl/debug/newrelic-agent-control

  6. Prepare a local_config.yaml file in /etc/newrelic-agent-control/local-data/agent-control, example:

    fleet_control:
      endpoint: https://opamp.service.newrelic.com/v1/opamp
      headers:
        api-key: YOUR_INGEST_KEY
    agents:
      nr-otel-collector:
        agent_type: "newrelic/com.newrelic.opentelemetry.collector:0.1.0"
  7. Place values files in the folder /etc/newrelic-agent-control/local-data/{AGENT-ID}/ where AGENT-ID is a key in the agents: list. Example:

    config: |
      # the OTel collector config here
      # receivers:
      # exporters:
      # pipelines:
  8. Execute the binary with the config file with sudo ./target/debug/newrelic-agent-control

Cross-compilation for Windows

The steps below work for the x86_64-pc-windows-msvc target only. It is also possible to compile the project for the x86_64-pc-windows-gnu target using cargo-zigbuild.

  1. Install the Rust toolchain: rustup target add x86_64-pc-windows-msvc

  2. Install cargo-xwin and dependencies. Example for macOs:

    ❯ brew install cmake ninja llvm
    ❯ cargo install --locked cargo-xwin
  3. Compile agent-control:

    ❯ cargo xwin build --bin newrelic-agent-control --target x86_64-pc-windows-msvc --release

    ⚠️ This method doesn't work for building debug binaries (--release is required). As cargo-xwin doesn't provide msvcrtd.lib (the debug version of the C runtime library).

AC Service Windows vs Linux

On Linux, Agent Control is expected to run as a system service, managed by systemd. Such service file is packaged in the .deb and .rpm installers by goreleaser and it is installed and enabled by the postinstall.sh script.

On the other hand, on Windows, Agent Control is expected to run as a Windows Service that is installed using the install.ps1 script that takes care of registering the service and install AC.

On both Operating Systems, an environment_variables.yaml file is created to store environment variables that Agent Control will use when running as a service. This file is created at installation time and it is located at the Agent Control configuration directory.

Kubernetes

We use minikube and tilt to launch a local cluster and deploy the Agent Control charts.

Prerequisites

  • Install the Rust toolchain for your system, also add the targets you wish to compile for, e.g. (rustup target add x86_64-unknown-linux-musl aarch64-unknown-linux-musl).
  • Install Zig with one of the supported methods.
  • Install cargo-zigbuild with cargo install --locked cargo-zigbuild.
  • Install minikube for local Kubernetes cluster emulation.
  • Ensure you have tilt installed for managing local development environments.
  • Add an Agent Control values file in local/agent-control-tilt.yml.

Note: Adding the 'chart_repo' setting, pointing to the New Relic charts on a local path, allows using local helm charts.

Steps

minikube start --driver='docker'
make tilt-up

On macOS, you might run into an error like the following:

cargo zigbuild --bin newrelic-agent-control-onhost --target aarch64-unknown-linux-musl
[...]
  = note: some arguments are omitted. use `--verbose` to show all linker arguments
  = note: error: unable to search for static library /<SOME_PATH_TO_RLIB_FILE>.rlib: ProcessFdQuotaExceeded

This is a known issue. To address it, increase the number of file descriptors for the current shell session with:

ulimit -n 4096

Troubleshooting

See diagnose issues with agent control logging.

Disable Fleet Control

Users can disable remote management just by commenting its configuration out from /etc/newrelic-agent-control/local-data/agent-control/local_config.yaml (on-host):

# fleet_control:
#   endpoint: https://opamp.service.newrelic.com/v1/opamp
#   signature_validation:
#     public_key_server_url: https://publickeys.newrelic.com/r/blob-management/global/agentconfiguration/jwks.json
#   headers:
#     api-key: API_KEY_HERE
#   fleet_id: FLEET_ID_HERE
#   auth_config:
#     token_url: PLACEHOLDER
#     client_id: PLACEHOLDER
#     provider: PLACEHOLDER
#     private_key_path: PLACEHOLDER

Or by placing enabled: false under the fleet_control section in the Agent Control configuration values (k8s):

# For K8s, inside the Helm values:
agent-control-deployment:
  image:
    imagePullPolicy: Always
  config:
    fleet_control:
      enabled: false
  # ...

Agent Control Health

There is a service that ultimately exposes a /status endpoint for Agent Control itself. This service performs a series of checks to determine the output (both in HTTP status code and message):

  • Reachability of Fleet Control endpoint (if Fleet Control is enabled at all).
  • Active agents and health of each one, in the same form as used by the OpAMP protocol, mentioned when discussing sub-agent health.
{
  "agent_control": {
    "healthy": true
  },
  "opamp": {
    "enabled": true,
    "endpoint": "https://opamp.service.newrelic.com/v1/opamp",
    "reachable": true
  },
  "agents": {
    "nr-otel-collector": {
      "agent_id": "nr-otel-collector",
      "agent_type": "newrelic/com.newrelic.opentelemetry.collector:0.1.0",
      "healthy": true
    },
    "nr-infra-agent": {
      "agent_id": "nr-infra-agent",
      "agent_type": "newrelic/com.newrelic.infrastructure:0.1.1",
      "healthy": false,
      "last_error": "process exited with code: exit status: 1"
    }
  }
}

Users need to enable the local server by adding the following setting in the Agent Control configuration file:

server:
    enabled: true
    # default values (change if needed)
    #host: "127.0.0.1"
    #port: 51200

For Kubernetes, the status endpoint is enabled by default. You can access this easily by performing a Kubernetes port-forward, using the following commands on separate shells:

$ kubectl port-forward ac-agent-control-6558446569-rtwh4 -n newrelic 51200:51200
Forwarding from 127.0.0.1:51200 -> 51200
Forwarding from [::1]:51200 -> 51200

$ curl localhost:51200/status | jq
# ... contents will appear here formatted and highlighted

Testing

General

cargo test --workspace --exclude 'newrelic_agent_control' --all-targets

On Windows some of the tests need elevated permissions and are ignored by default but can be executed by adding:

cargo test --workspace --exclude 'newrelic_agent_control' --all-targets -- --include-ignored

We have Makefiles containing targets for testing. Inspect them for more details. Those Makefiles are not prepared to use Powershell or cmd. They assume bash is present in the system. On Windows, we can use Git Bash for example. If it's installed, the following command will start the bash shell.

.'C:\Program Files\Git\bin\bash.exe'

Feature onhost

Running tests for the agent control lib excluding root-required tests (on-host)

make -C agent-control test/onhost

Run tests agent control integration tests excluding root-required tests.

make -C agent-control test/onhost/integration

Tests that require root user

Running tests that require root user can be not straight-forward, as the Rust toolchain installers like rustup tend to not install them globally on a system, so doing sudo cargo won't work. An easy way to run the root-required tests is spinning up a container where the user is root and running them there with:

make -C agent-control test/onhost/root/integration

Feature k8s

Running basic tests, not requiring an existing Kubernetes cluster.

make -C agent-control test/k8s

Tests that require an existing Kubernetes cluster

make -C agent-control test/k8s/integration

Coverage

Generate coverage information easily by running the following make recipe from the root directory (will install cargo-llvm-cov if it's not installed already):

make coverage

By default, this will generate a report in lcov format on coverage/lcov.info that IDEs such as VSCode can read via certain extensions. To modify the output format and the output location, use the variables COVERAGE_OUT_FORMAT and COVERAGE_OUT_FILEPATH:

COVERAGE_OUT_FORMAT=json COVERAGE_OUT_FILEPATH=jcov-info.json make coverage

Profiling

Heap Profiling

Heap profiling is supported via the dhat crate and is gated behind the dhat-heap feature flag. It can be used to detect memory leaks and analyze heap allocations.

To build with heap profiling enabled, use the release-debug profile (which inherits from release but keeps debug symbols) together with the feature flag:

cargo build -p newrelic_agent_control --profile release-debug --features dhat-heap

Running the resulting binary will:

  • Print a short summary to stderr on exit.
  • Write a dhat-heap.json file to the current working directory (or / when running as a systemd service).

Open dhat-heap.json in the DHAT Viewer to inspect the results.

Ad-hoc Profiling

Ad-hoc profiling is also supported via dhat and is gated behind the dhat-ad-hoc feature flag. Unlike heap profiling, it requires manually annotating the code you want to profile — no annotations are added by default at the time of writing this. See the dhat ad-hoc profiling documentation for details on how to instrument your code.

cargo build -p newrelic_agent_control --profile release-debug --features dhat-ad-hoc

Codeql

Codeql is executed automatically in GitHub pipelines, in order to check the results locally you need to install the tool and download rust queries:

brew install codeqlcodeql pack download codeql/rust-queries

Build the source-code database:

codeql database create codeql-db \
  --language=rust \
  --command="cargo build" \
  --source-root=. \
  --overwrite

Analyze:

codeql database analyze codeql-db \
  codeql/rust-queries:codeql-suites/rust-security-extended.qls \
  --format=sarif-latest \
  --output=results.sarif

Check the results

cat results.sarif | jq '.runs[0].results[] | {ruleId, message: .message.text, location: .locations[0].physicalLocation}'

Additional information

We maintain separate directories for other documented topics under this docs directory and in other Markdown files throughout the codebase. The latter will be centralized under the docs directory over time. Feel free to check these documents and ask doubts or propose changes!