Skip to content
This repository was archived by the owner on Sep 2, 2025. It is now read-only.

Commit d1b3e21

Browse files
Merge pull request #1557 from splunk/repo-sync
Pulling refs/heads/main into main
2 parents c31f25b + 8291a98 commit d1b3e21

File tree

5 files changed

+106
-4
lines changed

5 files changed

+106
-4
lines changed

_includes/gdi/otel-receivers-table.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
* :ref:`apache-receiver`
2+
* :ref:`apache-spark-receiver`
23
* :ref:`azureeventhub-receiver`
34
* :ref:`carbon-receiver`
45
* :ref:`cloudfoundry-receiver`

gdi/monitors-databases/apache-spark.rst

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -4,11 +4,11 @@ Apache Spark
44
============
55

66
.. meta::
7-
:description: Use this Splunk Observability Cloud integration for the Apache Sparck clusters monitor. See benefits, install, configuration, and metrics
7+
:description: Use this Splunk Observability Cloud integration for the Apache Spark clusters monitor. See benefits, install, configuration, and metrics
88

9-
The Splunk Distribution of OpenTelemetry Collector uses the Smart Agent receiver with the
10-
Apache Spark monitor type to monitor Apache Spark clusters. It does not
11-
support fetching metrics from Spark Structured Streaming.
9+
.. note:: If you're using the Splunk Distribution of the OpenTelemetry Collector and want to collect Apache Spark cluster metrics, use the native OTel component :ref:`apache-spark-receiver`.
10+
11+
The Splunk Distribution of the OpenTelemetry Collector uses the Smart Agent receiver with the Apache Spark monitor type to monitor Apache Spark clusters. It does not support fetching metrics from Spark Structured Streaming.
1212

1313
For the following cluster modes, the integration only supports HTTP
1414
endpoints:

gdi/opentelemetry/components.rst

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,9 @@ The Splunk Distribution of the OpenTelemetry Collector includes and supports the
5252
* - :ref:`apache-receiver` (``apache``)
5353
- Fetches stats from a Apache Web Server.
5454
- Metrics
55+
* - :ref:`apache-spark-receiver` (``apachespark``)
56+
- Fetches metrics for an Apache Spark cluster through the Apache Spark REST API.
57+
- Metrics
5558
* - :ref:`azureeventhub-receiver` (``azureeventhub``)
5659
- Pulls logs from an Azure event hub.
5760
- Logs

gdi/opentelemetry/components/a-components-receivers.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ Collector components: Receivers
1313
:hidden:
1414

1515
apache-receiver
16+
apache-spark-receiver
1617
azureeventhub-receiver
1718
carbon-receiver
1819
cloudfoundry-receiver
Lines changed: 97 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,97 @@
1+
.. _apache-spark-receiver:
2+
3+
*******************************
4+
Apache Spark receiver
5+
*******************************
6+
7+
.. meta::
8+
:description: The Apache Spark receiver fetches metrics for an Apache Spark cluster through the Apache Spark REST API.
9+
10+
The Apache Spark receiver monitors Apache Spark clusters and the applications running on them through the collection of performance metrics like memory utilization, CPU utilization, shuffle operations, and more. The supported pipeline type is ``metrics``. See :ref:`otel-data-processing` for more information.
11+
12+
.. note:: Out-of-the-box dashboards and navigators aren't supported for the Apache Spark receiver yet, but are planned for a future release.
13+
14+
The receiver retrieves metrics through the Apache Spark REST API using the following endpoints: ``/metrics/json``, ``/api/v1/applications/[app-id]/stages``, ``/api/v1/applications/[app-id]/executors``, and ``/api/v1/applications/[app-id]/jobs endpoints``.
15+
16+
Prerequisites
17+
======================
18+
19+
This receiver supports Apache Spark versions 3.3.2 or higher.
20+
21+
Get started
22+
======================
23+
24+
Follow these steps to configure and activate the component:
25+
26+
1. Deploy the Splunk Distribution of the OpenTelemetry Collector to your host or container platform:
27+
28+
- :ref:`otel-install-linux`
29+
- :ref:`otel-install-windows`
30+
- :ref:`otel-install-k8s`
31+
32+
2. Configure the receiver as described in the next section.
33+
3. Restart the Collector.
34+
35+
Sample configuration
36+
--------------------------------
37+
38+
To activate the Apache Spark receiver, add ``apachespark`` to the ``receivers`` section of your configuration file:
39+
40+
.. code-block:: yaml
41+
42+
receivers:
43+
apachespark:
44+
collection_interval: 60s
45+
endpoint: http://localhost:4040
46+
application_names:
47+
- PythonStatusAPIDemo
48+
- PythonLR
49+
50+
To complete the configuration, include the receiver in the ``metrics`` pipeline of the ``service`` section of your configuration file:
51+
52+
.. code:: yaml
53+
54+
service:
55+
pipelines:
56+
metrics:
57+
receivers: [apachespark]
58+
59+
Configuration options
60+
-----------------------
61+
62+
The following settings are optional:
63+
64+
* ``collection_interval``. ``60s`` by default. Sets the interval this receiver collects metrics on.
65+
66+
* This value must be a string readable by Golang's ``time.ParseDuration``. Learn more at Go's official documentation :new-page:`ParseDuration function <https://pkg.go.dev/time#ParseDuration>`.
67+
68+
* Valid time units are ``ns``, ``us`` (or ``µs``), ``ms``, ``s``, ``m``, ``h``.
69+
70+
* .. include:: /_includes/gdi/collector-settings-initialdelay.rst
71+
72+
* ``endpoint``. ``http://localhost:4040`` by default. Apache Spark endpoint to connect to in the form of ``[http][://]{host}[:{port}]``.
73+
74+
* ``application_names``. An array of Spark application names for which metrics are collected from. If no application names are specified, metrics are collected for all Spark applications running on the cluster at the specified endpoint.
75+
76+
Settings
77+
======================
78+
79+
The full list of settings exposed for this receiver are documented in the :new-page:`Apache Spark receiver config repo <https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/receiver/apachesparkreceiver/config.go>` in GitHub.
80+
81+
Metrics
82+
======================
83+
84+
The following metrics, resource attributes, and attributes are available.
85+
86+
.. note:: The SignalFx exporter excludes some available metrics by default. Learn more about default metric filters in :ref:`list-excluded-metrics`.
87+
88+
.. raw:: html
89+
90+
<div class="metrics-component" category="included" url="https://raw.githubusercontent.com/splunk/collector-config-tools/main/metric-metadata/apachesparkreceiver.yaml"></div>
91+
92+
.. include:: /_includes/activate-deactivate-native-metrics.rst
93+
94+
Troubleshooting
95+
======================
96+
97+
.. include:: /_includes/troubleshooting-components.rst

0 commit comments

Comments
 (0)