| 
 | 1 | +.. _apache-spark-receiver:  | 
 | 2 | + | 
 | 3 | +*******************************  | 
 | 4 | +Apache Spark receiver  | 
 | 5 | +*******************************  | 
 | 6 | + | 
 | 7 | +.. meta::  | 
 | 8 | +      :description: The Apache Spark receiver fetches metrics for an Apache Spark cluster through the Apache Spark REST API.   | 
 | 9 | + | 
 | 10 | +The Apache Spark receiver monitors Apache Spark clusters and the applications running on them through the collection of performance metrics like memory utilization, CPU utilization, shuffle operations, and more. The supported pipeline type is ``metrics``. See :ref:`otel-data-processing` for more information.  | 
 | 11 | + | 
 | 12 | +.. note:: Out-of-the-box dashboards and navigators aren't supported for the Apache Spark receiver yet, but are planned for a future release.  | 
 | 13 | + | 
 | 14 | +The receiver retrieves metrics through the Apache Spark REST API using the following endpoints: ``/metrics/json``, ``/api/v1/applications/[app-id]/stages``, ``/api/v1/applications/[app-id]/executors``, and ``/api/v1/applications/[app-id]/jobs endpoints``.  | 
 | 15 | + | 
 | 16 | +Prerequisites  | 
 | 17 | +======================  | 
 | 18 | + | 
 | 19 | +This receiver supports Apache Spark versions 3.3.2 or higher.  | 
 | 20 | + | 
 | 21 | +Get started  | 
 | 22 | +======================  | 
 | 23 | + | 
 | 24 | +Follow these steps to configure and activate the component:  | 
 | 25 | + | 
 | 26 | +1. Deploy the Splunk Distribution of the OpenTelemetry Collector to your host or container platform:  | 
 | 27 | +     | 
 | 28 | +   - :ref:`otel-install-linux`  | 
 | 29 | +   - :ref:`otel-install-windows`  | 
 | 30 | +   - :ref:`otel-install-k8s`  | 
 | 31 | + | 
 | 32 | +2. Configure the receiver as described in the next section.  | 
 | 33 | +3. Restart the Collector.  | 
 | 34 | + | 
 | 35 | +Sample configuration  | 
 | 36 | +--------------------------------  | 
 | 37 | + | 
 | 38 | +To activate the Apache Spark receiver, add ``apachespark`` to the ``receivers`` section of your configuration file:   | 
 | 39 | + | 
 | 40 | +.. code-block:: yaml  | 
 | 41 | +
  | 
 | 42 | +  receivers:  | 
 | 43 | +    apachespark:  | 
 | 44 | +      collection_interval: 60s  | 
 | 45 | +      endpoint: http://localhost:4040  | 
 | 46 | +      application_names:  | 
 | 47 | +      - PythonStatusAPIDemo  | 
 | 48 | +      - PythonLR  | 
 | 49 | +
  | 
 | 50 | +To complete the configuration, include the receiver in the ``metrics`` pipeline of the ``service`` section of your configuration file:  | 
 | 51 | + | 
 | 52 | +.. code:: yaml  | 
 | 53 | +
  | 
 | 54 | +  service:  | 
 | 55 | +    pipelines:  | 
 | 56 | +      metrics:  | 
 | 57 | +        receivers: [apachespark]  | 
 | 58 | +
  | 
 | 59 | +Configuration options  | 
 | 60 | +-----------------------  | 
 | 61 | + | 
 | 62 | +The following settings are optional:  | 
 | 63 | + | 
 | 64 | +* ``collection_interval``. ``60s`` by default. Sets the interval this receiver collects metrics on.   | 
 | 65 | +    | 
 | 66 | +  * This value must be a string readable by Golang's ``time.ParseDuration``. Learn more at Go's official documentation :new-page:`ParseDuration function <https://pkg.go.dev/time#ParseDuration>`.  | 
 | 67 | +    | 
 | 68 | +  * Valid time units are ``ns``, ``us`` (or ``µs``), ``ms``, ``s``, ``m``, ``h``.  | 
 | 69 | + | 
 | 70 | +* .. include:: /_includes/gdi/collector-settings-initialdelay.rst  | 
 | 71 | + | 
 | 72 | +* ``endpoint``. ``http://localhost:4040`` by default. Apache Spark endpoint to connect to in the form of ``[http][://]{host}[:{port}]``.  | 
 | 73 | + | 
 | 74 | +* ``application_names``. An array of Spark application names for which metrics are collected from. If no application names are specified, metrics are collected for all Spark applications running on the cluster at the specified endpoint.  | 
 | 75 | + | 
 | 76 | +Settings  | 
 | 77 | +======================  | 
 | 78 | + | 
 | 79 | +The full list of settings exposed for this receiver are documented in the :new-page:`Apache Spark receiver config repo <https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/receiver/apachesparkreceiver/config.go>` in GitHub.  | 
 | 80 | + | 
 | 81 | +Metrics  | 
 | 82 | +======================  | 
 | 83 | + | 
 | 84 | +The following metrics, resource attributes, and attributes are available.  | 
 | 85 | + | 
 | 86 | +.. note:: The SignalFx exporter excludes some available metrics by default. Learn more about default metric filters in :ref:`list-excluded-metrics`.  | 
 | 87 | + | 
 | 88 | +.. raw:: html  | 
 | 89 | + | 
 | 90 | +  <div class="metrics-component" category="included" url="https://raw.githubusercontent.com/splunk/collector-config-tools/main/metric-metadata/apachesparkreceiver.yaml"></div>  | 
 | 91 | + | 
 | 92 | +.. include:: /_includes/activate-deactivate-native-metrics.rst  | 
 | 93 | + | 
 | 94 | +Troubleshooting  | 
 | 95 | +======================  | 
 | 96 | + | 
 | 97 | +.. include:: /_includes/troubleshooting-components.rst  | 
0 commit comments