Skip to content

Not found (404) when call GET http://<databand-server>/api/v1/integrations/config?type=*** #86

@dancristi4n

Description

@dancristi4n

Describe
The Airflow Monitor DAG get not found error (http 404) when call Databand API "http:///api/v1/integrations/config?type=***"
It isn't a communication issue between Airflow and Databand because previous from another API calls occurs successfully.

To reproduce

  1. Create a requirements.txt file with content:
dbnd==1.0.14.1
dbnd-spark==1.0.14.1
dbnd-airflow==1.0.14.1
dbnd-airflow-auto-tracking==1.0.14.1
  1. Create Dockerfile to build Airflow image:
FROM apache/airflow:latest
ADD requirements.txt .
RUN pip install apache-airflow==${AIRFLOW_VERSION} -r requirements.txt
  1. Create directories:
mkdir -p ./dags ./logs ./plugins ./config
  1. Get official docker-compose file: https://airflow.apache.org/docs/apache-airflow/2.8.1/docker-compose.yaml

  2. Change docker-compose.yml, comment line 52 and uncomment line 53

...
  # image: ${AIRFLOW_IMAGE_NAME:-apache/airflow:2.8.1}
  build: .
...
  1. Create the containers
docker compose up airflow-init
docker compose up -d
  1. In Databand, create a Airflow Syncer (Settings > Airflow Syncers > Add new Syncer)
    Connection URL: http://<docker_host>:8080
    Syncer name: Any_syncer
    The step 3 give you a json to be used in next step.

  2. In Airflow, in admin > connection, create a new connection
    connection id: dbnd_config
    connection type: HTTP
    Extra:

{
  "airflow_monitor": {
    "dag_ids": "",
    "is_sync_enabled": true,
    "syncer_name": "Any_syncer"
  },
  "core": {
    "databand_url": "http://<databand-server>",
    "databand_access_token": "eyJ0eXAiOiJKV...."
  },
  "log": {
    "preview_head_bytes": 8192,
    "preview_tail_bytes": 8192
  },
  "tracking": {
    "track_source_code": false
  }
}
  1. Create a DAG file "databand_airflow_monitor.py" in dag folder:
from airflow_monitor.monitor_as_dag import get_monitor_dag
## This DAG is used by Databand to monitor your Airflow installation.
dag = get_monitor_dag()
  1. In Airflow, Unpause dag "databand_airflow_monitor" and see the logs.
ec2f5435d5c4
*** Found local files:
***   * /opt/airflow/logs/dag_id=databand_airflow_monitor/run_id=scheduled__2024-02-01T17:21:00+00:00/task_id=monitor/attempt=3.log
*** Found logs served from host http://ec2f5435d5c4:8793/log/dag_id=databand_airflow_monitor/run_id=scheduled__2024-02-01T17:21:00+00:00/task_id=monitor/attempt=3.log
[2024-02-07, 18:43:24 UTC] {dbnd_airflow_handler.py:117} INFO - Databand Tracking Started 1.0.14.1
[2024-02-07, 18:43:24 UTC] {base.py:83} INFO - Using connection ID 'dbnd_config' for task execution.
[2024-02-07, 18:43:24 UTC] {base.py:83} INFO - Using connection ID 'dbnd_config' for task execution.
[2024-02-07, 18:43:24 UTC] {taskinstance.py:1956} INFO - Dependencies all met for dep_context=non-requeueable deps ti=<TaskInstance: databand_airflow_monitor.monitor scheduled__2024-02-01T17:21:00+00:00 [queued]>
[2024-02-07, 18:43:24 UTC] {taskinstance.py:1956} INFO - Dependencies all met for dep_context=requeueable deps ti=<TaskInstance: databand_airflow_monitor.monitor scheduled__2024-02-01T17:21:00+00:00 [queued]>
[2024-02-07, 18:43:24 UTC] {taskinstance.py:2170} INFO - Starting attempt 3 of 13
[2024-02-07, 18:43:24 UTC] {taskinstance.py:2191} INFO - Executing <Task(MonitorOperator): monitor> on 2024-02-01 17:21:00+00:00
[2024-02-07, 18:43:24 UTC] {standard_task_runner.py:60} INFO - Started process 358 to run task
[2024-02-07, 18:43:24 UTC] {standard_task_runner.py:87} INFO - Running: ['***', 'tasks', 'run', 'databand_***_monitor', 'monitor', 'scheduled__2024-02-01T17:21:00+00:00', '--job-id', '24596', '--raw', '--subdir', 'DAGS_FOLDER/databand_***_monitor.py', '--cfg-path', '/tmp/tmpyjarxxti']
[2024-02-07, 18:43:24 UTC] {standard_task_runner.py:88} INFO - Job 24596: Subtask monitor
[2024-02-07, 18:43:24 UTC] {base.py:83} INFO - Using connection ID 'dbnd_config' for task execution.
[2024-02-07, 18:43:25 UTC] {task_command.py:423} INFO - Running <TaskInstance: databand_airflow_monitor.monitor scheduled__2024-02-01T17:21:00+00:00 [running]> on host ec2f5435d5c4
[2024-02-07, 18:43:25 UTC] {taskinstance.py:2480} INFO - Exporting env vars: AIRFLOW_CTX_DAG_OWNER='Databand' AIRFLOW_CTX_DAG_ID='databand_***_monitor' AIRFLOW_CTX_TASK_ID='monitor' AIRFLOW_CTX_EXECUTION_DATE='2024-02-01T17:21:00+00:00' AIRFLOW_CTX_TRY_NUMBER='3' AIRFLOW_CTX_DAG_RUN_ID='scheduled__2024-02-01T17:21:00+00:00' AIRFLOW_CTX_UID='ff2088ba-35ab-5eba-870b-c5b30b238094'
[2024-02-07, 18:43:25 UTC] {base.py:83} INFO - Using connection ID 'dbnd_config' for task execution.
[2024-02-07, 18:43:25 UTC] {base.py:83} INFO - Using connection ID 'dbnd_config' for task execution.
[2024-02-07, 18:43:25 UTC] {tracking_store_console.py:98} INFO - Tracking monitor task at http://192.168.0.195:8080/app/jobs/databand_***_monitor/29c4661a-0e21-5fb3-a5ba-94735dbf3cc1/084620bd-cb4a-5287-a09a-eb3495a26f3f
[2024-02-07, 18:43:25 UTC] {monitor_as_dag.py:194} INFO - Running memory guard with the limit=8589934592, checking every 10 seconds
[2024-02-07, 18:43:25 UTC] {subprocess.py:63} INFO - Tmp dir root location: /tmp
[2024-02-07, 18:43:25 UTC] {subprocess.py:75} INFO - Running command: ['/usr/bin/bash', '-c', '/usr/local/bin/python -m dbnd ***-monitor-v2  --interval 10  --stop-after 10800 ']
[2024-02-07, 18:43:25 UTC] {subprocess.py:86} INFO - Output:
[2024-02-07, 18:43:25 UTC] {monitor_as_dag.py:209} INFO - Memory usage changed from: 0 mb to 208 mb
[2024-02-07, 18:43:26 UTC] {subprocess.py:93} INFO - [2024-02-07 18:43:26,904] INFO - Starting Databand 1.0.14.1!
[2024-02-07, 18:43:26 UTC] {subprocess.py:93} INFO - 	DBND_HOME=/root
[2024-02-07, 18:43:26 UTC] {subprocess.py:93} INFO - 	DBND_SYSTEM=/root/.dbnd
[2024-02-07, 18:43:26 UTC] {subprocess.py:93} INFO - [2024-02-07 18:43:26,904] INFO - Reading configuration from:
[2024-02-07, 18:43:26 UTC] {subprocess.py:93} INFO - 	/home/***/.local/lib/python3.8/site-packages/dbnd/conf/databand-core.cfg
[2024-02-07, 18:43:26 UTC] {subprocess.py:93} INFO - 
[2024-02-07, 18:43:26 UTC] {subprocess.py:93} INFO - [2024-02-07 18:43:26,907] INFO - Running validations
[2024-02-07, 18:43:26 UTC] {subprocess.py:93} INFO - [2024-02-07 18:43:26,908] INFO - All required configurations exist
[2024-02-07, 18:43:26 UTC] {subprocess.py:93} INFO - [2024-02-07 18:43:26,915] INFO - All dbnd packages required for monitor exist
[2024-02-07, 18:43:26 UTC] {subprocess.py:93} INFO - [2024-02-07 18:43:26,915] INFO - All dbnd packages required for tracking exist
[2024-02-07, 18:43:27 UTC] {subprocess.py:93} INFO - [2024-02-07 18:43:27,188] INFO - Reading the config from /opt/***/***.cfg
[2024-02-07, 18:43:27 UTC] {subprocess.py:93} INFO - [2024-02-07 18:43:27,364] INFO - Configured default timezone UTC
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - [�[34m2024-02-07T18:43:28.315+0000�[0m] {�[34mvalidations.py:�[0m98} INFO�[0m - Airflow 2.0 support is set�[0m
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - [2024-02-07 18:43:28,586] ERROR ***_monitor.shared.multiserver 360 MainThread : Unknown exception during iteration
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - Traceback (most recent call last):
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO -   File "/home/***/.local/lib/python3.8/site-packages/***_monitor/shared/multiserver.py", line 167, in run
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO -     self.run_once()
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO -   File "/home/***/.local/lib/python3.8/site-packages/***_monitor/shared/multiserver.py", line 183, in run_once
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO -     ] = self.integration_management_service.get_all_servers_configuration(
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO -   File "/home/***/.local/lib/python3.8/site-packages/dbnd/_vendor/tenacity/__init__.py", line 241, in wrapped_f
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO -     return self.call(f, *args, **kw)
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO -   File "/home/***/.local/lib/python3.8/site-packages/dbnd/_vendor/tenacity/__init__.py", line 329, in call
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO -     do = self.iter(result=result, exc_info=exc_info,
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO -   File "/home/***/.local/lib/python3.8/site-packages/dbnd/_vendor/tenacity/__init__.py", line 297, in iter
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO -     raise retry_exc.reraise()
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO -   File "/home/***/.local/lib/python3.8/site-packages/dbnd/_vendor/tenacity/__init__.py", line 136, in reraise
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO -     raise self.last_attempt.result()
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO -   File "/usr/local/lib/python3.8/concurrent/futures/_base.py", line 437, in result
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO -     return self.__get_result()
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO -   File "/usr/local/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO -     raise self._exception
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO -   File "/home/***/.local/lib/python3.8/site-packages/dbnd/_vendor/tenacity/__init__.py", line 333, in call
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO -     result = fn(*args, **kwargs)
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO -   File "/home/***/.local/lib/python3.8/site-packages/***_monitor/common/metric_reporter.py", line 77, in wrapped
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO -     return f(*args, **kwargs)
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO -   File "/home/***/.local/lib/python3.8/site-packages/dbnd/_vendor/tenacity/__init__.py", line 241, in wrapped_f
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO -     return self.call(f, *args, **kw)
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO -   File "/home/***/.local/lib/python3.8/site-packages/dbnd/_vendor/tenacity/__init__.py", line 329, in call
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO -     do = self.iter(result=result, exc_info=exc_info,
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO -   File "/home/***/.local/lib/python3.8/site-packages/dbnd/_vendor/tenacity/__init__.py", line 279, in iter
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO -     return fut.result()
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO -   File "/usr/local/lib/python3.8/concurrent/futures/_base.py", line 437, in result
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO -     return self.__get_result()
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO -   File "/usr/local/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO -     raise self._exception
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO -   File "/home/***/.local/lib/python3.8/site-packages/dbnd/_vendor/tenacity/__init__.py", line 333, in call
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO -     result = fn(*args, **kwargs)
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO -   File "/home/***/.local/lib/python3.8/site-packages/***_monitor/shared/integration_management_service.py", line 68, in get_all_servers_configuration
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO -     response = self._api_client.api_request(
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO -   File "/home/***/.local/lib/python3.8/site-packages/dbnd/utils/api_client.py", line 254, in api_request
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO -     resp = self._request(
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO -   File "/home/***/.local/lib/python3.8/site-packages/dbnd/utils/api_client.py", line 152, in _request
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO -     raise DatabandApiError(
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - dbnd._core.errors.base.DatabandApiError: Call failed to endpoint GET http://<databand-server>/api/v1/integrations/config?type=***
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - Response code: 404
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - Server error: <!doctype html>
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - 	<html lang=en>
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - 	<title>404 Not Found</title>
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - 	<h1>Not Found</h1>
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - 	<p>The requested URL was not found on the server. If you entered the URL manually please check your spelling and try again.</p>
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - 

Expected behavior
Success in endpoint call and see execution metrics in Databand UI.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions