-
Notifications
You must be signed in to change notification settings - Fork 33
Description
Describe
The Airflow Monitor DAG get not found error (http 404) when call Databand API "http:///api/v1/integrations/config?type=***"
It isn't a communication issue between Airflow and Databand because previous from another API calls occurs successfully.
To reproduce
- Create a requirements.txt file with content:
dbnd==1.0.14.1
dbnd-spark==1.0.14.1
dbnd-airflow==1.0.14.1
dbnd-airflow-auto-tracking==1.0.14.1
- Create Dockerfile to build Airflow image:
FROM apache/airflow:latest
ADD requirements.txt .
RUN pip install apache-airflow==${AIRFLOW_VERSION} -r requirements.txt
- Create directories:
mkdir -p ./dags ./logs ./plugins ./config
-
Get official docker-compose file: https://airflow.apache.org/docs/apache-airflow/2.8.1/docker-compose.yaml
-
Change docker-compose.yml, comment line 52 and uncomment line 53
...
# image: ${AIRFLOW_IMAGE_NAME:-apache/airflow:2.8.1}
build: .
...
- Create the containers
docker compose up airflow-init
docker compose up -d
-
In Databand, create a Airflow Syncer (Settings > Airflow Syncers > Add new Syncer)
Connection URL: http://<docker_host>:8080
Syncer name: Any_syncer
The step 3 give you a json to be used in next step. -
In Airflow, in admin > connection, create a new connection
connection id: dbnd_config
connection type: HTTP
Extra:
{
"airflow_monitor": {
"dag_ids": "",
"is_sync_enabled": true,
"syncer_name": "Any_syncer"
},
"core": {
"databand_url": "http://<databand-server>",
"databand_access_token": "eyJ0eXAiOiJKV...."
},
"log": {
"preview_head_bytes": 8192,
"preview_tail_bytes": 8192
},
"tracking": {
"track_source_code": false
}
}
- Create a DAG file "databand_airflow_monitor.py" in dag folder:
from airflow_monitor.monitor_as_dag import get_monitor_dag
## This DAG is used by Databand to monitor your Airflow installation.
dag = get_monitor_dag()
- In Airflow, Unpause dag "databand_airflow_monitor" and see the logs.
ec2f5435d5c4
*** Found local files:
*** * /opt/airflow/logs/dag_id=databand_airflow_monitor/run_id=scheduled__2024-02-01T17:21:00+00:00/task_id=monitor/attempt=3.log
*** Found logs served from host http://ec2f5435d5c4:8793/log/dag_id=databand_airflow_monitor/run_id=scheduled__2024-02-01T17:21:00+00:00/task_id=monitor/attempt=3.log
[2024-02-07, 18:43:24 UTC] {dbnd_airflow_handler.py:117} INFO - Databand Tracking Started 1.0.14.1
[2024-02-07, 18:43:24 UTC] {base.py:83} INFO - Using connection ID 'dbnd_config' for task execution.
[2024-02-07, 18:43:24 UTC] {base.py:83} INFO - Using connection ID 'dbnd_config' for task execution.
[2024-02-07, 18:43:24 UTC] {taskinstance.py:1956} INFO - Dependencies all met for dep_context=non-requeueable deps ti=<TaskInstance: databand_airflow_monitor.monitor scheduled__2024-02-01T17:21:00+00:00 [queued]>
[2024-02-07, 18:43:24 UTC] {taskinstance.py:1956} INFO - Dependencies all met for dep_context=requeueable deps ti=<TaskInstance: databand_airflow_monitor.monitor scheduled__2024-02-01T17:21:00+00:00 [queued]>
[2024-02-07, 18:43:24 UTC] {taskinstance.py:2170} INFO - Starting attempt 3 of 13
[2024-02-07, 18:43:24 UTC] {taskinstance.py:2191} INFO - Executing <Task(MonitorOperator): monitor> on 2024-02-01 17:21:00+00:00
[2024-02-07, 18:43:24 UTC] {standard_task_runner.py:60} INFO - Started process 358 to run task
[2024-02-07, 18:43:24 UTC] {standard_task_runner.py:87} INFO - Running: ['***', 'tasks', 'run', 'databand_***_monitor', 'monitor', 'scheduled__2024-02-01T17:21:00+00:00', '--job-id', '24596', '--raw', '--subdir', 'DAGS_FOLDER/databand_***_monitor.py', '--cfg-path', '/tmp/tmpyjarxxti']
[2024-02-07, 18:43:24 UTC] {standard_task_runner.py:88} INFO - Job 24596: Subtask monitor
[2024-02-07, 18:43:24 UTC] {base.py:83} INFO - Using connection ID 'dbnd_config' for task execution.
[2024-02-07, 18:43:25 UTC] {task_command.py:423} INFO - Running <TaskInstance: databand_airflow_monitor.monitor scheduled__2024-02-01T17:21:00+00:00 [running]> on host ec2f5435d5c4
[2024-02-07, 18:43:25 UTC] {taskinstance.py:2480} INFO - Exporting env vars: AIRFLOW_CTX_DAG_OWNER='Databand' AIRFLOW_CTX_DAG_ID='databand_***_monitor' AIRFLOW_CTX_TASK_ID='monitor' AIRFLOW_CTX_EXECUTION_DATE='2024-02-01T17:21:00+00:00' AIRFLOW_CTX_TRY_NUMBER='3' AIRFLOW_CTX_DAG_RUN_ID='scheduled__2024-02-01T17:21:00+00:00' AIRFLOW_CTX_UID='ff2088ba-35ab-5eba-870b-c5b30b238094'
[2024-02-07, 18:43:25 UTC] {base.py:83} INFO - Using connection ID 'dbnd_config' for task execution.
[2024-02-07, 18:43:25 UTC] {base.py:83} INFO - Using connection ID 'dbnd_config' for task execution.
[2024-02-07, 18:43:25 UTC] {tracking_store_console.py:98} INFO - Tracking monitor task at http://192.168.0.195:8080/app/jobs/databand_***_monitor/29c4661a-0e21-5fb3-a5ba-94735dbf3cc1/084620bd-cb4a-5287-a09a-eb3495a26f3f
[2024-02-07, 18:43:25 UTC] {monitor_as_dag.py:194} INFO - Running memory guard with the limit=8589934592, checking every 10 seconds
[2024-02-07, 18:43:25 UTC] {subprocess.py:63} INFO - Tmp dir root location: /tmp
[2024-02-07, 18:43:25 UTC] {subprocess.py:75} INFO - Running command: ['/usr/bin/bash', '-c', '/usr/local/bin/python -m dbnd ***-monitor-v2 --interval 10 --stop-after 10800 ']
[2024-02-07, 18:43:25 UTC] {subprocess.py:86} INFO - Output:
[2024-02-07, 18:43:25 UTC] {monitor_as_dag.py:209} INFO - Memory usage changed from: 0 mb to 208 mb
[2024-02-07, 18:43:26 UTC] {subprocess.py:93} INFO - [2024-02-07 18:43:26,904] INFO - Starting Databand 1.0.14.1!
[2024-02-07, 18:43:26 UTC] {subprocess.py:93} INFO - DBND_HOME=/root
[2024-02-07, 18:43:26 UTC] {subprocess.py:93} INFO - DBND_SYSTEM=/root/.dbnd
[2024-02-07, 18:43:26 UTC] {subprocess.py:93} INFO - [2024-02-07 18:43:26,904] INFO - Reading configuration from:
[2024-02-07, 18:43:26 UTC] {subprocess.py:93} INFO - /home/***/.local/lib/python3.8/site-packages/dbnd/conf/databand-core.cfg
[2024-02-07, 18:43:26 UTC] {subprocess.py:93} INFO -
[2024-02-07, 18:43:26 UTC] {subprocess.py:93} INFO - [2024-02-07 18:43:26,907] INFO - Running validations
[2024-02-07, 18:43:26 UTC] {subprocess.py:93} INFO - [2024-02-07 18:43:26,908] INFO - All required configurations exist
[2024-02-07, 18:43:26 UTC] {subprocess.py:93} INFO - [2024-02-07 18:43:26,915] INFO - All dbnd packages required for monitor exist
[2024-02-07, 18:43:26 UTC] {subprocess.py:93} INFO - [2024-02-07 18:43:26,915] INFO - All dbnd packages required for tracking exist
[2024-02-07, 18:43:27 UTC] {subprocess.py:93} INFO - [2024-02-07 18:43:27,188] INFO - Reading the config from /opt/***/***.cfg
[2024-02-07, 18:43:27 UTC] {subprocess.py:93} INFO - [2024-02-07 18:43:27,364] INFO - Configured default timezone UTC
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - [�[34m2024-02-07T18:43:28.315+0000�[0m] {�[34mvalidations.py:�[0m98} INFO�[0m - Airflow 2.0 support is set�[0m
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - [2024-02-07 18:43:28,586] ERROR ***_monitor.shared.multiserver 360 MainThread : Unknown exception during iteration
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - Traceback (most recent call last):
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - File "/home/***/.local/lib/python3.8/site-packages/***_monitor/shared/multiserver.py", line 167, in run
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - self.run_once()
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - File "/home/***/.local/lib/python3.8/site-packages/***_monitor/shared/multiserver.py", line 183, in run_once
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - ] = self.integration_management_service.get_all_servers_configuration(
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - File "/home/***/.local/lib/python3.8/site-packages/dbnd/_vendor/tenacity/__init__.py", line 241, in wrapped_f
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - return self.call(f, *args, **kw)
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - File "/home/***/.local/lib/python3.8/site-packages/dbnd/_vendor/tenacity/__init__.py", line 329, in call
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - do = self.iter(result=result, exc_info=exc_info,
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - File "/home/***/.local/lib/python3.8/site-packages/dbnd/_vendor/tenacity/__init__.py", line 297, in iter
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - raise retry_exc.reraise()
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - File "/home/***/.local/lib/python3.8/site-packages/dbnd/_vendor/tenacity/__init__.py", line 136, in reraise
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - raise self.last_attempt.result()
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - File "/usr/local/lib/python3.8/concurrent/futures/_base.py", line 437, in result
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - return self.__get_result()
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - File "/usr/local/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - raise self._exception
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - File "/home/***/.local/lib/python3.8/site-packages/dbnd/_vendor/tenacity/__init__.py", line 333, in call
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - result = fn(*args, **kwargs)
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - File "/home/***/.local/lib/python3.8/site-packages/***_monitor/common/metric_reporter.py", line 77, in wrapped
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - return f(*args, **kwargs)
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - File "/home/***/.local/lib/python3.8/site-packages/dbnd/_vendor/tenacity/__init__.py", line 241, in wrapped_f
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - return self.call(f, *args, **kw)
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - File "/home/***/.local/lib/python3.8/site-packages/dbnd/_vendor/tenacity/__init__.py", line 329, in call
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - do = self.iter(result=result, exc_info=exc_info,
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - File "/home/***/.local/lib/python3.8/site-packages/dbnd/_vendor/tenacity/__init__.py", line 279, in iter
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - return fut.result()
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - File "/usr/local/lib/python3.8/concurrent/futures/_base.py", line 437, in result
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - return self.__get_result()
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - File "/usr/local/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - raise self._exception
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - File "/home/***/.local/lib/python3.8/site-packages/dbnd/_vendor/tenacity/__init__.py", line 333, in call
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - result = fn(*args, **kwargs)
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - File "/home/***/.local/lib/python3.8/site-packages/***_monitor/shared/integration_management_service.py", line 68, in get_all_servers_configuration
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - response = self._api_client.api_request(
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - File "/home/***/.local/lib/python3.8/site-packages/dbnd/utils/api_client.py", line 254, in api_request
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - resp = self._request(
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - File "/home/***/.local/lib/python3.8/site-packages/dbnd/utils/api_client.py", line 152, in _request
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - raise DatabandApiError(
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - dbnd._core.errors.base.DatabandApiError: Call failed to endpoint GET http://<databand-server>/api/v1/integrations/config?type=***
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - Response code: 404
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - Server error: <!doctype html>
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - <html lang=en>
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - <title>404 Not Found</title>
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - <h1>Not Found</h1>
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO - <p>The requested URL was not found on the server. If you entered the URL manually please check your spelling and try again.</p>
[2024-02-07, 18:43:28 UTC] {subprocess.py:93} INFO -
Expected behavior
Success in endpoint call and see execution metrics in Databand UI.