We chose this name as we beleive it reflects the essence of our capstone project with ArkXJobInTech and DXC.
NiFiPulse is a lightweight on-prem monitoring and alerting solution for Apache NiFi clusters.
It continuously tracks system and pipeline health: CPU, RAM, disk usage, file I/O, NiFi pipeline metrics, and more, then triggers alerts when thresholds are exceeded.
- Metrics Collection : Gather CPU, RAM, file system, and NiFi flow stats. nifi_flows
- NiFi Integration : Connects directly with NiFi APIs to pull processor and queue metrics.
- Alerting Engine : Send email, Slack, or webhook alerts based on user-defined thresholds. Alerting
- Custom Dashboards : Visualize health trends and performance over time. grafana
- On-Prem Ready : Designed for environments without external cloud dependencies. Docker
- Secure Configuration : Credentials and endpoints are managed via
.envfiles.
After running docker compose up run:
- Run SQL init inside the container :
docker exec -it postgres sh -c 'psql -U postgres -d postgres -f /docker-entrypoint-initdb.d/SQL_Script.sql' - Verify tables:
- Ubuntu:
docker exec -it postgres sh -c 'psql -U postgres -d metrics_db -c "\dt"' - Windows:
docker exec -it postgres psql -U postgres -d metrics_db -c "\dt"
- Ubuntu:
- Run:
pip install -r requirements.txt - Run:
pip install -e . - Run ETL:
nifipulse --poll <number_of_polls_10_by_default_0_for_infinite> - Run quick sanity check:
- Ubuntu:
docker exec -it postgres sh -c 'psql -U postgres -d metrics_db -c "
SELECT f.fact_id, d.timestamp_utc, i.instance_name, m.metric_name, c.component_name, f.value
FROM fact_metrics f
JOIN dim_date d ON d.date_id = f.date_id
JOIN dim_instance i ON i.instance_id = f.instance_id
JOIN dim_metric m ON m.metric_id = f.metric_id
JOIN dim_component c ON c.component_id = f.component_id
ORDER BY d.timestamp_utc DESC
LIMIT 20;"'
- Windows:
docker exec -it postgres psql -U postgres -d metrics_db -c "
SELECT f.fact_id, d.timestamp_utc, i.instance_name, m.metric_name, c.component_name, f.value
FROM fact_metrics f
JOIN dim_date d ON d.date_id = f.date_id
JOIN dim_instance i ON i.instance_id = f.instance_id
JOIN dim_metric m ON m.metric_id = f.metric_id
JOIN dim_component c ON c.component_id = f.component_id
ORDER BY d.timestamp_utc DESC
LIMIT 20;"
Our CI runs on every push to dev and on PRs to staging/main. To reduce waste:
- CI auto-skips for docs and images via
paths-ignore. - Optional: add
[skip ci]in the commit message (push), or in the PR title/body (pull_request) for one-off skips.
Use [skip ci] only for:
- Docs-only edits (README, Alerting.md, state_of_art PDFs).
- Image asset updates under
images/. - Non-code text changes (typos, formatting).
Do NOT use [skip ci] when changing:
- Python code (
nifipulse/**), tests (tests/**), dependencies (pyproject.toml,requirements*.txt). - CI/CD files (
.github/workflows/**), Dockerfile/compose, SQL schema, Prometheus/Grafana configs.
Examples:
git commit -m "docs: update Alerting.md [skip ci]" && git push- PR title:
docs: update dashboards [skip ci]
CI details:
- Concurrency cancels superseded runs to save compute.
- Pip caching speeds installs; lint is advisory (
continue-on-error: true) until we enforce style.
Nifi Registery config
-docker exec -it nifi /bin/bash + ls -l /opt/nifi/nifi-current/data/outgoing to list simulated data files from NiFi.
- docker set-up for Nifi and Prometheus
- Prometheus job configuration file
- Grafana dashboards configuration
- Grafana datasource configuration
- run
git clone https://github.com/DXC-DP-Monitoring/NiFiPulse.giton your local machine, on your preferred folder.
- run
git branchto make sure you are on main , if not rungit checkout main - run
git pull origin main, this will fetche updates from the remote repo (origin) and merges them into your localmainbranch - To keep the history linear:
git fetch origin+git rebase origin\main(no merge)
This project is licensed under the Apache License 2.0. See the LICENSE file for the full license text.
Copyright (c) 2025 Amina BOUHAMRA, Fadwa EL AMRAOUI, Nawar TOUMI, Soukayna BOUCETTA
