A full-stack web application designed to analyze and assess the quality of high frequency data collected from Industrial Internet of Things (IIoT) sensors, efficiently.
This application consists of:
- FastAPI Backend: RESTful API for data processing and analytics
- React Frontend: Modern, responsive dashboard interface
- TimescaleDB Integration: PostgreSQL with TimescaleDB extension for time-series data optimization
- DQA Worker Service: Background service for data aggregation and quality assessment
- Modern Web Interface: React-based responsive dashboard with professional UI/UX
- Data Import API: Upload and import CSV sensor data files via RESTful API
- File validation and preview
- Automatic machine type detection
- Background processing with progress tracking
- Direct import to TimescaleDB raw_sensor_data table
- Data Loading: Interactive data source selection and preprocessing with real-time previews
- TimescaleDB Integration: Leverage hypertables, compression, and continuous aggregates for efficient time-series data processing
- Advanced Analytics: Comprehensive visualization analytics including:
- Summary statistics and correlation analysis
- Time series analysis with trend detection
- Histogram and density plots
- Box plots and seasonal decomposition
- Anomaly detection using statistical methods
- Data Visualization: Interactive charts and graphs using modern charting libraries
- Missing Values Analysis: Detect and handle missing values in the raw sensor dataset
- Invalid Values Analysis: Identify and analyze invalid readings or alarms from sensors
- Data Quality Assessment: Comprehensive data quality metrics and visualizations
- RESTful API: Well-documented FastAPI backend with automatic OpenAPI documentation
┌─────────────────┐ HTTP/REST ┌──────────────────┐ SQLAlchemy ┌─────────────────┐
│ React Frontend│ ◄──────────────► │ FastAPI Backend │ ◄──────────────► │ TimescaleDB │
│ │ │ (Port 8000) │ │ (Port 5432) │
└─────────────────┘ └──────────────────┘ └─────────────────┘
│ ▲ │ ▲
│ │ │ │
│ │ │ │
▼ │ ▼ │
┌──────────────────┐ ┌──────────────────┐
│ DQA Worker │ │ raw_sensor_data │
│ (Background) │ │ aggr_insights │
└──────────────────┘ └──────────────────┘
- Python 3.9+ for backend development
- Node.js 18+ for frontend development
- Docker and Docker Compose for containerized services
- TimescaleDB (PostgreSQL with TimescaleDB extension) - automatically set up via Docker Compose
- Sensor Metadata in CSV format (see Data Description Requirements below)
Provide a csv file describing the sensors. The file should adhere to the following format and contains the specified columns. Below is a detailed description of each required column and its expected content:
- Description: A unique identifier for each sensor.
- Example:
22PI102
- Description: A brief description of the tag or sensor, explaining what it measures or monitors.
- Example:
SEAL OIL MAIN PUMP PRESSURE
- Description: The group or category of machinery to which the tag belongs.
- Example:
K-2201
- Description: The lower limit or threshold for the acceptable range of the tag's measurements.
- Example:
6
- Description: The upper limit or threshold for the acceptable range of the tag's measurements. If not applicable, it can be left empty.
- Example:
NaN
- Description: Indicates the type of threshold (e.g., "Down" for lower limit thresholds). One of (
Up, Down or Up/Down) - Example:
Down
- Description: The rule for aggregating data points (e.g.,
minfor minimum value). - Example:
min
- Description: The units in which the measurements are recorded.
- Example:
Kgf/cm2
- Description: The category of the measurement, such as
Pressure,Temperature, etc. - Example:
Pressure
| TAG | Tag Description | MACHINE_GROUP | LOW_THRESHOLD | HIGH_THRESHOLD | THRESHOLD_TYPE | AGGREGATION_RULE | ENGINEERING_UNITS | CATEGORY |
|---|---|---|---|---|---|---|---|---|
| 22PI102 | SEAL OIL MAIN PUMP PRESSURE | K-2201 | 6 | NaN | Down | min | Kgf/cm2 | Pressure |
| 22PI103 | CONTROL OIL HEADER PRESSURE | K-2201 | 5 | NaN | Down |
git clone https://github.com/giorgosfatouros/IIoT-Data-Quality-Assessment.git
cd iiot-data-quality-assessment-app- Clone the repository and navigate to the project directory:
git clone <repository-url>
cd fame-data-quality-assessment- Create
.envfile from example:
cp env.example .env
# Edit .env and add your OPENAI_API_KEY- Start all services:
./start-dev.shThis will start:
- TimescaleDB database (port 5432)
- DQA Worker service (background aggregation)
- FastAPI Backend (port 8000)
- React Frontend (port 5173)
- Navigate to the project directory:
cd fame-data-quality-assessment- Set up backend:
cd backend
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
pip install -r requirements.txt- Set up frontend:
cd frontend
npm install- Start TimescaleDB with Docker:
docker-compose up -d timescaledb- Start backend:
cd backend
uvicorn main:app --host 0.0.0.0 --port 8000 --reload- Start frontend (in another terminal):
cd frontend
npm run devdocker-compose up -dGo to: http://www.localhost:8501 Follow the instructions within the app to upload your sensor data.
- Data Loading: Navigate to the Data Loading page to select the data table (machine) for analysis.
- Data Visualization: Use the Data Visualization page to explore the data through various visualizations.
- Missing Values Analysis: Go to the Missing Values Analysis to get insights for missing values into the original/raw data.
- Invalid Values Analysis: The Invalid Values Analysis page helps you identify and understand invalid readings from your sensor data.
- Data Quality: Access the Data Quality page for a detailed assessment of your data's quality, including completeness, accuracy, and consistency.
The application includes a RESTful API for uploading and importing sensor data CSV files.
The Data Import API is part of the FastAPI backend service. All services, including the DQA Worker for background data aggregation, are automatically set up when you start the application using Docker Compose:
./start-dev.shThis will start:
- TimescaleDB database
- DQA Worker service (handles background data aggregation)
- FastAPI Backend (includes the Data Import API)
- React Frontend
For manual setup, ensure the backend dependencies are installed using uv (the project uses pyproject.toml for dependency management):
cd backend
uv pip install -e .Upload and import data:
curl -X POST "http://localhost:8000/import/upload" \
-F "data_file=@/path/to/sensor_data.csv" \
-F "tags_file=@/path/to/tags.csv" \
-F "table_name=KT2201" \
-F "machine_type=AUTO"Check import status:
curl "http://localhost:8000/import/status/{job_id}"Validate files before import:
curl -X POST "http://localhost:8000/import/validate" \
-F "data_file=@/path/to/sensor_data.csv" \
-F "tags_file=@/path/to/tags.csv"Test the API:
cd backend
python test_import_api.pyFor complete API documentation, see Data Import Guide.
- KT2201: K-2201/KT-2201 Machine
- K3301: K-3301/KT-3301 Machine
- K5700: K-5700 Machine
- AUTO: Automatic detection from filename or column patterns
If you use this software in your research, please cite:
@inproceedings{fatouros2023comprehensive,
title={Comprehensive architecture for data quality assessment in industrial iot},
author={Fatouros, Georgios and Makridis, Georgios and Mavrogiorgou, Argyro and Soldatos, John and Filippakis, Michael and Kyriazis, Dimosthenis},
booktitle={2023 19th International Conference on Distributed Computing in Smart Systems and the Internet of Things (DCOSS-IoT)},
pages={512--517},
year={2023},
organization={IEEE}
}The project has received funding from the European Union's funded Project HEU FAME under Grant Agreement No. 101092639.