________ __ _______ _______ ___ ____ ____
|_ __ | [ | |_ __ \ |_ __ \ .' `. |_ _| |_ _|
| |_ \_| | | .--. _ _ __ .---. _ .--. ______ | |__) | | |__) | / .-. \ \ \ / /
| _| | | / .'`\ \ [ \ [ \ [ ] / /__\\ [ `/'`\] |______| | ___/ | __ / | | | | \ \ / /
_| |_ | | | \__. | \ \/\ \/ / | \__., | | _| |_ _| | \ \_ \ `-' / \ ' /
|_____| [___] '.__.' \__/\__/ '.__.' [___] |_____| |____| |___| `.___.' \_/Federated Learning (FL) has emerged as a privacy-preserving paradigm that enables distributed model training without sharing raw data. However, ensuring traceability in FL workflows remains a challenge due to the inherently distributed nature of FL. Data is spread across multiple clients—ranging from a few to thousands—without clear consumption/production relationships within the workflow.
Tracking and understanding model evolution in this scenario requires insights into the data derivation path and key evaluation metrics (e.g., accuracy, silhouette score) for both local and aggregated models. This information is essential not only for understanding and explaining the training process but also for dynamically adjusting the FL workflow.
Since each training round can take minutes to hours, the ability to monitor and fine-tune the process in real time is critical. Poor hyperparameter configurations can lead to wasted time and computational resources—especially in existing FL frameworks, where users often only evaluate model performance after complete training.
This project addresses these challenges by improving traceability and monitoring capabilities in FL workflows.
Flower-PROV is an extension of the open-source Flower Federated Learning (FL) framework, designed to integrate provenance tracking as a core component of FL workflows to enhance reproducibility and analysis.
Flower-PROV enables the automatic and distributed capture of:
- Retrospective provenance (r-prov): Logs details about the actual FL workflow execution.
- Prospective provenance (p-prov): Represents the FL workflow specification.
The captured provenance data includes:
✅ Participating clients
✅ Hyperparameter values
✅ Accuracy metrics
✅ Model versions and checkpoints
Beyond simply collecting provenance data, Flower-PROV actively uses it to:
- Dynamically adjust model hyperparameters during training.
- Enable clients to recover previously trained models as a starting point for local training, avoiding redundant computations.
The following list of software has to be configured/installed to run Flower-PROV.
We provide a pre-built Docker image that includes the DfAnalyzer provenance library, the Python library, and the provenance database (MonetDB):
docker pull nymeria0042/dfanalyzerWe also provide a docker-compose.yaml that we will use to launch our containers.
This guide demonstrates how to run a Flower-PROV container using the CIFAR-10 dataset. We begin by splitting the dataset in a balanced manner using the dataset-splitter component.
Navigate to the dataset-splitter directory and execute the following command:
python splitter.py --dataset_splitter_config_file config/dataset_splitter.cfgThis will create 5 folders - default - with the data that will be used by each client.
Next, we can start the DfAnalyzer container, which runs in the background to capture all provenance data for the experiment:
docker compose up dfanalyzerOnce the DfAnalyzer service is running, execute the prospective provenance script, which defines the structure and parameters to be captured. Additionally, start the MongoDB service, which stores the model weights for fault tolerance.
When it’s finished, we can start the server:
docker compose up serverStart the clients — five in this demonstration:
docker compose up client1 client2 client3 client4 client5Once the experiment runs, you can submit queries to the provenance database (MonetDB) to monitor metrics and parameter/hyperparameters configurations.
First, we connect to the provenance database, running in the DfAnalyzer container:
docker exec -it dfanalyzer mclient -u monetdb -d dataflow_analyzerThe default password is monetdb.
Then, we can submit the queries, like:.
SELECT client_id FROM oClientTraining WHERE server_round = 5;
-- to see which clients were participating in round 5SELECT server_round, accuracy FROM oServerTrainingAggregation ;
-- to see how the accuracy is evolvingSELECT server_round, dynamically_adjusted FROM oTrainingConfig;
-- to see in which rounds the dynamic adjust was triggedSELECT server_round, insertion_time, weights_mongo_id, checkpoint_time FROM oServerTrainingAggregation;
-- to monitor the insertion of the checkpointsThe user can also access localhost:22000 to view the provenance graph and understand each step of the FL workflow:
To monitor the metrics, the user can run the streamlit app locally:
streamlit run monitoring/Flower-PROV_Monitor.py
- Current Members
- Camila Lopes
- Aline Paes
- Daniel de Oliveira
- Former Members
- Alan Lira
- Cristina Boeres
- Lucia Drummond
- Beutel, D. J., et al. Flower: A Friendly Federated Learning Framework., 2020.
- Lopes, C., et al. Provenance-Based Dynamic Fine-Tuning of Cross-Silo Federated Learning. CARLA 2023.
This project is licensed under the Apache License 2.0. See the LICENSE file for more details.

