- Project Background
- Project Goal
- Key Features
- Methodology
- Results and Discussion
- Dashboard & Deployment
- Tools & Technologies
- Conclusion
- Future Work
- How to Run the Project
- Acknowledgments
Mobile money has transformed financial inclusion in Africa. Services like M-Pesa (Kenya), MTN Mobile Money (Uganda), and Airtel Money (West Africa) allow millions of people to send money, pay bills, and manage their finances without relying on traditional banks.
With over 300 million active users in Sub-Saharan Africa, mobile money platforms are now the backbone of everyday transactions.
However, this rapid growth also introduces security challenges:
- Limited regulatory oversight
- High transaction volumes
- The anonymity of mobile wallets
Together, these factors make mobile money ecosystems a prime target for fraudsters. Common fraud tactics include:
- SIM swaps
- Account takeovers
- Fraudulent transfers
Fraudulent transactions are notoriously difficult to detect because they rarely follow predictable patterns. Traditional supervised machine learning approaches require labeled fraudulent data, which is often scarce or unavailable.
To address this challenge, this project leverages unsupervised learning, where the model learns to identify outliers that deviate from normal transaction behavior — a promising approach in fraud detection for data-scarce environments.
This project aims to design a scalable, real-time fraud detection system tailored to mobile money platforms in Africa.
Key objectives include:
- Develop an unsupervised anomaly detection model (Isolation Forest) to flag unusual transaction patterns.
- Provide a Streamlit dashboard for interactive visualization of anomalies.
- Deploy the model using FastAPI to enable real-time fraud detection for mobile money services.
- Data Simulation: A synthetic dataset mimicking real-world mobile money transactions in African markets.
- Unsupervised Model (Isolation Forest): Detect anomalies using transaction amount, frequency, location, and device type.
- Interactive Dashboard (Streamlit): Visual monitoring of flagged transactions and fraud patterns.
- Real-time API (FastAPI): Seamless deployment of the fraud detection engine for live monitoring and integration.
The dataset used in this project simulates 10,000 mobile money transactions to reflect real-world activity in African markets. It includes various features such as:
- User IDs & Device IDs – uniquely identify customers and their devices
- Transaction Amounts – numerical values of money transfers
- Transaction Types – send, receive, cash-in, cash-out, buy airtime, deposit, withdrawal
- User Locations – geographic regions within Africa (e.g., Nairobi, Lagos, Kampala)
- Transaction Channels – USSD, Mobile App, Web, Agent
- SIM Swap Flags – indicator for possible SIM swap fraud
- Agent IDs – identify transactions carried out through agents
Since real mobile money transaction datasets are rarely publicly available (due to privacy concerns), a synthetic dataset was generated to:
- Represent typical user behaviors
- Simulate fraudulent patterns (e.g., unusual amounts, odd times, suspicious device usage)
- Provide enough diversity for training and validating the unsupervised model
This project was executed using Python, with analysis performed in Jupyter Notebook and deployment via Streamlit and FastAPI.
-
Examined the dataset to understand distributions, patterns, and potential anomalies.
-
Investigated transaction amounts, timing (hour of day, day of week), user locations, devices, and transaction types.
-
Identified preliminary trends such as skewed transaction amounts and temporal patterns, which informed feature engineering and model expectations.
- Handling Missing Values:
- Numerical columns filled with median values.
- Categorical columns filled with mode values.
- Feature Engineering:
- Created
log_amountto reduce skewness in transaction amounts. - Extracted
hourandday_of_weekfrom transaction timestamps. - Added
time_of_day(morning, afternoon, evening, night) for better interpretability.
- Created
- Feature Scaling: Applied scaling to numerical features to improve model stability and training efficiency.
- Algorithm Used:
Isolation Forest, an unsupervised model for anomaly detection. - Training Details:
- Model trained on the full dataset without labeled fraud data.
- Contamination rate set to 5% to correspond with expected anomaly proportion.
- Anomaly Identification: Model learned “normal” patterns and flagged deviations as anomalies.
- Inspected flagged transactions to assess model effectiveness.
- Applied dimensionality reduction techniques (t-SNE, UMAP) to visualize separation between normal and anomalous transactions.
- Tuned hyperparameters such as contamination rate to optimize anomaly detection.
- Streamlit Dashboard: Provides an interface to explore and monitor flagged transactions interactively.
- FastAPI Endpoint: Enables real-time fraud detection by sending new transaction data to the model and receiving predictions.
- The
amountcolumn is right-skewed with a mean of 3,496.41, standard deviation of 3,507.29, minimum of 0.03, and maximum of 30,221.30. - Applying a log transformation (
log_amount) produced a more normally distributed variable, which improves model performance in detecting anomalies. - Most transactions fall within a “normal” range, while a small fraction (~5%) represent outliers, which the Isolation Forest model successfully flagged.
- Time of Day: Most transactions occur at night (10 PM – 4 AM), suggesting fraudsters may exploit low-monitoring periods. Morning transactions follow closely, while afternoon and evening remain low.
- Day of Week: Saturdays have the highest volume (~1,750 transactions), with weekdays relatively stable (~1,300–1,450).
Insight: Monitoring should be more vigilant during high-activity periods, especially nights and weekends.
- Most locations show average transaction amounts between 3,300 and 3,700.
- Eldoret, Mombasa, and Kisumu exhibit slightly higher transaction amounts, indicating potential risk areas.
- The majority of anomalies originate from agents rather than individual users (386 out of 500 anomalies).
- By device type: iOS leads in flagged transactions, followed by Android and Feature Phones.
- Network distribution of anomalies is fairly even: Safaricom (173), Telkom Kenya (171), Airtel (156).
- Send Money, Buy Airtime, and Deposit Cash are the transaction types most frequently flagged as anomalies.
- This suggests that fraud monitoring can prioritize these transaction types for enhanced scrutiny.
- The distribution of anomalies across network providers is relatively even, with Safaricom having the most anomalies (173), followed closely by Telkom Kenya (171), and then Airtel (156).
This indicates that no single network provider is disproportionately affected by the types of anomalies detected by this model.
- t-SNE Visualization: Projects transactions into 2D; blue points = normal, red points = anomalies. Normal transactions form tight clusters, while anomalies appear isolated or on cluster edges.
- UMAP Visualization: Preserves local and global structure; confirms separation between normal and anomalous transactions.
Key Insight: Both t-SNE and UMAP confirm that the Isolation Forest model effectively identifies anomalies, providing visual proof that flagged transactions deviate from typical behavior.
- ~5% of transactions are flagged as anomalies, consistent with the contamination parameter.
- Anomalies are concentrated in nighttime hours, weekends, specific locations, transaction types, and device types.
- The model’s predictions align with observed behavioral patterns, indicating the unsupervised approach is effective for fraud detection without labeled data.
The FraudWatch Africa dashboard provides an interactive interface for exploring, monitoring, and predicting fraudulent transactions in real-time. It is built using Streamlit for the frontend and FastAPI for backend predictions.
- Home Page – Project introduction and banner.
- Dashboard Page – KPIs, flagged anomalies, filters, and anomaly visualizations.
- Predict Transaction Page – Enter transaction details for single prediction.
- Batch Prediction Page – Upload CSV for batch fraud predictions.
- Streamlit serves as the interactive dashboard frontend.
- FastAPI powers the backend with REST API endpoints.
- Communication is seamless: the dashboard sends requests to FastAPI for anomaly predictions in real time.
Here’s an overview of the tools and technologies used in this project:
This project demonstrated how unsupervised learning can be applied to the challenge of fraud detection in mobile money platforms, especially in environments where labeled fraud data is scarce.
By leveraging Isolation Forest, we successfully identified anomalous transactions that may represent fraudulent activity. The results highlighted:
- Strong potential for detecting unusual transaction behaviors in real time.
- Practical use of dashboards (Streamlit) for monitoring and decision support.
- Seamless integration with FastAPI for deployment, ensuring accessibility and scalability.
The solution emphasizes how data science can drive financial security in African markets, protecting millions of users and strengthening trust in mobile money systems.
While the current system provides a strong foundation, there are opportunities to make it more powerful and robust:
- Enhanced Models: Experiment with advanced techniques such as Autoencoders, One-Class SVM, and Graph Neural Networks for improved anomaly detection.
- Feature Engineering: Incorporate additional features like transaction velocity, device fingerprinting, and geospatial tracking to capture more complex fraud patterns.
- Scalability: Deploy the system on cloud platforms with distributed data pipelines (e.g., Apache Kafka, Spark) to handle millions of transactions in real time.
- User Feedback Loop: Integrate mechanisms for human investigators to label flagged transactions, creating feedback that strengthens the model over time.
- Cross-Border Expansion: Extend beyond Kenya to support fraud detection across multiple African mobile money markets.
- Explainability: Add interpretable AI components so stakeholders can understand why a transaction is flagged as suspicious.
This roadmap ensures the solution continues evolving into a production-grade fraud detection system that adapts to emerging threats.
If you’d like to explore the project locally, follow these steps:
git clone https://github.com/mauree155/FraudWatch-Africa.git my-repo
cd my-repo
2️⃣ Create a Virtual Environment
python -m venv venv
source venv/bin/activate # On Mac/Linux
venv\Scripts\activate # On Windows
3️⃣ Install Dependencies
pip install -r requirements.txt
4️⃣ Run the FastAPI Backend
uvicorn app.main:app --reload
API available at: http://127.0.0.1:8000/docs
5️⃣ Run the Streamlit Dashboard
streamlit run o_streamlit_app.py
Dashboard available at: http://localhost:8501
This project was carried out as part of the Dataverse Africa Internship Program.
Special thanks to our mentors and teammates for their guidance and collaboration.
- Maureen Akunna Okoro – Team Lead | Data Analyst / Data Scientist
- Masheida Dzimaba – Data Scientist
- Nasiru Ibrahim – Data Analyst










