This project aims to detect fraudulent insurance claims using machine learning techniques. The repository contains the code for data analysis, preprocessing, model training, and a web application for interacting with the trained model.
The primary objective of this project is to develop a model to detect fraudulent insurance claims. This involves using machine learning techniques to analyze historical data and identify patterns that indicate fraud.
- Exploratory Data Analysis (EDA): Understanding the data distribution, detecting anomalies, and visualizing relationships between different variables.
- Data Preprocessing: Handling missing values, encoding categorical variables, and scaling numerical features.
- Feature Selection: Identifying important features using techniques like Extra Trees Regressor.
- Model Training: Splitting the data into training and testing sets, and training machine learning models.
- Model Evaluation: Evaluating the performance of the trained models using metrics like accuracy, classification report, and confusion matrix.
- Flask Web App: A web interface for uploading data, making predictions, and visualizing results.
- Python 3.7 or higher
- Necessary Python libraries:
- Pandas
- Matplotlib
- Seaborn
- Scikit-learn
- Flask
- TensorFlow (if used)
- Flask-Material
-
Clone the repository:
git clone https://github.com/yourusername/insurance-fraud-detection.git cd insurance-fraud-detection -
Install required libraries:
pip install -r requirements.txt
-
Place the dataset:
Ensure the
insurance_claims.csvfile is in thedatadirectory.
-
Navigate to the notebook directory:
cd notebooks -
Open the Jupyter Notebook:
jupyter notebook
-
Run the
Insurance Fraud Detection.ipynbnotebook:Execute the cells sequentially to perform data analysis, preprocessing, model training, and evaluation.
-
Ensure the necessary templates are in the
templatesdirectory:index.htmlabout.htmlupload.htmluploaded.html
-
Run the Flask application:
python main.py
-
Access the web application:
Open your web browser and go to
http://127.0.0.1:5000/.
insurance-fraud-detection/
├── data/
│ └── insurance_claims.csv
├── notebooks/
│ └── Insurance Fraud Detection.ipynb
├── templates/
│ ├── index.html
│ ├── about.html
│ ├── upload.html
│ └── uploaded.html
├── main.py
├── requirements.txt
└── README.md
The script imports various libraries, including Flask for web development, Scikit-learn for machine learning, and other utilities like Pandas for data manipulation.
from flask import Flask, render_template, request, redirect, url_for, session, jsonify
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn import metrics
from sklearn.metrics import classification_report, roc_auc_score, precision_recall_fscore_support
import pandas as pd
import os
app = Flask(__name__)
app.secret_key = '1a2b3c4d5e'The script defines several routes to handle different parts of the web application:
-
Home Route:
@app.route('/') def home(): return render_template('index.html')
-
About Route:
@app.route('/about') def about(): return render_template('about.html')
The script likely includes functionality for users to upload insurance claims data for analysis. This part of the code will handle file uploads and data processing:
@app.route('/upload', methods=['GET', 'POST'])
def upload_file():
if request.method == 'POST':
file = request.files['file']
if file:
filename = secure_filename(file.filename)
file.save(os.path.join('uploads', filename))
# Process the uploaded file here
return redirect(url_for('uploaded_file', filename=filename))
return render_template('upload.html')The script includes logic to load the pre-trained model and make predictions on new data:
@app.route('/predict', methods=['POST'])
def predict():
# Load data from the request
data = request.get_json()
# Process and predict using the loaded model
prediction = model.predict([data])
return jsonify({'prediction': prediction.tolist()})The overall structure of the main.py script seems to involve:
- Setting up the Flask application.
- Defining routes for the home page, about page, file upload, and prediction.
- Handling file uploads and saving them to a specific directory.
- Loading the trained machine learning model and making predictions based on user inputs.
- Rendering HTML templates to display the results and provide an interface for user interaction.
To run the Flask application, execute the script using Python:
python main.pyEnsure that the necessary templates (index.html, about.html, upload.html) are present in the templates directory and the static files (CSS, JS) are in the static directory.
Would you like a more detailed breakdown of any specific part of the code or further assistance with anything else?
This documentation provides a comprehensive guide to setting up and running the Insurance fraud Prediction System. By following the steps outlined, you should be able to deploy the application and make predictions based on user input. If you encounter any issues, ensure that all dependencies are installed and that the model file is correctly placed in the models directory.
This project is licensed under the MIT License.


