breast-cancer-diagnosis-using-logistic-regression

The purpose of this project is to implement a machine learning technique called logistic regression, to identify whether a given breast tissue sample is cancerous or not.

The dataset used is Breast Cancer Wisconsin (Original) Data Set and is divided into 80% training data and 20% test data. The datset itself provides the following 9 features in a normalized scale of 1 - 10 :

Clump Thickness
Uniformity of Cell Size
Uniformity of Cell Shape
Marginal Adhesion,
Single Epithelial Cell Size
Bare Nuclei
Bland Chromatin
Normal Nucleoli
Mitoses

The last column denotes whether the cell is malignant (1) or benign (0).

Principal Component Analysis has been done to visualize the data, by which we can determine that this is a linear classification problem (data.jpg)

Thus, using logistic regression, we get our hypothesis line for future predictions as follows (hypothesis.jpg)

The accuracies achieved are as follows:

Training accuracy: 96.146789, F1 score: 0.948403

Testing accuracy: 98.550725, F1 score: 0.972222

Installation

The code is written in MATLAB 9.0.0.341360 (R2016a). The program to run is model.m, which will run step by step. The console output explains what is done in each step.

Author

Nirav Jain [email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
License.md		License.md
README.md		README.md
costFunction.m		costFunction.m
costFunctionReg.m		costFunctionReg.m
data.jpg		data.jpg
data_cross_validation.csv		data_cross_validation.csv
data_test.csv		data_test.csv
data_train.csv		data_train.csv
featureNormalize.m		featureNormalize.m
hypothesis.jpg		hypothesis.jpg
mapFeature.m		mapFeature.m
model.m		model.m
pca.m		pca.m
plotData.m		plotData.m
plotDecisionBoundary.m		plotDecisionBoundary.m
predict.m		predict.m
projectData.m		projectData.m
segmented_sample_tissue.JPG		segmented_sample_tissue.JPG
sigmoid.m		sigmoid.m

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

breast-cancer-diagnosis-using-logistic-regression

Installation

Author

About

Uh oh!

Releases

Packages

Languages

License

niravjain/breast-cancer-diagnosis-using-logistic-regression

Folders and files

Latest commit

History

Repository files navigation

breast-cancer-diagnosis-using-logistic-regression

Installation

Author

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages