Better Breast Cancer Care with AI
The Project
In this project, a novel deep learning (DL) model was built to predict whether a breast tumor is benign (non-cancerous) or malignant (cancerous). The dataset used was the Breast Cancer Wisconsin Diagnostic Dataset, which is publicly available and contains de-identified data. It has 569 breast cancer cases, 357 benign & 212 malignant. Each case contains 30 computed features from the digitised images of cell nuclei from breast cancer biopsy. In the present DL model, Python was used in the user interface JupyterLab. The data was reconstructed so all of the values of the features fit within a scale of 0-1. The data was split into four train/test split percentages (90/10, 75/25, 60/40 &50/50) to form the training & testing groups. The Multilayer perceptron model used in the this project has an input layer, 3 hidden layers with 30 nodes each, & an output layer. ReLU Activation function was used for calculations. The errors were calculated & the weight values in the hidden layers were modified to reduce output errors. The process was repeated until the errors were minimised. The model was run 30 times for each train/test split to ensure accuracy. The mean test accuracy & standard deviation for each train/test splits over 30 trails were calculated. The mean test accuracy for train/test (%/%) groups 90/10, 75/25, 60/40 and 50/50 were 95.4%, 96.7%, 96.5% & 96.3% respectively. The standard deviation was 0.038, 0.016, 0.012 & 0.012 respectively. Pairwise statistical t-tests between mean test accuracies showed no significant difference (p>0.05) among the 4 different train/test splits. Therefore, all train/test splits were compatible with each other. Based on the data, we can conclude that DL model can be reliable, efficient, convenient and inexpensive method for breast tumor prediction. These advantages, compared to traditional diagnosis by pathologists, may lead to quicker onset of treatment leading to a better outcome.
Team Comments
I chose to make this project because...Breast cancer is the most common cancer & second leading cause of death in women. I have friends & family with breast cancer. Early detection is key to successful treatment. But, diagnosis by pathologists is expensive & takes weeks causing delay in treatment. Also, it is often prone to misdiagnosis
What I found difficult and how I worked it outChoosing the right machine learning model for breast tumor prediction was the hardest step. I had to learn about various models, & their pros & cons. It was time consuming & a big learning curve. InitiaIly, I tried many models in my project. Finally, I chose deep learning model for its accuracy
Next time, I would...Dataset used here is small compared to what a DL model can handle. I would like to use a larger dataset to see how this DL model would perform. Also: 1. Use data from different geographical area/socio-economic & ethnic backgrounds 2. Compare DL model with other AI models 3.Try other algorithms
About the team
Team members
More cool Identity projects
Searching for the bananas
Scratch
corona
Scratch
HAMBLNS
Scratch