Predictive Analysis Of Breast Cancer Using Machine Learning Techniques

Main Article Content

Rashmi Agrawal


This paper is a product of the research Project “Predictive Analysis Of Breast Cancer Using Machine Learning Techniques” performed in Manav Rachna International Institute of Research and Studies, Faridabad in the year 2018.

Introduction: The present article is part of the effort to predict breast cancer which is a serious concern for women’s health.

Problem: Breast cancer is the most common type of cancer and has always been a threat to women’s lives. Early diagnosis requires an effective method to predict cancer to allow physicians to distinguish benign and malicious cancer. Researchers and scientists have been trying hard to find innovative methods to predict cancer.

Objective: The objective of this paper will be predictive analysis of breast cancer using various machine learning techniques like Naïve Bayes method, Linear Discriminant Analysis, K-Nearest Neighbors and Support Vector Machine method. 

Methodology: Predictive data mining has become an instrument for scientists and researchers in the medical field. Predicting breast cancer at an early stage helps in better cure and treatment. KDD (Knowledge Discovery in Databases) is one of the most popular data mining methods used by medical researchers to identify the patterns and the relationship between variables and also helps in predicting the outcome of the disease based upon historical data of datasets.

Results: To select the best model for cancer prediction, accuracy of all models will be estimated and the best model will be selected.

Conclusion: This work seeks to predict the best technique with highest accuracy for breast cancer.

Originality: This research has been performed using R and the dataset taken from UCI machine learning repository.

Limitations: The lack of exact information provided by data.


Download data is not yet available.

Article Details

How to Cite
R. Agrawal, “Predictive Analysis Of Breast Cancer Using Machine Learning Techniques”, ing. Solidar, vol. 15, no. 29, pp. 1-23, Sep. 2019.
Research Articles


[1] C. E. Fear, et al., “Confocal microwave imaging for breast cancer detection: Localization of tumors in three dimensions,” IEEE Transactions on biomedical engineering, vol. 49, no. 8, pp. 812-822, 2002.

[2] N. K. Nikolova, “Microwave imaging for breast cancer,” IEEE microwave magazine, vol. 12, no. 7, pp. 78-94, 2011. doi:

[3] Xie, Yao, et al., “Multistatic adaptive microwave imaging for early breast cancer detection,” IEEE Transactions on Biomedical Engineering, vol. 53, no. 8, pp. 1647-1657, 2006. doi:

[4] E. J. Bond, et al., “Microwave imaging via space-time beamforming for early detection of breast cancer,” IEEE Transactions on Antennas and Propagation, vol. 51, no. 8, pp. 1690-1705, 2003. doi:

[5] J. L. Kelsey, D. G. Marilie, and M. J. Esther, “Reproductive factors and breast cancer,” Epidemiologic reviews, vol. 15, no.1, p. 36, 1993. doi:

[6] P. A. Francis, et al., “Adjuvant ovarian suppression in premenopausal breast cancer,” New England Journal of Medicine, vol. 372, no. 5, pp. 436-446, 2015. doi:

[7] C. E. De Santis, et al., “Breast cancer statistics, 2015: Convergence of incidence rates between black and white women,” CA: a cancer journal for clinicians, vol, 66, no. 1, pp. 31-42, 2016. doi:

[8] C. E. De Santis, et al., “Breast cancer statistics, 2017, racial disparity in mortality by state,” CA: a cancer journal for clinicians, vol. 67, no. 6, pp. 439-448, 2017. doi:

[9] T. J. Whelan, et al., “Regional nodal irradiation in early-stage breast cancer,” New England Journal of Medicine, vol. 373, no. 4, pp. 307-316, 2015. doi:

[10] K. C. Oeffinger, et al., “Breast cancer screening for women at average risk: 2015 guideline update from the American Cancer Society,” Jama, vol. 314, no. 15, pp. 1599-1614, 2015. doi:

[11] M. Kan, et al., “Multi-view discriminant analysis,” IEEE transactions on pattern analysis and machine intelligence, vol. 38, no. 1, pp. 188-194, 2015. doi:

[12] M. Batra, and R. Agrawal, “Comparative Analysis of Decision Tree Algorithms,” in B. Panigrahi, M. Hoda, V. Sharma, and S. Goel (eds), Nature Inspired Computing. Advances in Intelligent Systems and Computing, vol 652, pp. 1-4 Springer, Singapore, 2018. doi:

[13] R. Agrawal, “Design and development of data classification methodology for uncertain data,” Indian Journal of Science and Technology, vol. 9, no. 3, pp. 1-12, 2016. doi:

[14] R. Agrawal, “Integrated Parallel K-Nearest Neighbor Algorithm,” Smart Intelligent Computing and Applications, Springer, Singapore, pp. 479-486, 2019. doi:

[15] Agrawal, Rashmi, , p 1-50

[16] Biau, Gérard, and Erwan Scornet. "A random forest guided tour." Test 25.2 , 197-227.,2016, 10.1007/s11749-016-0481-7,

[17] American Cancer Society. (2019). Cancerorg. Accessed 11 July 2019. [Online]. Available from:

[18] Nihgov. (2019). PubMed Central (PMC). Accessed 11 July 2019. [Online]. Available from:

[19] R. Agrawal, “Integrated Effect of Nearest Neighbors and Distance Measures in k-NN Algorithm,” in V. Aggarwal, V. Bhatnagar, and D. Mishra (eds), Big Data Analytics. Advances in Intelligent Systems and Computing, vol 654, pp. 1-6 Springer, Singapore, 2018. doi:

[20] R. Agrawal, “A modified K-nearest neighbor algorithm using feature optimization,” Int. J. Eng. Technol., vol. 8, no. 1, pp. 28–37, 2016. [Online]. Available from: