Lung Cancer Detection and Analysis Using Data Mining Techniques, Principal Component Analysis and Artificial Neural Network

  • Kassimu Juma Master Student, Northeastern University, Shenyang, China
  • Ma He Associate Professor, Northeastern University, Shenyang, China
  • Yue Zhao Professor, Northeastern University, Shenyang, China
Keywords: Artificial Neural Network (ANN), Feature Extraction, Lung Database, Principal Component Analysis (PCA), Region of Interest (ROI), Thresholding.


The successful diagnosis of lung cancer disease in early time increases the percentage of patient survival. Effective ways for predict and treat lung cancer remain challenges due to lack of effective ways of detection the lung nodules which causes by their arbitrariness in shape, size and texture. In this paper, image processing is used for image pre-processing, image segmentation and feature extraction. Artificial neural network (ANN) have been employed to learn extracted feature for nodule detection such as shape, size, volume.While principal component analysis were employed for multivariate data processing, it used to detect the complexity of interrelationships between diverse patient, disease and treatment variables.MATLAB have been used for all procedure in processing lung image and artificial neural network for train features extracted. XLSTART software was used for principal component analysis. The lung cancer database which contains the images classify lung image into two kinds:1)Normal with no nodule and 2)nodule image such as benign or malignant.Therefore,by using the proposed method the accuracy obtained was 76%.


[1] Ge, Zhanyu, Berkman Sahiner, Heang-Ping Chan, Lubomir M. Hadjiiski, Jun Wei, Naama Bogot, et al. "Computer-aided detection of lung nodules: false positive reduction using a 3D gradient field method." In Medical Imaging 2004, pp. 1076-1082, 2004 May 12.
[2] Taylor, Stuart A., Rebecca Greenhalgh, Rajapandian Ilangovan, Emily Tam, Vikram A. Sahni, et al "CT Colonography and Computer-aided Detection: Effect of False-Positive Results on Reader Specificity and Reading Efficiency in a Low-Prevalence Screening Population 1." Radiology 247, no. 1 (2008): pp.133-140. 2008 Apr.
[3] Li, F. Engelmann, R. Metz, C.E. Doi, K. and MacMahon, H. “Lung Cancers Missed on Chest Radiographs: Results Obtained with a Commercial Computer-aided Detection Program” 1. Radiology, 246(1), pp.273-280, 2008 Jan.
[4] Endo, M, Aramaki, T., Asakura, K., Moriguchi, M., Akimaru, M., Osawa, A., Hisanaga, et al.” Content-based image-retrieval system in chest computed tomography for a solitary pulmonary nodule: method and preliminary experiments”.International journal of computer assisted radiology and surgery, 7(2), pp.331-338. 2012 Mar 1.
[5] The DICOM standards Committee.DICOM homepage,September 2004.
[6] Disha Sharma,Gagandeep Jindal,”Computer Aided Diagnosis System for Detection of Lung Cancer in CT scan images”,International Journal of Computer and Electrical Engineering,Vol.3,No.5, p.714.October 2011
[7] Penedo.M.G,Carreira.M.J,Mosquera.A and Cabello.D,”Computer-aided diagnosis:a neuralnetwork-based approach to lung nodule detection”,IEEE Transactions on Medical Imaging,vol:17,pp:872-880,1998
[8] Mr.Vijay A.Gajdhane,prof.Deshpande L.M,”Detection of Lung Cncer Nodule on Computer Tomography Images by using Image Processing”.
[9] National Cancer Institute.Retrieved 14 January,2015,from
[11] Adam, B., Jerzy, Z., Jerzy, K. and Roman, K. A “principal component analysis of patients, disease and treatment variables: a new prognostic tool in breast cancer after mastectomy”. Reports of Practical Oncology & Radiotherapy, 5(3), pp.83-89. 2000 Nov.
[12] Floyd, C.E., Lo, J.Y., Yun, A.J., Sullivan, D.C. and Kornguth, P.J. “Prediction of breast cancer malignancy using an artificial neural network”.Cancer, 74(11), pp.2944-2948. 1994 Dec.
[13] Cox, G. S., F. J. Hoare, and G. de Jager. "Experiments in lung cancer nodule detection using texture analysis and neural network classifiers." InThird South African Workshop on Pattern Recognition, vol. 31. 1992.
[14] M.G.Penedo,M.J.Carreira,A.Mosquera and D.Cabello,”Computer aided diagnosis:A neural network based approach to lung nodule detection”,IEEE Trans.on Medical Imaging ,vol.17,N6.pp.872-880,1998.
[15] Anirudh, Rushil, Jayaraman J. Thiagarajan, Timo Bremer, and Hyojin Kim. "Lung nodule detection using 3D convolutional neural networks trained on weakly labeled data." In SPIE Medical Imaging, pp. 978532-978532. International Society for Optics and Photonics, 2016.
[16] Stefan Diederich et al.,”screening for early lung cancer with low-dose spiral CT:prevalence in 817 asymptomatic smokers”,Radiology,vol.222,no.3,pp.773-781,2002.
[17] Ashis Kumar Dhara,Chanukya Krishna chama and Sudipta Mukhopadhyay,”Content-based image retrieval system for differential diagnosis of lung cancer”.
[18] Zhou, Z.H., Jiang, Y., Yang, Y.B. and Chen, S.F. “Lung cancer cell identification based on artificial neural network ensembles”. Artificial Intelligence in Medicine, 24(1), pp.25-36. 2002 Jan 31.
[19] M.Gomathi and P.Thangaraj,”computer Aided Medical Diagnosis system for detection of lung cancer nodule a survey,”The Library,pp.3-12,2012.
[20] A.A.Abdullah and S.M.Shaharum,”Lung cancer cell Classification Method using artificial neural network”,information engineering letters,vol.2,1,pp.49-59,2012
[21] Y.Singh and A.S.Chauhan,”Neural networks in data mining,”Journal of Theoretiecal and Applied Information Technology,2005-2009.
[22] M.H.Beale,M.T.Hagan and H.B.Demuth,”Neural Network”,The mathWorks,1992-2012.
[23] Zubi, Z.S. and Saad, R.A. Improves Treatment Programs of Lung Cancer Using Data Mining Techniques. Journal of Software Engineering and Applications, 7(2), p.69,2014 Feb.