- Open Access
- Authors : Revathy R , Nithya B S , Reshma J J , Ragendhu S S, Sumithra M D
- Paper ID : IJERTV9IS060170
- Volume & Issue : Volume 09, Issue 06 (June 2020)
- Published (First Online): 10-06-2020
- ISSN (Online) : 2278-0181
- Publisher Name : IJERT
- License: This work is licensed under a Creative Commons Attribution 4.0 International License
Diabetic Retinopathy Detection using Machine Learning
Revathy R1, Nithya B S2 , Reshma J J3 , Ragendhu S S4 ,5 Sumithra M D
1,2,3,4,5Dept of Computer Science and Engineering
1,2,3,4,5LBS Institute Of technology For Women, Thiruvananthapuram, Kerala.
Abstract: -Diabetic retinopathy is a disease caused by uncontrolled chronic diabetes and it can cause complete blindness if not timely treated. Therefore early medical diagnosis of diabetic retinopathy and it medical cure is essential to prevent the severe side effects of diabetic retinopathy. Manual detection of diabetic retinopathy by ophthalmologist take plenty of time and patients need to suffer a lot at this time. An automated system can help detect diabetic retinopathy quickly and we can easily follow-up treatment to avoid further effects to the eye. This study proposes a machine learning method for extracting three features like exudates, hemorrhages, and micro aneurysms and classification using hybrid classifier which is a combination of support vector machine, k nearest neighbour, random forest, logistic regression, multilayer perceptron network. From the results of the experiments, the highest accuracy values 82%. Hybrid approach produced a precision score of 0.8119,Recall score of 0.8116 and f-measure score of 0.8028.
Keywords-Diabetic Retinopathy, KNN, SVM, Random Forest, Retinal Fundus Images
I. INTRODUCTION
Diabetic Retinopathy is a complication that affect the eye due to the result of high blood glucose called diabetes. It can cause vision loss and in severe condition can lead to complete blindness. Early symptoms of diabetic retinopathy includes blurred vision, darker areas of vision, eye floaters and difficulty in perceiving colours. Proper detection of diabetic retinopathy in early stage is extremely important to prevent complete blindness. Of an estimated 285 million people with diabetes mellitus worldwide, approximately one third have signs of diabetic retinopathy. Globally the number of people
affected with diabetic retinopathy will increase from 126.6 million in 2010 to 191.0 million by 2030. Non Proliferative Diabetic Retinopathy (NPDR) is an early stage of disease in retina where tiny red spots occur. These tiny spots may represent haemorrhage and abnormal pouching of blood vessels represents microaneurysms. The lining of these blood vessels can become damaged enough to allow leakage of fluid and fatty material called exudates.
Available physical tests to detect diabetic retinopathy includes pupil dilation, visual acuity test, optical coherence tomography, etc. But they are time consuming and patients need to suffer a lot. This paper focuses on automated computer aided detection of diabetic retinopathy using machine learning hybrid model
by extracting the features haemorrhage, microaneurysms and exudates. The classifier used in this proposed model is the hybrid combination of SVM and KNN.
II . LITERATURE REVIEW
[Farrikh Alzami, 2019] described a system for diabetic retinopathy grade classification based on fractal analysis and random forest using MESSIDOR dataset. Their system segmented the images, then computed the fractal dimensions as features. They failed to distinguish mild diabetic retinopathy to severe diabetic retinopathy. [Qomariah 2019] proffered an automated system for classification of Diabetic Retinopathy and normal retinal images using concurrent neural network (CNN) and support vector machine (SVM). Features comprised of exudates, haemorrhage and microaneurysms. The author partitioned the proposed system into 2 parts: the first part composed with feature extraction based on neural networks and the second part performed classification using SVM. [Kumar, 2018] proposed a system for improved diabetic retinopathy detection by extracting area and number of microaneurysms using colour fundus images from DIARETDB1 dataset. Pre-processing of fundus images were performed using green channel extraction, histogram equalization and morphological process. Principal component analysis (PCA), contrast limited adaptive histogram equalization (CLAHE), morphological process, averaging filtering were applied for microaneurysms detection and classification is done by linear support vector machine (SVM). [Mohamed Chetoui, 2018] proffered a system which detect diabetic retinopathy using different texture feature and machine learning classification model. Two features haemorrhage and exudates are extracted using local ternary pattern (LTP) and local energy- based shape histogram (LESH). SVM is used for leaning and classification of extracted histogram using feature vectors of LTP and LESH. [S Choudhury, 2016] proposed a system which deals with fuzzy C means based feature extraction and classification of diabetic retinopathy using SVM. Blood vessels extraction is performed using top hat filter and mathematical morphology. Retinal vessel density and exudates are chosen as the features. Exudate extraction is done by fuzzy C means segmentation. Gaussian Radial Basis function is used to map the training data into SVM kernel space. [Sangwan, 2015] described a system that identifies different stages of diabetic retinopathy based on blood vessels, haemorrhage and exudates. The features are extracted using image pre-processing and they are fed into the neural network.SVM based training provided into the data and classify the images into three categories as mild, moderate non proliferative diabetic retinopathy and proliferative diabetic retinopathy. But the system could not give expected results if the exudate areas in the fundus images exceeds that of an optical disc size.
[Morium Akter, 2014] described a system for morphology based exudates detection from colour fundus images. The model uses grayscale conversion, histogram equalisation, thresholding, erosion, dilation, logical AND operation and watershed transformation. The system produces an output with ranges of exudates affected in diabetic retinopathy. [Handayani, 2013] proposed a system for the classification of non-proliferative diabetic retinopathy using soft margin SVM. Hard exudates in the retinal fundus images are used to classify severity level of non-proliferative diabetic retinopathy. Mathematical morphology is applied to segment hard exudates. But the system does not include micoaneurysms and haemorrhage as the features. [Saravanan, 2013] proposed an automated system for the red lesion diabetic retinopathy detection based on microaneurysms using GMM classifier. The feature is extracted using mathematical morphology, filter based method and supervised learning method. Severity level of candidate microaneurysms is detected into four stages. [Venkatalakshmi, 2011] described automated system for hard exudate detection using sharp edge and colour highlights as two features. Methods involved in the detection process were colour based classification, sharp edge detection, and extraction of optic disc. Training and testing were done using DRIVE and DIARETDB0 dataset. The system used MATLAB7.8 for graphical user interface (GUI).
-
DATASET AND METHODS
-
Dataset
This study used publicly available Kaggle Dataset for Diabetic Retinopathy Detection. The database was created with images taken from publicly available retinopathy detection datasets. The Kaggle dataset contain 1000 images with diabetic retinopathy and 1000 images without diabetic retinopathy. From the total images we have chosen 122 images with diabetic retinopathy and 122 normal images. Chosen abnormal images contains exudates, hemorhages, and microaneurysms.
Median Filtering
Median Filtering
The presence of diabetic retinopathy is based on the appearance, number, spread and size, area of exudates, microaneurysms, and hemorrhages as shown in Fig.1.Exudates are the bright areas with the yellowish appearance which has colour variance from colour of optic disc in slight range. The ruptured blood vessel contains lipid causes the occurrence of exudates. The ruptured microaneurysms in the blood vessels causes the formation of hemorrhages. Spread of exudates and hemorrhages appear in severe diabetic retinopathy images which is the last stage of diabetic retinopathy.
Fig 1. Normal image and an image with exudates, hemorrhages and microaneurysms
-
Steps
Input Image
Input Image
Pre processing
Pre processing
Image Segmentation
Image Segmentation
Feature Extraction
Feature Extraction
Classification
Classification
Normal
Abnormal
Normal
Abnormal
Fig 1c) Flow chart of proposed model
-
Pre-processing
In image pre-processing, to find exudates, initially image from dataset is converted to HSV image. Colour space conversion is converting an image that is represented in one colour space to another colour space, the goal being to make the translated image look as similar as possible to the original. Red, Blue, Green channels in the given image to Hue, Saturation, Value. It is useful to extract yellow coloured exudates from RGB image when we convert RGB to HSV. Then edge zero padding, median filtering and adaptive histogram equalization is done. Fig 2 shows image before pre- processing ang Fig.3 shows image after pre-processing.
Colour Space Conversion
Colour Space Conversion
Edge Zero Padding
Adaptive Histogram Equalisation
Adaptive Histogram Equalisation
Fig 2 a) Abnormal image before pre-processing
b) Normal image before pre-processing
Fig 3 a) Abnormal image after pre-processing
b) Normal image after pre-processing
-
Image Segmentation
After image pre-processing, to segment exudates we have done smoothing, masking and bitwise AND. Smoothing is employed to remove high spatial frequency noise from image. Image blurring is achieved by convolving the image with a low-pass filter kernel. Masking is an image processing
method in which we define a small 'image piece' and use it to modify a larger image. Here we are masking yellow coloured ([60,255,255]) exudates and optic disc in smoothed image with blue ([0,0,0255]) colour.
Bitwise AND operations are used in image manipulation and used for extracting essential parts in the image. Bitwise operations help in image masking. Image creation can be enabled with the help of these operations. These operations can be helpful in enhancing the properties of the input images. Here we are combining input image with masked image there by eliminate portions other than optic disc and exudates from original image. Fig 4. Represents abnormal and normal images after exudate segmentation
Smoothing
Smoothing
Masking
Masking
Bitwise-AND
Bitwise-AND
Fig: Flowchart of exudate segmentation
To segment hemorrhages and microaneurysms median blurring, thresholding, image erosion and image dilation are performed. Image erosion and dilation are the morphological operations performed on image. Thresholding partitions an image into foreground and background. This image analysis technique is a type of image segmentation that isolates objects by converting grayscale images into binary images. Morphological Opening is defined as an erosion followed by a dilation. Opening can remove small bright spots and connect small dark cracks. This tends to open up gaps between features. Morphological erosion sets a pixel to the minimum over all pixels in the neighbourhood. Morphological dilation sets a pixel to the maximum over all pixels in the neighbourhood. Fig
5. represents abnormal and normal images after hemorrhages and micro aneurysms segmentation. He segmented images are represented in binary images where white spots in the images represents the feature vectors or parameters. These parameters are counted for further classification processes.
Green Channel Extraction
Green Channel Extraction
Morphological Opening
Morphological Opening
Image Compliment
Image Compliment
Smoothing
Smoothing
Thresholding
Thresholding
Fig: Flowchart of hemorrhages and micro aneurysms segmentation
Fig 4. a) Abnormal images with segmented exudates
b) Normal images without exudates
Fig.5 a) Abnormal images with hemorrhages and micro aneurysms
b) Normal images without hemorrhages ang micro aneurysms
Fig 5. c) Abnormal images with segmented hemorrhages and micro aneurysms
d) Normal images after segmentation.
-
Feature Extraction
For binary classification, here we are using 2 features, ie, number of exudates as first parameter and number of hemorrhages and micro aneurysms as second parameter. That is, we are counting number of white pixels from the segmented images and divide it by total number of pixels in the image.
-
Classification
-
In the proposed method we are implementing hybrid classifier. That is we are using combination of five classifiers, Support vector machines, K nearest neighbours, Random forest. Each classifier will classify the total 244 images into either normal or abnormal image. SVM classifier with kernel radial bias function and degree 3 is used. After obtaining the classifiers we have done voting as hybrid method. Training of dataset is done on five different classifiers and testing is done. Training and testing set are prepared in ratio 80:20.
SVM: Support Vector Machine is a supervised machine learning algorithm which is extensively used for both classification and regression day to day problems .It is mostly used in classification problems rather than regression problems. In the SVM algorithm, we will have n number of features. We can plot these each data item hat is n features as a point in n-dimensional space where the value of each feature represent the value of a particular coordinate in the n dimensional space. Then we classifies the plotted data points into n classes by means of a hyperplane.
KNN: The k-nearest neighbors (KNN) algorithm is a simple and it is easy-to-implement focused on supervised machine learning algorithm. It is mainly used to solve both classification and regression problems. A supervised machine learning algorithm is one that pointed on labelled input data from user
dataset, directed to learn a function. The function produces an appropriate output when a new unlabelled data is feed on the algorithm. KNN captures the idea of similarity which is often called distance / proximity / closeness. Here we are calculating the distance between points on a graph. This distance is used to classify the given data. That is less distance with data point suggests that higher similarity.
Random Forest: Random forest implies it consists of a large number of individual decision trees. Decision trees are drawn upside down with its root at the top. In a decision tree, it contains condition/internal node, based on which the tree splits into branches/ edges. The end of the branch that doesnt split anymore is the decision/leaf. The fundamental principle behind random forest is the wisdom of crowds ie a large number of relatively uncorrelated models (trees) operating as a committee will outperform any of the individual constituent models.
Voting: It is the simplest method of combining the outputs from multiple machine learning algorithms. Initially we create two or more standalone machine learning models with our training dataset. Then a voting classifier can then be used to combine our standalone models and average the predictions of the standalone sub-models when a new data is given to the model or predictions. The predictions of the sub-models can be weighted by providing weight for each models manually or heuristically.
-
-
RESULTS AND DISCUSSION
Sensitivity is defined as the ability of a test to detect correctly people with disease.
Sensitivity = TP / (TP + FN)
Specificity is defined as the ability of a test to exclude properly people without disease condition.
Specificity = TN / (TN + FP)
True positive (TP) is the condition when a test result is positive and individual can detect the disease. True negative (TN) is the condition when the result is negative and individual is not diagnosed with the disease. False positive (FP) is the condition when a test result is positive and individual cannot express it. False negative (FN) is the case when the result is negative and individual can have it. SVM results in 68% accuracy. KNN classifier results in 76% and random forest results in 90% accuracy. After voting of three classifiers, the testing set results in 82% accuracy. Hybrid approach produced a precision score of 0.8619,Recall score of 0.8116 and f-measure score of 0.8028. That is out of 49 test samples 36 produced correct prediction.
-
CONCLUSION
In this proposed method hemorrhages, exudates and microaneurysms are detected. For exudate detection green channel extraction, masking, smoothing, bitwise AND are done which results in better calculation and extraction of exudates. For detection of hemorrhages and micro aneurysms, morphological operations are performed like opening. Dilation and erosion operators are performed here. For diabetic retinopathy detection, count the number for MA occurred, count the number of hemorrhages occurred and count the number of exudates occurred in the image so we can decide
the condition of image. Then features are calculated and feed to both SVM, KNN, Random Forest classifier. Voting of three classifiers are chosen as final prediction . So from the extracted feature it directly concludes the disease grade as normal or abnormal. So earlier detection and diagnosis of diabetic retinopathy help the patients from blindness and also the severe effects of disease can be decreases.
REFERENCES
-
Farrikh Alzami, Abdussalam, Rama Arya Megantara and Ahmad Zainul Fanani, Diabetic Retinopathy Grade Classification based on Fractal Analysis and Random Forest, International Seminar on Application for Technology of Information and Communication, 2019.
-
Dinial Utami Nurul Qomariah, Handayani Tjandrasa and Chastine Fatichah, Classification of Diabetic Retinopathy and Normal Retinal Images using CNN and SVM, 12th International Conference on Information and Communication Technology and System, 2019.
-
Shailesh Kumar and Basant Kumar Diabetic Retinopathy Detection by Extracting Area and Number of Microaneurysms from Colour Fundus Images, 5th International Conference on Signal Processing and Integrated Networks, 2018.
-
Mohamed Chetoui, Moulay A Akhloufi, Mustapha Kardoucha , Diabetic Retinopathy Detection using Machine Learning and Texture Features, IEEE Canadian Conference on Electrical and Computer Engineering, 2018.
-
S Choudhury, S Bandyopadhyay, SK Latib, DK Kole, C Giri, Fuzzy C Means based Feature Extraction and Classifiaction of Diabetic Retinopathy using Support Vector Machines, International Conference on Communication and Signal Processing, April 2016.
-
Surbhi Sangwan, Vishal Sharma, Misha Kakkar, Identification of Different Stages of Diabetic Retinopathy International Conference on Computer and Computational Sciences, 2015.
-
Morium Akter, Mohammed Shorif Uddin, Mahmudul Hasan Khan, Morphology based Exudate Detection from Colour Fundus Images in Diabetic Retinopathy International Conference on Electrical Engineering and Information and Communication Technology, 2014.
-
Handayani Tjandrasa, Ricky Eka Putra Arya Yudhi Wijaya, Isye Arieshanti, Classification of Non-Proliferative Diabetic Retinopathy based on Hard Exudates using Soft Margin SVM, IEEE International Conference on Control System, Computing and Engineering, November 2013.
-
V Saravanan, B Venkatalakshmi, Vithiya Rajendran, Automated Red Lesion detection in Diabetic Retinopathy IEEE Conference on Information and Communication Technologies, 2013.
-
Prof B Venkatalakshmi, V Saravanan, G Jenny Niveditha, Graphical User Interface for Enhanced Retinal Image Analysis for Diagnosing Diabetic Retinopathy, 2013.
-
V. Gulshan, L. Peng, M. Coram, M. C. Stumpe, D. Wu, A. Narayanaswamy, S. Venugopalan, K. Widner, T. Madams, and J. Cuadros, Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. Jama, vol. 316, no. 22, p. 2402, 2016.
-
R.GargeyaandT.Leng,Automated identication of diabetic retinopathy using deep learning, Ophthalmology, vol. 124, no. 7, pp. 962969, 2017.
-
B. Graham, Kaggle diabetic retinopathy detection competition report, https://kaggle2.blob.core.windows.net/forum- messageattachments/88655/2795/competitionreport.pdf/,August6,2015
,accessed May 20, 2018.
-
C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, Rethinking theinceptionarchitectureforcomputervision,inProceedingsoftheIEEE conference on computer vision and pattern recognition, 2016, pp. 2818 2826.
-
M. Lin, Q. Chen, and S. Yan, Network in network, arXiv preprint arXiv:1312.4400, 2013.
-
C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, Going deeper with convolutions, in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 19.