A Three Stage Detection of Tuberculosis using Adaptive Thresholding in Chest Radiographs

DOI : 10.17577/IJERTV5IS080187

Download Full-Text PDF Cite this Publication

Text Only Version

A Three Stage Detection of Tuberculosis using Adaptive Thresholding in Chest Radiographs

Binu Joykutty

Dept. of Electronics and Communication Engineering Amal Jyothi College of Engineering

Kanjirappally, Kerala, India

Binoshi Samuvel

Dept. of Electronics and Communication Engineering Amal Jyothi College of Engineering

Kanjirappally, Kerala, India

AbstractTuberculosis is a health threatening disease with high mortality and morbidity rates. So proper tools are required to diagnose the disease at the right time. To address this issue, we propose a novel scheme for detecting tuberculosis in chest X-ray images. The method detects tuberculosis in a three stage process namely segmentation, feature extraction and classification. The lung region is segmented using adaptive thresholding. Then feature extraction extracts information contained in the image. These feature set is given to support vector machine to distinguish between normal and abnormal chest image. The algorithm is evaluated using four performance measuring criteria : accuracy, sensitivity, specificity and area under the ROC curve (AUC). Simulation results reveal the efficiency of the method in detecting tuberculosis.

KeywordsFeature Extraction; Learning Mechanism; Tuberculosis; Segmentation.

  1. INTRODUCTION

    The increasing mortality rate of tuberculosis is a serious health threatening problem in many resource challenged regions of the world. As it is a global problem, World Health Organization (WHO) identified tuberculosis as the second leading cause of death after HIV. It is more prevalent in Southeast Asia and parts of sub-Saharan Africa. As per 2012 worldwide estimate, 8.5 million new cases of TB and 1.3 million deaths were reported. The Mycobacterium Tuberculosis is the responsible element in spreading the disease. When a person with active TB coughs or sneezes, it is breathed into the lungs of the other person who is inhaling. It can be cured if proper diagnosis and treatments are taken at the right time and can avoid spreading of the disease.

    Tuberculosis is developed through different stages. Once the bacteria engulfed inside the lung, it starts multiply within the lungs and lung alveolar surfaces get infected leading the person to an infected condition. TB is accompanied by symptoms like tiredness, night sweats, no appetite, a cough for three weeks, fever and weight loss. Several tests have been developed which have either low sensitivity or low specificity. The TB skin test is a highly sensitive and common test to detect whether someone has been exposed to TB. But it does not correctly confirm active TB. A more reliable method is the sputum test where cultured sputum samples are microscopically analyzed. Although it is definitive in its determination, it suffers from drawback that it is slow and depends on the ability of the patient to produce sputum which seems to be difficult for

    old age groups or when co-infected with other diseases. The method of blood testing also called interferon-gamma release assays (IGRA) are also conducted for TB identification. It is rarely used due to the high cost which makes it difficult for various screening applications. Chest x-ray is relatively cheap method to screen for the presence of TB. Any manifestations of TB is well clearly depicted in chest x-rays.

    In this paper, our objective is to develop an automatic TB detection system which accurately detects the presence of TB from chest X-rays. In this paper, a three stage detection of TB from chest radiographs is performed. Here the chest image is first segmented using adaptive thresholding. Based on the segmented image, features are extracted which will be discussed later in detail. A trained classifier further classifies the chest X-ray image as normal or abnormal.

    The paper is frame worked as follows. Section II discusses related work giving a brief review of various existing detection systems. Section III describes the dataset used for our experimental analysis. Section IV presents our approach in detailed manner. Experimental analysis results are covered in Section V. Section VI concludes the paper with a brief summary of the main results and findings.

  2. RELATED WORK

    The explosive growth of digital chest radiography in medical imaging has given new impetus to screening and diagnosis with the standard chest radiography being the most complex imaging tool. Several computer-aided diagnosis methods for TB detection have been developed from time to time. Many CAD based papers have been presented in recent years. Jaime Melendez [1] deals with supervised CAD system and MIL based CAD system. System address TB detection problem by supervised pixel classification, feature vectors associated with image pixels, labels of those vectors determined according to the available lesion annotations. After training, system expected to infer correct labels of unknown data. It has certain limitations. Manually annotating large x-ray data set is a tedious and time-consuming task and diffuse aspect of lesions are difficult to accurately detected. If new training data available, process is to be carried out again.

    Developing a CAD system for X-ray analysis is a tedious task. Researchers have started thinking on developing solutions. The segmentation of the lung field is

    one typical task that any CAD system should support for proper analysis of chest X-rays. Van Ginnekan [2] compared various segmentation techniques which include active shapes, pixel classification and other combinations. They concluded that pixel classification shows better performance on their test data. Dawoud [3] in his work combined intensity information with shape priors using an iterative segmentation approach trained on the JSRT database.

    Candemir [4] presents the graph cut based lung segmentation method that detects the lungs. The method consists of two stages: (I) average lung shape model calculation, and (II) lung boundary detection based on graph cut. Segmentation results in poor contrast and distortions due to the shape variations of the lung due to diseases. Training masks are used to learn the lung shape model. Instead of using all training mask, training set based on a simple shape similarity measure is used to increase the lung shape model accuracy.

    Stefan Jaeger [5] presents the tuberculosis detection by using a combination of multiple segmentation masks. The lung field is segmented using a combination of an intensity mask, a statistical lung model mask, and a Log Gabor mask. Then extract a set of features for shapes, curvatures, and textures from the segmented lung field. Using the extracted features, support vector machine classify them as normal and abnormal x-rays. But it suffers from the limitation that a larger spectral information need to be obtained while maintaining maximum spatial localization.

    In a recent study by Van Ginnekan [6], they showed ways to detect abnormal signs in textural nature. They divided the lung into overlapping regions and features are extracted from each region. The diffuse textural nature is detected by taking the moments of the responses to a filter bank. Besides, they took features as the difference between regions in the left and right lung fields. These feature extracted are used further for final classification which is done by voting and a weighted integration. Laurens Hogeweg [7] presents textural, focal and shape abnormality subsystems are combined into one system to deal with the heterogeneous abnormality expression in different populations. The lungs and clavicles were segmented to limit the analysis by the subsystems to th lung fields and provide them spatial context. The segmentation is based on supervised pixel classification and requires a set of features to be computed for each pixel which makes the system to work slower. Alexandros Karargyris [8] deals with a method by combining texture and shape features to classify chest x-rays into TB and non-TB cases. The algorithm uses shape features to describe the geometrical properties of the lung fields and texture features to represent image characteristics inside the lung field. Air cavity segmentation and lung anatomy segmentation are also discussed based on the number of iterations. Air cavity segmentation shows better performance than lung anatomy in Receiver operating characteristic area. With respect to recall (sensitivity), lung anatomy has better TB classification than air cavity segmentation.

    Fig. 1. Flowchart representation of the proposed system. The system takes chest X-ray as input and outputs its decision whether it is normal or abnormal.

    Fig.2. Chest X-ray and its segmented lung field after adaptive thresholding

  3. DATASET

    The dataset used here is the standard digital image database created by the Japanese Society of Radiological Technology (JSRT). Both for training and testing purposes, we utilized this dataset. It contained manual annotations of the ling fields. The set contains 247 chest X-ray images, among which 154 have lung nodules and 93 have no nodules. Lung nodule images were classified into five groups based on their degree of subtlety ranging from level 1 (extremely subtle) to level 5 (obvious). All images have a size of 2048 pixels x 2048 pixels with a 12-bit gray-scale depth. The pixel spacing in vertical and horizontal directions is 0.175mm. This database can be useful for research, educational purposes and other demonstrations.

  4. PROPOSED SYSTEM

    This section describes the proposed system for TB detection which includes three stages: lung segmentation, feature extraction and classification. The flowchart representation of our system is given in Fig.1 and following stages are discussed in detail. For the given input CXR, lung region is the region of interest which is segmented using adaptive thresholding method. Further feature extraction take place in the second stage process. The third stage is the classification stage which takes input as the features extracted from the second stage. A trained set is also provided for classification. Classification stage decides whether the image is normal or abnormal.

    1. Adaptive Thresholding Based Lung Segmentation

      Adaptive thresholding is used to separate desirable foreground image objects from the background based on the difference in pixel intensities of each region. It is used for images which have different lighting conditions in different areas. Its unique characteristic is its threshold changing dynamically over the image. In this, the algorithm calculate the threshold for a small regions of the image. So that it results in different thresholds for different regions of the same image. Adaptive thresholding typically takes a grayscale or color image as input and outputs a binary image representing the segmentation. A threshold has to be calculated for each pixel in the image.

      Steps for finding threshold for each region:

      • Input the image to be segmented.

      • Select a b × b region around the pixel location.

      • Calculate weighted average of b × b region : WA(x, y)

      • Apply threshold for the pixel location (x, y).

        T(x, y) = WA(x, y) c

        where c is a constant parameter which depends on the value of the threshold mode used. If the pixel value is below the threshold it is set to the background value, otherwise it assumes the foreground value.

    2. Feature Extraction

      Feature extraction transforms the input data into set of features such that the feature set extracts relevant information from the input data. The versatile feature set that is used here is the object detection inspired features that can take fine, delicate structures from chest images are required to perform desired task. The computed histogram is given in Fig.3.

      Object Detection Inspired Feature Set: a set encapsulated with shape, edge and texture descriptors. Each feature descriptor is analyzed using a histogram which gives an idea of the various descriptor values across the lung field. The following are the features that are used for detection

      [4] :

      • Intensity histogram : It shows the distribution of various intensity values across the lung field with pixels inside the lung fields have lower values than those of the surrounding tissues but higher values than areas outside the lung.

        • Gradient magnitude histogram : It is associated with the directional change in the intensity of an

          Fig. 3. Computed histogram for object detection inspired feature set.

          image. To get the full range, X and Y directions are computed with pixels having large gradient values in the direction of the gradient become possible edge pixels.

        • Shape descriptor histogram : numerically represent each region/ boundary in the segmented image. Here the lung region is decomposed into different sectors. The shape is represented by a spherical array whose value in each bin corresponds to the area of the surface that falls into each sector. Curvature descriptor is also used as a feature descriptor which accounts to the curvature properties of the lung field.

        • Histogram of oriented gradient: a feature descriptor for object detection. The technique counts occurrences of gradient orientation in localized portions of an image. Here local object appearance and shape within an image can be described by the distribution of intensity gradients or edge directions. The image is divided into small connected regions called cells and for the pixels within each cell, a histogram of gradient directions is computed.

        • Local binary pattern : texture descriptor that codes the intensity differences between neighbouring pixels. The local binary pattern detects edges. The binary patterns generated are obtained by thresholding the relative intensity between the central pixel and neighbouring pixels. It is efficient and computationally simple texture descriptor.

      Each histogram bin makes a feature and all descriptor features are concatenated to form a feature vector. This feature vector further serves as the input to the classifier. It is observed through experiments that using 32 bins for each histogram results in feasible outcomes.

    3. Classification

    The theory of support vector machine is utilized here to classify input image as abnormal/ normal images. Support Vector Machine (SVM) is a non-probabilistic classifier with supervised type of classification. The basic idea of SVM is to construct a hyper plane which separates its samples from two different classes in a high dimensional space. The operation of the SVM algorithm is based on finding the hyper plane that gives the largest minimum distance to the training examples. Therefore, the optimal separating hyper plane maximizes the margin of the training data. In general, the larger the margin the lower the generalization error of the classifier. To detect abnormal CXRs with TB, the feature vectors of abnormal CXRs should have a positive distance to the hyper plane while the feature vectors of normal CXRs should have a negative distance to the separating hyper plane in order to classify it as normal CXR. For more clear understanding, SVM hyper plane is shown in Fig. 4.

  5. EXPERIMENTAL RESULTS

    The proposed method is applied to chest X-ray images in the JSRT database for performing the experiments. JSRT database consists of 247 images, of which 154 are nodule images and 93 are non-nodule images. Among 247 images, 73% is assigned to training dataset and remaining 27% is assigned to test dataset.

    The image loaded is first segmented using adative thresholding. Once the image is segmented, feature extraction

    take place which transforms image into useful feature sets and histogram is computed for each shape and texture descriptor. Based on the feature set derived, SVM classifies the image as abnormal or normal.

    A. Computing Accuracy and ROC performance of the proposed system

    The performance of proposed system is checked by Receiver Operating Characteristic (ROC) curve. The classification performance is measured in terms of four performance measuring criteria: accuracy, sensitivity, specificity and area under the ROC curve (AUC).

    Accuracy indicates how accurate the test in reducing misclassification rate and is given in equation (1). Sensitivity indicates how often the test is right among the people who have disease and is given in equation (2). Specificity indicates how often the test is right among the people who are well and is given in equation (3). Area under the curve indicates how reliable classification can be performed for the region under consideration. It can be obtained from the ROC curve.

    We obtained an accuracy of 75.2% with AUC of 88% for the proposed system. It correctly classified 70 samples out of total test samples of 93.

    Fig. 4. Support Vector Machine Hyper plane.

    Accuracy = (TP + TN ) / (Total Samples) (1) Sensitivity = TP / (TP + FN) (2)

    Specificity = TN / (TN + FP) (3)

    where TP represents true positive, TN represents true negative, FP represents false positive and FN represents false negative. The Table I shows the confusion matrix of SVM trained by the combination of features and Table II shows parameter values obtained using SVM. ROC performance of proposed system is also shown below in Fig. 5.

    TABLE I

    CONFUSION MATRIX OF SVM CLASSIFIER

    Test Condition

    Predicted Condition

    Positive

    Negative

    Positive

    38 (TP)

    14 (FP)

    Negative

    9 (FN)

    32 (TN)

    Fig. 5. ROC performance of proposed system.

    TABLE II

    Parameters used for Performance Evaluation and their obtained values

    Accuracy

    75.20%

    Area under the curve

    87.50%

    Specificity

    69.56%

    Sensitivity

    80.85%

    Misclassification Rate

    23/93

    Parameters used for Performance Evaluation and their obtained values

    Accuracy

    75.20%

    Area under the curve

    87.50%

    Specificity

    69.56%

    Sensitivity

    80.85%

    Misclassification Rate

    23/93

    PERFORMANCE EVALUATION OF PROPOSED SYSTEM

  6. CONCLUSION

An automated system for detecting tuberculosis from chest radiographs has been developed. The detection of tuberculosis is based on adaptive thresholding segmentation and support vector machine. The proposed method accurately detects TB with an accuracy rate of 75.2

% and a sensitivity of 88%. As the system performs consistently on the extracted object detection inspired features, it is concluded that further increasing feature set dimensionality does not improve the performance on our dataset. The method is simple and inexpensive for use in remote areas in the emerging economies. The results show that a more convincing segmentation performance is achieved by using adaptive thresholding. In future experiments, we will evaluate our system on larger datasets. We would like to conclude with a message Stop TB worldwide and save mankind.

REFERENCES

  1. Jaime Melendez, et.al., A Novel Multiple-Instance Learning-based Approach to Computer-Aided Detection of Tuberculosis on Chest X-Rays, IEEE Transactions on Medical Imaging, vol. 34, no. 1, January 2015, pp.179-192.

  2. B. van Ginneken and B. ter Haar Romeny, Automatic segmentation of lung fields in chest radiographs particles, Med. Phys., vol. 27, no. 10,2000, p.24452455.

  3. A. Dawoud, Fusing shape information in lung segmentation in hest radiographs, Image Anal. Recognit., 2010, pp. 7078.

  4. S. Candemir, S. Jaeger, K. Palaniappan, S. Antani, and G. Thoma, Graph-cut based automatic lung oundary detection in chest radiographs, in Proc. IEEE Healthcare Technol. Conf.: Translat. Eng. Health Med., 2012, pp. 3134.

  5. S. Jaeger, A. Karargyris, S. Antani, and G. Thoma, Detecting tuberculosis in radiographs using combined lung masks, in Proc. Int. Conf. IEEE Eng. Med. Biol. Soc., 2012, pp. 49784981.

  6. B. van Ginneken, S. Katsuragawa, B. ter Haar Romeny, K. Doi, andM. Viergever, Automatic detection of abnormalities in chest radiographs using local texture analysis, IEEE Trans. Med. Imag., vol. 21, no. 2, Feb. 2002, pp. 139149.

  7. Laurens Hogeweg, et.al., Automatic Detection of Tuberculosis in Chest Radiographs Using a Combination of Textural, Focal, and Shape Abnormality Analysis, IEEE Transactions on Medical Imaging, February 2015, pp. 1-20.

  8. Alexandros Karargyris, Jenifer Siegelman, Dimitris Tzortzis, Stefan Jaeger, Sema Candemir, Zhiyun Xue, KC Santosh, Szilard Vajda, Sameer Antani, Les Folio and George R , Combination of texture and shape features to detect pulmonary abnormalities in digital chest X-rays,Springer,June2015,pp.1-8.

Leave a Reply