ABSTRACT
Lung infiltration is a non-communicable condition where materials with higher density than air exist in the parenchyma tissue of the lungs. Lung infiltration can be hard to be detected in an X-ray scan even for a radiologist, especially at the early stages making it a leading cause of death. In response, several deep learning approaches have been evolved to address this problem. This paper proposes the Slide-Detect technique which is a Deep Neural Networks (DNN) model based on Convolutional Neural Networks (CNNs) that is trained to diagnose lung infiltration with Area Under Curve (AUC) up to 91.47%, accuracy of 93.85% and relatively low computational resources.
1. INTRODUCTION
Lung infiltration [3] is the existence of materials with higher density than air within the parenchyma tissue [22] of the lungs. These materials can vary from protein, pus, blood, surfactant, oedema to foreign cells [12]. The diagnosis of such disorders are considered quite complex because clinicians rely on the detection of hard to locate radiological abnormalities and checking the accompanying clinical symptoms which can be confusing and misleading. The infiltration condition is a non-resolving, slow and has no accurately defined radiological features [21]. Figure 1 shows the non-small-cell lung infiltration X-ray radio scan①.
Despite the vast advances and effort in medical sciences, the survival rates of lung cancer are very low, making it a leading cause of death [1]. At some point, even the resection will not be effective, unless the condition is recognized at early stages i.e. infiltration [10]. Aside from lung cancer, infiltration is recently associated with asthma conditions [33].
Infiltration in lung X-ray [32] scans can be very challenging to be interpreted, as the dense materials such as bones appear to have higher pixel value than lungs as shown in figure 1. The X-ray scans are 2D scans not volume scans like Magnetic Resonance Imaging (MRI). These facts made it rather difficult to distinguish the infiltration condition, where the bones of the thoracic cage surround the lungs, even for a radiologist.
DNN models can effectively extract spacial features of an image. Consequently, CNNs are very good performers in image classification, especially in Computer-Aided Diagnosis (CAD) [20] [6].
Although the CNNs are very performant in CAD, it is hard to train a network to detect a tiny segment of infiltration in a typical X-ray scan. A typical X-ray scan resolution is 28xxX28xx pixels. Thus, the techniques used in the literature had to use very deep CNN architectures to detect the presence of the unusual substance within the parenchyma tissue. Very deep CNN architectures introduce issues like over-fitting, high training cost due to the vast amount of parameters to tune and slow performance due to the too many matrices multiplication required to classify an image.
Deep neural networks are excellent pattern detectors, however, finding a visually undefined object(i.e. Infiltration)covered by bones, in large 2D images can be challenging. Slide-Detect approach addresses this problem, by making the training dataset of the lung infiltration concentrated on the image segment that is known to be infected. Thus, making the infiltration pattern clearer to the DNN to detect.
The rest of the paper is organized as follows: The “Related Work” section reviews the different techniques for automating the detection of lung infiltration in X-ray scans. The “Methodology” section details the proposed Slide-Detect technique. The “Results” section presents the results of running the Slide-Detect on the “ChestXray-NIHCC” dataset, it also provides a discussion of the results and a comparison with the state-of-the-art techniques. The “Conclusion” section concludes the work. Finally, the “Future Work” section discusses the limitations in the proposed methodology that can be addressed in future research.
2. RELATED WORK
Singhal et al. [30] tested the state-of-the-art deep architectures performance in training a classification model for several lung conditions. These conditions include “Atelectasis” [27], “Cardiomegaly” [8], “Effusion” [9], “Infiltration” [3], “Mass” [24], “Nodule” [34], “Pneumonia” [28] and “Pneumothorax” [23]. The tested DNN architectures are: AlexNet [36] which is a deep CNN with 5 convolutional layers and 3 fully connected multi-layer perceptrons [7], GoogLeNet [31] which is a 22-layer DNN developed by Google AI in 2015, VGGNet-16 [37] a 16-layer model that takes an RGB image with (224X224) in size and ResNet-50 [11] a very deep model with 152-layers. Table 1 compares the performance of the previously discussed architectures. The highest AUC score was 61.27% achieved by the ResNet-50 architecture.
Liang et al. [18] designed a CNN network with two branches called “dense networks with relative location awareness for thorax disease identification”. The first branch is a U-net [29] that masks and segments the lung and heart from the X-ray scan. Then it produces a location relative map of the generated masks based on the Euclidean distance among them. The other branch is 121-layer dense net that takes the resized image (256X256 pixels) along with the extracted Euclidean distance map as an input and produces its classification. The two branches were fused together. This method achieved 70.9% AUC in detecting the infiltration condition.
Ho et al. [13] proposed an ensemble feature extraction method that works by combining 5 techniques working in two branches. The first branch is composed of four shallow feature extraction techniques used in succession. These techniques are:
SIFT [19] which decomposes the structural features of an image patch, GIFT [26] which extracts the orientation and scale of the different objects of an image, LBP [25] which extracts the texture features of different objects in an image and HOG [5] which extracts the histogram-based features within an image. These techniques are used successively in parallel with a 121-layer pre-trained CNN. The outputs of the two branches are combined in a feature integration stage. Finally, the outputs of the feature integration stages are fed to a supervised machine learning algorithm [17] to make the final classification. The AUC of this method for detecting lung infiltration is 70.3%.
Allaouzi et al. [2] proposed a method which is composed of three stages. The Binary Relevance (BR) classifier was the fastest and achieved the highest AUC of %87.
Chen et al. [4] proposed a technique using a MobileNet CNN [15] to classify (diagnose) lung conditions including lung infiltration using the “ChestXray-NIHCC” dataset [35]. They preferred MobileNet [15] due to the fact that MobileNet was designed to run on embedded devices effectively. The proposed network had over 3 million parameters to optimize and resulted in an AUC of 57%.
Kavyashree et al. [15] proposed a two-stage algorithm for lung conditions classification. The first stage was to segment each image and extract the lungs themselves into a new image using u-net[14]. The resulting AUC was 75.1%.
A group of techniques were proposed in the literature [30] [13] [2] [4] and [15] that were mainly based on two assumptions, which are:
One model can fit all the 14 lung diseases found in the dataset.
They used the exact same features to diagnose all the diseases.
This has led to a low performance in terms of AUC especially in the lung infiltration disease. Using features like spatial distances, or geometrical shapes will not yield the best results, because the key point in detecting lung infiltration is density. Hence, a relatively deep neural network had to be used which contributed to a considerably slow performance.
3. METHODOLOGY
This section discusses the proposed Slide-Detect technique and the whole methodology adopted for implementing and evaluating it. The approach is composed of composed of 4-successive stages. The “Pre-Processing” section illustrates the procedure adopted to pre-process the “ChestXray-NIHCC” dataset. The “Feature Extraction” section shows the procedure of extracting the relevant features of the images in preparation of the classification process. The “DNN” section details the architecture and training of the classifier. Finally, the “Testing Procedure” section reviews the algorithm used to evaluate the performance of the Slide-Detect technique.
3.1 Pre-Processing
Different X-ray equipment are shipped with different sensors, X-ray emitters, etc. Thus, the produced images have different pixel thresholds and spectrum responses. To address these problems, images are normalized. Then, converted to 8-bit RGB images and saved. Algorithm 1 illustrates the procedure of image normalization.
After that, a series of rotations, transitions, re-scaling, flips and zoom operations are applied to both the control and sample datasets and added to their corresponding datasets together with their original images to improve the classifier capabilities. These operations are conducted using the ImageDataGenerator package of the keras [16] platform. At this step, it is important to keep both the sample and control datasets balanced.
3.2 Feature Extraction
Learning and training are most effective whenever they are most concentrated. Thus, instead of training the DNN model to find a vague object, the training used the bounding boxes located in the “BBox List 2017.csv” file in the dataset. So, images with positive infiltration labels are cropped around the infected area to create a sample dataset as shown in algorithm 2
To create a control class dataset, the healthy labeled images are randomly cropped as shown in algorithm 3. As a final preparation step, the datasets are divided into training and testing sets, each with sample and control sets, as shown in algorithms 4.
3.3 DNN
A 2-step pipeline is created, the first step is resizing the input image into (128X128X3). Then, feeding the resulting image to a 5-layer DNN network. As shown in figure 2, the DNN is composed of 3 convolutional layers, each of which is coupled with a max-pooling layer. After that, the net is flattened to a 2-layer dense Multi-Layer Perceptrons(MLP) with 0.2 dropout at the last hidden layer. The used optimizer is Adam [38].
Figure 3 is a block diagram that shows how the proposed method (Slide-Detect) processes the data(scanned images). First, the entire dataset gets normalized. Then, the data classes get separated, After that, the image portions get created (each class separately). Then, the transformations are applied before training the CNN classifier. During these transformations, the density is still preserved, thus increasing the training data quality for lung infiltration diagnosis.
Figure 4 illustrates the technique used to classify new unseen images. First, the new image gets normalized. Then, the striding window is used to simulate the image cropping during the training phase. Finally, each output is fed to the CNN classifier to make a decision. If the classifier classifies 5 consecutive portions as positive, the image should be classified as a sample image.
3.4 Testing Procedure
To test the performance of the Slide-Detect technique, all the images marked as healthy in the dataset are loaded along with all the images marked as infected. Each image is progressively scanned in (128X128) patches. Each patch is normalized, then fed into the DNN classifier. If the DNN classifier marks at least 5 patches as abnormal, then the image is classified as an infiltrated scan. Otherwise, the image is classified as healthy. Algorithm 5 demonstrates the procedure adopted to test these images.
4. RESULTS
In this section, the results of the experiments done on the “ChestXray-NIHCC” dataset are discussed.
The “ChestXray-NIHCC” [35] is the largest publicly available dataset published by the NIH clinical center. This dataset contains 112120 labeled records of 30,805 unique patients. Each record is linked with a chest X-Ray image. Each image is resized to 1024X1024X3 pixels. Each record may have one or more of the following labels: “Cardiomegaly, Emphysema, Healthy, Hernia, Infiltration, Mass, Nodule, Effusion, Pneumothorax, Pleural Thickening, Consolidation, Edema, Pneumonia and Atelectasis”.
The classes are interleaved and imbalanced as shown in figure 5. The highest-class count is the healthy class of 60412 instances, followed by the infiltration class. Age is an important indicator when it comes to non-resolving conditions such as infiltration. Figure 6 shows the age distribution of the infiltration patients. The mean age is 46.198, with a standard deviation of 17.08. The maximum number of cases beaks in the age interval (50, 60].
Table 2 shows that the Slide-Detect technique outperformed the state-of-the-art techniques together with offering relatively lower computational cost while table 3 shows the confusion matrix of the Slide-Detect technique. The confusion matrix shows that the Slide-Detect model has a larger false positive than false negative, which is more suitable for medical applications than the contrary.
Technique . | AUC . |
---|---|
MobileNet [15] | 57% |
AlexNet [30] | 60.40% |
GoogLeNet [30] | 60.87% |
VGGNet-16 [30] | 58.95% |
ResNet-50 [30] | 61.27% |
Dense Networks with Relative Location Awareness [13] | 70.9% |
Multiple Feature Integration [2] | 70.3% |
Two-stream Collaborative Network [4] | 75.1% |
Slide-Detect | 91.27% |
. | True . | False . |
---|---|---|
Positive | 33027 | 9456 |
Negative | 69331 | 1 04 |
. | True . | False . |
---|---|---|
Positive | 33027 | 9456 |
Negative | 69331 | 1 04 |
Figure 7 Compares Slide-Detect with the state of the art techniques in terms of AUC and computational cost (number of layers used by each model). Although ResNet50 [30] used very deep CNN layers, it did not perform much better than less deep approaches such as MobileNet [15], AlexNet [30], GoogLeNet [30] and VGGNet-16 [30]. In addition, it was extremely surpassed by the much less deep Two-stream Collaborative Network [4]. This is a clear indication that just increasing the CNN depth will increase the computational cost but will not guarantee much better accuracy. The highest accuracies were achieved by Two-stream Collaborative Network [4], Dense Networks with Relative Location Awareness [13] and Multiple Feature Integration [2]. This is an indication that feature selection has more impact on solving this problem than using complex CNN networks. It is worth saying that Dense Networks with Relative Location Awareness [13] and Multiple Feature Integration [2] have by far more layers than Two-stream Collaborative Network[4] but did not manage to achieve a better accuracy. The slide-detect approach took advantage of both sides, it used a smaller CNN classifier and adopted a concentrated learning approach where it only considered images cropped around the infection area while training. Thus simplifying the model, reducing the computational cost while achieving better accuracy.
5. CONCLUSION
The Slide-Detect used a very concentrated deep learning approach to train a DNN that is able to diagnose lung infiltration with AUC up to 91.47% and accuracy of 93.85% outperforming the current state-of-the-art techniques. The Slide-Detect approach is highly efficient in terms of computational cost and memory compared to the state-of-the-art approaches as it is composed of fewer layers. So, Slide-Detect made a huge leap: eliminating the logically irrelevant features, considering only features that are significant to the process of lung infiltration diagnosis, focusing the training process on the parts of the scans which have been labelled as positive and using a properly sized network for the classification process.
6. FUTURE WORK
The Slide-Detect technique is designed specifically for the lung infiltration case that most of the state-of-the-art techniques failed to address. This is due to its nature, which is quite different from other lung diseases. Future research can be extended to other diseases of special nature.
A limitation of the Slide-Detect technique is that it only processes 2D chest scans, an adaptation can then be introduced to deal with multi-section or 3D scans. This can involve redesigning most of the algorithms used including the normalization, the feature extractions, the cropping and the CNN classifier itself.
adopted from www.bestpractice.bmj.com/