COVID-19 frequently provokes pneumonia, which can be diagnosed using imaging exams. Chest X-ray (CXR) is often useful because it is cheap, fast, widespread, and uses less radiation. Here, we demonstrate the impact of lung segmentation in COVID-19 identification using CXR images and evaluate which contents of the image influenced the most. Semantic segmentation was performed using a U-Net CNN architecture, and the classification using three CNN architectures (VGG, ResNet, and Inception). Explainable Artificial Intelligence techniques were employed to estimate the impact of segmentation. A three-classes database was composed: lung opacity (pneumonia), COVID-19, and normal. We assessed the impact of creating a CXR image database from different sources, and the COVID-19 generalization from one source to another. The segmentation achieved a Jaccard distance of 0.034 and a Dice coefficient of 0.982. The classification using segmented images achieved an F1-Score of 0.88 for the multi-class setup, and 0.83 for COVID-19 identification. In the cross-dataset scenario, we obtained an F1-Score of 0.74 and an area under the ROC curve of 0.9 for COVID-19 identification using segmented images. Experiments support the conclusion that even after segmentation, there is a strong bias introduced by underlying factors from different sources.
Impact of lung segmentation on the diagnosis and explanation of covid-19 in chest x-ray images
Nanni L.
;
2021
Abstract
COVID-19 frequently provokes pneumonia, which can be diagnosed using imaging exams. Chest X-ray (CXR) is often useful because it is cheap, fast, widespread, and uses less radiation. Here, we demonstrate the impact of lung segmentation in COVID-19 identification using CXR images and evaluate which contents of the image influenced the most. Semantic segmentation was performed using a U-Net CNN architecture, and the classification using three CNN architectures (VGG, ResNet, and Inception). Explainable Artificial Intelligence techniques were employed to estimate the impact of segmentation. A three-classes database was composed: lung opacity (pneumonia), COVID-19, and normal. We assessed the impact of creating a CXR image database from different sources, and the COVID-19 generalization from one source to another. The segmentation achieved a Jaccard distance of 0.034 and a Dice coefficient of 0.982. The classification using segmented images achieved an F1-Score of 0.88 for the multi-class setup, and 0.83 for COVID-19 identification. In the cross-dataset scenario, we obtained an F1-Score of 0.74 and an area under the ROC curve of 0.9 for COVID-19 identification using segmented images. Experiments support the conclusion that even after segmentation, there is a strong bias introduced by underlying factors from different sources.File | Dimensione | Formato | |
---|---|---|---|
sensors-21-07116_compressed.pdf
accesso aperto
Tipologia:
Published (publisher's version)
Licenza:
Creative commons
Dimensione
2.56 MB
Formato
Adobe PDF
|
2.56 MB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.