Convolutional neural networks (CNNs) have gained prominence in the research literature on image classification over the last decade. One shortcoming of CNNs, however, is their lack of generalizability and tendency to overfit when presented with small training sets. Augmentation directly confronts this problem by generating new data points providing additional information. In this paper, we investigate the performance of more than ten different sets of data augmentation methods, with two novel approaches proposed here: one based on the discrete wavelet transform and the other on the constant-Q Gabor transform. Pretrained ResNet50 networks are finetuned on each augmentation method. Combinations of these networks are evaluated and compared across four benchmark data sets of images representing diverse problems and collected by instruments that capture information at different scales: a virus data set, a bark data set, a portrait dataset, and a LIGO glitches data set. Experiments demonstrate the superiority of this approach. The best ensemble proposed in this work achieves state-of-the-art (or comparable) performance across all four data sets. This result shows that varying data augmentation is a feasible way for building an ensemble of classifiers for image classification.
Comparison of different image data augmentation approaches
Nanni L.
;
2021
Abstract
Convolutional neural networks (CNNs) have gained prominence in the research literature on image classification over the last decade. One shortcoming of CNNs, however, is their lack of generalizability and tendency to overfit when presented with small training sets. Augmentation directly confronts this problem by generating new data points providing additional information. In this paper, we investigate the performance of more than ten different sets of data augmentation methods, with two novel approaches proposed here: one based on the discrete wavelet transform and the other on the constant-Q Gabor transform. Pretrained ResNet50 networks are finetuned on each augmentation method. Combinations of these networks are evaluated and compared across four benchmark data sets of images representing diverse problems and collected by instruments that capture information at different scales: a virus data set, a bark data set, a portrait dataset, and a LIGO glitches data set. Experiments demonstrate the superiority of this approach. The best ensemble proposed in this work achieves state-of-the-art (or comparable) performance across all four data sets. This result shows that varying data augmentation is a feasible way for building an ensemble of classifiers for image classification.File | Dimensione | Formato | |
---|---|---|---|
jimaging-07-00254.pdf
accesso aperto
Tipologia:
Published (publisher's version)
Licenza:
Creative commons
Dimensione
2.22 MB
Formato
Adobe PDF
|
2.22 MB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.