With technological innovations progressing rapidly, big data is now produced from various applications. Due to its capability of handling complex problems, Deep Neural Network (DNN) has become one of the fastest-growing and most exciting areas of Machine Learning (ML) in the data-intensive field. However, it remains a challenge to train such a deep structure, facing the gradient problems and slow convergence speed. In big data analysis, the original dataset is usually characterized as high dimensionality and the key features are very often buried in the noise, which increases computational complexity and slows down DNN training. Through research, it is noted that the mathematical properties of activation functions may be attributed to the gradient exploding or vanishing problem, resulting in training failure or with big errors. As the networks go deeper and deeper to deal with more and more complex real-world problems, many more trainable parameters of DNN model need to be trained with higher computational efficiency. In this thesis, we propose a new topology of Wavelet Neural Network (WNN) termed as FDIDWT-MEXHACT-NN or simply FDMHNN for fast Deep Learning (DL) of big data applications. With preprocessing it would have two concatenation components. Since Fast Fourier Transform (FFT) performed at the beginning of hidden layers for feature pre-extraction and noise reduction in the training process is demonstrated to improve computational efficiency as the preliminary work, based on the Correlation Fractal Dimension (CFD) theory and Inverse Discrete Wavelet Transform (IDWT), the first component of FDMHNN implements a new type of preprocessing method, i.e., FDIDWT, to transform the original dataset to a low-dimensional and feature-extracted one for low computational complexity and fast DNN training. The second component of FDMHNN is a Multilayer Perceptron (MLP) model equipped with the derived Activation Function and Derivative Function pair (ADF), i.e., Tight Frame Wavelet Activation Function (TFWAF) and Tight Frame Wavelet Derivative Function (TFWDF) of Mexican hat wavelet. These two functions, termed as TFMH, are designed to normalize and constrain the training data of hidden layers to constant energy that stabilizes the training and speeds up convergence further. The nonlinearity of wavelet functions can strengthen the learning capacity of DNN model, while the sparse property of wavelets derived can reduce the computational complexity of training process and enhance the robustness of model. In addition, TFMH can also alleviate the overfitting problem. The proposed FDMHNN model is characterized as stable, fast construct and fast convergent, and is evaluated by various experiments. Through big data analysis for nonlinear system modelling and speech signal processing, and particularly feature extraction and classification for the astronomical data collected from a nano-satellite simulation environment and the real Fermi-LAT source catalog (3FGL), experiments performed demonstrate that the deep FDMHNN model achieves more stable training and faster convergence than the traditional deep MLPs in the binary classification tasks as well as complicated applications.
A New Topology of Wavelet Neural Network for Fast Deep Learning of Big Data Applications / Cao, Haitao. - (2019 Dec 01).
A New Topology of Wavelet Neural Network for Fast Deep Learning of Big Data Applications
Cao, Haitao
2019
Abstract
With technological innovations progressing rapidly, big data is now produced from various applications. Due to its capability of handling complex problems, Deep Neural Network (DNN) has become one of the fastest-growing and most exciting areas of Machine Learning (ML) in the data-intensive field. However, it remains a challenge to train such a deep structure, facing the gradient problems and slow convergence speed. In big data analysis, the original dataset is usually characterized as high dimensionality and the key features are very often buried in the noise, which increases computational complexity and slows down DNN training. Through research, it is noted that the mathematical properties of activation functions may be attributed to the gradient exploding or vanishing problem, resulting in training failure or with big errors. As the networks go deeper and deeper to deal with more and more complex real-world problems, many more trainable parameters of DNN model need to be trained with higher computational efficiency. In this thesis, we propose a new topology of Wavelet Neural Network (WNN) termed as FDIDWT-MEXHACT-NN or simply FDMHNN for fast Deep Learning (DL) of big data applications. With preprocessing it would have two concatenation components. Since Fast Fourier Transform (FFT) performed at the beginning of hidden layers for feature pre-extraction and noise reduction in the training process is demonstrated to improve computational efficiency as the preliminary work, based on the Correlation Fractal Dimension (CFD) theory and Inverse Discrete Wavelet Transform (IDWT), the first component of FDMHNN implements a new type of preprocessing method, i.e., FDIDWT, to transform the original dataset to a low-dimensional and feature-extracted one for low computational complexity and fast DNN training. The second component of FDMHNN is a Multilayer Perceptron (MLP) model equipped with the derived Activation Function and Derivative Function pair (ADF), i.e., Tight Frame Wavelet Activation Function (TFWAF) and Tight Frame Wavelet Derivative Function (TFWDF) of Mexican hat wavelet. These two functions, termed as TFMH, are designed to normalize and constrain the training data of hidden layers to constant energy that stabilizes the training and speeds up convergence further. The nonlinearity of wavelet functions can strengthen the learning capacity of DNN model, while the sparse property of wavelets derived can reduce the computational complexity of training process and enhance the robustness of model. In addition, TFMH can also alleviate the overfitting problem. The proposed FDMHNN model is characterized as stable, fast construct and fast convergent, and is evaluated by various experiments. Through big data analysis for nonlinear system modelling and speech signal processing, and particularly feature extraction and classification for the astronomical data collected from a nano-satellite simulation environment and the real Fermi-LAT source catalog (3FGL), experiments performed demonstrate that the deep FDMHNN model achieves more stable training and faster convergence than the traditional deep MLPs in the binary classification tasks as well as complicated applications.File | Dimensione | Formato | |
---|---|---|---|
Cao_Haitao_thesis.pdf
accesso aperto
Tipologia:
Tesi di dottorato
Licenza:
Non specificato
Dimensione
4.47 MB
Formato
Adobe PDF
|
4.47 MB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.