Recent data science applications using large datasets often need scalable optimization methods with low per iteration cost and low memory requirements. This has lead to a renewed interest in gradient descent methods, and on tailored variants for problems where gradient descent is unpractical due, e.g., to non smoothness or stochasticity of the optimization objective. Applications include deep neural network training, adversarial attacks in machine learning, sparse signal recovery, cluster detection in networks, etc. In this thesis, we focus on the theoretical analysis of some of these methods, as well as in the formulation and numerical testing of new methods with better complexity guarantees than existing ones under suitable conditions. The problems we consider have a continuous but sometimes constrained and not necessarily differentiable objective. The main contributions concern both some variants of the classic Frank-Wolfe (FW) method and direct search schemes. In particular, we prove new support identification properties for FW variants, with an application to a cluster detection problem in networks, we introduce a technique to provably speed up the convergence of FW variants, and extend some direct search schemes to the stochastic non smooth setting, as well as to problems defined on Riemannian manifolds.
Recent data science applications using large datasets often need scalable optimization methods with low per iteration cost and low memory requirements. This has lead to a renewed interest in gradient descent methods, and on tailored variants for problems where gradient descent is unpractical due, e.g., to non smoothness or stochasticity of the optimization objective. Applications include deep neural network training, adversarial attacks in machine learning, sparse signal recovery, cluster detection in networks, etc. In this thesis, we focus on the theoretical analysis of some of these methods, as well as in the formulation and numerical testing of new methods with better complexity guarantees than existing ones under suitable conditions. The problems we consider have a continuous but sometimes constrained and not necessarily differentiable objective. The main contributions concern both some variants of the classic Frank-Wolfe (FW) method and direct search schemes. In particular, we prove new support identification properties for FW variants, with an application to a cluster detection problem in networks, we introduce a technique to provably speed up the convergence of FW variants, and extend some direct search schemes to the stochastic non smooth setting, as well as to problems defined on Riemannian manifolds.
First and zeroth order optimization methods for data science / Zeffiro, Damiano. - (2023 Mar 14).
First and zeroth order optimization methods for data science
ZEFFIRO, DAMIANO
2023
Abstract
Recent data science applications using large datasets often need scalable optimization methods with low per iteration cost and low memory requirements. This has lead to a renewed interest in gradient descent methods, and on tailored variants for problems where gradient descent is unpractical due, e.g., to non smoothness or stochasticity of the optimization objective. Applications include deep neural network training, adversarial attacks in machine learning, sparse signal recovery, cluster detection in networks, etc. In this thesis, we focus on the theoretical analysis of some of these methods, as well as in the formulation and numerical testing of new methods with better complexity guarantees than existing ones under suitable conditions. The problems we consider have a continuous but sometimes constrained and not necessarily differentiable objective. The main contributions concern both some variants of the classic Frank-Wolfe (FW) method and direct search schemes. In particular, we prove new support identification properties for FW variants, with an application to a cluster detection problem in networks, we introduce a technique to provably speed up the convergence of FW variants, and extend some direct search schemes to the stochastic non smooth setting, as well as to problems defined on Riemannian manifolds.File | Dimensione | Formato | |
---|---|---|---|
thesis_final.pdf
accesso aperto
Descrizione: Tesi
Tipologia:
Tesi di dottorato
Dimensione
5.41 MB
Formato
Adobe PDF
|
5.41 MB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.