The rapid proliferation of Android apps has given rise to a dark side, where increasingly sophisticated malware poses a formidable challenge for detection. To combat this evolving threat, we present an explainable hybrid multi-modal framework. This framework leverages the power of deep learning, with a novel model fusion technique, to illuminate the hidden characteristics of malicious apps. Our approach combines models (leveraging late fusion approach) trained on attributes derived from static and dynamic analysis, hence utilizing the unique strengths of each model. We thoroughly analyze individual feature categories, feature ensembles, and model fusion using traditional machine learning classifiers and deep neural networks across diverse datasets. Our hybrid fused model outperforms others, achieving an F1-score of 99.97% on CICMaldroid2020. We use SHAP (SHapley Additive exPlanations) and t-SNE (t-distributed Stochastic Neighbor Embedding) to further analyze and interpret the best-performing model. We highlight the efficacy of our architectural design through an ablation study, revealing that our approach consistently achieves over 99% detection accuracy across multiple deep learning models. This paves the way groundwork for substantial advancements in security and risk mitigation within interconnected Android OS environments.

Android malware defense through a hybrid multi-modal approach

Conti M.
2025

Abstract

The rapid proliferation of Android apps has given rise to a dark side, where increasingly sophisticated malware poses a formidable challenge for detection. To combat this evolving threat, we present an explainable hybrid multi-modal framework. This framework leverages the power of deep learning, with a novel model fusion technique, to illuminate the hidden characteristics of malicious apps. Our approach combines models (leveraging late fusion approach) trained on attributes derived from static and dynamic analysis, hence utilizing the unique strengths of each model. We thoroughly analyze individual feature categories, feature ensembles, and model fusion using traditional machine learning classifiers and deep neural networks across diverse datasets. Our hybrid fused model outperforms others, achieving an F1-score of 99.97% on CICMaldroid2020. We use SHAP (SHapley Additive exPlanations) and t-SNE (t-distributed Stochastic Neighbor Embedding) to further analyze and interpret the best-performing model. We highlight the efficacy of our architectural design through an ablation study, revealing that our approach consistently achieves over 99% detection accuracy across multiple deep learning models. This paves the way groundwork for substantial advancements in security and risk mitigation within interconnected Android OS environments.
File in questo prodotto:
File Dimensione Formato  
1-s2.0-S1084804524002121-main.pdf

accesso aperto

Tipologia: Published (Publisher's Version of Record)
Licenza: Creative commons
Dimensione 6.34 MB
Formato Adobe PDF
6.34 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3537869
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 2
  • OpenAlex ND
social impact