The evolution of generative models in artificial intelligence (AI) has significantly expanded the capacity of machines to process and generate complex multimodal data such as text, images, audio, and video. Despite these advancements, the integration of emotional awareness remains an underexplored dimension. This paper examines the state of the art in multimodal generative AI, with a focus on existing models developed by major technology companies. It then proposes an approach to incorporate emotional awareness into AI models, which would enhance human-machine interaction by improving the interpretability and explainability of AI-generated decisions. The paper also addresses the challenges associated with building emotion-aware models, including the need for comprehensive multimodal datasets and the computational complexity of incorporating less-explored sensory modalities like olfaction and gustation. Finally, potential solutions are discussed, including the normalization of existing research data and the application of transfer learning to reduce resource demands. These steps are essential for advancing the field and unlocking the potential of emotion-aware multimodal AI in applications such as healthcare, robotics, and virtual assistants.

Towards Emotionally Aware AI: Challenges and Opportunities in the Evolution of Multimodal Generative Models

Spanio, Matteo
2024

Abstract

The evolution of generative models in artificial intelligence (AI) has significantly expanded the capacity of machines to process and generate complex multimodal data such as text, images, audio, and video. Despite these advancements, the integration of emotional awareness remains an underexplored dimension. This paper examines the state of the art in multimodal generative AI, with a focus on existing models developed by major technology companies. It then proposes an approach to incorporate emotional awareness into AI models, which would enhance human-machine interaction by improving the interpretability and explainability of AI-generated decisions. The paper also addresses the challenges associated with building emotion-aware models, including the need for comprehensive multimodal datasets and the computational complexity of incorporating less-explored sensory modalities like olfaction and gustation. Finally, potential solutions are discussed, including the normalization of existing research data and the application of transfer learning to reduce resource demands. These steps are essential for advancing the field and unlocking the potential of emotion-aware multimodal AI in applications such as healthcare, robotics, and virtual assistants.
2024
Proceedings of the AIxIA Doctoral Consortium 2024 co-located with the 23rd International Conference of the Italian Association for Artificial Intelligence (AIxIA 2024)
AIxIA Doctoral Consortium
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3547316
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex ND
social impact