The evolution of generative models in artificial intelligence (AI) has significantly expanded the capacity of machines to process and generate complex multimodal data such as text, images, audio, and video. Despite these advancements, the integration of emotional awareness remains an underexplored dimension. This paper examines the state of the art in multimodal generative AI, with a focus on existing models developed by major technology companies. It then proposes an approach to incorporate emotional awareness into AI models, which would enhance human-machine interaction by improving the interpretability and explainability of AI-generated decisions. The paper also addresses the challenges associated with building emotion-aware models, including the need for comprehensive multimodal datasets and the computational complexity of incorporating less-explored sensory modalities like olfaction and gustation. Finally, potential solutions are discussed, including the normalization of existing research data and the application of transfer learning to reduce resource demands. These steps are essential for advancing the field and unlocking the potential of emotion-aware multimodal AI in applications such as healthcare, robotics, and virtual assistants.
Towards Emotionally Aware AI: Challenges and Opportunities in the Evolution of Multimodal Generative Models
Spanio, Matteo
2024
Abstract
The evolution of generative models in artificial intelligence (AI) has significantly expanded the capacity of machines to process and generate complex multimodal data such as text, images, audio, and video. Despite these advancements, the integration of emotional awareness remains an underexplored dimension. This paper examines the state of the art in multimodal generative AI, with a focus on existing models developed by major technology companies. It then proposes an approach to incorporate emotional awareness into AI models, which would enhance human-machine interaction by improving the interpretability and explainability of AI-generated decisions. The paper also addresses the challenges associated with building emotion-aware models, including the need for comprehensive multimodal datasets and the computational complexity of incorporating less-explored sensory modalities like olfaction and gustation. Finally, potential solutions are discussed, including the normalization of existing research data and the application of transfer learning to reduce resource demands. These steps are essential for advancing the field and unlocking the potential of emotion-aware multimodal AI in applications such as healthcare, robotics, and virtual assistants.Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.