In this paper we aim to analyze the Italian social media communication about COVID-19 through a Twitter dataset collected in two months. The text corpus had been studied in terms of sensitivity to the social changes that are affecting people's lives in this crisis. In addition, the results of a sentiment analysis performed by two lexicons were compared and word embedding vectors were created from the available plain texts. Following we tested the informative effectiveness of word embeddings and compared them to a bag-of-words approach in terms of text classification accuracy. First results showed a certain potential of these textual data in the description of the different phases of the outbreak. However, a different strategy is needed for a more reliable sentiment labeling, as the results proposed by the two lexicons were discordant. Finally, although presenting interesting results in terms of semantic similarity, word embeddings did not show a predictive ability higher than the frequency vectors of the terms.

COVID-19 Outbreak through Tweeters’ Words: Monitoring Italian Social Media Communication about COVID-19 with Text Mining and Word Embeddings

Sciandra A.
2020

Abstract

In this paper we aim to analyze the Italian social media communication about COVID-19 through a Twitter dataset collected in two months. The text corpus had been studied in terms of sensitivity to the social changes that are affecting people's lives in this crisis. In addition, the results of a sentiment analysis performed by two lexicons were compared and word embedding vectors were created from the available plain texts. Following we tested the informative effectiveness of word embeddings and compared them to a bag-of-words approach in terms of text classification accuracy. First results showed a certain potential of these textual data in the description of the different phases of the outbreak. However, a different strategy is needed for a more reliable sentiment labeling, as the results proposed by the two lexicons were discordant. Finally, although presenting interesting results in terms of semantic similarity, word embeddings did not show a predictive ability higher than the frequency vectors of the terms.
2020
2020 IEEE Symposium on Computers and Communications (ISCC)
2020 IEEE Symposium on Computers and Communications, ISCC 2020
978-1-7281-8086-1
File in questo prodotto:
File Dimensione Formato  
PID6483597.pdf

non disponibili

Tipologia: Published (publisher's version)
Licenza: Accesso privato - non pubblico
Dimensione 450.22 kB
Formato Adobe PDF
450.22 kB Adobe PDF Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3466136
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 10
  • ???jsp.display-item.citation.isi??? 7
  • OpenAlex ND
social impact