Keywords categorization is an essential tool for SEO (Search Engine Optimization), digital marketers, and online advertising. Keywords represent one of the most valuable pieces of information to infer the users' intents and interests. An effective keyword categorization method allows understanding what types of content are in the greatest demand and can help improve future content strategies or marketing/ad campaigns. In this paper, we present a novel deep learning model for multilingual keyword categorization. The model relies on fastText multilingual word embeddings, and its architecture is inspired by the DeepSets model. To make use of (training) words not included in the pre-trained fastText embeddings, we initialize them as the average embedding overall of the co-occurrent words. Then, we fine-tune these representations by allowing the network to back-propagate the error to the input. We assess the quality of our proposal on a real-world dataset provided by a Spanish company where keywords are categorized upon the Google Product Taxonomy (GPT). Empirical results show that our model can achieve high accuracy scores while being extremely efficient.

Efficient Multilingual Deep Learning Model for Keyword Categorization

Navarin N.
2021

Abstract

Keywords categorization is an essential tool for SEO (Search Engine Optimization), digital marketers, and online advertising. Keywords represent one of the most valuable pieces of information to infer the users' intents and interests. An effective keyword categorization method allows understanding what types of content are in the greatest demand and can help improve future content strategies or marketing/ad campaigns. In this paper, we present a novel deep learning model for multilingual keyword categorization. The model relies on fastText multilingual word embeddings, and its architecture is inspired by the DeepSets model. To make use of (training) words not included in the pre-trained fastText embeddings, we initialize them as the average embedding overall of the co-occurrent words. Then, we fine-tune these representations by allowing the network to back-propagate the error to the input. We assess the quality of our proposal on a real-world dataset provided by a Spanish company where keywords are categorized upon the Google Product Taxonomy (GPT). Empirical results show that our model can achieve high accuracy scores while being extremely efficient.
2021
2021 IEEE Symposium Series on Computational Intelligence, SSCI 2021 - Proceedings
2021 IEEE Symposium Series on Computational Intelligence, SSCI 2021
978-1-7281-9048-8
File in questo prodotto:
File Dimensione Formato  
Efficient_Multilingual_Deep_Learning_Model_for_Keyword_Categorization.pdf

non disponibili

Tipologia: Published (publisher's version)
Licenza: Accesso privato - non pubblico
Dimensione 224.05 kB
Formato Adobe PDF
224.05 kB Adobe PDF Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3440098
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 1
  • OpenAlex ND
social impact