Nowadays the field of text mining techniques seems to be very active in dealing with the increasing mass of available digital texts and several algorithms have been proposed to analyze and synthesize the vast amount of data that today represents a challenging source of information overload. Topic modeling is a collection of algorithms which are useful for discovering themes, i.e. topics, in unstructured text. The Latent Dirichlet Allocation (LDA) by Blei (et al., 2003) was one of the first topic modeling algorithms and since then the field seems to be active and many variants and other algorithms have been suggested. The present study considers a topic as an indicator of the relevance of a research area in a specific time-span and its temporal evolution pattern as a way to identify the paradigm changes in terms of theories, ideas, forgotten topics, evergreen subjects and new emerging research interests. The study aims to contribute to a substantive reflection in Sociology by exploring the temporal evolution of topics in the abstracts of articles published by the American Journal of Sociology in the last century (1921-2016). Within the classical LDA perspective, the study also focus on topics with a significant increasing or decreasing trend (Griffiths et Steyvers, 2004). The results show different shifts that involved relevant reflections on various issues, from the early debate on the “institutionalization” process of Sociology as a scientific discipline to recent developments of sociological topics that clearly indicate how sociologists have reacted to new social problem.
What's Old and New? Discovering Topics in the American Journal of Sociology
Stefano Sbalchiero
;Arjuna Tuzzi
2018
Abstract
Nowadays the field of text mining techniques seems to be very active in dealing with the increasing mass of available digital texts and several algorithms have been proposed to analyze and synthesize the vast amount of data that today represents a challenging source of information overload. Topic modeling is a collection of algorithms which are useful for discovering themes, i.e. topics, in unstructured text. The Latent Dirichlet Allocation (LDA) by Blei (et al., 2003) was one of the first topic modeling algorithms and since then the field seems to be active and many variants and other algorithms have been suggested. The present study considers a topic as an indicator of the relevance of a research area in a specific time-span and its temporal evolution pattern as a way to identify the paradigm changes in terms of theories, ideas, forgotten topics, evergreen subjects and new emerging research interests. The study aims to contribute to a substantive reflection in Sociology by exploring the temporal evolution of topics in the abstracts of articles published by the American Journal of Sociology in the last century (1921-2016). Within the classical LDA perspective, the study also focus on topics with a significant increasing or decreasing trend (Griffiths et Steyvers, 2004). The results show different shifts that involved relevant reflections on various issues, from the early debate on the “institutionalization” process of Sociology as a scientific discipline to recent developments of sociological topics that clearly indicate how sociologists have reacted to new social problem.Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.