Improving olfactory assessment: an item response theory analysis of the American English version of the Sniffin' sticks identification subtest

Tolomeo, Eva; Ceraudo, Leognano; Kolb, Ryann; Dalton, Pamela H; Liuzza, Marco Tullio; Parma, Valentina

doi:10.3389/fpsyg.2026.1661164

Introduction: The Sniffin' Sticks Extended Test (SSET) is one of the most widely used tools for assessing olfactory function in research and clinical settings. Despite its broad application, a detailed psychometric evaluation of its items, including those within the identification subtest, remains limited. This study aimed to evaluate the reliability, validity, and item-level functioning of the SSET identification subtest using Item Response Theory (IRT), to identify potential weaknesses and propose possible areas for improvement. Methods: The study included 397 US-based participants (60.5% female; mean age 44.61 ± SD = 18.17 [45 ± 18]) who completed the American English version of the identification subtest of the SSET. IRT analyses were conducted using both a one-parameter (1PL) and a two-parameter (2PL) logistic model to estimate item difficulty and discrimination. A Differential Item Functioning (DIF) analysis was also performed to investigate potential sex-related biases in item responses. Results: Model comparison indicated that the 2PL model provided a better fit than the 1PL model. The 2PL analysis revealed that three items (i.e., leather, turpentine, and pineapple) exhibited low discrimination parameters, suggesting limited utility in distinguishing among different levels of olfactory ability. The DIF analysis found no evidence of differential item performance between male and female participants. Discussion: These findings support the use of IRT to identify poorly performing items, enabling the refinement of the SSET, to enhance its precision and reliability across populations. Future research should explore item revisions and extend psychometric evaluations to other subtests and samples.