In several observational contexts where different raters evaluate a set of items, it is common to assume that all raters draw their scores from the same underlying distribution. However, a plenty of scientific works have evidenced the relevance of individual variability in different type of rating tasks. To address this issue the intra-class correlation coefficient (ICC) has been used as a measure of variability among raters within the Hierarchical Linear Models approach. A common distributional assumption in this setting is to specify hierarchical effects as independent and identically distributed from a normal with the mean parameter fixed to zero and unknown variance. The present work aims to overcome this strong assumption in the inter-rater agreement estimation by placing a Dirichlet Process Mixture over the hierarchical effects' prior distribution. A new nonparametric index lambda is proposed to quantify raters polarization in presence of group heterogeneity. The model is applied on a set of simulated experiments and real world data. Possible future directions are discussed. The statistical framework introduced in the previous chapter is here generalized. This generalization concerns three different features. First, the specification of cross-classified observations, i. e. two independent sources of redundancy are modelled. This is the case in which the same set of items are evaluated independently by different raters. Second, the heteroschedasticity among different raters. The independent and identically normally distributed assumption over the residuals across all the observations might be relaxed. This allows us to capture some systematic differences in rating behavior among the raters. Some of them might be more consistent than others, this implies a smaller residual variance across their ratings. On the contrary, some raters might be less consistent, as a result the variance across their ratings is larger. The third generalization feature concerns the rating scale. We generalize the previous framework to the ordinal data case. This imply a flexible modelling in which both the ordinal and the continuous rating data might be analysed under the same framework. Under this general framework, an approximate intra-class correlation coefficient (ICC_a) is proposed. In some cases, when the objects of the evaluation are people it might be possible to have a "bidirectional" rating scheme. More specifically, under this scheme people rate each other, a person evaluates other people and, in turn, he/she is evaluated by others as well. People might have a twofold role, one as a rater and another as an object of rating, that is they are evaluated. It is a valuable rating solution in situations of peers, for instance in the educational contexts in which each student is evaluated by the other students. As a consequence, they might be regard both as examinees and as graders (i.e., raters). To this regard, in the last part of the thesis a peer grading model is proposed -- a system in education where each student's work is assessed by several other students. This system is widely used in massive open online courses (MOOCs) as well as classroom settings. While peer grading substantially reduces teachers' burden in grading coursework and may also facilitate students' learning, there are reliability concerns on the measurement caused by the heterogeneous grading behaviours among the students. To address these concerns, we introduce a general statistical framework for peer grading data. The naive average score may be inaccurate due to the biases and variances of the individual grades, and thus, the proposed framework provides an optimal scoring rule. Additionally, this framework provides a way to assess the performance of each student as a grader, which may be used to identify a pool of reliable graders or generate feedback to help students improve their grading. Our model can also provide insights.
Modelli flessibili di rating: approcci parametrici e nonparametrici / Mignemi, Giuseppe. - (2024 Mar 15).
Modelli flessibili di rating: approcci parametrici e nonparametrici
MIGNEMI, GIUSEPPE
2024
Abstract
In several observational contexts where different raters evaluate a set of items, it is common to assume that all raters draw their scores from the same underlying distribution. However, a plenty of scientific works have evidenced the relevance of individual variability in different type of rating tasks. To address this issue the intra-class correlation coefficient (ICC) has been used as a measure of variability among raters within the Hierarchical Linear Models approach. A common distributional assumption in this setting is to specify hierarchical effects as independent and identically distributed from a normal with the mean parameter fixed to zero and unknown variance. The present work aims to overcome this strong assumption in the inter-rater agreement estimation by placing a Dirichlet Process Mixture over the hierarchical effects' prior distribution. A new nonparametric index lambda is proposed to quantify raters polarization in presence of group heterogeneity. The model is applied on a set of simulated experiments and real world data. Possible future directions are discussed. The statistical framework introduced in the previous chapter is here generalized. This generalization concerns three different features. First, the specification of cross-classified observations, i. e. two independent sources of redundancy are modelled. This is the case in which the same set of items are evaluated independently by different raters. Second, the heteroschedasticity among different raters. The independent and identically normally distributed assumption over the residuals across all the observations might be relaxed. This allows us to capture some systematic differences in rating behavior among the raters. Some of them might be more consistent than others, this implies a smaller residual variance across their ratings. On the contrary, some raters might be less consistent, as a result the variance across their ratings is larger. The third generalization feature concerns the rating scale. We generalize the previous framework to the ordinal data case. This imply a flexible modelling in which both the ordinal and the continuous rating data might be analysed under the same framework. Under this general framework, an approximate intra-class correlation coefficient (ICC_a) is proposed. In some cases, when the objects of the evaluation are people it might be possible to have a "bidirectional" rating scheme. More specifically, under this scheme people rate each other, a person evaluates other people and, in turn, he/she is evaluated by others as well. People might have a twofold role, one as a rater and another as an object of rating, that is they are evaluated. It is a valuable rating solution in situations of peers, for instance in the educational contexts in which each student is evaluated by the other students. As a consequence, they might be regard both as examinees and as graders (i.e., raters). To this regard, in the last part of the thesis a peer grading model is proposed -- a system in education where each student's work is assessed by several other students. This system is widely used in massive open online courses (MOOCs) as well as classroom settings. While peer grading substantially reduces teachers' burden in grading coursework and may also facilitate students' learning, there are reliability concerns on the measurement caused by the heterogeneous grading behaviours among the students. To address these concerns, we introduce a general statistical framework for peer grading data. The naive average score may be inaccurate due to the biases and variances of the individual grades, and thus, the proposed framework provides an optimal scoring rule. Additionally, this framework provides a way to assess the performance of each student as a grader, which may be used to identify a pool of reliable graders or generate feedback to help students improve their grading. Our model can also provide insights.File | Dimensione | Formato | |
---|---|---|---|
PhD Thesis Mignemi.pdf
accesso aperto
Descrizione: Phd Thesis
Tipologia:
Tesi di dottorato
Dimensione
1.24 MB
Formato
Adobe PDF
|
1.24 MB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.