This study analyses Evaluation Summary Reports (ESRs) of Marie Skłodowska-Curie Actions (MSCA) Individual and Postdoctoral Fellowships proposals at the University of Padua (Unipd), spanning Horizon 2020 and Horizon Europe from 2015 to 2022. The aim is to identify recurring strengths and weaknesses in the evaluation process, recognizing the most important and recurrent features of successful proposals. Nearly 400 ESRs were analysed by employing keyword extraction and correspondence analysis (CA) to map relationships between words and variables such as project success. While CA did not clearly distinguish between successful and unsuccessful proposals, machine learning was applied. The coordinates from CA were used to predict project outcomes. Comparisons were made with models using only textual features and those employing transformers, specifically, BERT contextualised embeddings. Results showed that using a Large Language Model (LLM) for text representation improved prediction accuracy compared to other methods. However, it highlighted challenges in interpretability and emphasised the need for explicable methods in the absence of words. Overall, the study provides valuable insights for refining support services and training at Unipd, highlighting the effectiveness of LLMs in prediction while acknowledging the interpretive challenges associated with their use.
Analysis of Marie Skłodowska-Curie Actions (MSCA) evaluations and models for predicting the success of proposals
Ilaria Rodella;Andrea Sciandra
;Arjuna Tuzzi
2024
Abstract
This study analyses Evaluation Summary Reports (ESRs) of Marie Skłodowska-Curie Actions (MSCA) Individual and Postdoctoral Fellowships proposals at the University of Padua (Unipd), spanning Horizon 2020 and Horizon Europe from 2015 to 2022. The aim is to identify recurring strengths and weaknesses in the evaluation process, recognizing the most important and recurrent features of successful proposals. Nearly 400 ESRs were analysed by employing keyword extraction and correspondence analysis (CA) to map relationships between words and variables such as project success. While CA did not clearly distinguish between successful and unsuccessful proposals, machine learning was applied. The coordinates from CA were used to predict project outcomes. Comparisons were made with models using only textual features and those employing transformers, specifically, BERT contextualised embeddings. Results showed that using a Large Language Model (LLM) for text representation improved prediction accuracy compared to other methods. However, it highlighted challenges in interpretability and emphasised the need for explicable methods in the absence of words. Overall, the study provides valuable insights for refining support services and training at Unipd, highlighting the effectiveness of LLMs in prediction while acknowledging the interpretive challenges associated with their use.File | Dimensione | Formato | |
---|---|---|---|
JADT2024-Actes-vol2 315-324.pdf
non disponibili
Descrizione: Atti
Tipologia:
Published (publisher's version)
Licenza:
Accesso privato - non pubblico
Dimensione
744.25 kB
Formato
Adobe PDF
|
744.25 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.