Intrinsically disordered proteins (IDPs) and intrinsically disordered regions (IDRs) play essential roles in cellular regulation, signaling, and molecular recognition. Un- like globular proteins, their biological function arises from dynamic conformational ensembles rather than from a single well-defined three-dimensional structure. Accu- rately representing and interpreting these ensembles therefore represents a central challenge in modern structural biology. This thesis addresses this challenge by con- tributing to the infrastructure, methodology, and computational analysis required for ensemble-based protein structure representation, with a particular focus on intrinsically disordered systems. The first part of the thesis focuses on infrastructure for conformational ensembles of IDPs. A major contribution is the development and extension of the Protein Ensemble Database (PED), a database for conformational ensembles of flexible and disordered proteins. Through multiple updates, PED has evolved into a resource that supports experimentally derived ensembles, curated metadata, and automated deposition and validation workflows. This work also includes the integration of high-performance computing resources via DRMAAtic to automate validation and large-scale deposition pipelines, as well as the implementation of batch deposition strategies that enabled the inclusion of large predicted ensemble datasets. The second part of the thesis addresses the lack of standardization in ensemble determination by proposing a unified conceptual framework. This framework decom- poses ensemble determination into three core components: experimental methods, computational ensemble generation, and validation and comparison. Building on this framework, the thesis also contributes a comprehensive review of state-of-the-art experimental, computational, and integrative approaches for determining conforma- tional ensembles of IDPs. Particular emphasis is placed on uncertainty quantification, avoidance of overfitting, and the need for systematic benchmarking. These consid- erations culminate in the proposal of the IDP Ensemble Benchmarking Challenge (IDP-Bench), a community-driven initiative designed to enable fair, transparent, and reproducible assessment of ensemble generation methods. The final part of the thesis introduces a computational tool for the analysis and comparison of conformational ensembles. By developing and applying ensemble-level descriptors and comparison metrics, this work demonstrates how ensemble properties can be quantified and compared. This tool provides a bridge between raw ensemble representations and biological interpretation, enabling systematic assessment of structural heterogeneity, robustness, and physical plausibility. Overall, this thesis advances ensemble-based modeling of IDPs by integrating infras- tructure development, methodological standardization, and computational analysis. By enabling more reliable characterization of conformational ensembles and their dynamic properties, this work also provides a foundation for investigating IDPs as potential drug targets, where understanding ensemble modulation, transient binding sites, and context-dependent interactions is essential for rational therapeutic design.
Studio computazionale su ensembl di proteine come potenziali nuovi bersagli farmacologici / Ghafouri, H.. - (2026 Jun 09).
Studio computazionale su ensembl di proteine come potenziali nuovi bersagli farmacologici.
GHAFOURI, HAMIDREZA
2026
Abstract
Intrinsically disordered proteins (IDPs) and intrinsically disordered regions (IDRs) play essential roles in cellular regulation, signaling, and molecular recognition. Un- like globular proteins, their biological function arises from dynamic conformational ensembles rather than from a single well-defined three-dimensional structure. Accu- rately representing and interpreting these ensembles therefore represents a central challenge in modern structural biology. This thesis addresses this challenge by con- tributing to the infrastructure, methodology, and computational analysis required for ensemble-based protein structure representation, with a particular focus on intrinsically disordered systems. The first part of the thesis focuses on infrastructure for conformational ensembles of IDPs. A major contribution is the development and extension of the Protein Ensemble Database (PED), a database for conformational ensembles of flexible and disordered proteins. Through multiple updates, PED has evolved into a resource that supports experimentally derived ensembles, curated metadata, and automated deposition and validation workflows. This work also includes the integration of high-performance computing resources via DRMAAtic to automate validation and large-scale deposition pipelines, as well as the implementation of batch deposition strategies that enabled the inclusion of large predicted ensemble datasets. The second part of the thesis addresses the lack of standardization in ensemble determination by proposing a unified conceptual framework. This framework decom- poses ensemble determination into three core components: experimental methods, computational ensemble generation, and validation and comparison. Building on this framework, the thesis also contributes a comprehensive review of state-of-the-art experimental, computational, and integrative approaches for determining conforma- tional ensembles of IDPs. Particular emphasis is placed on uncertainty quantification, avoidance of overfitting, and the need for systematic benchmarking. These consid- erations culminate in the proposal of the IDP Ensemble Benchmarking Challenge (IDP-Bench), a community-driven initiative designed to enable fair, transparent, and reproducible assessment of ensemble generation methods. The final part of the thesis introduces a computational tool for the analysis and comparison of conformational ensembles. By developing and applying ensemble-level descriptors and comparison metrics, this work demonstrates how ensemble properties can be quantified and compared. This tool provides a bridge between raw ensemble representations and biological interpretation, enabling systematic assessment of structural heterogeneity, robustness, and physical plausibility. Overall, this thesis advances ensemble-based modeling of IDPs by integrating infras- tructure development, methodological standardization, and computational analysis. By enabling more reliable characterization of conformational ensembles and their dynamic properties, this work also provides a foundation for investigating IDPs as potential drug targets, where understanding ensemble modulation, transient binding sites, and context-dependent interactions is essential for rational therapeutic design.| File | Dimensione | Formato | |
|---|---|---|---|
|
thesis_Hamidreza_Ghafouri.pdf
accesso aperto
Descrizione: Thesis Hamidreza Ghafouri
Tipologia:
Tesi di dottorato
Dimensione
8.84 MB
Formato
Adobe PDF
|
8.84 MB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.




