Intrinsically disordered proteins, defying the traditional protein structure–function paradigm, are a challenge to study experimentally. Because a large part of our knowledge rests on computational predictions, it is crucial that their accuracy is high. The Critical Assessment of protein Intrinsic Disorder prediction (CAID) experiment was established as a community-based blind test to determine the state of the art in prediction of intrinsically disordered regions and the subset of residues involved in binding. A total of 43 methods were evaluated on a dataset of 646 proteins from DisProt. The best methods use deep learning techniques and notably outperform physicochemical methods. The top disorder predictor has Fmax = 0.483 on the full dataset and Fmax = 0.792 following filtering out of bona fide structured regions. Disordered binding regions remain hard to predict, with Fmax = 0.231. Interestingly, computing times among methods can vary by up to four orders of magnitude.

Critical assessment of protein intrinsic disorder prediction

Necci M.;Piovesan D.;Walsh I.;Vendruscolo M.;Sormanni P.;Wang C.;Sharma R.;Zhou Y.;Vranken W.;Dosztanyi Z.;Erdos G.;Meszaros B.;Sharma A.;Callebaut I.;Tamana S.;Chasapi A.;Ouzounis C.;Lambrughi M.;Maiani E.;Papaleo E.;Chemes L. B.;Alvarez L.;Palopoli N.;Benitez G. I.;Bassot C.;Elofsson A.;Salvatore M.;Hatos A.;Monzon A. M.;Bevilacqua M.;Micetic I.;Minervini G.;Paladin L.;Quaglia F.;Leonardi E.;Davey N.;Tantos A.;Davidovic R.;Veljkovic N.;Hajdu-Soltesz B.;Pajkos M.;Szaniszlo T.;Tosatto S. C. E.
2021

Abstract

Intrinsically disordered proteins, defying the traditional protein structure–function paradigm, are a challenge to study experimentally. Because a large part of our knowledge rests on computational predictions, it is crucial that their accuracy is high. The Critical Assessment of protein Intrinsic Disorder prediction (CAID) experiment was established as a community-based blind test to determine the state of the art in prediction of intrinsically disordered regions and the subset of residues involved in binding. A total of 43 methods were evaluated on a dataset of 646 proteins from DisProt. The best methods use deep learning techniques and notably outperform physicochemical methods. The top disorder predictor has Fmax = 0.483 on the full dataset and Fmax = 0.792 following filtering out of bona fide structured regions. Disordered binding regions remain hard to predict, with Fmax = 0.231. Interestingly, computing times among methods can vary by up to four orders of magnitude.
2021
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3390525
Citazioni
  • ???jsp.display-item.citation.pmc??? 33
  • Scopus 185
  • ???jsp.display-item.citation.isi??? 2
  • OpenAlex ND
social impact