Detailed description of the experiment
CASP (Critical Assessment of Structure Prediction) is a community wide experiment to determine and advance the state of the art in modeling protein structure from amino acid sequence. Every two years, participants are invited to submit models for a set of proteins for which the experimental structures are not yet public. In the latest CASP round, CASP14, nearly 100 groups from around the world submitted more than 67,000 models on 84 modeling targets. Independent assessors then compare the models with experiment. Assessments and results are published in a special issue of the journal PROTEINS.
CASP14 (2020) saw an enormous jump in the accuracy of single protein and domain models such that many are competitive with experiment. That advance is largely the result of the successful application of deep learning methods, particularly by the AlphaFold and, since that CASP, RosettaFold. As a consequence, computed protein structures are becoming much more widely used in a broadening range of applications. CASP has responded to this new landscape with a revised set of modeling categories. Some old categories have been dropped (refinement, contact prediction, and aspects of model accuracy estimation) and new ones have been added (RNA structures, protein ligand complexes, protein ensembles, and accuracy estimation for protein complexes). We are also strengthening our interactions with our partners CAPRI and CAMEO. We hope that these changes will maximize the insight that CASP15 provides, particularly in new applications of deep learning.
The core of CASP remains the same: blind testing of methods with independent assessment against experiment to establish the state-of-art in modeling proteins and protein complexes. CASP15 will include following categories.
- Single Protein and Domain Modeling
As in previous CASPs, the accuracy of single proteins and where appropriate single protein domains will be assessed, using the established metrics. Two changes will be the elimination of the distinction between template-based and template-free modeling, and an emphasis on the fine-grained accuracy of models, such as local main chain motifs and side chains. Because of the high accuracy of the new modeling methods, we expect assessment against high resolution experimental structures will be most informative.
As in recent CASPs, the ability of current methods to correctly model domain-domain, subunit-subunit, and protein-protein interactions will be assessed. We will again work in close collaboration with our CAPRI partners. Because of the promising deep learning results reported so far, substantial progress is expected.
- Accuracy Estimation
Members of the community will be invited to submit accuracy estimates for multimeric complexes and inter-subunit interfaces. There will no longer be a category for estimating the accuracy of single protein models, since it has become clear these cannot compete with modeling method specific estimates. Instead, there will be increased emphasis on assessment of self-reported accuracy estimates at the atomic level. Note the units will now be pLDDT, not Angstroms.
- RNA structures and complexes
There will be a pilot experiment to assess the accuracy of modeling for RNA models and protein-RNA complexes. The assessment will be done in collaboration with the RNA-Puzzles and Marta Szachniuk's group in Poznan.
- Protein-ligand complexes
Subject to the availability of adequate resources, there will also be a pilot experiment in this area. Deep-learning is already having an impact here, and there is high interest because of the relevance to drug design.
- Data Assisted
As in recent CASPs, there will be assessment of the extent to which the accuracy of models can be increased by the provision of sparse data, particularly that provided by SAXS and mass spectroscopy/chemical crosslinking. Only targets where these low-resolution data are likely to be useful will be considered, that is, large single proteins and complexes. As previously, we will work with collaborators to obtain the necessary experimental data. Targets will initially be released without the experimental data, followed by a second round of prediction including those data.
- Protein conformational ensembles
Following the success of deep-learning methods for single structures, it is increasingly important to assess methods for predicting structure ensembles. This is a huge area, ranging from the many conformations of disordered regions to the small number of conformations that may be involved in allosteric transitions and enzyme excited states to local protein dynamics.
While it is clear that deep learning and other methods have the potential to generate ensembles in some circumstances, the difficulty is in finding cases where there are sufficiently accurate and extensive experimental data to allow rigorous assessment. One promising avenue is modeling sets of conformations in regions of cryo-EM structures where there is evidence of local conformational heterogeneity. If suitable cases arise, we will present these as a special type of sub-target. First requesting conformational ensembles that will be evaluated against the election density map and then in a possible second stage providing the map for data assisted ensemble prediction.
A second possibility is for cases where detailed NMR data have already established the structure of two or more conformations. We have a good lead for a few targets of this type. In addition to this, we are considering a non-blind experiment (a departure from normal CASP practice), where we will first ask those interested to reproduce the known conformations. We will also ask participants to identify any additional conformations that appear to be present. It may then be possible to test these against existing or new experimental data.
The following categories from CASP14 will be not be included in in CASP15
- Contact and Distance prediction
- Domain-level estimates of model accuracy
Participation is open to all.
- April 4, 2022 - Start of the registration for CASP15 prediction experiment.
- April 18, 2022 - Start of the testing of server connectivity ("dry run" for server predictors).
- May 2, 2022 - Release of the first CASP15 modeling targets.
- June/July 2022 - Early bird registration for the December CASP15 conference.
- July 31, 2022 - Last date for releasing regular targets.
- August 31, 2022 - End of the modeling season.
- September 2022 - Collection of abstracts describing the methods used in CASP15.
- September-October 2022 - Evaluation of predictions.
- November 2022 - Invitations to groups with the most accurate models
and the most interesting methods to give talks at the CASP15 conference.
- November 2022 - Program of the conference finalized.
- December 10-13, 2022 - CASP15 Conference. (hopefully in person!)
CASP15 registration will open on April 4, 2022.
CASP15 modeling targets will be announced through the Target List page from
the main CASP15 webpage.
The success of CASP is completely dependent on the generous help of the experimental community in providing targets. As in previous CASPs, protein crystallographers, NMR spectroscopists and cryo-EM scientists are asked to provide details of structures they expect to have made public before September 15, 2022. All types of protein structure may be good modeling targets, but membrane proteins and protein complexes are particularly needed. The last day for suggesting proteins as CASP targets is July 31, 2022.
A target submission form is available here.
Models can be submitted through the Prediction Submission form available from
this web site or by the email provided in the
CASP15 format page . Please comply with the instructions on
submission procedures and format provided there.
Server predictions will be made publicly available shortly after the closing of the prediction
window for a specific target.
Details on the target collection and release procedures are available at our
As is the practice in CASP, assessment of the results will be made by the independent assessor teams. Assessment criteria will be based on those previously developed in CASP, but assessors may add new metrics they consider appropriate. Where possible, results will also be evaluated using criteria from the previous CASP, so the effects of any changes in criteria can be appreciated.
The CASP15 Assessors are as follows:
- Single protein and domain - Daniel Rigden (University of Liverpool, UK)
- Assembly - Ezgi Karaca (Izmir Biomedicine and Genome Center, Turkey)
- Model accuracy estimation - Gabriel Studer (Biozentrum, Basel, Switzerland)
for the list of assessors in all CASPs held so far.
In accordance with CASP policy, assessors cannot take part in the relevant parts of the experiment as predictors. Participants must not contact assessors directly with queries, but rather these should be sent to the
All CASP predictions and results of numerical evaluation will be made available through
this web site shortly before the meeting.
The proceedings will be published in a scientific journal
publications of previous experiments).
All participants will also be required to describe their methods
in the abstracts (published locally at our web site) and encouraged to
discuss them on the
The conference to discuss results of the CASP15 experiment is planned to be held in Europe (hopefully in person!) on December 10-13, 2022.
John Moult, CASP chair and founder; IBBR, University of Maryland, USA
Krzysztof Fidelis, founder, University of California, Davis, USA
Andriy Kryshtafovych, University of California, Davis, USA
Torsten Schwede, University of Basel, Switzerland
Maya Topf, Centre for Structural Systems Biology, Hamburg, Germany
David Baker, University of Washington
Michael Feig, Michigan State University
Nick Grishin, University of Texas
Andrzej Joachimiak, Argonne National Lab
David Jones, University College, London
Chaok Seok, Seoul National University
Michael Sternberg, Imperial College, London