Results Recalculated in 2019
Retrospective view

Detailed description of the experiment

Goals Scope Related Timetable Participation Targets Predictions Format Assessment Results Meeting Organizers

Goals

The main goal of CASP is to obtain an in-depth and objective assessment of our current abilities and inabilities in the area of protein structure prediction. To this end, participants will predict as much as possible about a set of soon to be known structures. These will be true predictions, not ‘post-dictions’ made on already known structures.

CASP10 will particularly address the following questions:

  1. Are the models produced similar to the corresponding experimental structure?
  2. Is the mapping of the target sequence onto the proposed structure (i.e. the alignment) correct?
  3. Have similar structures that a model can be based on been identified?
  4. Are comparative models more accurate than can be obtained by simply copying the best template?
  5. Has there been progress from the earlier CASPs?
  6. What methods are most effective?
  7. Where can future effort be most productively focused?

Scope

Tertiary structure predictions (TS):
  • The Template Based Modeling (TBM) category will include domains where a suitable template can be identified that covers all or nearly all of the target.
  • The Template free modeling (FM) category will include models of proteins for which no suitable template can be identified.
  • The Refinement category will include selected targets from among those released in the main modeling experiment to analyze success in refining models beyond the quality obtained by simply copying from a single template. For suitable CASP10 targets, we will select one of the best models received during the prediction season, and reissue it as a starting structure for refinement.
  • The contact-assisted structure modeling category will show how the knowledge of a few (usually 3 to 5) long-range contacts influences the ability of predictors to model the complete structure. This experiemnt will be carried out only for the more challenging CASP10 targets where we can get coordinates in advance and have at least two weeks for re-prediction.
  • The chemical shifts guided modeling of NMR structures will be performed on the selected CASP10 targets, for which we can get a chemical shifts table from the NMR-spectroscopists at least two weeks ahead of public release of the target.
  • The structure modeling based on molecular replacement with ab initio models and crystallographic diffraction data will be carried out for selected targets provided we get the structure factors from the crystallographers.

Other prediction categories: 

  • Detecting residue-residue contacts in proteins (RR).
  • Identifying disordered regions in target proteins (DR). 
  • Function prediction (prediction of binding sites) (FN). 
  • Quality assessment of models in general (without knowing native structures) and the reliability of predicting certain residues in particular (QA).

CASP Related activities

There will be additional activities included in or related to CASP10, which extend its scope.

Rolling CASP will run in parallel with CASP10 in May-July. We will discuss the results of the CASP ROLL experiment at the CASP10 predictors' meeting.

FORCASP: There will be discussion of predictions and methods on our  FORCASP forum.

Timetable

Registration for CASP10 will start in the last week of March 2012. Testing of server connectivity ("dry run" for server predictors) will be conducted starting April 16, 2012. The first prediction targets will be released not earlier than May 1; the last prediction targets will be released not later than July 17; prediction season will end not later than July 31. Refinement experiment will end not later than August 17. Abstracts describing the methods tested in CASP10 will be collected in September. At the same time we will open registration for the meeting. The program of the meeting will be available in November. The meeting will take place on December 9-12, and approximately one month before that groups with the most accurate predictions and interesting methods will receive invitations to give talks. 

Participation

Participation is open to all. If you already have an account with the Prediction Center, you will be able to go directly to the CASP10 registration page. Please check, though, that your basic registration information is current. If it has changed - please update it through the My Personal Data link from the main Menu. If you are new to CASP and don't have an account with us, you will have to register with the Prediction Center first, and only then - for CASP10. Separate registration forms for different types of registration are available through this website. Predictors with servers are requested to register as soon as possible as we are planning on starting a "dry run" for servers in the second decade of April.

For convenience of the predictors who currently participate in CASP ROLL and plan to participate in CASP10, we automatically enroll them in CASP10 with the CASP ROLL group id, group name and PIN number. If you are a CASP ROLL participant but do not plan to take part in CASP10 - please send us a message and we will skip your automatic CASP10 enrollment. After the automatic registration, the current CASP ROLL participants will be able to access their registration data and make updates if necessary (e.g., add information that a server already enrolled in CASP ROLL will also be sending QA or DR predictions in CASP10). Please note that you will not be allowed to change your group name for CASP10; if this is imperative for you - you will have to register a new group with the desired name in CASP10 and inform us that you don't want your CASP ROLL group enrolled in CASP10. In this case you would have to handle submissions separately for the groups registered for different experiments.

Targets

For the experiment to succeed, it is essential that we obtain the help of the experimental community. As in previous CASPs, we invite protein crystallographers and NMR spectroscopists to provide details of structures they expect to have made public before September 15, 2012. The last day for suggesting proteins as CASP targets is July 16, 2012. A target submission form is available at this website.

During the prediction season, targets are being posted daily at the Target List page and, additionally, automatically pushed to the registered prediction servers. Targets in CASP10 will be split in two prediction tracks: those for 1) all groups (long deadline) and 2) servers only (short deadline). Assignment of a target to a particular track is made by the organizers and communicated to the predictors through the Target List page. Priority for inclusion in the all groups modeling track will be given to targets containing low homology domains.

Targets are planned to be released on business days only, around 9am PDT. Tarballs for QA predictions will be released at noon, PDT. Sequence and other relevant information about the targets will be posted at the Target List web page. Requests to the participating servers will be sent shortly after the target release. We plan to release not more than 3 targets per day for servers and, usually, one target per day for regular groups. All targets are assigned two expiration dates (one - for server predictors and another - for regular groups). All predictions must be received and accepted before noon, 12pm PDT on the corresponding expiration date.

We are planning to release 50-60 targets for evaluation in the all-group track (long deadline) and as many targets as we can get in the server-only track. Server groups are expected to submit their predictions for all targets. Manual groups are expected to submit their predictions for long deadline targets (all_group) in TS and RR categories, and for all targets (all_group + server_only) in DR, FN and QA categories. Those manual groups that wish to submit TS and RR predictions for server-only targets are welcome to do so, but these predictions will not be officially evaluated in CASP. Note, that DR and FN predictions from manual groups on server-only targets are due on the server expiration date.

Predictions

Predictions can be submitted through the Prediction Submission form available from this web site or by the email provided at the format page. Please comply with the instructions on CASP10 submission procedures and format. Server predictions will be made publicly available shortly after the closing of prediction window for a specific target. To enable the new testing procedure for QA methods, we will be releasing server predictions in 3 stages: (1) up to 20 selected predictions spanning the whole range of model accuracy will be released 2 days after the server TS deadline; (2) best 150 server predictions (according to the ranking from the naive consensus QA method) - 4 days after the TS deadline; (3) all server predictions - 6 days after the server TS deadline (see the QA format description at the format page).

If you are currently participating in CASP ROLL, you will have to submit your predictions on the targets that were selected for both experiments only once - to CASP10. Our system will automatically copy them to CASP ROLL.

Assessment of Predictions

As in previous CASPs, independent assessors will evaluate the predictions. Assessors will be provided with the results of numerical prediction evaluations performed at the Prediction Center, and will judge the results primarily on that basis. They will be asked to focus particularly on the effectiveness of different methods. Evaluation criteria will as far as possible be similar to those used in previous CASPs, although the assessors are welcome to introduce additional measures.

There will be three assessors, focusing on the following areas of prediction:

  1. Template based modeling - Gaetano T. Montelione, Rutgers University, USA
  2. Template free modeling - BK Lee, NCI/NIH, Bethesda, USA
  3. Refinement and physics-based prediction methods - David Jones, University College London, UK
Other prediction categories (contacts, binding sites, disorder, quality assessment) will be evaluated by the selected assessors or the organizers.

In accordance with CASP policy, assessors are not directly involved in the organization of the experiment, nor can they take part in the experiment as predictors. Predictors must not contact assessors directly with queries, but rather these should be sent to the email address.

Results and Publication

All CASP predictions and results of numerical evaluation will be made available through this web site shortly before the meeting. The proceedings of the meeting will be published in a scientific journal (see publications of previous experiments). All participants will also be required to describe their methods in the abstracts (published locally at our web site) and encouraged to discuss them on the FORCASP forum. These contributions will be discussed and scored by other predictors, and this material will be taken into account in choosing some presentations at the meeting. Also, the predictors presenting  posters at the meeting should be prepared to give a short presentation at one of the main sessions as some talks will be invited during the meeting based on the discussion of poster sessions.

Meeting

The meeting to discuss results of the experiment will be held at the Hotel Serapo in Gaeta (Italy) on December 9-12, 2012 (starting at 6pm on the 9th and ending in the afternoon of the 12th). Program of the meeting will be available in mid-November, 2012. The meeting cost is expected to be in the range of 450-550 EURO per person (including registration fee, hotel acommodation and all meals for the time of Conference). Some financial assistance will be available for the most successful predictors and students.

Organizing Committee

       John Moult, CASP president; IBBR, University of Maryland, USA
       Krzysztof Fidelis, University of California, Davis, USA
       Andriy Kryshtafovych, University of California, Davis, USA
       Torsten Schwede, University of Basel, Switzerland
       Anna Tramontano, University of Rome, Italy

Scientific Advisory Board

       David Baker, University of Washington
       Nick Grishin, University of Texas
       David Jones, University College, London
       Justin MacCallum, University of New York, Stony Brook
       Michael Sternberg, Imperial College, London

General submission rules

  • Predictions for CASP10 may be submitted in 6 separate formats:
      TS    # 3D atomic coordinates (Tertiary Structure) prediction
      AL    # ALignment to PDB entries (obsolete format, allowed for old servers only)
      RR    # Residue-Residue separation distance prediction
      DR    # Order-Disorder Regions prediction
      FN    # Binding sites prediction
      QA    # Quality assessment
    

  • One team may make a prediction of a target by submitting up to five models in the TS or AL categories, and one model in RR, DR and FN categories. In QA category predictors may submit two different models: first model - after releasing the limited number of server TS models (prediction window will be open for two days after the tarball release), and second model - after releasing a set of tentatively best 150 server TS models (see the QA format section for the timeline example).
  • Submissions for refinement, contact-assisted, chemical-shifts-assisted and MR-guided targets should be submitted in TS format.
  • Each submission file should contain prediction for only one target.
  • Each submission file should contain only one of the allowed format categories.
  • Submission files in RR, DR, FN and QA categories should contain only one model.
  • Submission files in TS/AL categories may contain either one or several models. Most of the evaluation and assessment will focus on the model labeled '1' (model index 1, see MODEL record). Each model should begin with the MODEL record, end with the END record, and contain no target residue repetitions. You may specify only one set of required header fields (PFRMAT, TARGET, AUTHOR, METHOD) above the first MODEL record in the prediction file. A multiple-model file will be split into separate files (one model per file) and each model (up to 5) will be sent separately to the verification server.
  • Submission of a duplicate model (same target, format category, group, model index) will replace previously accepted model, provided it is received before the deadline.
  • Each submission must begin with the PFRMAT, TARGET and AUTHOR records, contain the METHOD field and at least one block starting with the MODEL and ending with the END record.
  • Each submitted model is automatically verified by the format verification server. In case of successful submission no confirmation email will be sent. A unique ACCESSION CODE is composed from the number of the target, prediction format category, prediction group number, and model index.
       Example:
    
       Accession code  T0444TS005_2  has the following components:
         T0044   target number
         TS      Tertiary Structure (PFRMAT TS)
         005     prediction group 5
         2       model index 2 
    
    
    The accepted predictions could be viewed using Model Viewer link from the CASP10 web page.
    If the submission contains an error, the regular group leader or server contact person will be immediately notified through email. If your prediction is rejected for format inconsistency, you will have the possibility to correct problems and re-send prediction(s) within the target prediction time window.

Submission rules for regular prediction groups (3+ week deadline in non-QA prediction categories, 2 day deadline for QA)

  • Predictions can be submitted by a group leader or a group member with submission privileges. The group leader can set the privileges (regular member or submitter) for every member of his group using the 'Review member status' option from 'My CASP10 profile' link. Members of prediction groups who intend to submit predictions should receive submission permission from the group leader first and then use the 12-digit Registration Code of the group to submit predictions for that group.
  • Models for regular deadline groups should be submitted directly by e-mail to models AT predictioncenter.org or using the CASP10 model submission facility.
  • When sending predictions by email, please remember to use as an origination point only the email address registered with the Prediction Center (make sure we have the updated email address for you on file - check for this your "My Personal Data" link from the menu). If you temporary cannot use the registered email address for submission, please use the submission form instead.
  • Time for returning regular group predictions is set separately for each target through the Target List form. Usually regular deadline predictors have around 3 weeks from the date of target release to return a prediction. For the most difficult targets this period is usually slightly prolonged.
  • Deadlines for predictions in QA category are the same for regular and server groups (2 days).

Submission rules for server groups (3 day deadline in non-QA prediction categories, 2 day deadline for QA)

  • CASP10 queries will be sent to the registered servers from the CASP distribution server casp-meta AT predictioncenter.org. Email servers are advised to reply to this address immediately upon receiving the query with an acceptance email with subject: "T0999 - query received by MY_SERVER". This will help us to track whether your server received a request from us so that we can timely address any connectivity issues. Please do not send your predictions to this address as they will be ignored.
  • We will be sending 3 variables to your server's submission URL (or email): the SEQUENCE, the TARGET-NAME and the REPLY-E-MAIL (where to return the results). For the quality assessment servers we will be sending the TARBALL-LOCATION variable instead of (or in addition to, if you specify so) the SEQUENCE. Names for these server-specific parameters will be taken from your server registration form.
  • Server models should be returned automatically to the address specified in the REPLY-E-MAIL field of the query. Please note that the return address should be always taken from our query and not hard-coded as we may change it during the season.
  • TS, RR, DR and FN servers are requested to return predictions in 72 hours from the target release time. No additional time for corrections will be allotted, but corrections will be accepted within the original 72 hour window. Please, send your corrections manually to the address specified in the REPLY-E-MAIL field of the original query. Remember, that corrections can be submitted only by a group leader or a group member with submission privileges. The group leader can set the privileges (regular member or submitter) for every member of his group using the 'Review member status' option from 'My CASP10 profile' link. Members of prediction groups who intend to submit predictions should receive submission permission from the group leader first.
  • Server models must be submitted in the body of the email as a plain text. Predictions in attachments to the emails will be rejected. Subject of the email preferrably should contain the target number and the group name.
  • Each submission may contain several models. If server returns more than 5 models, the models numbered 6 and higher will be ignored (or 2 and higher for RR, DR and FN categories). In QA category either model 1 or model 2 will be accepted depending on the stage of the QA request (see the General Rules above or description of the MODEL record below).
  • The submission engine will resend the query if it encounters obvious connecting problems (network timeouts, 'no response' etc.). Failures that go beyond that require special attention, but we'll make every effort to notify server curators ASAP if we suspect something is not working. The facility that allows checking accepting predictions from servers is available from our website.


Format description

All submissions contain records described below. Each of these records must begin with a standard keyword. In all submissions standard keywords must begin in the first column of a record. The keyword set is as follows:
PFRMAT     Format specification code:  TS , AL , RR , DR, FN, QA 
TARGET     Target identifier from the CASP10 target table
AUTHOR     XXXX-XXXX-XXXX   Registration code of the Group Leader or Server Group Name 
SCORE      Reliability of the model (optional) 
REMARK     Comment record (may appear anywhere after the first 3 required lines, optional)
METHOD     Records describing the methods used
MODEL      Beginning of the data section for the submitted model
PARENT     Specifies structure template used to generate the TS/AL model 
TER        Terminates independent segments of structure in the TS/AL model
END        End of the submitted model

Models should be submitted in Plain Text format.


Record PFRMAT should appear on the first line of the prediction and is used for all submissions.

   PFRMAT TS
     TS  indicates that submission contains 3D atomic coordinates
         in standard PDB format

   PFRMAT RR
     RR  indicates that submission contains a residue-residue 
         separation distance prediction

   PFRMAT AL
     AL  indicates that submission contains unambiguous alignments
         to PDB entries

   PFRMAT DR
     DR  indicates that submission contains an order-disorder regions
         prediction

   PFRMAT FN
     FN  indicates that submission contains a binding site prediction

   PFRMAT QA
     QA  indicates a models quality assessment prediction


Record TARGET should appear on the second line of the prediction and is used for all submissions.

   TARGET Txxxx
     Txxxx indicates id of the target predicted.


Record AUTHOR should appear on the third line of the prediction and is used for all submissions.

 For all groups:
   AUTHOR XXXX-XXXX-XXXX
          XXXX-XXXX-XXXX indicates the Group Registration code.
          This is the code obtained by the group leader upon registration.

	  Note: Members of prediction groups who intend to submit predictions
          should receive submission permissions from the group leader and 
	  use the registration code of the Group for all predictions submitted by 
	  that group. If sending predictions by email, please send them from the 
	  registered emails of the group leader or group submitter. 
	  If you temporary can not use these emails for submission, please login 
	  to our website and then use our web-based submission facility. 

 Servers alternatively can be identified using their registered group names: 
   AUTHOR MY_SERVER_NAME     
      or 
   REMARK AUTHOR MY_SERVER_NAME
          where MY_SERVER_NAME is a name selected for the server group at registration
 


SCORE Optional. This record may be used to report a model reliability score. It will not influence the evaluation.


REMARK Optional. PDB style 'REMARK' records may be used anywhere in the submission. These records may contain any text and will in general not influence evaluation.


Records METHOD are used for all submissions.
These records describe the methods used. Predictors are urged to provide as full a description of the methods as possible, including references, data libraries used, and values of default and non-default parameters. These descriptions will be made available via the Prediction Center WEB pages as well as printed along with the other materials distributed at the meeting. Length of 100 - 500 words is suggested.


Record MODEL is used for all submissions.
Signifies the beginning of model data.

   MODEL  n  
     n          Model index n is used to indicate predictor's ranking
                according to her/his belief which TS/AL model is closest to the 
                target structure (1 <= n <= 5). Model index is included
                automatically in the ACCESSION CODE. All models with index
                higher than 5 will be discarded. 
NEW IN CASP10. Model index should be set to 1 in RR, DR and FN categories. In QA category, predictors are requested to use model index '1' for the predictions submitted at the first QA stage (i.e., for the quality estimates made on the selected set of server models released 5 days after the target release for tertiary structure :prediction), and use model index '2' for the predictions submitted on a larger set of TS models at the second QA stage (i.e., for the quality estimates made on the models released 2 days after the release of the first set of models for QA prediction).


Record PARENT is required only for the submissions in the TS and AL format.
PARENT record indicates structure templates used to generate any independent segment of MODEL (see description of the TS format below). The PARENT record should be placed as the first record of any such independent segment. Only one PARENT record per structure segment is allowed. For multimeric predictions only one PARENT record per whole structure is allowed.

   PARENT N/A
     Indicates that a prediction is not directly based on any known
     structure. Note that this is the only indication in the file that the
     prediction is ab initio, so is a critical piece of information.

   PARENT 1abc_A
     Indicates that the model or the independent segment of structure is
     based on a single PDB entry 1abc chain A (use _A to indicate chain A).
     All template-based predictions should be submitted with this form 
     of the PARENT record. Note that, in order to be accepted, the code 
     must correspond to a current PDB entry.

   PARENT 1cdc 2def_g [3hij_k ...]
     Indicates that the model is based on more than one structural template. 
     Up to five PDB chains may be listed here with additional detailed information 
     included in the METHOD records. Subdomains of the target structure found 
     to correspond to different known folds may be submitted as independent 
     segments of structure with reference to only one PDB chain per segment.  


Record TER is used to terminate an independent segment of structure (PFRMAT TS and PFRMAT AL). Every TER record should correspond to the preceding PARENT record in the model.

   TER


An unambiguous alignment to a PDB entry used for threading predictions (PFRMAT AL).
This format is deprecated and allowed for old structure prediction servers only.

Alignment for each model or an independent structure segment should begin with a single PARENT record and terminate with a TER record (see above). The (four column) alignment data records provide: target residue one letter symbol, target residue sequence number, PDB residue one letter symbol, and PDB residue sequence number with an insertion code if necessary (see Example 3):

   aa1 n1  aa2 n2

   Note:
     - residues for which no prediction is made must be skipped
     - if a chain ID is specified in the PDB template of the target, then 
       the target residue sequence number should be composed of a chain ID 
       and residue number, e.g. A2, B44

The PDB code with chain extension of the structure the alignment is based on should be placed in the PARENT record. Only one PDB code per independent structure segment is allowed. PDB codes should refer to structures containing at least the main chain atomic coordinates (see the TS format). As in the case of coordinate submissions, when multiple independent segments of structure are used in a prediction, they will be evaluated separately with no assumption of a common frame of reference between the segments. For any given MODEL, no target residue may be repeated among all such independent structure segments. Potential multi-domain nature of targets will be addressed in the evaluation even if the prediction is made in a single frame of reference (i.e. without separation into multiple segments of structure). For such predictions segmentation should only be used to allow multiple model predictions (effectively up to 5 predictions for each such domain).
Note: The facility to translate sequence - structure alignments (AL format) into standard PDB atom records (TS format) is available as an additional AL2TS service.


Residue-Residue separation prediction (PFRMAT RR).
Data in this format are inserted between MODEL and END records of the submission file.
Format for the predicted separation distance between pairs of residues. The distance is defined as the separation between C-beta atoms (C-alpha for glycine residues).

The (five column) RR format:

   i  j  d1  d2  p

   Notes (see Example 2):
     - entire target sequence should be split over multiple lines with a
       maximum of 50 residues per line
     - for intrachain residue-residue contacts residue number indices 
       i and j should be used for distance specification (i < j), i.e. 
       only one diagonal of the separation matrix should be supplied
     - the distances d1 and d2 (real numbers) should indicate the range of 
       Cb-Cb distance predicted for the residue pair (C-alpha for glycines)
     - the real number p should range from 0.0 - 1.0 to indicate
       probability of the distance falling between the predicted range
     - residue 'contacts' (defined here - as in CASP2 - as Cb-Cb<8A) can be 
       predicted with this format as:
         i  j  0  8  p
     - any pair NOT listed is assumed to be NOT considered by predictor
     - for multichain predictions residue indices should be composed of 
       chain ID and residue number, e.g. A2, B44 (see Example 4B).


Order-disorder regions prediction (PFRMAT DR).
Data in this format are inserted between MODEL and END records of the submission file.
The (three column) format record consists of residue code, Order/Disorder prediction code, and a number specifying the associated confidence level:

   aa  OD  p
The symbols for the 2 state order/disorder prediction are 'O'=order, 'D'=disorder. Last column should indicate a probability of a residue being in the disordered region. The value of this confidence level is in the range of 0.0 - 1.0 (values 0.51 and higher designate disordered state). The entire sequence of the target should always be given. If parts cannot be predicted a probability value of 0.5 should be used (see Example 5).


Protein binding sites prediction (PFRMAT FN).
Data are inserted between MODEL and END records of the submission file (see Example 7 at the bottom of the page).
The first line of the prediction should start with the keyword

Binding site:

and any additional lines should start with the keyword

Comment:

Note that angle brackets in the format description below designate optional/additional data and should not be included into the prediction; semicolon separates several entries on one row; comma separates entries within the same logical block, e.g. numbers of residues within the same binding site):

Binding site: res1, res2, ...
   or
Binding site: res1 - res2, <res3 - res4>, ...
** Residues considered as binding sites are those in direct contact with heteroatoms bound in the structure of the target protein. For the purposes of binding site residues predictors should be aiming to predict residues that have any heavy atom in contact with the ligand at a distance of 0.5A plus the van der Waals radii. For example under this defintion the vast majority of single magnesium atoms are in contact with 2-4 residues per chain and ATP is usually bound by 11-18 residues per chain.
Over-prediction of binding residues will not be advantageous.

Comment: free text
** The predictors are encouraged to use this section to include the description of their predictions. Although this section will not be evaluated it might be useful for the assessor.


Quality assessment prediction (PFRMAT QA).

NEW IN CASP10. In QA category, predictors are requested to use model index '1' for predictions submitted in the first stage (i.e., estimating quality of the selected server models released 5 days after the initial target release), and use model index '2' for predictions submitted on the second, larger set of TS models (i.e., estimating quality of models released 7 days after the initial target release).

Timeline example.
May 1, 9am PDT - target T0644 is released for prediction in non-QA categories.
May 4, noon - the deadline for submitting tertiary structure predictions by servers.
May 6, noon - the first set of server TS predictions (up to 20 models selected primarily to test single-model methods) is sent to the registered QA servers and posted on the casp10 archive page. QA predictions (marked as MODEL 1) for this subset are accepted for two days.
May 8, noon - deadline for "stage 1" QA predictions. The second set of server TS predictions (150 models selected to test both, single-model and clustering methods) is sent to the registered QA servers and posted on the casp10 archive page. QA predictions (marked as MODEL 2) for this second subset of models are accepted for two more days.
May 10, noon - deadline for "stage 2" QA predictions. All server TS predictions are posted on the casp10 archive page. No further QA predictions (from servers or manual groups) are accepted for this target.

Data are inserted between MODEL and END records of the submission file. You may submit your quality assessment prediction in one of the two different modes:
QMODE 1 :   global model quality score (MQS - one number per model)
QMODE 2 :   MQS and error estimates on per-residue basis.

The first line of data should specify mode identifier, i.e. QMODE (see Example 8).

In both modes, the first column in each line contains model identifier (file name of the accepted 3D prediction). The second column contains reliability score for a model as a whole. The reliability score is a real number between 0.0 and 1.0 (1.0 being a perfect model). If you don't provide MQS for a model please put "X" in the corresponding place. If you don't want to additionally provide error estimates on per residue basis (QMODE 1), your data table will consist of these two columns only.

If you do additionally provide residue error estimates (QMODE 2), each consecutive column should contain error estimate in Angstroms for all the consecutive resides in the target (i.e., column 3 corresponds to residue 1 in the target, column 4 - to residue 2 and so on). This way data constitute a table (Number_of_models_for_the_target) BY (Number_of_residues_in_the_target + 1). Do not skip columns if you are not predicting error estimates for some residues - instead put "X" in the corresponding column.
Please specify in the REMARKS what you consider to be an error estimate for a residue (CA location error, geometrical center error, etc.).

Note 1. Please, be advised that a QA record line may be very long and then some editors/mailing programs may force line wrap potentially causing unexpected parsing errors. To avoid this problem we recommend that you split long lines into shorter sublines (50-100 columns of data) by yourself. Our parser will consider consecutive sublines (starting with the line containing evaluated model name and ending with the line containing the next model name or tag END) a part of the same logical line.

Note 2. Please, be advised that model quality predictions in CASP are evaluated by comparing submitted estimates of global reliability and per-residue accuracy of structural models with the values obtained from the LGA superpositions of the corresponding models with experimental structures. Therefore, perfect global model scores in QMODE1 (QA1) should ideally correspond to the GDT_TS scores, and predicted per-residue distances in QMODE2 should ideally reproduce those extracted from the optimal model-target superpositions.




END record is used for all predictions and indicates the end of a single model submission.


Example 1. Atomic coordinates (Tertiary Structure)

The primary CASP10 format used for tertiary structure prediction

(A) An example of prediction.

PFRMAT TS
TARGET T9999
AUTHOR 1234-5678-9000
REMARK Predictor remarks
METHOD Description of methods used
METHOD Description of methods used
MODEL  1 
PARENT 1abc 1def_A
ATOM      1  N   GLU     1      10.982  -9.774   1.377  1.00  0.50
ATOM      2  CA  GLU     1       9.623  -9.833   1.984  1.00  0.50
ATOM      3  C   GLU     1       8.913 -11.104   1.521  1.00  0.50
ATOM      4  O   GLU     1       9.187 -11.630   0.461  1.00  0.50
ATOM      5  CB  GLU     1       8.814  -8.614   1.546  1.00  0.50
ATOM      6  CG  GLU     1       7.372  -8.754   2.039  1.00  0.50
ATOM      7  CD  GLU     1       7.339  -8.625   3.562  1.00  0.50
ATOM      8  OE1 GLU     1       8.370  -8.307   4.131  1.00  0.50
ATOM      9  OE2 GLU     1       6.284  -8.846   4.132  1.00  0.50
ATOM     10  N   THR     2       7.998 -11.599   2.304  1.00  1.60
ATOM     11  CA  THR     2       7.266 -12.832   1.907  1.00  1.60
ATOM     12  C   THR     2       6.096 -12.456   1.005  1.00  1.60
ATOM     13  O   THR     2       5.008 -12.217   1.466  1.00  1.60
ATOM     14  CB  THR     2       6.731 -13.533   3.157  1.00  1.60
ATOM     15  OG1 THR     2       7.662 -13.379   4.220  1.00  1.60
ATOM     16  CG2 THR     2       6.526 -15.019   2.864  1.00  1.60
ATOM     17  N   VAL     3       6.308 -12.396  -0.278  1.00  1.70
ATOM     18  CA  VAL     3       5.190 -12.030  -1.187  1.00  1.70
ATOM     19  C   VAL     3       3.954 -12.870  -0.844  1.00  1.70
ATOM     20  O   VAL     3       2.834 -12.471  -1.090  1.00  1.70
ATOM     21  CB  VAL     3       5.608 -12.274  -2.641  1.00  1.70
ATOM     22  CG1 VAL     3       5.542 -13.771  -2.959  1.00  1.70
ATOM     23  CG2 VAL     3       4.664 -11.514  -3.573  1.00  1.70
ATOM     24  N   GLU     4       4.146 -14.029  -0.272  1.00  1.70
ATOM     25  CA  GLU     4       2.976 -14.882   0.086  1.00  1.60
ATOM     26  C   GLU     4       2.153 -14.190   1.175  1.00  1.50
ATOM     27  O   GLU     4       0.942 -14.141   1.109  1.00  1.40
ATOM     28  CB  GLU     4       3.465 -16.238   0.597  1.00  1.30
ATOM     29  CG  GLU     4       2.336 -17.264   0.479  1.00  1.20
ATOM     30  CD  GLU     4       2.929 -18.671   0.391  1.00  1.10
ATOM     31  OE1 GLU     4       4.056 -18.846   0.823  1.00  1.00
ATOM     32  OE2 GLU     4       2.246 -19.551  -0.108  1.00  0.90
TER
END
(B) A model consisting of 2 independent structure segments (could be a target modeled from two PDB domains, where relative orientation is unknown; could be 2 fragments predicted by ab initio methods - ab initio example shown). In a single MODEL no residue should appear twice among all such segments.
PFRMAT TS
TARGET T9999
AUTHOR 1234-5678-9000
REMARK Predictor remarks
METHOD Description of methods used
METHOD Description of methods used
MODEL  1
PARENT N/A
ATOM      1  N   GLU     1      10.982  -9.774   1.377  1.00  0.50
ATOM      2  CA  GLU     1       9.623  -9.833   1.984  1.00  0.50
ATOM      3  C   GLU     1       8.913 -11.104   1.521  1.00  0.50
ATOM      4  O   GLU     1       9.187 -11.630   0.461  1.00  0.50
ATOM      5  CB  GLU     1       8.814  -8.614   1.546  1.00  0.50
ATOM      6  CG  GLU     1       7.372  -8.754   2.039  1.00  0.50
ATOM      7  CD  GLU     1       7.339  -8.625   3.562  1.00  0.50
ATOM      8  OE1 GLU     1       8.370  -8.307   4.131  1.00  0.50
ATOM      9  OE2 GLU     1       6.284  -8.846   4.132  1.00  0.50
ATOM     10  N   THR     2       7.998 -11.599   2.304  1.00  1.60
ATOM     11  CA  THR     2       7.266 -12.832   1.907  1.00  1.60
ATOM     12  C   THR     2       6.096 -12.456   1.005  1.00  1.60
ATOM     13  O   THR     2       5.008 -12.217   1.466  1.00  1.60
ATOM     14  CB  THR     2       6.731 -13.533   3.157  1.00  1.60
ATOM     15  OG1 THR     2       7.662 -13.379   4.220  1.00  1.60
ATOM     16  CG2 THR     2       6.526 -15.019   2.864  1.00  1.60
ATOM     24  N   GLU     4       4.146 -14.029  -0.272  1.00  1.70
ATOM     25  CA  GLU     4       2.976 -14.882   0.086  1.00  1.60
ATOM     26  C   GLU     4       2.153 -14.190   1.175  1.00  1.50
ATOM     27  O   GLU     4       0.942 -14.141   1.109  1.00  1.40
ATOM     28  CB  GLU     4       3.465 -16.238   0.597  1.00  1.30
ATOM     29  CG  GLU     4       2.336 -17.264   0.479  1.00  1.20
ATOM     30  CD  GLU     4       2.929 -18.671   0.391  1.00  1.10
ATOM     31  OE1 GLU     4       4.056 -18.846   0.823  1.00  1.00
ATOM     32  OE2 GLU     4       2.246 -19.551  -0.108  1.00  0.90
TER
PARENT N/A
ATOM     17  N   VAL     3       6.308 -12.396  -0.278  1.00  1.70
ATOM     18  CA  VAL     3       5.190 -12.030  -1.187  1.00  1.70
ATOM     19  C   VAL     3       3.954 -12.870  -0.844  1.00  1.70
ATOM     20  O   VAL     3       2.834 -12.471  -1.090  1.00  1.70
ATOM     21  CB  VAL     3       5.608 -12.274  -2.641  1.00  1.70
ATOM     22  CG1 VAL     3       5.542 -13.771  -2.959  1.00  1.70
ATOM     23  CG2 VAL     3       4.664 -11.514  -3.573  1.00  1.70
TER
END


Example 2. Residue-Residue contact prediction

The flexibility offered by the new format allows algorithms parameterized to predict any distance range to be used. Below is an example of how to use the new residue-residue separation distance format to submit a prediction of residue contacts defined as Cb-Cb distances < 8 A.
PFRMAT RR
TARGET T9999
AUTHOR 1234-5678-9000
REMARK Predictor remarks
METHOD Description of methods used
METHOD Description of methods used
MODEL  1
HLEGSIGILLKKHEIVFDGC       # <- entire target sequence (up to 50 
HDFGRTYIWQMSD              #    residues per line)
1  9  0  8  0.70        
1 10  0  8  0.70           # <- indices of residues: i and j (integers), 
1 12  0  8  0.60           # <- the range of Cb-Cb distance predicted
1 14  0  8  0.20           #    for the residue pair: d1 and d2 (real),
1 15  0  8  0.10           # <- probability of the distance between 
1 17  0  8  0.30           #    Cb atoms being within the specified
1 19  0  8  0.50           #    range: p (real)
2  8  0  8  0.90
3  7  0  8  0.70
3 12  0  8  0.40
3 14  0  8  0.70
3 15  0  8  0.30
4  6  0  8  0.90
7 14  0  8  0.30
9 14  0  8  0.50
END


Example 3. An alternative alignment format for Threading/Fold Recognition predictions

Alignments will be converted into a 3D structures.

(A) Format to express unambiguous alignments to PDB entries 'mabc_A' and 'nefg'.

PFRMAT AL
TARGET T9999
AUTHOR 1234-5678-9000
REMARK Predictor remarks
METHOD Description of methods used
METHOD Description of methods used
MODEL  1
PARENT mabc_A
M  21    V  11 
P  22    D  12  
N  23    A  12A 
F  24    F  12B 
A  25    L  13  
P  32    D  22  
N  33    A  23 
F  34    F  24 
A  35    L  25  
TER
PARENT nefg
E  75    T  73   
T  76    T  74   
V  77    A  75  
D  78    D  76  
G  79    D  77  
R  80    R  78  
TER
END
(B) Format to express unambiguous alignments to PDB entry 'mabc_D'. An example of how to use the AL format to submit a prediction of the target with a chain name of 'A'.
PFRMAT AL
TARGET T9999
AUTHOR 1234-5678-9000
REMARK Predictor remarks
METHOD Description of methods used
METHOD Description of methods used
MODEL  1
PARENT mabc_D
M  A21    V  11 
P  A22    D  12  
N  A23    A  12A 
F  A24    F  12B 
A  A25    L  13  
P  A32    D  22  
N  A33    A  23 
F  A34    F  24 
A  A35    L  25  
TER
END


Example 4. Multichain predictions

(A) An example of 3D atomic coordinates model prediction.
PFRMAT TS
TARGET T9999
AUTHOR 1234-5678-9000
REMARK Predictor remarks
METHOD Description of methods used
METHOD Description of methods used
MODEL  1 
PARENT N/A
ATOM      1  N   GLU A   1      22.576  19.032  -5.026  1.00  0.00
ATOM      2  CA  GLU A   1      22.879  20.313  -4.321  1.00  0.00
ATOM      3  CB  GLU A   1      22.285  21.478  -5.449  1.00  0.00
ATOM      4  CG  GLU A   1      23.018  21.946  -6.707  1.00  0.00
ATOM      5  CD  GLU A   1      24.351  22.625  -6.434  1.00  0.00
ATOM      6  OE1 GLU A   1      25.379  21.908  -6.380  1.00  0.00
ATOM      7  OE2 GLU A   1      24.381  23.879  -6.291  1.00  0.00
ATOM      8  O   GLU A   1      22.237  20.962  -2.117  1.00  0.00
ATOM      9  C   GLU A   1      21.857  20.684  -3.261  1.00  0.00
ATOM     10  N   VAL A   2      20.585  20.675  -3.601  1.00  0.00
ATOM     11  CA  VAL A   2      19.530  21.006  -2.624  1.00  0.00
ATOM     12  CB  VAL A   2      18.277  21.590  -3.319  1.00  0.00
ATOM     13  CG1 VAL A   2      17.182  21.859  -2.270  1.00  0.00
ATOM     14  CG2 VAL A   2      18.656  22.833  -4.079  1.00  0.00
ATOM     15  O   VAL A   2      18.770  18.750  -2.603  1.00  0.00
ATOM     16  C   VAL A   2      19.096  19.721  -1.933  1.00  0.00
ATOM     17  N   HIS A   3      19.115  19.700  -0.603  1.00  0.00
ATOM     18  CA  HIS A   3      18.780  18.489   0.122  1.00  0.00
ATOM     19  CB  HIS A   3      19.559  18.445   1.410  1.00  0.00
ATOM     20  CG  HIS A   3      21.015  18.684   1.224  1.00  0.00
ATOM     21  CD2 HIS A   3      21.767  19.803   1.367  1.00  0.00
ATOM     22  ND1 HIS A   3      21.851  17.721   0.702  1.00  0.00
ATOM     23  CE1 HIS A   3      23.072  18.220   0.589  1.00  0.00
ATOM     24  NE2 HIS A   3      23.048  19.478   0.985  1.00  0.00
ATOM     25  O   HIS A   3      16.777  19.181   1.220  1.00  0.00
ATOM     26  C   HIS A   3      17.296  18.417   0.409  1.00  0.00
REMARK 
REMARK Predictors should NOT use TER separator between chains 
REMARK
ATOM   1321  N   GLU B   1     -22.603 -17.981  -4.847  1.00  0.00
ATOM   1322  CA  GLU B   1     -22.889 -19.285  -4.180  1.00  0.00
ATOM   1323  CB  GLU B   1     -22.342 -20.410  -5.372  1.00  0.00
ATOM   1324  CG  GLU B   1     -23.122 -20.828  -6.619  1.00  0.00
ATOM   1325  CD  GLU B   1     -24.447 -21.511  -6.324  1.00  0.00
ATOM   1326  OE1 GLU B   1     -25.468 -20.792  -6.207  1.00  0.00
ATOM   1327  OE2 GLU B   1     -24.479 -22.769  -6.227  1.00  0.00
ATOM   1328  O   GLU B   1     -22.172 -20.020  -2.026  1.00  0.00
ATOM   1329  C   GLU B   1     -21.830 -19.701  -3.172  1.00  0.00
ATOM   1330  N   VAL B   2     -20.572 -19.685  -3.557  1.00  0.00
ATOM   1331  CA  VAL B   2     -19.485 -20.056  -2.630  1.00  0.00
ATOM   1332  CB  VAL B   2     -18.260 -20.619  -3.392  1.00  0.00
ATOM   1333  CG1 VAL B   2     -17.131 -20.932  -2.393  1.00  0.00
ATOM   1334  CG2 VAL B   2     -18.674 -21.832  -4.184  1.00  0.00
ATOM   1335  O   VAL B   2     -18.711 -17.807  -2.553  1.00  0.00
ATOM   1336  C   VAL B   2     -19.020 -18.800  -1.909  1.00  0.00
ATOM   1337  N   HIS B   3     -18.990 -18.829  -0.580  1.00  0.00
ATOM   1338  CA  HIS B   3     -18.623 -17.648   0.178  1.00  0.00
ATOM   1339  CB  HIS B   3     -19.356 -17.649   1.494  1.00  0.00
ATOM   1340  CG  HIS B   3     -20.819 -17.875   1.353  1.00  0.00
ATOM   1341  CD2 HIS B   3     -21.571 -18.995   1.480  1.00  0.00
ATOM   1342  ND1 HIS B   3     -21.667 -16.890   0.896  1.00  0.00
ATOM   1343  CE1 HIS B   3     -22.894 -17.378   0.809  1.00  0.00
ATOM   1344  NE2 HIS B   3     -22.864 -18.650   1.156  1.00  0.00
ATOM   1345  O   HIS B   3     -16.586 -18.389   1.177  1.00  0.00
ATOM   1346  C   HIS B   3     -17.129 -17.592   0.414  1.00  0.00
TER
END
(B) An example of how to use the RR format to submit a prediction of interchain (chains A and B) residue-residue contacts defined as Cb-Cb distances < 8 A.
PFRMAT RR
TARGET T9999
AUTHOR 1234-5678-9000
REMARK Predictor remarks
METHOD Description of methods used
METHOD Description of methods used
MODEL  1
HLEGSIGILLKKHEIVFDGC         # <- entire target sequence (up to 50 
HDFGRTYIWQMSD                #    residues per line)
A1 B9   0  8  0.70        
A1 B10  0  8  0.70           # <- indices of residues: Ai and Bj, 
A1 B12  0  8  0.60           # <- the range of Cb-Cb distance predicted
A1 B14  0  8  0.20           #    for the residue pair: d1 and d2 (real),
A1 B15  0  8  0.10           # <- probability of the distance between 
A1 B17  0  8  0.30           #    Cb atoms being within the specified
A1 B19  0  8  0.50           #    range: p (real)
A2 B8   0  8  0.90
A3 B7   0  8  0.70
A3 B12  0  8  0.40
A3 B14  0  8  0.70
A3 B15  0  8  0.30
A4 B6   0  8  0.90
A7 B14  0  8  0.30
A9 B14  0  8  0.50
END


Example 5. Order-disorder regions prediction

Example of order-disorder regions prediction.

PFRMAT DR
TARGET T9999
AUTHOR 1234-5678-9000
REMARK Predictor remarks
METHOD Description of methods used
METHOD Description of methods used
MODEL  1
H D 0.70           # <- residue code,
L D 0.80           # <- order/disorder assignment code,
E D 0.80           # <- the number specifying the associated
G D 0.60           #    confidence level: 0.5 - residue not predicted 
S D 0.90           #                     >0.5 - disordered region 
I O 0.50           #                     <0.5 - ordered region
G O 0.40
I O 0.40
L O 0.30
L O 0.50
K O 0.50
K O 0.30
H O 0.20
E O 0.20
I O 0.40
V O 0.45
F D 0.60
D D 0.90
G D 0.60
C D 0.80
END


Example 7. Binding site prediction

PFRMAT FN
TARGET T9999
AUTHOR 1234-5678-9000
REMARK Predictor remarks
METHOD Description of methods used
MODEL  1
Binding site: 50-54, 76-79 ; 81, 82, 93-95
Comment: my comment
END


Example 8. Quality assessment prediction

(A) Global Model Quality Score

PFRMAT QA
TARGET T9999
AUTHOR 1234-5678-9000
METHOD Description of methods used
MODEL 1
QMODE 1
3D-JIGSAW_TS1 0.8 
FORTE1_AL1.pdb 0.7 
END
(B) Residue-based Quality Assessment (fragment of the table). Note, that this case includes case (A) and there is no need to submit QMODE 1 predictions additionlly to QMODE 2.

PFRMAT QA
TARGET T9999
AUTHOR 1234-5678-9000
REMARK Error estimate is CA-CA distance in Angstroms
METHOD Description of methods used
MODEL 1
QMODE 2
3D-JIGSAW_TS1 0.8 10.0 6.5 5.0 2.0 1.0 ... 
FORTE1_AL1.pdb 0.7 8.0 5.5 4.5 X X ... 
END
Protein Structure Prediction Center
Sponsored by the US National Institute of General Medical Sciences (NIH/NIGMS)
Please address any questions or queries to:
© 2007-2019, University of California, Davis