MS2DB++ logo
SFSU Bioinformatics Logo

  1. MS2DB++ Usage
  2. Introduction to Combination Rules
  3. Dempster rule
  4. Yager rule
  5. Campos-Cavalcante rule
  6. Shafer rule

MS2DB++ Usage

The disulfide bond determination process, available under the section Find Disulfide Connectivity, is divided into four separate steps:

  1. Method Selection and Data Entry
  2. Reliability Assignment
  3. Combination Strategies
  4. Global Connectivity

1. Method Selection and Data Entry

In this initial stage, the user can:

  • select the different S-S bond determination frameworks available
  • enter the protein's FASTA sequence
  • specify a region where S-S bonds are not expected to occur (optional)
  • if the MS2DB+ framework is selected, the user will be able to upload the MS/MS files, choose the protease used during digestion, choose the number of missing cleavage sites, and optionally set the Initial match threshold
  • if external frameworks are selected, the user will be able to enter the S-S connectivity information
Once all the data has been entered, the NEXT buttons should be clicked to move to the next step.

2. Reliability Assignment

In this step, the putative bonds determined by each framework selected are listed. Reliability scores are provided by each disulfide bond identified. Initially, all reliability scores are set to 1.0 (maximum value). Optionally, the user is able to tweak (decrease) these scores based on his/her experience. Some of the facts to consider while decreasing the reliability scores include:

  • technical knowledge of the protein sequence/structure
  • disulfide bond determination framework
  • analysis of the MS/MS data involved in the bond identification
Links to the MS/MS file(s) involved in the disulfide linkage identified by the MS2DB+ framework are provided. The MS/MS files can be accessed by clicking on the corresponding disulfide bond.

The next step (available when the user clicks on NEXT) is to select the combination rules based on the Dempster-Shafer theory (DST).

3. Combination Strategies

In this step, the user can choose from different combination rules to determine the global disulfide connectivity pattern. Each combination rule has its advantages, thus aiding in the user analysis and improving the quality of the results. At least one combination rule is required. The user may want to choose any or all of them. Please check the subsequent sections in this page for a more detailed description of each combination rule.

4. Global Connectivity

In this final stage, the global (consistent) disulfide connectivity, obtained using each combination rule selected in the previous step, are presented in both graphical and text formats. A confidence score is also assigned to each disulfide bond found.

All results displayed are available for downloads in TXT and XML formats. The intermediary scores calculated by the different frameworks and used during the information fusion are also available for downloads in XML format at the bottom of the page.

Introduction to Combination Rules

Each disulfide bond determination method contributes with some evidence (score) towards the final decision about the existence of a disulfide (S-S) bond between a specific pair of cysteines. Therefore, coherent combination rules are required to optimally combine the score values produced by the different disulfide bond connectivity determination methods.

MS2DB++ provides to the users the ability of use up to five different disulfide linkage determination methods, including:

  • MS/MS, developed in our MS2DB+ application (based on MS/MS data);
  • SVM, a predictive technique using a support vector machine classifier;
  • CSP, a predictive technique based on cysteine separation profiles;
  • Two custom methods, where users can provide the bonding patterns determined by other methods;

Given the results obtained by the different disulfide determination methods, different combination rules were developed. It is important to note that there is not a correct or incorrect combination rule. Each rule has its own supporting theory behind it and may be the optimal solution for a specific set proteins being analyzed by a specific set of methods.

Overall, MS2DB++ allows users to analyze the same data from different perspectives in order to obtain the best results. In the following, the different combination rules developed are listed and briefly explained.

Dempster rule

According to the Dempster rule and considering only two disulfide bond determination methods (for simplicity), the score of a disulfide bond pattern A is:
  • the summation of the product of the scores of bonding patterns B and C, respectively obtained by disulfide bond determination methods m1 and m2, when the intersection between B and C equals to A and bonding pattern A is not an empty set.
  • divided by
  • the summation of the product of the scores of bonding patterns B and C, respectively obtained by disulfide bond determination methods m1 and m2, when the intersection between B and C is not an empty set.

The formula used to combine disulfide bonds scores according to the Dempster rule is presented below. This combination formula is commutative and associative; thus, other disulfide bond determination methods (and their scores) can be easily added.

Dempster rule

Yager rule

According to the Yager rule and considering only two disulfide bond determination methods (for simplicity), the score of a disulfide bond pattern A is:
  • the summation of the product of the scores of bonding patterns B and C, respectively obtained by disulfide bond determination methods m1 and m2, when the intersection between B and C equals to A.

The formula used to combine disulfide bonds scores according to the Yager rule is presented below. This combination formula is also commutative and associative; thus, other disulfide bond determination methods (and their scores) can be easily added.

Yager rule

Campos-Cavalcante rule

According to the Campos-Cavalcante rule and considering only two disulfide bond determination methods (for simplicity), the score of a disulfide bond pattern A is:
  • the summation of the product of the scores of bonding patterns B and C, respectively obtained by disulfide bond determination methods m1 and m2, when the intersection between B and C equals to A and bonding pattern A is not an empty set.
  • multiplied by
  • I, which is 1 divided by the summation of the product of the scores of bonding patterns B and C, respectively obtained by disulfide bond determination methods m1 and m2, when the intersection between B and C is not an empty set.
  • divided by
  • 1 plus the logarithm (base 10) of I.

The formula used to combine disulfide bonds scores according to rule 1 is presented below. This combination formula is commutative and associative; thus, other disulfide bond determination methods (and their scores) can be easily added.

Campos-Cavalcante rule

Shafer rule

This combination rule allows users to weight the scores obtained by each disulfide bond determination method differently. This rule is specially powerful when the bonding patterns found are conflicting or if a particular method is known to perform poorly due to a specific motif (i.e.: poor fragmentation in tandem MS/MS analysis, or when a method is known to perform poorly due to a specific amino acid sequence or bonding arrangement).

While using this rule, an expert user may assign reliability values to the bonding scores obtained by the different methods used. These reliability values are then multiplied by their respective bonding pattern score. By default, MS2DB++ assigns maximum reliability (alfa = 1) to all scores.

In this rule, the score of a bonding pattern A is calculated as the average score of all bonding scores A obtained by the different disulfide bond determination methods, multiplied by their respective confidence factor alfa. The formula is presented below.

Shafer rule