sidebar-main
title
griff
banner-button_0 banner-button_Layer-7 banner_button_03 banner-button_Layer-4 banner-button_05 banner-button_Layer-5 banner-button_07
banner-button_08 banner-button_09 banner-button_10
Bioinformatics Unit banner
   homeoffresearchoffconfoffprogonmemoffpuboffvacoff
   tabfoot tabfoot-bgtabfoot-bgtabfoot-bgtabfoot-bgtabfoot-bgtabfoot-bgtabfoot-bgtabfoot-bgtabfoot-bgtabfoot-bgtabfoot-bgtabfoot-bgtabfoot-bgtabfoot-bgtabfoot-bg

multirelieflogo


Input

Algorithm

Algorithm

    Finding functional specific sites is a major problem in protein sequence analysis. Multi-RELIEF is a feature weighting algorithm for multiclass classification problems and can be therefore used to identify functional specific sites in a multiple sequence alignment.

Method

    Based on RELIEF1, a well known machine learning technique for feature weighting in two-class problems, multi-RELIEF can be used also in multiclass problems such as multiple sequence alignments. The principle behind the algorithm is to measure the quality of an attribute, e.g. a site, in merging neighboured sequences within the same class and separating neighboured sequences within different classes. The neighbour of a sequence is defined as any other sequence with minimal distance from it according to the 1-norm:


Example:

Sequence 1 A A B C
Sequence 2 A B C D
Sequence 3 A A B D
Sequence 4 A B B B

In the example, the nearest neighbour of Sequence 1 is Sequence 3 with distance 1. Sequences 2 and 4 are having distance 3 and 2 respectively. Using this definition RELIEF performs the following steps:

For each iteration of the algorithm an instance R is randomly selected and the two neighbours, hit(R) and miss(R), are calculated. hit(R) is the neighbour of R having the same class as R. miss(R) is the neighbour of R having the opposite class. The next step is to update the weight vector initialized to zero according to this formula,

,

where diff(a,b) is defined as 0 if a equals b and 1 otherwise. Normalization the differences with m ensures that the resulting weights range from -1 to 1 after m iterations of the algorithm. The multi-RELIEF algorithm extends this approach to multiclass problems by sub-sampling from pairs of classes. For every iteration of multi-RELIEF two classes are randomly selected and the RELIEF algorithm performed on them. The final weights for each attribute are calculated the following way:

If there had been positive weight assignments for attribute a:

                                                    compute average of all positive weight assignments

If there had been negative weight assignments for attribute a (no single positive weight assignment):

compute average of all negative weight assignments

If there had been just zero weight assignments set the weight of attribute a to zero:

Boosting multi-RELIEF's estimates with 3d contacts

    In order to possibly improve the results of multi-RELIEF, a heuristic to boost the weights of a site using information about its neighbours has been developed. The idea is fairly simple but nevertheless effective:

Every sited gets updated by the weights of its neighbouring sites. Thus, sites with high ranked neighbours are boosted.

Performance of multi-RELIEF

    The mulit-RELIEF algorithm shows good overal performance in comparison with other recent methods for selecting functional specific residues.

 

The multi-RELIEF web server developed and Copyright (c) by Bernhard Kamleitner, K. Anton Feenstra, Kai Ye and Elena Marchiori

(c) IBIVU 2024. If you are experiencing problems with the site, please contact the webmaster.