Homepage

Belozersky Institute

GeneBee

Russian EMBnet Node

RNA secondary structure prediction


REFERENCES:


ALGORITHM

If there's no significant multiple alignment, including the sequence, that you interest, then the secondary structure is built, using only the energy model (just the potential energy of the system is minimized), but very often such method will give unreliable results.

If a multiple alignment is given, then information on conservative positions in it and compensation exchanges in some of those will be used - stems, including such positions, are given more chances to be included into the resulting secondary structure.

When working with a single sequence, as well as with an alignment, the dialog window for setting parameters of the program looks the same. The following parameters are set:

sequence (should be in one-letter code format):
for example:

tggcacaagc gccgcaaaac cgggggcaag agaaagccct
accacaagaa gcggaagtat gagttggggc gcccagctgc
caacaccaag ttggcccccg ccgcatccac acagtccgtg
tgcggggagg taacaagaaa taccgtgccc tgaggttgga
tggaggagca gttccagcag ggcaagcttc ttggtgagaa
ggcgtgcatc gcttcaaggc cgggacagtg tggccgagca
gatggctatg tgctagaggg caaagagttg gagttctatc
ttaggaaaat caaggcccgc

or alignment which should be represented as following:

CLWRNA     AACCTGGTTGATCCTGCCAGTAGTCATATGCTTGTCTCAGAGATTAAGCCATGCATGTC
EHIRRNA    AACCTGGTTGATCCTGCCAGTAGTCATATGCTTGTCTCAGAGATTAAGCCATGCATGTC
FSLRRNA    AACCTGGTTGATCCTGCCAGTAGTCATATGCTTGTCTCAGAGATTAAGCCATGCATGTC

CLWRNA     TAAGTACATACCTTA---CGGTGAAACCGCGAATGGCTCATTAAATCAGCTATGGTTCC
EHIRRNA    TAAGTACATACCTTCA--CGGTGAAACCGCGAATGGCTCATTAAATCAGCTATGGTTCC
FSLRRNA    TAAGTACAAACCTTTAAACGGTGAAACCGCGAATGGCTCATTAAATCAGCTATGGTTCC

CLWRNA     TTGGATCGTACATACTACATGGATAACTGTAGTAATTCTAGAGCTAATACAT
EHIRRNA    TTGGATCGTACATTGTACATGGATAACTGTACTAATTCTAGAGCTAATACAT
FSLRRNA    TTAGATCGTACATACTACATGGATAACTGTAGTAATTCTAGAGCTAATACAT

It should be mentioned that by default, at alignment case, the RNA secondary structure will be predicted for the first sequence of the alignment. The right number of the sequence could be selected in parameter "Treated sequence" (sophisticated version of the query form).

The algorithm is the following: at first all of the possible ways of fitting together different pieces of the sequences (or the alignment as a hard solid) are looked over.

On the next step locally optimal secondary structures are built from the helices found (hierarchic cluster analysis joining of helices is done). In particular significant pseudoknots could be found at the step.

Now, final system construction can be run. It is done through optimizing, not the real, but model energy of the structure. This model energy includes inputs from conservative and complementary pairs with corresponding coefficients. After the final calculations, the pairs, included into the final structure, will be highlighted on the stack map. Then the graphical model of the RNA structure is built.

The output window is divided into two parts: the lower graphical "structure" frame and the upper text frame for detailed description of local stack zones on the sequence. In the "STRUCTURE" window you will see the secondary structure of the selected sequence. Complementary pairs of its hairpin zones will be shown in yellow (or cyan), white and green color, which means correspondingly compensatory changes or conservativeness for given pair of complementary positions of the alignment, either the given complementary pair of positions exists only in the treated sequence.