Help text for RNA secondary structure prediction with KNetFold
BackgroundKNetFold is a new software for predicting the consensus RNA secondary structure for a given alignment of nucleotide sequences. It uses an innovative classifier system (a hierarchical network of k-nearest neighbor classifiers) to compute for each pair of alignment positions a "base pair" or "no base pair" prediction. We evaluated the accuracy of the KNetFold algorithm with a set of 49 RNA sequence alignments obtained from the RFAM database. In our recent publication, we show that for this test set, the performance of the method is higher compared to the programs PFOLD and RNAalifold. We also show, that the method is able to predict pseudoknots. More detailed information can be found in our recent publication.
Quick startFor a really quick start, simply use the form filled with the example tRNA alignment and click "submit" after providing your email address. If you want to compute a prediction using your own set of sequences, you need to paste a set aligned RNA sequences (in FASTA format) into the sequence text area of the form below. Providing the email address is optional but recommended, because computing the results can take more than an hour (depending on the length of the submitted alignment). The use of the options is explained below.
Retrieving resultsThere are 3 different ways to obtain the compute results: Please note: if you did not provide an e-mail address, you have to store the job id or keep the page generated upon query submission open in you browser!
- If you provided an e-mail address, the web server sends you an email with the appropriate link to your results when they become available.
- After submitting your query, the generated page contains a link to you results. If they are not yet available, it will say that the job is still pending. Remember: KNetFold is not "fast", it might take more than half an hour before the compute job is finished.
- If you stored the job id (typically one would paste the id into a text editor), you can obtain the results by entering the job id into the result retriever page.
Filter optionsWe offer two different schemes for mapping the matrix representing a contact prediction into one unique secondary structure. The "winner takes all filter" was used for computing the results in our 2006 publication. It is fast and works fine, but in some instances it leads to the prediction implausible pseudoknots. For this reason we now offer a type of distance geometry algorithm that filters out sterically impossible pseudoknots.
Minimum stem lengthThis option requires all stems of a predicted secondary structure to have a certain minimum length. When evaluating the accuracy of KNetFold, we found that the prediction accuracy is slightly higher if the stem length is not restricted (minimum stem length of one). However, the RFAM alignments used for the evaluation are generally of high quality. Using alignments consisting of only a few sequences can lead to predicted spurious single base pairs. Because of this, the default for this option is set to a minimum stem length of 2.
ReferenceE. Bindewald, B.A. Shapiro:
RNA secondary structure prediction from sequence alignments using a network of k-nearest neighbor classifiers.
RNA. 12(3):342-352 (2006). HTML PDF PubMed