About Rfam

The Rfam database is a collection of non-coding RNA sequence families of structural RNAs, including non-coding RNA genes as well as cis-regulatory elements. Each family is represented by a multiple sequence alignment and a covariance model (CM).

You can use the Rfam website to obtain information about an individual family, or browse the families and genome annotations. Alternatively, you can download all of the Rfam data from the FTP site. Find out more about the project by exploring the latest Rfam references.

For each family, Rfam provides:

Summary page

Textual background information on the RNA family, which we obtain from the online encyclopedia Wikipedia.

Sequences

Information about sequences in the family, including the sequence ID, bit score, whether the sequence belongs to the SEED or FULL alignment, sequence start and end coordinates, sequence description, and species name.

SEED alignment

A curated alignment containing a representative set of sequences together with a consensus secondary structure annotation. Rfam structures are based on expert-reported secondary structure annotations where available; otherwise, predictive methods such as RNAfold, RNAalifold, or related tools are applied.

Secondary structure

Rfam provides two secondary structure representations: the Rfam structure and the R-scape optimized structure. The Rfam structure relies on expert-reported secondary structure where available, whereas the R-scape optimized structure is inferred to maximize statistically-supported covarying base pairs. Additionally, multiple measures of sequence and structural conservation are available through the Visualization Type menu.

Species

Phylogenetic trees are available for both the SEED and FULL alignments. Two species distribution views are provided: Sunburst and Tree. The Sunburst view provides an interactive visualization, while the Tree view shows the distribution of the RNA across species in the FULL alignment.

Trees

Phylogenetic trees are available for both the SEED and FULL alignments. The tree can also be downloaded in Newick format.

Structures

Mappings between PDB structures and Rfam annotations.

Motif matches

RNA motifs within the SEED alignment are listed. These include up to 34 conserved RNA structural motifs, such as GNRA and UNCG tetraloops, kink-turns, sarcin–ricin motifs, T-loops, and U-loops.

Database references

Main literature references for the family, together with links to additional cross-references, are provided. These references correspond to the publications associated with the family at the time of deposition.

Curation

In Rfam, this section is divided into two subsections: alignment source information and model information. The alignment source information includes the SEED alignment reference, structure source reference, RNA type, authors, and the number of sequences in the SEED and FULL alignments. The model information includes the build, calibration, and search commands, as well as the gathering, trusted, and noise cutoffs. A direct link to download the family covariance model is also provided.

Publications

Publications indexed in Europe PMC associated with an Rfam family using the Rfam ID, family name, or family description.