DSSR-derived secondary structure in BPSEQ format

From v1.1.3-2014jun18, DSSR has an additional output of RNA secondary structures in BPSEQ format. A sample file for PDB entry 1msy is shown below.

1msy [GUAA tetra loop] in 3d and 2d representations

Filename: dssr-2ndstrs.bpseq
Organism: DSSR-derived secondary structure [1msy]
Accession Number: DSSR v1.1.4-2014aug09 (xiangjun@x3dna.org)
Citation: Please cite 3DNA/DSSR (see http://x3dna.org)
    1 U     0 # name=A.U2647
    2 G    26 # name=A.G2648, pairedNt=A.U2672
    3 C    25 # name=A.C2649, pairedNt=A.G2671
    4 U    24 # name=A.U2650, pairedNt=A.A2670
    5 C    23 # name=A.C2651, pairedNt=A.G2669
    6 C    22 # name=A.C2652, pairedNt=A.G2668
    7 U     0 # name=A.U2653
    8 A     0 # name=A.A2654
    9 G     0 # name=A.G2655
   10 U     0 # name=A.U2656
   11 A     0 # name=A.A2657
   12 C    17 # name=A.C2658, pairedNt=A.G2663
   13 G     0 # name=A.G2659
   14 U     0 # name=A.U2660
   15 A     0 # name=A.A2661
   16 A     0 # name=A.A2662
   17 G    12 # name=A.G2663, pairedNt=A.C2658
   18 G     0 # name=A.G2664
   19 A     0 # name=A.A2665
   20 C     0 # name=A.C2666
   21 C     0 # name=A.C2667
   22 G     6 # name=A.G2668, pairedNt=A.C2652
   23 G     5 # name=A.G2669, pairedNt=A.C2651
   24 A     4 # name=A.A2670, pairedNt=A.U2650
   25 G     3 # name=A.G2671, pairedNt=A.C2649
   26 U     2 # name=A.U2672, pairedNt=A.G2648
   27 G     0 # name=A.G2673

Based on online sources, BPSEQ has originated from the Comparative RNA Web site developed by the Gutell lab. CRW files contain four header lines, describing the file name, organism, accession number, and a general remark. Thereafter, there is one line per base in the molecule, listing the position of the base (starting from 1), the one-letter base name (A,C,G,U etc), and the position number of the base to which it is paired. If the base is unpaired, zero (0) is put in the third column. In the above sample BPSEQ file derived from DSSR, detailed information about the base and its paired base (if any) comes after the # symbol.

Compared to dot-bracket notation (dbn) and connect-table (.ct) format, BPSEQ is simpler but less expressive. Nevertheless, the format is well-supported in bioinformatic tools on RNA secondary structures. It only seems fitting that DSSR now produces secondary structures in .bpseq (with default file name dssr-2ndstrs.bpseq), in addition to .dbn and .ct. Technically, adding the BPSEQ output to DSSR is trivial given the infrastructure already in place.

---

Comment

 
---

·

Thank you for printing this article from http://x3dna.org/. Please do not forget to visit back for more 3DNA-related information. — Xiang-Jun Lu