One of DSSR’s noteworthy features is the auto-detection of helical junctions in nucleic acids structures, be it RNA, DNA, or chimeric DNA/RNA, consisting of one or multiple chains. Helical junctions are created at the interface of three and more stems composed of canonical pairs (Watson-Crick A—T/U and G—C, or wobble G—U). A three-way junction model is illustrated below (copied from Figure 1 of the Bindewald et al. RNAJunction paper). Note that the three chains are each continuous (i.e., consecutive nts are covalently connected), and together with the three inner bps, forming a loop in the middle. Here, the three-way junction is of type [3×2×3], and the loop is composed of a total of 3×2+3+2+3 = 14 nts.
DSSR automatically detects all existing helical junctions in a nucleic acid structure, as illustrated by the following examples.
1l6b [all DNA Holliday junction structure of d(CCGGTACm5CGG)]
This is a simple four-way junction of type [0×0×0×0], where all bases are paired, leaving no connecting nts. The related portion of DSSR output is:
List of 1 junction(s) 1 4-way junctions: 8 nts; [0x0x0x0]; linked by [#1, #2, #4, #3] 1:A.DA6+1:A.DC7+2:B.DG14+2:B.DT15+2:A.DA6+2:A.DC7+1:B.DG14+1:B.DT15 [ACGTACGT] 0 nts junction ; 1:A.DA6-->1:A.DC7 [AC] 0 nts junction ; 2:B.DG14-->2:B.DT15 [GT] 0 nts junction ; 2:A.DA6-->2:A.DC7 [AC] 0 nts junction ; 1:B.DG14-->1:B.DT15 [GT]
Technically, note the following points:
- The four-way junction is derived from the biological assembly 1 (PDB file
1l6b.pdb1
), which contains two copies of the asymmetric unit, delineated by MODEL/ENDMDL. By default, DSSR/3DNA works one structure at a time, corresponding to the first structure/model in a given PDB or mmCIF file. To take the biological assembly as a whole, and to avoid confusions with MODEL/ENDMDL delineated NMR entries, the ENDMDL record of the first model is commented out in the file (1l6b.pdb1
), as below:
#ENDMDL MODEL 2
- With the modified PDB file
1l6b.pdb1
, the DSSR command can be run asx3dna-dssr -i=1l6b.pdb1
, with the output going tostdout
.
- The simplified schematic block png image was generated with the command below to create the Raster3D
.r3d
file (1l6b.r3d
), which was then ray-traced using PyMOL.
blocview -r 1l6b.r3d 1l6b.pdb1
1egk [a four-way DNA/RNA junction]
This four-way junction consists of both DNA and RNA chains. Here the helical junction may not be that obvious by directly looking at the 3D image.
List of 1 junction(s) 1 4-way junctions: 10 nts; [0x0x1x1]; linked by [#3, #-1, #4, #5] B.DC37+B.DT38+B.DA45+B.DC46+C.G109+C.A110+C.U111+D.DA130+D.DG131+D.DG132 [CTACGAUAGG] 0 nts junction ; B.DC37-->B.DT38 [CT] 0 nts junction ; B.DA45-->B.DC46 [AC] 1 nts junction C.A110 [A]; C.G109-->C.U111 [GAU] 1 nts junction D.DG131 [G]; D.DA130-->D.DG132 [AGG]
1ehz [yeast phenylalanine tRNA]
As shown below, DSSR correctly detects the classic L-shaped 3D structure and the cloverleaf 2D structure of a tRNA.
List of 1 junction(s) 1 4-way junctions: 16 nts; [2x1x5x0]; linked by [#1, #2, #3, #4] A.U7+A.U8+A.A9+A.2MG10+A.C25+A.M2G26+A.C27+A.G43+A.A44+A.G45+A.7MG46+A.U47+A.C48+A.5MC49+A.G65+A.A66 [UUAgCgCGAGgUCcGA] 2 nts junction A.U8+A.A9 [UA]; A.U7-->A.2MG10 [UUAg] 1 nts junction A.M2G26 [g]; A.C25-->A.C27 [CgC] 5 nts junction A.A44+A.G45+A.7MG46+A.U47+A.C48 [AGgUC]; A.G43-->A.5MC49 [GAGgUCc] 0 nts junction ; A.G65-->A.A66 [GA]
2fk6 [RNAse Z/tRNA(Thr) complex]
In a recent paper Predicting Helical Topologies in RNA Junctions as Tree Graphs by Laing et al., this PDB entry was selected in Table 1 as containing a three-way junction. However, DSSR fails to detect any junction in this structure, even though the program does find co-axial stacks. It turns out that the PDB entry 2fk6 does not possess the anti-codon stem/loop, thus nts C25 and G46 are not covalently connected. While three-way junctions may be defined differently, the DSSR result follows the above mentioned chain-continuity requirement.
Overall, DSSR can consistently find all helical junctions in a given nucleic acid structure. Try DSSR on a ribosomal structure, you may well appreciate what it reveals. Moreover, it is straightforward to apply the program to all RNA/DNA-containing entries in the PDB via a script.