X3DNA-DSSR Homepage -- Nucleic Acid Structures

DSSR-enabled RNA cover image March 2026

DSSR-enabled RNA cover image February 2026

Cover image provided by X3DNA-DSSR, an NIGMS National Resource for Structural Bioinformatics of Nucleic Acids (R24GM153869; skmatics.x3dna.org). Image generated using DSSR and PyMOL (Lu XJ. 2020. Nucleic Acids Res 48: e74).

As the developer of DSSR, I am thrilled to see its application in cutting-edge research across multiple disciplines. Below is a list of four recent publications that highlight how DSSR has been utilized, underscoring its versatility and significance in structural bioinformatics.

In the Geng et al. (2025) Nucleic Acids Research (NAR) paper, titled 'Revealing hidden protonated conformational states in RNA dynamic ensembles', DSSR is simply cited as follows:

All bp geometries, hydrogen-bond, backbone, stacking, and sugar dihedral angles were calculated using X3DNA-DSSR [77].

In the preprint by Gordan et al. (2025), titled 'High-throughput characterization of transcription factors that modulate UV damage formation and repair at single-nucleotide resolution', DSSR is cited as follows:

Step base stacking, base pair shift, base pair slide, interbase angle, pseudorotation angle, and sugar puckering classifications of nucleobases were computed using X3DNA-DSSR (v2.5.0)⁷⁵. Base stacking was defined as the overlapping polygon area in Å² when projecting the dipyrimidine base ring atoms (excluding exocyclic atoms) into the mean base pair plane⁷⁶. The sugar ring pseudorotation phase angle of each pyrimidine was also calculated using X3DNA-DSSR as described by Altona, C. & Sundaralingam, M.⁷⁷ Interbase angle was defined as sqrt(propeller²+buckle²) per the X3DNA-DSSR documentation.

Figure 6: TF Binding Induces Structural Distortion Favorable to UV Dimerization is highly informative, particularly panel (a), which illustrates the ensemble of structural parameters that predispose dipyrimidines to cyclobutane pyrimidine dimers (CPD) or 6-4 pyrimidine-pyrimidones (6-4 PP) formation. DSSR is designed as an integrated software tool, offering a comprehensive suite of structural parameters not found in any other single tool I am aware of. Despite this, the innovative use of DSSR by Gordan et al. exceeds my expectations and demonstrates its versatility.

In the preprint by Kubaney et al. (2025) from the Baker group, titled 'RNA sequence design and protein-DNA specificity prediction with NA-MPNN', DSSR is cited as follows:

On the pseudoknot subset, we evaluate additional structure‐ and reactivity‐based metrics. DSSR v2.3.2⁴¹ is used to extract the ground‐truth secondary structure from the native crystal structures. For each designed sequence, RibonanzaNet predicts 2A3 reactivity profiles, from which we compute predicted OpenKnot scores (see https://github.com/eternagame/OpenKnotScore)³¹ using the predicted reactivity together with the DSSR ground truth.

In a recent NSMB paper from the Baker group, titled 'Computational design of sequence-specific DNA-binding proteins', 3DNA is cited as follows:

RIF docking of scaffolds onto DNA targets (DBP design step 1) Structures of B-DNA for each target (Supplementary Table 2) were generated by (1) using the DNA portion of PDB 1BC8 (ref. 60), PDB 1YO5 (ref. 61), PDB 1L3L (ref. 51) or PDB 2O4A (ref. 62) or (2) using the software X3DNA⁶³, followed by a constrained Rosetta relax of the DNA structure.

Please note that 3DNA has been replaced by DSSR. The functionality for constructing B-DNA models, previously provided by 3DNA, is now directly available in DSSR via its fiber and rebuild modules.

In the preprint by Si et al. (2025), titled 'End-to-End Single-Stranded DNA Sequence Design with All-Atom Structure Reconstruction', DSSR is cited as follows:

Since ViennaRNA and NUPACK require secondary structures as input, we used DSSR³⁵ to extract secondary structures from the corresponding ssDNA three-dimensional structures.

The above use cases are merely a sample of how DSSR is utilized in the scientific literature. It is reasonable to state that DSSR has emerged as a de facto standard tool within the field of nucleic acid structural bioinformatics. Overall, DSSR is a mature, robust, and efficient software product that is actively developed and maintained. I am committed to making DSSR synonymous with quality and value. Its unmatched functionality, usability, and support save users significant time and effort compared to alternative solutions.

DSSR is available free of charge for academic users. Additionally, it has been integrated into other high-profile bioinformatics resources, including NAKB, PDB-redo, and N•ESPript.

References

Geng A, Roy R, Ganser L, Li L, Al-Hashimi HM. Revealing hidden protonated conformational states in RNA dynamic ensembles. Nucleic Acids Research. 2025;53:gkaf1366. https://doi.org/10.1093/nar/gkaf1366.
Gordan R, Wasserman H, Chi B, Bohm K, Duan M, Sahay H, et al. High-throughput characterization of transcription factors that modulate UV damage formation and repair at single-nucleotide resolution. 2025. https://doi.org/10.21203/rs.3.rs-8197218/v1.
Kubaney A, Favor A, McHugh L, Mitra R, Pecoraro R, Dauparas J, et al. RNA sequence design and protein–DNA specificity prediction with NA-MPNN. 2025. https://doi.org/10.1101/2025.10.03.679414.
Glasscock CJ, Pecoraro RJ, McHugh R, Doyle LA, Chen W, Boivin O, et al. Computational design of sequence-specific DNA-binding proteins. Nat Struct Mol Biol. 2025;32:2252–61. https://doi.org/10.1038/s41594-025-01669-4.
Si Y, Xu Y, Chen L. End-to-end single-stranded DNA sequence design with all-atom structure reconstruction. 2025. https://doi.org/10.64898/2025.12.05.692525.

Single- and double-stranded Zp

From early on, 3DNA calculates the Zp parameter to separate A- and B-DNA double helical steps. First introduced in the paper A-form conformational motifs in ligand-bound DNA structures (see figure below), Zp is the mean projection of the two phosphorus atoms onto the z-axis of the dimer ‘middle frame’. Zp is greater than 1.5 Å for A-DNA, and it is less than 0.5 Å for B-DNA. As noted in the 3DNA NAR paper, other parameters such as slide should also be examined to confirm conformational assignments based on Zp.

As of v2.1, 3DNA has introduced the single-stranded variant for the Zp parameter (ssZp) as a more robust substitute for the Richardson phosphorus-glycosidic bond distance parameter (Dp) to characterize sugar puckers. See post Sugar pucker correlates with phosphorus-base distance for more details. In 3DNA/DSSR, ssZp is defined as the z-coordinate of the 3′ phosphorus atom expressed in the standard reference frame of the preceding base; it is positive when phosphorus lies on the +z-axis side (base in anti conformation) and negative if phosphorus is on the –z-axis side (base in syn conformation). Note that by definition, Dp should always be positive.

As in the previous post, here I am using G175 and U176 of PDB entry 1jj2 (the large ribosomal subunit of Haloarcula marismortui) as examples to illustrate how the ssZp parameters are calculated. The GpU forms a dinucleotide platform, where the sugar of G175 adopts a C2′-endo conformation, and that of U176 C3′-endo. For verification, here is the PDB data file for fragment 1jj2-G175-U176-A177.pdb (note A177 is included for its phosphorus atom). Run the following 3DNA commands:

find_pair -s 1jj2-G175-U176-A177.pdb stdout
frame_mol -1 ref_frames.dat 1jj2-G175-U176-A177.pdb ref-G175.pdb
frame_mol -2 ref_frames.dat 1jj2-G175-U176-A177.pdb ref-U176.pdb

File ref-G175.pdb contains the following line:

ATOM     24  P     U 0 176      -5.624   6.937   1.918  1.00 24.19           P

The z-coordinate of U176 (which is 3′ to G175) is 1.918, which is the ssZp for G175. It is less than 2.9 Å, corresponding to the C2′-endo sugar conformation of G175.

Similarly, file ref-U176.pdb contains the following line:

ATOM     44  P     A 0 177      -3.841   6.592   4.377  1.00 25.91           P

So the ssZp for U176 is 4.377, which is greater than 2.9 Å, corresponding to the C3′-endo sugar conformation of U176.

To sum up, the double-stranded Zp as originally available from 3DNA can be used for discriminating A- and B-DNA double-helical steps: Zp > 1.5 Å for A-DNA, and Zp < 0.5 Å for B-DNA. The newly introduced single-stranded Zp is intended for characterizing sugar puckers: Zp > 2.9 Å for C3′-endo, and Zp < 2.9 Å for C2′-endo. Since A-DNA has predominately C3′-endo sugar conformation and B-DNA has C2′-endo sugar, the ssZp parameter would be helpful in classifying a dinucleotide into A- or B-like conformation. A survey of ssZp in well-defined A- and B-DNA structures (as performed for double-stranded Zp) should prove useful.

Realizing the naming confusions of double-stranded Zp vs single-stranded Zp, I am considering to rename single-stranded Zp as ssZp in future releases of 3DNA and DSSR. Do you have any comments or suggestions? Please let me know by leaving a comment!

Comment

Modified nucleotides in the PDB

In addition to the five canonical bases (A, C, G, T, and U), nucleic acid structures in the PDB contains numerous modified variants (natural or engineered) in the nucleobase, sugar, or the phosphate. For instance, the 76-nt (nucleotide) long yeast phenylalanine tRNA (1ehz) contains 14 modified bases: 2MG10, H2U16, H2U17, M2G26, OMC32, OMG34, YYG37, PSU39, 5MC40, 7MG46, 5MC49, 5MU54, PSU55, and 1MA58. Among which, the most prevalent and best-known example is pseudouridine. Note that in the PDB, each residue (including modified nt) is named with an up to three-letter identifier, e.g., PSU for pseudouridine. For a comprehensive list (with chemical and structural information) of small molecules, including modified nts, please refer to the Ligand Expo website hosted by the RCSB PDB.

Given the widespread occurrences of modified bases in nucleic acid structures, any practical structural bioinformatics software should be able to treat them effectively, as with the canonical bases. In 3DNA, from the very beginning, modified bases are mapped to standard counterparts, e.g. 5‐iodouracil (5IU) to uracil (U) and 1‐methyladenine (1MA) to adenine (A), allowing for easy analysis of unusual DNA and RNA structures (see the NAR03 reference). Specifically, in the 3DNA distribution the file baselist.dat contains the mappings explicitly.

As of v2.1, 3DNA automatically maps a new modified base not available in the file baselist.dat. Yet, I have continuously updated the list in line with new DNA/RNA entries released by the PDB. The process is automated with a Ruby script which calls find_pair -s on each nucleic-acid-containing structure to output unknown bases. As an extreme, the baselist.dat file below comprises only canonical bases:

  A   A
  C   C
  G   G
  T   T
  U   U
 DA   A
 DC   C
 DG   G
 DT   T

With the above minimum mapping list, running the command find_pair -s on 1ehz.pdb identifies all the 14 modified bases. A sample case for 2MG is shown below:

Match '2MG' to 'g' for residue 2MG   10  on chain A [#10]
    check it & consider to add line '2MG     g' to file <baselist.dat>

By parsing the output of a batch run on all DNA/RNA-containing entries in the PDB as of October 18, 2013, I identified a total of 596 modified bases. The top portion is as below:

02I     a
08Q     c
08T     a
0AD     g
 0C     c
0DC     c
0DG     g
0DT     t
 0G     g
0KL     u
0KX     c
0KZ     t

An explicit list of base mapping makes the correspondence transparent, and helps avoid ambiguous cases as to which canonical base a modified nt matches to. DSSR uses the same list internally. Hopefully, the information would also be useful to other related projects.

Comment [2]

UNR- and GNRA-type U-turns

As of beta-r20-on-20130830, DSSR is able to detect two types of U-turns (see the figure below), the UNR-type (left) originally identified by Quigley and Rich [1976] in yeast phenylalanine tRNA, and the GNRA-type (right) later on established by Jucker and Pardi [1995] in GNRA tetra loops. See the Gutell et al. paper Predicting U-turns in Ribosomal RNA with Comparative Sequence Analysis for a more extensive account of U-turns.

As its name implies, a U-turn is characterized by a reversal of the RNA backbone direction within a few nucleotides. Among other factors, the U-turn is stabilized by two key H-bonding interactions, illustrated in dotted lines in the figure below.


UNR-type (1ehz)	GNRA-type (1msy)

Applying DSSR to 1jj2 (the crystal structure of the Haloarcula marismortui large ribosomal subunit) led to the identification of over 30 cases. In addition to the well-documented UNR- and GNRA-type U-turns, the program also finds other variants. An example is shown below, where the U-turn is within a GCA triloop instead of a GNRA tetraloop. Here, the N1 (not N2) atom of G1809 forms an H-bond with OP2 of G1812. The G1809 N2 atom is H-bonded to G1812 O5′ to further stabilize the U-turn.

An examination of the chemical structure of the nitrogenous bases (see figure below) shows clearly other possibilities to connect RNA base donors to the phosphate oxygen acceptors. DSSR allows for the exploration of such variations, and more.

Comment

Restraint optimization of DNA backbone geometry using PHENIX

3DNA can build DNA/RNA structures with a precise base but approximate sugar-phosphate backbone geometry. In the 2003 3DNA-NAR paper, Table 3 of the section “Structures built with sugar–phosphate backbone” lists “root mean square deviation (in Å) between rebuilt 3DNA models and experimental DNA structures” for three representative DNA structures (in A-form, B-form, and a protein-DNA complex). It was noted that The RMSD of reconstructed versus observed base positions is virtually zero and that for both base and backbone coordinates is <0.85 Å, even for the 146 bp nucleosomal DNA structure.

The backbone geometry is approximate because 3DNA uses a fixed sugar-phosphate conformation (in A-DNA, B-DNA or RNA) that is attached to the corresponding bases in the model building process. The most noticeable effect is the long O3′(i)···P(i+1) bond that connects consecutive nucleotides along a chain. The imprecise structure was intended as a starting point for other objectives (e.g., all-atom molecular dynamics simulations) that are out of the design scope of 3DNA. Nevertheless, over the years, I have been concerned with the overlong O3′—P distance issue. I tried but failed to find a satisfying third-party (command-line driven) tool that can perform restraint optimization of the sugar-phosphate backbone geometry while keeping base atoms fixed.

The problem was finally solved after I attended the 43rd Mid-Atlantic Macromolecular Crystallography Meeting held at Duke University a few months ago. At the meeting, I had the opportunities to talk to several members of the PHENIX team. Particularly, Jeff Headd revised the geometry_minimization component of PHENIX to do the trick. Here is the mail reply from Jeff, using a 3DNA-generated DNA duplex (355d-3dna.pdb) as an example (see full details below):

Here’s a first go at refining just the backbone atoms of you input DNA model. You’ll need the most recently nightly build of Phenix (dev-1395 would work) and then run:

phenix.geometry_minimization 355d-3dna.pdb min.params

using the attached min.params file.

What I specify in the params file is to only move the backbone atoms, which I’ve done with a selection. You can modify the atoms that are allowed to move to your liking.

The only other change was to allow longer distance linkages, as some of the backbone linkages start quite far apart.

The content of file min.params is:

pdb_interpretation {
  link_distance_cutoff = 7.0
}
selection = name " P  " or name " OP1" or name " OP2" or \
            name " O5'" or name " C5'" or name " C4'" or \
            name " O4'" or name " C3'" or name " O3'" or \
            name " C2'"

To make the story complete, given below is the step-by-step procedure, using 355d, a B-DNA dodecamer at 1.4 Å resolution as an example. The corresponding PDB file is named 355d.pdb.

find_pair 355d.pdb stdout | analyze stdin
x3dna_utils cp_std bdna
rebuild -atomic bp_step.par 355d-3dna.pdb
# the rebuilt structure is called '355d-3dna.pdb'

# with Phenix dev-1395 and above
phenix.geometry_minimization 355d-3dna.pdb min.params
# the optimized structure is called '355d-3dna_minimized.pdb'

# to verify:
find_pair 355d-3dna.pdb stdout | analyze stdin
find_pair 355d-3dna_minimized.pdb stdout | analyze stdin
# check files '355d-3dna.out' and '355d-3dna_minimized.out'

The three key files mentioned above are provided here for your verification:

Finally, the following figure illustrates the B-DNA dodecamer duplex in experimental (left), 3DNA-generated (middle) and PHENIX-optimized (right) coordinates. Note that disconnected O3′—P linkages (marked by red dots for two cases, see bottom of the middle image) due to overlong distances in 3DNA-rebuilt structure are fixed following the restraint PHENIX optimization.

355d-experimental	3DNA-rebuilt	PHENIX-optimized

Note added on 2016-11-11: In the min.params file, the selection is in one long line. For illustration purpose, the selection section (see below) is split into serveral short lines in the blog post. However, PHENIX requires ending backslashes (\) to combine the split lines into a single grammatical unit. I was not aware of this strict rule, and missed to add the ending \s in the original post. Thanks to Oleg Sobolev from the PHENIX team for pointing out this omission to my attention. Note that the content of min.params did not have a problem, and thus no change is made.

pdb_interpretation {
  link_distance_cutoff = 7.0
}
selection = name " P  " or name " OP1" or name " OP2" or \
            name " O5'" or name " C5'" or name " C4'" or \
            name " O4'" or name " C3'" or name " O3'" or \
            name " C2'"

Comment [4]

Detection of helical junctions in nucleic acid structures

One of DSSR’s noteworthy features is the auto-detection of helical junctions in nucleic acids structures, be it RNA, DNA, or chimeric DNA/RNA, consisting of one or multiple chains. Helical junctions are created at the interface of three and more stems composed of canonical pairs (Watson-Crick A—T/U and G—C, or wobble G—U). A three-way junction model is illustrated below (copied from Figure 1 of the Bindewald et al. RNAJunction paper). Note that the three chains are each continuous (i.e., consecutive nts are covalently connected), and together with the three inner bps, forming a loop in the middle. Here, the three-way junction is of type [3×2×3], and the loop is composed of a total of 3×2+3+2+3 = 14 nts.

DSSR automatically detects all existing helical junctions in a nucleic acid structure, as illustrated by the following examples.

1l6b [all DNA Holliday junction structure of d(CCGGTACm5CGG)]

This is a simple four-way junction of type [0×0×0×0], where all bases are paired, leaving no connecting nts. The related portion of DSSR output is:

List of 1 junction(s)
   1 4-way junctions: 8 nts; [0x0x0x0]; linked by [#1, #2, #4, #3]
       1:A.DA6+1:A.DC7+2:B.DG14+2:B.DT15+2:A.DA6+2:A.DC7+1:B.DG14+1:B.DT15 [ACGTACGT]
       0 nts junction ; 1:A.DA6-->1:A.DC7 [AC]
       0 nts junction ; 2:B.DG14-->2:B.DT15 [GT]
       0 nts junction ; 2:A.DA6-->2:A.DC7 [AC]
       0 nts junction ; 1:B.DG14-->1:B.DT15 [GT]

Technically, note the following points:

The four-way junction is derived from the biological assembly 1 (PDB file 1l6b.pdb1), which contains two copies of the asymmetric unit, delineated by MODEL/ENDMDL. By default, DSSR/3DNA works one structure at a time, corresponding to the first structure/model in a given PDB or mmCIF file. To take the biological assembly as a whole, and to avoid confusions with MODEL/ENDMDL delineated NMR entries, the ENDMDL record of the first model is commented out in the file (1l6b.pdb1), as below:

#ENDMDL                                                                          
MODEL        2

With the modified PDB file 1l6b.pdb1, the DSSR command can be run as x3dna-dssr -i=1l6b.pdb1, with the output going to stdout.

The simplified schematic block png image was generated with the command below to create the Raster3D .r3d file (1l6b.r3d), which was then ray-traced using PyMOL.

blocview -r 1l6b.r3d 1l6b.pdb1

1egk [a four-way DNA/RNA junction]

This four-way junction consists of both DNA and RNA chains. Here the helical junction may not be that obvious by directly looking at the 3D image.

List of 1 junction(s)
   1 4-way junctions: 10 nts; [0x0x1x1]; linked by [#3, #-1, #4, #5]
       B.DC37+B.DT38+B.DA45+B.DC46+C.G109+C.A110+C.U111+D.DA130+D.DG131+D.DG132 [CTACGAUAGG]
       0 nts junction ; B.DC37-->B.DT38 [CT]
       0 nts junction ; B.DA45-->B.DC46 [AC]
       1 nts junction C.A110 [A]; C.G109-->C.U111 [GAU]
       1 nts junction D.DG131 [G]; D.DA130-->D.DG132 [AGG]

1ehz [yeast phenylalanine tRNA]

As shown below, DSSR correctly detects the classic L-shaped 3D structure and the cloverleaf 2D structure of a tRNA.

List of 1 junction(s)
   1 4-way junctions: 16 nts; [2x1x5x0]; linked by [#1, #2, #3, #4]
       A.U7+A.U8+A.A9+A.2MG10+A.C25+A.M2G26+A.C27+A.G43+A.A44+A.G45+A.7MG46+A.U47+A.C48+A.5MC49+A.G65+A.A66 [UUAgCgCGAGgUCcGA]
       2 nts junction A.U8+A.A9 [UA]; A.U7-->A.2MG10 [UUAg]
       1 nts junction A.M2G26 [g]; A.C25-->A.C27 [CgC]
       5 nts junction A.A44+A.G45+A.7MG46+A.U47+A.C48 [AGgUC]; A.G43-->A.5MC49 [GAGgUCc]
       0 nts junction ; A.G65-->A.A66 [GA]

2fk6 [RNAse Z/tRNA(Thr) complex]

In a recent paper Predicting Helical Topologies in RNA Junctions as Tree Graphs by Laing et al., this PDB entry was selected in Table 1 as containing a three-way junction. However, DSSR fails to detect any junction in this structure, even though the program does find co-axial stacks. It turns out that the PDB entry 2fk6 does not possess the anti-codon stem/loop, thus nts C25 and G46 are not covalently connected. While three-way junctions may be defined differently, the DSSR result follows the above mentioned chain-continuity requirement.

Overall, DSSR can consistently find all helical junctions in a given nucleic acid structure. Try DSSR on a ribosomal structure, you may well appreciate what it reveals. Moreover, it is straightforward to apply the program to all RNA/DNA-containing entries in the PDB via a script.

Comment

Drawing an RNA secondary structure from its 3D coordinates

Given the primary sequence of an RNA molecule, there are numerous methods for predicting its secondary (2D) structures. To judge their accuracy, three-dimensional (3D) RNA structures solved experimentally by X-ray or NMR as deposited in the PDB are often used as benchmarks. DSSR is a handy tool to derive an RNA 2D structure from its 3D coordinates in PDB or mmCIF format. The 2D structure is specified in the dot-bracket notation (dbn), which can be fed directly into drawing programs such as VARNA for interactive display and easy generation of publication quality 2D diagrams.

Over the past few months, I’ve been asked a few times on the details of how the diagrams in the DSSR post were created. The answer is really simple, and has already been mentioned above and in the post. Here are two concrete examples to show how the process works.

1zc5 (structure of the RNA signal essential for translational frame shifting in HIV-1)

This is the structure used in the VARNA paper. Let the PDB file be named 1zc5.pdb, the DSSR program can be run like this:

x3dna-dssr -i=1zc5.pdb

The output is sent to stdout by default, with the following three lines towards the end:

>1zc5-A #1 RNA with 41 nts
GGCGAUCUGGCCUUCCUACAAGGGAAGGCCAGGGAAUUGCC
(((((((((((((((((....)))))))))))...))))))

Simply copy and paste the last two lines (sequence and the 2D structure in dbn notation) into the Seq: and Str: fields of the VARNA demo page, the diagram will be updated automatically, as shown in the screenshot:

1ehz (crystal structure of yeast phenylalanine tRNA at 1.93 Å resolution)

This example (1ehz.pdb) is used to illustrate tRNA’s classic cloverleaf 2D structure. The related command and result are:

x3dna-dssr -i=1ehz.pdb -o=1ehz.out

# the output is sent to file '1ehz.out'
# towards its end are the following 3 lines

>1ehz-A #1 RNA with 76 nts
GCGGAUUUAgCUCAGuuGGGAGAGCgCCAGAcUgAAgAPcUGGAGgUCcUGUGuPCGaUCCACAGAAUUCGCACCA
(((((((..((((.....[..)))).((((.........)))).....(((((..]....))))))))))))....

I’ve used a local copy of the JAVA web start version of VARNA (VARNA-WebStart.jnlp) to generate the following 2D diagram. Here, in addition to the customized title, I have set the number period to 5 nts, adopted the simple base-pair style, and manually adjusted the T arm (upper right corner) to make the long line connecting G19 and C56 a bit more unobtrusive. Right-click to see the context menu.

Note that the G19—C56 pair creates a pseudo-knot (specified by the matching [] pair in the dbn notation above) in tRNA. I was not aware of this salient feature from previous knowledge of relevant literature. It was indeed a surprise when I first saw it in the 2D diagram.

As illustrated above, DSSR serves well as a bridge from RNA 3D to 2D structures. Give DSSR a try, you will find the program actually has much more to offer!

Comment

DSSR identifies kink-turns!

As of the beta-r14-on-20130626 release, DSSR has the functionality to identify kink-turns and reverse k-turns given an RNA structure in PDB format.

The k-turn motif was first described by Klein et al. (2001) in the paper The kink-turn: a new RNA secondary structure motif, based on analyses of the H. marismortui large ribosomal unit. It turns out to be a widespread structural motif, now with a dedicated k-turn database hosted by the Lilley laboratory.

Geometrically, k-turn is composed of an asymmetric internal loop, with a sharp kink between the two framing helices and characteristic loop features (including at least one sheared G-A pair and A-minor interactions). Overall, k-turn is a complicated motif, and I am not aware of any published method or available software for its auto-detection.

Previous releases of DSSR has built up all the necessary components to detect key features of a k-turn. Over the past few weeks, I have been focusing on connecting the dots to implement an algorithm for its auto-identification. As of beta-r14-on-20130626, DSSR can locate ‘simple’ k-turns or reverse k-turns from an RNA structure in PDB format. I understand the subtleties and variations of k-turns, and will refine the algorithm in future releases of DSSR.

Without putting k-turns under its umbrella, DSSR appears incomplete in its functionality. Hopefully, detection of k-turns will help DSSR gain more attention from the RNA structure community.

Comment

DSSR, what is it and why bother?

Over the past six months or so¹, I’ve been focusing mostly on developing DSSR, a new addition to the 3DNA suite of programs. So what is DSSR, specifically? Why did I bother to create it? How would it be relevant to the nucleic acid structure community?

Literally, DSSR stands for Defining the (Secondary) Structures of RNA². Starting from an RNA structure in PDB format, DSSR employs a set of simple criteria to identify all existent base pairs (bp): both canonical Watson–Crick (WC) pairs and non-canonical pairs with at least one H-bond, made up of normal or modified bases, regardless of tautomeric or protonation state. The classification is based on the six standard rigid-body bp parameters (shear, stretch, stagger, propeller, buckle, and opening), which together rigorously quantify the spatial disposition of any two interacting bases. Moreover, the program characterizes each bp by commonly used names (WC, reverse WC, Hoogsteen, reverse Hoogsteen, wobble, sheared, imino, Calcutta, and dinucleotide platform), the Saenger classification scheme of 28 types, and the Leontis-Westhof nomenclature of 12 basic geometric classes. DSSR also checks for non-pairing interactions (H-bonds or base stacking).

DSSR detects triplets and even higher-order base associations by searching horizontally in the plane of the associated bp for further H-bonding interactions. The program determines helical regions by exploring each bp’s neighborhood vertically for base-stacking interactions, regardless of backbone connection (e.g., coaxial stacking of helices or pseudo helices). Moreover, each helix/stem is characterized by a least-squares fitted helical axis to allow for easy quantification of relative helical geometry. DSSR calculates commonly used backbone (including the virtual η/θ) torsion angles, classifies the main chain backbone into BI/BII conformation and the sugar into C2’/C3’-endo like pucker, identifies A-minor interactions (types I and II), ribose zippers, G quartets, hairpin loops, kissing loops, bulges, internal loops and multi-branch loops (junctions). It also detects the existence of pseudo-knots, and outputs RNA secondary structure in the dot-bracket notation.

Experienced 3DNA users may notice that some of the above outlined functionality (e.g., calculation of torsion angles, identification of all pairs, higher order base associations, and helices) have existed for over a decade. Over the years, I have written several posts (see What can 3DNA do for RNA structures?, and links therein) to advocate 3DNA’s applications in RNA structural analysis. Nevertheless, 3DNA has never been widely used in the RNA structure community, for various possible reasons: (1) the misconception that 3DNA is only for DNA (but not RNA); (2) the basic functionality is split into two programs (find_pair and analyze), and needs to be run several times with different options (default find_pair, and with -s, or -p). Thus even though 3DNA is applicable to RNA structures, it is unnecessarily complicated and confusing (especially to new 3DNA users); (3) 3DNA is command-line driven, consisting of many C programs and scripts, with different styles in specifying options. It has the ‘reputation’ of being powerful, but cryptic and hard to use.

I’ve created DSSR from scratch to take consideration of these factors, by employing my extensive experience in supporting 3DNA, an increased knowledge in RNA structures and refined C programming skills. Implemented in ANSI C as a stand-alone command-line program, DSSR is self-contained. Its executables (on MacOS X, Linux and Windows) have zero runtime dependencies. No setup is necessary; simply put the program into a folder of your choice (preferably one on your command PATH), and it should work. DSSR has sensible default settings and an intuitive output, making it directly accessible to a much broader audience than 3DNA per se. Since its initial release on March 3, 2013, I’ve yet to hear any installation or usage problem. So far, all reported bugs have been verified and fixed promptly. The latest beta release has been checked against all nucleic-acid-containing entries in the PDB, without any known issues.

Overall, DSSR consolidates, refines, and significantly extends 3DNA’s functionality for RNA structural analysis. There are more in DSSR than its simple interface suggests. Piecewise, DSSR may appear nothing new, yet combined together, it has unique features not available anywhere else. Its value will be gradually appreciated as DSSR becomes more widely used by the community. Want to know if your structure contains any Hoogsteen pair, sheared G•A pair, or a dinucleotide platform? DSSR can check it for you, easily.

DSSR-beta already possesses all the basic functionality and has been well tested to serve as a handy tool for RNA structural analysis. I stand firmly behind DSSR, and strive to continuously improve the program. Give it a try, and report back on the 3DNA Forum any issues you have. As always, I respond quickly and concretely to all questions posted there. I hope you enjoying using DSSR as much as I enjoy creating and supporting it!

¹ This post was published on March 29, 2013, shortly after the beta releases of DSSR [note added on March 15, 2014].

² DSSR also works for DNA, or DNA-protein complexes, as far as the basic functionality is concerned. Moreover, the acronym could have two other possible interpretations, as would be obvious when the program gains a wider recognition.

Comment [2]

Named base pairs

In the field of nucleic acid structures, especially in the ‘RNA world’, we often hear named base pairs (bp). Among those, the Watson-Crick (WC) A–U and G–C bps (see figure below) are by far the most common.

Watson-Crick base pairs

Reversed WC (rWC) base pairs

Closely related to the WC bps are the so-called reversed WC (rWC) bps, where the relative glycosidic bond are reversed; instead of being on the same side of the bases as in WC bps shown above, they are now on opposite sides in rWC bps as shown below. According to the Leontis-Westhof (LW) bp classification scheme, the rWC bps belong to trans WC/WC. Following Saenger’s numbering, the rWC A+U bp corresponds to XXI, and the rWC G+C bp XXII.

In the figures below, the name of each type of bp and its LW & Saenger designations (separated by ‘;’) are noted under the corresponding image. All images are generated with 3DNA; for easy comparison, each bp is oriented in the reference frame of the leading base.


Reversed WC A+U pair	Reversed WC G+C pair
trans WC/WC; XXI	trans WC/WC; XXII

Hoogsteen and reversed Hoogsteen base pairs

The next most famous one is the Hoogsteen A+U bp, which also has a reverse variant, i.e., the rHoogsteen A–U bp (see figure below). Now the major groove edge of A, termed the Hoogsteen edge by LW, is used for pairing with U.


Hoogsteen A+U pair	Reversed Hoogsteen A–U pair
cis Hoogsteen/WC; XXIII	trans Hoogsteen/WC; XXIV

The G–U Wobble base pair

First proposed by Crick in 1966 to account for the degeneracy in codon–anticodon pairing, the Wobble bp is an essential component (in addition to the WC bps) in forming double helical RNA secondary structures.

Wobble G–U pair

cis WC/WC; XXVIII

The sheared G–A base pair

Sheared G–A is a commonly found non-WC bp in both DNA and RNA structures. Noticeably, tandem sheared G–A bps introduce distinct stacking geometry. Here G uses its minor groove edge, termed the sugar edge by LW, to pair with the Hoogsteen edge of A.

Sheared G–A pair

trans Suger/Hoogsteen; XI

Dinucleotide platforms

Dinucleotide platforms are formed via side-by-side pairing of adjacent bases; the most common of which are GpU and ApA. Here the sugar (minor-groove) edge of the 5′ base interacts with the Hoogsteen (major-groove) edge of the 3′ base. Since there is only one base-base H-bond in dinucleotide platforms, no Saenger classification is available. In 3DNA output, the GpU dinucleotide platform is designated as G+U, and ApA as A+A.


GpU dinucleotide platform	ApA dinucleotide platform
cis Sugar/Hoogsteen; n/a	cis Sugar/Hoogsteen; n/a

Other named base pairs

There exist other named bps in RNA literature, e.g., G⋅A imino, A⋅C reverse Hoogsteen, G⋅U reverse Wobble etc. In the my experience, they are (much) less commonly used than the ones illustrated above.

Comment [2]

« Older · Newer »

Thank you for printing this article from http://x3dna.org/. Please do not forget to visit back for more 3DNA-related information. — Xiang-Jun Lu

X3DNA-DSSR: a resource for structural bioinformatics of nucleic acids(An NIGMS National Resource supported by NIH grant R24GM153869)

Cover image provided by X3DNA-DSSR, an NIGMS National Resource for Structural Bioinformatics of Nucleic Acids (R24GM153869; skmatics.x3dna.org). Image generated using DSSR and PyMOL (Lu XJ. 2020. Nucleic Acids Res 48: e74).

References

1l6b [all DNA Holliday junction structure of d(CCGGTACm5CGG)]

1egk [a four-way DNA/RNA junction]

1ehz [yeast phenylalanine tRNA]

2fk6 [RNAse Z/tRNA(Thr) complex]

1zc5 (structure of the RNA signal essential for translational frame shifting in HIV-1)

1ehz (crystal structure of yeast phenylalanine tRNA at 1.93 Å resolution)

The G–U Wobble base pair

The sheared G–A base pair

Dinucleotide platforms

Other named base pairs

X3DNA-DSSR: a resource for structural bioinformatics of nucleic acids
(An NIGMS National Resource supported by NIH grant R24GM153869)