Cover image provided by X3DNA-DSSR, an NIGMS National Resource for structural bioinformatics of nucleic acids (R24GM153869; skmatics.x3dna.org). Image generated using DSSR and PyMOL (Lu XJ. 2020. [Nucleic Acids Res 48: e74(https://doi.org/10.1093/nar/gkaa426)).
See the 2020 paper titled "DSSR-enabled innovative schematics of 3D nucleic acid structures with PyMOL" in Nucleic Acids Research and the corresponding Supplemental PDF for details. Many thanks to Drs. Wilma Olson and Cathy Lawson for their help in the preparation of the illustrations.
Details on how to reproduce the cover images are available on the 3DNA Forum.

Structure of a group II intron ribonucleoprotein in the pre-ligation state (PDB id: 8T2R; Xu L, Liu T, Chung K, Pyle AM. 2023. Structural insights into intron catalysis and dynamics during splicing. Nature 624: 682–688). The pre-ligation complex of the Agathobacter rectalis group II intron reverse transcriptase/maturase with intron and 5′-exon RNAs makes it possible to construct a picture of the splicing active site. The intron is depicted by a green ribbon, with bases and Watson-Crick base pairs represented as color-coded blocks: A/A-U in red, C/C-G in yellow, G/G-C in green, U/U-A in cyan; the 5′-exon is shown by white spheres and the protein by a gold ribbon. Cover image provided by X3DNA-DSSR, an NIGMS National Resource for structural bioinformatics of nucleic acids (R24GM153869; skmatics.x3dna.org). Image generated using DSSR and PyMOL (Lu XJ. 2020. Nucleic Acids Res 48: e74).

Complex of terminal uridylyltransferase 7 (TUT7) with pre-miRNA and Lin28A (PDB id: 8OPT; Yi G, Ye M, Carrique L, El-Sagheer A, Brown T, Norbury CJ, Zhang P, Gilbert RJ. 2024. Structural basis for activity switching in polymerases determining the fate of let-7 pre-miRNAs. Nat Struct Mol Biol 31: 1426–1438). The RNA-binding pluripotency factor LIN28A invades and melts the RNA and affects the mechanism of action of the TUT7 enzyme. The RNA backbone is depicted by a red ribbon, with bases and Watson-Crick base pairs represented as color-coded blocks: A/A-U in red, C/C-G in yellow, G/G-C in green, U/U-A in cyan; TUT7 is represented by a gold ribbon and LIN28A by a white ribbon. Cover image provided by X3DNA-DSSR, an NIGMS National Resource for structural bioinformatics of nucleic acids (R24GM153869; skmatics.x3dna.org). Image generated using DSSR and PyMOL (Lu XJ. 2020. Nucleic Acids Res 48: e74).

Cryo-EM structure of the pre-B complex (PDB id: 8QP8; Zhang Z, Kumar V, Dybkov O, Will CL, Zhong J, Ludwig SE, Urlaub H, Kastner B, Stark H, Lührmann R. 2024. Structural insights into the cross-exon to cross-intron spliceosome switch. Nature 630: 1012–1019). The pre-B complex is thought to be critical in the regulation of splicing reactions. Its structure suggests how the cross-exon and cross-intron spliceosome assembly pathways converge. The U4, U5, and U6 snRNA backbones are depicted respectively by blue, green, and red ribbons, with bases and Watson-Crick base pairs shown as color-coded blocks: A/A-U in red, C/C-G in yellow, G/G-C in green, U/U-A in cyan; the proteins are represented by gold ribbons. Cover image provided by X3DNA-DSSR, an NIGMS National Resource for structural bioinformatics of nucleic acids (R24GM153869; skmatics.x3dna.org). Image generated using DSSR and PyMOL (Lu XJ. 2020. Nucleic Acids Res 48: e74).

Structure of the Hendra henipavirus (HeV) nucleoprotein (N) protein-RNA double-ring assembly (PDB id: 8C4H; Passchier TC, White JB, Maskell DP, Byrne MJ, Ranson NA, Edwards TA, Barr JN. 2024. The cryoEM structure of the Hendra henipavirus nucleoprotein reveals insights into paramyxoviral nucleocapsid architectures. Sci Rep 14: 14099). The HeV N protein adopts a bi-lobed fold, where the N- and C-terminal globular domains are bisected by an RNA binding cleft. Neighboring N proteins assemble laterally and completely encapsidate the viral genomic and antigenomic RNAs. The two RNAs are depicted by green and red ribbons. The U bases of the poly(U) model are shown as cyan blocks. Proteins are represented as semitransparent gold ribbons. Cover image provided by X3DNA-DSSR, an NIGMS National Resource for structural bioinformatics of nucleic acids (R24GM153869; skmatics.x3dna.org). Image generated using DSSR and PyMOL (Lu XJ. 2020. Nucleic Acids Res 48: e74).

Structure of the helicase and C-terminal domains of Dicer-related helicase-1 (DRH-1) bound to dsRNA (PDB id: 8T5S; Consalvo CD, Aderounmu AM, Donelick HM, Aruscavage PJ, Eckert DM, Shen PS, Bass BL. 2024. Caenorhabditis elegans Dicer acts with the RIG-I-like helicase DRH-1 and RDE-4 to cleave dsRNA. eLife 13: RP93979. Cryo-EM structures of Dicer-1 in complex with DRH-1, RNAi deficient-4 (RDE-4), and dsRNA provide mechanistic insights into how these three proteins cooperate in antiviral defense. The dsRNA backbone is depicted by green and red ribbons. The U-A pairs of the poly(A)·poly(U) model are shown as long rectangular cyan blocks, with minor-groove edges colored white. The ADP ligand is represented by a red block and the protein by a gold ribbon. Cover image provided by X3DNA-DSSR, an NIGMS National Resource for structural bioinformatics of nucleic acids (R24GM153869; skmatics.x3dna.org). Image generated using DSSR and PyMOL (Lu XJ. 2020. Nucleic Acids Res 48: e74).
Moreover, the following 30 [12(2021) + 12(2022) + 6(2023)] cover images of the RNA Journal were generated by the NAKB (nakb.org).
Cover image provided by the Nucleic Acid Database (NDB)/Nucleic Acid Knowledgebase (NAKB; nakb.org). Image generated using DSSR and PyMOL (Lu XJ. 2020. Nucleic Acids Res 48: e74).

H-bonding interactions are crucial for defining RNA secondary and tertiary structures. DSSR/3DNA contains a geometrically based algorithm for identifying H-bonds in nucleic-acid or protein structures given in .pdb or .cif format. Over the years, the method has been continuously refined, and it has served its purpose quite well. As of v1.1.1-2014apr11, this functionality is directly available from DSSR thorough the --get-hbonds
option.
The output for 1msy, which contains a GUAA tetraloop mutant of Sarcin/Ricin domain from E. Coli 23 S rRNA, is listed below. The first line gives the header (# H-bonds in '1msy.pdb' identified by DSSR ...
). The second line provides the total number of H-bonds (40) identified in the structure. Afterwards, each line consists of 8 space-delimited columns used to characterize a specific H-bond. Using the first one (#1) as an example, the meaning of each of the 8 columns is:
- The serial number (15), as denoted in the .pdb or .cif file, of the first atom of the H-bond.
- The serial number (578) of the second H-bond atom.
- The H-bond index (#1), from 1 to the total number of H-bonds.
- A one-letter symbol showing the atom-pair type (p) of the H-bond. It is ‘p’ for a donor-acceptor atom pair; ‘o’ for a donor/acceptor (such as the 2′-hydorxyl oxygen) with any other atom; ‘x’ for a donor-donor or acceptor-acceptor pair (as in #17); ‘?’ if the donor/acceptor status is unknown for any H-bond atom.
- Distance in Å between donor/acceptor atoms (2.768).
- Elemental symbols of the two atoms involved in the H-bond (O/N).
- Identifier of the first H-bond atom (O4@A.U2647).
- Identifier of the second H-bond atom (N1@A.G2673).
Command: x3dna-dssr -i=1msy.pdb --get-hbonds –o=1msy-hbonds.txt
# H-bonds in '1msy.pdb' identified by 3DNA version 3 (xiangjun@x3dna.org)
40
15 578 #1 p 2.768 O:N O4@A.U2647 N1@A.G2673
35 555 #2 p 2.776 O:N O6@A.G2648 N3@A.U2672
36 554 #3 p 2.826 N:O N1@A.G2648 O2@A.U2672
55 537 #4 p 2.965 O:N O2@A.C2649 N2@A.G2671
56 535 #5 p 2.836 N:N N3@A.C2649 N1@A.G2671
58 534 #6 p 2.769 N:O N4@A.C2649 O6@A.G2671
76 513 #7 p 2.806 N:N N3@A.U2650 N1@A.A2670
78 512 #8 p 3.129 O:N O4@A.U2650 N6@A.A2670
95 492 #9 p 2.703 O:N O2@A.C2651 N2@A.G2669
96 490 #10 p 2.853 N:N N3@A.C2651 N1@A.G2669
98 489 #11 p 2.987 N:O N4@A.C2651 O6@A.G2669
115 466 #12 p 2.817 O:N O2@A.C2652 N2@A.G2668
116 464 #13 p 2.907 N:N N3@A.C2652 N1@A.G2668
118 463 #14 p 2.897 N:O N4@A.C2652 O6@A.G2668
123 151 #15 o 2.622 O:O OP2@A.U2653 O2'@A.A2654
135 443 #16 p 2.898 O:N O2@A.U2653 N4@A.C2667
147 192 #17 x 3.054 O:O O4'@A.A2654 O4'@A.U2656
158 408 #18 p 2.960 N:O N6@A.A2654 OP2@A.C2666
173 188 #19 o 2.923 O:O O2'@A.G2655 OP2@A.U2656
173 378 #20 o 3.093 O:O O2'@A.G2655 O6@A.G2664
173 379 #21 o 3.343 O:N O2'@A.G2655 N1@A.G2664
181 386 #22 p 2.768 N:O N1@A.G2655 OP2@A.A2665
183 203 #23 p 2.754 N:O N2@A.G2655 O4@A.U2656
183 387 #24 p 2.887 N:O N2@A.G2655 O5'@A.A2665
188 379 #25 p 3.044 O:N OP2@A.U2656 N1@A.G2664
188 381 #26 p 2.944 O:N OP2@A.U2656 N2@A.G2664
200 401 #27 p 3.122 O:N O2@A.U2656 N6@A.A2665
201 398 #28 p 2.759 N:N N3@A.U2656 N7@A.A2665
220 381 #29 p 3.035 N:N N7@A.A2657 N2@A.G2664
223 371 #30 o 2.963 N:O N6@A.A2657 O2'@A.G2664
223 382 #31 p 3.039 N:N N6@A.A2657 N3@A.G2664
242 358 #32 p 2.821 O:N O2@A.C2658 N2@A.G2663
243 356 #33 p 2.890 N:N N3@A.C2658 N1@A.G2663
245 355 #34 p 2.887 N:O N4@A.C2658 O6@A.G2663
258 305 #35 o 2.604 O:N O2'@A.G2659 N7@A.A2661
258 308 #36 o 3.264 O:N O2'@A.G2659 N6@A.A2661
268 315 #37 p 2.973 N:O N2@A.G2659 OP2@A.A2662
268 327 #38 p 2.864 N:N N2@A.G2659 N7@A.A2662
371 390 #39 o 2.751 O:O O2'@A.G2664 O4'@A.A2665
550 566 #40 o 3.372 O:O O2'@A.U2672 O4'@A.G2673
In its default settings, DSSR detects 117 H-bonds for 1ehz (yeast phenylalanine tRNA), and 5,809 for 1jj2 (the H. marismortui large ribosomal subunit). Note that the program can identify H-bonds not only in RNA and DNA, but also in proteins, or their complexes. By default, however, DSSR only reports H-bonds within nucleic acids. As shown above, it is trivial to run DSSR with the --get-hbonds
option to get all H-bonds in a given structure, and the plain text output is straightforward to work on.
While there exist dedicated tools for finding H-bonds, such as HBPLUS or HBexplore, DSSR may well be sufficient to fulfill most practical needs. If you notice any weird behaviors with this H-bond finding functionality, please let me know. I strive to address reported issues promptly, to the extent practical. At the very least, I should be able to explain why the program is working the way it does.

From the very first release up until recently, the DSSR distribution had included two executables for Windows: one version was compiled on MinGW/MSYS, and the other on Cygwin. The executables are supposed to be run under the corresponding shells of the two environments respectively.
Since DSSR is a simple self-contained command-line tool, the MinGW/MSYS version also works directly under the Command Prompt of native Windows. So Windows users had the following three options to use DSSR:
- Download the MinGW/MSYS version to run it under the Command Prompt of native Windows. No need to install MinGW/MSYS.
- Download the MinGW/MSYS version to run it under the MinGW/MSYS environment, which must be installed separately.
- Download the Cygwin version to run it under the Cygwin environment, which must be installed separately.
Over times, I have observed some confusions among DSSR users as to which of the two executables to use on Windows. Luckily, I noticed by chance recently that the DSSR executable compiled under MinGW/MSYS runs just fine in the Cygwin shell. So as of v1.1.0-2014apr09, the DSSR distribution contains only one executable for Windows: compiled under MinGW/MSYS on 32-bit Windows XP, the same DSSR executable runs under the Command Prompt of native Windows, MinGW/MSYS, and Cygwin, either on a 32-bit or 64-bit Windows (XP, Vista, 7 or 8) machine.
A size fits all: I no longer need to provide two compiled versions of DSSR for Windows, and users have just one executable to download (no more space for confusions).
Note added on 2024-11-25: DSSR is distributed by the CTV (Columbia Technology Ventures). See https://x3dna.org

In addition to VARNA, the draw program in the RNAstructure package from the Mathews Laboratory can also be used to depict DSSR-derived RNA secondary structures in connect table (.ct) format. The draw
program produces images in PostScript (or svg) format, in different styles from those generated by VARNA. Given below are a couple of examples on how to connect DSSR with draw
.
The secondary structure of the PDB entry 1msy in DSSR-derived .ct file is as below:
27 DSSR-derived secondary structure in '1msy'
1 U 0 2 0 2647
2 G 1 3 26 2648
3 C 2 4 25 2649
4 U 3 5 24 2650
5 C 4 6 23 2651
6 C 5 7 22 2652
7 U 6 8 0 2653
8 A 7 9 0 2654
9 G 8 10 0 2655
10 U 9 11 0 2656
11 A 10 12 0 2657
12 C 11 13 17 2658
13 G 12 14 0 2659
14 U 13 15 0 2660
15 A 14 16 0 2661
16 A 15 17 0 2662
17 G 16 18 12 2663
18 G 17 19 0 2664
19 A 18 20 0 2665
20 C 19 21 0 2666
21 C 20 22 0 2667
22 G 21 23 6 2668
23 G 22 24 5 2669
24 A 23 25 4 2670
25 G 24 26 3 2671
26 U 25 27 2 2672
27 G 26 0 0 2673
Let the DSSR-derived .ct file for 1msy be named 1msy.ct
, the following two draw-command runs will produce the secondary structure in PostScript (1msy.eps
) and svg (1msy.svg
) respectively.
draw 1msy.ct 1msy.eps
draw 1msy.ct 1msy.svg --svg -n 1
![1msy [GUAA tetra loop] 2nd structure produced with the RNAstructure 'draw' program 1msy [GUAA tetra loop] 2nd structure produced with the RNAstructure 'draw' program](http://forum.x3dna.org/images/1msy.svg)
The PDB entry 1ehz (yeast phenylalanine tRNA) has a pseudo knot, so the draw
program will create a ‘circularized’ structure as shown below:
![1ehz [yeast phenylalanine tRNA] 2nd structure produced with the RNAstructure 'draw' program 1ehz [yeast phenylalanine tRNA] 2nd structure produced with the RNAstructure 'draw' program](http://forum.x3dna.org/images/1ehz.svg)
Note the following two caveats:

As of v1.0.3-2014mar09, DSSR has a decent user manual in PDF! Currently of 45 pages long, the DSSR manual contains everything a typical user needs to know to get started using the program effectively. The contents the manual are listed below.
Table of Contents
List of Figures
Introduction
Download and installation
Usages
Command-line help
Default run on PDB entry 1msy – detailed explanations
Summary section
List of base pairs
List of multiplets
List of helices
List of stems
List of lone canonical pairs
List of various loops
List of single-stranded fragments
Secondary structure in dot-bracket notation
List of backbone torsion angles and suite names
Default run on PDB entry 1ehz (tRNAPhe) – summary notes
Brief summary
Specific features
Default run on PDB entry 1jj2 – four auto-checked motifs
Kissing loops
A-minor (types I and II) motifs
Ribose zippers
Kink turns
The --more option
Extra parameters for base pairs
Extra parameters for helices/stems
The –-non-pair option
The –-u-turn option
The --po4 option
The –-long-idstr option
Frequently asked questions
How to cite DSSR?
Does DSSR work for DNA?
Does DSSR detect RNA tertiary interactions?
Revision history
Acknowledgements
References
With the User Manual available, I feel confident to claim that DSSR is now mature, stable, ready for real world applications. While only time would tell, I have no doubt that DSSR will become an essential tool in RNA structural bioinformatics.

From early on, DSSR-derived nucleic acid secondary structures have been written in the compact dot-bracket notation (.dbn) with pseudo-knot information. To better connect DSSR to the 2D world, I recently looked into the connect (.ct) format, which was first introduced by Zuker’s mfold program. Over time, the .ct format has become one of the most commonly used RNA secondary structure formats, and it is more expressive than the .dbn format (see below).
As of v1.0, for each analyzed structure, DSSR produces two secondary structure files with default names dssr-2ndstrs.dbn
and dssr-2ndstrs.ct
, in .dbn and .ct formats, respectively. Using the 27-nucleotides (nt) RNA fragment 1msy as an example, the DSSR-derived secondary structure in .dbn and .ct formats are shown below:
![1msy [GUAA tetra loop] in 3d and 2d representations 1msy [GUAA tetra loop] in 3d and 2d representations](http://forum.x3dna.org/images/1msy-3d-2d.png)
In dot-bracket notation (.dbn) [dssr-2ndstrs.dbn]
------------------------------------------------------
>1msy nts=27 DSSR-derived secondary structure
UGCUCCUAGUACGUAAGGACCGGAGUG
.(((((.....(....)....))))).
------------------------------------------------------
In connect format (.ct) [dssr-2ndstrs.ct]
------------------------------------------------------
27 DSSR-derived secondary structure in '1msy'
1 U 0 2 0 2647 # name=A.U2647
2 G 1 3 26 2648 # name=A.G2648, pairedNt=A.U2672
3 C 2 4 25 2649 # name=A.C2649, pairedNt=A.G2671
4 U 3 5 24 2650 # name=A.U2650, pairedNt=A.A2670
5 C 4 6 23 2651 # name=A.C2651, pairedNt=A.G2669
6 C 5 7 22 2652 # name=A.C2652, pairedNt=A.G2668
7 U 6 8 0 2653 # name=A.U2653
8 A 7 9 0 2654 # name=A.A2654
9 G 8 10 0 2655 # name=A.G2655
10 U 9 11 0 2656 # name=A.U2656
11 A 10 12 0 2657 # name=A.A2657
12 C 11 13 17 2658 # name=A.C2658, pairedNt=A.G2663
13 G 12 14 0 2659 # name=A.G2659
14 U 13 15 0 2660 # name=A.U2660
15 A 14 16 0 2661 # name=A.A2661
16 A 15 17 0 2662 # name=A.A2662
17 G 16 18 12 2663 # name=A.G2663, pairedNt=A.C2658
18 G 17 19 0 2664 # name=A.G2664
19 A 18 20 0 2665 # name=A.A2665
20 C 19 21 0 2666 # name=A.C2666
21 C 20 22 0 2667 # name=A.C2667
22 G 21 23 6 2668 # name=A.G2668, pairedNt=A.C2652
23 G 22 24 5 2669 # name=A.G2669, pairedNt=A.C2651
24 A 23 25 4 2670 # name=A.A2670, pairedNt=A.U2650
25 G 24 26 3 2671 # name=A.G2671, pairedNt=A.C2649
26 U 25 27 2 2672 # name=A.U2672, pairedNt=A.G2648
27 G 26 0 0 2673 # name=A.G2673
------------------------------------------------------
Presumably, the .ct format is very simple, and examining a sample file as shown above would give one a pretty good sense of what each column is about. While there exist many oversimplified descriptions of the .ct format on the web, the most detailed and accurate explanation is from the mfold manual:
The ``ct’‘ file (connect table) contains the sequence and base pair information, and is meant to be an input file for a structure drawing program. In addition to containing base pair information, it also lists the 5′ and 3′ neighbor of each base, allowing for the representation of circular RNA or multiple molecules. The ct file also lists the historical base numbering in the original sequence, as bases and base pairs are numbered according from 1 to the size of the folded segment. A portion of a ct file is displayed in Figure 12.
Figure 12: The ct file for the second and final folding of S. cerevisiae Phe-tRNA at 37°, with default parameters. The first record displays the fragment size (76), ΔG and sequence name. The ith subsequent record contains, in order, i, ri, the index of the 5′-connecting base, the index of the 3′-connecting base, the index of the paired base and the historical numbering of the ith base in the original sequence. The 5′, 3′ and base pair indices are 0 when there is no connection or base pair.
Specifically, the 3rd, 4th, and 6th columns in the .ct format convey specific information; by design, they are not redundant to information contained in the 1st column. Note that in the above ‘1msy’ example, the 6th column gives the nt sequence numbers (as in the PDB datafile) instead of the serial numbers (as in the 1st column). The DSSR produced .ct files also contain extra information after ‘#’, in the comma separated key=value format.
As an example of the usefulness of the 3rd and 4th columns, have a look of the DSSR-derived .ct file for the Dickerson DNA dodecamer duplex with sequence CGCGAATTCGCG:
24 DSSR-derived secondary structure in '355d'
1 C 0 2 24 1 # name=A.DC1, pairedNt=B.DG24
2 G 1 3 23 2 # name=A.DG2, pairedNt=B.DC23
3 C 2 4 22 3 # name=A.DC3, pairedNt=B.DG22
4 G 3 5 21 4 # name=A.DG4, pairedNt=B.DC21
5 A 4 6 20 5 # name=A.DA5, pairedNt=B.DT20
6 A 5 7 19 6 # name=A.DA6, pairedNt=B.DT19
7 T 6 8 18 7 # name=A.DT7, pairedNt=B.DA18
8 T 7 9 17 8 # name=A.DT8, pairedNt=B.DA17
9 C 8 10 16 9 # name=A.DC9, pairedNt=B.DG16
10 G 9 11 15 10 # name=A.DG10, pairedNt=B.DC15
11 C 10 12 14 11 # name=A.DC11, pairedNt=B.DG14
12 G 11 0 13 12 # name=A.DG12, pairedNt=B.DC13
13 C 0 14 12 13 # name=B.DC13, pairedNt=A.DG12
14 G 13 15 11 14 # name=B.DG14, pairedNt=A.DC11
15 C 14 16 10 15 # name=B.DC15, pairedNt=A.DG10
16 G 15 17 9 16 # name=B.DG16, pairedNt=A.DC9
17 A 16 18 8 17 # name=B.DA17, pairedNt=A.DT8
18 A 17 19 7 18 # name=B.DA18, pairedNt=A.DT7
19 T 18 20 6 19 # name=B.DT19, pairedNt=A.DA6
20 T 19 21 5 20 # name=B.DT20, pairedNt=A.DA5
21 C 20 22 4 21 # name=B.DC21, pairedNt=A.DG4
22 G 21 23 3 22 # name=B.DG22, pairedNt=A.DC3
23 C 22 24 2 23 # name=B.DC23, pairedNt=A.DG2
24 G 23 0 1 24 # name=B.DG24, pairedNt=A.DC1
Note the 0 at the 4th column for A.DG12 which is at the 3′ end of chain A, and the 0 at 3rd column for B.DC13 which is at the 5′ end of chain B.

From early on, 3DNA calculates the Zp parameter to separate A- and B-DNA double helical steps. First introduced in the paper A-form conformational motifs in ligand-bound DNA structures (see figure below), Zp is the mean projection of the two phosphorus atoms onto the z-axis of the dimer ‘middle frame’. Zp is greater than 1.5 Å for A-DNA, and it is less than 0.5 Å for B-DNA. As noted in the 3DNA NAR paper, other parameters such as slide should also be examined to confirm conformational assignments based on Zp.

As of v2.1, 3DNA has introduced the single-stranded variant for the Zp parameter (ssZp) as a more robust substitute for the Richardson phosphorus-glycosidic bond distance parameter (Dp) to characterize sugar puckers. See post Sugar pucker correlates with phosphorus-base distance for more details. In 3DNA/DSSR, ssZp is defined as the z-coordinate of the 3′ phosphorus atom expressed in the standard reference frame of the preceding base; it is positive when phosphorus lies on the +z-axis side (base in anti conformation) and negative if phosphorus is on the –z-axis side (base in syn conformation). Note that by definition, Dp should always be positive.
As in the previous post, here I am using G175 and U176 of PDB entry 1jj2 (the large ribosomal subunit of Haloarcula marismortui) as examples to illustrate how the ssZp parameters are calculated. The GpU forms a dinucleotide platform, where the sugar of G175 adopts a C2′-endo conformation, and that of U176 C3′-endo. For verification, here is the PDB data file for fragment 1jj2-G175-U176-A177.pdb (note A177 is included for its phosphorus atom). Run the following 3DNA commands:
find_pair -s 1jj2-G175-U176-A177.pdb stdout
frame_mol -1 ref_frames.dat 1jj2-G175-U176-A177.pdb ref-G175.pdb
frame_mol -2 ref_frames.dat 1jj2-G175-U176-A177.pdb ref-U176.pdb
File ref-G175.pdb
contains the following line:
ATOM 24 P U 0 176 -5.624 6.937 1.918 1.00 24.19 P
The z-coordinate of U176 (which is 3′ to G175) is 1.918, which is the ssZp for G175. It is less than 2.9 Å, corresponding to the C2′-endo sugar conformation of G175.
Similarly, file ref-U176.pdb
contains the following line:
ATOM 44 P A 0 177 -3.841 6.592 4.377 1.00 25.91 P
So the ssZp for U176 is 4.377, which is greater than 2.9 Å, corresponding to the C3′-endo sugar conformation of U176.
To sum up, the double-stranded Zp as originally available from 3DNA can be used for discriminating A- and B-DNA double-helical steps: Zp > 1.5 Å for A-DNA, and Zp < 0.5 Å for B-DNA. The newly introduced single-stranded Zp is intended for characterizing sugar puckers: Zp > 2.9 Å for C3′-endo, and Zp < 2.9 Å for C2′-endo. Since A-DNA has predominately C3′-endo sugar conformation and B-DNA has C2′-endo sugar, the ssZp parameter would be helpful in classifying a dinucleotide into A- or B-like conformation. A survey of ssZp in well-defined A- and B-DNA structures (as performed for double-stranded Zp) should prove useful.
Realizing the naming confusions of double-stranded Zp vs single-stranded Zp, I am considering to rename single-stranded Zp as ssZp in future releases of 3DNA and DSSR. Do you have any comments or suggestions? Please let me know by leaving a comment!

Recently I was surprised by some cases of nucleotides with missing atoms in PDB entry 1pns. The story started like this: 3DNA/DSSR maps various nucleotide names to one-letter codes, based on the data file baselist.dat
(see post Modified nucleotides in the PDB). In the meantime, 3DNA/DSSR internally assigns a nucleotide as either purine or pyrimidine, by virtue of coordinates of base atoms. Be definition, purines should only include A/a/G/g/I/i
, and pyrimidines C/c/T/t/U/u/P/p
. However, no consistency check has been implemented in DSSR until just now.
I first noticed the inconsistency between residue name and atom coordinates for nucleotide A6 on chain U (hereafter referred to as U.A6) in 1pns. The nucleotide has standard name ‘ A’, obviously a purine. However, somehow DSSR classified it as a pyrimidine based on atomic coordinates. Upon further check of the PDB data file, I found the following remarks:
REMARK 470 MISSING ATOM
REMARK 470 THE FOLLOWING RESIDUES HAVE MISSING ATOMS(M=MODEL NUMBER;
REMARK 470 RES=RESIDUE NAME; C=CHAIN IDENTIFIER; SSEQ=SEQUENCE NUMBER;
REMARK 470 I=INSERTION CODE):
REMARK 470 M RES CSSEQI ATOMS
REMARK 470 A U 6 N9 C8 N7
REMARK 470 G U 8 N9 C8 N7
REMARK 470 A U 12 N9 C8 N7
REMARK 470 A U 13 N9 C8 N7
REMARK 470 A U 14 N9 C8 N7
The atomic coordinates for U.A6 are as below:
ATOM 34447 P A U 6 81.861 37.210 78.651 1.00378.87 P
ATOM 34448 OP1 A U 6 80.631 37.121 77.831 1.00378.87 O
ATOM 34449 OP2 A U 6 81.665 37.221 80.119 1.00378.87 O
ATOM 34450 O5' A U 6 82.707 38.495 78.212 1.00378.87 O
ATOM 34451 C5' A U 6 83.948 38.777 78.887 1.00378.87 C
ATOM 34452 C4' A U 6 84.600 40.000 78.276 1.00378.87 C
ATOM 34453 O4' A U 6 84.975 39.698 76.901 1.00378.87 O
ATOM 34454 C3' A U 6 83.714 41.239 78.153 1.00378.87 C
ATOM 34455 O3' A U 6 83.654 41.968 79.369 1.00378.87 O
ATOM 34456 C2' A U 6 84.403 42.015 77.020 1.00378.87 C
ATOM 34457 O2' A U 6 85.564 42.655 77.474 1.00378.87 O
ATOM 34458 C1' A U 6 84.834 40.864 76.105 1.00378.87 C
ATOM 34459 C5 A U 6 82.033 39.296 74.209 1.00378.87 C
ATOM 34460 C6 A U 6 82.941 39.553 75.166 1.00378.87 C
ATOM 34461 N6 A U 6 81.170 39.949 72.090 1.00378.87 N
ATOM 34462 N1 A U 6 83.830 40.588 75.041 1.00378.87 N
ATOM 34463 C2 A U 6 83.843 41.410 73.939 1.00378.87 C
ATOM 34464 N3 A U 6 82.899 41.124 72.974 1.00378.87 N
ATOM 34465 C4 A U 6 81.968 40.108 73.016 1.00378.87 C
No atom records for N7, C8 and N9. So far, so good. However, surprise came when I visualized U.A6 in Jmol, as shown in the following image. Note here atom N1 is connected to C1’ as in pyrimidines, and N6 is bonded to C4!

The same issue also exists for U.G8 (see figure below), U.A12, U.A13, and U.A14.

It is beyond my imagination to understand why such weird cases exist in the PDB, even given the lousy resolution (8.7 Å) of 1pns.

I recently upgraded my Macs to OS X Mavericks to check if 3DNA/DSSR works in the new operating system. I am glad to report that both run without a hitch, as expected.
Since OS X Mavericks is free from the Mac App Store, it will quickly become the de facto version virtually all Mac users would use. I also noticed that Ruby on Mavericks has been upgraded to ruby 2.0.0p247 (2013-06-27 revision 41674)
, a major step forward from the now retiring Ruby 1.8.7 distributed in previous versions of Mac OS X.
As a rule, I’d ensure that 3DNA/DSSR executes properly in major releases of the commonly used operating systems — Mac, Windows, and Linux.

While having not used DOS for ages, I am glad to find that the DSSR version compiled for MinGW/MSYS on Windows works perfectly under this operating system (see screenshot below). The DSSR DOS command-line interface functions exactly the same as for Linux, Mac OS X, MinGW/MSYS, and CygWin. Among other possible usages, it allows for batch files to take advantage of DSSR.

Implementing DSSR in strict ANSI C as a self-contained and zero-dependent command-line program pays off enormously: it simplifies code maintenance and ensures that the program is applicable wherever a C compiler exists. The easy web interface to DSSR makes the program universally accessible.