[Job] Staff Associate II (Computational Structural Biology) at Columbia University
The X3DNA-DSSR resource is at the forefront of structural bioinformatics, developing advanced tools for analyzing and modeling nucleic acid structures. We are seeking a highly motivated Staff Associate II to join our team and contribute to our next-generation analysis and visualization engine.
To see our resource in action, please visit wDSSR, our new web interface for dissecting and modeling 3D nucleic acid structures: https://web.x3dna-dssr.org/.
We are looking for a candidate with a strong scientific background in structural biology or bioinformatics and a desire to contribute to peer-reviewed publications through community-driven data analysis. We value individuals who are eager to learn, adapt to new technical challenges, and support the global research community.
For the full job description and to submit your application, please visit the official Columbia University posting:
https://apply.interfolio.com/183705
Announcing wDSSR: The Next-Generation Web Interface to X3DNA-DSSR
Dear 3DNA/DSSR Community,
We are thrilled to announce the official launch of wDSSR (https://web.x3dna-dssr.org/), the powerful new web interface to the X3DNA-DSSR analytical engine.
Developed by Drs. Shuxiang Li and Xiang-Jun Lu and supported by NIH grant R24GM153869, wDSSR represents a major leap forward from our highly popular 2019 Web 3DNA 2.0 framework. While Web 3DNA 2.0 has faithfully served the community for the analysis, visualization, and modeling of 3D nucleic acid structures, wDSSR was built from the ground up to take full advantage of modern web technologies and the latest DSSR backend capabilities.
A Modern, Streamlined Scientific Workflow
We have completely overhauled the user interface to provide a clean, intuitive, and task-driven experience. The core modeling and analysis tools are now seamlessly organized into a logical, single-word scientific workflow: Analyze, Rebuild, Model, Circularize, Mutate, Assemble, and Visualize.
Spotlight Feature: The "Assemble" Module
One of the most exciting upgrades is the newly renamed Assemble tab (formerly "Composite"). This advanced composite model builder allows you to effortlessly construct complex, higher-order models by linking any combination of nucleic acid duplexes or protein-DNA/RNA complexes. You can quickly connect up to six distinct target structures, ranging from simple linked A-DNA and B-DNA duplexes to large, protein-decorated structural assemblies.
Immediate Global Adoption
Although wDSSR has just launched, we are incredibly humbled to share that it is already seeing rapid worldwide adoption! According to recent network infrastructure data, the new interface is actively being used by researchers across North America, South America, Europe, and Asia. Within just a few days, we have recorded active sessions from prestigious institutions around the globe, including:
- The Weizmann Institute of Science in Israel
- Katholieke Universiteit Leuven in Belgium
- Queen's University in Canada
- Universidad Nacional Autonoma de Mexico (UNAM) in Mexico
- Emory University and the Wadsworth Centers Laboratories and Research in the United States
- Jawaharlal Nehru University and the China Education and Research Network in Asia
How to Cite
While a dedicated paper for wDSSR is currently in preparation, researchers should cite the server using its URL (https://web.x3dna-dssr.org/) alongside the 2019 Web 3DNA 2.0 paper and the foundational 2015 DSSR paper. Full details and funding acknowledgements can be found on our newly consolidated About page.
We invite you all to try out the new wDSSR platform! As always, your feedback is invaluable to us, and we encourage you to share your thoughts, questions, and structural models via the newly updated Questions & Feedback link in the wDSSR footer.
Happy modeling!
In the field of nucleic acid structures, especially in the ‘RNA world’, we often hear named base pairs (bp). Among those, the Watson-Crick (WC) A–U and G–C bps (see figure below) are by far the most common.

Closely related to the WC bps are the so-called reversed WC (rWC) bps, where the relative glycosidic bond are reversed; instead of being on the same side of the bases as in WC bps shown above, they are now on opposite sides in rWC bps as shown below. According to the Leontis-Westhof (LW) bp classification scheme, the rWC bps belong to trans WC/WC. Following Saenger’s numbering, the rWC A+U bp corresponds to XXI, and the rWC G+C bp XXII.
In the figures below, the name of each type of bp and its LW & Saenger designations (separated by ‘;’) are noted under the corresponding image. All images are generated with 3DNA; for easy comparison, each bp is oriented in the reference frame of the leading base.
 |
 |
| Reversed WC A+U pair |
Reversed WC G+C pair |
| trans WC/WC; XXI |
trans WC/WC; XXII |
The next most famous one is the Hoogsteen A+U bp, which also has a reverse variant, i.e., the rHoogsteen A–U bp (see figure below). Now the major groove edge of A, termed the Hoogsteen edge by LW, is used for pairing with U.
 |
 |
| Hoogsteen A+U pair |
Reversed Hoogsteen A–U pair |
| cis Hoogsteen/WC; XXIII |
trans Hoogsteen/WC; XXIV |
First proposed by Crick in 1966 to account for the degeneracy in codon–anticodon pairing, the Wobble bp is an essential component (in addition to the WC bps) in forming double helical RNA secondary structures.
 |
| Wobble G–U pair |
| cis WC/WC; XXVIII |
The sheared G–A base pair
Sheared G–A is a commonly found non-WC bp in both DNA and RNA structures. Noticeably, tandem sheared G–A bps introduce distinct stacking geometry. Here G uses its minor groove edge, termed the sugar edge by LW, to pair with the Hoogsteen edge of A.
 |
| Sheared G–A pair |
| trans Suger/Hoogsteen; XI |
Dinucleotide platforms
Dinucleotide platforms are formed via side-by-side pairing of adjacent bases; the most common of which are GpU and ApA. Here the sugar (minor-groove) edge of the 5′ base interacts with the Hoogsteen (major-groove) edge of the 3′ base. Since there is only one base-base H-bond in dinucleotide platforms, no Saenger classification is available. In 3DNA output, the GpU dinucleotide platform is designated as G+U, and ApA as A+A.
 |
 |
| GpU dinucleotide platform |
ApA dinucleotide platform |
| cis Sugar/Hoogsteen; n/a |
cis Sugar/Hoogsteen; n/a |
Other named base pairs
There exist other named bps in RNA literature, e.g., G⋅A imino, A⋅C reverse Hoogsteen, G⋅U reverse Wobble etc. In the my experience, they are (much) less commonly used than the ones illustrated above.

Glycosidic bond “is a type of covalent bond that joins a carbohydrate (sugar) molecule to another group, which may or may not be another carbohydrate.” In nucleic acid structures, the other group is a nucleobase, and the predominated type is the N-glycosidic bond where the purine (A/G) N9 or pyrimidine (C/T/U) N1 atom connects to the C1′ atom of the five-membered (deoxy) ribose sugar ring. Another well-known type is the C-glycosidic bond in pseudouridine, the most common modified base in RNA structures where the C5 atom instead of N1 is linked to the C1′ atom of the sugar ring.

Recently, I performed a survey of all nucleic-acid-containing structures in the PDB/NDB database to see how many types of glycosidic bond are there. As always, I noticed some inconsistencies in the data: nucleotides with disconnected base/sugar, a base labeled as U but with pseudoU-type C-glycosidic bond. Shown below are a few unusual types of glycosidic bond in otherwise seemingly “normal” structures:
- The residue GN7 (number 28 on chain A) in PDB entry 1gn7 contains a N7-glycosylated guanine.

- The residue UPG (number 501 on chain A) in PDB entry 1y6f has sugar C1C (instead of C1′) atom connects to N1 of U.

- The residue XAE (number 11 on chain B) in PDB entry 2icz contains a benzo-homologous adenine.

- The residue F5H (number 206 on chain B) in PDB entry 3v06 has N1 of U connects to C2′ of a six-membered sugar ring.

The unusual glycosidic bond has implications in 3DNA calculated parameters, for example the chi torsion angle. Identifying such cases would help refine 3DNA to provide sensible parameters and to avoid possible misinterpretations.

As of today (2012-09-16), the number of 3DNA forum registrations has reached 500! A quick browse of the ‘Statistics Center’ shows that over 80% of the registrations (400+) are after March 2012, when the new 3DNA homepage/forum were launched.
The sharp increase in registration is mostly due to the streamlined, web-based way to distribute the 3DNA software package. As far as I know, the number of 3DNA registrations/downloads in the past six months is significantly higher than that of 3DNA v2.0 for over three years. Equally importantly, I have been able to fixed every reported bug, addressed each feature request, and updated the 3DNA v2.1 distribution promptly.
I also feel confident to declare that up to now, the 3DNA Forum is spam free (at least to the extent I am aware). To this end, I’ve taken the following three measures:
- Installation of the SMF “Mod Stop Spammer”; as of this writing, it shows “3920 Spammers blocked up until today”.
- By using 3DNA-related verification questions. At its current setting, a user must answer correctly three of the ‘simple’ yet effective verification questions. Early on, I decided deliberately not to use CAPTCHA as an anti-spam means, based on my past experience.
- I’ve continuously monitored (new) registrations, and taken immediate actions against any suspicious registration. Due to the effectiveness of above two steps, so far I only have to manually handle just a few spam registrations. Nevertheless, it does illustrate the fact that no automatic method is perfect, and expert inspection is required to ensure desired results.
Overall, the new simplified way to distribute the 3DNA software package is working as intended; now users can easily access all distributed versions of 3DNA, and I can focus on support and further development of the software.

From v1.5 or even earlier on, 3DNA provides an automatic classification of a dinucleotide step into A-, B- or TA-DNA conformation. Figure 5 of the 2003 3DNA Nucleic Acids Research paper (NAR03) shows three sets of scatter plots — helical inclination and x‐displacement, dimer step Roll and Slide, and the projected phosphorus z coordinates Zp and Zp(h) — to differentiate the A-, B- and TA-DNA dinucleotide steps.

Among the criteria tested, the most discriminative ones are the projected phosphorus z coordinates, Zp in the middle step frame (see figure below), and Zp(h) defined similarly but in the middle helical frame.

Over the years, I have received many questions regarding the datasets used in generating Figure 5 of NAR03. Back in August 2006, a user asked for IDs of the TA-DNA structures — see DNA standards/statistics using 3DNA. In April 2007, another user requested the same TA-DNA dataset. Early this year, a user asked for 3DNA’s A-DNA definition. More recently, yet another user would like to ask about the DNA set used for the analysis that is presented in Fig 5. in the NAR 2003 paper.
I am glad to see that after nearly a decade of the NAR03 publication, the user community is still interested in knowing details in the work. So I decided to dig into my archive for the original data files and scripts used to generate Figure 5 of NAR03. It was not an easy journey; just releasing the data files and scripts is not enough, I’d like to verify that they work together as intended in today’s computing environment. Luckily, I am finally able to get to the bottom of the issues. The details are in the post Datasets and scripts for reproducing Figure 5 of the 3DNA NAR03 paper. The tarball file named 3DNA-NAR03-Fig5.tar.gz is available by clicking the link.

As noted in post Rectangular block expressed in MDL molfile format, I added the -mol option (in v2.1) to convert 3DNA’s native alchemy to the better-supported MDL molfile format, to make the characteristic schematic representations more widely accessible. Along the line, I have recently further augmented alc2img with the -pdb option to transform alchemy to the PDB format.
While the macromolecular PDB format is certainly not convenient for specifying linkage details of small molecules, it’s nevertheless the best-documented and by far the most widely supported than molfile or alchemy in currently available molecular viewers. For example, the PDB format is consistently supported in Jmol, PyMOL, RasMol, DeepView, and UCSF Chimera. Moreover, the PDB format does have the CONECT section to provide information on atomic connectivity:
The CONECT records specify connectivity between atoms for which coordinates are supplied. The connectivity is described using the atom serial number as shown in the entry. CONECT records are mandatory for HET groups (excluding water) and for other bonds not specified in the standard residue connectivity table.
The alc2img -pdb option takes advantage of the CONECT records and specifies all ‘bond’ linkages explicitly. The usage is very simple — take the standard base-pair rectangular block file (‘Block_BP.alc’) as an example, the conversion can be performed as below:
alc2img -pdb Block_BP.alc Block_BP.pdb
Content of ‘Block_BP.alc’
12 ATOMS, 12 BONDS
1 N -2.2500 5.0000 0.2500
2 N -2.2500 -5.0000 0.2500
3 N -2.2500 -5.0000 -0.2500
4 N -2.2500 5.0000 -0.2500
5 C 2.2500 5.0000 0.2500
6 C 2.2500 -5.0000 0.2500
7 C 2.2500 -5.0000 -0.2500
8 C 2.2500 5.0000 -0.2500
9 C -2.2500 5.0000 0.2500
10 C -2.2500 -5.0000 0.2500
11 C -2.2500 -5.0000 -0.2500
12 C -2.2500 5.0000 -0.2500
1 1 2
2 2 3
3 3 4
4 4 1
5 5 6
6 6 7
7 7 8
8 5 8
9 9 5
10 10 6
11 11 7
12 12 8
Content of ‘Block_BP.pdb’
REMARK 3DNA v2.1 (c) 2012 Dr. Xiang-Jun Lu (http://x3dna.org)
HETATM 1 N ALC A 1 -2.250 5.000 0.250 1.00 1.00 N
HETATM 2 N ALC A 1 -2.250 -5.000 0.250 1.00 1.00 N
HETATM 3 N ALC A 1 -2.250 -5.000 -0.250 1.00 1.00 N
HETATM 4 N ALC A 1 -2.250 5.000 -0.250 1.00 1.00 N
HETATM 5 C ALC A 1 2.250 5.000 0.250 1.00 1.00 C
HETATM 6 C ALC A 1 2.250 -5.000 0.250 1.00 1.00 C
HETATM 7 C ALC A 1 2.250 -5.000 -0.250 1.00 1.00 C
HETATM 8 C ALC A 1 2.250 5.000 -0.250 1.00 1.00 C
HETATM 9 C ALC A 1 -2.250 5.000 0.250 1.00 1.00 C
HETATM 10 C ALC A 1 -2.250 -5.000 0.250 1.00 1.00 C
HETATM 11 C ALC A 1 -2.250 -5.000 -0.250 1.00 1.00 C
HETATM 12 C ALC A 1 -2.250 5.000 -0.250 1.00 1.00 C
CONECT 1 2 4
CONECT 2 1 3
CONECT 3 2 4
CONECT 4 1 3
CONECT 5 6 8 9
CONECT 6 5 7 10
CONECT 7 6 8 11
CONECT 8 5 7 12
CONECT 9 5
CONECT 10 6
CONECT 11 7
CONECT 12 8
END

From a pure structural perspective, the designation of the two strands in an anti-parallel DNA duplex is sort of arbitrary. Thus, for a given PDB file, let’s assume that the atomic coordinates of chain A (strand I) come before those of chain B (strand II). We can swap the order of the two chains as they appear in the PDB file, i.e., list first the atomic coordinates of chain B and then those of chain A.
Structurally, the two settings corresponding to exactly the same DNA molecule. As far as 3DNA goes, however, the different orderings do make a different in calculated parameters. Using the Dickerson B-DNA dodecamer CGCGAATTCGCG solved at high resolution (PDB entry 355d) as an example, running 3DNA find_pair and analyze on ‘355d.pdb’ gives the results (abbreviated) below:
find_pair 355d.pdb 355d.bps
# contents of file '355d.bps':
------------------------------------------------------------------
355d.pdb
355d.out
2 # duplex
12 # number of base-pairs
1 1 # explicit bp numbering/hetero atoms
1 24 0 # 1 | ....>A:...1_:[.DC]C-----G[.DG]:..24_:B<....
2 23 0 # 2 | ....>A:...2_:[.DG]G-----C[.DC]:..23_:B<....
3 22 0 # 3 | ....>A:...3_:[.DC]C-----G[.DG]:..22_:B<....
4 21 0 # 4 | ....>A:...4_:[.DG]G-----C[.DC]:..21_:B<....
5 20 0 # 5 | ....>A:...5_:[.DA]A-----T[.DT]:..20_:B<....
6 19 0 # 6 | ....>A:...6_:[.DA]A-----T[.DT]:..19_:B<....
7 18 0 # 7 | ....>A:...7_:[.DT]T-----A[.DA]:..18_:B<....
8 17 0 # 8 | ....>A:...8_:[.DT]T-----A[.DA]:..17_:B<....
9 16 0 # 9 | ....>A:...9_:[.DC]C-----G[.DG]:..16_:B<....
10 15 0 # 10 | ....>A:..10_:[.DG]G-----C[.DC]:..15_:B<....
11 14 0 # 11 | ....>A:..11_:[.DC]C-----G[.DG]:..14_:B<....
12 13 0 # 12 | ....>A:..12_:[.DG]G-----C[.DC]:..13_:B<....
------------------------------------------------------------------
analyze 355d.bps
# generate output file '355d.out', with base-pair step parameters:
****************************************************************************
step Shift Slide Rise Tilt Roll Twist
1 CG/CG 0.09 0.04 3.20 -3.22 8.52 32.73
2 GC/GC 0.50 0.67 3.69 2.85 -9.06 43.88
3 CG/CG -0.14 0.59 3.00 0.97 11.30 25.11
4 GA/TC -0.45 -0.14 3.39 -1.59 1.37 37.50
5 AA/TT 0.17 -0.33 3.30 -0.33 0.46 37.52
6 AT/AT -0.01 -0.60 3.22 -0.31 -2.67 32.40
7 TT/AA -0.08 -0.40 3.22 1.68 -0.97 33.74
8 TC/GA -0.27 -0.23 3.47 0.68 -1.69 42.14
9 CG/CG 0.70 0.78 3.07 -3.66 4.18 26.58
10 GC/GC -1.31 0.36 3.37 -2.85 -9.37 41.60
11 CG/CG -0.31 0.21 3.17 -0.68 6.69 33.31
****************************************************************************
Reversing the order of chains A and B in ‘355d.pdb’ as ‘355d-reversed.pdb’ and repeating the above procedure, we have the following results:
find_pair 355d-reversed.pdb 355d-reversed.bps
# contents of file '355d-reversed.bps':
------------------------------------------------------------------
355d-reversed.pdb
355d-reversed.out
2 # duplex
12 # number of base-pairs
1 1 # explicit bp numbering/hetero atoms
1 24 0 # 1 | ....>B:..13_:[.DC]C-----G[.DG]:..12_:A<....
2 23 0 # 2 | ....>B:..14_:[.DG]G-----C[.DC]:..11_:A<....
3 22 0 # 3 | ....>B:..15_:[.DC]C-----G[.DG]:..10_:A<....
4 21 0 # 4 | ....>B:..16_:[.DG]G-----C[.DC]:...9_:A<....
5 20 0 # 5 | ....>B:..17_:[.DA]A-----T[.DT]:...8_:A<....
6 19 0 # 6 | ....>B:..18_:[.DA]A-----T[.DT]:...7_:A<....
7 18 0 # 7 | ....>B:..19_:[.DT]T-----A[.DA]:...6_:A<....
8 17 0 # 8 | ....>B:..20_:[.DT]T-----A[.DA]:...5_:A<....
9 16 0 # 9 | ....>B:..21_:[.DC]C-----G[.DG]:...4_:A<....
10 15 0 # 10 | ....>B:..22_:[.DG]G-----C[.DC]:...3_:A<....
11 14 0 # 11 | ....>B:..23_:[.DC]C-----G[.DG]:...2_:A<....
12 13 0 # 12 | ....>B:..24_:[.DG]G-----C[.DC]:...1_:A<....
------------------------------------------------------------------
analyze 355d-reversed.bps
# generate output file '355d-reversed.out', with base-pair step parameters:
****************************************************************************
step Shift Slide Rise Tilt Roll Twist
1 CG/CG 0.31 0.21 3.17 0.68 6.69 33.31
2 GC/GC 1.31 0.36 3.37 2.85 -9.37 41.60
3 CG/CG -0.70 0.78 3.07 3.66 4.18 26.58
4 GA/TC 0.27 -0.23 3.47 -0.68 -1.69 42.14
5 AA/TT 0.08 -0.40 3.22 -1.68 -0.97 33.74
6 AT/AT 0.01 -0.60 3.22 0.31 -2.67 32.40
7 TT/AA -0.17 -0.33 3.30 0.33 0.46 37.52
8 TC/GA 0.45 -0.14 3.39 1.59 1.37 37.50
9 CG/CG 0.14 0.59 3.00 -0.97 11.30 25.11
10 GC/GC -0.50 0.67 3.69 -2.85 -9.06 43.88
11 CG/CG -0.09 0.04 3.20 3.22 8.52 32.73
****************************************************************************
Comparing the base-pair step parameters between ‘355d.out’ and ’355d-reversed.out’, one would notice that while slide/rise/roll/twist simply switch orders, shift/tilt (the x-axis parameters) also flip their signs. On the other hand, the nucleotide serial numbers specifying base pairs (the left two columns) are identical in ‘355d.bps’ and ’355d-reversed.bps’.
Apart from explicitly swapping the two strands in PDB data file, one can simply switch around the nucleotide serial numbers generated with find_pair in order to analyze a DNA duplex based on its complementary sequence instead of the primary one. For example, starting from the same PDB file ‘355d.pdb’, we change ‘355d.bps’ to ’355d-cs.bps’ as below,
------------------------------------------------------------------
355d.pdb
355d-cs.out
2 # duplex
12 # number of base-pairs
1 1 # explicit bp numbering/hetero atoms
13 12
14 11
15 10
16 9
17 8
18 7
19 6
20 5
21 4
22 3
23 2
24 1
------------------------------------------------------------------
Run analyze 355d-cs.bps, one would get exactly the same parameters in output file ’355d-cs.out’ as in ’355d-reversed.out’.

Ever since the 2003 publication of the initial 3DNA Nucleic Acids Research paper (NAR03), the schematic diagrams of base-pair parameters (see figure below) has become quite popular. Over the years, we have received numerous requests for permission to use the figure, or a portion thereof; as an example, the figure has been adopted into a structural biology textbook. In the 2008 3DNA Nature Protocols paper (NP08), we devoted the very first protocol to “create a schematic image for propeller of 45°”.

Figure legend taken from Figure 1 of NAR03: Pictorial definitions of rigid body parameters used to describe the geometry of complementary (or non‐complementary) base pairs and sequential base pair steps (19). The base pair reference frame (lower left) is constructed such that the x‐axis points away from the (shaded) minor groove edge of a base or base pair and the y‐axis points toward the sequence strand (I). The relative position and orientation of successive base pair planes are described with respect to both a dimer reference frame (upper right) and a local helical frame (lower right). Images illustrate positive values of the designated parameters. For illustration purposes, helical twist (Ωh) is the same as Twist (ω), formerly denoted by Ω (19,20) and helical rise (h) is the same as Rise (Dz).
I recall spending around two weeks to produce the above figure. Content-wise, the figure was constructed in only a short while; it was the little details that took me most of the time.
Over time, I’ve witnessed numerous versions of such schematic images in publications related to DNA/RNA structures. While looking similar, the schematics differ subtly in the magnitude, orientation and relative scale of illustrated parameters. To the best of my knowledge, only 3DNA provides a pragmatic approach to generate the base-pair schematic diagrams consistently.
To make the schematics more readily accessible, I’ve reproduced a high resolution image (in png format) for each of the 14 parameters shown above. You are welcome to pick and match the diagrams as necessary. If you use any of them in your publications, please cite the 3DNA NAR03 and/or NP08 paper(s).
Note that in the schematic diagrams below, the shaded edge (facing the viewer) denotes the minor-groove side of a base or base pair.
| Shear (Sx) |
Stretch (Sy) |
Stagger (Sz) |
 |
 |
 |
| Buckle (κ) |
Propeller (π) |
Opening (σ) |
 |
 |
 |
| Shift (Dx) |
Slide (Dy) |
Rise (Dz) |
 |
 |
 |
| Tilt (τ) |
Roll (ρ) |
Twist (ω) |
 |
 |
 |
| x-displacement (dx) |
y-displacement (dy) |
Helical Rise (h) |
 |
 |
As for Rise above
(for illustration purpose) |
| Inclination (η) |
Tip (θ) |
Helical Twist (Ωh) |
 |
 |
As for Twist above
(for illustration purpose) |

As of v2.1, I’ve switched from Perl to Ruby as the scripting language for 3DNA. Consequently, the Perl scripts in previous versions of 3DNA (v1.5 and v2.0) are now obsolete. I’ll only correct bugs in existing Perl scripts, but will not add any new features.
For back reference, the scripts are still available from a separate directory $X3DNA/perl_scripts, with the following contents:
OP_Mxyz* dcmnfile* nmr_strs*
README del_ms* pdb_frag*
block_atom* expand_ids* x3dna2charmm_pdb*
blocview.pl* manalyze* x3dna_r3d2png*
bp_mutation* mstack2img* x3dna_setup.pl*
cp_std* nmr_ensemble* x3dna_utils.pm
Among them, x3dna_setup.pl and blocview.pl have corresponding Ruby versions: x3dna_setup and blocview. Actually, the .pl file extension (for Perl) was added to avoid confusion with the new Ruby scripts.
Some of the functionalities have been incorporated into the Ruby script x3dna_utils:
------------------------------------------------------------------------
A miscellaneous collection of 3DNA utilities
Usage: x3dna_utils [-h|-v] sub-command [-h] [options]
where sub-command must be one of:
block_atom -- generate a base block schematic representation
cp_std -- select standard PDB datasets for analyze/rebuild
dcmnfile -- remove fixed-name files generated with 3DNA
x3dna_r3d2png -- convert .r3d to image with Raster3D or PyMOL
------------------------------------------------------------------------
--version, -v: Print version and exit
--help, -h: Show this message
Along the same line, ensemble-related functionalities (for NMR or molecular dynamics simulations) have been consolidated and extended into the new Ruby script x3dna_ensemble:
------------------------------------------------------------------------
Utilities for the analysis and visualization of an ensemble
Usage: x3dna_ensemble [-h|-v] sub-command [-h] [options]
where sub-command must be one of:
analyze -- analyze MODEL/ENDMDL delineated ensemble (NMR or MD)
block_image -- generate a base block schematic image
extract -- extract structural parameters after running 'analyze'
reorient -- reorient models to a particular frame/orientation
------------------------------------------------------------------------
--version, -v: Print version and exit
--help, -h: Show this message
Conceivably, C programs in 3DNA can also be consolidated. For backward compatibility, however, all existing C programs will be kept — and refined as necessary — in the current 3DNA v2.x series. As of v3.x, I’ll completely re-organize 3DNA incorporating my years of experience in programming languages and knowledge of macromolecular structures.

In 3DNA, each base pair (bp) is specified by the identity of its two comprising nucleotides (nts), and their interactions. Some examples are shown below based on the PDB entry 1ehz (the crystal structure of yeast phenylalanine tRNA at 1.93 Å resolution), with the shorthand form on the right:
....>A:...1_:[..G]G-----C[..C]:..72_:A<.... G-C
....>A:...4_:[..G]G-*---U[..U]:..69_:A<.... G-U
....>A:...9_:[..A]A-**+-A[..A]:..23_:A<.... A+A
....>A:..15_:[..G]G-**+-C[..C]:..48_:A<.... G+C
....>A:..26_:[M2G]g-**--A[..A]:..44_:A<.... g-A
Specification of a nucleotide
The nt specification string consists of 6 fields and follows the pattern below, with the number of characters in each field inside the parentheses:
modelNum(4)>chainId(1):ntNum(4)insCode(1):[ntName(3)]baseName(1)
- modelNum(4) — the model number is up to 4 digits, right-justified, with each leading space replaced by a dot. If no model number is available, as is the case for 1ehz (and virtually all other x-ray crystal structures in the PDB), it is written as
.... (4 dots).
- chainId(1) — the chain id is 1-char long, with space replaced by underscore.
- ntNum(4) — the nt residue number, handled as for the model number.
- insCode(1) — insertion code, handled as for the chain id.
- ntName(3) — the nt residue name is up to 3-char long, right-justified, with each leading space replaced by a dot.
- baseName(1) — the base name is 1-char long, mapped from ntName(3) following
$X3DNA/config/baselist.dat. Note that modified nucleotides are put in lower case to distinguish them from the canonical ones — for example, M2G to g.
For the complementary base in a bp, the order of the 6 fields is reversed — see examples above. To see the full list of nts in a PDB data file, run: find_pair -s 1ehz.pdb stdout (here using 1ehz as an example).
Specification of a base pair
The pattern of a bp is M-xyz-N, where M and N are 1-char base names (as in aforesaid field #6), and the three characters xyz have the following meaning:
z — the sign of the dot product of the z-axes of the M and N base reference frames. It is positive (+) if the two z-axes point in similar directions, as in Hoogsteen or reverse Watson-Crick bps. Conversely, it is negative (-) when the two z-axes point in opposite directions, as in the canonical Watson-Crick and Wobble bps. See figure below:

y — it is - if M and N are in a so-called Watson-Crick geometry (the two y-axes of the M and N base reference frames are anti-parallel, so are the two z-axes, whilst the two x-axes are parallel), e.g., the G-U Wobble pair; otherwise, *.
x — it is - for Watson-Crick bps, otherwise, *.
By design, Watson-Crick bps would be of the pattern M-----N, Wobble bps M-*---N, and non-canonical bps M-**+-N or M-**--N. Thus by browsing through the 3DNA output, users can readily identify these three bp types.
The shortened form is represented as MzN; following aforementioned notation, it can be either M-N or M+N. The relative direction of the two z-axes is critical in effecting 3DNA-calculated bp (and step) parameters, as detailed in the 2003 3DNA NAR paper:
To calculate the six complementary base pair parameters of an M–N pair (Shear, Stretch, Stagger, Buckle, Propeller and Opening), where the two z‐axes run in opposite directions, the reference frame of the complementary base N is rotated about the x2‐axis by 180°, i.e. reversing the y2‐ and z2‐axes in Figure 2a. Under this convention, if the base pair is reckoned as an N–M pair, rather than an M–N pair, the x‐axis parameters (Shear and Buckle) reverse their signs. For an M+N pair, e.g. the Hoogsteen A+U in Figure 2b, the x2‐, y2‐ and z2‐axes do not change sign; thus all six parameters for an N+M pair are of opposite sign(s) from those for an M+N pair.
The M-N and M+N bp designation is unique to 3DNA. In combination with the corresponding 6 bp parameters (shear, stretch, stagger, buckle, propeller, and opening), 3DNA provides a rigorous description of all possible bps. This contrasts and complements with the conventional Saenger scheme and the 3-edge based Leontis/Westhof notation.
The 3DNA M-N vs M+N bp designation is base-centric, without concerning the sugar-phosphate backbone. The chi (χ) torsion angle, which characterizes base/sugar relative orientation, can be in either anti or syn conformation; thus similar backbone(S) can accommodate either M-N or M+N.
