It gives me great pleasure to announce that the 3DNA/DSSR project is now funded by the NIH R24GM153869 grant, titled "X3DNA-DSSR: a resource for structural bioinformatics of nucleic acids". I am deeply grateful for the opportunity to continue working on a project that has basically defined who I am. It was a tough time during the funding gap over the past few years. Nevertheless, I have experienced and learned a lot, and witnessed miracles enabled by enthusiastic users.

Since late 2020 when I lost my R01 grant, DSSR has been licensed by the Columbia Technology Ventures (CTV). I appreciate the numerous users (including big pharma) who purchased a DSSR Pro License or a DSSR Basic paid License. Thanks to the NIH R24GM153869 grant, we are pleased to provide DSSR Basic free of charge to the academic community. Academic Users may submit a license request for DSSR Basic or DSSR Pro by clicking "Express Licensing" on the CTV landing page. Commercial users may inquire about pricing and licensing terms by emailing techtransfer@columbia.edu, copying xiangjun@x3dna.org.

The current version of DSSR is v2.4.5-2024sep24 which contains miscellaneous bug fixes (e.g., chain id with > 4 chars) and minor improvements. This release synchronizes with the new R24 funding, which will bring the project to the next level. All existing users are encouraged to upgrade their installation.

Lots of exciting things will happen for the project. The first thing is to make DSSR freely accessible to the academic community. In the past couple of weeks, CTV have already issued quite a few DSSR Basic Academic licenses to users from all over the world. So the demand is high, and it will become stronger as more academic users become aware of DSSR. I'm closely monitoring the 3DNA Forum, and is always ready to answer users questions.

I am committed to making DSSR a brand that stands for quality and value. By virtue of its unmatched functionality, usability, and support, DSSR saves users a substantial amount of time and effort when compared to other options. My track record throughout the years has unambiguously demonstrated my dedication to this solid software product.


DSSR Basic contains all features described in the three DSSR-related papers, and includes the originally separate SNAP program (still unpublished) for analyzing DNA/RNA-protein complexes. The Pro version integrates the classic 3DNA functionality, plus advanced modeling routines, with email/Zoom/phone support.

---

Restraint optimization of DNA backbone geometry using PHENIX

3DNA can build DNA/RNA structures with a precise base but approximate sugar-phosphate backbone geometry. In the 2003 3DNA-NAR paper, Table 3 of the section “Structures built with sugar–phosphate backbone” lists “root mean square deviation (in Å) between rebuilt 3DNA models and experimental DNA structures” for three representative DNA structures (in A-form, B-form, and a protein-DNA complex). It was noted that The RMSD of reconstructed versus observed base positions is virtually zero and that for both base and backbone coordinates is <0.85 Å, even for the 146 bp nucleosomal DNA structure.

The backbone geometry is approximate because 3DNA uses a fixed sugar-phosphate conformation (in A-DNA, B-DNA or RNA) that is attached to the corresponding bases in the model building process. The most noticeable effect is the long O3′(i)···P(i+1) bond that connects consecutive nucleotides along a chain. The imprecise structure was intended as a starting point for other objectives (e.g., all-atom molecular dynamics simulations) that are out of the design scope of 3DNA. Nevertheless, over the years, I have been concerned with the overlong O3′—P distance issue. I tried but failed to find a satisfying third-party (command-line driven) tool that can perform restraint optimization of the sugar-phosphate backbone geometry while keeping base atoms fixed.

The problem was finally solved after I attended the 43rd Mid-Atlantic Macromolecular Crystallography Meeting held at Duke University a few months ago. At the meeting, I had the opportunities to talk to several members of the PHENIX team. Particularly, Jeff Headd revised the geometry_minimization component of PHENIX to do the trick. Here is the mail reply from Jeff, using a 3DNA-generated DNA duplex (355d-3dna.pdb) as an example (see full details below):

Here’s a first go at refining just the backbone atoms of you input DNA model. You’ll need the most recently nightly build of Phenix (dev-1395 would work) and then run:

phenix.geometry_minimization 355d-3dna.pdb min.params

using the attached min.params file.

What I specify in the params file is to only move the backbone atoms, which I’ve done with a selection. You can modify the atoms that are allowed to move to your liking.

The only other change was to allow longer distance linkages, as some of the backbone linkages start quite far apart.

The content of file min.params is:

pdb_interpretation {
  link_distance_cutoff = 7.0
}
selection = name " P  " or name " OP1" or name " OP2" or \
            name " O5'" or name " C5'" or name " C4'" or \
            name " O4'" or name " C3'" or name " O3'" or \
            name " C2'"

To make the story complete, given below is the step-by-step procedure, using 355d, a B-DNA dodecamer at 1.4 Å resolution as an example. The corresponding PDB file is named 355d.pdb.

find_pair 355d.pdb stdout | analyze stdin
x3dna_utils cp_std bdna
rebuild -atomic bp_step.par 355d-3dna.pdb
# the rebuilt structure is called '355d-3dna.pdb'

# with Phenix dev-1395 and above
phenix.geometry_minimization 355d-3dna.pdb min.params
# the optimized structure is called '355d-3dna_minimized.pdb'

# to verify:
find_pair 355d-3dna.pdb stdout | analyze stdin
find_pair 355d-3dna_minimized.pdb stdout | analyze stdin
# check files '355d-3dna.out' and '355d-3dna_minimized.out'

The three key files mentioned above are provided here for your verification:

Finally, the following figure illustrates the B-DNA dodecamer duplex in experimental (left), 3DNA-generated (middle) and PHENIX-optimized (right) coordinates. Note that disconnected O3′—P linkages (marked by red dots for two cases, see bottom of the middle image) due to overlong distances in 3DNA-rebuilt structure are fixed following the restraint PHENIX optimization.

355d-experimental 3DNA-rebuilt PHENIX-optimized
355d, experimental structure 3DNA-rebuilt structure PHENIX-optimized structure
---

Note added on 2016-11-11: In the min.params file, the selection is in one long line. For illustration purpose, the selection section (see below) is split into serveral short lines in the blog post. However, PHENIX requires ending backslashes (\) to combine the split lines into a single grammatical unit. I was not aware of this strict rule, and missed to add the ending \s in the original post. Thanks to Oleg Sobolev from the PHENIX team for pointing out this omission to my attention. Note that the content of min.params did not have a problem, and thus no change is made.

pdb_interpretation {
  link_distance_cutoff = 7.0
}
selection = name " P  " or name " OP1" or name " OP2" or \
            name " O5'" or name " C5'" or name " C4'" or \
            name " O4'" or name " C3'" or name " O3'" or \
            name " C2'"

Comment [4]

---

Detection of helical junctions in nucleic acid structures

One of DSSR’s noteworthy features is the auto-detection of helical junctions in nucleic acids structures, be it RNA, DNA, or chimeric DNA/RNA, consisting of one or multiple chains. Helical junctions are created at the interface of three and more stems composed of canonical pairs (Watson-Crick A—T/U and G—C, or wobble G—U). A three-way junction model is illustrated below (copied from Figure 1 of the Bindewald et al. RNAJunction paper). Note that the three chains are each continuous (i.e., consecutive nts are covalently connected), and together with the three inner bps, forming a loop in the middle. Here, the three-way junction is of type [3×2×3], and the loop is composed of a total of 3×2+3+2+3 = 14 nts.

definition of a three-way junction

DSSR automatically detects all existing helical junctions in a nucleic acid structure, as illustrated by the following examples.

1l6b [all DNA Holliday junction structure of d(CCGGTACm5CGG)]

This is a simple four-way junction of type [0×0×0×0], where all bases are paired, leaving no connecting nts. The related portion of DSSR output is:

List of 1 junction(s)
   1 4-way junctions: 8 nts; [0x0x0x0]; linked by [#1, #2, #4, #3]
       1:A.DA6+1:A.DC7+2:B.DG14+2:B.DT15+2:A.DA6+2:A.DC7+1:B.DG14+1:B.DT15 [ACGTACGT]
       0 nts junction ; 1:A.DA6-->1:A.DC7 [AC]
       0 nts junction ; 2:B.DG14-->2:B.DT15 [GT]
       0 nts junction ; 2:A.DA6-->2:A.DC7 [AC]
       0 nts junction ; 1:B.DG14-->1:B.DT15 [GT]

1L6B: all DNA Holliday junction

Technically, note the following points:

  • The four-way junction is derived from the biological assembly 1 (PDB file 1l6b.pdb1), which contains two copies of the asymmetric unit, delineated by MODEL/ENDMDL. By default, DSSR/3DNA works one structure at a time, corresponding to the first structure/model in a given PDB or mmCIF file. To take the biological assembly as a whole, and to avoid confusions with MODEL/ENDMDL delineated NMR entries, the ENDMDL record of the first model is commented out in the file (1l6b.pdb1), as below:

#ENDMDL                                                                          
MODEL        2                                                                  
  • With the modified PDB file 1l6b.pdb1, the DSSR command can be run as x3dna-dssr -i=1l6b.pdb1, with the output going to stdout.
  • The simplified schematic block png image was generated with the command below to create the Raster3D .r3d file (1l6b.r3d), which was then ray-traced using PyMOL.
blocview -r 1l6b.r3d 1l6b.pdb1

1egk [a four-way DNA/RNA junction]

This four-way junction consists of both DNA and RNA chains. Here the helical junction may not be that obvious by directly looking at the 3D image.

List of 1 junction(s)
   1 4-way junctions: 10 nts; [0x0x1x1]; linked by [#3, #-1, #4, #5]
       B.DC37+B.DT38+B.DA45+B.DC46+C.G109+C.A110+C.U111+D.DA130+D.DG131+D.DG132 [CTACGAUAGG]
       0 nts junction ; B.DC37-->B.DT38 [CT]
       0 nts junction ; B.DA45-->B.DC46 [AC]
       1 nts junction C.A110 [A]; C.G109-->C.U111 [GAU]
       1 nts junction D.DG131 [G]; D.DA130-->D.DG132 [AGG]

1EGK: four-way DNA/RNA junction

1ehz [yeast phenylalanine tRNA]

As shown below, DSSR correctly detects the classic L-shaped 3D structure and the cloverleaf 2D structure of a tRNA.

List of 1 junction(s)
   1 4-way junctions: 16 nts; [2x1x5x0]; linked by [#1, #2, #3, #4]
       A.U7+A.U8+A.A9+A.2MG10+A.C25+A.M2G26+A.C27+A.G43+A.A44+A.G45+A.7MG46+A.U47+A.C48+A.5MC49+A.G65+A.A66 [UUAgCgCGAGgUCcGA]
       2 nts junction A.U8+A.A9 [UA]; A.U7-->A.2MG10 [UUAg]
       1 nts junction A.M2G26 [g]; A.C25-->A.C27 [CgC]
       5 nts junction A.A44+A.G45+A.7MG46+A.U47+A.C48 [AGgUC]; A.G43-->A.5MC49 [GAGgUCc]
       0 nts junction ; A.G65-->A.A66 [GA]

1EHZ: yeast phenylalanine tRNA

2fk6 [RNAse Z/tRNA(Thr) complex]

In a recent paper Predicting Helical Topologies in RNA Junctions as Tree Graphs by Laing et al., this PDB entry was selected in Table 1 as containing a three-way junction. However, DSSR fails to detect any junction in this structure, even though the program does find co-axial stacks. It turns out that the PDB entry 2fk6 does not possess the anti-codon stem/loop, thus nts C25 and G46 are not covalently connected. While three-way junctions may be defined differently, the DSSR result follows the above mentioned chain-continuity requirement.

2FK6: RNAse Z/tRNA(Thr) complex with chain break

Overall, DSSR can consistently find all helical junctions in a given nucleic acid structure. Try DSSR on a ribosomal structure, you may well appreciate what it reveals. Moreover, it is straightforward to apply the program to all RNA/DNA-containing entries in the PDB via a script.

Comment

---

Drawing an RNA secondary structure from its 3D coordinates

Given the primary sequence of an RNA molecule, there are numerous methods for predicting its secondary (2D) structures. To judge their accuracy, three-dimensional (3D) RNA structures solved experimentally by X-ray or NMR as deposited in the PDB are often used as benchmarks. DSSR is a handy tool to derive an RNA 2D structure from its 3D coordinates in PDB or mmCIF format. The 2D structure is specified in the dot-bracket notation (dbn), which can be fed directly into drawing programs such as VARNA for interactive display and easy generation of publication quality 2D diagrams.

Over the past few months, I’ve been asked a few times on the details of how the diagrams in the DSSR post were created. The answer is really simple, and has already been mentioned above and in the post. Here are two concrete examples to show how the process works.

1zc5 (structure of the RNA signal essential for translational frame shifting in HIV-1)

This is the structure used in the VARNA paper. Let the PDB file be named 1zc5.pdb, the DSSR program can be run like this:

x3dna-dssr -i=1zc5.pdb

The output is sent to stdout by default, with the following three lines towards the end:

>1zc5-A #1 RNA with 41 nts
GGCGAUCUGGCCUUCCUACAAGGGAAGGCCAGGGAAUUGCC
(((((((((((((((((....)))))))))))...))))))

Simply copy and paste the last two lines (sequence and the 2D structure in dbn notation) into the Seq: and Str: fields of the VARNA demo page, the diagram will be updated automatically, as shown in the screenshot:

1zc5-dssr-varna

1ehz (crystal structure of yeast phenylalanine tRNA at 1.93 Å resolution)

This example (1ehz.pdb) is used to illustrate tRNA’s classic cloverleaf 2D structure. The related command and result are:

x3dna-dssr -i=1ehz.pdb -o=1ehz.out

# the output is sent to file '1ehz.out'
# towards its end are the following 3 lines

>1ehz-A #1 RNA with 76 nts
GCGGAUUUAgCUCAGuuGGGAGAGCgCCAGAcUgAAgAPcUGGAGgUCcUGUGuPCGaUCCACAGAAUUCGCACCA
(((((((..((((.....[..)))).((((.........)))).....(((((..]....))))))))))))....

I’ve used a local copy of the JAVA web start version of VARNA (VARNA-WebStart.jnlp) to generate the following 2D diagram. Here, in addition to the customized title, I have set the number period to 5 nts, adopted the simple base-pair style, and manually adjusted the T arm (upper right corner) to make the long line connecting G19 and C56 a bit more unobtrusive. Right-click to see the context menu.

Note that the G19—C56 pair creates a pseudo-knot (specified by the matching [] pair in the dbn notation above) in tRNA. I was not aware of this salient feature from previous knowledge of relevant literature. It was indeed a surprise when I first saw it in the 2D diagram.

1ehz-dssr-varna

As illustrated above, DSSR serves well as a bridge from RNA 3D to 2D structures. Give DSSR a try, you will find the program actually has much more to offer!

Comment

---

DSSR identifies kink-turns!

As of the beta-r14-on-20130626 release, DSSR has the functionality to identify kink-turns and reverse k-turns given an RNA structure in PDB format.

The k-turn motif was first described by Klein et al. (2001) in the paper The kink-turn: a new RNA secondary structure motif, based on analyses of the H. marismortui large ribosomal unit. It turns out to be a widespread structural motif, now with a dedicated k-turn database hosted by the Lilley laboratory.

Geometrically, k-turn is composed of an asymmetric internal loop, with a sharp kink between the two framing helices and characteristic loop features (including at least one sheared G-A pair and A-minor interactions). Overall, k-turn is a complicated motif, and I am not aware of any published method or available software for its auto-detection.

Previous releases of DSSR has built up all the necessary components to detect key features of a k-turn. Over the past few weeks, I have been focusing on connecting the dots to implement an algorithm for its auto-identification. As of beta-r14-on-20130626, DSSR can locate ‘simple’ k-turns or reverse k-turns from an RNA structure in PDB format. I understand the subtleties and variations of k-turns, and will refine the algorithm in future releases of DSSR.

Without putting k-turns under its umbrella, DSSR appears incomplete in its functionality. Hopefully, detection of k-turns will help DSSR gain more attention from the RNA structure community.

Comment

---

DSSR, what is it and why bother?

Over the past six months or so1, I’ve been focusing mostly on developing DSSR, a new addition to the 3DNA suite of programs. So what is DSSR, specifically? Why did I bother to create it? How would it be relevant to the nucleic acid structure community?

Literally, DSSR stands for Defining the (Secondary) Structures of RNA2. Starting from an RNA structure in PDB format, DSSR employs a set of simple criteria to identify all existent base pairs (bp): both canonical Watson–Crick (WC) pairs and non-canonical pairs with at least one H-bond, made up of normal or modified bases, regardless of tautomeric or protonation state. The classification is based on the six standard rigid-body bp parameters (shear, stretch, stagger, propeller, buckle, and opening), which together rigorously quantify the spatial disposition of any two interacting bases. Moreover, the program characterizes each bp by commonly used names (WC, reverse WC, Hoogsteen, reverse Hoogsteen, wobble, sheared, imino, Calcutta, and dinucleotide platform), the Saenger classification scheme of 28 types, and the Leontis-Westhof nomenclature of 12 basic geometric classes. DSSR also checks for non-pairing interactions (H-bonds or base stacking).

DSSR detects triplets and even higher-order base associations by searching horizontally in the plane of the associated bp for further H-bonding interactions. The program determines helical regions by exploring each bp’s neighborhood vertically for base-stacking interactions, regardless of backbone connection (e.g., coaxial stacking of helices or pseudo helices). Moreover, each helix/stem is characterized by a least-squares fitted helical axis to allow for easy quantification of relative helical geometry. DSSR calculates commonly used backbone (including the virtual η/θ) torsion angles, classifies the main chain backbone into BI/BII conformation and the sugar into C2’/C3’-endo like pucker, identifies A-minor interactions (types I and II), ribose zippers, G quartets, hairpin loops, kissing loops, bulges, internal loops and multi-branch loops (junctions). It also detects the existence of pseudo-knots, and outputs RNA secondary structure in the dot-bracket notation.

Experienced 3DNA users may notice that some of the above outlined functionality (e.g., calculation of torsion angles, identification of all pairs, higher order base associations, and helices) have existed for over a decade. Over the years, I have written several posts (see What can 3DNA do for RNA structures?, and links therein) to advocate 3DNA’s applications in RNA structural analysis. Nevertheless, 3DNA has never been widely used in the RNA structure community, for various possible reasons: (1) the misconception that 3DNA is only for DNA (but not RNA); (2) the basic functionality is split into two programs (find_pair and analyze), and needs to be run several times with different options (default find_pair, and with -s, or -p). Thus even though 3DNA is applicable to RNA structures, it is unnecessarily complicated and confusing (especially to new 3DNA users); (3) 3DNA is command-line driven, consisting of many C programs and scripts, with different styles in specifying options. It has the ‘reputation’ of being powerful, but cryptic and hard to use.

I’ve created DSSR from scratch to take consideration of these factors, by employing my extensive experience in supporting 3DNA, an increased knowledge in RNA structures and refined C programming skills. Implemented in ANSI C as a stand-alone command-line program, DSSR is self-contained. Its executables (on MacOS X, Linux and Windows) have zero runtime dependencies. No setup is necessary; simply put the program into a folder of your choice (preferably one on your command PATH), and it should work. DSSR has sensible default settings and an intuitive output, making it directly accessible to a much broader audience than 3DNA per se. Since its initial release on March 3, 2013, I’ve yet to hear any installation or usage problem. So far, all reported bugs have been verified and fixed promptly. The latest beta release has been checked against all nucleic-acid-containing entries in the PDB, without any known issues.

Overall, DSSR consolidates, refines, and significantly extends 3DNA’s functionality for RNA structural analysis. There are more in DSSR than its simple interface suggests. Piecewise, DSSR may appear nothing new, yet combined together, it has unique features not available anywhere else. Its value will be gradually appreciated as DSSR becomes more widely used by the community. Want to know if your structure contains any Hoogsteen pair, sheared G•A pair, or a dinucleotide platform? DSSR can check it for you, easily.

DSSR-beta already possesses all the basic functionality and has been well tested to serve as a handy tool for RNA structural analysis. I stand firmly behind DSSR, and strive to continuously improve the program. Give it a try, and report back on the 3DNA Forum any issues you have. As always, I respond quickly and concretely to all questions posted there. I hope you enjoying using DSSR as much as I enjoy creating and supporting it!

1 This post was published on March 29, 2013, shortly after the beta releases of DSSR [note added on March 15, 2014].

2 DSSR also works for DNA, or DNA-protein complexes, as far as the basic functionality is concerned. Moreover, the acronym could have two other possible interpretations, as would be obvious when the program gains a wider recognition.

Comment [2]

---

Named base pairs

In the field of nucleic acid structures, especially in the ‘RNA world’, we often hear named base pairs (bp). Among those, the Watson-Crick (WC) A–U and G–C bps (see figure below) are by far the most common.

Watson-Crick base pairs

Reversed WC (rWC) base pairs

Closely related to the WC bps are the so-called reversed WC (rWC) bps, where the relative glycosidic bond are reversed; instead of being on the same side of the bases as in WC bps shown above, they are now on opposite sides in rWC bps as shown below. According to the Leontis-Westhof (LW) bp classification scheme, the rWC bps belong to trans WC/WC. Following Saenger’s numbering, the rWC A+U bp corresponds to XXI, and the rWC G+C bp XXII.

In the figures below, the name of each type of bp and its LW & Saenger designations (separated by ‘;’) are noted under the corresponding image. All images are generated with 3DNA; for easy comparison, each bp is oriented in the reference frame of the leading base.

Reversed Watson-Crick A+U pair Reversed Watson-Crick G+C pair
Reversed WC A+U pair Reversed WC G+C pair
trans WC/WC; XXI trans WC/WC; XXII

Hoogsteen and reversed Hoogsteen base pairs

The next most famous one is the Hoogsteen A+U bp, which also has a reverse variant, i.e., the rHoogsteen A–U bp (see figure below). Now the major groove edge of A, termed the Hoogsteen edge by LW, is used for pairing with U.

Hoogsteen A+U pair Reversed Hoogsteen A–U pair
Hoogsteen A+U pair Reversed Hoogsteen A–U pair
cis Hoogsteen/WC; XXIII trans Hoogsteen/WC; XXIV

The G–U Wobble base pair

First proposed by Crick in 1966 to account for the degeneracy in codon–anticodon pairing, the Wobble bp is an essential component (in addition to the WC bps) in forming double helical RNA secondary structures.

Wobble G–U pair
Wobble G–U pair
cis WC/WC; XXVIII

The sheared G–A base pair

Sheared G–A is a commonly found non-WC bp in both DNA and RNA structures. Noticeably, tandem sheared G–A bps introduce distinct stacking geometry. Here G uses its minor groove edge, termed the sugar edge by LW, to pair with the Hoogsteen edge of A.

Sheared G–A pair
Sheared G–A pair
trans Suger/Hoogsteen; XI

Dinucleotide platforms

Dinucleotide platforms are formed via side-by-side pairing of adjacent bases; the most common of which are GpU and ApA. Here the sugar (minor-groove) edge of the 5′ base interacts with the Hoogsteen (major-groove) edge of the 3′ base. Since there is only one base-base H-bond in dinucleotide platforms, no Saenger classification is available. In 3DNA output, the GpU dinucleotide platform is designated as G+U, and ApA as A+A.

GpU dinucleotide platform ApA dinucleotide platform
GpU dinucleotide platform ApA dinucleotide platform
cis Sugar/Hoogsteen; n/a cis Sugar/Hoogsteen; n/a

Other named base pairs

There exist other named bps in RNA literature, e.g., G⋅A imino, A⋅C reverse Hoogsteen, G⋅U reverse Wobble etc. In the my experience, they are (much) less commonly used than the ones illustrated above.

Comment [2]

---

Classification of dinucleotide steps into A- and B- and TA-DNA

From v1.5 or even earlier on, 3DNA provides an automatic classification of a dinucleotide step into A-, B- or TA-DNA conformation. Figure 5 of the 2003 3DNA Nucleic Acids Research paper (NAR03) shows three sets of scatter plots — helical inclination and x‐displacement, dimer step Roll and Slide, and the projected phosphorus z coordinates Zp and Zp(h) — to differentiate the A-, B- and TA-DNA dinucleotide steps.

Classification of A-, B- and TA-DNA dinucleotide steps

Among the criteria tested, the most discriminative ones are the projected phosphorus z coordinates, Zp in the middle step frame (see figure below), and Zp(h) defined similarly but in the middle helical frame.

definition of the Zp parameter

Over the years, I have received many questions regarding the datasets used in generating Figure 5 of NAR03. Back in August 2006, a user asked for IDs of the TA-DNA structures — see DNA standards/statistics using 3DNA. In April 2007, another user requested the same TA-DNA dataset. Early this year, a user asked for 3DNA’s A-DNA definition. More recently, yet another user would like to ask about the DNA set used for the analysis that is presented in Fig 5. in the NAR 2003 paper.

I am glad to see that after nearly a decade of the NAR03 publication, the user community is still interested in knowing details in the work. So I decided to dig into my archive for the original data files and scripts used to generate Figure 5 of NAR03. It was not an easy journey; just releasing the data files and scripts is not enough, I’d like to verify that they work together as intended in today’s computing environment. Luckily, I am finally able to get to the bottom of the issues. The details are in the post Datasets and scripts for reproducing Figure 5 of the 3DNA NAR03 paper. The tarball file named 3DNA-NAR03-Fig5.tar.gz is available by clicking the link.

Comment

---

Rectangular block expressed in PDB format

As noted in post Rectangular block expressed in MDL molfile format, I added the -mol option (in v2.1) to convert 3DNA’s native alchemy to the better-supported MDL molfile format, to make the characteristic schematic representations more widely accessible. Along the line, I have recently further augmented alc2img with the -pdb option to transform alchemy to the PDB format.

While the macromolecular PDB format is certainly not convenient for specifying linkage details of small molecules, it’s nevertheless the best-documented and by far the most widely supported than molfile or alchemy in currently available molecular viewers. For example, the PDB format is consistently supported in Jmol, PyMOL, RasMol, DeepView, and UCSF Chimera. Moreover, the PDB format does have the CONECT section to provide information on atomic connectivity:

The CONECT records specify connectivity between atoms for which coordinates are supplied. The connectivity is described using the atom serial number as shown in the entry. CONECT records are mandatory for HET groups (excluding water) and for other bonds not specified in the standard residue connectivity table.

The alc2img -pdb option takes advantage of the CONECT records and specifies all ‘bond’ linkages explicitly. The usage is very simple — take the standard base-pair rectangular block file (‘Block_BP.alc’) as an example, the conversion can be performed as below:

alc2img -pdb Block_BP.alc Block_BP.pdb

Content of ‘Block_BP.alc’

   12 ATOMS,    12 BONDS
    1 N      -2.2500   5.0000   0.2500
    2 N      -2.2500  -5.0000   0.2500
    3 N      -2.2500  -5.0000  -0.2500
    4 N      -2.2500   5.0000  -0.2500
    5 C       2.2500   5.0000   0.2500
    6 C       2.2500  -5.0000   0.2500
    7 C       2.2500  -5.0000  -0.2500
    8 C       2.2500   5.0000  -0.2500
    9 C      -2.2500   5.0000   0.2500
   10 C      -2.2500  -5.0000   0.2500
   11 C      -2.2500  -5.0000  -0.2500
   12 C      -2.2500   5.0000  -0.2500
    1     1     2
    2     2     3
    3     3     4
    4     4     1
    5     5     6
    6     6     7
    7     7     8
    8     5     8
    9     9     5
   10    10     6
   11    11     7
   12    12     8

Content of ‘Block_BP.pdb’

REMARK    3DNA v2.1 (c) 2012 Dr. Xiang-Jun Lu (http://x3dna.org)
HETATM    1  N   ALC A   1      -2.250   5.000   0.250  1.00  1.00           N  
HETATM    2  N   ALC A   1      -2.250  -5.000   0.250  1.00  1.00           N  
HETATM    3  N   ALC A   1      -2.250  -5.000  -0.250  1.00  1.00           N  
HETATM    4  N   ALC A   1      -2.250   5.000  -0.250  1.00  1.00           N  
HETATM    5  C   ALC A   1       2.250   5.000   0.250  1.00  1.00           C  
HETATM    6  C   ALC A   1       2.250  -5.000   0.250  1.00  1.00           C  
HETATM    7  C   ALC A   1       2.250  -5.000  -0.250  1.00  1.00           C  
HETATM    8  C   ALC A   1       2.250   5.000  -0.250  1.00  1.00           C  
HETATM    9  C   ALC A   1      -2.250   5.000   0.250  1.00  1.00           C  
HETATM   10  C   ALC A   1      -2.250  -5.000   0.250  1.00  1.00           C  
HETATM   11  C   ALC A   1      -2.250  -5.000  -0.250  1.00  1.00           C  
HETATM   12  C   ALC A   1      -2.250   5.000  -0.250  1.00  1.00           C  
CONECT    1    2    4                                                  
CONECT    2    1    3                                                  
CONECT    3    2    4                                                  
CONECT    4    1    3                                                  
CONECT    5    6    8    9                                             
CONECT    6    5    7   10                                             
CONECT    7    6    8   11                                             
CONECT    8    5    7   12                                             
CONECT    9    5                                                       
CONECT   10    6                                                       
CONECT   11    7                                                       
CONECT   12    8                                                       
END

Comment

---

Schematic diagrams of base-pair parameters

Ever since the 2003 publication of the initial 3DNA Nucleic Acids Research paper (NAR03), the schematic diagrams of base-pair parameters (see figure below) has become quite popular. Over the years, we have received numerous requests for permission to use the figure, or a portion thereof; as an example, the figure has been adopted into a structural biology textbook. In the 2008 3DNA Nature Protocols paper (NP08), we devoted the very first protocol to “create a schematic image for propeller of 45°”.

Schematic diagram of rigid body parameters

Figure legend taken from Figure 1 of NAR03: Pictorial definitions of rigid body parameters used to describe the geometry of complementary (or non‐complementary) base pairs and sequential base pair steps (19). The base pair reference frame (lower left) is constructed such that the x‐axis points away from the (shaded) minor groove edge of a base or base pair and the y‐axis points toward the sequence strand (I). The relative position and orientation of successive base pair planes are described with respect to both a dimer reference frame (upper right) and a local helical frame (lower right). Images illustrate positive values of the designated parameters. For illustration purposes, helical twist (Ωh) is the same as Twist (ω), formerly denoted by Ω (19,20) and helical rise (h) is the same as Rise (Dz).

I recall spending around two weeks to produce the above figure. Content-wise, the figure was constructed in only a short while; it was the little details that took me most of the time.

Over time, I’ve witnessed numerous versions of such schematic images in publications related to DNA/RNA structures. While looking similar, the schematics differ subtly in the magnitude, orientation and relative scale of illustrated parameters. To the best of my knowledge, only 3DNA provides a pragmatic approach to generate the base-pair schematic diagrams consistently.

To make the schematics more readily accessible, I’ve reproduced a high resolution image (in png format) for each of the 14 parameters shown above. You are welcome to pick and match the diagrams as necessary. If you use any of them in your publications, please cite the 3DNA NAR03 and/or NP08 paper(s).

Note that in the schematic diagrams below, the shaded edge (facing the viewer) denotes the minor-groove side of a base or base pair.



Shear (Sx) Stretch (Sy) Stagger (Sz)
Shear Stretch Stagger
Buckle (κ) Propeller (π) Opening (σ)
Buckle Propeller Opening
Shift (Dx) Slide (Dy) Rise (Dz)
Shift Slide Rise
Tilt (τ) Roll (ρ) Twist (ω)
Tilt Roll Twist
x-displacement (dx) y-displacement (dy) Helical Rise (h)
x-displacement y-displacement As for Rise above
(for illustration purpose)
Inclination (η) Tip (θ) Helical Twist (Ωh)
Inclination Tip As for Twist above
(for illustration purpose)

Comment [8]

---

« Older · Newer »

Thank you for printing this article from http://x3dna.org/. Please do not forget to visit back for more 3DNA-related information. — Xiang-Jun Lu