[Job] Staff Associate II (Computational Structural Biology) at Columbia University

The X3DNA-DSSR resource is at the forefront of structural bioinformatics, developing advanced tools for analyzing and modeling nucleic acid structures. We are seeking a highly motivated Staff Associate II to join our team and contribute to our next-generation analysis and visualization engine.

To see our resource in action, please visit wDSSR, our new web interface for dissecting and modeling 3D nucleic acid structures: https://web.x3dna-dssr.org/.

We are looking for a candidate with a strong scientific background in structural biology or bioinformatics and a desire to contribute to peer-reviewed publications through community-driven data analysis. We value individuals who are eager to learn, adapt to new technical challenges, and support the global research community.

For the full job description and to submit your application, please visit the official Columbia University posting: https://apply.interfolio.com/183705

---

Announcing wDSSR: The Next-Generation Web Interface to X3DNA-DSSR

Dear 3DNA/DSSR Community,

We are thrilled to announce the official launch of wDSSR (https://web.x3dna-dssr.org/), the powerful new web interface to the X3DNA-DSSR analytical engine.

Developed by Drs. Shuxiang Li and Xiang-Jun Lu and supported by NIH grant R24GM153869, wDSSR represents a major leap forward from our highly popular 2019 Web 3DNA 2.0 framework. While Web 3DNA 2.0 has faithfully served the community for the analysis, visualization, and modeling of 3D nucleic acid structures, wDSSR was built from the ground up to take full advantage of modern web technologies and the latest DSSR backend capabilities.

A Modern, Streamlined Scientific Workflow We have completely overhauled the user interface to provide a clean, intuitive, and task-driven experience. The core modeling and analysis tools are now seamlessly organized into a logical, single-word scientific workflow: Analyze, Rebuild, Model, Circularize, Mutate, Assemble, and Visualize.

Spotlight Feature: The "Assemble" Module One of the most exciting upgrades is the newly renamed Assemble tab (formerly "Composite"). This advanced composite model builder allows you to effortlessly construct complex, higher-order models by linking any combination of nucleic acid duplexes or protein-DNA/RNA complexes. You can quickly connect up to six distinct target structures, ranging from simple linked A-DNA and B-DNA duplexes to large, protein-decorated structural assemblies.

Immediate Global Adoption Although wDSSR has just launched, we are incredibly humbled to share that it is already seeing rapid worldwide adoption! According to recent network infrastructure data, the new interface is actively being used by researchers across North America, South America, Europe, and Asia. Within just a few days, we have recorded active sessions from prestigious institutions around the globe, including:

  • The Weizmann Institute of Science in Israel
  • Katholieke Universiteit Leuven in Belgium
  • Queen's University in Canada
  • Universidad Nacional Autonoma de Mexico (UNAM) in Mexico
  • Emory University and the Wadsworth Centers Laboratories and Research in the United States
  • Jawaharlal Nehru University and the China Education and Research Network in Asia

How to Cite While a dedicated paper for wDSSR is currently in preparation, researchers should cite the server using its URL (https://web.x3dna-dssr.org/) alongside the 2019 Web 3DNA 2.0 paper and the foundational 2015 DSSR paper. Full details and funding acknowledgements can be found on our newly consolidated About page.

We invite you all to try out the new wDSSR platform! As always, your feedback is invaluable to us, and we encourage you to share your thoughts, questions, and structural models via the newly updated Questions & Feedback link in the wDSSR footer.

Happy modeling!

---

3DNA/DSSR runs just fine under Mac OS X Mavericks

I recently upgraded my Macs to OS X Mavericks to check if 3DNA/DSSR works in the new operating system. I am glad to report that both run without a hitch, as expected.

Since OS X Mavericks is free from the Mac App Store, it will quickly become the de facto version virtually all Mac users would use. I also noticed that Ruby on Mavericks has been upgraded to ruby 2.0.0p247 (2013-06-27 revision 41674), a major step forward from the now retiring Ruby 1.8.7 distributed in previous versions of Mac OS X.

As a rule, I’d ensure that 3DNA/DSSR executes properly in major releases of the commonly used operating systems — Mac, Windows, and Linux.

Comment

---

DSSR works perfectly under DOS (in native Windows)

While having not used DOS for ages, I am glad to find that the DSSR version compiled for MinGW/MSYS on Windows works perfectly under this operating system (see screenshot below). The DSSR DOS command-line interface functions exactly the same as for Linux, Mac OS X, MinGW/MSYS, and CygWin. Among other possible usages, it allows for batch files to take advantage of DSSR.

Screenshot of a DSSR run in DOS

Implementing DSSR in strict ANSI C as a self-contained and zero-dependent command-line program pays off enormously: it simplifies code maintenance and ensures that the program is applicable wherever a C compiler exists. The easy web interface to DSSR makes the program universally accessible.

Comment

---

DSSR command-line processing

Aside from its extensive functionality for RNA structural analyses, DSSR also introduces a consistent and flexible way to process command-line options. Here, each option can be specified via a --key[=value] pair (or -key[=value] or key[=value]; i.e., two/one/zero preceding dashes are all accepted), key can be in either lower, UPPER or MiXed case, and value is optional for Boolean switches. Furthermore, options can be put in any order; if the same key is repeated more than once, the value specified last overwrites corresponding previous settings.

As always, the rules are best illustrated with concrete examples. Some typical use-cases are given below:

#1 analyze PDB entry '1msy', with default output to stdout
x3dna-dssr --input=1msy.pdb

#2 same as #1, with output directed to file '1msy.out'
x3dna-dssr --input=1msy.pdb --output=1msy.out

#3-6, same as #2
x3dna-dssr --output=1msy.out --input=1msy.pdb
x3dna-dssr --OUTPUT=1msy.out --Input=1msy.pdb
x3dna-dssr -output=1msy.out input=1msy.pdb
x3dna-dssr output=1msy.out --input=1msy.pdb

#7 the value '1ehz.pdb' overwrites '1msy.pdb'
x3dna-dssr --input=1msy.pdb input=1ehz.pdb

#8-12 with the switch --more set to true
x3dna-dssr -input=1msy.pdb --more
x3dna-dssr -input=1msy.pdb --more=true
x3dna-dssr -input=1msy.pdb --more=yes
x3dna-dssr -input=1msy.pdb --more=on
x3dna-dssr -input=1msy.pdb --more=1

#13 same as without specifying --more,
#      or with values set to false/no/0
x3dna-dssr -input=1msy.pdb --more=off

#14 shorthand forms for --input and --output
x3dna-dssr -i=1msy.pdb -o=1msy.out

#15 it can also be more verbose
x3dna-dssr --input-pdb-file=1msy.pdb

#16-18 within a key, separator dash(-) and underscore (_)
#      are treated the same, and can be omitted
x3dna-dssr -i=1msy.pdb -non-pair
x3dna-dssr -i=1msy.pdb -non_pair
x3dna-dssr -i=1msy.pdb -nonpair

By allowing for 2/1/0 dashes to precede each key and a dash/underscore character or none to separate words within the key, DSSR provides users with great flexibility in specifying command-line options to fit into their preferred styles. Not surprisingly, new programs to be added into 3DNA, or the version 3 release of the software will all follow the same convention.

Comment

---

Modified nucleotides in the PDB

In addition to the five canonical bases (A, C, G, T, and U), nucleic acid structures in the PDB contains numerous modified variants (natural or engineered) in the nucleobase, sugar, or the phosphate. For instance, the 76-nt (nucleotide) long yeast phenylalanine tRNA (1ehz) contains 14 modified bases: 2MG10, H2U16, H2U17, M2G26, OMC32, OMG34, YYG37, PSU39, 5MC40, 7MG46, 5MC49, 5MU54, PSU55, and 1MA58. Among which, the most prevalent and best-known example is pseudouridine. Note that in the PDB, each residue (including modified nt) is named with an up to three-letter identifier, e.g., PSU for pseudouridine. For a comprehensive list (with chemical and structural information) of small molecules, including modified nts, please refer to the Ligand Expo website hosted by the RCSB PDB.

Given the widespread occurrences of modified bases in nucleic acid structures, any practical structural bioinformatics software should be able to treat them effectively, as with the canonical bases. In 3DNA, from the very beginning, modified bases are mapped to standard counterparts, e.g. 5‐iodouracil (5IU) to uracil (U) and 1‐methyladenine (1MA) to adenine (A), allowing for easy analysis of unusual DNA and RNA structures (see the NAR03 reference). Specifically, in the 3DNA distribution the file baselist.dat contains the mappings explicitly.

As of v2.1, 3DNA automatically maps a new modified base not available in the file baselist.dat. Yet, I have continuously updated the list in line with new DNA/RNA entries released by the PDB. The process is automated with a Ruby script which calls find_pair -s on each nucleic-acid-containing structure to output unknown bases. As an extreme, the baselist.dat file below comprises only canonical bases:

  A   A
  C   C
  G   G
  T   T
  U   U
 DA   A
 DC   C
 DG   G
 DT   T

With the above minimum mapping list, running the command find_pair -s on 1ehz.pdb identifies all the 14 modified bases. A sample case for 2MG is shown below:

Match '2MG' to 'g' for residue 2MG   10  on chain A [#10]
    check it & consider to add line '2MG     g' to file <baselist.dat>

By parsing the output of a batch run on all DNA/RNA-containing entries in the PDB as of October 18, 2013, I identified a total of 596 modified bases. The top portion is as below:

02I     a
08Q     c
08T     a
0AD     g
 0C     c
0DC     c
0DG     g
0DT     t
 0G     g
0KL     u
0KX     c
0KZ     t

An explicit list of base mapping makes the correspondence transparent, and helps avoid ambiguous cases as to which canonical base a modified nt matches to. DSSR uses the same list internally. Hopefully, the information would also be useful to other related projects.

Comment [2]

---

Different names for the methyl group in DNA and RNA structures

Recently I was a bit surprised to find that the methyl group is named differently in the PDB: C7 in DT8 (thymine) of B-DNA 355d, CM5 in 5MC40 (5-methylated C) of tRNA 1ehz, and C5M in 5MU54 (5-methylated U, i.e., T) of the same tRNA 1ehz. See the three figures below for details.

I know that the previously named C5M of thymine in DNA has been renamed C7 as a result of the 2007 remediation effort (PDB v3). However, browsing through the wwPDB Remediation website and reading carefully the article Remediation of the protein data bank archive, I failed to see explanations of the obvious inconsistency of CM5 (5MC40) vs C5M (5MU54) in the nomenclature of the 5-methyl group in the same tRNA entry 1ehz, except for the following note:

As with the Chemical Component Dictionary, names for standard amino acids and nucleotides follow IUPAC recommendations (10) with the exception of the well-established convention for C-terminal atoms OXT and HXT. These nomenclature changes have been applied to standard polymeric chemical components only.

5-methyl is named C7 in DT8 of the DNA entry 355d

5-methyl in DT8 is named C7 in DNA (355d)

5-methyl is named CM5 in 5MC40 of the RNA entry 1ehz

5-methyl in 5MC40 is named CM5 in RNA (1ehz)

5-methyl is named C5M in 5MU54 of the RNA entry 1ehz

5-methyl in 5MU54 named C5M in RNA (1ehz)

Am I missing something obvious? If you have any further information, please leave a comment. Whatever the case, it helps (at least won’t hurt) to know the naming discrepancy for those who care about the small methyl group in nucleic acid structures.

Comment

---

Compiling ViennaRNA on Mac OS X

Recently, I upgraded my local ViennaRNA package installation from v2.0.7 to v2.1.3 on my Mac. Following Quickstart in the INSTALL file, I ran ./configure successfully, but make aborted with error messages. Since I previously had a working copy of the software, it must be configuration issues when I compiled this new version. After a few iterations of checking the error message and reading through the INSTALL file, I came up with the following settings:

./configure --disable-openmp --without-perl
make
sudo make install

Apart from some warning messages, the above make command ran successfully.

This post serves mainly as a note for my own reference. Hopefully, the information may prove useful to others who try to install the versatile ViennaRNA package on a Mac OS X machine.

Comment

---

Web-interface to DSSR

I’ve come up with a preliminary web-interface to DSSR, currently accessible at URL http://web.x3dna.org/dssr. The DSSR web-interface has been tested on Safari, Firefox, Chrome, and IE, with satisfying results. A screenshot of the home page is given below, using 1msy as an example:

Screenshot of the Web-DSSR homepage

After clicking the Submit button, users will be presented with the result page of a DSSR run. The beginning portion of the above example is as follows:

Screenshot of a DSSR-run

Note that the DSSR web-interface is being provided via a shared web hosting service, thus it has limited resources. Specifically, the uploaded file cannot be larger than two megabytes (2MB), and the process could be slow. Additionally, the file must have an extension of .pdb or .cif. To take full advantage of what DSSR has to offer, please install and run the software locally.

By design, DSSR is self-contained, command-line driven, with zero dependance on third-party libraries. Such features make it straightforward to build a GUI- or web-interface to DSSR, or integrate the program into other structural bioinformatics tools. As the need arises, I will refine the DSSR web-interface to better serve the community. The current simple, yet exploratory, web interface should make DSSR accessible to a much wider audience.

Comment

---

UNR- and GNRA-type U-turns

As of beta-r20-on-20130830, DSSR is able to detect two types of U-turns (see the figure below), the UNR-type (left) originally identified by Quigley and Rich [1976] in yeast phenylalanine tRNA, and the GNRA-type (right) later on established by Jucker and Pardi [1995] in GNRA tetra loops. See the Gutell et al. paper Predicting U-turns in Ribosomal RNA with Comparative Sequence Analysis for a more extensive account of U-turns.

As its name implies, a U-turn is characterized by a reversal of the RNA backbone direction within a few nucleotides. Among other factors, the U-turn is stabilized by two key H-bonding interactions, illustrated in dotted lines in the figure below.

UNR-type U-turn in yeast phenylalanine tRNA (1ehz) GNRA-type U-turn in a GUAA tetraloop (1msy)
UNR-type (1ehz) GNRA-type (1msy)

Applying DSSR to 1jj2 (the crystal structure of the Haloarcula marismortui large ribosomal subunit) led to the identification of over 30 cases. In addition to the well-documented UNR- and GNRA-type U-turns, the program also finds other variants. An example is shown below, where the U-turn is within a GCA triloop instead of a GNRA tetraloop. Here, the N1 (not N2) atom of G1809 forms an H-bond with OP2 of G1812. The G1809 N2 atom is H-bonded to G1812 O5′ to further stabilize the U-turn.

U-turn formed by the GCA-triloop (1jj2)

An examination of the chemical structure of the nitrogenous bases (see figure below) shows clearly other possibilities to connect RNA base donors to the phosphate oxygen acceptors. DSSR allows for the exploration of such variations, and more.

Chemical diagram of RNA bases

Comment

---

Restraint optimization of DNA backbone geometry using PHENIX

3DNA can build DNA/RNA structures with a precise base but approximate sugar-phosphate backbone geometry. In the 2003 3DNA-NAR paper, Table 3 of the section “Structures built with sugar–phosphate backbone” lists “root mean square deviation (in Å) between rebuilt 3DNA models and experimental DNA structures” for three representative DNA structures (in A-form, B-form, and a protein-DNA complex). It was noted that The RMSD of reconstructed versus observed base positions is virtually zero and that for both base and backbone coordinates is <0.85 Å, even for the 146 bp nucleosomal DNA structure.

The backbone geometry is approximate because 3DNA uses a fixed sugar-phosphate conformation (in A-DNA, B-DNA or RNA) that is attached to the corresponding bases in the model building process. The most noticeable effect is the long O3′(i)···P(i+1) bond that connects consecutive nucleotides along a chain. The imprecise structure was intended as a starting point for other objectives (e.g., all-atom molecular dynamics simulations) that are out of the design scope of 3DNA. Nevertheless, over the years, I have been concerned with the overlong O3′—P distance issue. I tried but failed to find a satisfying third-party (command-line driven) tool that can perform restraint optimization of the sugar-phosphate backbone geometry while keeping base atoms fixed.

The problem was finally solved after I attended the 43rd Mid-Atlantic Macromolecular Crystallography Meeting held at Duke University a few months ago. At the meeting, I had the opportunities to talk to several members of the PHENIX team. Particularly, Jeff Headd revised the geometry_minimization component of PHENIX to do the trick. Here is the mail reply from Jeff, using a 3DNA-generated DNA duplex (355d-3dna.pdb) as an example (see full details below):

Here’s a first go at refining just the backbone atoms of you input DNA model. You’ll need the most recently nightly build of Phenix (dev-1395 would work) and then run:

phenix.geometry_minimization 355d-3dna.pdb min.params

using the attached min.params file.

What I specify in the params file is to only move the backbone atoms, which I’ve done with a selection. You can modify the atoms that are allowed to move to your liking.

The only other change was to allow longer distance linkages, as some of the backbone linkages start quite far apart.

The content of file min.params is:

pdb_interpretation {
  link_distance_cutoff = 7.0
}
selection = name " P  " or name " OP1" or name " OP2" or \
            name " O5'" or name " C5'" or name " C4'" or \
            name " O4'" or name " C3'" or name " O3'" or \
            name " C2'"

To make the story complete, given below is the step-by-step procedure, using 355d, a B-DNA dodecamer at 1.4 Å resolution as an example. The corresponding PDB file is named 355d.pdb.

find_pair 355d.pdb stdout | analyze stdin
x3dna_utils cp_std bdna
rebuild -atomic bp_step.par 355d-3dna.pdb
# the rebuilt structure is called '355d-3dna.pdb'

# with Phenix dev-1395 and above
phenix.geometry_minimization 355d-3dna.pdb min.params
# the optimized structure is called '355d-3dna_minimized.pdb'

# to verify:
find_pair 355d-3dna.pdb stdout | analyze stdin
find_pair 355d-3dna_minimized.pdb stdout | analyze stdin
# check files '355d-3dna.out' and '355d-3dna_minimized.out'

The three key files mentioned above are provided here for your verification:

Finally, the following figure illustrates the B-DNA dodecamer duplex in experimental (left), 3DNA-generated (middle) and PHENIX-optimized (right) coordinates. Note that disconnected O3′—P linkages (marked by red dots for two cases, see bottom of the middle image) due to overlong distances in 3DNA-rebuilt structure are fixed following the restraint PHENIX optimization.

355d-experimental 3DNA-rebuilt PHENIX-optimized
355d, experimental structure 3DNA-rebuilt structure PHENIX-optimized structure
---

Note added on 2016-11-11: In the min.params file, the selection is in one long line. For illustration purpose, the selection section (see below) is split into serveral short lines in the blog post. However, PHENIX requires ending backslashes (\) to combine the split lines into a single grammatical unit. I was not aware of this strict rule, and missed to add the ending \s in the original post. Thanks to Oleg Sobolev from the PHENIX team for pointing out this omission to my attention. Note that the content of min.params did not have a problem, and thus no change is made.

pdb_interpretation {
  link_distance_cutoff = 7.0
}
selection = name " P  " or name " OP1" or name " OP2" or \
            name " O5'" or name " C5'" or name " C4'" or \
            name " O4'" or name " C3'" or name " O3'" or \
            name " C2'"

Comment [4]

---

« Older · Newer »

Thank you for printing this article from http://x3dna.org/. Please do not forget to visit back for more 3DNA-related information. — Xiang-Jun Lu