A video overview of DSSR

DSSR (Dissecting the Spatial Structure of RNA) is an integrated software tool for the analysis/annotation, model building, and schematic visualization of 3D nucleic acid structures (see the figures below and the video overview). It is built upon the well-known, tested, and trusted 3DNA suite of programs. DSSR has been made possible by the developer’s extensive user-support experience, detail-oriented software engineering skills, and expert domain knowledge accumulated over two decades. It streamlines tasks in RNA/DNA structural bioinformatics, and outperforms its ‘competitors’ by far in terms of functionality, usability, and support.

Wide citations. DSSR has been widely cited in scientific literature, including: (i) “Selective small-molecule inhibition of an RNA structural element” (Nature, 2015; Merck Research Laboratories), (ii) “The structure of the yeast mitochondrial ribosome” (Science, 2017), (iii) “RNA force field with accuracy comparable to state-of-the-art protein force fields” (PNAS, 2018; D. E. Shaw Research), (iv) “Predicting site-binding modes of ions and water to nucleic acids using molecular solvation theory” (JACS, 2019), (v) “RIC-seq for global in situ profiling of RNA-RNA spatial interactions” (Nature, 2020), and (vi) “DNA mismatches reveal conformational penalties in protein-DNA recognition” (Nature, 2020).

Broad integrations. To make DSSR as widely accessible as possible, I have initiated collaborations with the principal developers of Jmol and PyMOL. The DSSR-Jmol and DSSR-PyMOL integrations bring unparalleled search capabilities (e.g., ‘select junctions’ for all multi-branch loops) and innovative visualization styles into 3D nucleic acid structures. DSSR has also been adopted into numerous other structural bioinformatics resources, including: (i) URS, (ii) RiboSketch, (iii) RNApdbee, (iv) forgi, (v) RNAvista, (vi) VeriNA3d, (vii) RNAMake, (viii) ElTetrado, (ix) DNAproDB, (x) LocalSTAR3D, (xi) IPANEMAP, and (xii) RNANet.

Advanced features. DSSR may be licensed from Columbia University. DSSR Pro is the commercial version. It has more functionalities than DSSR basic (the free academic version), including: (i) homology modeling via in silico base mutations, a feature employed by Merck scientists, (ii) easy generation of regular helical models, including circular or super-helical DNA (see figures below), (iii) creation of customized structures with user-specified base sequences and rigid-body parameters, (iv) efficient processing of molecular dynamics (MD) trajectories, (v) detailed characterization of DNA-protein or RNA-protein spatial interactions, and (vi) template-based modeling of DNA-protein complexes (see figures below). DSSR Pro supersedes 3DNA. It integrates the disparate analysis and modeling programs of 3DNA under one umbrella, and offers new advanced features, through a convenient interface. For example, with the mutate module of DSSR Pro, one can automatically perform the following tasks: (i) mutate all bases to Us, (ii) mutate bases in hairpin loops to Gs, and (iii) mutate G–C Watson-Crick pairs to C–G, and A–U to U–A. Moreover, DSSR Pro includes an in-depth user manual and one-year technical support from the developer.

Quality control. DSSR is a solid software product that excels in RNA structural bioinformatics. It is written in strict ANSI C, as a single command-line program. It is self-contained, with zero runtime dependencies on third-party libraries. The binary executables for macOS, Linux, and Windows are just ~2MB. DSSR has been extensively tested using all nucleic-acid-containing structures in the PDB. It is also routinely checked with Valgrind to avoid memory leaks. DSSR requires no set up or configuration: it simply works.


Theoretical models of G-quadruplexes, created using DSSR Pro.



Template-based modeling of DNA-protein complexes using DSSR Pro.
Here are two chromatin-like models using PDB entry 4xzq as the template.



Circular DNA duplexes modeled using DSSR Pro.




DNA super helices modeled using DSSR Pro.



Innovative cartoon-block schematics enabled by the DSSR-PyMOL integration for six representative PDB entries. Watson-Crick pairs are shown as long blocks with minor-groove edges in black (A, B), G-tetrads represented as square blocks and the metal ion as sphere ©, the ligand rendered as balls-and-sticks (D), and proteins depicted as purple cartoons (E, F). Color code for base blocks: A, red; C, yellow; G, green; T, blue; U, cyan; G-tetrad, green; WC-pairs, per base in the leading strand. Visit http://skmatic.x3dna.org.
Recommended in Faculty Opinions: “simple and effective”, “Good for Teaching”.
Employed by the NDB to create cover images of the RNA Journal.

---

Cover images of the RNA Journal in 2020

Following my previous post 3DNA/blocview-PyMOL images in covers of the RNA journal in 2019, here is an update for 2020. The cover images of the January to July issues have all been generated with help of 3DNA and provided by the NDB:

RNA is displayed as a red ribbon; block bases use NDB colors: A—red, C—yellow, G—green, U—cyan. The image was generated using 3DNA/blocview and PyMol software. Cover image provided by the Nucleic Acid Database (ndbserver.rutgers.edu).

Here is the composite figure of the seven cover images, with the brand new DSSR-PyMOL schematics for comparison.

3DNA/blockview-PyMOL and DSSR-PyMOL cartoon-block schematics in the covers of the RNA journal in 2020

Details of the seven structures illustrated in the cover images are described below:

  1. January 2020 Pumilio homolog PUF domain in complex with RNA (PDB id: 5yki; Zhao YY, Mao MW, Zhang WJ, Wang J, Li HT, Yang Y, Wang Z, Wu JW. 2018. Expanding RNA binding specificity and affinity of engineered PUF domains. Nucleic Acids Res 46: 4771–4782). Engineered nine-repeat PUF domain binds to its RNA target specifically and with high binding affinity.
  2. February 2020 Aprataxin RNA–DNA deadenylase product complex (PDB id: 6cvo; Tumbale P, Schellenberg MJ, Mueller GA, Fairweather E, Watson M, Little JN, Krahn J, Waddell I, London RE, Williams RS. 2018. Mechanism of APTX nicked DNA sensing and pleiotropic inactivation in neurodegenerative disease. EMBO J 37: e98875). Human aprataxin RNA–DNA deadenylase protects genome integrity and corrects abortive DNA ligation arising during ribonucleotide excision repair and base excision DNA repair.
  3. March 2020 PreQ1 riboswitch (PDB id: 6e1w; Connelly CM, Numata T, Boer RE, Moon MH, Sinniah RS, Barchi JJ, Ferre-D’Amare AR, Schneekloth Jr JS. 2019. Synthetic ligands for PreQ1 riboswitches provide structural and mechanistic insights into targeting RNA tertiary structure. Nat Commun 10: 1501). Class I PreQ1 riboswitch regulates downstream gene expression in response to its cognate ligand PreQ1 (7-aminomethyl-7-deazaguanine).
  4. April 2020 Hatchet ribozyme (PDB id: 6jq6; Zheng L, Falschlunger C, Huang K, Mairhofer E, Yuan S, Wang J, Patel DJ, Micura R, Ren A. 2019. Hatchet ribozyme structure and implications for cleavage mechanism. Proc Natl Acad Sci 116: 10783–10791). This crystal structure of the hatchet ribozyme product features a compact symmetric dimer.
  5. May 2020 Adenovirus virus-associated RNA (PDB id: 6ol3; Hood IV, Gordon JM, Bou-Nader C, Henderson FE, Bahmanjah S, Zhang J. 2019. Crystal structure of an adenovirus virus-associated RNA. Nat Commun 10: 2871). Acutely bent viral RNA fragment is a protein kinase R inhibitor and features an unusually structured apical loop, a wobble-enriched, coaxially stacked apical and tetra-stems, and a central domain pseudoknot that resembles codon-anticodon interactions.
  6. June 2020 Archeoglobus fulgidus L7Ae bound to cognate K-turn (PDB id: 6hct; Huang L, Ashraf S, Lilley DMJ. 2019. The role of RNA structure in translational regulation by L7Ae protein in archaea. RNA 25: 60–69). 50S archaeal ribosome protein L7Ae binds to a K-turn structure in the 5′-leader of the mRNA of its structural gene to regulate translation.
  7. July 2020 Spinach RNA aptamer/Fab complex (PDB id: 6b14; Koirala D, Shelke SA, Dupont M, Ruiz S, DasGupta S, Bailey LJ, Benner SA, Piccirilli JA. 2018. Affinity maturation of a portable Fab-RNA module for chaperone-assisted RNA crystallography. Nucleic Acids Res 46: 2624–2635). Novel Fab-RNA module can serve as an affinity tag for RNA purification and imaging and as a chaperone for RNA crystallography.

Comment

---

Paper on DSSR-PyMOL schematics

The paper, titled DSSR-enabled innovative schematics of 3D nucleic acid structures with PyMOL, has just been published in Nucleic Acids Research (online on May 22, 2020). Here is the abstract:

Sophisticated analysis and simplified visualization are crucial for understanding complicated structures of biomacromolecules. DSSR (Dissecting the Spatial Structure of RNA) is an integrated computational tool that has streamlined the analysis and annotation of 3D nucleic acid structures. The program creates schematic block representations in diverse styles that can be seamlessly integrated into PyMOL and complement its other popular visualization options. In addition to portraying individual base blocks, DSSR can draw Watson-Crick pairs as long blocks and highlight the minor-groove edges. Notably, DSSR can dramatically simplify the depiction of G-quadruplexes by automatically detecting G-tetrads and treating them as large square blocks. The DSSR-enabled innovative schematics with PyMOL are aesthetically pleasing and highly informative: the base identity, pairing geometry, stacking interactions, double-helical stems, and G-quadruplexes are immediately obvious. These features can be accessed via four interfaces: the command-line interface, the DSSR plugin for PyMOL, the web application, and the web application programming interface. The supplemental PDF serves as a practical guide, with complete and reproducible examples. Thus, even beginners or occasional users can get started quickly, especially via the web application at http://skmatic.x3dna.org.

A brief history on DNA/RNA schematics as implemented in SCHNAaP/SCHNArP, 3DNA, and now in DSSR:

The idea of representing bases and WC-pairs as rectangular blocks came from the pioneering work of Calladine et al. (27,28) The block schematics were first implemented in the pair of SCHNAaP/SCHNArP programs (29,30) for rigorous analysis and reversible rebuilding of double-helical nucleic acid structures. The algorithms that underpinned SCHNAaP/SCHNArP laid the foundation of ‘analyze’ and ‘rebuild’, two core components of the 3DNA suite of programs (31–33). 3DNA also takes advantage of the standard base reference frame (34), and comprises quite a few other related programs. One of them is ‘blocview’, a script which calls several 3DNA utility programs to generate individual base blocks and set the view, MolScript (35) to produce backbone ribbons, and Raster3D (36) to render the composite image. The 3DNA ‘blocview’ schematics catch characteristic attributes of nucleic acid structures. They have gradually become popular and been adopted into the RCSB PDB (1) and the NDB (37), and then propagated into other bioinformatics resources (e.g., the ‘RNA Structure Atlas’ website hosted by the Leontis-Zirbel RNA group).

DSSR supersedes ‘blocview’ by eliminating all the internal and external dependencies of the 3DNA utility program. DSSR produces block representations, not only of individual bases but also WC-pairs and G-tetrads, that can be fed directly into PyMOL. The DSSR-PyMOL integration is easier to use, has more features, and produces better schematics than the original 3DNA-blocview approach.


DSSR-PyMOL schematic for PDB entry 6ol3 3DNA-blocview-PyMOL schematic for PDB entry 6ol3
Schematic image for PDB entry 6ol3 auto-generated via the DSSR-PyMOL integration Cover image of the May 2020 issue of the RNA Journal, “generated using 3DNA/blocview and PyMol software by the Nucleic Acid Database”

Indeed, the base block schematics have continuously evolved for over two decades, as appreciated in the acknowledgements:

I would like to thank Christopher A. Hunter, Christopher R. Calladine, Helen M. Berman, Catherine L. Lawson, Zukang Feng, Wilma K. Olson and Harmen J. Bussemaker for their helpful input on the block schematic during its continuous evolution for over two decades. I appreciate Thomas Holder (PyMOL Principal Developer, Schrödinger, Inc.) for writing the DSSR plugin for PyMOL, and for providing insightful comments on the manuscript and the web application interface. I also thank Jessalyn Lu and Yin Yin Lu for proofreading the manuscript, and the user community for feedback.

Notably, the supplemental PDF has been diligently written to serve as a practical guide, with complete and reproducible examples. In fact, the paper concludes with the following two sentences:

Finally, all results reported here are completely reproduceable (see the supplemental PDF). Any questions related to this work are welcome and will be openly addressed on the 3DNA Forum (http://forum.x3dna.org).

Comment

---

Context-aware in silico base mutations enabled by DSSR 2.0

As of version 2.0 (to be released soon), DSSR has a new module for in silico base mutations that is context sensitive. Powered by the DSSR analysis engine, the module allows users to perform base mutations in unprecedented flexibility and convenance. Here are some examples:

  • Mutate all bases in hairpin loops to a specific base (e.g., G)
  • Mutate all non-stem bases to a specific base (e.g., U)
  • Mutate bases 2-12 to a specific base (e.g., A) regardless of context
  • Mutate bases 1-10 in a given structure to a new sequence (e.g., AUAUAUAUAU)
  • Mutate all bases of the same type to another (e.g., A to G)
  • Mutate all bases of the same type to another (e.g., C to U) except for some nucleotides
  • Mutate all G-C Watson-Crick (WC) pairs to C-G WC pairs, and A-U to U-A
  • Mutate all G-tetrads in G-quadruplexes to non-G-tetrads (e.g., U-tetrads)

By default, the mutation preserves both the geometry of the sugar-phosphate backbone and the base reference frame (position and orientation). As a result, re-analyzing the mutated model gives the same base-pair and step parameters as those of the original structure.

Over the years, the 3DNA mutate bases program has been cited in the literature and patent, including the following ones:

The DSSR mutation module has completely obsoleted the mutate_bases program distributed in 3DNA v2.x. In addition to serving as a drop-in replacement of mutate_bases, the DSSR approach offers much more features and versatility: it is simply better.

Comment

---

May's article on "The Best Ways to Study DNA and Protein Interactions"

In late March, I was approached by Mike May. He was then writing an article for Biocompare about DNA-protein interactions and asked me to answer a few questions on “What features of 3DNA be used in studying DNA-protein interactions?” and “Please provide 1-2 examples.” Initially, I was a bit surprised by the contact. Thus, I visited his online profile and Amazon Author Page. I also read a couple of his previous publications. Impressed by his track records, I answered his requests and our following communications were as smooth and professional as I could have ever imagined.

The paper The Best Ways to Study DNA and Protein Interactions has now been published, and is freely accessible. It includes the following content:

3DNA creator and maintainer Xiang-Jun Lu mentioned a couple of ways that the software has been used. For example, he noted that “3DNA can analyze all DNA-protein complexes in the Protein Data Bank—PDB—in an automatic, consistent, and robust manner,” and other bioinformatic resources have adopted this feature of 3DNA. He added that scientists have used 3DNA to “understand the structural basis on how transcription factors recognize methylated DNA.” Moreover, 3DNA is continuously developed. A new feature of 3DNA is the automatic identification and comprehensive characterization of G-quadruplexes, a noncanonical DNA structure formed from guanine-rich base sequences.

The bioinformatics resource I used as an example is the paper DNAproDB: an expanded database and web-based tool for structural analysis of DNA–protein complexes by the Rohs lab. The phrase “to understand the structural basis on how transcription factors recognize methylated DNA” refers to the article Toward a mechanistic understanding of DNA methylation readout by transcription factors by the Bussemaker lab. Both works employed DSSR and SNAP, two sophisticated programs I created and maintained over the past ten years, and they have largely obsoleted the original 3DNA suite of programs.

The image I provided is a DSSR-PyMOL schematic based on PDB entry 6LDM. The 6LMD picture features a G-quadruplex, for which DSSR comes with an unmatched set of features (including automatic identification and comprehensive annotations). See the http://g4.x3dna.org/ page for survey results, curated using DSSR, of all G-quadruplexes from the PDB.

This image of a protein-DNA complex (PDB entry 6LDM) shows the protein (purple), the DNA G-quadruplex (green) and thymine (blue). The image was created using the 3DNA-DSSR program and PyMOL. Image courtesy of Xiang-Jun Lu.

DSSR-PyMOL schematic for PDB entry 6ldm

Comment

---

SARS-CoV-2, RNA G-Quadruplexes and 3DNA

I recently noticed a bioRxiv preprint, titled Role of RNA Guanine Quadruplexes in Favoring the Dimerization of SARS Unique Domain in Coronaviruses by a European team consisting of scientists from France, Italy, and Spain. The abstract is as follows. Figure 1 shows a schematic representation of the mRNA with a G-Quadruplex structure, functioning in a healthy cell and an infected cell by coronavirus.

Coronaviruses may produce severe acute respiratory syndrome (SARS). As a matter of fact, a new SARS-type virus, SARS-CoV-2, is responsible of a global pandemic in 2020 with unprecedented sanitary and economic consequences for most countries. In the present contribution we study, by all-atom equilibrium and enhanced sampling molecular dynamics simulations, the interaction between the SARS Unique Domain and RNA guanine quadruplexes, a process involved in eluding the defensive response of the host thus favoring viral infection of human cells. The results obtained evidence two stable binding modes with guanine quadruplexes, driven either by electrostatic (dimeric mode) or by dispersion (monomeric mode) interactions, are proposed being the dimeric mode the preferred one, according to the analysis of the corresponding free energy surfaces. The effect of these binding modes in stabilizing the protein dimer was also assessed, being related to its biological role in assisting SARS viruses to bypass the host protective response. This work also constitutes a first step of the possible rational design of efficient therapeutic agents aiming at perturbing the interaction between SARS Unique Domain and guanine quadruplexes, hence enhancing the host defenses against the virus.


Figure 1) Schematic representation of the mRNA function in a) a healthy cell and b) an infected cell by coronavirus. Panel b) showcases the influence of viral SUD binding to G4 sequences of mRNA that encodes crucial proteins for the apoptosis/cell survival regulation and other signaling paths.

In the manuscript, the software tools employed in this MD study are described as below:

… Both protein and RNA have been described with the amber force field including the bsc1 corrections, and the MD simulations have been performed in the constant pressure and temperature ensemble (NPT) at 300K and 1 atm. All MD simulations have been performed using the NAMD code and analyzed via VMD, the G4 structure has also been analyzed with the 3DNA suite.

I am glad that 3DNA has played a role in the analysis of G-quadruplexes in this timely contribution. In particular, I would like to draw attention of the community to 3DNA-DSSR which has a brand-new module dedicated to the automatic identification and comprehensive characterization of G-quadruplexes. The DSSR-annotated G-quadruplexes from the PDB should be of great interest to a wide audience, especially the experimentalists. As a concrete example, the authors noted that “The crystal structure … of the oligonucleotide (pdb 1J8G) have been chosen coherently with the experimental work performed by Tan et al”. Follow the link to see results of DSSR-derived G-quadruplex features in PDB entry 1J8G and you are guaranteed to see features not available elsewhere.

Note added on July 9, 2020: This paper has been published in J. Phys. Chem. Lett. 2020, 11, 5661−5667.

Comment

---

SNAP for the analysis of TF-DNA complexes containing 5-methyl-cytosines

The Kribelbauer et al. article, Towards a mechanistic understanding of DNA methylation readout by transcription factors has recently been published in the Journal of Molecular Biology (JMB). I am honored to be among the author list, and I learned a lot during the process. For the project, I added the --methyl-C (short-form: --5mc) option to SNAP (v1.0.6-2019sep30) for the automatic identification and annotation of DNA-transcription factor (TF) complexes containing 5-methyl-cytosine (5mC). The results are presented in a dynamic table, easily accessible at URL http://snap-5mc.x3dna.org, and summarized in Fig. 1 “Structural basis of how TFs recognize methylated DNA” (see below) of the JMB paper.

Fig. 1. Structural basis of how TFs recognize methylated DNA

Details on the SNAP-enabled curation of TF-DNA complexes containing 5mC from atomic coordinates in the Protein Data Bank (PDB) are available in a tutorial page at http://snap-5mc.x3dna.org/tutorial. In essence, the process can be easily understood via a concrete example with PDB id 4m9e, as shown below.

x3dna-snap --methyl-C --type=base -i=4m9e.pdb -o=4m9e-5mC.out

Here the --methyl-C option is specific for 5mC-DNA, and --type=base ensures that at least one DNA base atom is contacting protein amino acid(s). If these conditions are fulfilled, SNAP would produce two additional 5mC-related files, apart from the normal output file (i.e., 4m9e-5mC.out, as specified in the example):

  • 4m9e-5mC.txt — a simple text file with the following contents:
4m9e:B.5CM5: stacking-with-A.ARG443 is-WC-paired is-in-duplex [+]:GcG/cGC
4m9e:C.5CM5: other-contacts is-WC-paired is-in-duplex [-]:cGT/AcG
  • 4m9e-5mC.pdb — a corresponding PDB file, potentially multi-model, two as in this case. Moreover, the cluster of interacting residues (DNA nucleotides and protein amino acids) is oriented in the standard base reference frame of 5mC, allowing for easy comparison and direct overlap of multiple clusters.

In practice, SNAP needs to take care of many details for the automatic identification and annotation of 5mC-DNA-TF complexes directly from PDB entries. For example, 5mC in DNA is designated 5CM and the 5-methyl carbon atom is named C5A in the PDB (see the blogpost 5CM and 5MC, two forms of 5-methylcytosine in the PDB). Moreover, the --type=base option is employed to ensure that base atoms (regardless sugar-phosphate atoms) of 5mC are directly involved in interactions with amino acids.

It is also worth noting the combined use of DSSR for the generation of molecular images (rendered with PyMOL), as shown below. Here the DSSR options --block-file=fill-hbond (fill to fill base rings and hbond to draw hydrogen bonds) and --cartoon-block=sticks-label are used. The 3DNA DSSR/SNAP combo is a unique and powerful toolset for structural bioinformatics, as demonstrated in DNAproDB from the Rohs lab (see my blogpost SNAP and DSSR in DNAproDB). The JMB paper represents yet another example. I can only expect to see more combined DSSR/SNAP applications in the future.

DSSR-PyMOL image for PDB id: 4m9e

Comment

---

3DNA-DSSR is linked in the G4-society website

A couple of months ago, I came across the homepage of the newly-established G4 Society on G-quadruplexes (G4s). I checked the “Online tools” section and found a few links to G4 databases and sequence-based predication programs (e.g., G4Hunter). No tools, however, were listed for G4 identification and characterization from 3D atomic coordinates as those deposited in the Protein Data Bank (PDB). So I filled out the contact form and provided a brief description of 3DNA-DSSR, including a link to the website of G4s auto-curated with DSSR from the PDB.

I’ve recently visited the G4-society website again. I am pleased to see that 3DNA-DSSR is now listed under Online tools as a “program for detections/annotations of G4 from atomic coordinates in PDB or PDBx/mmCIF format”. The G4 module of 3DNA-DSSR has been created to streamline the identification and annotation of 3D structures of G4s. The collection of G4s in the PDB, available at G4.x3dna.org, is updated weekly. It represents a unique resource for the G4 community. Hopefully, its value will be more widely appreciated thanks to the link from the G4-society website.

At the G4-society homepage, I noticed the following two items in the “News” section (on December 13, 2019):

The Quadruplex Meeting Report

Meeting report: Seventh International Meeting on Quadruplex Nucleic Acids (Changchun, P.R. China, September 6e9, 2019) written by Jean-Louis Mergny. Reading through the report, I noticed the following:

Jonathan B. Chaires (U. Louisville, KY, USA) provided an overview and historical perspective of the quadruplex field in his inaugural lecture. As of August 2019, the quadruplex field gathers 8467 articles and 253,174 citations in the Science Citation Index. Over 200 G4 structures are available in the PDB.

I did not know how the survey of G4s in the PDB was performed. Based on my data, the PDB-G4 structures was already over 300 as of August 2019. As of December 11, 2019, the number of G4 structures in the PDB is 329. Importantly, the PDB-G4 website compiled using 3DNA-DSSR contains not only citation information but also detailed annotations and schematic images not available elsewhere. Here are a few recent examples:

  • PDB id: 6ge1 — “Unraveling the structural basis for the exceptional stability of RNA G-quadruplexes capped by a uridine tetrad at the 3’ terminus.” by Andralojc et al. in RNA (2019).
  • PDB id: 6gh0 — “Two-quartet kit* G-quadruplex is formed via double-stranded pre-folded structure.” by Kotar et al. in Nucleic Acids Res. (2019).
  • PDB id: 6e8u — “Structure and functional reselection of the Mango-III fluorogenic RNA aptamer.” by Trachman et al. in Nat. Chem. Biol. (2019).
  • PDB id: 6ac7 —“Structure of a (3+1) hybrid G-quadruplex in the PARP1 promoter.” by Sengar et al. in Nucleic Acids Res. (2019).

The Important Paper

A guide to computational methods for G-quadruplex prediction by Emilia Puig Lombardi and Arturo Londoňo-Vallejo in Nucleic Acids Res. (2019), which presents an updated overview of G4 prediction algorithms. I am impressed by the large number of sequence-based G4 prediction software tools, including the most recent G4-iM Grinder. Nevertheless, as noted by the authors in the concluding remarks, “All computational G-quadruplex prediction approaches have their drawbacks and limitations despite the recent advances in the field and the introduction of validation steps based on experimental data.”

The G4 module in 3DNA-DSSR belongs to a completely different category of software tool. It does not ‘predict’ G4 propensity/stability from a base sequence, but identify and annotate G4s in a 3D atomic coordinate file. It complements sequence-based predicting tools by gaining insights into the 3D G4 structures and refining folding rules to improve performance of prediction tools. Based on my knowledge, the 3D G4 structures contains features that are not captured by any of the sequence-based prediction tools.

While reading the review article, I found Fig. 1 informative (see below). The right side of Fig. 1A shows a “cartoon representation of the Oxytricha telomeric DNA G4 crystal structure (PDB accession 1JPQ (112))” using PyMOL. In comparison, the cartoon-block image auto-generated via 3DNA-DSSR and PyMOL for PDB id: 1jpq is shown at the bottom. The DSSR-PyMOL version is obviously different, presumably simpler and more informative, from that illustrated in Fig. 1A.

Figure 1. From guanines to G-quadruplexes

3DNA-DSSR cartoon-block schematic for PDB entry 1jpq, rendered with PyMOL

Comment

---

3DNA/blocview-PyMOL images in covers of the RNA journal

I recently performed a quick survey of the cover images of the RNA journal in 2019. I was pleased to find that 9 out of the 12 cover images were provided by the Nucleic Acid Database where 3DNA/blockview and PyMOL were employed, as noted below:

The RNA backbone is displayed as a red ribbon; bases are shown as blocks with NDB coloring: A—red, C—yellow, G—green, U—cyan; geneticin ligands are shown in spacefill with element colors: C—white, N—blue, O—red. The image was generated using 3DNA/blocview and PyMol software.

Details of the 9 cover images are listed below:

  1. January 2019 Rhodobacter sphaeroides Argonaute with guide RNA/target DNA duplex containing noncanonical A-G pair (PDB code: 6d9k)
  2. April 2019 Group I self-splicing intron P4-P6 domain mutant U131A (PDB code: 6d8l)
  3. May 2019 Crystal structure of T. thermophilus 50S ribosomal protein L1 in complex with helices H76, H77, and H78 of 23S RNA (PDB code: 5npm)
  4. June 2019 Crystal structure of ykoY-mntP riboswitch chimera bound to cadmium (PDB code: 6cc3)
  5. July 2019 G96A mutant of the PRPP riboswitch from T. mathranii bound to ppGpp (PDB code: 6ck4)
  6. August 2019 Crystal structure of the metY SAM V riboswitch (PDB code: 6fz0)
  7. October 2019 Crystal structure of protease factor Xa bound to RNA aptamer 11F7t and rivaroxaban (PDB code: 5vof)
  8. November 2019 Drosophila melanogaster nucleosome remodeling complex (PDB code: 6f4g)
  9. December 2019 Crystal structure of the Homo Sapiens cytoplasmic ribosomal decoding site in complex with Geneticin (PDB code: 5xz1)

Here is the composite figure of the 9 cover images.

3DNA/blockview-PyMOL cartoon-block schematics in the covers of the RNA journal in 2019

See also:

Comment

---

Web API to 3DNA

I’ve created a web API to DSSR and SNAP, and fiber models. The overall help message is available via http://api.x3dna.org. Individually, each program is accessed as below.

Help message on x3dna-dssr (DSSR): http://api.x3dna.org/dssr/help

Usage with 'http' (HTTPie):
    http -f http://api.x3dna.org/dssr [options] url=|model@
    http http://api.x3dna.org/dssr/help   -- display this help message

Options:
    json=true-or-FALSE(default)    [e.g., json=true # JSON output]
    pair=true-or-FALSE(default)    [e.g., pair=1    # base-pair only]
    hbond=true-or-FALSE(default)   [e.g., hbond=t   # H-bonding info]
    more=true-or-FALSE(default)    [e.g., more=y    # further details]

Required parameter:
    url=URL-to-coordinate-file [e.g., url=https://files.rcsb.org/download/1ehz.pdb.gz]
    model@coordinate-file      [e.g., model@1ehz.cif]
    # Only one must be specified. 'url' precedes 'model' when both are specified.
    # The coordinate file must be in PDB or PDBx/mmCIF format, optionally gzipped.

Examples:
    http -f http://api.x3dna.org/dssr url=https://files.rcsb.org/download/1ehz.pdb.gz
    http -f http://api.x3dna.org/dssr model@1ehz.cif pair=1
    # with 'curl'
    curl http://api.x3dna.org/dssr -F 'url=https://files.rcsb.org/download/1ehz.pdb.gz'
    curl http://api.x3dna.org/dssr -F 'model=@1msy.pdb' -F 'pair=1'

Note:
    The web API has an upper limit on coordinate file size (gzipped): < 6 MB

Help message on x3dna-snap (SNAP): http://api.x3dna.org/snap/help

Usage with 'http' (HTTPie):
    http -f http://api.x3dna.org/snap [options] url=|model@
    http http://api.x3dna.org/snap/help   -- display this help message

Options:
    json=true-or-FALSE(default)    [e.g., json=true # JSON output]
    hbond=true-or-FALSE(default)   [e.g., hbond=t   # H-bonding info]

Required parameter:
    url=URL-to-coordinate-file [e.g., url=https://files.rcsb.org/download/1oct.pdb.gz]
    model@coordinate-file      [e.g., model@1oct.cif]
    # Only one must be specified. 'url' precedes 'model' when both are specified.
    # The coordinate file must be in PDB or PDBx/mmCIF format, optionally gzipped.

Examples:
    http -f http://api.x3dna.org/snap url=https://files.rcsb.org/download/1oct.pdb.gz
    http -f http://api.x3dna.org/snap model@1oct.cif json=1
    # with 'curl'
    curl http://api.x3dna.org/snap -F 'url=https://files.rcsb.org/download/1oct.pdb.gz'
    curl http://api.x3dna.org/snap -F 'model=@1oct.cif' -F 'json=1'

Note:
    The web API has an upper limit on coordinate file size (gzipped): < 6 MB

Help message on 56 fiber models: http://api.x3dna.org/fiber/help

Usage with 'http' (HTTPie):
    http http://api.x3dna.org/fiber/help    # display this help message
    http http://api.x3dna.org/fiber/list    # show a list of available fiber models (56 in total)
    http http://api.x3dna.org/fiber/str_id  # build model 'str_id' in the range of [1, 56]
    http http://api.x3dna.org/fiber/name    # generate a model with common names as shown below:
              A-DNA, B-dna, C_DNA, D-DNA, ZDNA, RNA, RNAduplex, PaulingTriplex, G4
              Case does not matter, and the separator can be '-' or '_' or omitted.
              So a-dna, A-dNA, a_DNA, or ADNA is valid for building an A-DNA model.

Options (via query strings, or form fields):
    seq=base-sequence # A, C, G, T, U for generic model
    repeat=number     # number of repeats of the sequence
    cif=1             # output file in mmCIF format

Examples with 'http' (HTTPie):
    http http://api.x3dna.org/fiber/1       # model no. 1 (i.e., calf thymus A-DNA model)
    http -f http://api.x3dna.org/fiber/1 seq=A3TTT repeat=2  # specific sequence, repeated twice
    http http://api.x3dna.org/fiber/rna     # single-stranded RNA model
    http http://api.x3dna.org/fiber/rna-ds  # double-stranded RNA model
    http http://api.x3dna.org/fiber/pauling # the triplex model of Pauling & Corey
    http http://api.x3dna.org/fiber/g4      # G-quadruplex model
    # with 'curl'
    curl http://api.x3dna.org/fiber/1
    curl http://api.x3dna.org/fiber/1 -d 'seq=A3TTT' -d 'repeat=2'
    curl http://api.x3dna.org/fiber/rna
    curl http://api.x3dna.org/fiber/rna-ds
    curl http://api.x3dna.org/fiber/pauling
    curl http://api.x3dna.org/fiber/g4

Note:
    The web API has two upper limits: repeats < 1,000, and nucleotides < 10,000.

Comment

---

« Older · Newer »

Thank you for printing this article from http://x3dna.org/. Please do not forget to visit back for more 3DNA-related information. — Xiang-Jun Lu