Get hydrogen bonds with DSSR

H-bonding interactions are crucial for defining RNA secondary and tertiary structures. DSSR/3DNA contains a geometrically based algorithm for identifying H-bonds in nucleic-acid or protein structures given in .pdb or .cif format. Over the years, the method has been continuously refined, and it has served its purpose quite well. As of v1.1.1-2014apr11, this functionality is directly available from DSSR thorough the --get-hbonds option.

The output for 1msy, which contains a GUAA tetraloop mutant of Sarcin/Ricin domain from E. Coli 23 S rRNA, is listed below. The first line gives the header (# H-bonds in '1msy.pdb' identified by DSSR ...). The second line provides the total number of H-bonds (40) identified in the structure. Afterwards, each line consists of 8 space-delimited columns used to characterize a specific H-bond. Using the first one (#1) as an example, the meaning of each of the 8 columns is:

The serial number (15), as denoted in the .pdb or .cif file, of the first atom of the H-bond.
The serial number (578) of the second H-bond atom.
The H-bond index (#1), from 1 to the total number of H-bonds.
A one-letter symbol showing the atom-pair type (p) of the H-bond. It is ‘p’ for a donor-acceptor atom pair; ‘o’ for a donor/acceptor (such as the 2′-hydorxyl oxygen) with any other atom; ‘x’ for a donor-donor or acceptor-acceptor pair (as in #17); ‘?’ if the donor/acceptor status is unknown for any H-bond atom.
Distance in Å between donor/acceptor atoms (2.768).
Elemental symbols of the two atoms involved in the H-bond (O/N).
Identifier of the first H-bond atom (O4@A.U2647).
Identifier of the second H-bond atom (N1@A.G2673).

Command: x3dna-dssr -i=1msy.pdb --get-hbonds –o=1msy-hbonds.txt

# H-bonds in '1msy.pdb' identified by 3DNA version 3 (xiangjun@x3dna.org)
40
   15   578  #1     p    2.768 O:N O4@A.U2647 N1@A.G2673
   35   555  #2     p    2.776 O:N O6@A.G2648 N3@A.U2672
   36   554  #3     p    2.826 N:O N1@A.G2648 O2@A.U2672
   55   537  #4     p    2.965 O:N O2@A.C2649 N2@A.G2671
   56   535  #5     p    2.836 N:N N3@A.C2649 N1@A.G2671
   58   534  #6     p    2.769 N:O N4@A.C2649 O6@A.G2671
   76   513  #7     p    2.806 N:N N3@A.U2650 N1@A.A2670
   78   512  #8     p    3.129 O:N O4@A.U2650 N6@A.A2670
   95   492  #9     p    2.703 O:N O2@A.C2651 N2@A.G2669
   96   490  #10    p    2.853 N:N N3@A.C2651 N1@A.G2669
   98   489  #11    p    2.987 N:O N4@A.C2651 O6@A.G2669
  115   466  #12    p    2.817 O:N O2@A.C2652 N2@A.G2668
  116   464  #13    p    2.907 N:N N3@A.C2652 N1@A.G2668
  118   463  #14    p    2.897 N:O N4@A.C2652 O6@A.G2668
  123   151  #15    o    2.622 O:O OP2@A.U2653 O2'@A.A2654
  135   443  #16    p    2.898 O:N O2@A.U2653 N4@A.C2667
  147   192  #17    x    3.054 O:O O4'@A.A2654 O4'@A.U2656
  158   408  #18    p    2.960 N:O N6@A.A2654 OP2@A.C2666
  173   188  #19    o    2.923 O:O O2'@A.G2655 OP2@A.U2656
  173   378  #20    o    3.093 O:O O2'@A.G2655 O6@A.G2664
  173   379  #21    o    3.343 O:N O2'@A.G2655 N1@A.G2664
  181   386  #22    p    2.768 N:O N1@A.G2655 OP2@A.A2665
  183   203  #23    p    2.754 N:O N2@A.G2655 O4@A.U2656
  183   387  #24    p    2.887 N:O N2@A.G2655 O5'@A.A2665
  188   379  #25    p    3.044 O:N OP2@A.U2656 N1@A.G2664
  188   381  #26    p    2.944 O:N OP2@A.U2656 N2@A.G2664
  200   401  #27    p    3.122 O:N O2@A.U2656 N6@A.A2665
  201   398  #28    p    2.759 N:N N3@A.U2656 N7@A.A2665
  220   381  #29    p    3.035 N:N N7@A.A2657 N2@A.G2664
  223   371  #30    o    2.963 N:O N6@A.A2657 O2'@A.G2664
  223   382  #31    p    3.039 N:N N6@A.A2657 N3@A.G2664
  242   358  #32    p    2.821 O:N O2@A.C2658 N2@A.G2663
  243   356  #33    p    2.890 N:N N3@A.C2658 N1@A.G2663
  245   355  #34    p    2.887 N:O N4@A.C2658 O6@A.G2663
  258   305  #35    o    2.604 O:N O2'@A.G2659 N7@A.A2661
  258   308  #36    o    3.264 O:N O2'@A.G2659 N6@A.A2661
  268   315  #37    p    2.973 N:O N2@A.G2659 OP2@A.A2662
  268   327  #38    p    2.864 N:N N2@A.G2659 N7@A.A2662
  371   390  #39    o    2.751 O:O O2'@A.G2664 O4'@A.A2665
  550   566  #40    o    3.372 O:O O2'@A.U2672 O4'@A.G2673

In its default settings, DSSR detects 117 H-bonds for 1ehz (yeast phenylalanine tRNA), and 5,809 for 1jj2 (the H. marismortui large ribosomal subunit). Note that the program can identify H-bonds not only in RNA and DNA, but also in proteins, or their complexes. By default, however, DSSR only reports H-bonds within nucleic acids. As shown above, it is trivial to run DSSR with the --get-hbonds option to get all H-bonds in a given structure, and the plain text output is straightforward to work on.

While there exist dedicated tools for finding H-bonds, such as HBPLUS or HBexplore, DSSR may well be sufficient to fulfill most practical needs. If you notice any weird behaviors with this H-bond finding functionality, please let me know. I strive to address reported issues promptly, to the extent practical. At the very least, I should be able to explain why the program is working the way it does.

Comment

Why can hydrogen bonding form for a donor-donor or acceptor-acceptor pair for the ‘x’ type? Does solvent participate? Can you recommend any literature supporting it?

— Di · 2017-12-28 20:35 · #

Type ‘x’ simply means it fulfills the DSSR’s geometric criteria for an H-bond, but is questionable with regard to its donor-acceptor type. It may be a true H-bond as for the C+C pair in an I-motif, or an error in the analyzed structure.

See the thread General questions of H-bond section in DSSR for further details.

— Xiangjun Lu · 2017-12-28 21:11 · #

« DSSR-derived secondary structure in .ct format · DSSR-derived secondary structure in BPSEQ format »

Thank you for printing this article from http://x3dna.org/. Please do not forget to visit back for more 3DNA-related information. — Xiang-Jun Lu

Name	Remember
Email
Website
Message	Textile help

X3DNA-DSSR: a resource for structural bioinformatics of nucleic acids(An NIGMS National Resource supported by NIH grant R24GM153869)

Get hydrogen bonds with DSSR

Comment

X3DNA-DSSR: a resource for structural bioinformatics of nucleic acids
(An NIGMS National Resource supported by NIH grant R24GM153869)