An abasic site is a location in DNA or RNA where a purine or pyrimidine base is missing. It is also termed an AP site (i.e., apurinic/apyrimidinic site) in biochemistry and molecular genetics. The abasic site can be formed either spontaneously (e.g., depurination) or due to DNA damage (occurring as intermediates in base excision repair). According to Wikipedia, “It has been estimated that under physiological conditions 10,000 apurinic sites and 500 apyrimidinic may be generated in a cell daily.”
In DSSR and 3DNA v2.x, nucleotides are recognized using standard atom names and base planarity. Thus, abasic sites are not taken as nucleotides (by default), simply because they do not have base atoms. DSSR introduced the --abasic
option to account for abasic sites, a feature useful for detecting loops with backbone connectivity.
For example, by default, DSSR identifies one internal loop (no. 1 in the list below) in PDB entry 1l2c. With the --abasic
option, two internal loops (including the one with the abasic site C.HPD18, no. 2) are detected.
List of 2 internal loops 1 symmetric internal loop: nts=6; [1,1]; linked by [#-1,#1] summary: [2] 1 1 [B.1 C.24 B.3 C.22] 1 4 nts=6 GTATAC B.DG1,B.DT2,B.DA3,C.DT22,C.DA23,C.DC24 nts=1 T B.DT2 nts=1 A C.DA23 2 symmetric internal loop: nts=6; [1,1]; linked by [#1,#2] summary: [2] 1 1 [B.6 C.19 B.8 C.17] 4 5 nts=6 CTTA?G B.DC6,B.DT7,B.DT8,C.DA17,C.HPD18,C.DG19 nts=1 T B.DT7 nts=1 ? C.HPD18
Note that C.HPD18 in 1l2c is a non-standard residue, as shown in the HETATM records below. Since the identity of C.HPD18 cannot be deduced from the atomic records, its one-letter code is designated as ?
.
HETATM 346 P HPD C 18 -14.637 52.299 29.949 1.00 49.12 P HETATM 347 O5' HPD C 18 -14.658 52.173 28.359 1.00 48.28 O HETATM 348 O1P HPD C 18 -15.167 51.040 30.537 1.00 49.35 O HETATM 349 O2P HPD C 18 -13.303 52.798 30.369 1.00 46.43 O HETATM 350 C5' HPD C 18 -15.703 51.469 27.687 1.00 45.70 C HETATM 351 O4' HPD C 18 -16.364 50.501 25.561 1.00 44.15 O HETATM 352 O3' HPD C 18 -13.990 51.738 24.335 1.00 45.75 O HETATM 353 C1' HPD C 18 -16.105 54.187 25.684 1.00 52.47 C HETATM 354 O1' HPD C 18 -17.309 54.085 26.496 1.00 56.16 O HETATM 355 C3' HPD C 18 -14.756 52.250 25.426 1.00 46.23 C HETATM 356 C4' HPD C 18 -15.263 51.093 26.291 1.00 45.72 C HETATM 357 C2' HPD C 18 -16.030 52.889 24.898 1.00 49.05 C
In contrast, the R.U-8 in PDB entry 4ifd is a standard U, and is properly labeled by DSSR.
ATOM 26418 P U R -8 139.362 21.962 129.430 1.00208.29 P ATOM 26419 OP1 U R -8 140.062 20.821 130.074 1.00207.30 O ATOM 26420 OP2 U R -8 140.113 23.208 129.129 1.00208.44 O1+ ATOM 26421 O5' U R -8 138.712 21.439 128.071 1.00157.60 O ATOM 26422 C5' U R -8 139.507 20.790 127.087 1.00155.47 C ATOM 26423 C4' U R -8 138.843 20.804 125.731 1.00152.27 C ATOM 26424 O4' U R -8 138.538 22.172 125.352 1.00149.29 O ATOM 26425 C3' U R -8 139.677 20.275 124.572 1.00152.70 C ATOM 26426 O3' U R -8 139.670 18.859 124.478 1.00155.04 O ATOM 26427 C2' U R -8 139.053 20.969 123.369 1.00150.26 C ATOM 26428 O2' U R -8 137.849 20.322 122.984 1.00146.83 O ATOM 26429 C1' U R -8 138.700 22.334 123.958 1.00147.35 C
This is yet another little detail that DSSR takes care of. It is the close consideration to many such subtle points that makes DSSR different. Overall, DSSR represents my view of what a scientific software program could be (or should be).