Recently, I came across and have been surprised by the different assignment of HETATM vs. ATOM records for modified nucleotides in PDB vs. PDBx/mmCIF format. As always, the issue is best illustrated with a concrete example. Here is what I observed in the PDB entry 1ehz, the crystal structure of yeast phenylalanine tRNA at 1.93 Å resolution.
DSSR identifies 14 modified nucleotides (of 11 types) in 1ehz as shown below:
List of 11 types of 14 modified nucleotides nt count list 1 1MA-a 1 A.1MA58 2 2MG-g 1 A.2MG10 3 5MC-c 2 A.5MC40,A.5MC49 4 5MU-t 1 A.5MU54 5 7MG-g 1 A.7MG46 6 H2U-u 2 A.H2U16,A.H2U17 7 M2G-g 1 A.M2G26 8 OMC-c 1 A.OMC32 9 OMG-g 1 A.OMG34 10 PSU-P 2 A.PSU39,A.PSU55 11 YYG-g 1 A.YYG37
In file 1ehz.pdb
downloaded from RCSB PDB, all the 14 modified nucleotides are assigned as HETATM whereas in 1ehz.cif
the corresponding records are ATOM. Here is the excerpt for 1MA58 in PDB format:
HETATM 1252 P 1MA A 58 73.770 67.765 34.057 1.00 30.65 P HETATM 1253 OP1 1MA A 58 72.638 67.886 33.105 1.00 32.84 O HETATM 1254 OP2 1MA A 58 73.621 68.229 35.450 1.00 29.49 O HETATM 1255 O5' 1MA A 58 74.315 66.273 34.254 1.00 28.81 O HETATM 1256 C5' 1MA A 58 74.592 65.439 33.080 1.00 29.42 C HETATM 1257 C4' 1MA A 58 74.279 63.972 33.383 1.00 33.42 C HETATM 1258 O4' 1MA A 58 74.880 63.685 34.667 1.00 32.36 O HETATM 1259 C3' 1MA A 58 72.789 63.573 33.509 1.00 35.13 C HETATM 1260 O3' 1MA A 58 72.625 62.168 33.250 1.00 36.80 O HETATM 1261 C2' 1MA A 58 72.560 63.667 35.012 1.00 34.80 C HETATM 1262 O2' 1MA A 58 71.525 62.828 35.506 1.00 36.27 O HETATM 1263 C1' 1MA A 58 73.908 63.150 35.551 1.00 33.62 C HETATM 1264 N9 1MA A 58 74.284 63.494 36.930 1.00 30.36 N HETATM 1265 C8 1MA A 58 73.887 64.574 37.688 1.00 34.55 C HETATM 1266 N7 1MA A 58 74.415 64.610 38.899 1.00 33.32 N HETATM 1267 C5 1MA A 58 75.204 63.469 38.953 1.00 33.37 C HETATM 1268 C6 1MA A 58 76.031 62.941 39.948 1.00 33.58 C HETATM 1269 N6 1MA A 58 76.184 63.488 41.134 1.00 41.19 N HETATM 1270 N1 1MA A 58 76.708 61.803 39.669 1.00 34.48 N HETATM 1271 CM1 1MA A 58 77.649 61.222 40.626 1.00 31.43 C HETATM 1272 C2 1MA A 58 76.527 61.216 38.479 1.00 28.43 C HETATM 1273 N3 1MA A 58 75.793 61.624 37.453 1.00 31.67 N HETATM 1274 C4 1MA A 58 75.142 62.771 37.747 1.00 33.02 C
The corresponding section in PDBx/mmCIF format is:
ATOM 1252 P P . 1MA A 1 58 ? 73.770 67.765 34.057 1.00 30.65 ? ? ? ? ? ? 58 1MA A P 1 ATOM 1253 O OP1 . 1MA A 1 58 ? 72.638 67.886 33.105 1.00 32.84 ? ? ? ? ? ? 58 1MA A OP1 1 ATOM 1254 O OP2 . 1MA A 1 58 ? 73.621 68.229 35.450 1.00 29.49 ? ? ? ? ? ? 58 1MA A OP2 1 ATOM 1255 O "O5'" . 1MA A 1 58 ? 74.315 66.273 34.254 1.00 28.81 ? ? ? ? ? ? 58 1MA A "O5'" 1 ATOM 1256 C "C5'" . 1MA A 1 58 ? 74.592 65.439 33.080 1.00 29.42 ? ? ? ? ? ? 58 1MA A "C5'" 1 ATOM 1257 C "C4'" . 1MA A 1 58 ? 74.279 63.972 33.383 1.00 33.42 ? ? ? ? ? ? 58 1MA A "C4'" 1 ATOM 1258 O "O4'" . 1MA A 1 58 ? 74.880 63.685 34.667 1.00 32.36 ? ? ? ? ? ? 58 1MA A "O4'" 1 ATOM 1259 C "C3'" . 1MA A 1 58 ? 72.789 63.573 33.509 1.00 35.13 ? ? ? ? ? ? 58 1MA A "C3'" 1 ATOM 1260 O "O3'" . 1MA A 1 58 ? 72.625 62.168 33.250 1.00 36.80 ? ? ? ? ? ? 58 1MA A "O3'" 1 ATOM 1261 C "C2'" . 1MA A 1 58 ? 72.560 63.667 35.012 1.00 34.80 ? ? ? ? ? ? 58 1MA A "C2'" 1 ATOM 1262 O "O2'" . 1MA A 1 58 ? 71.525 62.828 35.506 1.00 36.27 ? ? ? ? ? ? 58 1MA A "O2'" 1 ATOM 1263 C "C1'" . 1MA A 1 58 ? 73.908 63.150 35.551 1.00 33.62 ? ? ? ? ? ? 58 1MA A "C1'" 1 ATOM 1264 N N9 . 1MA A 1 58 ? 74.284 63.494 36.930 1.00 30.36 ? ? ? ? ? ? 58 1MA A N9 1 ATOM 1265 C C8 . 1MA A 1 58 ? 73.887 64.574 37.688 1.00 34.55 ? ? ? ? ? ? 58 1MA A C8 1 ATOM 1266 N N7 . 1MA A 1 58 ? 74.415 64.610 38.899 1.00 33.32 ? ? ? ? ? ? 58 1MA A N7 1 ATOM 1267 C C5 . 1MA A 1 58 ? 75.204 63.469 38.953 1.00 33.37 ? ? ? ? ? ? 58 1MA A C5 1 ATOM 1268 C C6 . 1MA A 1 58 ? 76.031 62.941 39.948 1.00 33.58 ? ? ? ? ? ? 58 1MA A C6 1 ATOM 1269 N N6 . 1MA A 1 58 ? 76.184 63.488 41.134 1.00 41.19 ? ? ? ? ? ? 58 1MA A N6 1 ATOM 1270 N N1 . 1MA A 1 58 ? 76.708 61.803 39.669 1.00 34.48 ? ? ? ? ? ? 58 1MA A N1 1 ATOM 1271 C CM1 . 1MA A 1 58 ? 77.649 61.222 40.626 1.00 31.43 ? ? ? ? ? ? 58 1MA A CM1 1 ATOM 1272 C C2 . 1MA A 1 58 ? 76.527 61.216 38.479 1.00 28.43 ? ? ? ? ? ? 58 1MA A C2 1 ATOM 1273 N N3 . 1MA A 1 58 ? 75.793 61.624 37.453 1.00 31.67 ? ? ? ? ? ? 58 1MA A N3 1 ATOM 1274 C C4 . 1MA A 1 58 ? 75.142 62.771 37.747 1.00 33.02 ? ? ? ? ? ? 58 1MA A C4 1
While I have not tested exhaustively, it seems true that PDBx/mmCIF has adopted a different definition of what constitutes a HETATM residue. It is worth noting that results from 3DNA and DSSR/SNAP are not effected by the conflicting assignments.