Ikenna E.
Ndukwe‡
a,
Yu-hong
Lam
b,
Sunil K.
Pandey
c,
Bengt E.
Haug
c,
Annette
Bayer
d,
Edward C.
Sherer
a,
Kirill A.
Blinov
e,
R. Thomas
Williamson§
a,
Johan
Isaksson
d,
Mikhail
Reibarkh
a,
Yizhou
Liu¶
*a and
Gary E.
Martin||
*a
aAnalytical Research & Development, (Rahway), Merck & Co. Inc., Kenilworth, NJ, USA. E-mail: yizhou.liu@gmail.com
bComputational and Structural Chemistry, Merck & Co., Inc., Rahway, NJ 07065, USA
cDepartment of Chemistry and Centre for Pharmacy, University of Bergen, Allégaten 41, NO-5020 Bergen, Norway
dDepartment of Chemistry, UiT the Arctic University of Tromsø, NO-9037 Tromsø, Norway
eMestReLab Research S. L., Santiago de Compostela, A Coruna, 15706, Spain
First published on 21st September 2020
Structural features of proton-deficient heteroaromatic natural products, such as the breitfussins, can severely complicate their characterization by NMR spectroscopy. For the breitfussins in particular, the constitution of the five-membered oxazole central ring cannot be unequivocally established via conventional NMR methods when the 4′-position is halogenated. The level of difficulty is exacerbated by 4′-iodination, as the accuracy with which theoretical NMR parameters are determined relies extensively on computational treatment of the relativistic effects of the iodine atom. It is demonstrated in the present study, that the structure of a 4′-iodo breitfussin analog can be unequivocally established by anisotropic NMR methods, by adopting a reduced singular value decomposition (SVD) protocol that leverages the planar structures exhibited by its conformers.
Recent advances in our ability to measure and utilize Residual Chemical Shift Anisotropy (RCSA) data have further augmented the utility of Residual Dipolar Couplings (RDC) and anisotropic NMR methods for defining the constitution and configuration of small molecules.15–22 In particular, anisotropic NMR methods generate additional experimental constraints that provide orthogonal validation of structural proposals in a manner that is not prone to investigator bias.7,17,23,24 The simultaneous use of both RDC and RCSA data is highly desirable; the former establishes the relative orientations of different C–H bond vectors while the latter affords orientation information of chemical shielding tensors of both protonated and non-protonated carbons. This strategy generally provides a more robust and discriminative analysis of structural proposals than that provided by using either RDC or RCSA data alone. Although these anisotropic NMR data in conjunction with DFT calculations have been successfully applied to several highly complex natural products,13,24,25 molecules like the breitfussins present still further challenges. First, iodo-substitution introduces significant relativistic effects that potentially decrease the accuracy of the theoretically calculated molecular geometries and chemical shielding tensors. Second, the nearly planar lowest energy conformation limits the out-of-plane orientational sampling of anisotropic NMR data and consequently their utility by conventional singular value decomposition (SVD) analysis. Finally, the two rotatable bonds (bold red bonds in Fig. 1) connecting the three aromatic moieties leads to conformational flexibility that complicates the interpretation of the anisotropic NMR data. However, we demonstrate that the structure of a breitfussin A analog can be unequivocally determined solely based on RDC and RCSA data through implementation of the single-tensor SVD method by making use of the unique alignment properties of its planar structures. To the best of our knowledge, this approach has not yet been used for natural product structural characterization, and applications will likely be limited to predominantly planar conformations.
Fig. 1 Four plausible constitutional isomers of a breitfussin A analog (see ESI† for the complete structural ensemble based on positional isomerism of the central aromatic ring). Red bonds denote rotatable bonds. |
Although the so-called heavy-atom–light-atom (HALA) relativistic effects36 are primarily observed at the carbon nucleus directly attached to the heavy atom, their impact on conformer energies can be significant (Table 1). Geometry optimization of 1–4 at three levels of theory that are well validated for the modeling of organic molecules26–35 revealed significant variations in the Boltzmann populations, relating to conformer energies.39 The broad implications of this observation is that DFT-derived chemical shifts and other NMR parameters will likely be incorrectly weighted and could thus lead to unreliable comparisons with the experimental data. Specifically, the calculated Boltzmann population of conformer 1a, the major conformer of the correct constitutional isomer (vide infra), ranges from 79.1–48.5%, utilizing different DFT functionals/basis sets. The population of this conformer was, however, experimentally determined to be approximately 45% from ROESY measurements (see ESI† for details). Consequently, geometries and energies obtained with the contracted basis set for iodine and bromine (TZP-DKH33–35) were used in all later comparisons, based on the assumption that the more accurate energy obtained for 1a indicates better suitability of this basis set for theoretical calculations on the other isomers as well.
Isomers | Conformers | B3LYP/BS1a,b(%) | M06-2X/BS2a,b (%) | M06-2X/BS3a (%) |
---|---|---|---|---|
a BS1: 6-31G* basis set on C, H, N, O and Br; MIDI!31 basis set on I. BS2: 6-31+G** basis on C, H, N, O and Br; DZDZVP32 basis on I. BS3: 6-31+G** basis on C, H, N and O; TZP-DKH33–35 basis on Br and I. b Bracketed values were derived using electronic energies computed with Gaussian implementation of the Douglas-Kroll-Hess (DKH) Hamiltonian. | ||||
1 | 1a | 79.1 (78.8) | 61.8 (56.0) | 48.5 |
1b | 3.0 (3.1) | 19.6 (26.5) | 33.8 | |
1c | 17.9 (18.0) | 14.0 (11.9) | 10.1 | |
1d | 0 | 4.6 (5.7) | 7.7 | |
2 | 2a | 66 (67.1) | 42.5 (46.3) | 62.6 |
2b | 20.3 (20.7) | 27.5 (24.0) | 16.8 | |
2c | 10.2 (9.2) | 18.8 (20.4) | 16.8 | |
2d | 3.4 (3.1) | 11.2 (9.4) | 3.8 | |
3 | 3a | 76.7 (76.8) | 61.0 (53.3) | 39.3 |
3b | 2.4 (2.5) | 20.4 (27.6) | 43.9 | |
3c | 20.9 (20.7) | 13.6 (12.2) | 9.1 | |
3d | 0 | 5.0 (6.9) | 7.7 | |
4 | 4a | 65.5 (65.2) | 61 (59.4) | 35.4 |
4b | 29.3 (29.1) | 19.4 (18.6) | 28.3 | |
4c | 3.5 (3.9) | 14.9 (16.9) | 22.8 | |
4d | 1.7 (1.8) | 4.7 (5.1) | 13.4 |
The population-averaged 13C chemical shifts calculated from DFT (mPW1PW91/TZP-DKH for iodine/bromine and mPW1PW91/6-311+G(2d,p) for other atoms) for 1–4 were compared to the experimental chemical shift values (chemical shift comparisons of the isoxazole analogs can be found in the ESI†). The mean absolute error (MAE) (excluding the halogen-bearing carbons) slightly favours 1 (1.91 ppm) over 2, 3 and 4 (3.83, 2.95, and 4.21 ppm, respectively). Bar charts of the absolute chemical shift errors for 1–4 are shown in Fig. 2. Although 2, 3 and 4 have slightly larger average errors than 1, unequivocal distinction of the isomers (especially between 1 and 3) is limited by possible errors in DFT-computed chemical shift values (see Fig. S13† for chemical shifts analysis of the isoxazole analogs). Clearly, further structural verification by orthogonal means, such as the utilization of RDC and RCSA data, is strongly justified.
As shown in Table 1, the oxazole constitutional isomers can adopt at least two major conformations. Consequently, any meaningful interpretation of experimental RDC and RCSA data must account for this rotational exchange via comparisons with theoretical averages. The single-tensor singular-value decomposition (SVD) method was utilized to differentiate these isomers.40–42 First, the coordinates and chemical shielding (CS) tensors of all conformers were superimposed to achieve the smallest RMSD for atomic positions through the principal axis frame (PAF) of their mass-weighted gyration tensors (Fig. 3). As Azurmendi, et al.43 and Almond, et al.44,45 have shown for biomolecules aligned in plane-like media, the principal axes of the mass-weighted gyration tensor, or the closely related moment of inertia tensor, coincide with those of the alignment tensor. For small molecules aligned in polymeric gels, this relationship cannot be assumed. As a result, we utilized only the gyration tensor PAF to provide a common frame for all conformations associated with each isomer thus setting the stage for single-tensor SVD analysis, following the proposal of Burnell and de Lange.42 As the gyration tensor has 4-fold symmetry, structural superposition was conducted by considering four possible orientations of each conformer relative to a reference conformer, and choosing the orientation that gave the lowest RMSD for pair-wise atomic positions. The Saupe order matrix in this common frame was assumed to be identical for all conformations and was determined by SVD using five free variables, specifically Syy, Szz, Sxy, Syz, and Sxz, which were further used to back-calculate the theoretical averages of RDC and RCSA for each isomer.46 The population of each conformation was optimized by the Nelder-Mead simplex procedure, which minimizes the Q-factor. This standard approach, the results of which are summarized in Table 2, is hereafter referred to as “Method A”. With the exception of 3, which exhibited a considerably higher Q-value of 0.154, the other isomers yielded very low Q-values: 0.050, 0.078 and 0.067, respectively, for 1, 2 and 4. This lack of differentiation clearly indicates that the available experimental data were insufficient for method A to unambiguously identify the correct isomer, and some degree of overfitting had likely occurred. It is worth pointing out that isomer 3, the closest isomer based on chemical shift MAE (2.95 vs. 1.91, Fig. 2), gives the largest Q-value of 0.154. The complementarity of chemical shifts and anisotropic NMR data, underscores the value of utilizing both approaches for structure elucidation of challenging molecules (similar observations have been made for a previously published study19). Due to conjugation, the conformers of the breitfussin A isomers, 1–4, were either nearly planar or have biaryl dihedral angles of <50° (see Fig. 3, S14 and S15† for details). Consequently, the out-of-plane orientation sampling in the anisotropic NMR data would be minimal. Although RCSA data for the aromatic carbons provide information on the orientation of the plane norm, this information is highly redundant for different carbons in a nearly flat conformation. To circumvent the potential problem of over-fitting by SVD analysis, we sought to impose additional constraints that leverages the planarity of conformers of these isomers.
Fig. 3 Conformations of isomers 1–4 (a–d) and their alignment tensor principal axes. Different conformations from the same isomer were initially superimposed through their mass-weighted gyration tensor PAF's and then separated vertically for better visualization (see alternative views in Fig. S14 and S15,† dihedral angles are collected in Table S5†). The alignment tensor principal axes determined from method A and B are coloured in red and green, respectively. The plane norm is indicated with “n”. |
Isomers | Conformers | Population by DFT (%) | Optimized Populationa (%) | Q-factor (ensemble)a, population is a variable | Optimized Populationb (%) | Q-factor (ensemble)b, population is a variable | Q-factor (ensemble)b, population fixed to DFT values |
---|---|---|---|---|---|---|---|
a Results obtained with method A. b Results obtained with method B. | |||||||
1 | 1a | 48.5 | 16 | 0.050 | 67 | 0.053 | 0.150 |
1b | 33.8 | 82.2 | 31.4 | ||||
1c | 10.1 | 1.8 | 1.6 | ||||
1d | 7.7 | 0 | 0 | ||||
2 | 2a | 62.6 | 0 | 0.078 | 0 | 0.095 | 0.744 |
2b | 16.8 | 0 | 0 | ||||
2c | 16.8 | 0 | 43.1 | ||||
2d | 3.8 | 100 | 56.9 | ||||
3 | 3a | 39.3 | 0 | 0.154 | 0 | 0.232 | 0.752 |
3b | 43.9 | 3.5 | 0 | ||||
3c | 9.1 | 67.1 | 21 | ||||
3d | 7.7 | 29.3 | 79 | ||||
4 | 4a | 35.4 | 0 | 0.067 | 0 | 0.126 | 0.802 |
4b | 28.3 | 0 | 0 | ||||
4c | 22.8 | 92.3 | 100 | ||||
4d | 13.4 | 7.8 | 0 |
Below we demonstrate that for a conformation of reflection symmetry, the mirror norm must be a principal axis of the alignment tensor when an achiral medium is used. To see this, we construct a molecular frame (MF) in which the mirror norm is along the z-axis; the x- and y-axes are within the mirror plane. The magnetic field B0 has a zenith angle θ and azimuthal angle φ in this MF. To show that z is a principal axis of the alignment tensor, we prove that the off-diagonal Saupe order matrix elements Sxz and Syz are zero. For instance, Sxz can be determined using the following equation:47
In method B, we will impose the conclusion from the preceding paragraph on SVD analysis. If all conformations are transformed to a frame such that the plane norm is a Cartesian axis, e.g. the z-axis, then the off-diagonal elements of the Saupe order Syz and Sxz, must be zero. Consequently, only three parameters, namely, Syy, Sxx, and Sxy, need to be determined by SVD instead of five as this reduced SVD analysis now only needs to determine the orientation of the two in-plane principal axes. In practice, the implementation of this concept is quite simple based on the gyration tensor mentioned earlier. The approximate plane norm of a nearly flat conformation can be identified as the direction associated with the smallest principal moment of gyration (λ12). For a perfectly flat structure, λ12 is zero. Therefore, the coordinates and CS tensors superimposed through the gyration tensor, as previously used in method A, can be directly used in method B, except that only three Saupe order matrix elements are used for SVD. The principal moments are listed in Table 3. The plane norm is associated with λ12, which is zero or over five times smaller than λ22 in all cases. It is also evident from Table 3 that different conformers of each constitutional isomer have very similar principal moments of gyration, and therefore the single-tensor approach is likely viable with the neutral poly-HEMA/MMA medium in which the alignment takes place mostly through steric interactions.
Isomers | Conformers | λ 1 2 | λ 2 2 | λ 3 2 |
---|---|---|---|---|
1 | 1a | 0.0 | 6.1 | 13.3 |
1b | 0.6 | 3.9 | 14.8 | |
1c | 0.6 | 4.1 | 14.6 | |
1d | 0.1 | 6.1 | 13.1 | |
2 | 2a | 0.0 | 4.6 | 17.3 |
2b | 0.0 | 3.1 | 17.7 | |
2c | 0.0 | 3.0 | 17.8 | |
2d | 0.0 | 4.4 | 17.5 | |
3 | 3a | 0.2 | 5.9 | 13.6 |
3b | 0.7 | 3.8 | 15.0 | |
3c | 0.7 | 3.6 | 15.1 | |
3d | 0.1 | 5.7 | 13.9 | |
4 | 4a | 0.0 | 4.6 | 16.7 |
4b | 0.0 | 3.3 | 18.0 | |
4c | 0.0 | 4.4 | 16.8 | |
4d | 0.0 | 3.1 | 18.2 |
The results from method B are also summarized in Table 2. First, a reduced SVD analysis with variable conformational populations was performed. Clearly, 1 can be easily identified as the best match with a Q-factor of 0.053, with the second-best match 4 having a considerably larger Q-factor of 0.095. The correlation plots showing the agreement between experimental RDC and RCSA data and corresponding theoretical averaged values calculated using method B are displayed in Fig. 4. The correct structure, 1, is now clearly differentiated from its isomers. In Fig. 3, we displayed the alignment tensor PAF's from methods A (red) and B (green) side-by-side with the stacked conformers of the respective isomers, 1–4 (alternative top and side views of the various conformations of 1–4 are shown in Fig. S14 and S15†). The principal axis in method B that corresponds to the plane norm, n, is also indicated. Clearly, none of the principal axes from method A aligns with the plane norm or superimposes with the principal axes determined by method B, suggesting that method A generated physically unrealistic alignment tensor parameters for all isomers that led to artificially low Q-factors.
It is also remarkable that amongst the four structure candidates 1–4, the optimized conformational distribution of 1, computed with method B, agrees reasonably well with DFT-computed Boltzmann distribution. In contrast, the optimized conformational distributions of 2, 3, and 4, also derived by SVD analysis with method B, favour conformations of higher DFT energies (see Table 2). This observation is further supported by a second reduced SVD analysis but with fixed conformer populations utilizing DFT-derived values as shown in Table 1 (M06-2X/BS3). As shown in Table 2, 1 is clearly distinguished as the correct isomer with the lowest Q-factor of 0.150 compared to the significantly higher Q-factors of 0.744, 0.752, and 0.802 for 2, 3, and 4, respectively. The enhanced isomeric differentiation using fixed Boltzmann populations underscores the benefit of obtaining accurate theoretical molecular energies in the ensemble-based analysis of flexible molecules.
Footnotes |
† Electronic supplementary information (ESI) available. See DOI: 10.1039/d0sc03664a |
‡ Current address: Complex Carbohydrate Research Center, University of Georgia, 315 Riverbend Road, Athens, GA 30602. |
§ Current address: Department of Chemistry and Biochemistry, University of North Carolina Wilmington, Wilmington, NC 28403. |
¶ Current address: Analytical Research and Development, Pfizer Worldwide Research and Development, 445 Eastern Point Road, Groton, CT, 06340, USA. |
|| Current address: Department of Chemistry and Biochemistry, Seton Hall University, South Orange, NJ 07079, USA. |
This journal is © The Royal Society of Chemistry 2020 |