Feng
Wang
Department of Chemistry and Biotechnology, School of Science, Computing and Engineering Technologies, Swinburne University of Technology, Melbourne, Victoria 3122, Australia. E-mail: fwang@swin.edu.au
First published on 24th February 2023
Molecular spectroscopy measures transitions between discrete molecular energies which follow quantum mechanics. Structural information of a molecule is encoded in the spectra, which can be only decoded using quantum mechanics and therefore computational molecular spectroscopy becomes essential. In this review perspective, the role evolution of computational molecular spectroscopy has been discussed with several joint theory and experiment spectroscopic studies in the past decades, which includes rotational (microwave), vibrational and electronic spectroscopy (valence and core) of molecules. With the development in high resolution and computerized synchrotron sourced spectroscopy, spectral measurements and computational molecular spectroscopy need to be integrated for materials development. Contemporary computational molecular spectroscopy is, therefore, more than merely supporting interpretation but leading the innovation. Future development of molecular spectroscopy lies to identify the niche to integrate experimental and computational molecular spectroscopy. It also requires to engineer molecular spectroscopic databases that function according to the universal approaches of computing, such as those in a Turing machine, to be realized in a chemical and/or spectroscopic programable manner (digital twinning research) in the future.
Molecular spectroscopy is measurable quantum mechanics. Molecular spectroscopy is to measure transitions between unique energy states and quantum mechanics calculates the states through solving the Schrödinger equation. In quantum mechanics, the molecular Hamiltonian of the Schrödinger equation under Born–Oppenheimer approximation contains molecular rotation, molecular vibration and electronic terms (and their interactions). Molecular spectra can be obtained due to transitions between rotational states, vibrational states and electronic states of the molecule. Rotational spectroscopy (microwave (MW) and millimeter wave region) provides accurate and reliable information on the structures of polar molecules in gas phase,1 more accurate in the region around bottom of the potential energy well. Fig. 1 gives the composite spectrum of the JKa,Kc = 20,2 − 10,1 rotational transition of 20Ne–14N2 Van der Waals (VdW) complex.3 Rotational (MW) spectroscopy is useful to establish the approximate dimension of a compound from three rotational constants of a molecule (A, B and C), energy barriers to internal rotation such as conformers of a flexible molecule, as well as hyperfine structure of molecules. Much of the understanding of the nature of weak molecular interactions such as VdW complexes,3,4 interstellar molecules and hydrogen bonding has been established through rotational spectroscopy. “Molecular rotational spectra”5 by the Nobel Laureate HW Kroto details all aspects of the art of rotational spectroscopy. Rotational spectrum, however, is tedious and requires significant computer and CMS support and often needs to combine with other techniques for structure determination. The 1998 Nobel Prize in Chemistry was awarded to two computational chemists, Walter Kohn “for his development of the density-functional theory” and John A. Pople “for his development of computational methods in quantum chemistry”.
Fig. 1 A composite spectrum of the JKa,Kc = 20,2 − 10,1 rotational transition of 20Ne–14N2 van der Waals complex showing complicated nuclear hyperfine structure due to the 14N nuclei.3 The missing rotational transitions were recovered from the new measurements guided by CMS. |
Molecular rotation and vibration are often compounded. Transitions involving changes in both vibrational and rotational states can be abbreviated as rovibrational (or ro-vibrational) transitions. Vibrations are relative motions of the atomic nuclei and are studied by both infrared (IR) and Raman spectroscopy. The theory of rotational energy spectrum was developed to account for observations of vibration-rotation spectra of gases in IR (400–4000 cm−1) or far-IR (<400 cm−1) spectroscopy (the latter is in the region of Terahertz (THz) spectroscopy6). If Coriolis coupling is very small, rotation and vibration can be treated as separable (at first approximation), so that the energy of rotation is added to the energy of vibration.7 Vibrational spectroscopy is useful for information about the presence or absence of particular functional groups, and the IR molecular fingerprint can be employed to identify samples.8Fig. 2 compares the theoretically calculated IR spectra of the N2–Ar VdW complex with measurements to test the accuracy of the potential energy surface inversely.9 IR spectroscopy is accessible for almost all materials, the measurements which require small amount of sample are easy, fast and less expensive. IR spectroscopy is for polar compounds,9–11 however. It does not provide information for molecular formula and usually needs to be employed in conjunction with other techniques.12
Fig. 2 Comparisons between experiment13 and simulated 77 K mid-IR spectra of N2–Ar van der Waals complex generated using the MMSV mod9 and CPV14 potential energy surfaces. The measured IR spectrum was employed to validate model development. |
Transitions between electronic states are measured using electron spectroscopy. Fig. 3 is a scheme for electronic processes and their measurements using various electron spectroscopies. This perspective will concentrate on ionization and excitation spectroscopy of a molecule as they closely relate to the Fukui frontier molecular orbital theory and chemical bonding.15 If an electron is removed from valence orbitals of a molecule (<30 eV), photoelectron spectroscopy (PES) is the appropriate technique for binding energy spectral measurement, or electron momentum spectroscopy (EMS) is more appropriate if the shape of the Dyson orbital information is also needed.16 Combining PES and EMS with CMS will be able to reveal a comprehensive picture of the ionization states of a molecule. Such the combination provides the state of the art complementary advantages for accurate binding energy spectrum and related Dyson orbitals with conclusive structural information.17 On the other hand, if a valence electron does not leave the molecule but is excited into virtual unoccupied orbitals, such the electron excitations are studied using UV-Vis spectroscopy (190–1000 nm) as well as fluorescence spectroscopy (emission), which may be calculated using time-dependent density functional theory (TD-DFT) in CMS. UV-Vis spectroscopy is also called optical spectroscopy which is sensitive to environment for optical reporting of chromophores with applications in solar cell and drug research.
The 1901 Nobel Prize in Physics is to Wilhelm Conrad Röntgen for his discovery of X-ray as the “magic ray”. For ionization (excitation) spectroscopy in which an electron is removed or excited from core orbitals of a compound, it requires high energy source such as synchrotron sourced spectroscopy. The former (core ionization) is measured using for example, X-ray photoelectron spectroscopy (XPS) in the energy range > ∼180 eV for B1s (as H and He do not have core electrons). The latter can be X-ray absorption spectroscopy (XAS) or near edge X-ray absorption spectroscopy (NEXAS), which approximately refer to C and D processes illustrated in Fig. 3. XPS which is also called electron spectroscopy for chemical analysis (ESCA), enables to explore the structure of an atom outside the nucleus locally. As a result, XPS also determines the chemical state of elements including the nature of chemical bonding, providing local information of a particular region in a molecule. Hence, XPS has wide applications in organic molecules such as drugs and inorganic compounds such as catalysts and other materials. Together with the information about the valence energy levels as secured by the other related spectroscopy, XPS provides localized picture of the atom outside the nucleus of materials for chemical bonding and oxidation.18
Another routine spectroscopic technique in laboratories is NMR spectroscopy, which becomes a tool of choice to probe chemical structures as NMR is fast, non-destructive, and non-invasive means for the observation. A small variation in NMR frequency (i.e., the chemical shift) is a result of a variation in molecular electron distribution due to chemical environment so that NMR is able to assign previously unknown molecular structures. An important advantage of NMR is to provide atom specific information which is useful to study chemical bonding and impact of the local chemical environment. For these reasons, when combining with CMS, NMR spectroscopy can be applied to study stereochemistry such as conformation and hydrogen bonding (HB) interactions. This perspective will discuss structure and property determination using rotational, vibrational, electronic and nuclear magnetic resonance (NMR) spectroscopy, their integration with CMS as well as the role changing of CMS in the process.
Accuracy matters. It is critical to obtain accurate information of the constitution, stereochemistry, and conformation of drug candidates or bioactive compounds.2 Spectroscopic methods have been one of the major techniques for flexible drug-like natural products.19 With the development of powerful high resolution spectroscopic techniques such as computerized synchrotron sourced spectrometers, more accurate information becomes available. Such information is able to reveal pitfalls with incorrect structural elucidation and sometimes misleading structures of often flexible bioactive compounds.2 In some cases more accurate measurements or information might change earlier conclusions. For example, XPS studies revealed that intramolecular hydrogen bonding impacts on core electrons20,21 and significant differences appear in the core electrons for purine and pyrimidine DNA bases even though adenine and purine share the same molecule skeleton.22–24 Hence, there has been a constant need to flow the structural revision and update. In a recent review, Shen et al. surveyed the literature from the past decade (2010–2020) and identified over 200 cases of misassigned natural products.2 Among the errors presented, NMR misassignment errors are on the top (23%).2Fig. 4 reports the numbers of revised and new natural products in the past decade. The advantages of computer-assisted and theory supported structural revision (broadly CMS) or a combination of methodologies (computer-aided calculation, NMR spectroscopy, empirical rules, X-ray crystallography, and biosynthetic studies) has been recognized.25
Fig. 4 The numbers of revised and new natural products in the past decade (2010–2020).2 Here MNPs refer to molecular natural products. |
Joint CMS and experimental measurements, i.e., synchrotron sourced spectroscopy is critical as they act like the two wheels of a bicycle. Often, spectroscopic measurements validate the quantum mechanical models used in CMS which is able to provide more insight beyond the measured spectra of the compounds, enabling the result to achieve 1 + 1 > 2. In next sections, some examples combining experimental measurements with CMS to achieve deeper insight beyond spectroscopy are presented in the order of (ro-vibrational) spectroscopy, electron spectroscopy, NMR spectroscopy and optical (UV-Vis) spectroscopy.
When CMS and FTIR measurement work together, the applications of vibrational spectroscopy can be extended. One example is to use the measured far-IR spectra of N2–Ar to study the theoretical models of the potential energy surface as shown in Fig. 2,9 and the theoretically developed dipole moment function of the VdW complex.10 The nitrogen diatomic molecule (N2) does not possess a permanent dipole moment, so that the IR active spectral transitions of the complex are produced due to small induction and dispersion effects of the Ar atom interacting with the N2 diatom.10 Another CMS leading innovation case is the decade long theory guided (synchrotron sourced) spectroscopic study of ferrocene (Fc).27–37 The theoretical IR spectral discovery of the more stable Fc eclipsed conformer (D5h)27 shed a light on Fc studies, led and suggested several new measurements of Fc and its derivatives.38 In the seminal theoretical IR study of Fc, it conclusively revealed that Fc eclipsed (D5h) and staggered (D5d) conformers exhibit their unique IR fingerprints in the region of 400–500 cm−1.27 That is, the IR band of Fc staggered (D5d) conformer does not split (the splitting of δ = 2 cm−1 is sufficiently small not to be measured), whereas the band splits (δ = 17 cm−1) in the same region for Fc eclipsed (D5h) conformer is measurable in almost all IR spectral measurements available. This includes the amazing IR measurement of Lippincott and Nelson half century ago, the IR spectra were, unfortunately, assigned to the less stable staggered (D5d) Fc conformer39 due to lack of accurate quantum mechanical support at spectroscopic accuracy. It was further discovered that any modifications to the unsubstituted staggered Fc ((C5H5)2Fe) result in such IR band splitting.32–37 The Fc IR spectral fingerprint is so far the unique technique which conclusively differentiates Fc eclipsed and staggered conformers.27 The Fc IR fingerprint band27 in Fig. 5 further discovered that any vibrations involving the center Fe atom of Fc are sensitive to the conformation, eclipsed or staggered. This discovery of Fc has led to a decade long of study of Fc and derivatives.28–37
Fig. 5 The fingerprint IR spectral bands of the eclipsed (D5h) and staggered (D5d) ferrocene.27 The vibrations including the center Fe atom in Fc are sensitive to the conformation of eclipsed or staggered. |
IR spectroscopy when integrating CMS with measurements can also be used to reveal inter and intra molecular hydrogen bonding through IR frequency red-shifts40 and blue shifts.41 It was found that the A–H stretch in an A–H⋯B complex redshifts (Δν) between 20 to 2500 cm−1 relative to the free molecule without such hydrogen bonding. The correlated hydrogen-bond length (rH⋯B) increases between 0.28 and 0.12 nm.40 IR frequency blue-shifted hydrogen bonding is also called improper hydrogen bonding.41,42 When a molecule is involved in intermolecular bonding, the electric field from the hydrogen acceptor is sufficiently strong at the intermolecular equilibrium distance and the chemical bonds are strengthened and shortened, the stretching vibrational frequency will blue-shift (upshift).42 Often, both IR frequency redshift and blueshift were studied related to intermolecular hydrogen bonding interactions, in which the monomers without such hydrogen bonding serve as the reference for the shift. However, hydrogen bonding redshift and blueshift can also be employed to study intramolecular hydrogen bonding—the isomers/conformers of a (larger) molecule without local hydrogen bonding may serve as the reference.43,44Fig. 6 reports the intramolecular hydrogen bonding blue shift due to conformations of an anti-cancer drug zidovudine.44
Fig. 6 Hydrogen bonding blue shift of the simulated IR spectra for zidovudine conformers. The blue shifted C–H stretch frequencies of AZT-B νC(4′)–H).44 |
In addition, if the vibrational spectroscopic technique is able to detect differences in attenuation of left and right circularly polarized light passing through a sample, it extends to circular dichroism spectroscopy (in the infrared and near infrared ranges). It is called vibrational circular dichroism (VCD) spectroscopy which has applications for stereoisomers such as chiral molecules. As the VCD signals are very weak, it often requires CMS support.45 The VCD and Raman optical activity (ROA) spectroscopy have applications in bioactive compounds such as amino acids.46 Recent development into multi-dimension and finer region such as two-dimensional (2D) IR spectroscopy (2D IR)47 and terahertz (THz) technology based on radiation between MW and IR,6 requires more significant computational molecule spectroscopy including modelling and computer digital technology support.
Fig. 7 Comparison of simulated (dashed line) and the recent synchrotron sourced measured48 (solid spectra) binding energy spectra of NBD (blue) and QC (orange). The dashed spectra are calculated using the same method for NBD49,50 and for QC. |
Electron momentum spectroscopy (EMS) is an (e, 2e) reaction under binary encounter collision conditions.16 EMS effectively probes valence electron (frontier orbital) transfer out of a molecule, providing images of the spherically averaged Dyson orbital electron momentum density distribution, i.e., triple differential cross sections (TDCS) corresponding to the ionization process.16 Hence, EMS is capable of producing a series of binding energy spectra (PES) under different azimuthal angle ϕ's, at ϕ = 0° (at momentum p ≈ 0.0 au) and ϕ ≠ 0°, say, ϕ = 10° (at p ≈ 0.92 au),49 by sitting at the Sun paths (i.e., spherical coordinates). The azimuthal angle ϕ is congruent with the solar azimuth.16 As a result, EMS is the technique which measures both binding energy and Dyson orbitals (in momentum space).17,51 Combining the information from PES, EMS and CMS, one may determine the frontier Dyson orbitals conclusively.17 The combination of PES/EMS and CMS helps to elucidate the NBD ⇆ QC isomerization information in energy storage which will be discussed elsewhere.
The EMS spectroscopy is developed from atomic applications.16 Atoms have a single spherical centre whereas molecules are multiple centres and their properties can be very different. Upon ionisation, the Dyson orbital momentum distributions of an s-orbital and a p-orbital in an atom is observed in half-bell shape and bell shape, respectively, such as the s- and p-Dyson orbitals of neon.52 It was discovered in a theoretical EMS study of diborane (B2H6), that the shape of Dyson orbital TDCS for s-dominant Dyson orbitals (1b2u, and 2b1u) is bell shape rather than the expected half-bell shape for s-orbitals.53Fig. 8 gives the quantum mechanically calculated three s-dominant Dyson orbital TDCS, 1b2u, 2b1u and 2ag of diborane,53 which exhibit half bell shaped Dyson orbital (2ag) and bell shaped Dyson orbitals (1b2u, and 2b1u). This discovery revealed that the shape, half-bell or bell, of a Dyson orbital TDCS is not determined by s or p orbitals but the orbital nodal point/plane of a Dyson orbital. The TDCS of a Dyson orbital may exhibit a half-bell shape if the orbital has no nodal planes such as the s-orbitals of an atom; and a bell-shape if the orbital has a single nodal plane such as the p-orbital of an atom or anti-bonding s-dominant (σ*-orbitals) of a diatomic molecule or polyatomic molecule with high point group symmetry like diborane.53
Fig. 8 Dyson orbital TDCS (left) of 1b2u, 2b1u and 2ag of diborane in the ground electronic state.53 All three Dyson orbitals are dominated by s-electrons but shape is different due to the number of nodal planes. |
The unique advantages of EMS to probe Dyson orbitals of a molecule make significant contributions to study anisotropic properties related to orbitals of molecules. Quantum mechanical methods often focus on energy such as the variational methods, which treat anisotropic properties including orbitals less important. Most post-Hartree–Fock (HF) methods such as MP2 and CCSD(T) methods do not produce sufficiently accurate information for orbitals (or wavefunctions) of the molecule at the same level of accuracy like energy. Post-HF methods usually produce post-HF (more accurate) energies but HF wavefunctions (orbitals) of a molecule. Density functional theory (DFT) in this regard, is able to provide more accurate information through the density of the Dyson orbitals (assuming that Kohn–Sham orbitals are approximately Dyson orbitals). As a result, the EMS technique is able to provide information of both energy and wavefunction (Dyson) of a molecule.16 It is, therefore, a more appropriate technique to test quantum mechanical methods in conjunction with CMS.17 In addition, when working with CMS, EMS is a unique method to study anisotropic properties such as conformation and pseudorotation of tetrahydrofuran (THF).54–56
Bioactive compounds can be unstable at higher temperatures (vaporisation) or experience chemical reactions during measurements. It is critical to combine XPS measurements with CMS. The analysis for the XPS measurements of a dipeptide, phenylalanyl–phenylalanine (PhePhe)62 demonstrated the importance of support and guidance from CMS. The quantum mechanically calculated XPS for the dimer (dipeptide PhePhe) and monomer (phenylalanine) are shown in Fig. 9(a) and the actual gas phase measured XPS spectra are given in Fig. 9(b).
Fig. 9 (a) The CMS simulated C1s, N1s and O1s spectra of phenylalanine monomer and linear phenylalanyl–phenylalanine (dipeptide) and (b) the actually measured XPS spectra.62 |
The XPS spectra for PhePhe dipeptide sample in Fig. 9(a) exhibit multiple bands for C1s spectrum, double bands with approximately equal ratio (1:1) for N1s spectrum and two bands with 1:2 ratio for O1s spectrum. If the sample, on the other hand, were a monomer phenylalanine amino acid, the XPS would exhibit multiple bands for C1s spectrum, single band for N1s spectrum and double bands with equal 1:1 ratio for O1s spectrum. However, the measured XPS C1s, N1s and O1s spectra in Fig. 9(b) are neither the spectra of phenylalanine monomer nor the PhePhe dimer (the dipeptide). It was discovered later that the sample PhePhe dipeptide experienced dehydration in the vacuum chamber of the spectrometer duration vaporisation, resulted in cyclo-PhePhe dipeptide as such,
The above proposed dehydration process of PhePhe was confirmed by joint CMS and measurement,62 as given in Fig. 10.
Fig. 10 The C1s, N1s and O1s spectra of the measured and simulated spectra of the cyclo-dipeptide c(phenylalanyl–phenylalanine).62 |
It presents a challenge that spectroscopic results/databases need to be reproducible and future verifiable, whereas information is constantly improving and updating as the development of the technique and our knowledge. Some early conclusions made based on the available information at the time, which limited our understanding, need updating, revision and even overturning. Information provided by high resolution XPS of molecules in gas phase disturbs the concept in chemistry that core electrons do not participate in chemical bonding and changes the concept that hydrogen bonding is a valence effect for hydrogen atoms without core electrons. Recent joint CMS and XPS studies reveal that intramolecular hydrogen bonding (O⋯H) deeply impacts to core electrons (O1s) of molecules.63 Combining the measured O1s XPS (for hydrogen acceptor) and 1H-NMR (for hydrogen donor) with quantum mechanics, intramolecular hydrogen bonding (O⋯H) of molecules can be studied locally.64 In addition, XPS provides core electron ionization of a molecule which localised on specific atoms. Although the mechanisms are different, the C1s (and/or N1s) XPS can be combined with 13C-NMR (or N-NMR) chemical shift for more comprehensive local chemical bonding information. Comparing to XPS, C-NMR chemical shifts are less difficult to measure and to calculate quantum mechanically. For this reason, computational XPS and NMR calculations can be employed to study conformers and intramolecular hydrogen bonding, which helps the determination of the conformer distributions.65,66 In the NMR study of gallic acid conformers, CMS calculated C-NMR chemical shifts of theoretically possible conformers indicate that the C-NMR chemical shifts can be employed to study intramolecular hydrogen bonding of conformers, as indicated in Fig. 11.66 The conformers are formed by rotations of the hydroxyl and carbonyl groups through the C–O and C–C bonds. As indicated by the colour scheme in this figure, intramolecular hydrogen bonding interactions reveal by the chemical shift with Δδ ≠ 0 of the carbon pairs δ(C(7)) and δ(C(3)) and Δδ′ Δδ ≠ 0 of the carbon pairs δ(C(4)) and δ(C(6)). These carbons do not locate on the axis connecting C(1)–C(2)–C(5)–O(D) as shown in the gallic acid structure in Fig. 11 (the vertical dashed line).
Fig. 11 Comparison of calculated 13C NMR spectra of gallic acid conformers66 with NMR experiment in dimethyl sulfoxide (DMSO) solvent.67 Note that the vertical dashed line on the structure on the right represents the molecular axis of the gallic acid. |
The calculated and measured C-NMR spectra in Fig. 11 reveal that the carbon chemical shifts of the gallic acid conformers are not the same, reflecting their unique conformer chemical environments. Comparison of the CMS simulated C-NMR with the measurement indicates approximately which gallic conformer maybe dominant through, for example, the root-mean square deviation (RMSD). However, the identical carbon chemical shifts in δ(C(7)) and δ(C(3)), and in δ(C(4)) and δ(C(6)) given in the C-NMR measurement,67 is untrue.66 This is due to the fact that NMR spectroscopy associates with slower time scale than rotations (flipping about the axis in Fig. 11). As suggested by Bryan when interpretation NMR time scale, the phenol ring flipping process is apparently faster than the chemical shift difference between the two NMR resonances—for carbon NMR, it's approximately 1.5 milliseconds (1 ms = 10−3 s).68 The phenol ring flipping is approximately in the time scale of femtosecond (1 fs = 10−15 s).69 As a result, chemical shifts of atoms on the aromatic ring symmetric to the axis of C(1)–C(2)–C(5)–O(D) are averaged in the NMR measurements as also found in a study of an active pharmaceutical ingredient resverotrol.70 Finally, care must be taken in the assessment of the accuracy of calculated NMR chemical shifts with respect to NMR measurement. Sometimes the RMSD is not necessarily an appropriate indicator of the accuracy of the calculations. For example, the C-NMR measurement of gallic acid is unable to resolve the “equivalent” carbons. Table 1 compares the calculated and measured C-NMR chemical shifts in the same dimethyl sulfoxide (DMSO) solvent. In the measurement, the chemical shifts of C(3) and C(7) are the same, both 109.14 ppm; so do the chemical shifts of C(4) and C(6), both 144.94 ppm. To this end low temperature dynamic nuclear polarization (DNP)-enhanced solid-state NMR supported by CMS is required.71
Atomic sites | C-NMR(cal)/ppm | C-NMR (expt)/ppm |
---|---|---|
a Calculations using B3PW91/6-311++G(d,p) method for the most stable conformer of gallic acid. b The averaged chemical shift of C(3) and C(7) is 109.75 ppm, which is close the to the measured of 109.15 ppm of Rajan et al.67 c The averaged chemical shift of C(4) and C(6) is 145.14 ppm, which is close to the measured 144.94 ppm of Rajan et al.67 | ||
C(7) | 108.58b | 109.14 |
C(3) | 110.91b | 109.14 |
C(2) | 120.05 | 120.81 |
C(5) | 138.23 | 137.77 |
C(4) | 144.15c | 144.94 |
C(6) | 146.12c | 144.94 |
C(1) | 167.74 | 167.39 |
Photophysical studies have recently received much attention, since optical spectra are very sensitive to the changes in microenvironment.73 A quarter of cancers links to mutation or over-expression of protein tyrosine kinases such as the epidermal growth factor receptor (EGFR). The functioning of many oncogenic proteins depends on kinase-catalysed phosphorylation; hence blocking tyrosine kinase activity in tumour cells has been a promising strategy to halt tumour growth.74 As environment-sensitive fluorophores, quinazoline derivatives are a special class of chromophores that could allow for deeper understanding of biological binding and function of candidate EGFR tyrosine kinase inhibitors (TKI). The TKIs have different biological activities when installing (click chemistry) various active groups to the quinazoline core using synthetic methods, which have potential applications in biology, pesticides and medicine.75 Of the quinazoline derivatives, anilinoquinazolines are the most developed class of drugs that inhibit EGFR kinase intracellularly,76 and drug candidates in this class have already reached various phases of clinical trials, such as Gefitinib, Erlotinib, Lapatinib, Afatinib, Vandetanib, Icotinib, Dacomitinib (PF-00299804), PD150335 and AG-1478.
Anilinoquinazolines can demonstrate changes in electronic configuration upon binding to target proteins, hence acting as biological marker for screening small molecule inhibitors.77 While these studies are very encouraging, some cancers appear to develop resistance to long-term TKI treatment. Hence, understanding the spatial and temporal distribution of TKI is therefore of paramount importance to reveal whether and how these drugs are binding to the target of interest. An important step in this process is to first determine whether the inhibitors have spectral signatures that might assist in determining the relevant targets and interactions.77 It is discovered that the measured UV-Vis spectra of an anilinoquinazoline TKI AG-1478 are sensitive to various solvents.77 A further CMS study reveals that the measured optical spectra of AG-1478 were from contribution of more than one conformers and a twist of the -NH- bridge of AG-1479 contributes to the changes of the optical spectra.78Fig. 12 reports the measured UV-Vis spectra of an anilinoquinazolines TKI AG-1478 in various solvents77 and conformation of AG-1478 contribution revealed by the optical spectra in methanol solvent identified by CMS.78
Fig. 12 (a) The measured UV-Vis spectra of AG-1478 in various solvents (left).77 (b) Comparsion of measured and CMS calculated spectra of AG-1478 in methanol (right).78 |
Computational molecular spectroscopy (CMS) plays a significant role in the direction of drug conformation search. There are three major challenges in drug development summarised by Habgood et al.79 That is, (a) development of a good method to generate ensembles of a molecule's bioactive conformation; (b) rational analysis and modification of a pre-established bioactive conformation; and (c) approximation of real solution phase conformational ensembles in tandem with spectroscopic data such as IR, NMR and UV-Vis spectra. Further CMS studies of anilinoquinazoline TKIs with high potency (PD150335)80 reveal that optically reportable conformation of the TKI is often more potent than the planar global minimum structure of the TKI. It seems that a correlation with flexibility and potency of a TKI exists, as flexible TKI conformers are able to fit and dock in the TK domain of EGFR. To study flexibility of a drug requires conformational search on the potential energy surface1 and molecular dynamics (MD) simulation.81 Conformational sampling of drugs still represents major hurdles for effective drug design methods in CMS approaches,1,76,82 which is a prerequisite for data-hungry machine learning algorithms such as intelligent computing in drug development. A further challenge in the case of the flexible anilinoquinazoline TKIs is that a large number of preferred conformers needs to be considered. As a result, a robust comprehensive conformational search with high-quality results and computational efficiency is required. The robust PreQMCom system is therefore developed.82 This robust computer script PreQMCom reduces the period of manual quantum mechanical calculations from years to weeks, produces hundreds of conformers of AG-1478 in dimethyl sulfoxide (DMSO) solvent and the low energy conformer clusters of the AG-1478 TKI are given in Table 2.
Cluster (conformer) | Strain energy (kcal mol−1) | Dipole moment (Debye) | Cluster size | Weightb (%) |
---|---|---|---|---|
a The calculated total energy of global minimum AG-1478 conformer is −1392.755766 Eh. b Estimation of the probability distribution of a given conformer is in %. | ||||
1 | 0.000a | 8.4504 | 10 | 67.646 |
2 | 0.594 | 3.6481 | 11 | 26.819 |
3 | 1.975 | 8.1783 | 17 | 3.871 |
4 | 2.570 | 5.3047 | 11 | 0.900 |
5 | 2.933 | 11.1002 | 9 | 0.395 |
6 | 3.516 | 5.812 | 19 | 0.306 |
7 | 4.567 | 3.3103 | 12 | 0.032 |
8 | 4.631 | 4.6733 | 10 | 0.024 |
9 | 5.373 | 9.2452 | 5 | 0.003 |
10 | 5.970 | 4.3828 | 5 | 0.001 |
11 | 6.884 | 4.1492 | 6 | 0.000 |
12 | 6.898 | 4.4065 | 14 | 0.001 |
As indicated in Table 2, at each strain energy (energy above the global minimum energy conformer) cut-off, several AG-1478 conformers are degenerate or near-degenerate. The AG-1478 TKI is dominated by the lowest six clusters of AG-1478 conformers by Boltzmann weights (last column of Table 2) at given temperature.82 The simulated UV-Vis spectra of the lowest six clusters of AG-1478 conformers which are robustly simulated using the PreQMCom system82 are given in Fig. 13.
Fig. 13 Calculated UV-Vis spectra of the lowest six preferred AG-1478 conformer clusters in DMSO solution.82 The combined spectrum (red) indeed shows the splitting bands as measured.77 |
Solar energy is the most promising renewable energy source for large-scale global electricity production particularly in Australia, which ships the sunshine to the world for clean energy export.83 Photovoltaic (PV) technology is one of the fastest growing renewable energy technologies in the world. The third generation of PVs includes organic solar cells and dye-sensitized solar cells (DSSC) in which the maximum reported efficiency has achieved 14.3% for DSSCs.84 Each component of the solar cell device heavily determines the cost, stability, and efficiency of DSSCs. Thus, in the past decade, almost all research efforts have focused on the modification of each component for practical applications. The sensitizers in DSSCs play a crucial role in gaining high solar-to-electricity conversion efficiency. Among a number of factors determination of the performance of DSSCs, two factors are critical, (a) wide absorption electromagnetic wavelength in the visible to near-infrared region, and (b) appropriate energy levels of molecular frontier orbitals that influence the thermodynamics in electron injection (from the excited state of the dyes to the conduction band of the semiconductor) and dye regeneration (from the redox mediator to the dye).84 CMS enhanced organic solar cells have been an essential tool to determine the rational and to guide further modification and improvement of high efficiency organic dyes. As a result, CMS UV-Vis spectroscopy represents the target properties for new organic dye development.85–87 as shown in Fig. 14.
Fig. 14 Design of high efficient organic dyes using π-spacers in A–π–D dyes toward machine learning.86,87 This strategy can be applied to acceptor (A) and donor (D) of the dyes. |
It has been a common practice in organic DSSC development that once a high performance dye is identified, which is often experimentally synthesized and confirmed, CMS combined with quantum mechanics is critical to work out the rational in structure–property relationship and to guide synthesis for new dyes in the class using click chemistry. The organic synthesis chemistry strategy leads to the 2022 Nobel Prize in Chemistry to Carolyn R. Bertozzi, Morten Meldal and K. Barry Sharpless for their development of click chemistry and bioorthogonal chemistry. In computational molecular spectroscopy, one can turn click chemistry from click “molecular Legos” in synthesis into click “computer mouse” in digital chemistry. Currently, the design of high-performance dye (this is also similar to drug design) has still relied on two main approaches: (a) conceptual ideas based on researchers’ chemical intuitions and (b) exploratory experiments based on a trial-and-error approach which leads to optimise numerous synthetic and spectroscopic parameters. The situation starts to change. Hopefully, combination of data available from both experiment and computation, application of the machine learning technique will accelerate the development in this direction.87–89
Theory does not need to always agree with experiment. It took a long time for experiment to achieve the theoretical ionization energy of hydrogen atom of 13.6 eV, for example. In addition to the differences in methodology, such as experimental error bars and quantum mechanical approximations, experiment is a top down process (data and fit) and theoretical method is a bottom-up process (concepts and predict). The gap becomes smaller and smaller but will not disappear. Molecular spectroscopy measures the properties at vibrational ground state (r0), whereas the calculated properties are at the bottom of the potential energy well (re). The measurement conditions are often unable to be reproduced exactly in theory. Limitations to specific spectroscopic measurements such as time scale of NMR spectroscopy, the 1H-NMR chemical shifts of methyl (–CH3) often result in an averaged single band although three protons involve in different chemical environments.
Computational molecular spectroscopy should not be separated from experimental spectroscopic measurement. Rather, it is important to identify the niche to advance molecular spectroscopy from the development of quantum mechanics in a digital era. A spectroscopic technique usually concentrates on a particular property of a molecule rather than providing the full picture. One needs to avoid the trap of “six blind men and the elephant” due to the multi-dimensional and multi facets nature of structural information. Nature is replete with examples where the handling and storing of data occurs with high efficiencies, low energy costs, and high-density information encoding. Computational molecular spectroscopy takes the advantages of quantum mechanics and recent development in digital technology into the next stage of data base, digitalizing and machine learning, which requires that the complex spectroscopic data must be managed in a scalable, reproducible and future-proofed environment. Combining with molecular spectroscopic data science that function according to the universal approaches of computing, such as those in a Turing machine, might be realized in a chemically and/or spectroscopically programable manner (digital twinning research) in the future.
This journal is © the Owner Societies 2023 |