Peter
Sagmeister
ab,
Christine
Schiller
ab,
Peter
Weiss
b,
Klara
Silber
ab,
Sebastian
Knoll
c,
Martin
Horn
c,
Christopher A.
Hone
ab,
Jason D.
Williams
*ab and
C. Oliver
Kappe
*ab
aCenter for Continuous Flow Synthesis and Processing (CC FLOW), Research Center Pharmaceutical Engineering GmbH (RCPE), Inffeldgasse 13, 8010 Graz, Austria. E-mail: jason.williams@rcpe.at
bInstitute of Chemistry, NAWI Graz, University of Graz, Heinrichstrasse 28, 8010 Graz, Austria. E-mail: oliver.kappe@uni-graz.at
cInstitute of Automation and Control, Graz University of Technology, Inffeldgasse 21b, 8010 Graz, Austria
First published on 28th July 2023
As organic chemistry becomes an increasingly data-rich field, there is a need for methods to rapidly build and parameterize models for further development. We demonstrate the parameterization of kinetic models for a catalytic reaction using three different experimental approaches: 1) steady state experiments; 2) dynamic experiments altering residence time only; 3) multi-ramp experiments, where all variables are altered simultaneously. The best agreement in a range of validation experiments was achieved using the model parameterized in the multi-ramp experiment, which also required the shortest experimental time. Further validation was then performed against a self-optimization experiment, demonstrating this as an effective method for developing empirically accurate kinetic models. The validated model could then be used for further in silico optimization and for guiding scale-up studies.
By its nature, flow chemistry lends itself well to automation and data-rich experimentation.5 The simple manipulation of flow rates allows control to be asserted over reaction parameters, such as reagent concentrations/stoichiometries. This has allowed straightforward incorporation of closed-loop experimentation, such as self-optimization.6 However, other challenges present themselves, such as the placement of process analytical technology (PAT) instruments,7 problematic reaction mixtures and comparatively high material consumption.
Flow chemistry is conceptually different to its batch counterpart and requires an alternative mindset when designing experiments. Most significantly, the distribution of species in flow is constant over time, but varies in space (Fig. 1a). This is fundamentally different to batch reactions, where (assuming perfect mixing) the distribution of species is constant over space, but varies in time. As a result, a PAT instrument measuring at one position in a flow reactor will only measure a single concentration (assuming the reaction is at steady state), rather than a full reaction time-course. However, with this fundamental difference come new opportunities for reaction design.8
By changing input flow rates, the reaction stream can be treated as a series of pseudo batch reactors, so long as there is minimal influence of axial dispersion.9 Accordingly, when well-designed, such experiments can provide a vast quantity of information for reaction characterization in a short time period, without user intervention. This concept has been exploited by numerous groups in recent years, since the first report in a flow chemistry setting in 2011.10
The group of Jensen reported the first organic chemistry-focused examples9,11 including a recent contribution examining complex non-linear ramps of three variables.12 Wyvratt, McMullen and co-workers have presented the most refined industrial examples, using a combination of FTIR and offline HPLC analysis to simultaneously ramp two parameters.13 They recently also used this approach to optimize a complex objective function for a process and to gain temperature-independent kinetic understanding.14
Bourne used dynamic flow experiments, examining changes in residence time within a single experiment, with a focus on kinetic model generation.15 Recently, Hii reported kinetic model determination for two related substrates at a single temperature by simultaneously varying reagent ratio and total flow rate.16 Numerous other contributions have been made in various fields such as polymer chemistry,17 surfactant synthesis18 and related areas.19
Following on from our earlier foray into the field of dynamic experimentation, which was focused on data processing,20 we endeavored to increase the level of complexity in such experiments. Reports thus far have utilized a maximum of three simultaneous parameter changes (Fig. 1b). In this work, we demonstrate the design of dynamic experiments to rapidly cover a large design space (5 independent variables), and collect the necessary data to build temperature-dependent kinetic models for a catalyzed reaction with two regioisomeric products (Fig. 1c). More complex utilization of this data, by means of flowsheet modeling, is then discussed in the second part of this publication.21
In order to rapidly perform the reaction in a flexible manner, the flow setup (Fig. 2) consisted of four separate pumps (Knauer, Azura P 4.1S) to deliver three concentrated reagent solutions and a solvent stream. A small volume (2 mL) tubular reactor (Ehrfeld, Hastelloy Capillary Reactor) was used to minimize axial dispersion whilst heating to temperatures up to 140 °C. A back pressure regulator (Zaiput, BPR-10) was set to 15 bar, enabling the reaction to take place at temperatures above the boiling point of the solvent and reagents.24
For analysis of the reaction, two different spectroscopic PAT instruments were proposed: FTIR (Mettler Toledo, ReactIR 15) and NMR25 (Magritek, Spinsolve Ultra 43 MHz). Inline FTIR26 analysis was performed using a pressure-resistant flow cell (SiComp, 50 μL flow cell volume, up to 35 bar), meaning that analysis could be performed before the BPR. Hence, the effect of axial diffusion after the reactor itself was minimal and the reaction progress could be followed closely, with good time resolution for following input changes.
The NMR flow cell (800 μL volume) was not pressure resistant, so the instrument was positioned after the BPR. Furthermore, the NMR measurements are known to be influenced by the flow rate through the cell.25 Therefore, if this instrument were to be used in an inline manner, the range of accessible residence times would be very low (maximum total flow rate ∼1.5 mL min−1). To obviate this problem, a peristaltic pump (Ismatec, ISM834C) was used to sub-sample the stream at a constant flow rate (0.5 mL min−1) for online NMR analysis.
Although the inline FTIR provided a picture of the reaction response to input parameters, it was more challenging to distinguish the different reaction components, due to the similarity between the two isomeric products 4 and 5. However, based on training data (16 concentration levels, with regioisomer ratio determined by benchtop NMR analysis at steady state) a suitable Partial Least Squares (PLS) regression model27 was built, to quantify species 1, 4 and 5 (see ESI† for details). DIPEA 2 and acrylonitrile 3 were difficult to accurately quantify in this model, so were left out.
Conversely, in the online NMR spectra it was possible to partially resolve the two sets of aromatic signals belonging to the two regioisomers 4 and 5. Based on this it was possible to build an Indirect Hard Model (IHM)24,27b to quantify all five species present in the reaction. As previously noted, this technique can be calibrated using only a single experimental point, greatly decreasing the experimental calibration burden. The quantifications from this single point calibration were in good agreement with the FTIR model (Fig. 3). A minor discrepancy in quantity was observed toward the end of the experiment, where starting material 1 was underestimated by the FTIR model. This is thought to be due to the lack of characteristic FTIR signals, which led to poorer quantification. The good agreement of these two measurement techniques (which are at different points along the reactor path) also suggests that the reaction is suitably quenched by simply cooling to room temperature. Although some extent of reaction may occur at room temperature, its influence on the experimental results is expected to be minimal.
In order to rapidly parameterize a predictive kinetic model, a simplified approach was taken wherein the two reactions (producing major product 4 and side product 5, Fig. 4.) were modeled using the Arrhenius equation, considering all three reaction components (1, 2 and 3). Since DIPEA 2 was not required in stoichiometric quantities, it was treated as a catalytic reagent, which was not consumed in the reaction.
(1) |
(2) |
Fig. 4 Reaction scheme used for determination of predictive reaction model using Arrhenius kinetics. |
At this point it should be stated that the fitted kinetic equation does not aim to provide an accurate representation of the elementary reaction steps in this reaction. To do so would require more complex considerations and ideally data regarding intermediate species, which were not observable here. To simplify the process of building a predictive model, the reactions were treated as single step trimolecular reactions, where non-integer reactant orders between 0.1 and 2 were permitted. Whilst the likelihood of this reaction proceeding via such a reaction pathway is extremely low, the purpose of the model is simply to provide predictions that can be used in future reaction development.
Overall, ten parameters were fitted simultaneously using the software COPASI:28 activation energy, pre-exponential factor and 3× reaction orders for both the major product 4 and regioisomer 5 formation (see eqn (1) and (2)). The measured flow reactor steady state concentrations were averaged over five NMR measurements and treated as individual reaction time points. The fitted model described the experimentally-observed concentrations of species 1, 3, 4 and 5 well (Fig. 5). The identified activation energy parameters were within a reasonable range for both major product 4 and regioisomer 5 (Ea1 = 35.1 kJ mol−1; Ea2 = 38.4 kJ mol−1). Interestingly, the fitted reaction orders (a, b, c) were below 1 for two of the three reactants in each rate equation, but not for the same reactant (b1 = 1.47, a2 = 1.12).
For this dataset, it is likely that the model is overfitted, due to the high number of fitted parameters (10), compared to the relatively small number of data points (16 data points from 4 different reaction conditions). Although this certainly represents a limitation of such a model, employing an overfitted model for comparison purposes within this study was not considered problematic. Furthermore, the reaction model was later tested with a wide range of reaction conditions, where it proved to perform to a satisfactory level. Experimental error may also arise in this approach, due to the slight fluctuations in steady state and the averaging used to take a single value.
To fit a similar predictive model as in the steady state experiments, a MatLab script was written in which the reaction input was discretized into segments of ∼10 s (aligned with NMR measurements). These segments were approximated as individual batch reactors, and their composition tracked by a standard batch kinetics approach, using ordinary differential equations, as they progressed through the reactor.19 This approach allowed each segment to experience different temperatures whilst travelling through the reactor, making it compatible with temperature ramps. The composition at the measurement point was then predicted and compared to the measured values. Dispersion in the system was simulated using the lsim tool with a continuous-time transfer function.
The model was parameterized using a bounded Nelder–Mead optimizer in MatLab (fminsearchbnd), aiming to minimize the normalized root mean square error (RMSE) of the four measured components (with equal weighting). Due to the large number of data points and parameters to fit, the parameterization was relatively slow (duration ∼16 h), but provided a low average RMSE of 5.17% for this experiment. As can be observed in the concentration plot (Fig. 6c) there is generally excellent agreement between the predicted and measured values for the components. Where disparity is observed, it appears to be characterized by a shift in time, rather than a difference in the overall prediction. This implies that the main downside of this model is its accounting for the reactor dynamics, rather than the kinetic model itself.
Experiment | Steady state RMSE [1]a | Dynamic RMSE [1]a | Multi-ramp RMSE [7]a |
---|---|---|---|
RMSE is calculated as a relative value, normalized versus the maximum observed concentration for each species. The value presented is the mean RMSE for all species 1, 3, 4 and 5.a The experiment set that was used to parameterize each model is shown in square brackets and highlighted in bold. | |||
1 | 6.30% | 5.17% | 7.77% |
2 | 10.08% | 9.24% | 10.10% |
3 | 8.09% | 8.60% | 7.15% |
4 | 11.83% | 12.74% | 8.74% |
5 | 10.16% | 10.92% | 7.60% |
6 | 9.14% | 9.13% | 6.58% |
7 | 6.27% | 6.74% | 5.38% |
Mean | 8.83% | 8.93% | 7.62% |
When comparing the performance of the differently parameterized models, the “multi-ramp” variant (trained using experiment 7) was superior for all validation experiments (with the exception of experiment 2). It should be expected that the model's own parameterization experiment provides the lowest RMSE, therefore the observed performance in experiment 1 is unsurprising.
There are numerous potential sources of error that likely contribute to these RMSE values. First and foremost, the model does not take into account reactor dynamics, meaning that the trace for measured vs. predicted results are often misaligned. Additionally, small variations in reaction solution concentrations and pump flow rates may contribute to overall errors. Finally, the chemometric model (IHM) inherently has error associated, particularly at low concentrations, which has an influence in both model fitting experiments and validation experiments.
All five variables that were ramped in the initial experiments were included as optimizable variables: 1) reaction concentration; 2) residence time; 3) reaction temperature; 4) acrylonitrile 3 loading; 5) DIPEA 2 loading. The experiments were, however, carried out as steady state reactions, by setting initial parameters and waiting for steady state to be reached (i.e. not as dynamic experiments). The self-optimization was initiated with a Latin hypercube (LHC) set of 10 experiments (2n, where n is the number of optimizable variables). Thereafter, 28 optimization experiments were performed, providing a dataset of 38 experiments.
After the initial LHC experiments, the TSEMO algorithm rapidly began to define a pareto front at complete conversion (Fig. 8a), which was relatively straightforward to achieve. Maximizing the space–time yield was then explored, where higher values were achieved at incomplete conversion. A maximum space–time yield of 7.37 kg L−1 h−1 was observed, where the reaction operated at only 82% conversion.
The measured values from these steady state experiments were then compared with the predictions from the simple kinetic model, parameterized using MatLab. Gratifyingly, there was excellent agreement between the predicted and measured values for triazole 1, acrylonitrile 3 and main product 4. The relative RMSE for these were 3.8%, 5.7% and 5.2%, respectively.
The measured and predicted values for main product 4 (Fig. 8b) are of particular interest, showing good parity between all points, aside from two outliers (consecutive experiments 31 and 32, where product is significantly over-predicted).
The relative RMSE for the side product 5 was far poorer than the other three components, at 27.9%. However, this was likely, in part, due to the lower concentrations involved (up to 0.13 M, c.f. ∼1.4 M for main product 4). The absolute RMSE is 38 mM (c.f. 72 mM for main product 4), which will be influenced by compound error in the chemometric model, pump flow rates and other factors.
With this validated model in hand, it should be noted that further in silico optimization can be performed, without experimental effort. For example, the same TSEMO multi-objective algorithm can be carried out with a far larger experiment set, allowing the pareto front to be fully defined in a short time period (see ESI†).
As previously discussed, the reactor dynamics are poorly accounted for using this modeling approach. Therefore, more complex corrections or alternative methods may be advantageous. In the accompanying manuscript, a flowsheet modeling approach with dispersion correction is presented and discussed in detail for the same reaction.21
Footnote |
† Electronic supplementary information (ESI) available: Additional experimental details and results. See DOI: https://doi.org/10.1039/d3re00243h |
This journal is © The Royal Society of Chemistry 2023 |