Felix
Thelen
,
Lars
Banko
,
Rico
Zehl
,
Sabrina
Baha
and
Alfred
Ludwig
*
Chair for Materials Discovery and Interfaces, Institute for Materials, Ruhr University Bochum, Universitätsstraße 150, 44780 Bochum, Germany. E-mail: alfred.ludwig@rub.de
First published on 19th September 2023
High-throughput experimentation enables efficient search space exploration for the discovery and optimization of new materials. However, large search spaces, e.g. of compositionally complex materials, require decreasing characterization times significantly. Here, an autonomous measurement algorithm was developed, which leverages active learning with a Gaussian process model capable of iteratively scanning a materials library based on the highest uncertainty. The algorithm is applied to a four-point probe electrical resistance measurement device, frequently used to obtain indications for regions of interest in materials libraries. Ten libraries with different complexities of composition and property trends are analyzed to validate the model. By stopping the process before the entire library is characterized and predicting the remaining areas, the measurement efficiency can be improved drastically. As robustness is essential for autonomous measurements, intrinsic outlier handling is built into the model and a dynamic stopping criterion based on the mean predicted covariance is proposed. A measurement time reduction of about 70–90% was observed while still ensuring an accuracy above 90%.
High-throughput experiments usually consist of three main stages, starting with the combinatorial fabrication of hundreds of well-defined chemical compositions in the form of thin-film materials libraries.3 These can either have a continuous compositional gradient, e.g., generated by co-deposition magnetron sputtering,4 or can be ordered discretely, i.e., by inkjet printing techniques.5,6 An example of a co-sputtered materials library is shown in Fig. 1. After fabrication, the libraries are characterized by multiple techniques ideally in parallel or by automated serial methods. These include, first, identification of the chemical compositions and their crystallographic structure, e.g., by energy dispersive X-ray analysis (EDX) and X-ray diffraction (XRD) respectively. Second, functional properties are investigated based on the use cases of the fabricated materials, which include for example electrical resistance or band gap measurements.1 Thereby, most high-throughput characterization instruments consist of an automated positioning system which moves a sensor system over the materials library. After characterization, the large amounts of data generated along these steps are then used to plan follow-up experiments. Many characterization techniques remain, however, rather time-consuming compared to the synthesis process, e.g., performing XRD measurements for hundreds of measurement areas on a single library can take 12–14 hours.7
Especially for the last stage, the application of machine learning and data mining under the paradigm of materials informatics8,9 has contributed significantly to navigate, explore, and exploit the high-dimensional materials search space more efficiently. In order to decrease the necessary time for high-throughput characterization, active learning together with Gaussian process regression can be leveraged to autonomously determine materials properties across libraries. Instead of measuring all, typically hundreds of measurement areas of a library consecutively with fixed coordinates, the algorithm decides the measurement sequence by building and updating a Gaussian process model during the procedure. Once the model's prediction is accurate enough, the process can be terminated, decreasing the total measurement time drastically: related work10,11 indicates a 10-fold time reduction. An essential factor for autonomous characterization is the robustness of a model as it needs to be applicable to a wide variety of materials and measurement procedures can be affected by systematic measurement errors.
To investigate the possibilities and limitations of this approach, the algorithm is tested on a custom-built high-throughput test-stand12 measuring the electrical resistance of materials libraries using the four-point probe method. The electrical resistivity in alloys is dependent on the crystal structure and is further influenced by all defects in the materials as electrons are scattered at lattice defects like voids, impurities, dislocations, and grain boundaries.7 Therefore, a mapping of the electrical resistance of a library can indicate different phase zones/regions and their boundaries1,12 and is thus a useful descriptor for finding areas of interest.
Ten libraries comprising a variety of metallic materials systems fabricated with different methods such as co- and multilayer-sputtering, were measured and analyzed to validate the performance of the developed algorithm.
After optimization of the hyperparameters, the predictions of a Gaussian process, given by the posterior mean and covariance, can be used to determine which additional training data instance can result in the highest model improvement. As the covariance is a measure of uncertainty of the Gaussian process model, selecting the instance with the highest covariance reduces the overall uncertainty efficiently.24
This learning approach is especially useful in scenarios in which labels are expensive or time-consuming to generate. Therefore, active learning fits the conditions of materials discovery with its mostly elaborate measurement techniques.25 Examples of applications of active learning for materials discovery can be found in ref. 26–28.
Closely related to active learning is Bayesian optimization, which is in comparison more frequently applied in the field of materials discovery. In contrast to active learning, instead of learning an underlying function as efficiently as possible, Bayesian optimization aims to maximize a function globally.19 As materials discovery most often has the aim to identify materials with optimized properties while reducing the number of experiments, Bayesian optimization is applied frequently in literature.10,25,29–34
Material system | Sputter method | Substrate | T deposit | T anneal |
---|---|---|---|---|
Co–Fe–Mo–Ni–V | Co-sputtering | Si + SiO2 | 25 °C | — |
Co–Fe–Mo–Ni–W–Cu | Co-sputtering | Si + SiO2 | 25 °C | — |
Co–Cr–Fe–Mo–Ni | Co-sputtering | Si + SiO2 | 25 °C | — |
Cr–Fe–Mn–Mo–Ni | Co-sputtering | Si + SiO2 | 25 °C | — |
Co–Cr–Fe–Mn–Mo | Co-sputtering | Si + SiO2 | 25 °C | — |
Ni–Al | Co-sputtering | Si + SiO2 | 25 °C | — |
Co–Cr–W 1 | Multilayer | Al2O3 | 150 °C | 900 °C |
Co–Cr–W 2 | Multilayer | Al2O3 | 150 °C | 750 °C |
Co–Cr–W 3 | Multilayer | Al2O3 | 25 °C | 600 °C |
Co–Cr–Mo | Multilayer | Al2O3 | 25 °C | 900 °C |
To increase the robustness of the algorithm, modifications to the standard Gaussian process were tested, ranging from the incorporation of the substrate information into the training data to including the measurement variance into the model. Furthermore, a kernel test was done by comparing the performance of the Gaussian process with various kernel functions.
Fig. 3 shows two iterations of the autonomous measurement on the example of the Co–Cr–Fe–Mn–Mo library. After measuring nine areas for initialization, the algorithm first selects areas at the edge of the library, before concentrating on the inner parts. While parts of the library are still incorrectly predicted after five iterations, the ground truth and the prediction are almost visually identical after 15 iterations.
Depending on acceleration voltage and materials, the electron beam reaches different depths up to several micrometers. Therefore, not only the deposited elements, but also the substrate material can be included in the analysis. This can support the autonomous resistance measurements, as substrate information can generally be correlated with film thickness, which in turn influences the electrical resistance.
The performance of a standard Gaussian process with SE kernel was observed to test the influence of the selection of constituents. One model was trained only on the compositional information of the deposited elements, and another on the composition data including the substrate contents. The (normalized) x- and y-coordinates were added to the training data as well, to help the Gaussian process to model the thickness as a hidden dimension, which is an x–y-dependent property. Input and output standardization were used to improve numerical stability. The mean function of the Gaussian process was set to zero, as the model needs to be applicable to a variety of material systems and there is no physical equation of the resistance distribution. The results of the first 250 training iterations are shown in Fig. 4. For more iterations, the Gaussian process tends to memorize the added training data, generally referred to as overfitting. Therefore, following iterations are neglected.
For most tested libraries, an accuracy higher than 90% after 50 iterations was observed using the standard implementation of the Gaussian process. The highest performance was achieved for the measurement of libraries which generally show unidirectional resistance gradients (the first five in Table 1).
Including the substrate information in the training data was mostly found to either not affect the performance or slightly improve the accuracy and robustness of the prediction. This is because the electrical resistance depends both on composition and thickness. Only in one case (Co–Cr–Mo) of the tested 10 different libraries the inclusion of substrate information showed a substantially improved result. Therefore, in case of this material, the resistance is mainly dependent on the thickness instead of the composition.
For the libraries Co–Cr–W 1 as well as Ni–Al, the Gaussian process shows a decrease in performance when trained on the substrate information, indicating that the resistance mainly depends on the composition. However, automatic relevance determination enables the Gaussian process to assign less weight to the importance of the substrate information, therefore still enabling a sufficient fit. Additional noise brought into the training data by the substrate information is not visibly affecting the performance. Consequently, because including thickness information via the substrate content was shown to enable a more robust prediction, as much information as available should be added to the training data.
This enables the Gaussian process to automatically weigh the measurement results based on their reliability without modifying its architecture significantly. With this modification, the model is capable of dealing with homoscedastic as well as heteroscedastic aleatoric uncertainty, which originate in the respective measurement setup as a result of dirty or rough surfaces and the touching of the contact pins respectively.
Fig. 5(a) compares the standard Gaussian process to the one trained with the measurement variance over the first 250 iterations. A full visualization can be found in the ESI.† Without outliers, both implementations show almost identical results, the mean deviation of accuracy across all tested libraries is 0.2%. This small improvement originates from the ability of the algorithm to detect minor measurement errors caused by variations of the pin's orientation during each individual measurement. In order to investigate the performance with higher measurement noise, the accidental short-circuit of the pins was simulated by adding randomly generated noise in the range of 0.8–1.2 MΩ to three measurement areas across all libraries. The resulting resistance distributions can be found in the ESI.† In this simulation, it is assumed that one out of three touchdowns feature ten resistance measurement results with a large variance. The resulting performance of the vanilla Gaussian process and the one based on the measurement variance is shown in Fig. 5(b).
Fig. 5 Performance of the standard Gaussian process and the one with information about the measurement variance σm2. Both algorithms show comparable performances when trained on data with low noise levels (a), but when three artificially added random outliers are added (b), the prediction of the standard Gaussian process fails as soon as an area with an outlier is reached.‡ |
While the standard Gaussian process fails predicting the distribution as soon as an outlier is measured, the active learning algorithm with integrated measurement variance continues the optimization once an outlier is reached, as it is capable of automatically weighting the output training data relative to its reliability.
Since the algorithm needs to be suitable for a large variety of different materials and libraries, sufficient adaptability and stability are the most important factors for choosing the kernel. Each library was autonomously measured with each kernel and ranked by their performance. The results are summarized in Table 2.
Material system | Rated kernel performance | n iters until stopped | ||||
---|---|---|---|---|---|---|
SE | RQ | M32 | M52 | Optimal | Criterion | |
Co–Fe–Mo–Ni–V | 3 | 0 | 3 | 4 | 20 | 41 |
Co–Fe–Mo–Ni–W–Cu | 3 | 2 | 1 | 4 | 10 | 41 |
Co–Cr–Fe–Mo–Ni | 2 | 0 | 3 | 4 | 16 | 46 |
Cr–Fe–Mn–Mo–Ni | 2 | 0 | 3 | 4 | 35 | 47 |
Co–Cr–Fe–Mn–Mo | 2 | 1 | 4 | 3 | 30 | 41 |
Ni–Al | 3 | 3 | 4 | 2 | 80 | 88 |
Co–Cr–W 1 | 3 | 4 | 2 | 2 | 40 | 41 |
Co–Cr–W 2 | 3 | 3 | 4 | 4 | 50 | 52 |
Co–Cr–W 3 | 3 | 2 | 2 | 4 | 80 | 100 |
Co–Cr–Mo | 3 | 3 | 2 | 4 | 35 | 49 |
Mean | 2.7 | 1.8 | 2.8 | 3.5 |
The accuracy improvement over the iterations can be found in the ESI.† Except for the RQ kernel, the performances of the different kernels across all materials libraries were found being very similar. While there was no kernel performing best for each of the ten libraries tested, the Matérn kernels were found to have slightly better prediction accuracy. Reasons for this are their larger set of hyperparameters and the resulting greater flexibility. The rational quadratic kernel on the other hand was the only kernel unable to approximate all libraries and failed in four occasions entirely. For future uses of the algorithm, the Matérn52 kernel was chosen.
In order to overcome this, a dynamic stopping criterion based on the predicted uncertainty of the Gaussian process is proposed. However, simply defining an uncertainty threshold under which the process is terminated is not applicable either, as each measured library will have a different range of uncertainties depending on the noise level of the measurement and potential outliers. Therefore, the uncertainty over the training iterations needs to be observed relative to the initial uncertainty. The stopping logic is shown in Fig. 6 on the example of the Co–Cr–Fe–Mo–Ni library. The (unknown) accuracy of the optimization process, the normalized mean covariance predicted by the Gaussian process as well as the numerically determined gradient of the normalized mean covariance are plotted over the training iterations.
After initialization of the Gaussian process, 30 areas are measured independent of the performance ensuring a basic approximation of the dataset. Afterwards, the normalized predicted covariance of each iteration is observed. If the covariance of the current iteration is smaller than the initial covariance, the numeric derivative of the normalized mean covariance is calculated, and its progression is observed over the next ten iterations. This criterion is driven by the notion, that in order to terminate the process, the model at least needs to have an uncertainty lower than the one of the initial iteration. If the model continues to improve its fit over the next ten iterations (indicated by a steady decrease in the mean covariance), the measurement process is stopped. This is determined by observing the gradient of the mean covariance, specifically by ensuring that it stays below the empirically found threshold of 1% per iteration. Otherwise, if the gradient is positive, the observation is reset and at least ten additional measurements are taken until the next termination is possible.
Visualizations of the stopping criterion applied to the other libraries can be found in the ESI.†Table 2 compares the number of iterations determined via the shown stopping criterion and the optimal stopping decision based on observing the accuracy of the algorithm until it hits 90% accuracy as well as a visual representation of the prediction. In most cases, the developed stopping criterion is overestimating the number of measurements to perform by a factor of 1.5–4. Although this can be finetuned by changing the fixed number of initial iterations or the number of iterations in which the mean covariance is supposed to decrease, this behavior is beneficial for this early implementation of the algorithm. In order to apply autonomous measurements to real-world every-day scientific workflow, enough trust in this technology needs to be established, that is why higher safety margins are useful during early adoption. However, for most tested libraries, the autonomous measurement could ideally be stopped after 6–16% of the normally measured areas of a library without a significant loss in quality. This applies especially to the tested co-sputtered HEA libraries, which feature uniform resistance gradients with less resistance variations. Less good predictions were obtained on the libraries Ni–Al and Co–Cr–W 3. Reasons for this could be missing information, e.g., on surface oxidation or phase formation, in the training data. For further studies, the incorporation of visual or crystal structure information could help improving the prediction in those cases. An analysis of the visual information of a library could also improve the selection of initial measurement areas.
In order to gain additional insight and trust into the autonomous measurement procedure, the performance of the method can be evaluated in a longtime study while still measuring the entire library and therefore not taking the risk of less accurate or even wrong experiment results. Despite the achieved high efficiency improvement, the autonomous measurement only decreases the absolute measurement duration by about 30–40 minutes due to the already fast four-point probe measurement procedure. This is still important when a multitude of libraries needs to be measured as fast as possible, and the implementation is part of a (semi)autonomous experimentation campaign. However, the application into materials characterization devices demanding much more time can result in even higher absolute efficiency improvements. An example are temperature-dependent resistance measurements, where temperature cycling is inherently slow with 20–50 hours36 depending on the number of temperature steps and the temperature interval. In addition, the widely used EDX or XRD measurement techniques can profit from active learning optimization as well. Further progress in these areas depends on manufacturers, who need to provide application programming interfaces (APIs) for their highly specialized devices, which would enable intervening into the measurement processes via custom made software.
Footnotes |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d3dd00125c |
‡ A detailed version of the training accuracy and the noiseless and noisy training data can be found in the ESI. |
This journal is © The Royal Society of Chemistry 2023 |