Jae Min Jeong‡
a,
Moonsoo Ra‡b,
Jinha Jeong*b and
Woong Lee*ac
aDept. of Materials Convergence and System Engineering, Changwon National University, 20 Changwondaehak-ro, Changwon-si, Gyeongsangnam-do 51140, Republic of Korea. E-mail: woonglee@changwon.ac.kr
bLightVision Inc., 20 Seongsuil-ro 12-gil, Seongdong-gu, Seoul 04793, Republic of Korea. E-mail: trizmaster@gmail.com
cSchool of Materials Science and Engineering, Changwon National University, 20 Changwondaehak-ro, Changwon-si, Gyeongsangnam-do 51140, Republic of Korea
First published on 10th June 2024
A deep convolutional neural network (DCNN) architecture ResNet has been tested to verify its ability to handle selected area electron diffraction pattern (SADP) datasets carrying information on lattice defects including strains, thermal lattice vibrations, point defects, dislocations, and twin boundaries. The disordered states of the crystal lattices in the presence of these defects were predicted by ab initio molecular dynamics simulations, first principles geometry optimizations, and lattice manipulation operations in an effort to establish a possible dataset augmentation strategy for the improvement of classification performance of the ResNet. Using the disordered lattice information originating from the defects, test dataset SADPs were generated by simulating electron diffraction in transmission electron microscopy. The ResNet, pre-trained using SADPs from defect-free materials, showed decreasing but acceptable classification accuracies with increasing degrees of lattice disorder regarding the lattice vibrations and point defects. When tested using the diffraction patterns for strained lattices, the ResNet responded to the changing lattice symmetry when strain levels are relatively high suggesting that it has capability to discern different symmetries induced by large strains. However, the ResNet failed to recognize lattice structure when dislocations and twin boundaries were considered. It is suggested that DCNN architectures be trained over various scenarios including changes in the image feature characteristics in the diffraction patterns related to defects in future developments for improved general classification performances.
The application of artificial intelligence, deep learning (DL) in particular, to the classification of crystal structure using SADPs is in its early development stage. Ziletti et al.11 first employed a convolutional neural network (CNN) to classify the space groups using SADP-like simulated diffraction patterns. They defined a space group descriptor by superposing diffraction patterns obtained and colour-indexed for three major crystallographic axes, namely a-, b-, and c-axes. For the selected space groups of 139, 141, 166, 194, 221, 225, 227, and 229, the training and test accuracy of 100% was reported. Tiong et al.12 constructed three colour-indexed geometric meshes by connecting diffraction spots on diffraction patterns for three major crystallographic axes as space group descriptors which were to be processed in a parallel multistream DenseNet. They achieved the classification accuracy of 80.1% over 72 space groups. The first attempt to mimic human perception process of the diffraction patterns was made in the previous work by the authors of this study.13 In this work, a CNN architecture ResNet14 was trained and validated using simulated TEM SADPs obtained for various combinations of acceleration voltages, camera lengths, and 16 zone axes ranging from [001] to [233] as appear in standard diffraction patterns. The labelling scheme based on ‘how the SADPs appear’ and the classification algorithm based on the ensemble of inference probabilities were developed and the validation accuracy of 92.6% was obtained for the space groups 213, 221, 225, 227 and 229. Further development of this scheme was reported by Chen et al. recently.15 They adopted a vector map representation of diffraction patterns obtained from the four-dimensional scanning tunnelling electron microscopy to train and validate a CNN architecture called PointNet. It was possible to identify all 7 crystal systems, from triclinic to cubic systems, instead of the space groups with the accuracy of 94%.
While these works have demonstrated application potentials of DL architectures to the analysis and classification of SADPs, there remain many issues to be addressed for further developments. For instance, the laws of thermodynamics state that no material is defect free and therefore the DL architectures should have robustness against the noisy SADPs from the specimens containing crystalline defects. Indeed, Ziletti et al. tested their CNN architecture for defects using test dataset comprising simulated diffraction patterns for materials having the vacancy concentration of up to 25% and for materials with randomly displaced lattice atoms corresponding to a few % changes in interatomic distances.11 Tiong et al. also considered the effect of vacancies by considering diffraction patterns from materials containing up to 20% of vacancy concentration as test datasets.12 These works demonstrated that their DL architectures trained and validated using datasets for defect-free or perfect crystals were robust to defect-related noisy diffraction patterns. Defects were considered from another perspective in a recent work by Chen et al. where the SADPs were artificially modified by displacing and deleting diffraction spots and adding redundant spots. It was reported that the test accuracy gradually decreased with increasing noise levels.15
Despite these attempts, it is still necessary to verify whether the DL architectures can classify noisy SADPs containing defect information further since there are more crystalline defects other than vacancies and lattice disordering. For example, thermal lattice vibration, lattice strains, impurities, dislocations, etc, can be considered.16 Noisy SADPs can be generated by adjusting positions and brightness of the diffraction spots to represent defects, while such modifications would be more realistic if performed based on actual changes in lattice structure caused by the presence of defects.17 Moreover, the previous work by the authors of this study has not been tested with noisy diffraction patterns. This previous work showed that ResNet architecture could classify space groups of materials using SADPs in its pristine form13 whereas the other works pre-processed the SADPs by colour-indexing and superpositions,11 constructing geometric meshes over them,12 or vectorizing the positions and intensities of the diffraction spots.15 From a practical viewpoint, it is desired that the crystal structure classification task be carried out using SADPs as obtained from TEM directly. In this respect, it would be necessary to test the ResNet architecture adopted in the previous study for tolerance to noises in SADPs due to crystalline defect. In this study, capability of this ResNet 101 architecture to process noisy SADPs is addressed by considering defects in single crystal systems such as lattice strains, lattice vibrations, vacancies, impurities, dislocations, and twin boundaries. Changes in the lattice structures caused by these defects were first considered using theoretical calculations, ab initio simulations and lattice simulations to generate noisy SADPs. Based on the experiment-ations using ‘realistic’ noisy SADPs, future directions for the dataset augmentation to improve crystal structure classification performance of DL architectures is discussed.
Type of defect | Calculation method | Output |
---|---|---|
Lattice strain | Solution of constitutive equation following Hook's law for anisotropic materials | Lattice parameters for deformed unit cell |
Thermal vibration | Ab initio molecular dynamic (AIMD) simulations over 3 × 3 × 3 super cells | Disordered positions of lattice atoms |
Point defect | First principles geometry optimizations over 2 × 2 × 2 and 3 × 3 × 3 super cells | Displaced positions of lattice atoms |
Edge dislocation | Lattice manipulations to insert extra half planes above a slip plane | Displaced positions of lattice atoms |
Twin boundary | Lattice manipulations to invert stacking sequences with respect to a twin plane | Mirrored positions of lattice atoms with respect to the twin plane |
Fig. 1 Schematic showing the data flow structure to verify the classification performance of the ResNet architecture pre-trained with SADPs for pristine defect-free materials. |
Fig. 2 Changes in the classification accuracy of the ResNet 101 architecture with increasing magnitude of (a) the normal strain and (b) the shear strain. |
These changes in the test accuracy with varying strain can be understood by referring to the test SADPs for the model system of Al shown in Fig. 3 for the BD aligned with the [001] zone axis, i.e. BD = [001]. When the axial strain along a-direction, εxx, is small (2.0% in this case), the SADP in Fig. 3a is hardly distinguishable from the SADP for εxx = 0 in that the four-fold symmetry with respect to the [001] axis appear to be maintained to human eyes. Once the strain is increased (5.0% in this case), elongation between the diffraction spots is noticed along the b*-direction (vertical direction in the figure) in Fig. 3b. Concerning the shear strain, it is noticed in Fig. 3c and d that the angle between a*- and b*- directions become noticeably smaller than 90° with increasing shear strain (γxy) from 0.015 to 0.035 rad. In a strict sense, cooperative displacements, or rotations of lattice planes along specific crystallographic directions accompany changes in the lattice symmetry. However, small changes in symmetry are hardly reflected in the SADPs. Thus, human materials scientists can find proper symmetry of the material under investigation even when it is strained. Likewise, the ResNet architecture assigned proper symmetry to the strained SADPs when the strains are not large. In fact, normal strain of 1% would be reflected in the SADPs as the shift of the diffraction spot by 1 pixel in every 100 pixels while the SDAPs for the training and validation datasets had the dimension of 256 × 256 pixels. In this respect, it is expected that small magnitude of strains will not make significant change to the pixel information to be handled by the ResNet architecture.
Decreasing accuracies at higher strains on the other hand may suggest that the ResNet 101 architecture adopted in the previous study encounter problems when treating SADPs from strained specimens. However, the ResNet 101 architecture has been trained only to classify 5 space groups in the cubic system in the previous study.13 The training datasets thus lack representations for the space groups that share similarities with the strained test SADPs. Consequently, it is expected that the ResNet architecture would try to assign the strained SADPs to one of these 5 space groups, leading to a notable decrease in classification accuracy at higher strains. Any lattices with an elongation along the a-direction is represented by the lattice parameters of a ≠ b = c and α = β = γ = 90°, which corresponds to a tetragonal system. It is therefore expected that the ResNet architecture herein would have distinguished between cubic and tetragonal symmetry, had it been trained with the diffraction dataset for the tetragonal system as well. In a similar manner, a shear strain with respect to the c-axis results in the deformed lattice with the lattice parameters of a = b = c and α = β = 90° ≠ γ, which correspond to a special case of triclinic or monoclinic system. Hence, the ResNet architecture would have assigned the SADPs for higher shear strains to monoclinic or triclinic systems like that reported by Ziletti et al.11
Although lattice structures were predicted to be disordered due to thermal vibrations causing diffuse scattering of the electron beams as revealed in the resulting simulated SADPs, the classification capability of the ResNet architecture was not significantly influenced by the changes in the intensities of the diffraction signals. It exhibited the classification accuracy of 94.57 and 90.13% for the Al superlattices with simulated thermal disordering at 100 and 300 K, respectively. These accuracies are not much different from that for the diffraction patterns obtained at 0 K which was 98.65% demonstrating the robustness of the ResNet architecture in classifying the diffraction data having thermal noises originating from disordering. Gradual decrease in the accuracy with increasing temperature is the result of an increasing degree of disorder16 which resulted in more noises in the diffraction signals.
If lattice atoms are randomly displaced in direction, the crystal structure will have P1 symmetry (space group 1). However, such random displacements can be averaged over many atoms. Further, these displacements are only very small fractions of lattice lengths as mentioned above and shown in Fig. 4c and d. Hence, random changes in the atomic positions due to thermal vibration will not affect the lattice symmetry reflected in the SADPs. They only cause small changes in the intensities (brightness) of the diffraction spots5,20 that could be handled by the ResNet architecture without significant loss of accuracies. This is similar to the human perception process in which the geometric arrays of the diffraction spots are first recognized to assign the SADP of concern to an appropriate space group and then the diffuse diffraction spots are ascribed to the presence of some defects.5,20,21
The classification capability of the ResNet architecture with respect to the point defects was verified with rather high defect concentrations of 1/32 (3.1 at%, 2 × 2 × 2 supercell) and 1/108 (0.93 at%, 3 × 3 × 3 supercell) for Al and 1/54 (1.8 at%, 3 × 3 × 3 supercell) and 1/128 (0.78 at%, 4 × 4 × 4 supercell) for Fe, respectively. Lattice deformations caused by the point defects of these concentrations, as predicted from the first principles geometry optimization, are visualized as projections of lattice points onto the x–y plane (viewing direction is [001]) in Fig. 6. It is seen that the lattice distortions are higher for higher defect concentrations while the vacancy tends to deform lattices more than substitutional impurities, which are more pronounced in Fe than in Al lattices. For instance, in 2 × 2 × 2 Al superlattice, among 31 lattice atoms, 8 atoms around a vacancy showed the largest displacement of the atomic position of 0.036 Å in the first principles geometry optimization while the remaining 23 atoms showed almost zero displacements. In 3 × 3 × 3 Fe superlattice, among 53 lattice atoms, 10 atoms around the vacancy were displaced by 0.14 Å, 26 atoms in the surrounding regions were displaced by 0.045 to 0.12 Å and the remaining 17 atoms showed almost zero displacements. In these deformed superlattices, locations of the defects were chosen arbitrarily, but these deformed lattices are repeated along the x, y, and z directions 10 times to create virtual specimens resembling nanoparticles to generate SADPs in the TEM SAD simulations. Hence, in these virtual TEM specimens, the defects themselves form another ordered array affecting the diffraction process, which would provide severer conditions with respect to causing ‘noises’ in the SADPs than defects with completely random distributions.
Fig. 7 shows the simulated SADPs from Fe with point defects for [001] zone axis. In these diffraction patterns the relative intensities of the diffraction spots for higher index planes (smaller dots around the inner brighter ones) becomes weaker with the introduction of defects to the lattice, which is more prominent for the case of vacancy. Such changes in the diffraction patterns subsequently result in the changes in the classification accuracy as summarized in Table 2 regarding the vacancies and substitutional impurities. As expected, presence of point defects in materials at low concentrations do not have noticeable effect on the classification accuracy. Once the defect concentrations increase, the accuracy becomes lower. Especial-ly when the defect type is vacancy which caused large lattice distortion in substantial volume of the Fe supercell (Fig. 6d), decrease in the accuracy is somewhat noticeable, although 83.24% seems to be still acceptable. This indicates that large non-uniform deformation in lattice structure and resulting diffuse scattering of electrons as reflected in the degraded quality of the SADPs has detrimental effect on the classification accuracy, although it is still acceptable when the defect concentration within practical ranges.
Fig. 7 SADPs obtained from (a) pristine Fe lattice, (b) Fe 3 × 3 × 3 supercell with one vacancy and (c) Fe 3 × 3 × 3 supercell with one Au atom substituting one lattice Fe atom. |
Type of defect | Defect concentration in supercells | Classification accuracy |
---|---|---|
Vacancy | Al 3 × 3 × 3 supercell: 1/108 | 97.79% |
Fe 4 × 4 × 4 supercell: 1/128 | ||
Al 2 × 2 × 2 supercell: 1/32 | 83.24% | |
Fe 3 × 3 × 3 supercell: 1/54 | ||
Substitutional impurity (si in Al/Au in Fe) | Al 3 × 3 × 3 supercell: 1/108 | 97.55% |
Fe 4 × 4 × 4 supercell: 1/128 | ||
Al 2 × 2 × 2 supercell: 1/32 | 94.90% | |
Fe 3 × 3 × 3 supercell: 1/54 |
In some materials systems, foreign atoms smaller than the hosting lattice atoms may occupy the interstitial sites. Since the interstitial sites do not provide enough volume to accommodate the impurity atom, neighbouring lattice near the impurity is distorted. Depending on the size of the impurity, the extent of distortions may differ. In the case of C impurity taking a tetrahedral site in Al lattice, the first principles geometry optimization for a 3 × 3 × 3 supercell (one C atom in 108 Al atoms) predicted that the lattice distortion is limited only to the first neighbouring lattices as shown in Fig. 8a. This highly localized lattice deformation did not have a noticeable effect on the diffraction process when the SADP in Fig. 8b is compared with that for a defect-free Al in Fig. 5a. The classification accuracy for the SADPs for this system was 95.48%, close to the accuracy of 98.65% for the SADPs from a pristine Al single crystal.
Fig. 8 (a) Predicted atomic positions when a C atom occupies a tetrahedral interstitial site in an Al 3 × 3 × 3 supercell and (b) corresponding simulated SADP for BD = [001]. |
Unlike the cases of lattice strain, thermal vibration, and point defects, the ResNet architecture failed to classify these SADPs showing the classification accuracy only of 7.97%. The image features included in the SADPs from specimens containing dislocations are diffuse disks with streaks. These features are obviously different from diffraction spots included in the SADPs for training the ResNet, viz. SADPs from defect-free specimens.13 A human materials scientist will resort to knowledge and experiences to assign the SADPs in Fig. 9 to the space group 225 for the zone axes of [112], [111], and [110], respectively from the overall arrangements of the diffraction spots. Those specific features such as diffuse disks and streaks will then be ascribed to the dislocation array.22 On the other hand, the ResNet architecture tries to match the image to the closest model (function) that has been established during the training process. If the features in the test image are substantially different from those in the training image dataset such that the ‘noises’ are essentially new features, then the ResNet will fail to classify the image properly.
Failure of the ResNet in classifying SADPs is further expected when a TEM specimen contains a twin boundary which forms a mirror plane between two crystals having the same crystal structure but inverted stacking sequences. Hence, the SADPs from a specimen having a twin boundary will appear as a superposition of two diffraction patterns reflected with respect to each other if the beam direction is normal to both the twinning direction and twinning plane normal. SADPs from specimens containing one twin boundary are shown Fig. 10a and b for the model systems of Al and Fe, respectively. In Al, the twin plane is (111) and the twinning direction is [112]24 and Fig. 10a was obtained for BD = [110]. As for Fe, the twin plane is (121) and the twinning direction is [111]20 and Fig. 10b was obtained for BD = [101]. These SADPs differ substantially from the standard diffraction patterns as exemplified by the double or quadruple diffraction spots with large separations which are the traces of twins. For the SADPs from the twinned crystals, the ResNet architecture showed the classification accuracy of 33.80%. In the case of SADPs from materials with twin boundaries, difficulty arises even to a human materials scientist unless the overlapping of two inverted patterns is noticed by analysis or intuition based on experiences. Since an SADP from a material containing twin boundary is essentially a superposition of two reflected SADPs, it can be treated as an SADP of different kind containing new features like SADPs from textured structures. In the latter case, several essentially identical SADPS are superposed with small rotations around an axis normal to the centre of the incident electron beam. Such a feature was not included in the training dataset,13 and therefore the SADPs from a twin system would be treated as unknown type rather than noisy patterns. It would then be natural to expect that the ResNet is unable to classify an SADP like the case of dislocations.
Fig. 10 Simulated diffraction patterns for (a) Al (BD = [110]) and (b) Fe (BD = [101]) single crystals with one twin boundary. |
Considering the theoretical foundations of deep learning, the ResNet architecture's robustness against defects demonstrated in this study, be it strong as in the case of lattice disorder and point defects or weak as in the case of dislocations and twin boundaries, may hinge on two key factors. One is the dimensions of the SADPs utilized during training and the other is composition of the dataset, which includes the data augmentation technique.29,30 Concerning the dataset dimension, it was found in a preliminary investigation that the ResNet became more sensitive to even smaller defect-related noises in the dataset if the simulated SADPs for training have resolution exceeding 256 × 256 pixels. This susceptibility arises from the detailed changes present in the simulated SADPs due to the defects themselves. On the contrary, decreasing the image resolution would result in the loss of details in the lattice structure information, leading to degradation of general classification capability even over the pristine (clean) SADPs.
Meanwhile, most of the limitations in the ResNet architectures performance encountered in this study stem from the dataset characteristics. Despite the exceptional performances of recent deep learning applications, their success often depends on vast quantity training data that comprehensively covers various real-world scenarios. On the other hand, tasks such as SADP classification pose challenges in acquiring datasets that encompass nearly all real-world cases, an almost impossible work. Therefore, employing appropriate data augmentation techniques and/or more comprehensive SADP simulations becomes crucial in overcoming such limitations. As demonstrated in this study as well as in the previous studies elsewhere,11,12 the DL architectures are robust against noisy SADPs originating from low dimensional defects since the ‘noises’ in this case are small changes in the brightness of the diffraction spots. On the other hand, as seen in this study, defects of higher dimensions introduce new features in the SADPs that are not included in the ‘clean’ training SADP datasets, which leads to substantial decrease in the classification accuracies. Hence, one possible strategy of dataset augmentation for improved classification accuracy over the noisy SADPs would be the preparation of training and validation datasets having these new features due to dislocations and twin boundaries. At this stage, however, it is a challenging task to prepare noisy SADPs for these defects for numerous materials systems in large quantity. One possible alternative method to tackle this problem would be the use of image translation strategies such as cycle-GAN31 by which the test SADPs containing high dimensional defect information can be transformed into those resembling the ‘clean’ training dataset. Considering these factors for enhancing the performance of deep learning models would stand as a key direction for future developments.
Footnotes |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d3ra08939h |
‡ These authors equally contributed as the first authors. |
This journal is © The Royal Society of Chemistry 2024 |