Jungtaek
Kim
a,
Mingxuan
Li
a,
Yirong
Li
a,
Andrés
Gómez
b,
Oliver
Hinder
a and
Paul W.
Leu
*a
aUniversity of Pittsburgh, Pittsburgh, Pennsylvania 15261, USA. E-mail: pleu@pitt.edu
bUniversity of Southern California, Los Angeles, California 90089, USA
First published on 15th December 2023
The design of optical devices is a complex and time-consuming process. To simplify this process, we present a novel framework of multi-fidelity multi-objective Bayesian optimization with warm starts, called Multi-BOWS. This approach automatically discovers new nanophotonic structures by managing multiple competing objectives and utilizing multi-fidelity evaluations during the design process. We employ our Multi-BOWS method to design an optical device specifically for transparent electromagnetic shielding, a challenge that demands balancing visible light transparency and effective protection against electromagnetic waves. Our approach leverages the understanding that simulations with a coarser mesh grid are faster, albeit less accurate than those using a denser mesh grid. Unlike the earlier multi-fidelity multi-objective method, Multi-BOWS begins with faster, less accurate evaluations, which we refer to as “warm-starting,” before shifting to a dense mesh grid to increase accuracy. As a result, Multi-BOWS demonstrates 3.2–89.9% larger normalized area under the Pareto frontier, which measures a balance between transparency and shielding effectiveness, than low-fidelity only and high-fidelity only techniques for the nanophotonic structures studied in this work. Moreover, our method outperforms an existing multi-fidelity method by obtaining 0.5–10.3% larger normalized area under the Pareto frontier for the structures of interest.
Designing an optical device requires defining a parametric design space for these devices and identifying specific objective functions to optimize. However, the design process of an optical device can be complex due to the need to balance several distinct competing objectives over many parameters. For instance, in lens design, factors such as resolution, wavelength range, and field angles need to be considered.4 Antireflection coating design requires the minimization of reflection at multiple wavelengths and angles.5,6 For light-emitting diodes, considerations include efficiency, color rendering, lifetime, and thermal management.7
We can use one of several electrodynamic methods, such as rigorous coupled-wave analysis,8 finite element method,9 or finite-difference time-domain method,10 to simulate an optical device. Interestingly, these simulation methods involve different levels of fidelity, such as mesh resolution, frequency domain decomposition, and time step. Different nanophotonic structures can be evaluated at lower fidelity, which is less expensive but more prone to noise, or at higher fidelity, which is costlier but yields more accurate results. Both low-fidelity and high-fidelity evaluations are valuable due to their unique properties related to accuracy and time efficiency.
To efficiently design an optimal optical device, we propose a framework of multi-fidelity multi-objective Bayesian optimization with warm starts, called Multi-BOWS. This framework combines multiple objectives and multi-fidelity evaluations in the design process of optical devices with electrodynamic simulations. We utilize Bayesian optimization,11–13 a sample-efficient technique for black-box optimization, which has been shown to effectively automate structure discovery.14–24 This automatic discovery process allows us to investigate a high-dimensional search space of disparate nanophotonic structures, reducing human intervention in the design process. Specifically, we use the Pareto frontier of low-fidelity evaluations to kickstart the high-fidelity Bayesian optimization by providing better initial points and thereby accelerating the optimization process.
We demonstrate the effectiveness of our Multi-BOWS method in the specific context of designing optical devices for transparent electromagnetic shielding. This requires a structure with high visible transparency and efficient electromagnetic shielding. Our findings show that Multi-BOWS outperforms several approaches that use low-fidelity evaluations only, high-fidelity evaluations only, or a multi-fidelity approach that uses a mix of both.25 In particular, our method achieves 3.2–89.9% larger normalized area under the Pareto frontier (AUPF) than the low-fidelity only and high-fidelity only techniques for the nanophotonic structures investigated in our work. Moreover, it achieves 0.5–10.3% larger AUPF than the earlier multi-fidelity method for the structures studied in this work.
Electromagnetic shielding is crucial for safeguarding electronic devices and circuits by mitigating electromagnetic interference.26–31 This has been a major research focus for a variety of applications such as protecting RFID chips from radio-frequency interference and shielding medical implants from electromagnetic waves. Besides reducing interference, some applications such as consumer electronics, automotive and aviation, medical devices, and building windows need to fulfill additional design objectives like visible transparency. The simultaneous consideration of several different factors complicates the task of identifying the most effective structure.
Formally, suppose that we have an objective for transparency, denoted as ftr, and another for shielding effectiveness (SE), denoted as fse. An optimal structure for transparency and one for SE
can be defined by solving the following equations:
![]() | (1) |
![]() | (2) |
In addition to the aforementioned complexities of multi-objective optimization, a specific expression of ftr cannot be explicitly obtained for many structures and requires electrodynamic simulations. Simulating a nanophotonic structure to evaluate ftr is a time-consuming task because an accurate evaluation requires a dense mesh grid. These challenges make a compelling case for employing a black-box optimization technique for a costly function. Notably, Bayesian optimization is a sample-efficient black-box optimization strategy that stands as a suitable candidate to tackle this problem.11–13 Furthermore, by utilizing the nature of mesh-based simulations, we can evaluate less expensive, albeit noisier functions using a coarse mesh grid, rather than more expensive but more accurate functions using a dense mesh grid. Hence, we can define a low-fidelity multi-objective function [flowtr, flowse] and a high-fidelity multi-objective function [fhightr, fhighse]. These functions, with varying degree of accuracy, help us to select an optimal structure concerning both objectives from a multi-fidelity optimization standpoint. This approach allows us to strike a balance between evaluation accuracy and time.
To overcome this challenge, it is necessary to define three key elements:
• structure representation: this is the process of expressing a structure of interest as a specific type of input, such as discrete variables,
• evaluation function: this is used to assess the performance of a particular structure, and
• decision-making policy: this is a strategy used to identify potential optimal structures based on previously evaluated structures and their corresponding evaluations.
In this paper, we carefully design structure representations, taking into account the structures described in this section and the feasibility of structures. The evaluation function is then defined based on the structure representation to measure specific properties. Our framework considers the multiple objectives of transparency and SE. Moreover, this evaluation function of transparency is inherently black-box, as it cannot be explicitly expressed as a function. Lastly, the decision-making process incorporates both the structure representation and the evaluation function, to sequentially recommend optimal structure candidates.
On the other hand, topology optimization can be employed in the design of photonic structures, leveraging gradient information with respect to these structures.42–44 Previous research has shown that combining adjoint methods with topology optimization is a powerful approach for tackling inverse design problems in photonics.45–48 These methods rely on gradients and typically use gradient-based optimization techniques to find solutions. However, these approaches may be limited when objective functions are complex and it is important to find a global optimum as opposed to local optima.
Its strengths have been validated in attractive real-world problems, including optimizing chemical reactions,18 battery charging protocols,16 automatic chemical design,17 and automated machine learning.52,53 Building on this work, Bayesian optimization is particularly well-suited for optimizing nanophotonic structures where a structure representation and evaluation functions are already provided. Specifically, it excels in optimizing objectives when categorical and discrete variables are present.51,58,59
Suppose that we do not know an objective function f and can only evaluate a d-dimensional query point from f, where
is a d-dimensional search space, i.e., typically a hypercube. Bayesian optimization sequentially optimizes f by selecting a solution candidate at each iteration. Initially, we construct a surrogate function, often using probabilistic regression, based on the points already evaluated and their evaluations. Gaussian process regression is a popular surrogate function in the Bayesian optimization community,60 though other models such as random forests,61 tree-based surrogate models,59 and Bayesian neural networks62 can also be used. For our problem, we utilize a Gaussian process-based surrogate model. Using the surrogate function, we define an acquisition function a to select the next query point. Various acquisition functions exist, including the probability of improvement,11 expected improvement,63 Gaussian process upper confidence bound,64 and a portfolio of existing acquisition functions.65 This work uses the expected improvement, aligning with numerous studies that attest to its robustness.18,49,50,66
Recent research in Bayesian optimization has explored multi-fidelity methods, which seek a balance in evaluations across varying levels of fidelity.67–69 In parallel, multi-objective Bayesian optimization has been developed to optimize multiple objectives simultaneously.70–72 Recent research efforts have sought to combine these two concepts into multi-fidelity multi-objective Bayesian optimization by the introduction of continuous fidelity levels as an optimizeable parameter73,74 or aiming to maximize information gain per unit cost of resources.25
(a) three-layer structure,
(b) matched-period double-sided nanocone structure,
(c) unmatched-period double-sided nanocone structure, and
(d) meta-structure.
The three-layer and matched-period double-sided nanocone structures have been previously explored.32 In this paper, we introduce two new structures: (c) the unmatched-period double-sided nanocone structure and (d) the meta-structure.
Fig. 2 provides a depiction of how the parameters of a structure are defined. The structure's parameters include the silver-layer thickness ts, upper-layer thickness tu, lower-layer thickness tl, heights for upper and lower cones hu, hl, radii for upper cones rub, rut, radii for lower cones rlb, rlt, pitches for upper and lower cones au, al, and the number of upper and lower cones nu, nl. Depending on the specific structure, some parameters may not be applicable. For instance, in a three-layer structure, the parameters related to the upper and lower cones are disregarded, and only ts, tu, and tl are utilized. For a matched-period double-sided nanocone structure, nu and nl are dismissed and au is equal to al.
Table 1 provides a detailed description of the parameter ranges and constraints. All parameters except for au and al are discretized to integers. Several constraints are applied as follows: rut < rub, rlt > rlb, 2rub ≤ au, 2rlt ≤ al, and nuau = nlal. For easier management of the constraints, rut < rub and rlt > rlb, we introduce new variables qru and qrl as follows: qru = rut/rub and qrl = rlb/rlt where qru and qrl are both variables ∈ [0, 1]. Moreover, the next acquired point is only sampled over the region of the parameter space that is known to be feasible. If the proposed solution of an acquisition function violates a constraint, then it is instead evaluated at the boundary of that constraint.
Parameter | Symbol | Range | Constraints |
---|---|---|---|
Silver-layer thickness | t s | {3, 4, …, 20} | — |
Upper-layer thickness | t u | {5, 6, …, 100} | — |
Lower-layer thickness | t l | {5, 6, …, 100} | — |
Height of upper cones | h u | {50, 51, …, 400} | — |
Height of lower cones | h l | {50, 51, …, 400} | — |
Pitch for upper cones | a u | [20, 400] | n u a u = nlal, 2rub ≤ au |
Pitch for lower cones | a l | [20, 400] | n u a u = nlal, 2rlt ≤ al |
Bottom radius of upper cones | r ub | {10, 11, …, 100} | r ut < rub, 2rub ≤ au |
Top radius of upper cones | r ut | {1, 2, …, 99} | r ut < rub |
Bottom radius of lower cones | r lb | {1, 2, …, 99} | r lt > rlb |
Top radius of lower cones | r lt | {10, 11, …, 100} | r lt > rlb, 2rlt ≤ al |
The number of upper cones | n u | {1, 2, …, 10} | n u a u = nlal |
The number of lower cones | n l | {1, 2, …, 10} | n u a u = nlal |
Besides the three basic structures – three-layer, matched-period double-sided nanocone, and unmatched-period double-sided nanocone structures – each defined by a specific set of parameters, we introduce a new type called a meta-structure. This is a generalized structure and it is introduced to optimize the structure from the perspective of automatic structure discovery. To accommodate the meta-structure, we include an extra parameter – structure selection parameter – that allows the selection of one structure among various structures. In our study, we consider five types of structures: three-layer, single-sided (upper) nanocone, single-sided (lower) nanocone, matched-period double-sided nanocone, and unmatched-period double-sided nanocone structures. It is worth noting that the types of structures can be easily expanded by altering the potential choices for the structure selection parameter.
Before explaining the details of Multi-BOWS, we enumerate the high-level procedure of our algorithm:
(i) selection of initial points for low-fidelity multi-objective Bayesian optimization,
(ii) execution of low-fidelity multi-objective Bayesian optimization, constrained by a time budget allocated for the low-fidelity Bayesian optimization,
(iii) identification of the Pareto frontier (i.e., optimal solutions) from low-fidelity evaluations,
(iv) warm-starting of high-fidelity multi-objective Bayesian optimization using the identified Pareto frontier,
(v) execution of high-fidelity multi-objective Bayesian optimization, constrained by a time budget allocated for this high-fidelity Bayesian optimization, and
(vi) identification of the Pareto frontier from high-fidelity evaluations.
This procedure is visually outlined in Fig. 3. Steps (i) and (iv) act as initialization steps, Steps (ii) and (v) are considered as optimization phases, and Steps (iii) and (vi) focus on identifying the Pareto frontiers.
Firstly, a certain number of initial points are randomly sampled in Step (i), using uniform distributions or low-discrepancy sequences like the Sobol’ sequence.77 However, unlike Step (i), Step (iv) uses the Pareto frontier from low-fidelity evaluations as initial points for high-fidelity multi-objective Bayesian optimization. If the number of points on the Pareto frontier exceeds the predefined number of initial points, we randomly select the required number of points from the Pareto frontier. Steps (ii) and (v), similar to the standard Bayesian optimization algorithm, sequentially determine the query points based on the allocated time budgets.
In order to determine the next point, we first create a Gaussian process regression model to serve as a surrogate function. Given a set of data points and their corresponding responses
, a posterior predictive distribution over
is defined by the following:
![]() | (3) |
μ(x∣X,y) = k(x,X)(K(X,X) + σn2I)−1y, | (4) |
σ2(x∣X,y) = k(x,x) − k(x,X)(K(X,X) + σn2I)−1k(x,X)T, | (5) |
By using four surrogate functions, we are able to define the corresponding acquisition functions: alowtr, alowse, alowse, and ahighse. This assumes the use of the expected improvement acquisition function:63
![]() | (6) |
To handle multiple objective functions, we use a random scalarization technique:78
alow(x∣X,y) = alowtr(x∣X,y) + 10λlowalowse(x∣X,y), | (7) |
ahigh(x∣X,y) = ahightr(x∣X,y) + 10λhighahighse(x∣X,y), | (8) |
![]() | (9) |
![]() | (10) |
![]() | (11) |
We execute a low-fidelity multi-objective function [flowtr, flowse] and a high-fidelity multi-objective function [fhightr, fhighse] using uniform mesh sizes of 40 nm and 2 nm, respectively. The meshes are overridden at the silver and titanium oxide interfaces in order to capture the effect of small thickness. These mesh sizes are selected to ensure an appropriate simulation time. As expected, the evaluations of [fhightr, fhighse] are slower but more accurate than the ones of [flowtr, flowse]. Notably, the evaluations of flowtr can be larger than 1, which is physically impossible. Due to the lower accuracy of low-fidelity evaluations, we do not report the results of low-fidelity evaluations in this section. Instead, we evaluate the final Pareto frontier acquired by low-fidelity Bayesian optimization using [fhightr, fhighse]. Moreover, to compare Bayesian optimization algorithms, we normalize the evaluations of fhighse with min–max scaling. This way, the AUPF is confined with the range [0, 1]. The AUPF is computed as follows:
![]() | (12) |
In our Multi-BOWS approach, we employ Gaussian process regression utilizing the Matérn 5/2 kernel as a surrogate function.60 We choose the expected improvement policy as an acquisition function,63 and this function is optimized using multi-started L-BFGS-B by following the work.79 For the time budget, we allocate 20% to Tlow and the remaining 80% to Thigh. The low-fidelity only or high-fidelity only multi-objective Bayesian optimization initializes with 10 points, while the multi-fidelity multi-objective Bayesian optimization starts with a total of 10 points, out of which 8 are evaluated by a low-fidelity function and the other 2 points are by a high-fidelity function. Moreover, for the low-fidelity Bayesian optimization of Multi-BOWS, we start with 8 initial points. If the size of the Pareto frontier of low-fidelity evaluations exceeds 10, we randomly select 10 points from the Pareto frontier of the low-fidelity evaluations. For the existing methods, we employ the official implementation of the recent work.25‡
We investigate four following structures: the three-layer structure, matched-period double-sided nanocone structure, unmatched-period double-sided nanocone structure, and meta-structure. The AUPF is calculated for each structure with four variations: low-fidelity only, high-fidelity only, and multi-fidelity multi-objective Bayesian optimization, and Multi-BOWS. Using the qualitative results in Table 2, we compare four algorithms by computing X/Y where X and Y are the AUPF results. We obtain those results by assuming the uncorrelated non-central normal ratio for a ratio distribution.
Structure | AUPF | |||
---|---|---|---|---|
Single-fidelity algorithm | Multi-fidelity algorithm | |||
Low-fidelity only | High-fidelity only | Multi-fidelity | Multi-BOWS | |
Three-layer | 0.4529 ± 0.0529 | 0.7939 ± 0.0058 | 0.7795 ± 0.0090 | 0.8600 ± 0.0014 |
Matched-period | 0.7231 ± 0.0204 | 0.8344 ± 0.0049 | 0.8392 ± 0.0049 | 0.8579 ± 0.0022 |
Unmatched-period | 0.7088 ± 0.0273 | 0.7727 ± 0.0072 | 0.7941 ± 0.0100 | 0.8141 ± 0.0039 |
Meta-structure | 0.7157 ± 0.0104 | 0.8286 ± 0.0054 | 0.8509 ± 0.0035 | 0.8551 ± 0.0021 |
We find that the Multi-BOWS approach discovers superior structures more rapidly compared to other methods and is successful in identifying structures that exhibit higher SE and visible transmittance compared to other methods, as presented in Fig. 4 and Table 2. Our method delivers an AUPF that is 89.9 ± 22.2% and 8.3 ± 0.8% larger in the three-layer structure, 18.6 ± 3.4% and 2.8 ± 0.7% larger in the matched-period double-sided nanocone structure, 14.9 ± 4.5% and 5.4 ± 1.1% larger in the unmatched-period double-sided nanocone structure, and 19.5 ± 1.8% and 3.2 ± 0.7% larger in the meta-structure compared to the low-fidelity only and high-fidelity only methods, respectively. Interestingly, the earlier multi-fidelity multi-objective Bayesian optimization technique tends to outperform the low-fidelity only and high-fidelity only methods except for one case between the high-fidelity only and multi-fidelity methods for the three-layer structure. Furthermore, our Multi-BOWS shows 10.3 ± 1.3%, 2.2 ± 0.7%, 2.5 ± 1.4%, and 0.5 ± 0.5% larger AUPF than the existing multi-fidelity algorithm for four structures, respectively.
It is important to note that the number of initial points is identical across all experiments as previously mentioned. Moreover, the number of evaluations varies significantly across nanophotonic structures because the simulation time is dependent on the size of the simulation cell. For example, the low-fidelity only Bayesian optimization evaluates 462.5000 ± 6.8007 structures for the three-layer structure, 1218.2000 ± 16.1728 structures for the matched-period double-sided nanocone structure, 2648.6000 ± 98.2692 structures for the unmatched-period double-sided nanocone structure, and 2948.0000 ± 16.7571 structures for the meta-structure, and the high-fidelity only Bayesian optimization method evaluates 397.6000 ± 1.7436 structures for the three-layer structure, 230.3000 ± 42.5888 structures for the matched-period double-sided nanocone structure, 43.1000 ± 12.3810 structures for the unmatched-period double-sided nanocone structure, and 205.7000 ± 59.3701 structures for the meta-structure.
Therefore, we can remark two main messages here. Firstly, the use of multi-fidelity evaluations improves Bayesian optimization's ability to find high quality structures. Secondly, while the performance gain in higher-dimensional problems is smaller than in lower-dimensional problems, our Multi-BOWS approach remains effective for all the structures compared to the other methods.
We observe interesting characteristics in the structures identified by our method, as shown in Fig. 5. The structures with nanocones – both matched-period and unmatched-period double-sided nanocone structures – exhibit greater visible transparency than the three-layer structures in the region of high transparency. The lines plotted in the bottom right box of Fig. 5 show better performance in the region of high transparency. In particular, the results for the unmatched-period double-sided nanocone structure are comparable to or better than the results for the matched-period double-sided nanocone structure, even though the number of evaluations for the unmatched-period structure is less than the number of evaluations for the matched-period structure. Additionally, Fig. 5 shows that the meta-structure favors high-transmission structures in the region of high SE, thus achieving similar performance to the three-layer structure. It implies that the meta-structure allows the optimization algorithm to actively seek diverse structures without thorough domain knowledge in optical device design. By leveraging this feature, we can systematically address the problem of optical device design by defining a more generic search space and employing a Bayesian optimization strategy, such as our Multi-BOWS framework.
Footnotes |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d3dd00177f |
‡ It is available at https://github.com/belakaria/mf-osemo. |
This journal is © The Royal Society of Chemistry 2024 |