Machine learning for knowledge acquisition and accelerated inverse-design for non-Hermitian systems

Ahmed, Waqas W.; Farhat, Mohamed; Staliunas, Kestutis; Zhang, Xiangliang; Wu, Ying

doi:10.1038/s42005-022-01121-9

Download PDF

Article
Open access
Published: 03 January 2023

Machine learning for knowledge acquisition and accelerated inverse-design for non-Hermitian systems

Communications Physics volume 6, Article number: 2 (2023) Cite this article

1800 Accesses
4 Citations
Metrics details

Subjects

Abstract

Non-Hermitian systems offer new platforms for unusual physical properties that can be flexibly manipulated by redistribution of the real and imaginary parts of refractive indices, whose presence breaks conventional wave propagation symmetries, leading to asymmetric reflection and symmetric transmission with respect to the wave propagation direction. Here, we use supervised and unsupervised learning techniques for knowledge acquisition in non-Hermitian systems which accelerate the inverse design process. In particular, we construct a deep learning model that relates the transmission and asymmetric reflection in non-conservative settings and propose sub-manifold learning to recognize non-Hermitian features from transmission spectra. The developed deep learning framework determines the feasibility of a desired spectral response for a given structure and uncovers the role of effective gain-loss parameters to tailor the spectral response. These findings offer a route for intelligent inverse design and contribute to the understanding of physical mechanism in general non-Hermitian systems.

Neural networks for computing and denoising the continuous nonlinear Fourier spectrum in focusing nonlinear Schrödinger equation

Article Open access 24 November 2021

Correlating metasurface spectra with a generation-elimination framework

Article Open access 12 August 2023

Meta-neural-network for real-time and passive deep-learning-based object recognition

Article Open access 09 December 2020

Introduction

Non-Hermitian systems, first studied in quantum mechanics^1,2, have been attracting a growing interest in wave physics, especially in optics for being a promising platform to explore new physical phenomena^3,4,5,6,7 that are impossible in Hermitian systems. Hermitian systems exhibit symmetric transmission and reflection with respect to the direction of the incident wave. This symmetry stems from reciprocity and energy conservation principles. Yet, gain-loss in non-Hermitian systems breaks space symmetry and reveals unusual properties such as unidirectional invisibility^8,9, lasing and coherent perfect absorption¹⁰, asymmetric chirality¹¹, and many others^{12,13,14,15,16}. Non-Hermitian systems extend photonic design to complex plane of dielectric permittivity, examining the delicate relationship between the refractive index, and gain-loss modulation¹⁷. Several platforms are readily available for designing non-Hermitian photonics systems, including waveguides^5,6 microcavities^18,19, metamaterials²⁰, fiber optics²¹, plasmonic meta-atoms²², and microwave systems²³. The new development in non-Hermitian physics opens up an array of future applications and technologies, including optical sensors²⁴, laser-absorbers¹⁰, micro-lasers^25,26, meta lens²⁷, telemetry²⁸, to name a few. In simple cases, even if the scatterers are elastic (conservative) and only one dimensional (1D), scattering problems may not be solved analytically. Some general relations may be derived. For instance, the reflection and the transmission for conservative systems obey ${\left|t\right|}^{2}+{\left|r\right|}^{2}=1$. For non-conservative but balanced gain-loss systems, another general relation exists, i.e., ${r}_{L}{r}_{R}^{* }=1-{\left|t\right|}^{2}$ relating the left and the right reflection waves. However, in the most general case, such relations do not exist. Lowest order Born approximation valid for weak scatterers predicts the left and right reflection and transmission spectra as ${r}_{L,R}^{0}\left(k\right)=\frac{{{{{{{\rm{i}}}}}}k}}{2}{\int }_{a}^{b}\varepsilon \left(x\right){{{{{{\rm{e}}}}}}}^{\pm 2{{{{{{\rm{i}}}}}}kx}}{{{{{{\rm{d}}}}}}x}$ and ${t}^{0}\left(k\right)=1+\frac{{{{{{{\rm{i}}}}}}k}}{2}{\int }_{a}^{b}\varepsilon \left(x\right){{{{{{\rm{d}}}}}}x}$, where for x < a and x > b the medium is air. These integrals depend on a general function of scatterer, ε(x), and value of reflection, ${r}_{L,R}^{0},$ for left and right incident plane wave ${{{{{{\rm{e}}}}}}}^{\pm {{{{{{\rm{i}}}}}}kz}}$. Obviously for real valued ε(x), the above relation gives ${r}_{L}={r}_{R}^{* }$. Yet, continuing to calculate higher order born approximations, reflection-transmission spectra become interdependent due to recursive relation of reradiated electric fields i.e., ${E}_{n}\left(\zeta \right)=\frac{{{{{{{\rm{i}}}}}}k}}{2}{\int }_{a}^{b}\varepsilon \left(x\right){{E}_{n-1}\left(z\right){{{{{\rm{e}}}}}}}^{-{{{{{{\rm{i}}}}}}k}\left|x-\zeta \right|}{{{{{{\rm{d}}}}}}z}$ where $\zeta$ lies in the range of $a\le \zeta \le b$. This implies that the transmission and the reflection spectra might be related even for arbitrary scattering functions ε(x). The analysis of Born approximation chain reveals that this hypothetic dependence is absent for very weak scattering and vanishes proportionally to ${\left|\varepsilon (z)\right|}^{2}$. Nevertheless, the analysis, even in this 1D case, cannot produce analytical tractable results or general relation between r_L, r_R, and t spectra. Eventually, whether r_L, r_R, and t are mutually related remains an open question. In addition, the amount of effective gain-loss plays a crucial role in designing the specific functionalities of the non-Hermitian structures that require careful tuning of design parameters. The existing modeling methods in conventional photonics use exhaustive tuning of material (and geometrical) parameters via brute-force and optimization for on-demand wave control^29,30, which are computationally expensive. Therefore, intelligent models to understand the underlying physics of wave-matter interaction with the latent data structures is desired to reveal the relation between different physical quantities and automate the design of technological devices.

In recent years, machine-learning (ML) techniques were successfully applied for forward and inverse designs in different physical settings^{31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46}. ML-based models are often regarded as black boxes because they do not reveal the physical behavior of the designed structures. In the literature, less efforts have been devoted to get scientific insights of photonic structures using ML^47,48,49,50. Such scientific knowledge is essential to exclude suboptimal designs for fabrication or to discover new physics. Although tremendous progresses in ML-enabled methods have been made to solve different physical problems, most of the reported studies are devoted to address the Hermitian (or conservative) systems, leaving ML models for non-Hermitian problems yet to be developed. Here, we focus on non-Hermitian systems and exploit the ML for knowledge extraction that streamlines the inverse design process. By analyzing large amounts of spectral data, the ML provides us hint to the answer to the general question about the relation between asymmetric reflections and transmission in non-conservative systems. We develop powerful design tools that intelligently learn the bidirectional mapping between non-Hermitian material parameters and scattering spectra, considerably reducing the computation requirements. The forward process from design material parameters to response-space is well defined, but the backward inverse problem (from response space to material parameters) is ambiguous due to the degenerate solution space⁵¹. In fact, one identical response can be yield by various parameter sets. Therefore, solving the inverse design problem for the desired spectral properties is very challenging for practical realizations due to the existence of non-unique solutions. To this end, we first identify the sub-manifolds of non-Hermitian features (gain, loss, balanced gain-loss cases) which play an important role for inverse design. We use unsupervised learning based on Principal Component Analysis⁵² (PCA) that reduces the dimensionality of the transmission spectra over a set of training data and discovers the clustering among the data representing gain, lossy, and balanced non-Hermitian cases, as depicted in Fig. 1a. The largest variance component helps to understand the underlying features in the data and provides insights about the role of effective media in designing non-Hermitian structures. Further, sub-manifold learning is utilized to accelerate the inverse design process. The inverse design process is divided into training and inference phases as shown in Fig. 1b. The training phase involves the modeling of the forward neural network, $\hat{f}(D)$ as a function of the design space, D, where $\hat{f}$ is a universal approximator function for forward simulator. Inference phase retrieves the design parameters, D, for on-demand spectra, which involves the computation of feasible starting design parameters based on manifold learning and the gradient of the pretrained forward model with respect to the design space $\partial \hat{f}/\partial D$ while fixing all the weights and biases of the network. K-dimensional (KD) tree algorithm⁵³ is applied to find the nearest neighboring point for the latent transmission spectra within a feasible region that works as a promising starting point, D, to achieve the optimal design using adaptive gradient approach. In addition, we reconstruct the transmission spectra from the asymmetric reflection that uncovers one-to-one and one-to-many mapping among reflection and transmission in forward and reverse directions.

**Fig. 1: Conceptual illustration of the design process for machine learning assisted non-Hermitian photonics structures.**

In this paper, we propose a machine learning approach for knowledge discovery and solve the inverse problem in non-Hermitian photonics. As a proof of concept and without loss of generality, we apply our design strategy to study multilayered non-Hermitian structures where the complex optical materials exhibit highly asymmetric optical response due to simultaneous index and gain-loss modulation. The developed deep learning frameworks uncover the relationship between the transmission and the asymmetric reflection, recognize the non-Hermitian features in spectral data, and solve the inverse problem in general non-Hermitian systems.

Results and discussion

To demonstrate the effectiveness of our design approach, we study a non-Hermitian structure with periodically distributed five-layer supercells, where each layer of the unit cell may be parametrized with thickness and material as shown Fig. 1. We model such structure with the transfer matrix method (TMM) where the material properties of the ith layer are represented by complex refractive index ${n}_{i}={n}_{{Ri}}+{{{{{\rm{i}}}}}}{n}_{{Ii}}$, where n_R and n_I are the real and the imaginary parts of the refractive index [see “Methods” section for detail]. The background medium is air. The complex refractive index distribution of the supercell dictates its scattering properties. Data driven approaches generally require a huge amount of data (from thousands to millions of sets) for discovering the hidden features and the intrinsic relations between the input and the output. In the data generation process, we assume the same thickness for each layer and consider material parameters as design variables ${{{{{\mathcal{D}}}}}}$= [${n}_{R1},{n}_{R2},{n}_{R3},{n}_{R4},{n}_{R5},{n}_{I1},{n}_{I2},{n}_{I3},{n}_{I4},{n}_{I5}$] to determine the scattering response, i.e., transmission and reflection properties for the left and right incident waves. Because of reciprocity, such a structure provides symmetric transmission T = T_R = T_L and, in general (e.g., if mirror symmetry is broken), asymmetric reflection denoted by R_R and R_L for left and right direction of incident waves, respectively. For illustration, the spectral response of interest is set in the normalized frequency range $0.2\le \omega a/2\pi c\le$ 0.8 where each data set contains ten design parameters and 100 discrete points for each transmission and reflection spectrum $S=[{{s}_{1},s}_{2},{s}_{3},{s}_{4}\ldots \ldots {s}_{100}]$. The real and imaginary parts of the refractive index are restricted in the range [1 1.4] and [−0.2 0.2] for data generation, respectively. In our study, we randomly generate 50,000 data samples of forward simulations with TMM. Among these, 80% of the samples are used as a training set, 10% for validation, and the remaining 10% for final testing. The training set is used for knowledge acquisition and training the feed forward networks; the validation set serves as a check to avoid overfitting, and the testing set evaluates the performance of the network. Here, we assume only material properties as the design space for the network training since refractive index and gain-loss modulations are crucial to non-Hermitian physics. Yet, the geometrical parameters (thickness) can also be easily incorporated, since the proposed feed forward neural networks can handle discrete data structures, either in terms of material or geometrical parameters.

Interplay of reflection and transmission in non-Hermitian structures

Conservation laws, as fundamental physical principles, have been conventionally derived in the model-driven way and more recently re-discovered with data-driven approaches^54,55,56. Typically, the elements of the transfer matrix result in a conservation relation that connects transmission and reflection for multilayer configuration. In Hermitian systems, the generalized conservation relation is simply expressed as T + R = 1 where left and right reflections are necessarily identical due to mirror symmetry, i.e., R_L = R_R = R. Following this relation, Parity-time (PT)-symmetric systems with balanced gain and loss⁴³ hold the relation $\sqrt{{{R}_{L}R}_{R}}=\left|1-T\right|$. In non-Hermitian systems, energy conservation is not valid due to the existence of gain and loss and, therefore, the intrinsic correlation between transmission and reflection in a general non-conservative system is yet to be developed. Here, we exploit deep learning to unveil the relation between transmission and reflection responses of non-Hermitian structures. We uncover one-to-one mapping from reflection (R_L, R_R) to transmission T (i.e., unique transmission response exists for any given reflection of arbitrary structure) and many-to-one in the reverse direction. The one-to-one relation is valuable to reconstruct the transmission from given reflections and analytically accessible through the training process that adjusts the weights in a neural network (NN) forming a series of nested rectified linear unit functions. The trained NN provides a universal function approximator $f({{{{{\mathcal{R}}}}}},T,\Theta )$ where the function f maps the reflection ${{{{{\mathcal{R}}}}}}$ to transmission T with the trained weight parameters Θ.

We implement the function f by a fully connected NN to learn the mapping between the left reflection, R_L (or right reflection, R_R), and transmission, where the reflection is fed as input and transmission as output to the network [see Fig. 2a]. The training process optimizes weights Θ by minimizing the mean absolute loss ${{{{{\mathcal{L}}}}}}{{{{{\mathscr{=}}}}}}\frac{1}{N}{\sum }_{i}\left|{T}_{i}-{\hat{T}}_{i}\right|$, where T_i and ${\hat{T}}_{i}$ are the ground truth and the predicted transmission responses, respectively. The architecture of the designed network consists of four fully connected layers with 500–500–400–300 neurons as depicted in Fig. 2a (see details in Supplementary Note 1). The performance of the trained network is quantified with the relative spectral error defined as: ${e}_{s}={\sum }_{i}\left|{T}_{i}-{\hat{T}}_{i}\right|/{T}_{i}$, where T_i and ${\hat{T}}_{i}$ denote the discretized target and predicted spectral response, respectively. In Fig. 2b, we show the distribution of the spectral error over the entire testing set in histogram form that indicates a high prediction accuracy (over 97%) of the network with an average error of 2.89% (dashed red line). The prediction performance of the trained model is demonstrated by three representative examples in Fig. 2c, where the red and the black dashed curves corresponding to the TMM simulation and the designed network prediction, respectively, show an excellent agreement.

**Fig. 2: Designed neural network for left reflection to transmission mapping.**

Note that neural networks can also be designed to determine the transmission spectra from the given right reflection (see Supplementary Note 2), or from both reflections (see Supplementary Note 3). Our findings suggest that only one component of the reflection, either left or right, is sufficient to build the transmission profile of the considered non-Hermitian system.

Sub-manifold learning for features recognition in non-Hermitian structures

The spectral response of a photonic structure depends to a great extent on the operational frequency. A multi-dimensional frequency response presents a large number of features, which grows very fast with the number of discrete frequencies. Due to extra degrees of freedom introduced by gain-loss modulation in non-Hermitian systems, the spectral response of photonic structures becomes even more complex. Consequently, it becomes extremely challenging to recognize the non-Hermitian structure from the spectral response and infer some knowledge for deep physical insight. Therefore, effective procedures to extract the most relevant information contained in the system response are highly desired. Different approaches have proven to be useful for extraction of the valuable information in data, such as PCA⁵², multidimensional binary search tree⁵³, neural autoencoder⁵⁷, to name a few^58,59. Such approaches are based on dimensionality reduction to visualize patterns, similarities, and differences in data with minimum loss of information. In this study, we exploit the unsupervised dimensionality reduction algorithm using PCA⁵⁹ that transforms the discretized transmission spectra from a high-dimensional space to a low-dimensional latent space and models the sub-manifolds for different non-Hermitian structures. These manifolds, in the form of convex hull, are used to investigate the feasibility of having a desired optical response from a certain class of non-Hermitian structure. When mapped into the respective two-dimensional (2D) feature space, these responses form a distribution of points, which characterize the correlation among the multi-frequency responses. The latent representation exploits the variance of features as well as the covariance between features to find major trends in spectral data. PCA organizes the large dimensions of each spectrum in terms of a respective feature vector, existing in higher-dimensional feature space. The procedure includes the rotation of the coordinate axes of the feature space such that the first axis results with the maximum possible data dispersion (as quantified by the statistical variance), the second axis with the second maximum dispersion, and so on⁵². This principle is illustrated for a transmission spectrum with the transmission amplitude at three different frequencies in Fig. 1. In practice, we deal with n frequencies and represent one transmission spectrum as an n-dimensional vector. Since the n frequency features are not independent, a large set of the transmission spectra when all represented as n-dimensional vectors live on a low-dimensional manifold embedded in the n-dimensional space. Therefore, PCA can be employed to reduce the n-dimensional space to a lower-dimensional space (e.g., a 2D space spanned by the first two principal components), and visualize interesting patterns of gain, lossy, and balanced media [see Method section for details]. In our study, PCA is applied over the training data of transmission spectra, and the first two components encoding the latent representation in 2D are shown in Fig. 3a. However, the number of principal components required for dimensionality reduction depends on the nature of the spectral data being examined. The dominant spectral features of a spectrum with many resonance peaks spread randomly over the broad frequency range may require more principal components for latent space representation. A convex hull is plotted to show the boundary of possible feasible response from the given non-Hermitian structure [see Fig. 3b]. Note that the convex hull has been formed with Quickhull algorithm to bound the transmission patterns in 2D latent space⁶⁰. We identify three distinct manifolds corresponding to the effective gain, lossy, and balanced gain-loss class for the latent spectra. An overlapping area appears when the relative amount of gain and loss in the system are nearly equal. Around the center of the reduced space, the gain and loss become exactly balanced, which maximizes the unidirectionality. The difference in the area of color-coded sub-manifold implies the capability of the multilayer configuration to construct a wide range of feasible spectral responses with different non-Hermitian media. Figure 3c–e presents representative transmission spectra belonging to three distinct manifolds, respectively. Sub-manifold learning can be used to forecast the feasibility of a response using a specific structure class. There are different physical scenarios associated with the identified feasible regions. For example, the gain region is ideal for lasing and designing amplifiers with stacked index-gain modulations. The balanced region can be applied for simultaneous lasing and coherent perfect absorption, sensing, and unidirectional invisibility, while coherent perfect absorbers can be designed using the lossy region.

**Fig. 3: Principal component analysis for the transmission spectra generated from a five-layer periodic non-Hermitian structure.**

Forward neural network

We develop a forward neural network to determine the spectral response for a given structure. It takes the design parameters as input ${{{{{\mathcal{D}}}}}}$ and the spectral response as output $S$ to build a mapping relation as $S=f(D)$ where f is the complex nonlinear function constructed by the forward neural network. One-to-one mapping from design parameter to spectral response is a regression problem that can be modeled by NN, as depicted in Fig. 4a. The network is trained with mean absolute spectral loss defined as ${{{{{\mathcal{L}}}}}}{{{{{\mathscr{=}}}}}}\frac{1}{N}{\sum }_{i}\left|{S}_{i}-{\hat{S}}_{i}\right|$ where S_i and ${\hat{S}}_{i}$ are the ground truth and the predicted spectral response, respectively. The designed architecture for the forward NN has 500–500–400–400–300−300 neurons in six layers, for transmission, and asymmetric reflections [see Fig. 4a]. In order to assess the performance of the designed network, we define the relative spectral error on the testing data sets as: $e={\sum }_{i}\left|{S}_{i}-{\hat{S}}_{i}\right|/{S}_{i}$ where S_i and ${\hat{S}}_{i}$ correspond to target and predicted spectral responses. The average spectral errors (below 4%) for transmission and left/right reflection of the designed forward networks are presented in Fig. 4b. The results of representative predicted responses agree well with TMM simulations, as show in Fig. 4c, d (see more details in Supplementary Note 4).

Inverse network

Next, we move toward the inverse design problem after obtaining a well-trained forward NN model. Naturally, most inverse design problems are ambiguous due to the “one-to-many” mapping i.e., different non-unique solutions exist for the same target response. Consequently, a single discriminative network cannot achieve the optimal solution in the inverse design. To mitigate this problem, auxiliary training approaches and optimization strategies have been incorporated in the inverse design process^{31,32,37,38,43,61,62,63,64}. In our work, we propose the sub-manifold learning with neural adjoint method to solve the inverse problem. To design a structure that results in a desired spectral response, the first step is to find the corresponding target point in the latent space by reducing the dimensionality of the desired response using PCA. Next, the KD-tree algorithm is applied to find the nearest neighboring point for the corresponding target point in the latent space within feasible sub-manifolds. The identified nearest neighboring point will act as the starting point for inverse design using a neural adjoint approach. The neural adjoint method determines the optimal solution by computing the gradient of the pretrained forward model $\hat{f}(D)$ with respect to the design parameters while keeping all weights and biases of the network fixed. The pretrained forward network $\hat{f}(D)$ provides a closed-form differentiable expression, and thus the calculation of $\partial \hat{f}/\partial D$ is trivial for the inverse design process. The gradient of the input design parameters with respect to a loss function ${{{{{\mathcal{L}}}}}}$ is estimated to iteratively move along the loss surface for the optimal solution. The inverse model can be denoted as ${\hat{f}}^{-1}(S,{D}_{0})$, where D₀ is the initial structure obtained from sub-manifold learning and S is the desired spectra.

Let S be our desired spectra, and D_i be our current best estimate of the design space, where the index i indicates the iteration for adaptive gradient-based estimation procedure. To compute D_i+1, first we calculate the moving averages based on decaying exponential rates and gradient of the input design parameters with respect to a loss function in the following way:

$$ {m}_{i} ={\beta }_{1}{m}_{i-1}+\left(1-{\beta }_{1}\right)\frac{\partial {{{{{\mathcal{L}}}}}}\left(\hat{f}({\hat{D}}_{i}),{{\mbox{S}}}\right)}{\partial {\hat{D}}_{i}},\\ {v}_{i} ={\beta }_{2}{v}_{i-1}+(1-{\beta }_{2}){\left[\frac{\partial {{{{{\mathcal{L}}}}}}\left(\hat{f}({\hat{D}}_{i}),{{\mbox{S}}}\right)}{\partial {\hat{D}}_{i}}\right]}^{2},$$

(1)

where ${{{{{\mathcal{L}}}}}}{{{{{\mathscr{=}}}}}}\frac{1}{N}{\sum }_{i}^{N}{\left(\hat{f}(\hat{D})-{S}_{i}\right)}^{2}$ is the mean squared loss function and β₁, β₂ are the exponential decay of the rates for the first moment estimates and second-moment estimates, respectively. Next, we need to correct the bias in the first and the second moment estimates as:

$${\hat{m}}_{i}=\frac{{m}_{i}}{1-{\beta }_{1}^{i}},{\hat{v}}_{i}=\frac{{v}_{i}}{1-{\beta }_{2}^{i}}.$$

(2)

Finally, we can update the design parameters based on the calculated moving averages with a step size α:

$${\hat{D}}_{i+1}={\hat{D}}_{i}-\alpha \frac{{\hat{m}}_{i}}{\sqrt{{\hat{v}}_{i}}+\epsilon },$$

(3)

where α is the adaptive learning rate⁶⁵. The major problem with existing neural adjoint methods is the selection of initial seed, however, we mitigate it with sub-manifold learning that eventually leads to an accurate solution in inverse design.

The results for the inverse design are summarized in Fig. 5. We consider three represented examples of desired responses whose transmission spectra resides within sub-manifold of lossy, gain, and balanced media indicated by b–d shown in Fig. 5a. Finding the nearest neighbor for the corresponding points in the latent space works as the initial structure that is evolved to achieve optimal target spectrum. The predicted spectra coincide with the target that indicates the strong capability of our method to design any spectral response absent in the training set, as shown in Fig. 5b–d (see details in Supplementary Note 5). Figure 5c shows the perfect transmission and asymmetric reflection that do not require any strict space and time symmetry conditions as in the case of PT-symmetric structures. Therefore, the proposed approach can be used to design a more general class of planar reflectionless structures with non-Hermitian materials that can be realized with locally isotropic and non-magnetic materials. In addition, the method can be applied to design high-quality resonators based on non-Hermitian photonic structures with more spectral points in desired frequency range for training data without affecting prediction time.

**Fig. 5: Inverse design approach based on sub-manifold learning and neural adjoint method.**

Conclusion

In conclusion, we propose unsupervised and supervised learning for modeling the optical response of non-Hermitian photonic structures that can derive valuable insights about the relationships of asymmetric reflections and transmission in non-conservative systems. The study reveals that either the left or the right reflection is sufficient to determine the corresponding transmission profile. As demonstration, we study a multilayered periodic structure and develop a machine learning framework for knowledge acquisition and retrieval of the design parameters for a given spectral response. In particular, we propose a machine learning model to determine the transmission from a given asymmetric reflection without design parameters that uncover the one-to-one mapping from the reflection to the transmission spectra. The dimensionality reduction based on PCA is applied to learn the sub-manifolds of lossy, gain, and balanced structures via the transmission response. The learned sub-manifolds in 2D latent space are useful to determine the feasibility of the response with a given class of structures and find the best initial seed for inverse-design with neural adaptive gradient method. In this proposal, a single feed-forward neural network is trained rather than the complex training procedures used in common machine learning-based inverse methods that tend to yield suboptimal results. The PCA and KD-tree algorithm provide accurate initialization for neural adjoint method to find an approximate globally optimal solution within a short period of time (one minute). The approach is used to inversely design the unidirectional properties with multilayered structures that do not require any strict symmetry as in PT-symmetric structures. Our methodology integrates dimensionality reduction to automate the design and, thus, provides a versatile platform to learn new physical insights in non-Hermitian photonics structures. The inverse design method is not restricted to optical problems, and it can be applied directly to find accurate solutions to ill-posed problems in other physical systems such as acoustics, elasticity, or plasmonics. It is worth to mention that we use the feed forward neural networks to represent discrete data structures (material or geometry parameters) in modeling the response of multilayered structures. However, the idea can be extended for complex coupled systems beyond multilayered systems by replacing the feed forward neural network with different network architectures. Graphical neural networks can be used to model photonic systems with interacting objects (e.g., coupled ring resonators), while recurrent neural networks are used to model dynamical photonics systems.

Methods

Transfer matrix method

TMM is one of the most widely used methods to model light propagation in multilayered structures. In this study, we consider a one-dimensional periodic structure formed by five alternating layers with coherent interfaces. We assume that all layers are isotropic and nonmagnetic with different dielectric material. The electric field is represented as a superposition of the left- and right-propagating waves with wavevector, k = $\omega /c$, as: ${E}^{-}\left(z\right)={E}_{L}^{+}{{{{{{\rm{e}}}}}}}^{{{{{{{\rm{i}}}}}}kz}}+{E}_{R}^{-}{{{{{{\rm{e}}}}}}}^{-{{{{{{\rm{i}}}}}}kz}}$ and ${E}^{+}\left(z\right)={E}_{R}^{-}{{{{{{\rm{e}}}}}}}^{{{{{{{\rm{i}}}}}}kz}}+{E}_{R}^{+}{{{{{{\rm{e}}}}}}}^{-{{{{{{\rm{i}}}}}}kz}}$ for the left $\left(z \, < \, -L/2\right)$ and the right $\left(z \, > \, L/2\right)$ side of the structure, respectively. The continuity conditions for the tangential electric field components at the outer interfaces of the layered structure determine the transfer matrix, M, that is the product of matching and propagation matrices. Mathematically, the transfer matrix relates the field amplitudes of the left- and right propagating waves in the following manner:

$$\left(\begin{array}{c}{E}_{R}^{-}\\ {E}_{R}^{+}\end{array}\right)=M\left(\begin{array}{c}{E}_{L}^{+}\\ {E}_{L}^{-}\end{array}\right),M=\left(\begin{array}{cc}{M}_{11} & {M}_{12}\\ {M}_{21} & {M}_{22}\end{array}\right).$$

(4)

The transmission coefficient t_R,L, and reflection coefficient r_R,L, (along with the transmittance ${T}_{R,L}{=\left|{t}_{R,L}\right|}^{2}$ and reflectance ${R}_{R,L}{=\left|{r}_{R,L}\right|}^{2}$) for left (L) and right (R) incidence waves can be computed from boundary conditions and related with the elements of transfer matrix as:

$$\begin{array}{cc}{t}_{R}=\frac{1}{{M}_{22}}, & {t}_{L}=\frac{{M}_{11}{M}_{22}-{M}_{12}{M}_{21}}{{M}_{22}},\\ {r}_{R}=\frac{{M}_{12}}{{M}_{22}}, & {r}_{L}=-\frac{{M}_{21}}{{M}_{22}}.\end{array}$$

(5)

Principal component analysis

PCA is a powerful unsupervised method for dimensionality reduction that transforms the data to a lower dimensional space to identify the intrinsic patterns and correlation in the data without loss of original information. Consider m data points of n-dimensional spectral space S = [s₁, s₂….. s_n] where, s₁ represents ith features, and n represents the total number of features. The response data matrix can be written as R^m×n from which successive k orthogonal components (also called principal components) are computed to find the direction of the maximum variance. The largest eigenvalues of the response data matrix along with the corresponding eigenvectors are used to analyze a large amount of high-dimensional transmission data. In this work, we implement PCA with python sklearn libaray that uses the singular value decomposition for the calculation of principal components.

Data availability

The data that support the findings of this study are available from the corresponding authors upon reasonable request.

Code availability

The codes that support the findings of this study are available from the corresponding authors upon reasonable request.

References

Bender, C. M. & Böttcher, S. Real spectra in non-Hermitian Hamiltonians having PT symmetry. Phys. Rev. Lett. 80, 5243–5246 (1998).
Article ADS MathSciNet MATH Google Scholar
Bender, C. M. Making sense of non-Hermitian Hamiltonians. Rep. Prog. Phys. 70, 947–1018 (2007).
Article ADS MathSciNet Google Scholar
Makris, K. G., El-Ganainy, R., Christodoulides, D. N. & Musslimani, Z. H. Beam dynamics in PT symmetric optical lattices. Phys. Rev. Lett. 100, 103904 (2008).
Article ADS Google Scholar
Longhi, S. Bloch oscillations in complex crystals with PT symmetry. Phys. Rev. Lett. 103, 123601 (2009).
Article ADS Google Scholar
Guo, A. et al. Observation of PT-symmetry breaking in complex optical potentials. Phys. Rev. Lett. 103, 093902 (2009).
Article ADS Google Scholar
Rüter, C. E. et al. Observation of parity–time symmetry in optics. Nat. Phys. 6, 192–195 (2010).
Article Google Scholar
El-Ganainy, R. et al. Non-Hermitian physics and PT symmetry. Nat. Phys. 14, 11–19 (2018).
Article Google Scholar
Lin, Z. et al. Unidirectional invisibility induced by PT-symmetric periodic structures. Phys. Rev. Lett. 106, 213901 (2011).
Article ADS Google Scholar
Longhi, S. Invisibility in PT-symmetric complex crystals. J. Phys. A 44, 485302 (2011).
Article MathSciNet MATH Google Scholar
Longhi, S. PT-symmetric laser absorber. Phys. Rev. A 82, 031801 (2010).
Article ADS Google Scholar
Lee, J. M. et al. Reconfigurable directional lasing modes in cavities with generalized PT symmetry structures. Phys. Rev. Lett. 112, 253902 (2014).
Article ADS Google Scholar
Turduev, M. et al. Two-dimensional complex parity-time-symmetric photonic structures. Phys. Rev. A 91, 023825 (2015).
Article ADS MathSciNet Google Scholar
Peng, B. et al. Chiral modes and directional lasing at exceptional points. Proc. Natl Acad. Sci. USA 113, 6845–6850 (2016).
Article ADS Google Scholar
Ahmed, W. W. et al. Locally parity-time-symmetric and globally parity-symmetric systems. Phys. Rev. A 97, 033824 (2016).
Article ADS Google Scholar
Ahmed, W. W. et al. Directionality fields generated by a local Hilbert transform. Phys. Rev. A 97, 033824 (2018).
Article ADS MathSciNet Google Scholar
Botey, M., Herrero, R. & Staliunas, K. Light in materials with periodic gain-loss modulation on a wavelength scale. Phys. Rev. A 82, 013828 (2010).
Article ADS Google Scholar
Feng, L., El-Ganainy, R. & Ge, L. Non-Hermitian photonics based on parity–time symmetry. Nat. Photonics 11, 752–762 (2017).
Article ADS Google Scholar
Chang, L. et al. Parity-time symmetry and variable optical isolation in active-passive-coupled microresonators. Nat. Photonics 8, 524–529 (2014).
Article ADS Google Scholar
Peng, B. et al. Parity–time-symmetric whispering-gallery microcavities. Nat. Phys. 10, 394–398 (2014).
Article Google Scholar
Feng, L. et al. Experimental demonstration of a unidirectional reflectionless parity-time metamaterial at optical frequencies. Nat. Mater. 12, 108–113 (2013).
Article ADS Google Scholar
Regensburger, A. et al. Parity-time synthetic photonic lattices. Nature 488, 167–171 (2012).
Article ADS Google Scholar
Lawrence, M. et al. Manifestation of PT symmetry breaking in polarization space with terahertz metasurfaces. Phys. Rev. Lett. 113, 093901 (2014).
Article ADS Google Scholar
Liu, Y. et al. Observation of parity-time symmetry in microwave photonics. Light Sci. Appl. 7, 38 (2018).
Article ADS Google Scholar
Hodaei, H. et al. Enhanced sensitivity at higher-order exceptional points. Nature 548, 187–191 (2017).
Article ADS Google Scholar
Hodaei, H., Miri, M.-A., Heinrich, M., Christodoulides, D. N. & Khajavikhan, M. Parity-time symmetric microring lasers. Science 346, 975–978 (2014).
Article ADS Google Scholar
Feng, L., Wong, Z. J., Ma, R.-M., Wang, Y. & Zhang, X. Single-mode laser by parity-time symmetry breaking. Science 346, 972–975 (2014).
Article ADS Google Scholar
Fleury, R., Sounas, D. L. & Alù, A. Negative refraction and planar focusing based on parity–time symmetric metasurfaces. Phys. Rev. Lett. 113, 023903 (2014).
Article ADS Google Scholar
Chen, P.-Y. et al. Generalized parity-time symmetry condition for enhanced sensor telemetry. Nat. Electron. 1, 297–304 (2018).
Article Google Scholar
Bossard, J. A. et al. Near-ideal optical metamaterial absorbers with super-octave bandwidth. ACS Nano 8, 1517–1524 (2014).
Article Google Scholar
Yang, J. & Fan, J. A. Analysis of material selection on dielectric metasurface performance. Opt. Express 25, 23899–23909 (2017).
Article ADS Google Scholar
LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).
Article ADS Google Scholar
Jordan, M. I. & Mitchell, T. M. Machine learning: trends, perspectives, and prospects. Science 349, 255–260 (2015).
Article ADS MathSciNet MATH Google Scholar
Liu, D., Tan, Y., Khoram, E. & Yu, Z. Training deep neural networks for the inverse design of nanophotonic structures. ACS Photonics 5, 1365–1369 (2018).
Article Google Scholar
Peurifoy, J. et al. Nanophotonic particle simulation and inverse design using artificial neural networks. Sci. Adv. 4, eaar4206 (2018).
Article ADS Google Scholar
Zhang, Y. et al. Machine learning in electronic-quantum-matter imaging experiments. Nature 570, 484–490 (2019).
Article Google Scholar
Pilozzi, L. et al. Machine learning inverse problem for topological photonics. Commun. Phys. 1, 57 (2018).
Article Google Scholar
Liu, Z., Zhu, D., Rodrigues, S., Lee, K. & Cai, W. Generative model for the inverse design of metasurfaces. Nano Lett. 18, 6570–6576 (2018).
Article ADS Google Scholar
Ma, W., Cheng, F., Xu, Y., Wen, Q. & Liu, Y. Probabilistic representation and inverse design of metamaterials based on a deep generative model with semi‐supervised learning strategy. Adv. Mater. 31, 1901111 (2019).
Article Google Scholar
Hughes, T. W., Williamson, I. A., Minkov, M. & Fan, S. Wave physics as an analog recurrent neural network. Sci. Adv. 5, eaay6946 (2019). 12.
Article ADS Google Scholar
Kiarashinejad, Y., Abdollahramezani, S. & Adibi, A. Deep learning approach based on dimensionality reduction for designing electromagnetic nanostructures. npj Comput. Mater. 6, 12 (2020).
Article ADS Google Scholar
Sheverdin, A., Monticone, F. & Valagiannopoulos, C. Photonic inverse design with neural networks: the case of invisibility in the visible. Phys. Rev. Appl. 14, 024054 (2020).
Article ADS Google Scholar
Qian, C. et al. Deep-learning-enabled self-adaptive microwave cloak without human intervention. Nat. Photonics 14, 383–390 (2020).
Article ADS Google Scholar
Kudyshev, Z. A., Kildishev, A. V., Shalaev, V. M. & Boltasseva, A. Machine learning–assisted global optimization of photonic devices. Nanophotonics 1, 371–383 (2020).
Article Google Scholar
Jiang, J., Chen, M. & Fan, J. A. Deep neural networks for the evaluation and design of photonic devices. Nat. Rev. Mater. 6, 1–22 (2020).
Unni, R., Yao, K. & Zheng, Y. Deep convolutional mixture density network for inverse design of layered photonic structures. ACS Photonics 7, 2703–2712 (2020).
Article Google Scholar
Deng, Y., Ren, S., Fan, K., Malof, M. J. & Padilla, J. W. Neural-adjoint method for the inverse design of all-dielectric metasurfaces. Opt. Express 29, 7526–7534 (2021).
Article ADS Google Scholar
Kiarashinejad, Y., Abdollahramezani, S., Zandehshahvar, M., Hemmatyar, O. & Adibi, A. Deep learning reveals underlying physics of light–matter interactions in nanophotonic devices. Adv. Intell. Syst. 2, 1900088 (2019).
Google Scholar
Melati, D. et al. Mapping the global design space of nanophotonic components using machine learning pattern recognition. Nat. Commun. 10, 1–9 (2019).
Article Google Scholar
Iten, R., Metger, T., Wilming, H., Del Rio, L. & Renner, R. Discovering physical concepts with neural networks. Phys. Rev. Lett. 124, 010508 (2020).
Article ADS Google Scholar
Kiarashinejad, Y. et al. Knowledge discovery in nanophotonics using geometric deep learning. Adv. Intell. Syst. 2, 1900132 (2020).
Article Google Scholar
Mueller, J. Linear and Nonlinear Inverse Problems with Practical Applications (Society for Industrial and Applied Mathematics, 2012).
Dunteman, G. H. Principal Components Analysis (SAGE, 1989).
Bentley, J. L. Multidimensional binary search trees used for associative searching. Commun. ACM 18, 509 (1975).
Article MATH Google Scholar
Liu, Z. & Tegmark, M. Machine learning conservation laws from trajectories. Phys. Rev. Lett. 126, 180604 (2021).
Article ADS MathSciNet Google Scholar
Ha, S. & Jeong, H. Discovering invariants via machine learning. Phys. Rev. Res. 3, L042035 (2021).
Article Google Scholar
Ge, L., Chong, Y. D. & Stone, A. D. Conservation relations and anisotropic transmission resonances in one-dimensional PT-symmetric photonic heterostructures. Phys. Rev. A 85, 023802 (2012).
Article ADS Google Scholar
Hinton, G. E. & Salakhutdinov, R. R. Reducing the dimensionality of data with neural networks. Science 313, 504–507 (2006).
Article ADS MathSciNet MATH Google Scholar
Bhowmik, T., Liu, H., Ye, Z. & Oraintara, S. Dimensionality reduction based optimization algorithm for sparse 3-d image reconstruction in diffuse optical tomography. Sci. Rep. 6, 22242 (2016).
Article ADS Google Scholar
Gorban, A. N. Kégl, B. Wunsch, D. C. & Zinovyev, A. Y. (eds.). Principal Manifolds for Data Visualization and Dimension Reduction 58, 96–130 (Springer, 2008).
Barber, C. B., Dobkin, D. P. & Huhdanpaa, H. The quickhull algorithm for convex hulls. ACM Trans. Math. Softw. 22, 469 (1996).
Article MathSciNet MATH Google Scholar
So, S., Badloe, T., Noh, J., Bravo-Abad, J. & Rho, J. Deep learning enabled inverse design in nanophotonics. Nanophotonics 9, 1041–1057 (2020).
Article Google Scholar
Ahmed, W. W., Farhat, M., Zhang, X. & Wu, Y. Deterministic and probabilistic deep learning models for inverse design of broadband acoustic cloak. Phys. Rev. Res. 3, 013142 (2021).
Article Google Scholar
Wiecha, P. R., Arbouet, A., Girard, C. & Muskens, L. O. Deep learning in nano-photonics: inverse design and beyond. Photon. Res. 9, B182–B200 (2021).
Article Google Scholar
Liu, Z., Zhu, D., Raju, L. & Cai, W. Tackling photonic inverse design with machine learning. Adv. Sci. 8, 2002923 (2021).
Article Google Scholar
Kingma, D. P. & Ba, J. ADAM: a method for stochastic optimization. In Proceedings of the 3rd International Conference on Learning Representations (ICLR) (2014)

Download references

Acknowledgements

The work described in here is supported by King Abdullah University of Science and Technology (KAUST) Artificial Intelligence Initiative Fund and KAUST Baseline Research Fund No. BAS/1/1626-01-01. K.S. acknowledges funding from European Social Fund (project No 09.3.3-LMT-K712-17- 0016) under grant agreement with the Research Council of Lithuania (LMTLT), and from the Spanish Ministerio de Ciencia e Innovación under grant No.385 (PID2019-109175GB-C21).

Author information

Authors and Affiliations

Division of Computer, Electrical and Mathematical Sciences and Engineering, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
Waqas W. Ahmed, Mohamed Farhat, Xiangliang Zhang & Ying Wu
Departamento de Fisica, Universitat Politècnica de Catalunya (UPC), Rambla Sant Nebridi 22, Terrassa, Barcelona, 08222, Spain
Kestutis Staliunas
Institució Catalana de Recerca i Estudis Avançats (ICREA), Passeig Lluís Companys 23, Barcelona, 08010, Spain
Kestutis Staliunas
Faculty of Physics, Laser Research Center, Vilnius University, Saulėtekio Ave. 10, Vilnius, 10223, Lithuania
Kestutis Staliunas
Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, 46556, USA
Xiangliang Zhang
Division of Physical Science and Engineering, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
Ying Wu

Authors

Waqas W. Ahmed
View author publications
You can also search for this author in PubMed Google Scholar
Mohamed Farhat
View author publications
You can also search for this author in PubMed Google Scholar
Kestutis Staliunas
View author publications
You can also search for this author in PubMed Google Scholar
Xiangliang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Ying Wu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

W.W.A., M.F., and Y.W. conceived the idea. W.W.A. and M.F. conducted the analysis and simulations. K.S., X.Z., and Y.W. contributed to interpreting the results. All authors were involved in the discussions on the idea and methods, on writing and revising the manuscript. X.Z. and Y.W. supervised the overall project.

Corresponding authors

Correspondence to Xiangliang Zhang or Ying Wu.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Communications Physics thanks the anonymous reviewers for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Peer Review File

Supplementary information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Ahmed, W.W., Farhat, M., Staliunas, K. et al. Machine learning for knowledge acquisition and accelerated inverse-design for non-Hermitian systems. Commun Phys 6, 2 (2023). https://doi.org/10.1038/s42005-022-01121-9

Download citation

Received: 25 May 2022
Accepted: 16 December 2022
Published: 03 January 2023
DOI: https://doi.org/10.1038/s42005-022-01121-9

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.