Introduction

Non-Hermitian systems, first studied in quantum mechanics1,2, have been attracting a growing interest in wave physics, especially in optics for being a promising platform to explore new physical phenomena3,4,5,6,7 that are impossible in Hermitian systems. Hermitian systems exhibit symmetric transmission and reflection with respect to the direction of the incident wave. This symmetry stems from reciprocity and energy conservation principles. Yet, gain-loss in non-Hermitian systems breaks space symmetry and reveals unusual properties such as unidirectional invisibility8,9, lasing and coherent perfect absorption10, asymmetric chirality11, and many others12,13,14,15,16. Non-Hermitian systems extend photonic design to complex plane of dielectric permittivity, examining the delicate relationship between the refractive index, and gain-loss modulation17. Several platforms are readily available for designing non-Hermitian photonics systems, including waveguides5,6 microcavities18,19, metamaterials20, fiber optics21, plasmonic meta-atoms22, and microwave systems23. The new development in non-Hermitian physics opens up an array of future applications and technologies, including optical sensors24, laser-absorbers10, micro-lasers25,26, meta lens27, telemetry28, to name a few. In simple cases, even if the scatterers are elastic (conservative) and only one dimensional (1D), scattering problems may not be solved analytically. Some general relations may be derived. For instance, the reflection and the transmission for conservative systems obey \({\left|t\right|}^{2}+{\left|r\right|}^{2}=1\). For non-conservative but balanced gain-loss systems, another general relation exists, i.e., \({r}_{L}{r}_{R}^{* }=1-{\left|t\right|}^{2}\) relating the left and the right reflection waves. However, in the most general case, such relations do not exist. Lowest order Born approximation valid for weak scatterers predicts the left and right reflection and transmission spectra as \({r}_{L,R}^{0}\left(k\right)=\frac{{{{{{{\rm{i}}}}}}k}}{2}{\int }_{a}^{b}\varepsilon \left(x\right){{{{{{\rm{e}}}}}}}^{\pm 2{{{{{{\rm{i}}}}}}kx}}{{{{{{\rm{d}}}}}}x}\) and \({t}^{0}\left(k\right)=1+\frac{{{{{{{\rm{i}}}}}}k}}{2}{\int }_{a}^{b}\varepsilon \left(x\right){{{{{{\rm{d}}}}}}x}\), where for x < a and x > b the medium is air. These integrals depend on a general function of scatterer, ε(x), and value of reflection, \({r}_{L,R}^{0},\) for left and right incident plane wave \({{{{{{\rm{e}}}}}}}^{\pm {{{{{{\rm{i}}}}}}kz}}\). Obviously for real valued ε(x), the above relation gives \({r}_{L}={r}_{R}^{* }\). Yet, continuing to calculate higher order born approximations, reflection-transmission spectra become interdependent due to recursive relation of reradiated electric fields i.e., \({E}_{n}\left(\zeta \right)=\frac{{{{{{{\rm{i}}}}}}k}}{2}{\int }_{a}^{b}\varepsilon \left(x\right){{E}_{n-1}\left(z\right){{{{{\rm{e}}}}}}}^{-{{{{{{\rm{i}}}}}}k}\left|x-\zeta \right|}{{{{{{\rm{d}}}}}}z}\) where \(\zeta\) lies in the range of \(a\le \zeta \le b\). This implies that the transmission and the reflection spectra might be related even for arbitrary scattering functions ε(x). The analysis of Born approximation chain reveals that this hypothetic dependence is absent for very weak scattering and vanishes proportionally to \({\left|\varepsilon (z)\right|}^{2}\). Nevertheless, the analysis, even in this 1D case, cannot produce analytical tractable results or general relation between rL, rR, and t spectra. Eventually, whether rL, rR, and t are mutually related remains an open question. In addition, the amount of effective gain-loss plays a crucial role in designing the specific functionalities of the non-Hermitian structures that require careful tuning of design parameters. The existing modeling methods in conventional photonics use exhaustive tuning of material (and geometrical) parameters via brute-force and optimization for on-demand wave control29,30, which are computationally expensive. Therefore, intelligent models to understand the underlying physics of wave-matter interaction with the latent data structures is desired to reveal the relation between different physical quantities and automate the design of technological devices.

In recent years, machine-learning (ML) techniques were successfully applied for forward and inverse designs in different physical settings31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46. ML-based models are often regarded as black boxes because they do not reveal the physical behavior of the designed structures. In the literature, less efforts have been devoted to get scientific insights of photonic structures using ML47,48,49,50. Such scientific knowledge is essential to exclude suboptimal designs for fabrication or to discover new physics. Although tremendous progresses in ML-enabled methods have been made to solve different physical problems, most of the reported studies are devoted to address the Hermitian (or conservative) systems, leaving ML models for non-Hermitian problems yet to be developed. Here, we focus on non-Hermitian systems and exploit the ML for knowledge extraction that streamlines the inverse design process. By analyzing large amounts of spectral data, the ML provides us hint to the answer to the general question about the relation between asymmetric reflections and transmission in non-conservative systems. We develop powerful design tools that intelligently learn the bidirectional mapping between non-Hermitian material parameters and scattering spectra, considerably reducing the computation requirements. The forward process from design material parameters to response-space is well defined, but the backward inverse problem (from response space to material parameters) is ambiguous due to the degenerate solution space51. In fact, one identical response can be yield by various parameter sets. Therefore, solving the inverse design problem for the desired spectral properties is very challenging for practical realizations due to the existence of non-unique solutions. To this end, we first identify the sub-manifolds of non-Hermitian features (gain, loss, balanced gain-loss cases) which play an important role for inverse design. We use unsupervised learning based on Principal Component Analysis52 (PCA) that reduces the dimensionality of the transmission spectra over a set of training data and discovers the clustering among the data representing gain, lossy, and balanced non-Hermitian cases, as depicted in Fig. 1a. The largest variance component helps to understand the underlying features in the data and provides insights about the role of effective media in designing non-Hermitian structures. Further, sub-manifold learning is utilized to accelerate the inverse design process. The inverse design process is divided into training and inference phases as shown in Fig. 1b. The training phase involves the modeling of the forward neural network, \(\hat{f}(D)\) as a function of the design space, D, where \(\hat{f}\) is a universal approximator function for forward simulator. Inference phase retrieves the design parameters, D, for on-demand spectra, which involves the computation of feasible starting design parameters based on manifold learning and the gradient of the pretrained forward model with respect to the design space \(\partial \hat{f}/\partial D\) while fixing all the weights and biases of the network. K-dimensional (KD) tree algorithm53 is applied to find the nearest neighboring point for the latent transmission spectra within a feasible region that works as a promising starting point, D, to achieve the optimal design using adaptive gradient approach. In addition, we reconstruct the transmission spectra from the asymmetric reflection that uncovers one-to-one and one-to-many mapping among reflection and transmission in forward and reverse directions.

Fig. 1: Conceptual illustration of the design process for machine learning assisted non-Hermitian photonics structures.
figure 1

a Dimensionality reduction based on Principal Component Analysis (PCA) reveals the lower-dimensional sub-spaces where gain, lossy, and balanced system reside. A high dimensional transmission spectrum generated from a periodic five-layer photonic structure, of complex refractive index \({n}_{i}={n}_{{Ri}}+i{n}_{{Ii}}\), where nR and nI are the real and the imaginary parts, whose unit-cell is shown, is mapped to a point. The latent space for different spectra (left reflection and transmission RL and TL and right reflection and transmission RR and TR) forms feasible regions of gain, lossy, and balanced systems spanned by the first two principal components (PC1 and PC2). b For the inverse design, the dimensionality of the desired response (transmission) is reduced using PCA to observe the feasibility of the response with effective gain, lossy, and balanced media. Using a trained neural network (NN) that relates the design space with the response space, we search for the optimum solution with adaptive gradient descent method. The nearest neighbor of the target response within the possible region is used as a starting point to optimize the final structure. \(\hat{D}\) denotes the input whereas \(\hat{S}\) denotes the output.

In this paper, we propose a machine learning approach for knowledge discovery and solve the inverse problem in non-Hermitian photonics. As a proof of concept and without loss of generality, we apply our design strategy to study multilayered non-Hermitian structures where the complex optical materials exhibit highly asymmetric optical response due to simultaneous index and gain-loss modulation. The developed deep learning frameworks uncover the relationship between the transmission and the asymmetric reflection, recognize the non-Hermitian features in spectral data, and solve the inverse problem in general non-Hermitian systems.

Results and discussion

To demonstrate the effectiveness of our design approach, we study a non-Hermitian structure with periodically distributed five-layer supercells, where each layer of the unit cell may be parametrized with thickness and material as shown Fig. 1. We model such structure with the transfer matrix method (TMM) where the material properties of the ith layer are represented by complex refractive index \({n}_{i}={n}_{{Ri}}+{{{{{\rm{i}}}}}}{n}_{{Ii}}\), where nR and nI are the real and the imaginary parts of the refractive index [see “Methods” section for detail]. The background medium is air. The complex refractive index distribution of the supercell dictates its scattering properties. Data driven approaches generally require a huge amount of data (from thousands to millions of sets) for discovering the hidden features and the intrinsic relations between the input and the output. In the data generation process, we assume the same thickness for each layer and consider material parameters as design variables \({{{{{\mathcal{D}}}}}}\)= [\({n}_{R1},{n}_{R2},{n}_{R3},{n}_{R4},{n}_{R5},{n}_{I1},{n}_{I2},{n}_{I3},{n}_{I4},{n}_{I5}\)] to determine the scattering response, i.e., transmission and reflection properties for the left and right incident waves. Because of reciprocity, such a structure provides symmetric transmission T = TR = TL and, in general (e.g., if mirror symmetry is broken), asymmetric reflection denoted by RR and RL for left and right direction of incident waves, respectively. For illustration, the spectral response of interest is set in the normalized frequency range \(0.2\le \omega a/2\pi c\le\) 0.8 where each data set contains ten design parameters and 100 discrete points for each transmission and reflection spectrum \(S=[{{s}_{1},s}_{2},{s}_{3},{s}_{4}\ldots \ldots {s}_{100}]\). The real and imaginary parts of the refractive index are restricted in the range [1 1.4] and [−0.2 0.2] for data generation, respectively. In our study, we randomly generate 50,000 data samples of forward simulations with TMM. Among these, 80% of the samples are used as a training set, 10% for validation, and the remaining 10% for final testing. The training set is used for knowledge acquisition and training the feed forward networks; the validation set serves as a check to avoid overfitting, and the testing set evaluates the performance of the network. Here, we assume only material properties as the design space for the network training since refractive index and gain-loss modulations are crucial to non-Hermitian physics. Yet, the geometrical parameters (thickness) can also be easily incorporated, since the proposed feed forward neural networks can handle discrete data structures, either in terms of material or geometrical parameters.

Interplay of reflection and transmission in non-Hermitian structures

Conservation laws, as fundamental physical principles, have been conventionally derived in the model-driven way and more recently re-discovered with data-driven approaches54,55,56. Typically, the elements of the transfer matrix result in a conservation relation that connects transmission and reflection for multilayer configuration. In Hermitian systems, the generalized conservation relation is simply expressed as T + R = 1 where left and right reflections are necessarily identical due to mirror symmetry, i.e., RL = RR = R. Following this relation, Parity-time (PT)-symmetric systems with balanced gain and loss43 hold the relation \(\sqrt{{{R}_{L}R}_{R}}=\left|1-T\right|\). In non-Hermitian systems, energy conservation is not valid due to the existence of gain and loss and, therefore, the intrinsic correlation between transmission and reflection in a general non-conservative system is yet to be developed. Here, we exploit deep learning to unveil the relation between transmission and reflection responses of non-Hermitian structures. We uncover one-to-one mapping from reflection (RL, RR) to transmission T (i.e., unique transmission response exists for any given reflection of arbitrary structure) and many-to-one in the reverse direction. The one-to-one relation is valuable to reconstruct the transmission from given reflections and analytically accessible through the training process that adjusts the weights in a neural network (NN) forming a series of nested rectified linear unit functions. The trained NN provides a universal function approximator \(f({{{{{\mathcal{R}}}}}},T,\Theta )\) where the function f maps the reflection \({{{{{\mathcal{R}}}}}}\) to transmission T with the trained weight parameters Θ.

We implement the function f by a fully connected NN to learn the mapping between the left reflection, RL (or right reflection, RR), and transmission, where the reflection is fed as input and transmission as output to the network [see Fig. 2a]. The training process optimizes weights Θ by minimizing the mean absolute loss \({{{{{\mathcal{L}}}}}}{{{{{\mathscr{=}}}}}}\frac{1}{N}{\sum }_{i}\left|{T}_{i}-{\hat{T}}_{i}\right|\), where Ti and \({\hat{T}}_{i}\) are the ground truth and the predicted transmission responses, respectively. The architecture of the designed network consists of four fully connected layers with 500–500–400–300 neurons as depicted in Fig. 2a (see details in Supplementary Note 1). The performance of the trained network is quantified with the relative spectral error defined as: \({e}_{s}={\sum }_{i}\left|{T}_{i}-{\hat{T}}_{i}\right|/{T}_{i}\), where Ti and \({\hat{T}}_{i}\) denote the discretized target and predicted spectral response, respectively. In Fig. 2b, we show the distribution of the spectral error over the entire testing set in histogram form that indicates a high prediction accuracy (over 97%) of the network with an average error of 2.89% (dashed red line). The prediction performance of the trained model is demonstrated by three representative examples in Fig. 2c, where the red and the black dashed curves corresponding to the TMM simulation and the designed network prediction, respectively, show an excellent agreement.

Fig. 2: Designed neural network for left reflection to transmission mapping.
figure 2

a Architecture of the optimized network, showing the left reflection RL as function of frequency ω as input and the predicted transmission \(\hat{T}\) as function of ω as output. b Histogram of the spectral prediction error. c Three representative examples for generation of transmission response T (using Transfer Matrix Method: TMM and Machine Learning: ML) from three different given reflection spectra RL versus normalized frequency ωa/2πc, with a the dimension of the scatterer and c the speed of light in free space. The solid red and dotted black line represent the transmission calculated from TMM and ML method, respectively. More examples are shown in Supplementary Fig. 1.

Note that neural networks can also be designed to determine the transmission spectra from the given right reflection (see Supplementary Note 2), or from both reflections (see Supplementary Note 3). Our findings suggest that only one component of the reflection, either left or right, is sufficient to build the transmission profile of the considered non-Hermitian system.

Sub-manifold learning for features recognition in non-Hermitian structures

The spectral response of a photonic structure depends to a great extent on the operational frequency. A multi-dimensional frequency response presents a large number of features, which grows very fast with the number of discrete frequencies. Due to extra degrees of freedom introduced by gain-loss modulation in non-Hermitian systems, the spectral response of photonic structures becomes even more complex. Consequently, it becomes extremely challenging to recognize the non-Hermitian structure from the spectral response and infer some knowledge for deep physical insight. Therefore, effective procedures to extract the most relevant information contained in the system response are highly desired. Different approaches have proven to be useful for extraction of the valuable information in data, such as PCA52, multidimensional binary search tree53, neural autoencoder57, to name a few58,59. Such approaches are based on dimensionality reduction to visualize patterns, similarities, and differences in data with minimum loss of information. In this study, we exploit the unsupervised dimensionality reduction algorithm using PCA59 that transforms the discretized transmission spectra from a high-dimensional space to a low-dimensional latent space and models the sub-manifolds for different non-Hermitian structures. These manifolds, in the form of convex hull, are used to investigate the feasibility of having a desired optical response from a certain class of non-Hermitian structure. When mapped into the respective two-dimensional (2D) feature space, these responses form a distribution of points, which characterize the correlation among the multi-frequency responses. The latent representation exploits the variance of features as well as the covariance between features to find major trends in spectral data. PCA organizes the large dimensions of each spectrum in terms of a respective feature vector, existing in higher-dimensional feature space. The procedure includes the rotation of the coordinate axes of the feature space such that the first axis results with the maximum possible data dispersion (as quantified by the statistical variance), the second axis with the second maximum dispersion, and so on52. This principle is illustrated for a transmission spectrum with the transmission amplitude at three different frequencies in Fig. 1. In practice, we deal with n frequencies and represent one transmission spectrum as an n-dimensional vector. Since the n frequency features are not independent, a large set of the transmission spectra when all represented as n-dimensional vectors live on a low-dimensional manifold embedded in the n-dimensional space. Therefore, PCA can be employed to reduce the n-dimensional space to a lower-dimensional space (e.g., a 2D space spanned by the first two principal components), and visualize interesting patterns of gain, lossy, and balanced media [see Method section for details]. In our study, PCA is applied over the training data of transmission spectra, and the first two components encoding the latent representation in 2D are shown in Fig. 3a. However, the number of principal components required for dimensionality reduction depends on the nature of the spectral data being examined. The dominant spectral features of a spectrum with many resonance peaks spread randomly over the broad frequency range may require more principal components for latent space representation. A convex hull is plotted to show the boundary of possible feasible response from the given non-Hermitian structure [see Fig. 3b]. Note that the convex hull has been formed with Quickhull algorithm to bound the transmission patterns in 2D latent space60. We identify three distinct manifolds corresponding to the effective gain, lossy, and balanced gain-loss class for the latent spectra. An overlapping area appears when the relative amount of gain and loss in the system are nearly equal. Around the center of the reduced space, the gain and loss become exactly balanced, which maximizes the unidirectionality. The difference in the area of color-coded sub-manifold implies the capability of the multilayer configuration to construct a wide range of feasible spectral responses with different non-Hermitian media. Figure 3c–e presents representative transmission spectra belonging to three distinct manifolds, respectively. Sub-manifold learning can be used to forecast the feasibility of a response using a specific structure class. There are different physical scenarios associated with the identified feasible regions. For example, the gain region is ideal for lasing and designing amplifiers with stacked index-gain modulations. The balanced region can be applied for simultaneous lasing and coherent perfect absorption, sensing, and unidirectional invisibility, while coherent perfect absorbers can be designed using the lossy region.

Fig. 3: Principal component analysis for the transmission spectra generated from a five-layer periodic non-Hermitian structure.
figure 3

a Representation of the first two principal components (PC1 and PC2) of the transmission spectra. b The corresponding convex-hulls of the feasible regions for non-Hermitian structures in Fig. 1 in the reduced 2D space. The red, blue, and yellow regions correspond to effective gain, lossy, and balanced media, respectively. The response generated from effective gain and lossy media resides top-right and bottom-left from the center of the latent space, respectively. An overlapping region is found when the gain and loss are balanced with the threshold 10−6. Within this overlapping area, PT-symmetry with perfect transmission exists around the center of the latent space. ce Representative examples for the transmission spectra versus normalized frequency ωa/2πc, with a the dimension of the scatterer and c the speed of light in free space, residing in gain, balanced, and lossy regions of reduced space are presented, respectively.

Forward neural network

We develop a forward neural network to determine the spectral response for a given structure. It takes the design parameters as input \({{{{{\mathcal{D}}}}}}\) and the spectral response as output \(S\) to build a mapping relation as \(S=f(D)\) where f is the complex nonlinear function constructed by the forward neural network. One-to-one mapping from design parameter to spectral response is a regression problem that can be modeled by NN, as depicted in Fig. 4a. The network is trained with mean absolute spectral loss defined as \({{{{{\mathcal{L}}}}}}{{{{{\mathscr{=}}}}}}\frac{1}{N}{\sum }_{i}\left|{S}_{i}-{\hat{S}}_{i}\right|\) where Si and \({\hat{S}}_{i}\) are the ground truth and the predicted spectral response, respectively. The designed architecture for the forward NN has 500–500–400–400–300−300 neurons in six layers, for transmission, and asymmetric reflections [see Fig. 4a]. In order to assess the performance of the designed network, we define the relative spectral error on the testing data sets as: \(e={\sum }_{i}\left|{S}_{i}-{\hat{S}}_{i}\right|/{S}_{i}\) where Si and \({\hat{S}}_{i}\) correspond to target and predicted spectral responses. The average spectral errors (below 4%) for transmission and left/right reflection of the designed forward networks are presented in Fig. 4b. The results of representative predicted responses agree well with TMM simulations, as show in Fig. 4c, d (see more details in Supplementary Note 4).

Fig. 4: Designed feed forward neural network for non-Hermitian structures.
figure 4

a Architecture of the 6-layer neural network with input D and output \(\hat{S}\), with the transmission \(\hat{T}\), left reflection \({\hat{R}}_{L}\), and right reflection \({\hat{R}}_{R}\) versus frequency ω. b Average spectral prediction error for transmission and left/right reflection. c, d Representative examples of the predicted spectral response (using Transfer Matrix Method: TMM and Machine Learning: ML) for the designed networks (i) transmission, (ii) left reflection, and (iii) right reflection versus normalized frequency ωa/2πc, with a the dimension of the scatterer and c the speed of light in free space.

Inverse network

Next, we move toward the inverse design problem after obtaining a well-trained forward NN model. Naturally, most inverse design problems are ambiguous due to the “one-to-many” mapping i.e., different non-unique solutions exist for the same target response. Consequently, a single discriminative network cannot achieve the optimal solution in the inverse design. To mitigate this problem, auxiliary training approaches and optimization strategies have been incorporated in the inverse design process31,32,37,38,43,61,62,63,64. In our work, we propose the sub-manifold learning with neural adjoint method to solve the inverse problem. To design a structure that results in a desired spectral response, the first step is to find the corresponding target point in the latent space by reducing the dimensionality of the desired response using PCA. Next, the KD-tree algorithm is applied to find the nearest neighboring point for the corresponding target point in the latent space within feasible sub-manifolds. The identified nearest neighboring point will act as the starting point for inverse design using a neural adjoint approach. The neural adjoint method determines the optimal solution by computing the gradient of the pretrained forward model \(\hat{f}(D)\) with respect to the design parameters while keeping all weights and biases of the network fixed. The pretrained forward network \(\hat{f}(D)\) provides a closed-form differentiable expression, and thus the calculation of \(\partial \hat{f}/\partial D\) is trivial for the inverse design process. The gradient of the input design parameters with respect to a loss function \({{{{{\mathcal{L}}}}}}\) is estimated to iteratively move along the loss surface for the optimal solution. The inverse model can be denoted as \({\hat{f}}^{-1}(S,{D}_{0})\), where D0 is the initial structure obtained from sub-manifold learning and S is the desired spectra.

Let S be our desired spectra, and Di be our current best estimate of the design space, where the index i indicates the iteration for adaptive gradient-based estimation procedure. To compute Di+1, first we calculate the moving averages based on decaying exponential rates and gradient of the input design parameters with respect to a loss function in the following way:

$$ {m}_{i} ={\beta }_{1}{m}_{i-1}+\left(1-{\beta }_{1}\right)\frac{\partial {{{{{\mathcal{L}}}}}}\left(\hat{f}({\hat{D}}_{i}),{{\mbox{S}}}\right)}{\partial {\hat{D}}_{i}},\\ {v}_{i} ={\beta }_{2}{v}_{i-1}+(1-{\beta }_{2}){\left[\frac{\partial {{{{{\mathcal{L}}}}}}\left(\hat{f}({\hat{D}}_{i}),{{\mbox{S}}}\right)}{\partial {\hat{D}}_{i}}\right]}^{2},$$
(1)

where \({{{{{\mathcal{L}}}}}}{{{{{\mathscr{=}}}}}}\frac{1}{N}{\sum }_{i}^{N}{\left(\hat{f}(\hat{D})-{S}_{i}\right)}^{2}\) is the mean squared loss function and β1, β2 are the exponential decay of the rates for the first moment estimates and second-moment estimates, respectively. Next, we need to correct the bias in the first and the second moment estimates as:

$${\hat{m}}_{i}=\frac{{m}_{i}}{1-{\beta }_{1}^{i}},{\hat{v}}_{i}=\frac{{v}_{i}}{1-{\beta }_{2}^{i}}.$$
(2)

Finally, we can update the design parameters based on the calculated moving averages with a step size α:

$${\hat{D}}_{i+1}={\hat{D}}_{i}-\alpha \frac{{\hat{m}}_{i}}{\sqrt{{\hat{v}}_{i}}+\epsilon },$$
(3)

where α is the adaptive learning rate65. The major problem with existing neural adjoint methods is the selection of initial seed, however, we mitigate it with sub-manifold learning that eventually leads to an accurate solution in inverse design.

The results for the inverse design are summarized in Fig. 5. We consider three represented examples of desired responses whose transmission spectra resides within sub-manifold of lossy, gain, and balanced media indicated by b–d shown in Fig. 5a. Finding the nearest neighbor for the corresponding points in the latent space works as the initial structure that is evolved to achieve optimal target spectrum. The predicted spectra coincide with the target that indicates the strong capability of our method to design any spectral response absent in the training set, as shown in Fig. 5b–d (see details in Supplementary Note 5). Figure 5c shows the perfect transmission and asymmetric reflection that do not require any strict space and time symmetry conditions as in the case of PT-symmetric structures. Therefore, the proposed approach can be used to design a more general class of planar reflectionless structures with non-Hermitian materials that can be realized with locally isotropic and non-magnetic materials. In addition, the method can be applied to design high-quality resonators based on non-Hermitian photonic structures with more spectral points in desired frequency range for training data without affecting prediction time.

Fig. 5: Inverse design approach based on sub-manifold learning and neural adjoint method.
figure 5

a Convex-hulls of the feasible regions for non-Hermitian structures for the training transmission data in latent space. bd Desired spectral responses (using Transfer Matrix Method: TMM and Machine Learning: ML) versus normalized frequency ωa/2πc, with a the dimension of the scatterer and c the speed of light in free space, designed with adaptive gradient descent method. The initial seed is obtained within feasible sub-manifolds of lossy, gain, and balanced non-Hermitian systems.

Conclusion

In conclusion, we propose unsupervised and supervised learning for modeling the optical response of non-Hermitian photonic structures that can derive valuable insights about the relationships of asymmetric reflections and transmission in non-conservative systems. The study reveals that either the left or the right reflection is sufficient to determine the corresponding transmission profile. As demonstration, we study a multilayered periodic structure and develop a machine learning framework for knowledge acquisition and retrieval of the design parameters for a given spectral response. In particular, we propose a machine learning model to determine the transmission from a given asymmetric reflection without design parameters that uncover the one-to-one mapping from the reflection to the transmission spectra. The dimensionality reduction based on PCA is applied to learn the sub-manifolds of lossy, gain, and balanced structures via the transmission response. The learned sub-manifolds in 2D latent space are useful to determine the feasibility of the response with a given class of structures and find the best initial seed for inverse-design with neural adaptive gradient method. In this proposal, a single feed-forward neural network is trained rather than the complex training procedures used in common machine learning-based inverse methods that tend to yield suboptimal results. The PCA and KD-tree algorithm provide accurate initialization for neural adjoint method to find an approximate globally optimal solution within a short period of time (one minute). The approach is used to inversely design the unidirectional properties with multilayered structures that do not require any strict symmetry as in PT-symmetric structures. Our methodology integrates dimensionality reduction to automate the design and, thus, provides a versatile platform to learn new physical insights in non-Hermitian photonics structures. The inverse design method is not restricted to optical problems, and it can be applied directly to find accurate solutions to ill-posed problems in other physical systems such as acoustics, elasticity, or plasmonics. It is worth to mention that we use the feed forward neural networks to represent discrete data structures (material or geometry parameters) in modeling the response of multilayered structures. However, the idea can be extended for complex coupled systems beyond multilayered systems by replacing the feed forward neural network with different network architectures. Graphical neural networks can be used to model photonic systems with interacting objects (e.g., coupled ring resonators), while recurrent neural networks are used to model dynamical photonics systems.

Methods

Transfer matrix method

TMM is one of the most widely used methods to model light propagation in multilayered structures. In this study, we consider a one-dimensional periodic structure formed by five alternating layers with coherent interfaces. We assume that all layers are isotropic and nonmagnetic with different dielectric material. The electric field is represented as a superposition of the left- and right-propagating waves with wavevector, k = \(\omega /c\), as: \({E}^{-}\left(z\right)={E}_{L}^{+}{{{{{{\rm{e}}}}}}}^{{{{{{{\rm{i}}}}}}kz}}+{E}_{R}^{-}{{{{{{\rm{e}}}}}}}^{-{{{{{{\rm{i}}}}}}kz}}\) and \({E}^{+}\left(z\right)={E}_{R}^{-}{{{{{{\rm{e}}}}}}}^{{{{{{{\rm{i}}}}}}kz}}+{E}_{R}^{+}{{{{{{\rm{e}}}}}}}^{-{{{{{{\rm{i}}}}}}kz}}\) for the left \(\left(z \, < \, -L/2\right)\) and the right \(\left(z \, > \, L/2\right)\) side of the structure, respectively. The continuity conditions for the tangential electric field components at the outer interfaces of the layered structure determine the transfer matrix, M, that is the product of matching and propagation matrices. Mathematically, the transfer matrix relates the field amplitudes of the left- and right propagating waves in the following manner:

$$\left(\begin{array}{c}{E}_{R}^{-}\\ {E}_{R}^{+}\end{array}\right)=M\left(\begin{array}{c}{E}_{L}^{+}\\ {E}_{L}^{-}\end{array}\right),M=\left(\begin{array}{cc}{M}_{11} & {M}_{12}\\ {M}_{21} & {M}_{22}\end{array}\right).$$
(4)

The transmission coefficient tR,L, and reflection coefficient rR,L, (along with the transmittance \({T}_{R,L}{=\left|{t}_{R,L}\right|}^{2}\) and reflectance \({R}_{R,L}{=\left|{r}_{R,L}\right|}^{2}\)) for left (L) and right (R) incidence waves can be computed from boundary conditions and related with the elements of transfer matrix as:

$$\begin{array}{cc}{t}_{R}=\frac{1}{{M}_{22}}, & {t}_{L}=\frac{{M}_{11}{M}_{22}-{M}_{12}{M}_{21}}{{M}_{22}},\\ {r}_{R}=\frac{{M}_{12}}{{M}_{22}}, & {r}_{L}=-\frac{{M}_{21}}{{M}_{22}}.\end{array}$$
(5)

Principal component analysis

PCA is a powerful unsupervised method for dimensionality reduction that transforms the data to a lower dimensional space to identify the intrinsic patterns and correlation in the data without loss of original information. Consider m data points of n-dimensional spectral space S = [s1, s2….. sn] where, s1 represents ith features, and n represents the total number of features. The response data matrix can be written as Rm×n from which successive k orthogonal components (also called principal components) are computed to find the direction of the maximum variance. The largest eigenvalues of the response data matrix along with the corresponding eigenvectors are used to analyze a large amount of high-dimensional transmission data. In this work, we implement PCA with python sklearn libaray that uses the singular value decomposition for the calculation of principal components.