Skip to main content
Log in

The planted matching problem: sharp threshold and infinite-order phase transition

  • Published:
Probability Theory and Related Fields Aims and scope Submit manuscript

Abstract

We study the problem of reconstructing a perfect matching \(M^*\) hidden in a randomly weighted \(n\times n\) bipartite graph. The edge set includes every node pair in \(M^*\) and each of the \(n(n-1)\) node pairs not in \(M^*\) independently with probability d/n. The weight of each edge e is independently drawn from the distribution \({\mathcal {P}}\) if \(e \in M^*\) and from \({\mathcal {Q}}\) if \(e \notin M^*\). We show that if \(\sqrt{d} B({\mathcal {P}},{\mathcal {Q}}) \le 1\), where \(B({\mathcal {P}},{\mathcal {Q}})\) stands for the Bhattacharyya coefficient, the reconstruction error (average fraction of misclassified edges) of the maximum likelihood estimator of \(M^*\) converges to 0 as \(n\rightarrow \infty \). Conversely, if \(\sqrt{d} B({\mathcal {P}},{\mathcal {Q}}) \ge 1+\epsilon \) for an arbitrarily small constant \(\epsilon >0\), the reconstruction error for any estimator is shown to be bounded away from 0 for both the sparse (fixed d) and dense (growing d) regimes, resolving the conjecture in Moharrami et al. (Ann Appl Probab 31(6):2663–2720, 2021. https://doi.org/10.1214/20-AAP1660) and Semerjian et al. (Phys Rev E 102:022304, 2020. https://doi.org/10.1103/PhysRevE.102.022304). Furthermore, in the special case of complete exponentially weighted graph with \(d=n\), \({\mathcal {P}}=\exp (\lambda )\), and \({\mathcal {Q}}=\exp (1/n)\), for which the sharp threshold simplifies to \(\lambda =4\), we prove that when \(\lambda = 4-\epsilon \), the optimal reconstruction error is \(\exp \left( - \varTheta (1/\sqrt{\epsilon }) \right) \), confirming the conjectured infinite-order phase transition in Semerjian et al. (2020).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Data availibility

Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.

Notes

  1. If \(\rho (0)=0\), then for each vertex, among its incident edges, the planted edge weight has the smallest magnitude with high probability, in which case almost perfect recovery is trivially achievable.

  2. To be precise, the conjecture given in [25, eq. (45) and (40)] is stated under the aforementioned unipartite version of the planted matching problem. Nevertheless, our proof techniques as well as the sharp thresholds in Theorems 13 continue to hold for the unipartite version.

  3. In statistical physics parlance, a phase transition is called first-order if the order parameter (in this case, the average reconstruction error) is discontinuous at the threshold, and of pth order for \(p\ge 2\) if its \((p-2)\)th derivative is continuous but its \((p-1)\)-th derivative is discontinuous [25].

  4. Indeed, suppose that \(\sqrt{d}B({\mathcal {P}},{\mathcal {Q}})=c>1+\epsilon \). Consider the model parametrized with \((d',{\mathcal {P}},{\mathcal {Q}})\) where \(d'=d(1+\epsilon )^2/c^2<d\) so that \(\sqrt{d'}B({\mathcal {P}},{\mathcal {Q}})=1+\epsilon \). From an observed graph G generated from the \((d',{\mathcal {P}},{\mathcal {Q}})\) model, one can “densify” G by adding edges independently with edge weight drawn from \({\mathcal {Q}}\) to arrive at an instance of the \((d,{\mathcal {P}},{\mathcal {Q}})\) model. Therefore, the lower bound on the average reconstruction error carries over to the \((d,{\mathcal {P}},{\mathcal {Q}})\) model.

  5. Strictly speaking m needs to be rounded to the nearest integer. To lighten the notation, from here on we will not explicitly specify the rounding step as it only affects constant factors.

  6. Indeed, construct \(\mathbb {V}\) greedily until no more subsets can be added. Then for any T with \(|T|=n/2\), there exists some \(S\in \mathbb {V}\) such that \(|S\triangle T|<n/3\). Since for each fixed S, the number of T such that \(|S\triangle T|<n/3\) is at most \(\sum _{i=0}^{n/3}\left( {\begin{array}{c}n\\ i\end{array}}\right) \), the lower bound on \(|\mathbb {V}|\) follows.

  7. Indeed, using the density of sum of exponentials (see (87) in Appendix B), the probability that \(\textsf{wt}_{{\textsf{r}}}(P)/\ell \) and \(\textsf{wt}_{{\textsf{r}}}(P)/(\ell -1)\) are both close to a given value x is proportional to \(x^{2\ell -3} e^{-\ell x(\lambda + \frac{\ell -1}{n \ell })}\), which, for large n and \(\ell \), is approximately maximized at \(x = \frac{2}{\lambda }\).

  8. We emphasize that it is crucial to keep the polynomials terms in (76) so that in (79) we can get the upper bound \(4 (2\ell )^{k-1}\), which in turn yields the desired \(\frac{\ell ^2}{n}\) factor in (81).

References

  1. Aldous, D.: The \(\zeta (2)\) limit in the random assignment problem. Random Struct. Algorithms 18(4), 381–418 (2001). https://doi.org/10.1002/rsa.1015

    Article  MathSciNet  MATH  Google Scholar 

  2. Alon, N., Spencer, J.H.: The Probabilistic Method. Wiley-Interscience Series in Discrete Mathematics and Optimization, 3rd edn (2008)

  3. Bagaria, V., Ding, J., Tse, D., Wu, Y., Xu, J.: Hidden Hamiltonian cycle recovery via linear programming. Oper. Res. 68(1), 53–70 (2020)

  4. Bhattacharyya, A.: On a measure of divergence between two statistical populations defined by their probability distributions. Bull. Calcutta Math. Soc. 35, 99–109 (1943)

    MathSciNet  MATH  Google Scholar 

  5. Chertkov, M., Kroc, L., Krzakala, F., Vergassola, M., Zdeborová, L.: Inference in particle tracking experiments by passing messages between images. PNAS 107(17), 7663–7668 (2010). https://doi.org/10.1073/pnas.0910994107

    Article  Google Scholar 

  6. Decelle, A., Krzakala, F., Moore, C., Zdeborova, L.: Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications. Phys. Rev. E 84, 066106 (2011)

    Article  Google Scholar 

  7. Ding, J.: Scaling window for mean-field percolation of averages. Ann. Probab. 41(6), 4407–4427 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  8. Ding, J., Goswami, S.: Percolation of averages in the stochastic mean field model: the near-supercritical regime. Electron. J. Probab. 20, 1–21 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  9. Ding, J., Sun, N., Wilson, D.B.: Supercritical minimum mean-weight cycles. Trans. Am. Math. Soc. (2015) (to appear). arXiv:1504.00918

  10. Ding, J., Wu, Y., Xu, J., Yang, D.: Consistent recovery threshold of hidden nearest neighbor graphs. In: Proceedings of Conference on Learning Theory (COLT) (2020)

  11. Durrett, R.: Random Graph Dynamics, vol. 200. Citeseer (2007)

  12. Frieze, A., Karoński, M.: Introduction to Random Graphs. Cambridge University Press, Cambridge (2016)

    MATH  Google Scholar 

  13. Hajek, B., Wu, Y., Xu, J.: Information limits for recovering a hidden community. IEEE Trans. Inf. Theory 63(8), 4729–4745 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  14. Karp, R.M.: An upper bound on the expected cost of an optimal assignment. In: Johnson, D.S., Nishizeki, T., Nozaki, A., Wilf, H.S. (eds.) Discrete Algorithms and Complexity, pp. 1–4. Academic Press, Cambridge (1987). https://doi.org/10.1016/B978-0-12-386870-1.50006-X

    Chapter  Google Scholar 

  15. Krivelevich, M.: Long paths and hamiltonicity in random graphs. Random Graphs Geom. Asymptot. Struct. 84, 1 (2016)

    MathSciNet  MATH  Google Scholar 

  16. Kunisky, D., Niles-Weed, J.: Strong recovery of geometric planted matchings. In: Proceedings of the 2022 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 834–876. SIAM (2022)

  17. Lesieur, T., Krzakala, F., Zdeborová, L.: Mmse of probabilistic low-rank matrix estimation: universality with respect to the output channel. In: 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp. 680–687 (2015). https://doi.org/10.1109/ALLERTON.2015.7447070

  18. Linusson, S., Wästlund, J.: A proof of Parisi’s conjecture on the random assignment problem. Probab. Theory Relat. Fields 128(3), 419–440 (2004). https://doi.org/10.1007/s00440-003-0308-9

    Article  MathSciNet  MATH  Google Scholar 

  19. Mézard, Marc, Parisi, Giorgio: On the solution of the random link matching problems. J. Phys. France 48(9), 1451–1459 (1987). https://doi.org/10.1051/jphys:019870048090145100

    Article  Google Scholar 

  20. Marinari, E., Semerjian, G.: On the number of circuits in random graphs. J. Stat. Mech. Theory Exp. 2006(06), P06019 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  21. Moharrami, M., Moore, C., Xu, J.: The planted matching problem: phase transitions and exact results. Ann. Appl. Probab. 31(6), 2663–2720 (2021). https://doi.org/10.1214/20-AAP1660

  22. Moore, C.: The computer science and physics of community detection: landscapes, phase transitions, and hardness. (2017). arXiv:1702.00467

  23. Nair, C., Prabhakar, B., Sharma, M.: Proofs of the Parisi and Coppersmith–Sorkin random assignment conjectures. Random Struct. Algorithms 27(4), 413–444 (2005). https://doi.org/10.1002/rsa.20084

    Article  MathSciNet  MATH  Google Scholar 

  24. Robbins, H.: A remark on Stirling’s formula. Am. Math. Mon. 62(1), 26–29 (1955)

    MathSciNet  MATH  Google Scholar 

  25. Semerjian, G., Sicuro, G., Zdeborová, L.: Recovery thresholds in the sparse planted matching problem. Phys. Rev. E 102, 022304 (2020). https://doi.org/10.1103/PhysRevE.102.022304

    Article  MathSciNet  Google Scholar 

  26. Sicuro, G., Zdeborová, L.: The planted \( k \)-factor problem (2020). arXiv:2010.13700

  27. Walkup, D.W.: On the expected value of a random assignment problem. SIAM J. Comput. 8(3), 440–442 (1979). https://doi.org/10.1137/0208036

    Article  MathSciNet  MATH  Google Scholar 

  28. Wang, H., Wu, Y., Xu, J., Yolou, I.: Random graph matching in geometric models: the case of complete graphs. In: Conference on Learning Theory, pp. 3441–3488. PMLR (2022)

  29. Wästlund, J.: An easy proof of the \(\zeta (2)\) limit in the random assignment problem. Electron. Commun. Probab. 14, 261–269 (2009). https://doi.org/10.1214/ECP.v14-1475

    Article  MathSciNet  MATH  Google Scholar 

  30. Wu, Y., Xu, J.: Statistical problems with planted structures: information-theoretical and computational limits. In: Eldar, Y., Rodrigues, M. (eds.) Information-Theoretic Methods in Data Science. Cambridge University Press, Cambridge (2020)

    Google Scholar 

Download references

Acknowledgements

JX would like to thank Cristopher Moore for many inspiring discussions on the posterior sampling. JX is also grateful to Guilhem Semerjian, Gabriele Sicuro, and Lenka Zdeborová for sharing the early draft of [25]. This work was done in part while the authors were participating in a program at the Simons Institute for the Theory of Computing.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dana Yang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Jian Ding is supported by the NSF Grants DMS-1757479 and DMS-1953848. Yihong Wu is supported in part by the NSF Grant CCF-1900507, an NSF CAREER award CCF-1651588, and an Alfred Sloan fellowship. Jiaming Xu is supported by NSF grants IIS-1838124, CCF-1850743, CCF-1856424 and an NSF CAREER award CCF-2144593.

Appendices

Appendix A Large deviation estimates

Lemma 13

Let \({\mathcal {P}},{\mathcal {Q}}\) be two probability distributions such that \({\mathcal {P}}\ll {\mathcal {Q}}\). Let \(X_i\)’s and \(Y_i\)’s be two independent sequences of random variables, where \(X_i\)’s are i.i.d. copies of \( \log ({\mathcal {P}}/{\mathcal {Q}})\) under distribution \({\mathcal {P}}\) and \(Y_i \)’s are i.i.d. copies of \(\log ({\mathcal {P}}/{\mathcal {Q}})\) under distribution \({\mathcal {Q}}\). For all \(x\ge 0\) and positive integer \(\ell \), we have

$$\begin{aligned} \mathbb {P}\left\{ \sum _{i=1}^\ell (Y_i - X_i) \ge x\ell \right\} \le \exp \left( -\ell (\alpha +x/2) \right) , \end{aligned}$$
(83)

where \(\alpha = -2\log B({\mathcal {P}},{\mathcal {Q}})\) as defined in (14). Furthermore, if we further assume that \({\mathcal {Q}}\ll {\mathcal {P}}\), then for all \(0\le x\le D(P\Vert Q)+D(Q\Vert P)\) and positive integer \(\ell \), we have

$$\begin{aligned} \mathbb {P}\left\{ \sum _{i=1}^\ell (Y_i - X_i) \ge x\ell \right\} \ge \exp \left( -\ell (\alpha +x+o_\ell (1))\right) . \end{aligned}$$
(84)

Proof

The proof of (83) follows from standard large deviation analysis (cf. [3, Appendix B]). Let F denote the Legendre transform of the log moment generating function of \(Y_1-X_1\), i.e.,

$$\begin{aligned} F(x) = \sup _{\theta \ge 0} \left\{ \theta x - \psi _{\mathcal {P}}(-\theta ) -\psi _{\mathcal {Q}}(\theta ) \right\} , \end{aligned}$$

where \(\psi _{\mathcal {P}}(\theta )=\log \mathbb {E}\!\left[ e^{\theta X_1} \right] \) and \(\psi _{\mathcal {Q}}(\theta )=\log \mathbb {E}\!\left[ e^{\theta Y_1} \right] \). Then from the Chernoff bound we have the following large deviation inequality:

$$\begin{aligned} \mathbb {P}\left\{ \sum _{i=1}^\ell (Y_i - X_i) \ge x\ell \right\} \le \exp \left( -\ell F(x) \right) . \end{aligned}$$
(85)

Note the following facts:

  1. 1.

    \(F(0) = - \psi _{\mathcal {P}}( -1/2) - \psi _{\mathcal {Q}}( 1/2)= - 2 \log \int \sqrt{{\mathcal {P}}{\mathcal {Q}}}=\alpha \);

  2. 2.

    \(F(x) \ge F(0) +x/2\), for all \(x\ge 0\).

Combining these facts with (85) yields (83).

Next we prove (84). By Cramér’s theorem, we have

$$\begin{aligned} \mathbb {P}\left\{ \sum _{i=1}^\ell (Y_i - X_i) \ge x\ell \right\} = \exp \left( -F(x)\ell + o(\ell ) \right) , \end{aligned}$$
(86)

where \(o(\ell )/\ell \) converges to 0 as \(\ell \) grows to infinity. Second, we have

$$\begin{aligned} F(x) \le \alpha + x, \quad \forall 0 \le x \le D({\mathcal {P}}\Vert {\mathcal {Q}}) + D({\mathcal {Q}}\Vert {\mathcal {P}}). \end{aligned}$$

To see this, note that \(\psi _{\mathcal {P}}(\theta ) = \psi _{\mathcal {Q}}(1+\theta )\). Thus, the optimal \(\theta \) is given by

$$\begin{aligned} x + \psi '_{\mathcal {Q}}(1-\theta ) - \psi _{\mathcal {Q}}'(\theta ) =0. \end{aligned}$$

Note that \(\psi _{\mathcal {Q}}'(0)=-D({\mathcal {Q}}\Vert {\mathcal {P}})\) and \(\psi _{\mathcal {Q}}'(1)=D({\mathcal {P}}\Vert {\mathcal {Q}})\). Moreover, since \(\psi _{\mathcal {Q}}(\theta )\) is convex, it follows that \(\psi _{\mathcal {Q}}'(\theta )\) is non-decreasing in \(\theta \). Thus the optimal \(\theta \) must lie in [1/2, 1] when \(0 \le x \le D({\mathcal {P}}\Vert {\mathcal {Q}}) + D({\mathcal {Q}}\Vert {\mathcal {P}})\). Hence,

$$\begin{aligned} F(x)= & {} \sup _{\theta \in [1/2, 1] } \left\{ \theta x - \psi _{\mathcal {P}}(-\theta ) -\psi _{\mathcal {Q}}(\theta ) \right\} \le x + \sup _{\theta \in [1/2, 1] } \left\{ - \psi _{\mathcal {P}}(-\theta ) -\psi _{\mathcal {Q}}(\theta ) \right\} \\= & {} x+F(0) = x+ \alpha . \end{aligned}$$

Combine with (86) to finish the proof of (84). \(\square \)

Appendix B Erlang distribution and Chernoff bounds

The sum of \(\ell \) i.i.d. \(\exp (\lambda )\) random variables has an Erlang distribution with parameters \(\ell \) and \(\lambda \), denoted by \(\text {Erlang}(\ell ,\lambda )\), whose density is given by

$$\begin{aligned} f(x) = \frac{\lambda ^{\ell } x^{\ell -1} e^{-\lambda x}}{(\ell -1) !}, \quad x \ge 0 \end{aligned}$$
(87)

Theorem 8

Let \(X_i {{\mathop {\sim }\limits ^{\text {i.i.d.}}}}\exp (1)\). Then

$$\begin{aligned}&\mathbb {P}\left\{ \sum _{i=1}^n X_i \ge n \xi \right\} \le \exp \left( -n \left( \xi - \log (\xi ) -1 \right) \right) , \quad \forall \xi >1 \\&\mathbb {P}\left\{ \sum _{i=1}^n X_i \le n \xi \right\} \le \exp \left( -n \left( \xi - \log (\xi ) -1 \right) \right) , \quad \forall \xi <1 \end{aligned}$$

Appendix C Exp-minus-one random bridge

Let \(X_1, X_2, \ldots , X_\ell {{\mathop {\sim }\limits ^{\text {i.i.d.}}}}\exp (\mu )\) and let \(X=\sum _{i=1}^\ell X_i\). Recall from Lemma 8 the exp-minus-one \(\ell \)-bridge R is defined as

$$\begin{aligned} R_j=\sum _{i=1}^j \left( \frac{X_i}{X} \ell -1 \right) , \quad 0 \le j \le \ell . \end{aligned}$$

Define \(\textsf{dev}(R)=\max _{0 \le j \le \ell } \left| R_j \right| \). The following result adapted from [8, Lemma 3.3] (which is a slight extension of [7, Lemma 3.2]) bounds the probability of \(\textsf{dev}(R) \le A\) conditional on the total weight of \(X_i\) and the value of \(X_i\) for a set of indices i in a union of intervals, in terms of the unconditional probability \(p_\ell \triangleq \mathbb {P}\left\{ \textsf{dev}(R) \le A \right\} \). This result is crucial for the second moment computation in Sect. 7.3.

Lemma 14

Let \(1/4 \le \rho \le 1\) and \(1 \le A \rho \le \sqrt{\ell }\). Consider the integer intervals \([a_1, b_1], \ldots , [a_k, b_k]\) such that \(1 \le a_1 \le b_1 \le \cdots \le a_k \le b_k \le \ell \) and \(m=\sum _{i=1}^k (b_i-a_i +1)\). Write \(J= \cup _{i=1}^k [a_i, b_i]\). Then

$$\begin{aligned} \mathbb {P}\left\{ \textsf{dev}(R) \le A \mid \sum _{i=1}^\ell X_i = \rho \ell , \{ X_j= x_j, j \in J \} \right\} \le c_1 A \sqrt{m \wedge (\ell -m) } p_\ell 10^{100 k A} e^{c_0 m/A^2}, \end{aligned}$$

where \(c_0, c_1>0\) are two universal constants.

Appendix D Minimum-weight matching for the exponential model

In this section, we prove the positive part (12) in Theorem 3. Namely, in the complete graph case with exponentially distributed weights, when \(\lambda =4-\epsilon \), the minimum weighted matching \({\widehat{M}}_{\textsf{ML}}\) (linear assignment), which corresponds to the maximum likelihood estimation, misclassifies at most \(O\left( \frac{1}{\epsilon ^3} e^{-\frac{2\pi }{\sqrt{\epsilon }}} \right) \) fraction of edges on average. This together with the negative part (11) in Theorem 3 shows that the minimum weighted matching achieves the optimal rate \(1/\sqrt{\epsilon }\) of the error exponent. Prior work [21] provides the exact characterization of the asymptotic error of \({\widehat{M}}_{\textsf{ML}}\) in terms of a system of ordinary differential equations when \(\lambda <4\). Our proof follows by analyzing this system of ODEs when \(\lambda =4-\epsilon \) for small \(\epsilon \), and is inspired by the heuristic arguments in [25, Section VI].

Proof of Theorem 3: positive part

First, it has been shown in [21, Theorem 2] that

$$\begin{aligned} \lim _{n\rightarrow \infty } \mathbb {E}\!\left[ \ell \left( M^*,{\widehat{M}}_{\textsf{ML}}\right) \right] =4 \int _0^\infty \left( 1-U(x) V(x) \right) \left( 1-\left( 1- U(x) \right) W(x) \right) V(x) W(x) \,dx, \end{aligned}$$

where (UVW) is the unique solution to the following system of equations

$$\begin{aligned} \begin{aligned}&\frac{\textrm{d}U}{\textrm{d}x} = - \lambda U(1-U ) + (1- UV) \left( 1- (1-U) W \right) \\&\frac{\textrm{d}V}{\textrm{d}x} = \lambda V (1-U) \\&\frac{\textrm{d}W}{\textrm{d}x} = - \lambda W U \end{aligned} \end{aligned}$$
(88)

with initial condition

$$\begin{aligned}&U(0) = \frac{1}{2}, \quad V(0)=W(0)=\delta , \quad \delta \in (0,1) , \end{aligned}$$
(89)

and \(\delta \) is the unique value in (0, 1) such that \(U(x), V(x) \rightarrow 1\) as \(x \rightarrow +\infty \).

Furthermore, it has been shown in [21, Section B] that \(UV<1,\) \((1-U) W<1,\) \(0<U,V,W<1\), and

$$\begin{aligned} V(x)&=\delta \exp \left( \lambda \int _{0}^x \left( 1- U(y) \right) \textrm{d}y\right) , \end{aligned}$$
(90)
$$\begin{aligned} W(x)&= V(x) \,e^{-\lambda x}. \end{aligned}$$
(91)

Therefore, we have that

$$\begin{aligned} \lim _{n\rightarrow \infty } \mathbb {E}\!\left[ \ell \left( M^*,{\widehat{M}}_{\textsf{ML}}\right) \right] \le 4 \int _0^\infty V^2(x) e^{-\lambda x} \,dx, \end{aligned}$$

Let \(x_0=\inf \{x\ge 0: U(x) < \frac{1}{2}\}\) denote the first time that U(x) crosses 1/2. Therefore by (90) we have that for all \(0 \le x \le x_0\), \(V(x) \le \delta e^{\lambda x/2}\). Hence for any \(0 \le \tau \le x_0\),

$$\begin{aligned} \int _0^\infty V^2(x) e^{-\lambda x} \,dx&= \int _0^{\tau } V^2(x) e^{-\lambda x} \,dx + \int _{\tau }^{\infty } V^2(x) e^{-\lambda x} \,dx \\&\le \delta ^2 \tau + \int _{\tau }^{\infty } e^{-\lambda x} \,dx \\&= \delta ^2 \tau + \frac{1}{\lambda } e^{-\lambda \tau }. \end{aligned}$$

We claim that \(\lambda x_0 \ge 2 \log \frac{\epsilon }{8\delta }\). Suppose not. Then \(e^{\lambda x_0/2} < \frac{\epsilon }{8\delta }\), in particular, \(x_0<+\infty .\) By continuity, we have \(U(x_0)=\frac{1}{2}\), \(U'(x_0) \le 0\). But we have

$$\begin{aligned} U'(x_0) =&~ -\lambda U(x_0) \left( 1-U(x_0) \right) + \left( 1-U(x_0) V(x_0) \right) \left( 1-\left( 1-U(x_0) \right) W(x_0) \right) \\ =&~ -\lambda /4 + \left( 1- \frac{1}{2} V(x_0) \right) \left( 1- \frac{1}{2} W(x_0) \right) \\ \ge&~ -\lambda /4 + 1 - V(x_0) \\ \ge&~ \epsilon /4 - \delta e^{\lambda x_0/2}> \epsilon /8 > 0, \end{aligned}$$

which is the needed contradiction. Therefore, by setting \(\tau = \frac{2}{\lambda } \log \frac{\epsilon }{8\delta }\), we get that

$$\begin{aligned} \int _0^\infty V^2(x) e^{-\lambda x} \,dx \le \frac{2 \delta ^2}{\lambda } \log \frac{\epsilon }{8\delta } + \frac{1}{\lambda } \left( \frac{8\delta }{\epsilon } \right) ^2 \le \frac{c}{\epsilon ^3} \exp \left( - \frac{2\pi }{\sqrt{\epsilon }} \right) , \end{aligned}$$

where \(c>0\) is a universal constant and the last inequality follows from the claim that

$$\begin{aligned} \delta \le \frac{c'}{\sqrt{ \epsilon } } \exp \left( - \frac{\pi }{\sqrt{\epsilon }} \right) , \end{aligned}$$
(92)

where \(c'>0\) is a universal constant.

It remains to prove (92). In view of \(V<1\) and (90), we have that

$$\begin{aligned} \delta \exp \left( \lambda \int _{0}^\infty \left( 1- U(y) \right) \textrm{d}y\right) \le 1. \end{aligned}$$
(93)

To proceed, we derive an upper bound to U(x). Let \({\widetilde{U}}(x)\) denote the unique solution of the following ODE:

$$\begin{aligned} \frac{\textrm{d}\widetilde{U}}{\textrm{d}x} = - \lambda {\widetilde{U}}(1-{\widetilde{U}}) + 1, \quad {\widetilde{U}}(0) = \frac{1}{2}. \end{aligned}$$
(94)

We claim that \(U(x) \le {\widetilde{U}}(x)\) for all \(x \ge 0\). To show this, let \(f={\widetilde{U}}-U\). Then we have (a) \(f(0)=0\); (b) \(f'(0)>0\); (c) \(f'(x)>0\) whenever \(f(x)=0\). Thus \(f(\delta _0)>0\) for some small \(\delta _0\). Let \(x_1 = \inf \{x \ge 0: f(x)<0 \}\). Suppose for the sake of contradiction that \(x_1<\infty \). Then \(x_1\) is the first time that f crosses zero. Thus \(f'(x_1) \le 0\). But by continuity we have \(f(x_1)=0\) and hence \(f'(x_1)>0\), which is a contradiction.

Solving the ODE (94), we get that

$$\begin{aligned} {\widetilde{U}}(x) = \frac{1}{2} + \frac{1}{2} \sqrt{ \frac{4-\lambda }{\lambda } } \tan \left( \frac{x}{2} \sqrt{\lambda (4-\lambda ) } \right) , \quad 0 \le x < \frac{\pi }{\sqrt{\lambda (4-\lambda ) }}. \end{aligned}$$

Therefore, for some \(0 \le a<\frac{\pi }{\sqrt{\lambda (4-\lambda ) }}\) to be determined,

$$\begin{aligned} \int _{0}^\infty \left( 1- U(y) \right) \textrm{d}y&\ge \int _{0}^{a} \left( 1- U(y) \right) \textrm{d}y\\&\ge \int _{0}^{a} \left( 1- {\widetilde{U}}(y) \right) \textrm{d}y\\&= \frac{a}{2} + \frac{1}{\lambda } \log \cos \left( \frac{a}{2} \sqrt{\lambda (4-\lambda ) } \right) , \end{aligned}$$

where the last inequality holds due to \(\int _{0}^x \tan (y) \textrm{d}y= - \log \cos (x)\) for \(0 \le x <\pi /2\). Combining the last displayed equation with (93) yields that

$$\begin{aligned} \delta \le \frac{1}{ \cos \left( \frac{a}{2} \sqrt{\lambda (4-\lambda ) } \right) } \exp \left( - \lambda \frac{a}{2} \right) . \end{aligned}$$

Recall that \(\lambda =4-\epsilon \). Choose \(a = \frac{ \pi -\sqrt{\epsilon } }{ \sqrt{\lambda (4-\lambda )} }\). Then we get

$$\begin{aligned} \delta \le \frac{1}{\sin \left( \frac{ \sqrt{ \epsilon } }{2} \right) } \exp \left( - \sqrt{\lambda } \frac{\pi - \sqrt{\epsilon } }{2 \sqrt{\epsilon } } \right) \le \frac{c'}{\sqrt{ \epsilon } } \exp \left( - \frac{\pi }{\sqrt{\epsilon }} \right) , \end{aligned}$$

for a universal constant \(c'>0\). \(\square \)

Appendix E Reduction arguments

1.1 E.1 Reduction from dense to sparse model

In this subsection, we prove Theorem 2 for the dense model by reducing it to the sparse model. Suppose Theorem 2 holds under the sparse model. Recall that in the dense model, the planted weight density is p(x) and the null weight density is \(q(x)=\frac{1}{d} \rho (\frac{x}{d})\), where p and \(\rho \) are fixed probability densities on \({\mathbb {R}}\) and \(d \rightarrow \infty \). In this case the impossibility condition (7) simplifies to:

$$\begin{aligned} \int _{-\infty }^\infty \sqrt{ p(x) \rho (0) } \textrm{d}x \ge 1+\epsilon . \end{aligned}$$
(95)

Define a sparse model with parameter \((d',{\mathcal {P}}',{\mathcal {Q}}')\), where

$$\begin{aligned} d'=\varGamma \rho (0)\left( 1-\frac{\epsilon }{2}\right) ^2, \quad {\mathcal {P}}'={\mathcal {P}}, \quad {\mathcal {Q}}'=\text {Unif}[-\varGamma /2,\varGamma /2], \end{aligned}$$

and for some positive constant \(\varGamma \) to be specified. Note that \(d',{\mathcal {P}}',{\mathcal {Q}}'\) are all independent of n. Next we show that, given graph \(G'\) drawn from the \((d',{\mathcal {P}}',{\mathcal {Q}}')\) model with planted matching \(M^*\), there exists a (randomized) mapping f such that \(G\triangleq f(G')\) is distributed according to the \((d,{\mathcal {P}},{\mathcal {Q}})\) model with the same planted matching \(M^*\). Note that crucially such a mapping needs to be agnostic to the latent \(M^*\).

To construct the mapping f, we first express \({\mathcal {Q}}\) as a mixture of \({\mathcal {Q}}'\) and some other distribution \(\overline{{\mathcal {Q}}}\), that is \({\mathcal {Q}}=t {\mathcal {Q}}'+(1-t)\overline{{\mathcal {Q}}}\), where \(t=\frac{d'}{d}=\frac{\varGamma }{d}\rho (0)\left( 1-\frac{\epsilon }{2}\right) ^2\) and \(\overline{{\mathcal {Q}}}\) has density \(\overline{q}(x) \triangleq \frac{ q(x)- tq'(x) }{1-t}\). We claim that \(\overline{q}\) is well-defined density for large d. Indeed, note that \(q(x)=\frac{1}{d}\rho (\frac{x}{d})\), \(q'(x)={\textbf{1}_{\left\{ {|x|\le \varGamma /2}\right\} }}/\varGamma \) and \(d\rightarrow \infty \). By continuity of \(\rho \) at 0, \(\rho (0)(1-\frac{\epsilon }{2})^2\le \rho (\frac{x}{d})\) holds for all \(|x| \le \varGamma /2\), provided that d is sufficiently large. Thus \(tq'(x) \le q(x)\) for all x.

Next, given \(G'\), we generate a denser graph \(G=f(G')\) as follows. For each edge e in \(G'\), we leave its edge weight unchanged. For each edge e not in \(G'\), we connect it in G independently with probability r/n and draw its edge weight \(W_e \sim \overline{{\mathcal {Q}}}\), where \(r\triangleq \frac{d-d'}{1-d'/n}\). The choice of r is such that for each \(e\in [n]\times [n]'\backslash M^*\), the probability that e is an unplanted edge in G (namely, \(e\in E(G)\backslash M^*\)) equals \((1-\frac{d'}{n})\frac{r}{n}+\frac{d'}{n}=\frac{d}{n}\). Furthermore, for each unplanted e in G, it is an unplanted edge in \(G'\) with probability \(\frac{d'/n}{d/n}=\frac{d'}{d}=t\). As a result, conditioned on e being an unplanted edge in G, the distribution of \(W_e\) is \(t {\mathcal {Q}}' + (1-t) \overline{{\mathcal {Q}}}= {\mathcal {Q}}\). Also, by construction, conditioned on the true matching \(M^*\) of \(G'\), the edge weights of G are independent. In other words, \(G=f(G')\) is distributed according the dense \((d,{\mathcal {P}},{\mathcal {Q}})\) model with \(M^*\) being the planted matching. Thus, it remains to verify the impossibility for the sparse \((d',{\mathcal {P}}',{\mathcal {Q}}')\) model by verifying the condition of Theorem 2. Indeed,

$$\begin{aligned} \sqrt{d'} B({\mathcal {P}}',{\mathcal {Q}}')=&\sqrt{td}\int _{-\varGamma /2}^{\varGamma /2}\sqrt{p(x)\frac{1}{\varGamma }}dx = \left( 1-\frac{\epsilon }{2}\right) \sqrt{\rho (0)}\int _{-\varGamma /2}^{\varGamma /2}\sqrt{p(x)}dx\ge 1+\frac{\epsilon }{4}, \end{aligned}$$

where the last inequality holds as a consequence of the condition (95) and by choosing a large enough \(\varGamma \). This completes the proof of Theorem 2 for dense models.

1.2 E.2 Reduction for general weight distributions: positive result

In Sect. 2, we proved Theorem 4 under the assumption \({\mathcal {P}}\ll {\mathcal {Q}}\). Here, we prove the theorem for general \({\mathcal {P}},{\mathcal {Q}}\). Note that our main positive result Theorem 1 follows directly from Theorem 4, per the argument in Sect. 2.

Recall that in (4), f and g denote the densities of \({\mathcal {P}}\) and \({\mathcal {Q}}\) with respect to a common dominating measure \(\mu \), respectively. Under this general model, the maximum likelihood estimator \({\widehat{M}}_{\textsf{ML}}\) takes the form

$$\begin{aligned} {\widehat{M}}_{\textsf{ML}}\in \arg \max _{M\in {\mathcal {M}}} \prod _{e\in M}f(W_e)\prod _{e\notin M}g(W_e). \end{aligned}$$

Let \(p={\mathcal {P}}\{g>0\}\). If \(p=0\), then \({\mathcal {P}}\) and \({\mathcal {Q}}\) are mutually singular. In this case, it is easy to see that \({\widehat{M}}_{\textsf{ML}}=M^*\) with probability one. Next, suppose \(p>0.\) Let \({\mathcal {P}}'\) denote the distribution \({\mathcal {P}}\) conditioned on the support of \({\mathcal {Q}}\), with density \(f'=f{\textbf{1}_{\left\{ {g>0}\right\} }}/p\). Then we have \({\mathcal {P}}'\ll {\mathcal {Q}}\).

Let \(V=\{i\in [n]:g(W_{i,i'})>0\}\). Therefore, \(g(W_{i,i'})=0\) for all \(i\in V^c\). We first observe that if a perfect matching \(M\in {\mathcal {M}}\) does not contain all the red edges \(\{(i,i'): i\in V^c\}\), then it has a zero likelihood. Thus, \({\widehat{M}}_{\textsf{ML}}\) reduces to:

$$\begin{aligned} {\widehat{M}}_{\textsf{ML}}=\left\{ (i,i'):i\in V^c\right\} \cup \widehat{M}_{\textsf{ML},V}, \end{aligned}$$
(96)

where for \({\mathcal {M}}_V\) defined as the set of perfect matchings on \(V\times V'\),

$$\begin{aligned} \widehat{M}_{\textsf{ML},V}\in \arg \max _{M\in {\mathcal {M}}_V} \prod _{e\in M}f(W_e)\prod _{e\notin M}g(W_e). \end{aligned}$$

In other words, \(\widehat{M}_{\textsf{ML},V}\) is the maximum likelihood estimator over the subgraph \(G'=G[V\times V']\). Moreover, conditional on the set V, all the red edges weights on \(G'\) are i.i.d. following distribution \({\mathcal {P}}'\), and all the blue edge weights are i.i.d. according to \({\mathcal {Q}}\). Denote \(M^*_V=\{(i,i'):i\in V\}\) for the true matching on \(G'\), and let \(n'=|V|\). We can therefore apply Theorem 4 on \(G'\) with \(B({\mathcal {P}}',{\mathcal {Q}})=B({\mathcal {P}},{\mathcal {Q}})/\sqrt{p}\). Under the condition \(\sqrt{n}B({\mathcal {P}},{\mathcal {Q}})\le 1+\epsilon \), we have

$$\begin{aligned} \sqrt{n'}B({\mathcal {P}}',{\mathcal {Q}})=\sqrt{n}B({\mathcal {P}},{\mathcal {Q}})\sqrt{\frac{n'}{np}}\le (1+\epsilon )\sqrt{\frac{n'}{np}}. \end{aligned}$$

Theorem 4 yields

$$\begin{aligned} \mathbb {E}\!\left[ |\widehat{M}_{\textsf{ML},V}\triangle M^*_V| \;\Biggr |\; V \right] \le Cn'\max \left\{ \log (1+\epsilon '),\sqrt{\frac{\log n'}{n'}}\right\} , \end{aligned}$$

where \(\epsilon '\) is such that

$$\begin{aligned} 2\log (1+\epsilon ')= 2\log (1+\epsilon )+\log \frac{n'}{np}. \end{aligned}$$

Thus,

$$\begin{aligned} \nonumber \mathbb {E}\!\left[ |\widehat{M}_{\textsf{ML},V}\triangle M^*_V| \;\Biggr |\; V \right] \le \,&\, C\max \left\{ n'\log (1+\epsilon )+\frac{n'}{2}\log \frac{n'}{np}, \sqrt{n'\log n'}\right\} \\ \le \,&\, C\max \left\{ n\log (1+\epsilon )+\frac{n'}{2}\log \frac{n'}{np}, \sqrt{n\log n}\right\} , \end{aligned}$$
(97)

where the last inequality is from \(n'=|V|\le n\). Next, we average over V to obtain that

$$\begin{aligned} \mathbb {E}\!\left[ |\widehat{M}_{\textsf{ML},V}\triangle M^*_V| \right] \le \,&C\mathbb {E}\!\left[ \max \left\{ n\log (1+\epsilon )+\frac{n'}{2}\log \frac{n'}{np}, \sqrt{n\log n}\right\} \right] \\ \le \,&C\left[ n\log (1+\epsilon )+\mathbb {E}\!\left[ \frac{n'}{2}\log \frac{n'}{np} \right] +\sqrt{n\log n}\right] . \end{aligned}$$

Under our model, \(W_{i,i'}\sim {\mathcal {P}}\) for all i. Therefore \(n'= |V|\sim \textrm{Binom}(n,p)\). To bound \(\mathbb {E}\!\left[ n'\log n' \right] \), note that for any \(u>0,\) and \(x \ge 0\), \(\log (x/u) \le x/u -1\) so that \(x \log x \le x^2/u + x \log (u/e)\). Thus,

$$\begin{aligned} \mathbb {E}\!\left[ n' \log n' \right] \le \mathbb {E}\!\left[ (n')^2 \right] / u + \mathbb {E}\!\left[ n' \right] \log (u/e) \le \mathbb {E}\!\left[ n' \right] \log \frac{\mathbb {E}\!\left[ (n')^2 \right] }{\mathbb {E}\!\left[ n' \right] } \le np \log (n p + 1), \end{aligned}$$

where the first inequality holds by optimally choosing \(u=\mathbb {E}\!\left[ (n')^2 \right] /\mathbb {E}\!\left[ n' \right] .\) Hence, \(\mathbb {E}\!\left[ n'\log \frac{n'}{np} \right] \le np \log (1 + 1/(np) ) \le 1\) and consequently,

$$\begin{aligned} \mathbb {E}\!\left[ |\widehat{M}_{\textsf{ML},V}\triangle M^*_V| \right]\le & {} C \left[ n\log (1+\epsilon )+ 1/2+ \sqrt{n\log n}\right] \\\le & {} C_1 \max \left\{ n\log (1+\epsilon ), \sqrt{n\log n}\right\} \end{aligned}$$

for some universal constant \(C_1\). This finishes the proof of Theorem 4 for general \({\mathcal {P}},{\mathcal {Q}}\).

1.3 E.3 Reduction for general weight distributions: negative results

In Appendix E.1 we have already reduced the impossibility result Theorem 2 from the dense model to the sparse model. In this section, we show that for the sparse model \((d,{\mathcal {P}},{\mathcal {Q}})\), we can assume WLOG that \({\mathcal {P}}\ll {\mathcal {Q}}\) and \({\mathcal {Q}}\ll {\mathcal {P}}\). In other words, assuming Theorem 2 holds under the sparse model with \({\mathcal {P}}\) and \({\mathcal {Q}}\) mutually absolutely continuous, we prove Theorem 2 for general (fixed) \({\mathcal {P}}\), \({\mathcal {Q}}\).

Recall that \(\mu \) is a common dominating measure of \({\mathcal {P}},{\mathcal {Q}}\), and fg are densities of \({\mathcal {P}},{\mathcal {Q}}\) respectively. As in the reduction argument for the positive result, we first define distribution \({\mathcal {P}}'\) with density \(f{\textbf{1}_{\left\{ {g>0}\right\} }}/p\), and distribution \({\mathcal {Q}}'\) with density \(g{\textbf{1}_{\left\{ {f>0}\right\} }}/q\), where \(p={\mathcal {P}}\{g>0\}\) and \(q={\mathcal {Q}}\{f>0\}\) are the normalizing constants. Note that since \(B({\mathcal {P}},{\mathcal {Q}})>0\), both pq are strictly positive, so that \({\mathcal {P}}'\) and \({\mathcal {Q}}'\) are well-defined. Then we have \({\mathcal {P}}'\ll {\mathcal {Q}}'\), and \({\mathcal {Q}}'\ll {\mathcal {P}}'\).

We first reduce the original \((d,{\mathcal {P}},{\mathcal {Q}})\) model to the \((d', {\mathcal {P}},{\mathcal {Q}}')\) model where \(d'=dq\). Note that for edges e that either do not appear in G (in which case recall that we set \(W_e =\star \) for a special symbol \(\star \) to signify that e is not in G), or are such that \(f(W_e)=0\), we have \((f/g)(W_e)=0\). Thus the posterior distribution of \(M^*\) under the \((d,{\mathcal {P}},{\mathcal {Q}})\) model is identical to the posterior distribution under the \((d', {\mathcal {P}},{\mathcal {Q}}')\) model.

Therefore, we only need to show that the conclusion of Theorem 2 holds for graph G with weights \((W_e)\) that follow the \((d', {\mathcal {P}},{\mathcal {Q}}')\) model. Let \(V=\{i\in [n]:g(W_{i,i'})>0\}\). We can bound the optimal overlap as follows:

$$\begin{aligned} \sup _{\widehat{M}}\mathbb {E}\left( \sum _{i\le n}{\textbf{1}_{\left\{ {(i,i') \in \widehat{M}}\right\} }}\right) \le \,&\sup _{\widehat{M}}\left[ \mathbb {E}\left( V^c\right) +\mathbb {E}\left( \sum _{i\in V}{\textbf{1}_{\left\{ {(i,i') \in \widehat{M}}\right\} }}\right) \right] \nonumber \\ \le \,&n{(1-p)}+\mathbb {E}\left[ \sup _{\widehat{M}}\mathbb {E}\left( \sum _{i\in V}{\textbf{1}_{\left\{ {(i,i') \in \widehat{M}}\right\} }}\;\bigg |\;V\right) \right] . \end{aligned}$$
(98)

Conditional on V, the subgraph \(\widetilde{G}=G[V\times V']\) follows the \((d'|V|/n, {\mathcal {P}}',{\mathcal {Q}}')\) model. To see that, note that the red edges in \(\widetilde{G}\) are distributed i.i.d. \({\mathcal {P}}'\); the blue edges follow distribution \({\mathcal {Q}}'\), and appear independently with probability \(d'/n = (d'|V|/n)/|V|\). To apply the impossibility result under the \((d'|V|/n, {\mathcal {P}}',{\mathcal {Q}}')\) model, we need to establish a lower bound on |V|. Let

$$\begin{aligned} {\mathcal {A}}=\left\{ |V|\ge \left( 1-\frac{\epsilon }{2}\right) ^2pn\right\} . \end{aligned}$$

Recall that under the sparse model, \({\mathcal {P}},{\mathcal {Q}}\) are fixed distributions that do not depend on n. Therefore p does not depend on n, and since \(|V|\sim \textrm{Binom}(n,p)\), we have \(\mathbb {P}({\mathcal {A}})=1-o(1)\).

Furthermore, note that

$$\begin{aligned} B({\mathcal {P}}',{\mathcal {Q}}')=\frac{\int \sqrt{ fg } \textrm{d}\mu }{\sqrt{pq}} = \frac{B({\mathcal {P}},{\mathcal {Q}})}{\sqrt{pq}}. \end{aligned}$$

Therefore on \({\mathcal {A}}\), we have

$$\begin{aligned} \sqrt{\frac{d'|V|}{n} } B \left( {\mathcal {P}}', {\mathcal {Q}}' \right) =\sqrt{\frac{d|V|}{np} } B \left( {\mathcal {P}}, {\mathcal {Q}}\right) \ge (1+\epsilon )(1-\epsilon /2)\ge 1+\epsilon /3 \end{aligned}$$

for \(\epsilon \le 1/3\). Therefore, we can invoke the conclusion of Theorem 2 under the \((d'|V|/n, {\mathcal {P}}',{\mathcal {Q}}')\) model to deduce that on \({\mathcal {A}}\), for some constant \(c>0\) and large enough n,

$$\begin{aligned} \sup _{\widehat{M}}\mathbb {E}\left( \sum _{i\in V}{\textbf{1}_{\left\{ {(i,i') \in \widehat{M}}\right\} }} \;\bigg |\;V\right) \le (1-c)|V|, \end{aligned}$$

where the supremum is over all (possibly random) mappings from \(G[V\times V']\) to \({\mathcal {M}}_V\), the set of perfect matchings on \(V\times V'\). Conditional on V, the edges in \(G[V\times V']\) are independent of those in \(G\setminus G[V\times V']\). Thus, we can replace the range of the supremum with all (random) mappings from G to \({\mathcal {M}}_V\) without changing its value, allowing us to continue upper bounding (98) with

$$\begin{aligned} n(1-p) + \mathbb {E}\left[ (1-c)|V|\textbf{1}_{\mathcal {A}}\right] + n\mathbb {P}\left( {\mathcal {A}}^c\right)\le & {} \left( (1-p) + (1-c)p+o(1)\right) n \\= & {} \left( 1-cp+o(1)\right) n. \end{aligned}$$

In other words, the fraction of misclassified edges is lower bounded by \(cp+o(1)\). Under the sparse model, cp is a positive constant that only depends on \(d,{\mathcal {P}},{\mathcal {Q}}\). We have proved Theorem 2 under the \((d',{\mathcal {P}},{\mathcal {Q}}')\) model. The conclusion of Theorem 2 under the \((d,{\mathcal {P}},{\mathcal {Q}})\) model immediately follows.

Appendix F Finite-order phase transition under the unweighted model

In this section, we prove that in the unweighted case \({\mathcal {P}}={\mathcal {Q}}\), under the impossibility condition (7), which translates to \(\sqrt{d}\ge 1+\epsilon \), the minimal reconstruction error is at least \(\varOmega (\epsilon ^8)\) for small \(\epsilon \). Similar to the proof of Theorem 2, we can assume WLOG that \(\sqrt{d}=1+\epsilon \). As in the proof under general weight distributions, this result is again proven via the two-stage cycle finding scheme described in Algorithm 1. Recall from Algorithm 1 that we reserve a set V of \(\gamma n\) left vertices from [n]. We will first construct many disjoint alternating paths on the subgraph \(G_1=G[V^c\times (V^c)']\), and then use the reserved vertices to connect the paths into alternating cycles.

1.1 F.1 Path construction

The sets \(L_k\) and \(R_k\) and the alternating paths will again be constructed using two-sided trees. However, unlike the model with general weight distribution, we do not need to keep track of the weights of the paths, hence the construction is much simpler. For example, the leaf node selection step is no longer necessary. Moreover, in the previous sections we needed the paths to be long enough, so that the weights on the paths dominate the weights of the sprinkling edges. In the unweighted case, that restriction is also lifted. Therefore we can define \(L_k\) (resp. \(R_k\)) to be all the left (resp. right) vertices in the left (resp. right) subtree, instead of only the selected leaf nodes. The detailed construction is given in Algorithm 4 below.

figure d

Note that Algorithm 4 explores \(\gamma n\) pairs of vertices in total. Next, we show that with high probability, the size \(K_1\) of the set \({\mathcal {K}}_1\) is at least \(c_3n/\ell \) for some constant \(c_3=\varOmega (\epsilon ^3)\). That is, a constant proportion of two-sided trees contain exactly \(2\ell \) pairs of vertices in both the left and the right tree.

By construction, for each k, \(|L_k|\) can only be strictly smaller than \(2\ell \) if the breadth-first search cannot find more vertices to explore, namely, the branching process dies. Since the number \(|{\mathcal {U}}|\) of unused vertices to explore is at least \((1-2\gamma )n\), we have

$$\begin{aligned} \mathbb {P}\left\{ |L_k|<2\ell \right\} \le \,&\text { Probability of extinction for a branching process}\\&\text {with offspring distribution }\textrm{Binom}((1-2\gamma )n, d/n)=: 1-c_4. \end{aligned}$$

By choosing \(\gamma =\epsilon /2\), the mean of the offspring distribution is \((1-2\gamma )d =(1-\epsilon )(1+\epsilon )^2\ge 1+\epsilon /2\) for \(\epsilon \) smaller than some universal constant \(\epsilon _0\). Therefore, according to the standard Branching process theory (see e.g. [12, Theorem 25.1]), \(1-c_4\) is the unique solution \(\rho <1\) so that \(\phi (\rho )=\rho \), where \(\phi (\rho )=\mathbb {E}\!\left[ \rho ^X \right] \) with \(X \sim \textrm{Binom}((1-2\gamma )n, d/n)\). In particular, the probability of survival \(c_4\) is strictly positive, and it can be further shown that \(c_4\ge c_5 \epsilon \) for some universal constant \(c_5\) for small enough \(\epsilon \).

Note that the argument would be simplified if the events \(\{|L_k|=2\ell \}\) and \(\{|R_k|=2\ell \}\) were independent. This, however, is not true since when constructing \(R_k\), the number of unexplored vertices depends on \(|L_k|\). To resolve this technicality, note that there are always at least \((1-2\gamma )n\) unused vertices, and that the construction never reuses vertices. Therefore we can couple the construction of \(L_k\), \(R_k\) with two independent branching processes, each with offspring distribution \(\textrm{Binom}((1-2\gamma )n, d/n)\), such that

$$\begin{aligned} \mathbb {P}\left\{ |L_k|=|R_k|=2\ell \right\} \ge \mathbb {P}\left\{ \text {both branching processes survive}\right\} = c_4^2. \end{aligned}$$

Since the algorithm uses \(\gamma n\) pairs of vertices in total, and at most \(2\ell \) pairs are used for each k, we have \(K\triangleq \text {the total number of two-sided trees}\ge \gamma n/(2\ell )\). Thus

$$\begin{aligned} \mathbb {P}\left\{ \sum _{k\le K}{\textbf{1}_{\left\{ {|L_k|=|R_k|=2\ell }\right\} }}<\frac{c_4^2 \gamma n}{4\ell }\right\} \le \,&\mathbb {P}\left\{ \textrm{Binom}(K, c_4^2)<\frac{c_4^2 \gamma n}{4\ell }\right\} \\ \le \,&\exp \left( -\frac{c_4^4\gamma }{4\ell }n\right) \end{aligned}$$

by Hoeffding’s inequality. We have shown that with probability \(1-\exp (-\varOmega (n))\), \(K_1\ge c_3n/\ell \), with constant \(c_3\triangleq c_4^2 \gamma /4 \ge c_5^2 \epsilon ^3/8\).

1.2 F.2 Sprinkling stage

In this subsection, we apply Algorithm 3 and Theorem 6 to show the existence of exponentially alternating cycles in G. Since there are no weights, we set the thresholds in Algorithm 3 as \(\tau _\textsf{red}=\tau _\textsf{blue}=0\). Since \({\mathcal {P}}={\mathcal {Q}}\), these thresholds yield \(V^*=V\), and \(G_2\) contains all edges in G except those in \(V^c\times (V^c)'\). The parameters in Theorem 6 are specified as follows. Since \(|V|=\gamma n\), we have \(\beta =\gamma =\epsilon /2\). The blue edge probability in \(G_2\) is d/n, so that \(\eta =d\). In the path construction stage, we showed that the set \({\mathcal {K}}_1\) is of size at least \(c_5^2\epsilon ^3 n/(8 \ell )\). By taking its subset, we can assume WLOG that \(K_1=|{\mathcal {K}}_1|=c_5^2\epsilon ^3 n/(8\ell )=c_6 \epsilon ^3 n/\ell \) for some universal constant \(c_6\).

Let \(\ell =c_7/\epsilon ^5\) for some universal constant \(c_7\) that will be chosen later. Next, we check the assumptions of Theorem 6. Note that \(s=2\ell \) and

$$\begin{aligned} b=\frac{\beta s \eta }{4}=\frac{c_7 (1+\epsilon )^2}{4\epsilon ^4}\ge 4 \end{aligned}$$

for all \(\epsilon <\epsilon _0\) for small enough \(\epsilon _0\);

$$\begin{aligned} K_1=\frac{c_6 \epsilon ^3 n}{ \ell }\ge 8400 \end{aligned}$$

for large enough n;

$$\begin{aligned} \kappa =\frac{2K_1 s\eta }{n}=4c_6\epsilon ^3 (1+\epsilon )^2\le \frac{1}{16^2} \end{aligned}$$

for small enough \(\epsilon _0\);

$$\begin{aligned} d_\textrm{super}=\frac{K_1 b^2\eta }{32n}=\frac{c_6c_7 (1+\epsilon )^6}{512} \ge 256\log (32 e) \end{aligned}$$

for \(c_7\) chosen large enough. We have checked that all the assumptions of Theorem 6 are satisfied. Therefore, with high probability, Algorithm 1 yields at least \(\exp (K_2/20)=\exp (c_6\epsilon ^8 n/(320 c_7))\) alternating cycles of length at least \(3K_2/4=3c_6\epsilon ^8 n/(64 c_7)\). Each alternating cycle corresponds to a perfect matching in G. By taking \(\delta = 3K_2/8=3c_6\epsilon ^8 n/(128 c_7)\), all these perfect matchings are in \({\mathcal {M}}_\textsf{bad}\). Note that under the unweighted model, all the perfect matchings that appear in G occupy the same posterior mass. Thus we have shown that with high probability, \(\mu _W({\mathcal {M}}_\textsf{bad})/\mu _W(M^*)\ge \exp (c_6\epsilon ^8 n/(320 c_7))\). By Lemma 1, \(\mu _W({\mathcal {M}}_\textsf{good})/\mu _W(M^*) \le 2 e^{7\epsilon \delta n}\) with high probability. Thus we conclude that \(\mathbb {E}[\ell (\widetilde{M},M^*)]\gtrsim \delta =\varOmega (\epsilon ^8)\).

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ding, J., Wu, Y., Xu, J. et al. The planted matching problem: sharp threshold and infinite-order phase transition. Probab. Theory Relat. Fields 187, 1–71 (2023). https://doi.org/10.1007/s00440-023-01208-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00440-023-01208-6

Keywords

Mathematics Subject Classification

Navigation