Lecture Notes in Computer Science
Edited by G. Goos, J. Hartmanis, and J. van Leeuwen
1776
Berlin
Heidelberg
New York
Barcelona
Hong Kong
London
Milan
Paris
Singapore
Tokyo
Gastón H. Gonnet Daniel Panario
Alfredo Viola (Eds.)
LATIN 2000:
Theoretical Informatics
4th Latin American Symposium
Punta del Este, Uruguay, April 10-14, 2000
Proceedings
Series Editors
Gerhard Goos, Karlsruhe University, Germany
Juris Hartmanis, Cornell University, NY, USA
Jan van Leeuwen, Utrecht University, The Netherlands
Volume Editors
Gastón H. Gonnet
Informatik, ETH
8092 Zürich, Switzerland
E-mail: gonnet@inf.ethz.ch
Daniel Panario
University of Toronto
Department of Computer Science
10 Kings College Road Toronto - Ontario, Canada
E-mail: daniel@cs.toronto.edu
Alfredo Viola
Universidad de la República, Facultad de Ingenieria
Instituto de Computación, Pedeciba Informática
Casilla de Correo 16120, Distrito 6
Montevideo, Uruguay
Cataloging-in-Publication Data applied for
Die Deutsche Bibliothek - CIP-Einheitsaufnahme
Theoretical informatics : proceedings / LATIN 2000, 4th Latin American
Symposium, Punta del Este, Uruguay, April 10 - 14, 2000 Gastón H.
Gonnet ... (ed.). - Berlin ; Heidelberg ; New York ; Barcelona ; Hong
Kong ; London ; Milan ; Paris ; Singapore ; Tokyo : Springer, 2000
(Lecture notes in computer science ; Vol. 1776)
ISBN 3-540-67306-7
CR Subject Classification (1991): F.2, G.2, G.1, F.1, F.3, C.2, E.3
ISSN 0302-9743
ISBN 3-540-67306-7 Springer-Verlag Berlin Heidelberg New York
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is
concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting,
reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication
or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965,
in its current version, and permission for use must always be obtained from Springer-Verlag. Violations are
liable for prosecution under the German Copyright Law.
Springer-Verlag is a company in the BertelsmannSpringer publishing group
c Springer-Verlag Berlin Heidelberg 2000
Printed in Germany
Typesetting: Camera-ready by author, data conversion by PTP-Berlin, Danny Lewis
Printed on acid-free paper
SPIN: 10719839
06/3142
543210
Preface
This volume contains the proceedings of the LATIN 2000 International Conference
(Latin American Theoretical INformatics), to be held in Punta del Este, Uruguay, April
10-14, 2000.
This is the fourth event in the series following São Paulo, Brazil (1992), Valparaíso,
Chile (1995), and Campinas, Brazil (1998). LATIN has established itself as a fully
refereed conference for theoretical computer science research in Latin America. It has
also strengthened the ties between local and international scientific communities. We
believe that this volume reflects the breadth and depth of this interaction.
We received 87 submissions, from 178 different authors in 26 different countries.
Each paper was assigned to three program committee members. The Program Committee
selected 42 papers based on approximately 260 referee reports. In addition to these
contributed presentations, the conference included six invited talks.
The assistance of many organizations and individuals was essential for the success
of this meeting. We would like to thank all of our sponsors and supporting organizations.
Ricardo Baeza-Yates, Claudio Lucchesi, Arnaldo Moura, and Imre Simon provided insightful advice and shared with us their experiences as organizers of previous LATIN
meetings. Joaquín Goyoaga and Patricia Corbo helped in the earliest stages of the organization in various ways, including finding Uruguayan sources of financial support.
SeCIU (Servicio Central de Informática Universitario, Universidad de la República) provided us with the necessary communication infrastructure. The meeting of the program
committee was hosted by the Instituto de Matemática e Estatística, Universidade de São
Paulo, which also provided us with the Intranet site for discussions among PC members.
We thank the researchers of the Institute for their collaboration, and in particular, Arnaldo
Mandel for the Intranet setup. Finally, we thank Springer-Verlag for their commitment
in publishing this and previous LATIN proceedings in the Lecture Notes in Computer
Science series.
We are encouraged by the positive reception and interest that LATIN 2000 has created
in the community, partly indicated by a record number of submissions.
January 2000
Gastón Gonnet
Daniel Panario
Alfredo Viola
The Conference
Invited Speakers
Allan Borodin (Canada)
Philippe Flajolet (France)
Joachim von zur Gathen (Germany)
Yoshiharu Kohayakawa (Brazil)
Andrew Odlyzko (USA)
Prabhakar Raghavan (USA)
Program Committee
Ricardo Baeza-Yates (Chile)
Béla Bollobás (USA)
Felipe Cucker (Hong Kong)
Josep Díaz (Spain)
Esteban Feuerstein (Argentina)
Celina M. de Figueiredo (Brazil)
Gastón Gonnet (Switzerland, Chair)
Jozef Gruska (Czech Republic)
Joos Heintz (Argentina/Spain)
Gérard Huet (France)
Marcos Kiwi (Chile)
Ming Li (Canada)
Cláudio L. Lucchesi (Brazil)
Ron Mullin (Canada)
Ian Munro (Canada)
Daniel Panario (Canada)
Dominique Perrin (France)
Patricio Poblete (Chile)
Bruce Reed (France)
Bruce Richmond (Canada)
Vojtech Rödl (USA)
Imre Simon (Brazil)
Neil Sloane (USA)
Endre Szmerédi (USA)
Alfredo Viola (Uruguay)
Yoshiko Wakabayashi (Brazil)
Siang Wun Song (Brazil)
Nivio Ziviani (Brazil)
Organizing Committee
Ed Coffman Jr.
Cristina Cornes
Javier Molina
Laura Molina
Lucia Moura
Daniel Panario (Co-Chair)
Alberto Pardo
Luis Sierra
Alfredo Viola (Co-Chair)
Local Arrangements
The local arrangements for the conference were handled by IDEAS S.R.L.
Organizing Institutions
Instituto de Computación (Universidad de la República Oriental del Uruguay)
Pedeciba Informática
VIII
The Conference
Sponsors and Supporting Organizations
CLEI (Centro Latinoamericano de Estudios en Informática)
CSIC (Comisión Sectorial de Investigación Científica, Universidad de la República)
CONICYT (Consejo Nacional de Investigaciones Científicas y Técnicas)
UNESCO
Universidad ORT del Uruguay
Tecnología Informática
The Conference
Referees
Carme Alvarez
Andre Arnold
Juan Carlos Augusto
Valmir C. Barbosa
Alejandro Bassi
Gabriel Baum
Denis Bechet
Leopoldo Bertossi
Ralf Borndoerfer
Claudson Bornstein
Richard Brent
Véronique Bruyère
Héctor Cancela
Rodney Can eld
Jianer Chen
Chirstian Choffrut
José Coelho de Pina Jr.
Ed Coffman Jr.
Don Coppersmith
Cristina Cornes
Bruno Courcelle
Gustavo Crispino
Maxime Crochemore
Diana Cukierman
Joe Culberson
Ricardo Dahab
Célia Picinin de Mello
Erik Demaine
Nachum Dershowitz
Luc Devroye
Volker Diekert
Luis Dissett
Juan V. Echagüe
David Eppstein
Mart´n Farach-Colton
Paulo Feo lof f
Henning Fernau
Cristina G. Fernandes
W. Fernández de la Vega
Carlos E. Ferreira
Marcelo Fr´as
Zoltan Furedi
Joaquim Gabarro
Juan Garay
Mark Giesbrecht
Eduardo Giménez
Bernard Gittenberger
Raúl Gouet
Qian-Ping Gu
Marisa Gutiérrez
Ryan Hayward
Ch´nh T. Hoàng
Delia Kesner
Ayman Khalfalah
Yoshiharu Kohayakawa
Teresa Krick
Eyal Kushilevitz
Anton´n Kucera
Imre Leader
Hanno Lefmann
Sebastian Leipert
Stefano Leonardi
Sachin Lodha
F. Javier López
Hosam M. Mahmoud
A. Marchetti-Spaccamela
Claude Marché
Arnaldo Mandel
Mart´n Matamala
Guillermo Matera
Jacques Mazoyer
Alberto Mendelzon
Ugo Montanari
François Morain
Petra Mutzel
Rajagopal Nagarajan
Gonzalo Navarro
Marden Neubert
Cyril Nicaud
Takao Nishizeki
Johan Nordlander
Alfredo Olivero
Alberto Pardo
Jordi Petit
Wojciech Plandowski
Libor Polák
Pavel Pudlak
Davood Ra ei
Ivan Rapaport
Mauricio G.C. Resende
Celso C. Ribeiro
Alexander Rosa
Salvador Roura
Andrzej Ruciński
Juan Sabia
Philippe Schoebelen
Maria Serna
Oriol Serra
Jiri Sgall
Guillermo R. Simari
José Soares
Pablo Solernó
Doug Stinson
Jorge Stol
Leen Stougie
Jayme Szwarc ter
Prasad Tetali
Dimitrios Thilikos
Soledad Torres
Luca Trevisan
Vilmar Trevisan
Andrew Turpin
Kristina Vusković
Lusheng Wang
Sue Whitesides
Thomas Wilke
Hugh Williams
David Williamson
Fatos Xhafa
Daniel Yankelevich
Sheng Yu
Louxin Zhang
Binhai Zhu
IX
Table of Contents
Random Structures and Algorithms
Algorithmic Aspects of Regularity (Invited Paper) . . . . . . . . . . . . . . . . . . . . . . . . . .
Y. Kohayakawa, V. Rödl
1
Small Maximal Matchings in Random Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Michele Zito
Some Remarks on Sparsely Connected Isomorphism-Free Labeled Graphs . . . . . . 28
Vlady Ravelomanana, Loÿs Thimonier
Analysis of Edge Deletion Processes on Faulty Random Regular Graphs . . . . . . . . 38
Andreas Goerdt, Mike Molloy
Equivalent Conditions for Regularity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
Y. Kohayakawa, V. Rödl, J. Skokan
Algorithms I
Cube Packing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
F.K. Miyazawa, Y. Wakabayashi
Approximation Algorithms for Flexible Job Shop Problems . . . . . . . . . . . . . . . . . . 68
Klaus Jansen, Monaldo Mastrolilli, Roberto Solis-Oba
Emerging Behavior as Binary Search Trees Are Symmetrically Updated . . . . . . . . 78
Stephen Taylor
The LCA Problem Revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
Michael A. Bender, Martín Farach-Colton
Combinatorial Designs
Optimal and Pessimal Orderings of Steiner Triple Systems in Disk Arrays . . . . . . 95
Myra B. Cohen, Charles J. Colbourn
Rank Inequalities for Packing Designs and Sparse Triple Systems . . . . . . . . . . . . . 105
Lucia Moura
The Anti-Oberwolfach Solution: Pancyclic 2-Factorizations of Complete Graphs . 115
Brett Stevens
XII
Table of Contents
Web Graph, Graph Theory I
Graph Structure of the Web: A Survey (Invited Paper) . . . . . . . . . . . . . . . . . . . . . . . 123
Prabhakar Raghavan
Polynomial Time Recognition of Clique-Width ≤3 Graphs . . . . . . . . . . . . . . . . . . 126
Derek G. Corneil, Michel Habib, Jean-Marc Lanlignel, Bruce Reed, Udi Rotics
On Dart-Free Perfectly Contractile Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
Cláudia Linhares Sales, Frédéric Maffray
Graph Theory II
Edge Colouring Reduced Indifference Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
Celina M.H. de Figueiredo, Célia Picinin de Mello, Carmen Ortiz
Two Conjectures on the Chromatic Polynomial . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
David Avis, Caterina De Simone, Paolo Nobili
Finding Skew Partitions Efficiently . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
Celina M.H. de Figueiredo, Sulamita Klein, Yoshiharu Kohayakawa,
Bruce A. Reed
Competitive Analysis, Complexity
On the Competitive Theory and Practice of Portfolio Selection (Invited Paper) . . . 173
Allan Borodin, Ran El-Yaniv, Vincent Gogan
Almost k-Wise Independence and Hard Boolean Functions . . . . . . . . . . . . . . . . . . 197
Valentine Kabanets
Improved Upper Bounds on the Simultaneous Messages Complexity of the
Generalized Addressing Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
Andris Ambainis, Satyanarayana V. Lokam
Algorithms II
Multi-parameter Minimum Spanning Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
David Fernández-Baca
Linear Time Recognition of Optimal L-Restricted Prefix Codes . . . . . . . . . . . . . . . 227
Ruy Luiz Milidiú, Eduardo Sany Laber
Uniform Multi-hop All-to-All Optical Routings in Rings . . . . . . . . . . . . . . . . . . . . 237
Jaroslav Opatrny
A Fully Dynamic Algorithm for Distributed Shortest Paths . . . . . . . . . . . . . . . . . . . 247
Serafino Cicerone, Gabriele Di Stefano, Daniele Frigioni, Umberto Nanni
Table of Contents
XIII
Computational Number Theory, Cryptography
Integer Factorization and Discrete Logarithms (Invited Paper) . . . . . . . . . . . . . . . . 258
Andrew Odlyzko
Communication Complexity and Fourier Coefficients of the Diffie–Hellman Key . 259
Igor E. Shparlinski
Quintic Reciprocity and Primality Test for Numbers of the Form M = A5n ± ωn 269
Pedro Berrizbeitia, Mauricio Odreman Vera, Juan Tena Ayuso
Determining the Optimal Contrast for Secret Sharing Schemes in Visual
Cryptography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280
Matthias Krause, Hans Ulrich Simon
Analysis of Algorithms I
Average-Case Analysis of Rectangle Packings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292
E.G. Coffman, Jr., George S. Lueker, Joel Spencer, Peter M. Winkler
Heights in Generalized Tries and PATRICIA Tries . . . . . . . . . . . . . . . . . . . . . . . . . 298
Charles Knessl, Wojciech Szpankowski
On the Complexity of Routing Permutations on Trees by Arc-Disjoint Paths . . . . . 308
D. Barth, S. Corteel, A. Denise, D. Gardy, M. Valencia-Pabon
Algebraic Algorithms
Subresultants Revisited (Invited Paper) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318
Joachim von zur Gathen, Thomas Lücking
A Unifying Framework for the Analysis of a Class of Euclidean Algorithms . . . . . 343
Brigitte Vallée
Worst-Case Complexity of the Optimal LLL Algorithm . . . . . . . . . . . . . . . . . . . . . 355
Ali Akhavi
Computability
Iteration Algebras Are Not Finitely Axiomatizable . . . . . . . . . . . . . . . . . . . . . . . . . 367
Stephen L. Bloom, Zoltán Ésik
Undecidable Problems in Unreliable Computations . . . . . . . . . . . . . . . . . . . . . . . . . 377
Richard Mayr
Automata, Formal Languages
Equations in Free Semigroups with Anti-involution and Their Relation to
Equations in Free Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387
Claudio Gutiérrez
XIV
Table of Contents
Squaring Transducers: An Efficient Procedure for Deciding Functionality and
Sequentiality of Transducers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397
Marie-Pierre Béal, Olivier Carton, Christopher Prieur, Jacques Sakarovitch
Unambiguous Büchi Automata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 407
Olivier Carton, Max Michel
Linear Time Language Recognition on Cellular Automata with Restricted
Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417
Thomas Worsch
Logic, Programming Theory
From Semantics to Spatial Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427
Luis R. Sierra Abbate, Pedro R. D’Argenio, Juan V. Echagüe
On the Expressivity and Complexity of Quantitative Branching-Time Temporal
Logics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437
F. Laroussinie, Ph. Schnoebelen, M. Turuani
A Theory of Operational Equivalence for Interaction Nets . . . . . . . . . . . . . . . . . . . 447
Maribel Fernández, Ian Mackie
Analysis of Algorithms II
Run Statistics for Geometrically Distributed Random Variables . . . . . . . . . . . . . . . 457
Peter Grabner, Arnold Knopfmacher, Helmut Prodinger
Generalized Covariances of Multi-dimensional Brownian Excursion Local Times . 463
Guy Louchard
Combinatorics of Geometrically Distributed Random Variables: Length of
Ascending Runs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473
Helmut Prodinger
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 483
Algorithmic Aspects of Regularity
Y. Kohayakawa1⋆ and V. Rödl2
1
Instituto de Matemática e Estatı́stica, Universidade de São Paulo,
Rua do Matão 1010, 05508–900 São Paulo, Brazil
yoshi@ime.usp.br
2
Department of Mathematics and Computer Science,
Emory University, Atlanta, GA, 30322, USA
rodl@mathcs.emory.edu
Abstract. Szemerédi’s celebrated regularity lemma proved to be a fundamental result in graph theory. Roughly speaking, his lemma states
that any graph may be approximated by a union of a bounded number
of bipartite graphs, each of which is ‘pseudorandom’. As later proved
by Alon, Duke, Lefmann, Rödl, and Yuster, there is a fast deterministic
algorithm for finding such an approximation, and therefore many of the
existential results based on the regularity lemma could be turned into
constructive results. In this survey, we discuss some recent developments
concerning the algorithmic aspects of the regularity lemma.
1
Introduction
In the course of proving his well known density theorem for arithmetic progressions [47], Szemerédi discovered a fundamental result in graph theory. This
result became known as his regularity lemma [48]. For an excellent survey on this
lemma, see Komlós and Simonovits [35]. Roughly speaking, Szemerédi’s lemma
states that any graph may be approximated by a union of a bounded number of
bipartite graphs, each of which is ‘pseudorandom’.
Szemerédi’s proof did not provide an efficient algorithm for finding such an
approximation, but it was later proved by Alon, Duke, Lefmann, Rödl, and
Yuster [1,2] that such an algorithm does exist. Given the wide applicability of
the regularity lemma, the result of [1,2] had many consequences. The reader is
referred to [1,2,14,35] for the first applications of the algorithmic version of the
regularity lemma. For more recent applications, see [5,6,7,12,17,32,52].
If the input graph G has n vertices, the algorithm of [1,2] runs in time
O(M (n)), where M (n) = O(n2.376 ) is the time needed to multiply two n by n
matrices with {0, 1}-entries over the integers. In [34], an improvement of this is
given: it is shown that there is an algorithm for the regularity lemma that runs
in time O(n2 ) for graphs of order n.
If one allows randomization, one may do a great deal better, as demonstrated
by Frieze and Kannan. In fact, they show in [21,22] that there is a randomized
⋆
Partially supported by FAPESP (Proc. 96/04505–2), by CNPq (Proc. 300334/93–1),
and by MCT/FINEP (PRONEX project 107/97).
G. Gonnet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNCS 1776, pp. 1–17, 2000.
c Springer-Verlag Berlin Heidelberg 2000
2
Y. Kohayakawa and V. Rödl
algorithm for the regularity lemma that runs in time O(n) for n-vertex graphs.
Quite surprisingly, they in fact show that one may obtain an implicit description
of the required output in constant time. The key technique here is sampling.
The regularity lemma has been generalized for hypergraphs in a few different ways; see, e.g., [8,18,19,21,22,42]. One of these generalizations admits constructive versions, both deterministic [13] and randomized [21,22]. Again, the
consequences of the existence of such algorithms are important. For instance,
Frieze and Kannan [21,22] prove that all ‘dense’ MaxSNP-hard problems admit
PTAS, by making use of such algorithms. For other applications of an algorithmic hypergraph regularity lemma, see Czygrinow [11].
Let us discuss the organization of this survey. In Section 2, we state the
regularity lemma for graphs and hypergraphs. In Section 3, we discuss a few
independent lemmas, each of which allows one to produce algorithms for the
regularity lemma. In Section 4.1, we state the main result of [34]. Some results
of Frieze and Kannan are discussed in Section 4.2. In Section 5, we discuss a
recent result on property testing on graphs, due to Alon, Fischer, Krivelevich,
and Szegedy [3,4]. The main result in [3,4] is based on a new variant of the
regularity lemma. We close with some final remarks.
2
The Regularity Lemma
In this section we state the regularity lemma and briefly discuss the original
proof of Szemerédi. In order to be concise, we shall state a hypergraph version
that is a straightforward extension of the classical lemma. We remark that this
extension was first considered and applied by Prömel and Steger [42].
2.1
The Statement of the Lemma
Given a set V and a non-negative integer r, we write [V ]r for the collection of
subsets of V of cardinality r. An r-uniform hypergraph or r-graph on the vertex
set V is simply a collection of r-sets H ⊂ [V ]r . The elements of H are the
hyperedges of H
Let U1 , . . . , Ur ⊂ V be pairwise disjoint, non-empty subsets of vertices. The
density dH (U1 , . . . , Ur ) of this r-tuple with respect H is
dH (U1 , . . . , Ur ) =
e(U1 , . . . , Ur )
,
|U1 | . . . |Ur |
(1)
where e(U1 , . . . , Ur ) is the number of hyperedges e ∈ H with |e ∩ Ui | = 1 for
all 1 ≤ i ≤ r. We say that the r-tuple (U1 , . . . , Ur ) is ε-regular with respect to H
if, for all choices of subsets Ui′ ⊂ Ui with |Ui′ | ≥ ε|Ui | for all 1 ≤ i ≤ r, we have
|dH (U1′ , . . . , Ur′ ) − dH (U1 , . . . , Ur )| ≤ ε.
(2)
If for any such Ui′ (1 ≤ i ≤ r) we have
|dH (U1′ , . . . , Ur′ ) − α| ≤ δ,
(3)
Algorithmic Aspects of Regularity
3
we say that the r-tuple (U1 , . . . , Ur ) is (α, δ)-regular. Finally, we say that a
partition V = V0 ∪ · · · ∪ Vk of the vertex set V of H is ε-regular with respect
to H if
(i ) |V0 | < ε|V |,
(ii ) |V1 | = · · · = |Vk |,
(iii ) at least (1 − ε) kr of the r-tuples (Vi1 , . . . , Vir ) with 1 ≤ i1 < · · · < ir ≤ k
are ε-regular with respect to H.
Often, V0 is called the exceptional class of the partition. For convenience, we say
that a partition (Vi )ki=0 is (ε, k)-equitable if it satisfies (i ) and (ii ) above. The
hypergraph version of Szemerédi’s lemma reads as follows.
Theorem 1 For any integers r ≥ 2 and k0 ≥ 1 and real number ε > 0, there
are integers K = K(r, k0 , ε) and N = N (r, k0 , ε) such that any r-graph H on a
vertex set of cardinality at least N admits an (ε, k)-equitable ε-regular partition
with k0 ≤ k ≤ K.
Szemerédi [48] considered the case r = 2, that is, the case of graphs. However,
the proof in [48] generalizes in a straightforward manner to a proof of Theorem 1
above; see Prömel and Steger [42], where the authors prove and apply this result.
2.2
Brief Outline of Proofs
Let H be an r-graph on the vertex set V with |V | = n, and let Π = (Vi )ki=0 be
an (ε, k)-equitable partition of V . A crucial definition in Szemerédi’s proof of his
lemma is the concept of the index ind(Π) of Π, given by
ind(Π) =
−1 X
k
d(Vi1 , . . . , Vir )2 ,
r
(4)
where the sum is taken over all r-tuples 1 ≤ i1 < · · · < ir ≤ k. Clearly, we always
have 0 ≤ ind(Π) ≤ 1. For convenience, if Π ′ = (Wj )ℓj=0 is an (ε′ , ℓ)-equitable
partition of V , we say that Π ′ is a refinement of Π if, for any 1 ≤ j ≤ ℓ, there
is 1 ≤ i ≤ k for which we have Wj ⊂ Vi . In words, Π ′ refines Π if any nonexceptional class of Π ′ is contained in some non-exceptional class of Π. Now,
the key lemma in the proof of Theorem 1 is the following (see [42]).
Lemma 2 For any integer r ≥ 2 and real number ε > 0, there exist integers
k0 = k0 (r, ε) and n0 = n0 (r, ε) and a positive number ϑ = ϑ(r, ε) > 0 for
which the following holds. Suppose we have an r-graph H on a vertex V , with
n = |V | ≥ n0 , and Π = (Vi )ki=0 is an (ε, k)-equitable partition of V . Then
(i ) either Π is ε-regular with respect to H,
(ii ) or, else, there is a refinement Π ′ = (Wj )ℓj=0 of Π such that
(a) |W0 | ≤ |V0 | + n/4k ,
(b) |W1 | = · · · = |Wℓ |,
4
Y. Kohayakawa and V. Rödl
r−1
(c) ℓ = k4k ,
(d ) ind(Π ′ ) ≥ ind(Π) + ϑ.
Theorem 1 follows easily from Lemma 2: it suffices to recall that the index
can never be larger than 1, and hence if we successively apply Lemma 2, starting
from an arbitrary partition Π, we must arrive at an ε-regular partition after a
bounded number of steps, because (d ) of alternative (ii ) guarantees that the
indices of the partitions that we generate always increase by a fixed positive
amount.
Ideally, to turn this proof into an efficient algorithm, given a partition Π, we
would like to have (I) an efficient procedure to check whether (i ) applies, and
(II) if (i ) fails, an efficient procedure for finding Π ′ as specified in (a)–(d ).
It turns out that if we actually have, at hand, witnesses for the failure of
ε-regularity of > ε kr of the r-tuples (Vi1 , . . . , Vir ), where 1 ≤ i1 < · · · < ir ≤ k,
then Π ′ may be found easily (see [42] for details). Here, by a witness for the εirregularity of an r-tuple (U1 , . . . , Ur ) we mean an r-tuple (U1′ , . . . , Ur′ ) with Ui′ ⊂
Ui and |Ui′ | ≥ ε|Ui | for all 1 ≤ i ≤ r for which (2) fails. We are therefore led to
the following decision problem:
Problem 3 Given an r-graph H, an r-tuple (U1 , . . . , Ur ) of non-empty, pairwise
disjoint sets of vertices of H, and a real number ε > 0, decide whether this r-tuple
is ε-regular with respect to H.
In case the answer to Problem 3 is negative for a given instance, we would
like to have a witness for the ε-irregularity of the given r-tuple.
3
3.1
Conditions for Regularity
A Hardness Result
It turns out that Problem 3 is hard, as proved by Alon, Duke, Lefmann, Rödl,
and Yuster [1,2].
Theorem 4 Problem 3 is coNP-complete for r = 2.
Let us remark in passing that Theorem 4 is proved in [1,2] for the case in
which ε = 1/2; for a proof for arbitrary 0 < ε ≤ 1/2, see Taraz [49].
Theorem 4 is certainly discouraging. Fortunately, however, there is a way
around. We discuss the graph and hypergraph cases separately.
The graph case. In the case r = 2, that is, in the case of graphs, one has
the following lemma. Below R+ denotes the set of positive reals. Moreover, a
bipartite graph B = (U, W ; E) with vertex classes U and W and edge set E
is said to be ε-regular if (U, W ) is an ε-regular pair with respect to B. Thus,
a witness to the ε-irregularity of B is a pair (U ′ , W ′ ) with U ′ ⊂ U , W ′ ⊂ W ,
|U ′ |, |W ′ | ≥ εn, and |dB (U ′ , W ′ ) − dB (U, W )| > ε (see the paragraph before
Problem 3).
Algorithmic Aspects of Regularity
5
Lemma 5 There is a polynomial-time algorithm A and a function ε′A : R+ →
R+ such that the following holds. When A receives as input a bipartite graph B =
(U, W ; E) with |U | = |W | = n and a real number ε > 0, it either correctly asserts
that B is ε-regular, or else it returns a witness for the ε′A (ε)-irregularity of B.
We remark that Lemma 5 implicitly says that ε′ = ε′A (ε) ≤ ε, for otherwise A
would not be able to handle an input graph B that is not ε-regular but is ε′ regular. In fact, one usually has ε′ ≪ ε.
Note that Lemma 5 leaves open what the behaviour of A should be when B
is ε-regular but is not ε′ -regular. Despite this fact, Lemma 5 does indeed imply
the existence of a polynomial-time algorithm for finding ε-regular partitions of
graphs. We leave the proof of this assertion as an exercise for the reader.
In Sections 3.2, 3.3, and 3.4 we state some independent results that imply
Lemma 5, thus completing the general description of a few distinct ways one
may prove the algorithmic version of the regularity lemma for graphs.
The hypergraph case. In the case of r-graphs (r ≥ 3), we do not know a
result similar to Lemma 5. The algorithmic version of Theorem 1 for r ≥ 3 is
instead proved by introducing a modified concept of index and then by proving
a somewhat more complicated version of Lemma 5. For lack of space, we shall
not go into details; the interested reader is referred to Czygrinow and Rödl [13].
In fact, in the remainder of this survey we shall mostly concentrate on graphs.
Even though several applications are known for the algorithmic version of
Theorem 1 for r ≥ 3 (see, e.g., [11,12,13,21,22]), it should be mentioned that the
most powerful version of the regularity lemma for hypergraphs is not the one
presented above. Indeed, the regularity lemma for hypergraphs proved in [18]
seems to have deeper combinatorial consequences.
3.2
The Pair Condition for Regularity
As mentioned in Section 3.1, Lemma 5 may be proved in a few different ways.
One technique is presented in this section. The second, which is in fact a generalization of the methods discussed here, is based on Lemmas 9 and 10. Finally,
the third method, presented in Section 3.4, is based on a criterion for regularity
given in Lemma 11.
Let us turn to the original approach of [1,2]. The central idea here may be
summarized in two lemmas, Lemmas 6 and 7 below; we follow the formulation
given in [14] (see also [9,20,50,51] and the proof of the upper bound in Theorem 15.2 in [16], due to J. H. Lindsey). Below, d(x, x′ ) denotes the joint degree
or codegree of x and x′ , that is, the number of common neighbours of x and x′ .
Lemma 6 Let a constant 0 < ε < 1 be given and let B = (U, W ; E) be a
bipartite graph with |U | ≥ 2/ε. Let ̺ = dB (U, W ) and let D be the collection of
all pairs {x, x′ } of vertices of U for which
(i ) d(x), d(x′ ) > (̺ − ε)|W |,
6
Y. Kohayakawa and V. Rödl
(ii ) d(x, x′ ) < (̺ + ε)2 |W |.
Then if |D| > (1/2)(1 − 5ε)|W |2 , the pair (U, W ) is (̺, (16ε)1/5 )-regular.
Lemma 7 Let B = (U, W ; E) be a graph with (U, W ) a (̺, ε)-regular pair and
with density d(U, W ) = ̺. Assume that ̺|W | ≥ 1 and 0 < ε < 1. Then
(i ) all but at most 2ε|W | vertices x ∈ W satisfy
(̺ − ε)|W | < d(x), d(x′ ) < (̺ + ε)|W |,
(ii ) all but at most 2ε|U |2 pairs {x, x′ } of vertices of A satisfy
d(x, x′ ) < (̺ + ε)2 |W |.
It is not difficult to see that Lemmas 6 and 7 imply Lemma 5. Indeed, the
main computational task that algorithm A from Lemma 5 has to perform is
to compute the codegrees of all pairs of vertices (x, x′ ) ∈ U × U . Clearly, this
may be done in time O(n3 ). Observing that this task may be encoded as the
squaring of a certain natural {0, 1}-matrix over the integers, one sees that there
is an algorithm A as in Lemma 5 with time complexity O(M (n)), where M (n) =
O(n2.376 ) is the time required to carry out such a multiplication (see [10]).
Before we proceed, let us observe that the pleasant fact here is the following.
Although the definition of ε-regularity for a pair (U, W ) involves a quantification
over exponentially many pairs (U ′ , W ′ ), we may essentially check the validity of
this definition by examining all pairs (x, x′ ) ∈ U × U , of which there are only
quadratically many. We refer to the criterion for regularity given by Lemmas 6
and 7 as the pair condition for regularity.
3.3
An Optimized Pair Condition for Regularity
Here we state an improved version of the pair condition of Section 3.2. The key
idea is that it suffices to control the codegrees of a small, suitably chosen set of
pairs of vertices to guarantee the regularity of a bipartite graph; that is, we need
not examine all pairs (x, x′ ) ∈ U × U . As it turns out, it suffices to consider the
pairs that form the edge set of a linear-sized expander, which reduces the number
of pairs to examine from n2 to O(n). This implies that there is an algorithm A
as in Lemma 5 with time complexity O(n2 ).
We start with an auxiliary definition.
Definition 8 Let 0 < ̺ ≤ 1 and A > 0 be given. We say that a graph J on n
vertices is (̺, A)-uniform if for any pair of disjoint sets U , W ⊂ V (J) such that
1 ≤ |U | ≤ |W | ≤ r|U |, where r = ̺n, we have
p
eJ (U, W ) − ̺|U ||W | ≤ A r|U ||W | .
Algorithmic Aspects of Regularity
7
Thus, a (̺, A)-uniform graph is a graph with density ∼ ̺ in which the edges
are distributed in a random-like manner. One may check that the usual binomial
random graph G(n, p) is (p, 20)-uniform (see, e.g., [31]). More relevant to us is the
fact that there are efficiently constructible graphs J that are (̺, O(1))-uniform,
and have arbitrarily large but constant average degree. Here we have in mind
the celebrated graphs of Margulis [41] and Lubotzky, Phillips, and Sarnak [39]
(see also [40,46]). For these graphs, we have A = 2.
Let us now introduce the variant of the pair condition for regularity that is
of interest. Let an n by n bipartite graph B = (U, W ; E) be given, and suppose
that J is a graph on U . Write e(J) for the number of edges in J and let p =
e(B)/n2 be the density of B. We say that B satisfies property P(J, δ) if
X
|d(x, y) − p2 n| ≤ δp2 ne(J).
(5)
{x,y}∈E(J)
Below, we shall only be interested in the case in which J is (̺, A)-uniform for
some constant A and ̺ ≍ 1/n. The results analogous to Lemmas 6 and 7 involving property P are as follows.
Lemma 9 For every ε > 0, there exist r0 = r0 (ε), n0 = n0 (ε) ≥ 1, and δ =
δ(ε) > 0 for which the following holds. Suppose n ≥ n0 , the graph J is a (̺, A)uniform graph with ̺ = r/n ≥ r0 /n, and B is a bipartite graph as above. Then,
if B has property P(J, δ), then B is ε-regular.
Lemma 10 For every δ > 0, there exist r1 = r1 (δ), n1 = n1 (δ) ≥ 1, and ε′ =
ε′ (δ) > 0 for which the following holds. Suppose n ≥ n1 , the graph J is a (̺, A)uniform graph with ̺ = r/n ≥ r1 /n, and B is a bipartite graph as above. Then,
if B does not satisfy property P(J, δ), then B is not ε′ -regular. Furthermore,
in this case, we can find a pair of sets of vertices (U ′ , W ′ ) witnessing the ε′ irregularity of B in time O(n2 ).
Lemmas 9 and 10 show that Lemma 5 holds for an algorithm A with time
complexity O(n2 ).
The proof of Lemma 9 is similar in spirit to the proof of Lemma 6, but of
course one has to make heavy use of the (̺, A)-uniformity of J. The sparse version
of the regularity lemma (see, e.g., [33]) may be used to prove the ε′ -irregularity
of the graph B in Lemma 10. However, proving that a suitable witness may
be found in quadratic time requires a different approach. The reader is referred
to [34] for details.
3.4
Singular Values and Regularity
We now present an approach due to Frieze and Kannan [23]. Let an m by n real
matrix A be given. The first singular value σ1 (A) of A is
σ1 (A) = sup{|x⊤ Ay| : ||x|| = ||y|| = 1}.
(6)
8
Y. Kohayakawa and V. Rödl
Above, we use || || to denote the 2-norm. In what follows, for a matrix W =
(wi,j ), we let ||W||∞ = max |wi,j |. Moreover, if I and J are subsets of the index
sets of the rows and columns of W, we let
X
wi,j = χ⊤
(7)
W(I, J) =
I WχJ ,
i∈I,j∈J
where χI and χJ are the {0, 1}-characteristic vectors for I and J.
Let B = (U, W ; E) be a bipartite graph with density p = e(B)/|U ||W |, and
let A = (ai,j )i∈U,j∈W be the natural {0, 1}-adjacency matrix associated with B.
Put W = A − pJ, where J is the n × n matrix with all entries equal to 1. It is
immediate to check that the following holds:
(*) B is ε-regular if and only if |W(U ′ , W ′ )| ≤ ε|U ′ ||W ′ | for all U ′ ⊂ U
and W ′ ⊂ W with |U ′ | ≥ ε|U | and |W ′ | ≥ ε|W |.
The regularity condition of Frieze and Kannan [23] is as follows.
Lemma 11 Let W be a matrix whose entries are index by U × W , where |U | =
|W | = n, and suppose that ||W||∞ ≤ 1. Let γ > 0 be given. Then the following
assertions hold:
(i ) If there exist U ′ ⊂ U and W ′ ⊂ W such that |U ′ |, |W ′ | ≥ γn and
|W(U ′ , W ′ )| ≥ γ|U ′ ||W ′ |,
then σ1 (W) ≥ γ 3 n.
(ii ) If σ1 (W) ≥ γn, then there exist U ′ ⊂ U and W ′ ⊂ W such that |U ′ |, |W ′ | ≥
γ ′ n and |W(U ′ , W ′ )| ≥ γ ′ |U ′ ||W ′ |, where γ ′ = γ 3 /108. Furthermore, U ′
and W ′ may be constructed in polynomial time.
Now, in view of (*) and the fact that singular values may be computed in
polynomial time (see, e.g., [29]), Lemma 11 provides a proof for Lemma 5.
4
The Algorithmic Versions of Regularity
In this section, we discuss the constructive versions of Theorem 1 for the graph
case, that is r = 2. We discuss deterministic and randomized algorithms separately.
4.1
Deterministic Algorithms
Our deterministic result, Theorem 12 below, asserts the existence of an algorithm
for finding regular partitions that is asymptotically faster than the algorithm due
to Alon, Duke, Lefmann, Rödl, and Yuster [1,2].
Algorithmic Aspects of Regularity
9
Theorem 12 There is an algorithm A that takes as input an integer k0 ≥ 1,
an ε > 0, and a graph G on n vertices and returns an ε-regular (ε, k)-equitable
partition for G with k0 ≤ k ≤ K, where K is as Theorem 1. Algorithm A runs
in time ≤ Cn2 , where C = C(ε, k0 ) depends only on ε and k0 .
Theorem 12 follows from the considerations in Sections 3.1 and 3.3 (see [34]
for details).
It is easy to verify that there is a constant ε′ = ε′ (ε, k0 ) such that if e(G) <
′ 2
ε n , then any (ε, k0 )-equitable partition is ε-regular. Clearly, as the time required to read the input is Ω(e(G)) and, as observed above, we may assume
that e(G) ≥ ε′ n2 , algorithm A in Theorem 12 is optimal, apart from the value
of the constant C = C(ε, k0 ).
A typical application of the algorithmic regularity lemma of [1,2] asserts the
existence of a fast algorithm for a graph problem. As it turns out, the running
time of such an algorithm is often dominated by the time required for finding
a regular partition for the input graph, and hence the algorithm has time complexity O(M (n)). In view of Theorem 12, the existence of quadratic algorithms
for these problems may be asserted. Some examples are given in [34].
4.2
Randomized Algorithms
We have been discussing deterministic algorithms so far. If we allow randomization, as proved by Frieze and Kannan, a great deal more may be achieved in
terms of efficiency [21,22]. The model we adopt is as follows. We assume that
sampling a random vertex from G as well as checking an entry of the adjacency
matrix of G both have unit cost.
Theorem 13 There is a randomized algorithm AFK that takes as input an integer k0 ≥ 1, an ε > 0, a δ > 0, and a graph G on n vertices and returns, with
probability ≥ 1 − δ, an ε-regular (ε, k)-equitable partition for G with k0 ≤ k ≤ K,
where K is as Theorem 1. Moreover, the following assertions hold:
(i ) Algorithm AFK runs in time ≤ Cn, where C = C(ε, δ, k0 ) depends only
on ε, δ, and k0 .
(ii ) In fact, AFK first outputs a collection of vertices of G of cardinality ≤ C ′ ,
where C ′ = C ′ (ε, δ, k0 ) depends only on ε, δ, and k0 , and the above ε-regular
partition for G may be constructed in linear time from this collection of
vertices.
The fact that one is able to construct a regular partition for a graph on n
vertices in randomized time O(n) is quite remarkable; this is another evidence of
the power of randomization. However, even more remarkable is what (ii ) implies:
given ε > 0, δ > 0, and k0 ≥ 1, there is a uniform bound C ′ = C ′ (ε, δ, k0 ) on the
number of randomly chosen vertices that will implicitly define for us a suitable
ε-regular partition of the given graph, no matter how large this input graph is.
Let us try to give a feel on how one may proceed to prove Theorem 13.
We shall discuss a lemma that is central in the Frieze–Kannan approach. The
10
Y. Kohayakawa and V. Rödl
idea is that, if a bipartite graph B = (U, W ; E) is not ε-regular, then this may
be detected, with high probability, by sampling a bounded number of vertices
of B. The effectiveness of sampling for detecting “dense spots” in graphs appears
already in Goldreich, Goldwasser, and Ron [25,26], where many constant time
algorithms are developed (we shall discuss such matters in Section 5 below).
Let us state the key technical lemma in the proof of Theorem 13. Let an
n by n bipartite graph B = (U, W ; E) be given. Our aim is to check whether
there exist U ′ ⊂ U and W ′ ⊂ W such that |U ′ |, |W ′ | ≥ εn and
|dB (U ′ , W ′ ) − dB (U, W )| > ε.
Note that this last inequality is equivalent to e(U ′ , W ′ )−p|U ′ ||W ′ | > ε|U ′ ||W ′ |,
where p = e(B)/n2 is the density of B. In fact, if
e(U ′ , W ′ ) − p|U ′ ||W ′ | > γn2
holds and γ ≥ ε, then (U ′ , W ′ ) must be a witness for the ε-irregularity of B.
We are now ready to state the result of Frieze and Kannan that allows one
to prove a randomized version of Lemma 5. The reader may wish to compare
this result with Lemmas 7 and 10.
Lemma 14 There is a randomized algorithm A that behaves as follows. Let B =
(U, W ; E) be as above and let γ and δ > 0 be positive reals. Suppose there exist U ′ ⊂ U and W ′ ⊂ W for which e(U ′ , W ′ ) > p|U ′ ||W ′ | + γn2 holds. Then, on
input B, γ > 0, and δ > 0, algorithm A determines, with probability ≥ 1 − δ,
implicitly defined sets U ′′ ⊂ U and W ′′ ⊂ W with
1
γn2 .
16
The running time of A is bounded by some constant C = C(γ, δ) that depends
only on γ and δ.
e(U ′′ , W ′′ ) > p|U ′′ ||W ′′ | +
A few comments concerning Lemma 14 are in order. As before, the model
here allows for the selection of random vertices of B as well as checking whether
two given vertices are adjacent in constant time. In order to define the sets U ′′
and W ′′ , algorithm A returns two sets Z1 ⊂ U and Z2 ⊂ W , both of cardinality
bounded by some constant depending only on γ and δ > 0. Then, U ′′ is simply
the set of vertices u ∈ U for which e(u, Z2 ) ≥ p|Z2 |. The set W ′′ is defined
analogously.
Algorithm A of Lemma 14 is extremely simple, and its elegant proof of correctness is based on the linearity of expectation and on well known large deviation inequalities (see Frieze and Kannan [21]).
We close this section mentioning a result complementary to Lemma 14 (see
[15] for a slightly weaker statement). Suppose B = (U, W ; E) is ε-regular. Then
if U ′ ⊂ U and W ′ ⊂ W are randomly chosen sets of vertices with |U ′ | =
|W ′ | ≥ M0 = M0 (ε′ , δ), then the bipartite graph B ′ = (U ′ , W ′ ; E ′ ) induced
by B on (U ′ , W ′ ) is ε′ -regular with probability ≥ 1 − δ, as long as ε ≤ ε0 (ε′ , δ).
Here again the striking fact is that a bounded sample of vertices forms a good
enough picture of the whole graph.
Algorithmic Aspects of Regularity
5
11
Property Testing
The topic of this section is in nature different from the topics discussed so far. We
have been considering how to produce algorithms for finding regular partitions of
graphs. In this section, we discuss how non-constructive versions of the regularity
lemma may be used to prove the correctness of certain algorithms. We shall
discuss a recent result to Alon, Fischer, Krivelevich, and Szegedy [3,4]. These
authors develop a new variant of the regularity lemma and use it to prove a far
reaching result concerning the testability of certain graph properties.
5.1
Definitions and the Testability Result
The general notion of property testing was introduced by Rubinfeld and Sudan [45], but in the context of combinatorial testing it is the work of Goldreich
and his co-authors [24,25,26,27,28] that are most relevant to us.
of all graphs on a fixed n-vertex set, say [n] =
Let G n be the collection
S
{1, . . . , n}. Put G = n≥1 G n . A property of graphs is simply a subset P ⊂ G that
is closed under isomorphisms. There is a natural notion of distance in each G n ,
the normalized Hamming distance: the distance d(G, H) = dn (G, H) between
−1
, where E(G) △ E(H) denotes
two graphs G and H ∈ G n is |E(G) △ E(H)| n2
the symmetric difference of the edge sets of G and H.
We say that a graph G is ε-far from having property P if
d(G, P) = min d(G, H) ≥ ε,
H∈P
n
2
edges have to be added or removed to G to turn it into a
that is, at least ε
graph that satisfies P.
An ε-test for a graph property P is a randomized algorithm A that receives as
input a graph G and behaves as follows: if G has P then with probability ≥ 2/3
we have A(G) = 1, and if G is ε-far from having P then with probability ≥ 2/3
we have A(G) = 0. The graph G is given to A through an oracle; we assume
that A is able to generate random vertices from G and it may query the oracle
whether two vertices that have been generated are adjacent.
We say that a graph property P is testable if, for all ε > 0, it admits an ε-test
that makes at most Q queries to the oracle, where Q = Q(ε) is a constant that
depends only on ε. Note that, in particular, we require the number of queries to
be independent of the order of the input graph.
Goldreich, Goldwasser, and Ron [25,26], besides showing that there exist NP
graph properties that are not testable, proved that a large class of interesting
graph properties are testable, including the property of being k-colourable, of
having a clique with ≥ ̺n vertices, and of having a cut with ≥ ̺n2 edges, where n
is the order of the input graph. The regularity lemma is not used in [25,26]. The
fact that k-colourability is testable had in fact been proved implicitly in [15],
where regularity is used.
We are now ready to turn to the result of Alon, Fischer, Krivelevich, and
Szegedy [3,4]. Let us consider properties from the first order theory of graphs.
12
Y. Kohayakawa and V. Rödl
Thus, we are concerned with properties that may be expressed through quantification of vertices, Boolean connectives, equality, and adjacency. Of particular
interest are the properties that may be expressed in the form
∃x1 , . . . , xr ∀y1 , . . . , ys A(x1 , . . . , xr , y1 , . . . , ys ),
where A is a quantifier-free first order expression. Let us call such properties of
type ∃∀. Similarly, we define properties of type ∀∃. The main result of [3,4] is as
follows.
Theorem 15 All first order properties of graphs that may be expressed with at
most one quantifier as well as all properties that are of type ∃∀ are testable.
Furthermore, there exist properties of type ∀∃ that are not testable.
The first part of the proof of the positive result in Theorem 15 involves the
reduction, up to testability, of properties of type ∃∀ to a certain generalized
colourability property. A new variant of the regularity lemma is then used to
handle this generalized colouring problem.
5.2
A Variant of the Regularity Lemma
In this section we shall state a variant of the regularity lemma proved in [3,4].
Let us say that a partition Π = (Vi )ki=1 of a set V is an equipartition of V if
all the sets Vi (1 ≤ i ≤ k) differ by at most 1 in size. In this section, we shall not
have exceptional classes in our partitions. Below, we shall have an equipartition
of V
Π ′ = {Vi,j : 1 ≤ i ≤ k, 1 ≤ j ≤ ℓ}
that is a refinement of a given partition Π = (Vi )ki=1 . In this notation, we
understand that, for all i, all the Vi,j (1 ≤ j ≤ ℓ) are contained in Vi .
Theorem 16 For every integer k0 and every function 0 < ε(r) < 1 defined on
the positive integers, there are constants K = K(k0 , ε) and N = N (k0 , ε) with
the following property. If G is any graph with at least N vertices, then there
exist equipartitions Π = (Vi )1≤i≤k and Π ′ = (Vi,j )1≤i≤k, 1≤j≤ℓ of V = V (G)
such that the following hold:
(i ) |Π| = k ≥ k0 and |Π′ | = kℓ ≤ K;
(ii ) at least (1 − ε(0)) k2 of the pairs (Vi , Vi′ ) with 1 ≤ i < i′ ≤ k are ε(0)regular;
(iii ) for all 1 ≤ i < i′ ≤ k, we have that at least (1−ε(k))ℓ2 of the pairs (Vi,j , Vi′ ,j ′ )
with j, j ′ ∈ [ℓ] are ε(k)-regular;
(iv ) for at least (1 − ε(0)) k2 of the pairs 1 ≤ i < i′ ≤ k, we have that for at
least (1 − ε(0))ℓ2 of the pairs j, j ′ ∈ [ℓ] we have
|dG (Vi , Vi′ ) − dG (Vi,j , Vi′ ,j ′ )| ≤ ε(0).
Algorithmic Aspects of Regularity
13
Suppose we have partitions Π and Π ′ as in Theorem 16 above and that
ε(k) ≪ 1/k. It is not difficult to see that then, for many ‘choice’ functions
e = (Vi,j(i) )1≤i≤k is an equipartition of an induced
j : [k] → [ℓ], we have that Π
subgraph of G such that the following hold:
(a) all the pairs (Vi,j(i) , Vi′ ,j(i′ ) ) are ε(k)-regular,
(b) for at least (1 − ε(0)) k2 of the pairs 1 ≤ i < i′ ≤ k, we have
|dG (Vi , Vi′ ) − dG (Vi,j(i) , Vi′ ,j(i′ ) )| ≤ ε(0).
In a certain sense, this consequence of Theorem 16 lets us ignore the irregular
pairs in the partition Π, at the expense of dropping down from the Vi to smaller
sets Vi,j(i) (still all of cardinality Ω(n)), and having most but not necessarily all
densities dG (Vi,j(i) , Vi′ ,j(i′ ) ) under tight control.
Let us remark in passing that, naturally, one may ask whether Theorem 1 may
be strengthened by requiring that there should be no irregular pairs altogether.
This question was already raised by Szemerédi in [48]. As observed by Lovász,
Seymour, Trotter, and the authors of [2] (see p. 82 in [2]), such an extension of
Theorem 1 does not exist. As noted above, Theorem 16 presents a way around
this difficulty.
Theorem 16 and its corollary mentioned above are the main ingredients in
the proof of the following result (see [3,4] for details).
Theorem 17 For every ε > 0 and h ≥ 1, there is δ = δ(ε, h) > 0 for which
the following holds. Let H be an arbitrary graph on h vertices and let P =
Forbind (H) be the property of not containing H as an induced subgraph. If an
n-vertex graph G is ε-far from P, then G contains δnh induced copies of H.
The case in which H is a complete graph follows from the original regularity lemma, but the general case requires the corollary to Theorem 16 discussed
above. Note that Theorem 17 immediately implies that the property of membership in Forbind (H) (in order words, the property of not containing an induced
copy of H) is a testable property for any graph H.
The proof of Theorem 15 requires a generalization of Theorem 17 related to
the colouring problem alluded to at the end of Section 5.1. We refer the reader
to [3,4]. We close by remarking that Theorem 16 has an algorithmic version,
although we stress that this is not required in the proof of Theorem 15.
6
Concluding Remarks
We have not discussed a few recent, important results that relate to the regularity
lemma. We single out three topics that the reader may wish to pursue.
6.1
Constants
The constants involved in Theorem 1 are extremely large. The proof in [48] gives
that K in Theorem 1 is bounded from above by a tower of 2s of height ε−5 .
14
Y. Kohayakawa and V. Rödl
A recent result of Gowers [30] in fact shows that this cannot be essentially
improved. Indeed, it is proved in [30] that there are graphs for which any εregular partition must have at least G(ε−c ) parts, where c > 0 is some absolute
constant and G(x) is a tower of 2s of height ⌊x⌋.
The size of K is very often not too relevant in applications, but in certain
cases essentially better results may be obtained if one is able to avoid the appearance of such huge constants. In view of Gowers’s result, this can only be
accomplished by modifying the regularity lemma. One early instance in which
this carried out appears in [14]; a more recent example is [21].
6.2
Approximation Schemes for Dense Problems
Frieze and Kannan have developed variants of the regularity lemma for graphs
and hypergraphs and discuss several applications in [21,22]. The applications
are mostly algorithmic and focus on ‘dense’ problems, such as the design of a
PTAS for the max-cut problem for dense graphs. The algorithmic versions of
their variants of the regularity lemma play a central rôle in this approach.
For more applications of algorithmic regularity to ‘dense’ problems, the
reader is referred to [11,12,13,32]
6.3
The Blow-Up Lemma
We close with an important lemma due to Komlós, Sárközy, and Szemerédi [37],
the so-called blow-up lemma. (For an alternative proof of this lemma, see [44].)
In typical applications of the regularity lemma, once a suitably regular partition of some given graph G is found, one proceeds by embedding some ‘target
graph’ H of bounded degree into G. Until recently, the embedding techniques
could only handle graphs H with many fewer vertices than G. The blow-up
lemma is a novel tool that allows one to embed target graphs H that even have
the same number of vertices as G. The combined use of the regularity lemma and
the blow-up lemma is a powerful new machinery in graph theory. The reader is
referred to Komlós [36] for a discussion on the applications of the blow-up lemma.
On the algorithmic side, the situation is good: Komlós, Sárközy, and Szemerédi [38] have also proved an algorithmic version of the blow-up lemma (see
Rödl, Ruciński, and Wagner [43] for an alternative proof).
References
1. N. Alon, R. A. Duke, H. Lefmann, V. Rödl, and R. Yuster, The algorithmic aspects
of the regularity lemma (extended abstract), 33rd Annual Symposium on Foundations of Computer Science (Pittsburgh, Pennsylvania), IEEE Comput. Soc. Press,
1992, pp. 473–481.
2.
, The algorithmic aspects of the regularity lemma, Journal of Algorithms 16
(1994), no. 1, 80–109.
3. N. Alon, E. Fischer, M. Krivelevich, and M. Szegedy, Efficient testing of large
graphs, submitted, 22pp., 1999.
Algorithmic Aspects of Regularity
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
15
, Efficient testing of large graphs (extended abstract), 40th Annual Symposium on Foundations of Computer Science (New York City, NY), IEEE Comput.
Soc. Press, 1999, pp. 656–666.
N. Alon and E. Fischer, Refining the graph density condition for the existence of
almost K-factors, Ars Combinatoria 52 (1999), 296–308.
N. Alon and R. Yuster, Almost H-factors in dense graphs, Graphs and Combinatorics 8 (1992), no. 2, 95–102.
, H-factors in dense graphs, Journal of Combinatorial Theory, Series B 66
(1996), no. 2, 269–282.
F. R. K. Chung, Regularity lemmas for hypergraphs and quasi-randomness, Random Structures and Algorithms 2 (1991), no. 1, 241–252.
F. R. K. Chung, R. L. Graham, and R. M. Wilson, Quasi-random graphs, Combinatorica 9 (1989), no. 4, 345–362.
D. Coppersmith and S. Winograd, Matrix multiplication via arithmetic progressions., Journal of Symbolic Computation 9 (1990), no. 3, 251–280.
A. Czygrinow, Partitioning problems in dense hypergraphs, submitted, 1999.
A. Czygrinow, S. Poljak, and V. Rödl, Constructive quasi-Ramsey numbers and
tournament ranking, SIAM Journal on Discrete Mathematics 12 (1999), no. 1,
48–63.
A. Czygrinow and V. Rödl, An algorithmic regularity lemma for hypergraphs, submitted, 1999.
R. A. Duke, H. Lefmann, and V. Rödl, A fast approximation algorithm for computing the frequencies of subgraphs in a given graph, SIAM Journal on Computing
24 (1995), no. 3, 598–620.
R. A. Duke and V. Rödl, On graphs with small subgraphs of large chromatic number,
Graphs and Combinatorics 1 (1985), no. 1, 91–96.
P. Erdős and J. Spencer, Probabilistic methods in combinatorics, Akademiai Kiado,
Budapest, 1974, 106pp.
E. Fischer, Cycle factors in dense graphs, Discrete Mathematics 197/198 (1999),
309–323, 16th British Combinatorial Conference (London, 1997).
P. Frankl and V. Rödl, Extremal problems on set systems, Random Structures and
Algorithms, to appear.
, The uniformity lemma for hypergraphs, Graphs and Combinatorics 8
(1992), no. 4, 309–312.
P. Frankl, V. Rödl, and R. M. Wilson, The number of submatrices of a given type
in a Hadamard matrix and related results, Journal of Combinatorial Theory, Series
B 44 (1988), no. 3, 317–328.
A. Frieze and R. Kannan, The regularity lemma and approximation schemes for
dense problems, 37th Annual Symposium on Foundations of Computer Science
(Burlington, VT, 1996), IEEE Comput. Soc. Press, Los Alamitos, CA, 1996,
pp. 12–20.
, Quick approximation to matrices and applications, Combinatorica 19
(1999), no. 2, 175–220.
, A simple algorithm for constructing Szemerédi’s regularity partition, Electronic Journal of Combinatorics 6 (1999), no. 1, Research Paper 17, 7 pp. (electronic).
O. Goldreich, Combinatorial property testing (a survey), Randomization methods
in algorithm design (Princeton, NJ, 1997), Amer. Math. Soc., Providence, RI, 1999,
pp. 45–59.
16
Y. Kohayakawa and V. Rödl
25. O. Goldreich, S. Goldwasser, and D. Ron, Property testing and its connection to
learning and approximation, 37th Annual Symposium on Foundations of Computer
Science (Burlington, VT, 1996), IEEE Comput. Soc. Press, Los Alamitos, CA,
1996, pp. 339–348.
26.
, Property testing and its connection to learning and approximation, Journal
of the Association for Computing Machinery 45 (1998), no. 4, 653–750.
27. O. Goldreich and D. Ron, Property testing in bounded degree graphs, 29th ACM
Symposium on Theory of Computing (El Paso, Texas), 1997, pp. 406–419.
28.
, A sublinear bipartiteness tester for bounded degree graphs, Combinatorica
19 (1999), no. 3, 335–373.
29. G. H. Golub and C. F. van Loan, Matrix computations, Johns Hopkins University
Press, London, 1989.
30. W. T. Gowers, Lower bounds of tower type for Szemerédi’s uniformity lemma,
Geometric and Functional Analysis 7 (1997), no. 2, 322–337.
31. P. E. Haxell, Y. Kohayakawa, and T. Luczak, The induced size-Ramsey number of
cycles, Combinatorics, Probability, and Computing 4 (1995), no. 3, 217–239.
32. P. E. Haxell and V. Rödl, Integer and fractional packings in dense graphs, submitted, 1999.
33. Y. Kohayakawa, Szemerédi’s regularity lemma for sparse graphs, Foundations of
Computational Mathematics (Berlin, Heidelberg) (F. Cucker and M. Shub, eds.),
Springer-Verlag, January 1997, pp. 216–230.
34. Y. Kohayakawa, V. Rödl, and L. Thoma, An optimal deterministic algorithm for
Szemerédi’s regularity lemma, submitted, 2000.
35. J. Komlós and M. Simonovits, Szemerédi’s regularity lemma and its applications
in graph theory, Combinatorics—Paul Erdős is eighty, vol. 2 (Keszthely, 1993)
(D. Miklós, V. T. Sós, and T. Szőnyi, eds.), Bolyai Society Mathematical Studies,
vol. 2, János Bolyai Mathematical Society, Budapest, 1996, pp. 295–352.
36. J. Komlós, The blow-up lemma, Combinatorics, Probability and Computing 8
(1999), no. 1-2, 161–176, Recent trends in combinatorics (Mátraháza, 1995).
37. J. Komlós, G. N. Sárközy, and E. Szemerédi, Blow-up lemma, Combinatorica 17
(1997), no. 1, 109–123.
38.
, An algorithmic version of the blow-up lemma, Random Structures and
Algorithms 12 (1998), no. 3, 297–312.
39. A. Lubotzky, R. Phillips, and P. Sarnak, Ramanujan graphs, Combinatorica 8
(1988), 261–277.
40. A. Lubotzky, Discrete groups, expanding graphs and invariant measures, Birkhäuser Verlag, Basel, 1994, with an appendix by Jonathan D. Rogawski.
41. G. A. Margulis, Explicit group-theoretic constructions of combinatorial schemes and
their applications in the construction of expanders and concentrators, Problemy
Peredachi Informatsii 24 (1988), no. 1, 51–60.
42. H. J. Prömel and A. Steger, Excluding induced subgraphs III. A general asymptotic,
Random Structures and Algorithms 3 (1992), no. 1, 19–31.
43. V. Rödl, A. Ruciński, and M. Wagner, An algorithmic embedding of graphs via perfect matchings, Randomization and approximation techniques in computer science,
Lecture Notes in Computer Science, vol. 1518, 1998, pp. 25–34.
44. V. Rödl and A. Ruciński, Perfect matchings in ε-regular graphs and the blow-up
lemma, Combinatorica 19 (1999), no. 3, 437–452.
45. R. Rubinfeld and M. Sudan, Robust characterizations of polynomials with applications to program testing, SIAM Journal on Computing 25 (1996), no. 2, 252–271.
46. P. Sarnak, Some applications of modular forms, Cambridge University Press, Cambridge, 1990.
Algorithmic Aspects of Regularity
17
47. E. Szemerédi, On sets of integers containing no k elements in arithmetic progression, Acta Arithmetica 27 (1975), 199–245, collection of articles in memory of Juriı̆
Vladimirovič Linnik.
48.
, Regular partitions of graphs, Problèmes Combinatoires et Théorie des
Graphes (Colloq. Internat. CNRS, Univ. Orsay, Orsay, 1976) (Paris), Colloques
Internationaux CNRS n. 260, 1978, pp. 399–401.
49. A. R. Taraz, Szemerédis Regularitätslemma, April 1995, Diplomarbeit, Universität
Bonn, 83pp.
50. A. G. Thomason, Pseudorandom graphs, Random graphs ’85 (Poznań, 1985),
North-Holland Math. Stud., vol. 144, North-Holland, Amsterdam–New York, 1987,
pp. 307–331.
51.
, Random graphs, strongly regular graphs and pseudorandom graphs, Surveys
in Combinatorics 1987 (C. Whitehead, ed.), London Mathematical Society Lecture
Note Series, vol. 123, Cambridge University Press, Cambridge–New York, 1987,
pp. 173–195.
52. A. Tiskin, Bulk-synchronous parallel multiplication of Boolean matrices, Automata,
languages and programming, Lecture Notes in Computer Science, vol. 1443, 1998,
pp. 494–506.
Small Maximal Matchings in Random Graphs
Michele Zito⋆
Department of Computer Science, University of Liverpool, Liverpool L69 7ZF, UK
Abstract. We look at the minimal size of a maximal matching in general, bipartite
and d-regular random graphs. We prove that the ratio between the sizes of any
two maximal matchings approaches one in dense random graphs and random
bipartite graphs. Weaker bounds hold for sparse random graphs and random dregular graphs. We also describe an algorithm that with high probability finds a
matching of size strictly less than n/2 in a cubic graph. The result is based on
approximating the algorithm dynamics by a system of linear differential equations.
1 Introduction
A matching in a graph is a set of disjoint edges. Several optimisation problems are
definable in terms of matchings. If G is a graph and M is a matching in G, we count the
number of edges in M and the goal is to maximise this value, then the corresponding
problem is that of finding a maximum cardinality matching in G. This problem has a
glorious history and an important place among combinatorial problems [2,5,8]. However
few other matching problems share its nice combinatorial properties. If G = (V, E) is a
graph, a matching M ⊆ E is maximal if for every e ∈ E \ M , M ∪ e is not a matching;
V (M ) = {v : ∃u {u, v} ∈ M }. Let β(G) denote the minimum cardinality of a maximal
matching in G. The minimum maximal matching problem is that of finding a maximal
matching in G with β(G) edges. The problem is NP-hard [10]. The size of
any maximal
matching is at most 2β(G) [6] in general graphs and at most 2 − d1 β(G) [11] in
regular graphs of degree d. Some negative results are known about the approximability
of β(G) [11].
In this paper we abandon the pessimistic point of view of worst-case algorithmic
analysis by assuming that each input graph G occurs with a given probability. Nothing
seems to be known about the most likely value of β(G) or the effectiveness of any
approximation heuristics in this setting. In Section 2 we prove that the most likely value
of β(G) can be estimated quite precisely, for instance, if G is chosen at random among
all graphs with a given number of vertices. Similar results are proved in Section 3 for
dense random bipartite graphs.Also, simple algorithms exist which, with high probability
(w.h.p.), that is with probability approaching one as n = |V (G)| tends to infinity, return
matchings of size β(G) + o(n). Lower bounds on β(G), improving the ones presented
above, are proved also in the case when higher probability is given to graphs with few
edges. Most of the bounds on β(G) are obtained by exploiting a simple relation between
maximal matchings and independent sets. In Section 4 we investigate the possibility of
applying a similar reasoning if G is a random d-regular graph. After showing a number
⋆
Supported by EPSRC grant GR/L/77089.
G. Gonnet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNCS 1776, pp. 18–27, 2000.
c Springer-Verlag Berlin Heidelberg 2000
Small Maximal Matchings in Random Graphs
19
of lower bounds on β(G) for several values of d, we present an algorithm that finds a
maximal matching in a d-regular graph. We prove that with high probability it returns a
matching of size asymptotically less than n/2 if G is a random cubic graph.
In what follows G(n, p) (G(Kn,n , p)) denotes the usual model of random (bipartite)
graphs as defined in [1]. Also G(n, d-reg) denotes the following model for random dregular graphs [9, Section 4]. Let n urns be given, each containing d balls (with dn even):
a set of dn/2 pairs of balls (called a configuration) is chosen at random among those
containing neither pairs with two balls from the same urn nor couples of pairs with balls
coming from just two urns. To get a random G ∈ G(n, d-reg) let {i, j} ∈ E(G) if and
only if there is a pair with one ball belonging to urn i and the other belonging to urn j. If
G is a random graph model, G ∈ G means that G is selected with a probability defined
by G. The random variable X = Xk (G) counts the number of maximal matchings of
size k in G. The meaning of the sentences “almost always (a.a.)”, “for almost every (a.e.)
graph” is defined in [1, Ch. II].
2 General Random Graphs
Let q = 1 − p. If U is a random indicator Pr[U ] will denote Pr[U = 1].
Theorem 1. If G ∈ G(n, p) then E(X) =
n (2k)!
k!
2k
p k
2
n−2k
2
q(
).
Proof. Let Mi be a set of k independent edges, assume that G is a random graph sampled
i
be the random indicator equal to one if Mi
according to the model G(n, p) and let Xp,k
n−2k
i
is a maximal matching in G. E(X ) = Pr[X i ] = pk q ( 2 ) . Then by linearity of
p,k
expectation
E(X) =
P
|Mi |=k
p,k
n−2k
2
i
E(Xp,k
) = |{Mi : |Mi | = k}| · pk q (
)
The number of matchings of size k is equal to the possible ways of choosing 2k vertices
out of n times the number of ways of connecting them by k independent edges divided
by the number of orderings of these chosen edges.
⊓
⊔
A lower bound on β(G) is obtained by bounding E(X) and then using the Markov
inequality to prove that Pr[X > 0] approaches zero as the number of vertices in the
graph becomes large. Assuming 2k = n − 2ω
n −ω
n
n2 e ω 2ω2
2
p 2 −ω 2ω 2 −ω
q
q
≤ pn
E(X) ≤ n(2ω)!
2
2
npqω
√
and this goes to zero only if ω = Ω( n). However a different argument gives a considerably better result.
Theorem 2. β(G) >
n
2
−
log n
log(1/q)
for a.e. G ∈ G(n, p) with p constant.
Proof. If M is a maximal matching in G then V \ V (M ) is an independent set. Let
2 log n
in a
Z = Zp,2ω be the random variable counting independent sets of size 2ω = log(1/q)
n
random graph G. If X counts maximal matchings of size k = 2 − ω,
20
M. Zito
Pr[X > 0] = Pr[X > 0 | Z > 0] Pr[Z > 0] + Pr[X > 0 | Z = 0] Pr[Z = 0]
≤ Pr[X > 0 | Z > 0] Pr[Z > 0] + 0 · 1 ≤ Pr[Z > 0] → 0
The last result follows from a theorem in [4] on the independence number of dense
log n
⊓
⊔
random graphs. Thus β(G) > n2 − log
1/q for a.e. G ∈ G(n, p).
The argument before Theorem 2 is weak because even if E(Zp,2ω ) is small E(X)
might be very large. The random graph G might have very few independent sets of size
2ω but many maximal matchings of size n2 − ω.
Results in [4] also have algorithmic consequences. Grimmett and McDiarmid considered the simple greedy heuristic which repeatedly places a vertex v in the independent
set I if there is no u ∈ I with {u, v} ∈ E(G) and removes it from G. It is easily proved
log n
.
that |I| ∼ log(1/q)
Theorem 3. β(G) <
n
2
−
log n
2 log(1/q)
for a.e. G ∈ G(n, p) with p constant.
Proof. Let IS be an algorithm that first finds a maximal independent set I in G using
the algorithm above and then looks for a perfect matching in the remaining graph. With
log n
for all δ > 0. Also, IS does not
probability approaching one |I| ≥ (1 − δ) log(1/q)
expose any edge in G − I. Hence G − I is a completely random graph on about n − |I|
vertices, each edge in it being chosen with constant probability p. Results in [3] imply
that a.a. such graphs contain a matching with at most one unmatched vertex.
⊓
⊔
Independent sets are useful also for sparse graphs. If p = nc a lower bound on β(G)
can be obtained again by studying α(G), the size of a largest independent set of vertices
in G.
Theorem 4. β(G) >
n
2
−
n log c
c
for a.e. G ∈ G(n, c/n), with c > 2.27.
c
for a.e. G ∈ Gn,c/n for c > 2.27 [1, Theorem XI.22]. The result
Proof. α(G) < 2n log
c
follows by an argument similar to that of Theorem 2.
⊓
⊔
If p = nc for c sufficiently small, the exact expression for E(X) in Theorem 1 gives
an improved lower bound on β(G). Roughly, if c is sufficiently small and U is a large
independent set in G then the graph induced by V \ U very rarely contains a perfect
matching.
Proof. Let k =
n
2
n
3
for a.e. G ∈ Gn,c/n , with c ∈ (2.27, 16.99]
c c
n!
2k
− dn
c . If d ∈ 6 , 2 then k < n/3. Hence (n−2k)! k! ≤ n /k! and
Theorem 5. β(G) >
E(X) ≤ O(1) ·
q
c
π(c−2d)n
c2 e
(c−2d)n
n2 − dn
c
e−
2d2 n
c +d
which goes to zero for every d in the given range. The best choice of d is the smallest
⊓
⊔
and the theorem follows by noticing that logc c < 16 if c > 16.9989.
3 Bipartite Graphs
The results in the last section can be extended to the case when G ∈ G(Kn,n , p). Again
β(G) is closely related to a graph parameter whose value, at least in dense random
Small Maximal Matchings in Random Graphs
21
graphs, can be estimate rather well. Given a bipartite graph G = (V1 , V2 , E) with
|V1 | = |V2 | = n, a split independent set in G is a set of 2ω independent vertices S with
|S ∩ Vi | = ω. Let σ(G) be the size of a largest split independent set in G. If M is a
maximal matching in a bipartite graph G then V \ V (M ) is a split independent set.
Theorem 6. If G ∈ G(Kn,n , p) then
2
2
1. E(X) = nk k!pk q (n−k) .
2. If Z = Zp,n−k is the random variable counting split independent sets of size n − k
and Y = Yp,k is the random variable counting perfect matchings in H ∈ G(Kk,k , p)
then E(X) = E(Z) · E(Y ).
i
be the
Proof. Let Mi be a set of k independent edges and G ∈ G((Kn,n , p) and let Xp,k
i
i
random indicator equal to one if Mi is a maximal matching in G. E(Xp,k ) = Pr[Xp,k
]=
2
pk q (n−k) . Then
E(X) =
P
|Mi |=k
i
E(Xp,k
) = |{Mi : |Mi | = k}| · pk q (n−k)
2
The number of matchings of size k is given by the possible ways of choosing k vertices
out of n on each side times the number of permutations on k elements.
⊓
⊔
If p is constant, it is fairly easy to bound the first two moments of Z and get good
estimates on the value of σ(G).
Theorem 7. σ(G) ∼
4 log n
log 1/q
for a.e. G ∈ G(Kn,n , p) with p constant.
2 2
Proof. The expected number of split independent sets of size 2ω is ωn q ω . Hence,
by the Markov inequality, and Stirling’s approximation to the factorial
l
mPr[Z > 0] <
2 log n
nω 2 ω 2
q and the right side tends to zero as n grows if 2ω = 2 log 1/q .
ω!
k
j
log n
for any ǫ > 0. The event “Z = 0” is equivalent to
Let 2ω = 2 2(1−ǫ)
log 1/q
“σ(G) < 2ω” because if there is no split independent set of size 2ω then the largest of
such sets can only have less than 2ω elements. By the Chebyshev inequality Pr[Z = 0] ≤
2
Var(Z)/E(Z)2 . Also Var(Z) = E(Z 2 )−E(Z)2 . There are sω = ωn ways of choosing
ω vertices from two disjoint sets of n vertices.PIf Z i is the randomP
indicator set to one
if S i is a split independent set in G then Z = Z i and E(Z 2 ) = i,j Pr[Z i ∧ Z j ] =
P
P
i
j
j
i Pr[Z | Z ] where the sums are over all i, j ∈ {1, . . . , sω }. Finally
i,j Pr[Z ]
i
j
by symmetry Pr[Z | Z ] does not actually depend on j but only on the amount of
intersection between S i and S j . Thus, if S 1 = {1, . . . , 2ω},
P
P
i
1
1
j
Pr[Z
]
E(Z 2 ) =
i Pr[Z | Z ] = E(Z) · E(Z|Z ).
j
Thus to prove that Pr[Z = 0] converges to zero it is enough to show that the ratio
E(Z|Z 1 )/E(Z) converges to one. By definition of conditional expectation
n−ω n−ω ω2 −l l
P
1 2
E(Z|Z 1 ) = 0≤l1 ,l2 ≤ω lω1 lω2 ω−l
ω−l2 q
1
Define Tij (generic term in E(Z|Z 1 )/E(Z)) by
2
Tij ωn = wi wj n−ω
ω−i
n−ω
ω−j
q −ij
22
M. Zito
Tedious algebraic manipulations prove that T00 ≤ 1 −
2ω 2
n−ω+1
+
ω 3 (2ω−1)
(n−ω+1)2 ,
and, for
ω2
n−ω+1
for i + j = 1, and Tij ≤ T10 for all i, j ∈ {1, . . . , ω}.
sufficiently large n, Tij ≤
From these results it follows that
ki
h
j
log n
≤ T00 + T10 + T01 + ω 2 T10 − 1
Pr σ < 2 2(1−ǫ)
log 1/q
≤ 1 − 2ω 2 /n +
≤
Theorem 8. β(G) > n −
2 log n
log 1/q
ω 3 (2ω−1)
(n−ω+1)2
+
ω 3 (2ω−1)
(n−ω+1)2
+
2ω 2
n
+
ω4
n
4
ω
n
−1
⊓
⊔
for a.e. G ∈ G(Kn,n , p) with p constant.
The similarities between the properties of independent sets in random graphs and
those of split independent sets in random bipartite graphs have some algorithmic implications. A simple greedy heuristic almost always produces a solution whose cardinality
can be predicted quite tightly. Let I be the independent set to be output. Consider the
process that visits the vertices of a random bipartite graph G(V1 , V2 , E) in some fixed
order. If Vi = {v1i , . . . , vni }, then the algorithm will look at the pair (vj1 , vj2 ) during step
6 E and if there is no edge between vji and any of the vertices which are
j. If {vj1 , vj2 } ∈
already in I then vj1 and vj2 are inserted in I. Let σg (G) = |I|.
Theorem 9. σg (G) ∼
log n
log 1/q
for a.e. G ∈ G(Kn,n , p) with p constant.
Proof. Suppose that 2(k − 1) vertices are already in I. The algorithm above will add two
6 E and there is no edge between either v1
vertices v1 and v2 as the kth pair if {v1 , v2 } ∈
or v2 and any of the vertices which are already in I. The two events are independent in the
given model and their joint probability is (1 − p) · (1 − p)2(k−1) = (1 − p)2k−1 . Let Wk
(for k ∈ IN+ ) be the random variable equal to the number of pairs considered before the
= (1 − p)2k−1 .
kth pair is added to I. Wk has geometric distribution with parameter Pk P
ω
Moreover the variables W1 , W2 , . . . are all independent. Let Yω =
k=1 Wk . The
event “Yω < n” is implied by “σg (G) > 2ω”: if the split independent set returned by
the greedy algorithm contains more than 2ω vertices that means that the algorithm finds
ω independent pairs in strictly less than n trials. Also if Yω < n then certainly each of
the Wk cannot be larger than n. Hence
Qω
2k−1 n
] }
Pr[Yω < n] ≤ Pr[∩ω
k=1 {Wk ≤ n}] =
k=1 {1 − [1 − (1 − p)
m
l
log n
and, given ǫ > 0 and r ∈ IN, choose m > r/ǫ. For sufficiently
Let ω = (1+ǫ)
2 log 1/q
large n, ω − m > 0. Hence
Qω
Pr[Yω < n] ≤ k=ω−m {1 − [1 − (1 − p)2k+1 ]n }
that is at most {1 − [1 − (1 − p)2(ω−m)+1 ]n }m . Since (1 − x)n ≥ 1 − nx, we also have
−r
}m
Pr[Yω < n] ≤ {n(1 − p)2(ω−m)+1
j
k = o(n ). The event “Yω > n” is equivalent to
log n
“σg (G) < 2ω”. Let ω = (1−ǫ)
. If Yω > n then there must be at least one k for
2 log 1/q
which Wk > n/ω. Hence Pr[Yω > n] ≤ Pr[∪ω
k=1 {Wk > n/ω}] and this is at most
Pω
2ω−1 ⌊n/ω⌋
]
.
k=1 Pr[Wk > n/ω] ≤ ω[1 − (1 − p)
Small Maximal Matchings in Random Graphs
By the choice of ω, (1 − p)2ω−1 >
h
Pr[Yω > n] ≤ ω 1 −
n−(1−ǫ)
1−p .
n−(1−ǫ)
1−p
23
Hence
i⌊n/ω⌋
n −(1−ǫ) o
n
≤ ω exp − n 1−p
ω
n
o
nǫ
− o(1) since ⌊n/ω⌋ > n/ω − 1, and the
Finally Pr[Yω > n] ≤ ω exp − (1−p)ω
result follows from the choice of ω.
⊓
⊔
The greedy algorithm analysed in Theorem 9 does not expose any edge between
two vertices that are not selected to be in I. Therefore G − I is a random graph. Classical results ensure the existence of a perfect matching in G − I, and polynomial time
algorithms exist which find one such a matching. We have proved the following.
Theorem 10. β(G) < n −
log n
2 log 1/q
for a.e. G ∈ G(Kn,n , p) with p constant.
4 Regular Graphs
In this section we look at the size of the smallest maximal matchings in random regular
graphs. Again known upper bounds on the independence number of such graphs imply,
in nearly all interesting cases, good lower bounds on β(G).
Theorem 11. For each d ≥ 3 there exists a constant γ(d) such that β(G) ≥ γ(d)n for
a.e. G ∈ G(n, d-reg).
Proof. It is convenient to use the configuration model described in the introduction. Two
pairs of balls in a configuration are independent if each ball is chosen from a distinct
urn. A matching in a configuration is a set of independent pairs. The expected number
of maximal matchings of size k in a random configuration is
[2k(d−1)]!
(dn/2)!
n! d2k
k! (n−2k)! [k(2d−1)−nd/2]! (dn)!
2d(n−2k)
If k = γn, using Stirling’s approximation to the factorial, this is at most
n
h
h
iγ
id d
γ(2d−1)−d/2 2
d(1−2γ)2
[γ(d−1)]2γ(d−1)
1
2 −2γ
2
f (γ, d)n =
1−2γ
γ
d
[γ(2d−1)−d/2]γ(2d−1)
d
, 21 for which f (γ, d) ≥ 1, for
For every d there exists a unique γ1 (d) ∈ 2(2d−1)
γ ∈ (γ1 (d), 0.5). Since the probability that a random configuration corresponds to a
d-regular graph is bounded (see for example [1, Chap 2]), the probability that a random
d-regular graph has a maximal matching of size γn is at most f (γ, d)n . If d > 6 a better
bound is obtained by using γ(d) = (1 − α3 (d))/2 where α3 (d) is the smallest value in
⊓
⊔
(0, 1/2) such that α(G) < α3 (d)n for a.a. G ∈ G(n, d-reg) [7].
The relationship between independent sets and maximal matchings can be further
exploited also in the case where G ∈ G(n, d-reg), but random regular graphs are rather
sparse graphs and the approach used in the previous sections cannot be easily applied in
this context. However, a simple greedy algorithm which finds a large independent in a
d-regular graph can be modified and incorporated in a longer procedure that finds a small
maximal matching in a random regular graph. Consider the following algorithm A.
24
(1)
(2)
(3)
(4)
(5)
M. Zito
Input: Random d-regular graph with n vertices
M ← ∅;
while there is a vertex of degree d do
choose v in V u.a.r. among the vertices of degree d;
M ← M ∪ {v, u1 }; /* Assume N (v) = {u1 , . . . , udegG v } */
V ← V \ {v};
for j = 1 to d − 1 do
choose v u.a.r. among the vertices of degree d − j in V (M );
V ← V \ {v};
find a maximal matching M ′ in what is left of G;
make M ∪ M ′ into a maximal matching for G.
Step (2) essentially mimics one of the algorithms presented in [9], with the only difference
that instead of selecting an independent set of vertices, the process selects a set of edges.
Step (4) can be clearly performed in polynomial time. In general the set M ∪ M ′ is an
edge dominating set (each edge in G is adjacent to some edge in M ∪ M ′ ) but it is not
necessarily a matching. However [10] any edge dominating set F can be transformed
in polynomial time into a maximal matching M of G with |M | ≤ |F |. Let Di = {v :
degG v = i}. In the remaining part of this section we will analyse the evolution of |Di |
for 0 ≤ i ≤ d, as the algorithm goes through step (2) and (3). Step (3) is performed in a
number of iterations. For j ≥ 0, let Vij (t) be the size of Di at stage t of iteration j, with
the convention that iteration 0 refers to the execution of step (2).
Step (2) for d-regular graphs. Theorem 4 in [9] implies that step (2) proceeds for
2
d−2
1
stages, adding an edge to M at every stage.
asymptotically x1 = 12 − 21 d−1
Let Vij+ (t) = |Di ∩ V (M )| at stage t of iteration j and set Vij− (t) = Vij (t) − Vij+ (t).
Let ∆Vi0 sign (t) denote the expected change of Vi0 sign (t) (with sign ∈ {“”, “+”, “−”})
moving from stage t to t + 1, of step (2), conditioned to the history of the algorithm’s
execution up to stage t. Let v be the chosen vertex of degree d. We assume a given
fixed ordering among the vertices adjacent to v. The edge {v, u1 } is added to M and
edges {v, ul } (for l = 2, . . . , degG v) are removed from G. Vertex v becomes of degree
zero and the expected reduction in the number of vertices of degree i that are (not) in
iV 0+ (t)
iV 0− (t)
i
i
(resp. n−2t
), that is the probability that a vertex in Di ∩ V (M )
V (M ) is n−2t
(resp. Di ∩ (V \ V (M ))) is hit over d trials. The “loss” of a vertex of degree i implies
the “gain” of a vertex of degree i − 1. Moreover if u1 ∈ Di+1 ∩ (V \ V (M )) at stage t,
then u1 ∈ Di ∩ V (M ) at stage t + 1. Let δr,s = 1 if r = s and zero otherwise. In what
follows i ∈ {1, . . . , d − 1}. Also Vd0− (t) = Vd0 (t). We have
dVd0 (t)
n−2t
0+
(i+1)Vi+1
(t)
iVi0+ (t)
+ (1 − δd−1,i ) n−2t
∆Vi0+ (t) = − n−2t
0−
0−
(i+1)(d−1)Vi+1 (t)
iVi (t)
∆Vi0− (t) = − n−2t
+
nd−2dt
0−
V1 (t)
V10+ (t)
∆V00 (t) = 1 + n−2t
+ n−2t
∆Vd0 (t) = −1 −
+
0−
(i+1)Vi+1
(t)
nd−2dt
Small Maximal Matchings in Random Graphs
25
Setting x = t/n, Vi1 sign (t) = nvi1 sign (t/n), we can consider the following system of
differential equations:
dvd0 (x)
1−2x
0+
0−
(i+1)vi+1
(x)
(i+1)vi+1
(x)
ivi0+ (x)
− 1−2x
+ (1 − δd−1,i ) 1−2x
+ d(1−2x)
0−
′
(i+1)(d−1)vi+1 (x)
ivi0− (x)
+
vi0− (x) = − 1−2x
d(1−2x)
′
v10− (x)
v10+ (x)
+ 1−2x
v00 (x) = 1 + 1−2x
′
′
vi0+ (x) =
vd0 (0) = 1
vd0 (x) = −1 −
vi0+ (0) = 0
vi0− (0) = 0
v00 (0) = 0
In each case |Vi1 sign (t + 1) − Vi1 sign (t)| is bounded by a constant. Also, the system
of differential equations above is sufficiently well-behaved, so that the hypotheses of
Theorem 1 in [9] are fulfilled and thus for large n, Vi1 sign (t) ∼ nvi1 sign (t/n) where
vi1 sign (x) are the solutions of the system above.
Lemma 1. For each d ∈ IN+ and for each i ∈ {1, . . . , d}, there is a number Ai , two
j
j
}j=1,...,⌈ d−i+1 ⌉ ,{Bi,1
}j=1,...,⌈ d−i ⌉ , and a number Ci
sequences of real numbers {Bi,0
2
2
such that the system of differential equation above admits the following unique solutions:
P⌈ d−i+1 ⌉ j j
i
Bi,0 x
vi0− (x) = Ai (1 − 2x) + (1 − 2x) 2 Ci log(1 − 2x) + j=02
d−i
i+1 P⌈ 2 ⌉
B j xj
+(1 − 2x) 2
j=0
d−i i,1
d
−1
vi0+ (x) = vi0− (x) d−1
v00 (x) = f0 (x) − f0 (0)
d−1 R
d
where f0 (x) = x + d−1
v10− (x)
1−2x dx.
Proof. We sketch the proof of the first two results (which can be formally carried out by
1
d−1
(1 − 2x) + d−2
(1 − 2x)d/2 .
induction on d − i). For i = d, vd0 (x) = vd0− (x) = − d−2
0−
d−1
Assuming the result holds for vi+1 (x) and letting Di = (i + 1) d , we have
0−
i Rx
vi+1
(s)
vi0− (x) = Di (1 − 2x) 2 0
i +1 ds
(1−2s) 2
and the result follows by integration (in particular, the logarithmic terms are present only
if i ≤ 2).
0−
R x vi+1
(s)
0−
0−
0−
(x) = 0
(x) = Di Ii+1
(x). Therefore
Let Ii+1
i +1 ds; then vi
(1−2s) 2
= (i +
= (i +
Rx
0+
vi+1
(s)
0−
ds + i+1
d Ii+1 (x)
0 (1−2s)1+ 2i
0+
R x vi+1
(s)
v 0− (x)
ds + id−1
1) 0
1+ i
2
(1−2s)
d−i−1
0−
R x vi+1
(s)
d
1) d−1
−1 0
i
(1−2s)1+ 2
vi0+ (x) = (i + 1)
d−i−1
dvi0− (x)
d
+
− 1 (i+1)(d−1)
= (i + 1) d−1
d−i−1
vi0− (x)
dvi0− (x)
d
=
−1
d−1
d−1 + d−1
ds +
vi0− (x)
d−1
vi0− (x)
d−1
26
M. Zito
The third result follows by replacing the expression for vi0+ in
′
v00 (x) = 1 +
v10− (x)
1−2x
+
v10+ (x)
1−2x
⊓
⊔
Lemma 2. Let x1 be the smallest root of vd0 (x) = 0. After Step (2) is completed the size
⊓
⊔
of M is asymptotically x1 n for a.e. G ∈ G(n, d-reg).
Step (3.j) for cubic graphs. During this step the algorithm chooses a random vertex in
D3−j ∩ V (M ) and removes it from G (all edges incident to it will not be added to M ).
Let cj (3 − j)n/2 be the number of edges left at the beginning of iteration j. If iteration
j − 1 ended at stage xj n, the parameter cj satisfies the recurrence:
2(4−j)x
c
(4−j)
1
− 3−j j = 1 + 3−j
(cj−1 − 2xj )
cj = j−13−j
(with c0 = 1) where x1 has been defined above and x2 and x3 will be defined later. For
all i ∈ {1, 2, 3} the expected decrease in the number of vertices of degree i in V (M )
(resp. not in V (M )) is
iVij+ (t)
cj n−2t
iV j− (t)
( cjin−2t ). The following set of equations describes the
expected change in the various Vij sign (t). In what follows i ∈ {1, . . . , 3 − j}. Notice
that Vij+ (t) = 0 for all i > 3 − j during iteration j so there are only 3 − j equations
involving Vij+ (t) but there are always two involving Vij− (t).
V1j+ (t)
cj n−2t
j+
(i+1)Vi+1
(t)
iV j+ (t)
−δ3−j,i − cjin−2t + cj n−2t
j−
(i+1)Vi+1
(t)
iV j− (t)
− cjin−2t + (1 − δ2,i ) cj n−2t
∆V0j (t) = 1 +
∆Vij+ (t) =
∆Vij− (t) =
V1j− (t)
cj n−2t
+
Leading to the d.e.’s
v1j− (x)
v1j+ (x)
cj −2x + cj −2x
j+
′
(i+1)vi+1
(x)
iv j+ (x)
vij+ (x) = −δ3−j,i − cji−2x +
cj −2x
j−
′
(i+1)vi+1
(x)
iv j− (x)
vij− (x) = − cji−2x + (1 − δ2,i ) cj −2x
′
v0j (x) = 1 +
v0j (0) = 0
(j−1)+
vij+ (0) = vi
vij− (0)
=
(xj )
(j−1)−
vi
(xj )
(j−1)+
Theorem 12. Let xj be the smallest positive root of v4−j (x) = 0, for j ∈ {1, 2, 3}.
For a.e. G ∈ G(n, 3-reg) algorithm A returns a maximal matching of size at most
v 2− (x )+v 2− (x )
βu (G) ∼ n x1 + 1 3 2 2 3
Proof. The result follows again by applying Theorem 1 in [9] to the random variables
Vi1 sign (t). Notice that all functions vi1 sign (x) have a simple expression which can be
derived by direct integration and, in particular,
i
2
h
v 1+ (x )
2v 0 (x )
x3 = c22 1 − 4 1 − 1 c2 2
x2 = c21 1 − exp − 2c1 1
⊓
⊔
Small Maximal Matchings in Random Graphs
27
5 Conclusions
In this paper we presented a number of results about the minimal size of a maximal
matching in several types of random graphs. If the graph G is dense, with high probability β(G) is concentrated around |V (G)|/2 (both in the general and bipartite case).
Moreover simple algorithms return an asymptotically optimal matching. We also gave
simple combinatorial lower bounds on β(G) if G ∈ G(n, c/n). Finally we presented
combinatorial bounds on β(G) if G ∈ G(n, d-reg) and an algorithm that finds a maximal
matching of size asymptotically less than |V (G)|/2 in G. The complete analysis was
presented for the case when G ∈ G(n, 3-reg). In such case the bound in Theorem 11 and
the algorithmic result in Theorem 12 imply that 0.3158n < β(G) < 0.47563n. Results
similar to Theorem 12 can be proved for random d-regular graphs, although some extra
care is needed to keep track of the evolving degree sequence. Our algorithmic results
exploit a relationship between independent sets and maximal matchings. In all cases
the given minimisation problem is reduced to a maximisation one, and the analysis is
completed by exploiting a number of techniques available to deal with the maximisation
problem. The weakness of our results for sparse graphs and for regular graphs leaves the
open problem of finding a more direct approach which might produce better results.
References
1. B. Bollobás. Random Graphs. Academic Press, 1985.
2. J. Edmonds. Paths, Trees and Flowers. Canadian Journal of Math., 15:449–467, 1965.
3. P. Erd}os and A. Rényi. On the Existence of a Factor of Degree One of a Connected Random
Graph. Acta Mathematica Academiae Scientiarum Hungaricae, 17(3–4):359–368, 1966.
4. G. R. Grimmett and C. J. H. McDiarmid. On Colouring Random Graphs. Mathematical
Proceedings of the Cambridge Philosophical Society, 77:313–324, 1975.
5. J. Hopcroft and R. Karp. An n5/2 Algorithm for Maximal Matching in Bipartite Graphs.
SIAM Journal on Computing, 2:225–231, 1973.
6. B. Korte and D. Hausmann. An Analysis of the Greedy Heuristic for Independence Systems.
Annals of Discrete Mathematics, 2:65–74, 1978.
7. B. D. McKay. Independent Sets in Regular Graphs of High Girth. Ars Combinatoria,
23A:179–185, 1987.
8. S. Micali and V. V. Vazirani. An O(v 1/2 e) Algorithm for Finding Maximum Matching in
General Graphs. In Proceedings of the 21st Annual Symposium on Foundations of Computer
Science, pages 17–27, New York, 1980.
9. N. C. Wormald. Differential Equations for Random Processes and Random Graphs. Annals
of Applied Probability, 5:1217–1235, 1995.
10. M. Yannakakis and F. Gavril. Edge Dominating Sets in Graphs. SIAM Journal on Applied
Mathematics, 38(3):364–372, June 1980.
11. M. Zito. Randomised Techniques in Combinatorial Algorithmics. PhD thesis, Department of
Computer Science, University of Warwick, 1999.
Some Remarks on Sparsely Connected
Isomorphism-Free Labeled Graphs
Vlady Ravelomanana1 and Loÿs Thimonier1
LaRIA
5, Rue du Moulin Neuf
80000 Amiens, France
{thimon,vlady}@laria.u-picardie.fr
Abstract. Given a set ξ = {H1 , H2 , · · ·} of connected non-acyclic graphs, a
ξ-free graph is one which does not contain any member of ξ as induced subgraph.
Our first purpose in this paper is to perform an investigation into the limiting
distribution of labeled graphs and multigraphs (graphs with possible self-loops and
multiple edges), with n vertices and approximately 12 n edges, in which all sparse
connected components are ξ-free. Next, we prove that for any finite collection ξ of
multicyclic graphs almost all connected graphs with n vertices and n + o(n1 /3)
edges are ξ-free. The same result holds for multigraphs.
1 Introduction
We consider here labeled graphs, i.e., graphs with labeled vertices, undirected edges and
without self-loops or multiple edges as well as labeled multigraphs which are labeled
graphs with self-loops and/or multiple edges. A (n, q) graph (resp. multigraph) is one
having n vertices and q edges.
On one hand, classical papers, for e.g. [7], [8], [11] and [13], provide algorithms and
analysis of algorithms that deal with random graphs or multigraphs generation, estimating relevant characteristics of their evolution. Starting with an initially empty graph of n
vertices, we enrich it by successively adding edges. As random graph evolves, it displays
a phase transition similar to the typical phenomena observed with percolation process.
On the other hand, various authors such as Wright [19], [21] or Bender, Canfield and
McKay [3], [4] studied exact enumeration or asymptotic properties of labeled connected
graphs.
In recent years, a lot of research was performed for graphs without certain graphs
as induced subgraphs. Let H be a connected graph and let F be a family of graphs
none of which contains a subgraph isomorphic to H. In this case, we say that the family
F is H-free. Mostly forbidden subgraphs are triangle, ..., Cn , Kn , Kp,q graphs or any
combination of them. We refer as bicyclic graphs all connected graphs with n vertices
and (n + 1) edges and in general (q + 1)-cyclic graphs are connected (n, n + q) graphs.
Also in this case, we say that it is a q-excess graph. In general, we refer as multicyclic
a connected graph which is not acyclic. The same nomenclature holds for multigraphs.
Denote by ξ = {H1 , H2 , H3 , ...} a set of connected multicyclic graphs. A ξ-free graph
is one which does not contain any member Hi of ξ as induced subgraph. Throughout this
G. Gonnet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNCS 1776, pp. 28–37, 2000.
c Springer-Verlag Berlin Heidelberg 2000
Some Remarks on Sparsely Connected Isomorphism-Free Labeled Graphs
29
paper, each Hi is a connected multicyclic graph. Our goal in this paper is; to extend the
study of random (n, m(n)) graphs to random ξ-free (n, m(n)) graphs when the number
of edges, added one at time and at random, reach m(n) ≈ 1/2n and to compute the
asymptotic number of ξ-free connected graphs when ξ is finite. To do this, we will rely
strongly on the result of [21], in particular, we will investigate ξ-free connected (n, n+k)
graphs when k = o(n1/3 ). Note that similar works can be done with multigraphs.
This paper is organized as follows. In Section 2, we recall some useful definitions
of the stuff we will encounter throughout the rest of this document. In Section 3, we
will work with the example of the enumeration of bicyclic graphs. The enumeration
of these graphs was discovered, as far as we know, independently by Bagaev [1] and
by Wright [19]. The purpose of this example is two-fold. First, it brings a simple new
combinatorial point of view to the relationship between the generating functions of some
integer partitions, on one hand, and graphs or multigraphs, on the other hand. Next, this
example gives us ideas, regarding the simplest complex components, of what will happen
if we force our graphs to contain some specific configurations (especially the form of
the generating function). Section 4 is devoted to the computation of the probability of
random graphs without isomorphs in the general case. In Section 5, we give asymptotic
formula for the number of connected graph with n vertices, n + k edges as n → ∞ and
k → ∞ but k = o(n1/3 ) and prove that almost (n, n + o(n1/3 )) connected graphs are
ξ-free when ξ is finite.
2 Definitions
Powerful tools in all combinatorial approaches, generating functions will be used for our
concern. If F (z) is a power series, we write [z n ] F (z) for the coefficient of z n in F (z).
We say that F (z) is the exponential generating function (EGF for brief) for a collection
F of labeled objects if n! [z n ] F (z) is the number of ways to attach objects in F that
have n elements (see for instance [18] or [12]). The bivariate EGF for labeled rooted
trees satisfies
T (w, z) = z exp (T (w, z)) =
X
(wn)n−1
n>0
zn
,
n!
(1)
where the variable w is the variable for edges and z is the variable for vertices. Without
ambiguity, one can also associate a given configuration of labeled graph or multigraph
with its EGF. For instance, a triangle can be labeled in only one way. Thus,
C3 → C3 (w, z) =
1 3 3
w z .
3!
(2)
ck , the EGF for labeled multicyclic connected multigraphs,
We will denote by Wk , resp. W
resp. graphs, with k edges more than vertices. These EGF have been computed in [19]
\
and in [13]. Furthermore, we will denote by Wk,H and W
k,H the EGF of multicyclic
H-free multigraphs and graphs with k edges more than vertices. In these notations, the
second indice corresponds to the forbidden configuration(s). Recall that a smooth graph
or multigraph is one with all vertices of degree ≥ 2 (see [20]). Throughout the rest of this
30
V. Ravelomanana and L. Thimonier
paper, the “widehat” notation will be used for EGF of graphs and “underline” notation
ck , resp. Wk are EGF for
corresponds to the smoothness of the species. For example, W
respectively connected (n, n + k) smooth graphs and smooth multigraphs.
3 The Link between the EGF of Bicyclic Graphs and Integer
Partitions
After the different proofs for trees (see [14] and [9]), Rényi [16] found the formula to
enumerate unicyclic graphs which can be expressed in terms of the generating function
of rooted labeled trees
1
T (z) T (z)2
1
−
.
−
Vb (z) = ln
2 1 − T (z)
2
4
(3)
It may be noted that in some connected graphs, as well as multigraphs, the number of
edges exceeding the number of vertices can be seen as useful enumerating parameter.
The term bicyclic graphs, appeared first in the seminal paper of Flajolet et al. [11]
followed few years later by the huge one of Janson et al. [13] and was concerned with
all connected graphs with (n + 1) edges and n vertices. Wright [19] found recurrent
formula well adapted for formal calculation to compute the number of all connected
graphs with k edges more than their proper number of vertices for general k. Our aim in
this section is to show that the problem of the enumeration of bicyclic graphs can also
be solved with techniques involving integer partitions.
There exist two types of graphs which are connected and have (n + 1) edges as
shown by the figures below.
r^
(a)
(b)
Fig. 1. Examples of bicyclic components
p
q
(a)
(b)
Fig. 2. Smooth bicyclic components without
symmetry
Wright [19] showed with his reduction method that the EGF of all multicyclic graphs,
namely bicyclic graphs, can be expressed in term of the EGF of labeled rooted trees.
In order to count the number of ways to label a graph, we can repeatedly prune it by
suppressing recursively any vertex of degree 1. We then remove as many vertices as
edges. As these structures present many symmetries, our experiences suggest so far that
we ought to look at our previously described object without symmetry and without the
possible rooted subtrees.
Some Remarks on Sparsely Connected Isomorphism-Free Labeled Graphs
31
n n − p (p−1)! (q−1)!
p 2 q(n − p − q)! = n!
2
4 ways to label the graph reprep
q
n (r−1)! (n−r)!
r 2 = n!
sented by the figure 2a (p 6= q) and 2
2
2 ways for the graph of the
r
figure 2b. Note that the results are independent from the size of the subcycles. One can
obtain all smooth bicyclic graphs after considering possible symmetry criterions. In 2a,
if the subcycles have the same length, p = q, a factor 1/2 must be considered and we
have n!/8 ways to label the graph. Similarly, the graph of 2b can have the 3 arcs with the
same number of vertices. In this case, a factor 1/6 is introduced. If only two arcs have
the same number of vertices, we need a symmetrical factor 1/2. Thus, the enumeration
of smooth bicyclic graphs can be viewed as specific problem of integer partitioning into
2 or 3 parts following the dictates of the basic graphs of the figure 3.
There are
(a)
(b)
(c)
(f)
(d)
(e)
(g)
Fig. 3. The different basic smooth bicyclic graphs
With the same notations as in [6], denote by Pi (t), respectively Qi (t), the generating
functions of the number of partitions of an integer in i parts, respectively in i different
c1 (t) =
c1 (z) be the univariate EGF for smooth bicyclic graphs, then we have W
parts. Let W
f (P2 (t), P3 (t), Q2 (t), Q3 (t)). A bit of algebra leads to
4
c1 (z) = z (6 − z) .
W
24 (1 − z)3
(4)
1
In this formula, the denominator (1−z)
3 denotes the fact that there is at most 3 arcs or 3
degrees of liberty of integer partitions of the vertices in a bicyclic graph. The same remark
holds for the denominators (1−T1(z))3k in Wright’s formulae [19], for all (k + 1)-cyclic
connected labeled graphs. The EGF of labeled rooted trees, T (z), is introduced here
when re-expanding the reduced vertices of some smooth graph. The main consequence
of the relation between integer partitions and these EGF is that in any bicyclic graphs
Polynomial in T (z)
.
containing an induced q-gon as subgraph, the EGF is of the form
(1−T (z))2
The form of these EGF is important for the study of the asymptotic behaviour of random
graphs or multigraphs. The key point of the study of their characteristics is the analytical
properties of tree polynomial tn (y) defined as follow
X
zn
1
tn (y)
=
,
(1 − T (z))
n!
n≥0
(5)
32
V. Ravelomanana and L. Thimonier
where tn (y) is a polynomial of degree n in y. Knuth and Pittel [10] studied their properties. For fixed y and n → ∞, we have
√
2πn(n−1/2+y/2)
+ O(nn−1+y/2 ) .
(6)
tn (y) =
2y/2 Γ (y/2)
This equation tells us that in the EGF of bicyclic graphs
4
1
1
19
c1 (z) = T (z) (6 − T (z)) = 5
−
+ ...
W
24 (1 − T (z))3
24 (1 − T (z))3
24 (1 − T (z))2
(7)
5
of tn (3) is asymptotically significant. Thus in [13, Theorem
only the coefficient 24
5], the authors proved that only leading coefficients of tn (3k) are used to compute
the probability of random graphs or multigraphs. As already said, these coefficients
change only slightly in the study of random graphs or multigraphs without forbidden
d
configurations. Denote respectively by VC3 and V
C3 the EGF for acyclic multigraphs
and graphs without triangle (C3 ), we have
VC3 (z) =
and
1
T (z)3
1
ln
−
,
2 1 − T (z)
6
(8)
1
T (z) T (z)2
T (z)3
1
d
ln
−
−
−
.
(9)
V
C3 (z) =
2 1 − T (z)
2
4
6
For bicyclic components without triangle, we have respectively for multigraphs and
graphs
T (z) (3 + 2T (z))
T (z)5 (2 + 6T (z) − 3T (z)2 )
\
(z)
=
and
W
.
1,C
3
24 (1 − T (z))3
24
(1 − T (z))3
(10)
The decompositions of formulae (10), using the tree polynomials described by (5), lead
respectively to
W1,C3 (z) =
W1,C3 (z) =
X
n≥0
\
W
1,C3 (z) =
P
n≥0
zn
7
1
5
tn (3) − tn (2) + tn (1)
,
24
24
12
n!
5
25
47
35
5
24 tn (3) − 24 tn (2) + 24 tn (1) − 24 −n24 tn (−1)
25
tn (−2) − 58 tn (−3) + 18 tn (−4) zn! .
+ 24
(11)
(12)
Lemma 1. If ξ = {Ck , k ∈ Ω} where Ω is a finite set of integers greater to or less
than 3, the probability that a random graph or multigraph with has n vertices and 1/2n
edges only acyclic, unicyclic, bicyclic components all Ck -free, k ∈ Ω, is
r !
r
P
1
2
5
e− k∈Ω 2k + O(n−1/3 ) .
cosh
(13)
3
18
⊓
⊔
Some Remarks on Sparsely Connected Isomorphism-Free Labeled Graphs
33
Proof. This is a corollary of [13, eq (11.7)] using the formulae (8), (9), (11) and (12).
Incidentally, random graphs and multigraphs have the same asymptotic behavior as
shown by the proof of [13, Theorem 4]. As multigraphs are graphs without cycles of
length 1 and 2, the forbidden cycles of length 1 and 2 bring a factor e−3/4 which
is cancelled by a factor e+3/4 because of the ratio between weighting functions that
convert the EGF of graphs and multigraphs into probabilities
2m
n
m m2
m
m3
n
2 = n
exp
−
−
+
O(
)
+
O(
) ,m≤
.
2
2m m!
n
n2
n2
n4
m
(14)
The situation changes radically when cycles of length greater to or less than 3 are
5
of tn (3) in (11) and in
forbidden. Equations (8), (9) and the “significant coefficient” 24
(12) and the demonstration of [13, Lemma 3] show us that the term − T (z)k
2k , introduced
in (8) and (9) for each forbidden k-gon, simply changes the result by a factor of e−1/2k +
O(n−1/3 ).
The example of forbidden k-gon suggests itself for a generalization.
4 Random Isomorphism-Free Graphs
The probabilistic results on random H-free graphs/multigraphs can be obtained when
looking at the form of the decompositions of their EGF into tree polynomials.
[
Lemma 2. Let H be a connected (n, n + p) graph or multigraph.Let R
q,H (w, z), resp.
Rq,H , be the bivariate EGF of all connected q-excess graphs, resp. multigraphs, con[
taining at least one subgraph isomorphic to H. Then, R
q,H (w, z) is of the form
q P (T (wz))
[
,
R
q,H (w, z) = w
(1 − T (wz))k
where k < 3q and P is a polynomial. Similar formula holds for multigraphs.
(15)
⊓
⊔
Proof. The more cycles H has, the more the degree of the denominator of the EGF
multicyclic graphs or multigraphs containing subgraph isomorphic to H diminishes.
This follows from the fact that EGF of (q + 1)-cyclic graphs or multigraphs are simply
combination of integer partitions functions up to 3q. If we force our structures to contain
some specific multicyclic subgraphs, some parts are fixed and we strictly diminish the
number of parts of integers needed to reconstruct our graphs or multigraphs.
Lemma 3. Let H be a connected multicyclic graph or multigraph with n vertices and
[
(n + p) edges with p > 0 and let Wq,H , respectively W
q,H , be the generating functions
of connected multicyclic H-free multigraphs, respectively
graphs, with q edges, (q ≥ p),
P
cq
di
more than vertices. If Wq (z) = (1−T (z))
3q +
i≥1 (1−T (z))3q−i is the Wright’s EGF
cq
of q-excess multigraphs rewritten with tree polynomials then Wq,H (z) = (1−T (z))
3q +
34
V. Ravelomanana and L. Thimonier
P
di ′
i≥1 (1−T (z))3q−i .
In these formulae the leading coefficient cq is the same for Wq and
c
Wq , defined in [13, equation (8.6)]. Analogous result holds for multigraphs with the
cq and W
[
⊓
⊔
EGF W
q,H .
P (T (z))
q
Proof. One can write Wq (z) = (1−T
(z))3q where Pq is a polynomial. Then, we can
express Pq (x) in term of successive powers of (1 − x)i and cq equals simply Pq (1).
P
wq cq
′
wn z n
q
because Wq (w, z) =
We have Wq,H (w, z) = (1−T (wz))
3q + w
i<3q di tn (i) n!
Wq,H (w, z) + Rq,H (w, z), where Rq,H (w, z) is the bivariate EGF of multicyclic connected graphs with q edges more than vertices. As shown by (15), the denominator of
Rq,H (w, z) is strictly less than 3q.
We are now ready to state the following result.
Theorem 1. Let ξ = {H1 , H2 , H3 , ...Hm } be a finite collection of multicyclic connected graphs or multigraphs. Then the probability that a random graph with n vertices
1
and 21 n+O(n− 3 ) edges has r1 bicyclic components, r2 tricyclic components, ..., (k+1)cyclic components, all components {H1 , H2 , H3 , ...Hm }-free and no components of
higher cyclic order is
r
X 1 2 cr1 cr2
crk r!
4 r
1
2
··· k
+ O(n−1/3 )
exp −
(16)
rk ! (2r)!
3
2p
3 r1 ! r2 !
p∈Ω
where Ω = {p ≥ 3, ∃i ∈ [1, m] such that Hi is a p-gon}.
⊓
⊔
The theorem 2 below shows that a necessary and sufficient condition to change a coefficient ci of (16) is that ξ must contain all graphs contractible to a certain i-excess graph
Hi .
Theorem 2. Let H be a k-excess multicyclic graph (resp. multigraph) with k > 0.
Suppose that H has n vertices, n + k edges and c(H) n! is the number of ways to
label H (for example c(K4 ) = 1/24 ). Denote by ξk (H) the set of all k-excess graphs
contractible to H. Then the probability that a random graph (resp. multigraph) with
n vertices and m(n) = 1/2n + O(n−1/3 ) edges has r1 bicyclic, r2 tricyclic, ..., rp
p + 1-cyclic components, all without component isomorphic to any member of the set
ξk (H) is
r
rk+1
rk−1
r
ck−1
(ck − c(H))rk ck+1
cpp r!
4 r 2 cr11 cr22
···
···
+O(n−1/3 ) . (17)
3
3 r1 ! r2 !
rk−1 !
rk !
rk+1 !
rp ! (2r)!
⊓
⊔
Proof. The EGF of ξk (H) is simply
ξk (H)(w, z) = wk c(H)
T (wz)n
.
(1 − T (wz))3k
(18)
Thus in (16) if we want to avoid all graphs contractible to H, we have to substract (18)
to the EGF of connected k-excess graphs. Lemma 3 shows us that the other coefficients,
i.e., ci for all i > k remain unchanged.
Some Remarks on Sparsely Connected Isomorphism-Free Labeled Graphs
35
5 Asymptotic Numbers
Denote by c(n, n+k) the number of connected (n, n+k) graphs. Similarly, let cξ (n, n+
k) be the number of connected (n, n + k) ξ-free graphs. Wright [21, Theorem 4] state
the following result:
Theorem 3. If k = o(n1/3 ), but k → ∞ as n → ∞ then
c(n, n+k) = d (3π)1/2 (e/12k)k/2 nn+1/2(3k−1) (1+O(k −1 )+O(k 3/2 /n1/2 )) . (19)
⊓
⊔
1
as k → ∞. We prove here that a very
Note that later Voblyi [17] proved dk → 2π
similar result holds for cξ (n, n + k), i.e., for ξ-free connected graphs when ξ is finite
and k = o(n1/3 ).
If X(z) and Y (z) are 2 EGF, we note here that X ≥ Y iff ∀n, [z n ] X(z) ≥ [z n ] Y (z).
ck (w, z) their bivariate exponential
ck , k ≥ 0 the set of k-excess graphs and W
Denote by W
ck (w, z), k ≥ 1 has the following form
generating function. Thus, W
ck (w, z) =
W
X
bk
ck
ck,s
−
+
(1 − T (wz))3k
(1 − T (wz))3k−1
(1 − T (wz))s
(20)
s≥3k−2
c0 (w, z) = Vb (w, z) as in eq. (3).
and W
[
Furthermore, denote by W
k,ξ the set of connected k-excess ξ-free graphs.
[
Lemma 4. If ξ is finite, W
k,ξ (w, z), k > 0 has the following form
[
W
k,ξ (w, z) =
X (ck,s + αk, s)
bk
(ck + αk )
−
+
3k
3k−1
(1 − T (wz))
(1 − T (wz))
(1 − T (wz))s
(21)
s≤3k−2
⊓
⊔
where αk = 0 if ξ does not contain a p-gon.
Proof. Denote respectively by Sk,ξ and Jk,ξ the EGF for k-excess graphs containing
exactly one occurrence of a member of ξ and k-excess with many occurrences of member
of ξ but necessarily juxtaposed, i.e. the deletion of one edge will delete any occurrences
of any member of ξ. For example if C3 and C4 are in ξ, a “house” is a juxtaposition of
[
them. Then, remember that W
k,ξ satisfy the recurrence (see also [15])
2
Vz −Vz
[
\
−
V
W
Vw W
w
k,ξ
k+1,ξ + O(Sk+1,ξ ) + O(Jk+1,ξ ) =
2
(22)
P
[
[
(Vz W
p,ξ ) (Vz Wq,ξ )
+ p+q=k
1+δp,q
∂
(see [12]). Lemma 4 follows from the fact that Sk,ξ and Jk,ξ are of
where Vx = x ∂x
the form described by lemma 2. We have, for k > 0 the formula for smooth (n, n + k)
graphs with exactly one triangle
X
1
sk,i
1 3
Vz z
.
(23)
Vz W\
Sk,C3 (z) =
k−1,C3 (z) +
1−z
6
(1 − z)i
i≤3k−2
36
V. Ravelomanana and L. Thimonier
Thus,
X
bk−1
3
(k − 1)
+
3k−1
2
(1 − T (z))
sk,x
(1 − T (z))x
(24)
(ck + αk )
bk
bk
[
−
≤W
k,ξ (z) ≤
(1 − T (z))3k
(1 − T (z))3k−1
(1 − T (z))3k
(25)
Sk,C3 (z) =
x<3k−1
and in this case αk of (21) equals 32 (k − 1)bk−1 .
Lemma 5.
⊓
⊔
(ck +αk )
bk
[
Proof. We have to prove only W
k,ξ (z) − (1−T (z))3k + (1−T (z))3k−1 ≥ 0, since Wright
bk
bk
ck (z) ≤
[
and a fortiori, we have W
[21] show W
k,ξ (z) ≤ (1−T (z))3k . Substi(1−T (z))3k
tuting (21) in (22) leads to
2(k + 1)bk+1 = 3k(k + 1)bk + 3
k−1
X
t=1
t(k − t)bt bk−t
(26)
and
2(3k + 2)(ck+1 + αk+1 ) = 8(k + 1)bk+1 + 3kbk + (3k + 2)(3k − 1)(ck + αk )
Pk−1
+
6 t=1 t(3k − 3t − 1)bt (ck−t + αk−t )
(27)
1)
kb
.
Still
using
Then we have also as in [21, Lemma 5], kbk ≤ ck + αk ≤ (c1 +α
k
b1
similar arguments to those of [21], the equivalent of [21, Lemmas 6, 7, 8, 9, 10] can
also be obtained here (with the coefficients ck + αk instead of ck ) to prove lemma 5 by
induction on k.
Theorem 4. Given a finite collection ξ of multicyclic graphs, almost all connected
⊓
⊔
(n, n + k) graphs are ξ-free when k = o(n1/3 ).
Proof. By lemma 5, [21, eq (5.2), (5.3), Theorem 2] and (26).
References
1. Bagaev, G.N.: Random graphs with degree of connectedness equal 2. Discrete Analysis 22
(1973) 3–14, (in Russian).
2. Bagaev, G.N., Voblyi, V.A.: The shrinking-and-expanding method for the graph enumeration.
Discrete Mathematics and Applications 8 (1998) 493–498.
3. Bender, E.A., Canfield, E.R., McKay, B.D.: The asymptotic number of labeled connected
graphs with a given number of vertices and edges. Random Structures and Algorithms 1
(1990) 127–169.
4. Bender, E.A., Canfield, E.R., McKay, B.D.: Asymptotic Properties of Labeled Connected
Graphs. Random Structures and Algorithms 3, No. 2 (1992) 183–202.
Some Remarks on Sparsely Connected Isomorphism-Free Labeled Graphs
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
37
Cayley, A.: A Theorem on Trees. Quart. J. Math. Oxford Ser. 23 (1889) 376–378.
Comtet, L.: Analyse Combinatoire. Presses Universitaires de France (1970).
Erdös, P., Rényi, A.: On random graphs I. Publ. Math. Debrecen 6 (1959) 290–297.
Erdös, P., Rényi, A.: On the evolution of random graphs. Magyar Tud. Akad. Mat. Kut. Int.
Kzl. 5 (1960) 17–61.
Knuth, D.E.: The Art Of Computing Programming, v.1, “Fundamental Algorithms”. 2nd
Edition, Addition-Wesley, Reading (1973).
Knuth, D.E, Pittel, B.:A recurrence related to trees. Proc.Am. Math. Soc. 105 (1989) 335–349.
Flajolet, P., Knuth, D.E.,Pittel, B.: The First Cycles in an Evolving Graph. Discrete Mathematics 75 (1989) 167–215.
Flajolet, P., Zimmerman, P.,Van Cutsem: A calculus for the random generation of labelled
combinatorial structures. Theoretical Computer Sciences 132 (1994) 1–35.
Janson, S.,Knuth, D.E., Luczak, T., Pittel, B.: The Birth of the Giant Component. Random
Structures and Algorithms 4 (1993) 233–358.
Moon, J.W.: Various proofs of Cayley’s formula for counting trees. In: Harary, F. (ed.): A
seminar on graph theory. New York (1967) 70–78.
Ravelomanana, V., Thimonier, L.: Enumeration and random generation of the first multicyclic
isomorphism-free labeled graphs. submitted, (1999).
Rényi, A.: On connected graphs I. Publ. Math. Inst. Hungarian Acad. Sci. 4 (1959) 385–388.
Voblyi, V.A.: Wright and Stepanov-Wright coefficients. Math. Notes 42 (1987) 969–974.
Wilf, H.S.: Generatingfunctionology. Academic Press, New-York (1990).
Wright, E.M.: The Number of Connected Sparsely Edged Graphs. Journal of Graph Theory
1 (1977) 317–330.
Wright, E.M.: The Number of Connected Sparsely Edged Graphs. II. Smooth graphs and
blocks. Journal of Graph Theory 2 (1978) 299–305.
Wright, E.M.: The Number of Connected Sparsely Edged Graphs. III. Asymptotic results.
Journal of Graph Theory 4 (1980) 393–407.
Analysis of Edge Deletion Processes on Faulty
Random Regular Graphs
Andreas Goerdt⋆1 and Mike Molloy2
1
2
Fakultät für Informatik, TU Chemnitz, 09111 Chemnitz, Germany
goerdt@informatik.tu-chemnitz.de
Department of Computer Science, University of Toronto, Toronto, Canada
molloy@cs.toronto.edu
Abstract. Random regular graphs are, at least theoretically, popular
communication networks. The reason for this is that they combine low
(that is constant) degree with good expansion properties crucial for efficient communication and load balancing. When any kind of communication network gets large one is faced with the question of fault tolerance
of this network. Here we consider the question: Are the expansion properties of random regular graphs preserved when each edge gets faulty
independently with a given fault probability? We improve previous results on this problem: Expansion properties are shown to be preserved
for much higher fault probabilities and lower degrees than was known
before. Our proofs are much simpler than related proofs in this area.
Introduction
A natural question in the theory of fault tolerance of communication networks
reads: Is it possible to simulate the non-faulty network on the faulty one with a
well determined slowdown? Here one assumes that the network proceeds in synchronous steps and in each step each processor (= node of the network) performs
some local computation and some communication steps. Ideally one would like to
simulate the non-faulty network in such a way that the simulation is slower only
by a constant factor showing that the time is essentially unchanged. Whereas
such efficient simulations are known for networks with unbounded degree, like
the hypercube, it is still an important question whether they exist for bounded
degree networks like the butterfly [3]. Note that all of this paper refers to random
faults, that is each component (normally edge or node) gets faulty independently
with a given fault probability and the results only hold with high probability
meaning with probability going to 1 when the network gets large.
Random regular graphs with given degree d ≥ 3 are well known to be expander graphs (with high probability) [2]: There is a constant C (< 1) such
that each subset X of nodes has ≥ C · |X| neighbours adjacent to X but not
belonging to X (provided X contains at most half of all vertices). If we ever were
⋆
Author’s work in part performed at the University of Toronto, supported by a grant
obtained through Alasdair Urquhart.
G. Gonnet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNCS 1776, pp. 38–47, 2000.
c Springer-Verlag Berlin Heidelberg 2000
Analysis of Edge Deletion Processes on Faulty Random Regular Graphs
39
to simulate computation on a random regular graph with slowdown only by a
constant factor on the faulty graph we would need a linear size expander inside
the faulty graph.
The investigation of random regular graphs with edge faults starts with the
paper [9]. In the succeeding paper [10] attention is drawn to the preservation
of expansion properties. Some sufficient conditions are given. In work by the
first author [4] a threshold result for the existence of a linear size component is
proved. In [5] we give a sufficient condition on fault probability and degree such
that we can find a linear size expander efficiently – a question not treated in the
initial work on expansion [10]. Crucial to our result is the notion of a k-core:
The k-core of a given graph is the (unique) maximal subgraph where each node
has degree at least k. In [5] we first observe that the 3−core of a faulty random
regular graph is an expander (this follows simply from randomness properties
of the 3−core). Second, we present a simple edge deletion algorithm which is
shown to find a 3−core of linear size when d ≥ 42 and each edge is non-faulty
with probability at least 20/d.
The present paper improves considerably on these results: We give a precise
threshold on the fault probability for the existence of a linear size k-core for
any d > k ≥ 3. Thus improving the previous bounds for the existence of
an expanding subgraph. For example when the degree is as low as 4 and each
edge is faulty with probability < 1/9 we have a linear size 3−core and thus an
expanding subgraph.
Our proof uses a proof technique originally developed for [7]. It is technically
quite simple. This is in sharp constrast to the previous proofs of the weaker
results mentioned above relying on technically advanced probability theoretic
tools. This technique applies to a wide range of similar problems (see [8]). The
technique was inspired by the original (more involved) proof of the k-core threshold for Gn,p given in [6].
1
Outline
We will study random regular graphs with edge faults by focussing on the configuration model (cf.[1]). It is well known that properties which hold a.s. (almost
surely) for a uniformly random d-regular configuration also hold a.s. for a uniformly random d-regular simple graph. For the configuration model, we consider
n disjoint d−element sets called classes; the elements of these classes are called
copies. A configuration is a partition of the set of all copies into 2−element
sets, which are edges. Identifying classes with vertices, configurations determine
multigraphs and standard graph theoretic terminology can be applied to configurations. More details can be found in [1]. We fix the degree d and the probability
p for the rest of this paper and consider probability spaces Con(n, d, p) of random configurations where each edge is present with probability p or absent with
fault probability f = 1 − p. We call this space the space of faulty configurations.
An element of this space of is best considered as being generated by the following
probabilistic experiment consisting of two stages:
40
A. Goerdt, M. Molloy
.
.
(1) Draw randomly a configuration Φ = (W, E) where W = W1 ∪ . . . ∪ Wn
and |Wi | = d. (2) Delete each edge of Φ, along with its end-copies, independently
with fault probability f .
The probability of a fixed faulty configuration with k edges is (n · d − 2 ·
k)!! · (1 − p)(n·d/2) − k · pk . Given k, each set of k edges is equally likely to occur.
The degree of a class W with respect to a faulty configuration Φ, DegΦ (W ), is
the number of copies of W which were not deleted. Note that edges {x , y} with
x , y ∈ W contribute with two to the degree. The degree of a copy x, DegΦ (x), is
the degree of the class to which x belongs. The k-core of a faulty configuration
is the maximal subconfiguration of the faulty configuration in which each class
has a degree ≥ k. We call classes of degree less than k light whereas classes of
degree at least k are heavy. By Bin(m, λ) we denote the binomial distribution
with parameters m and success probability λ.
We now give an overview of the proof of the following theorem which is the
main result of this paper. For d > k ≥ 3 we consider the real valued function
L(λ) = λ/P r[Bin(d−1, λ) ≥ k−1] which we define for 0 < λ ≤ 1. L(1) = 1 and
L(λ) goes to infinity for λ approaching 0. Moreover L(λ) has a unique minimum
for 1 ≥ λ > 0. Let r(k, d) = min{L(λ)|1 ≥ lambda > 0}. For example we
have that r(3, 4) = 8/9. The definition of r(k, d) is, no doubt, mysterious at
this point, but we will see that it has a very natural motivation.
Theorem 1. (a) If p > r(k, d) then a random Φ ∈ Con(n, d, p) has a k-core
of linear size with high probability.
(b) If p < r(k, d) then a random Φ ∈ Con(n, d, p) has only the empty k-core
with high probability.
Theorem 1 implies that the analogous result holds for the space of faulty
random regular graphs (obtained as: first draw a graph, second delete the faulty
edges). The following algorithm which can easily be executed in the faulty network itself is at the heart of our argument.
Algorithm 2 The Global Algorithm
Input: A faulty configuration Φ, output: The k-core of Φ.
while Φ has light classes do
Φ := the modification of Φ where all light classes are deleted.
od. Output Φ.
Specifically, when we delete a class, W , we delete (i) all copies within W , (ii)
all copies of other classes which are paired with copies of W , (iii) W itself. Note
that it is possible for W itself to still be undeleted but to contain no copies as
they were all deleted as a result of neighbouring classes being deleted, or faulty
edges. In this case, of course, W is light and so it will be deleted on the next
iteration. At the end of the algorithm Φ has only classes of degree ≥ k, which
form the k-core. The following notion will be used later on: A class W of the
faulty configuration Φ survives j (j ≥ 0) rounds of the global algorithm with
degree t iff W has not yet been deleted and has degree t after the j’th execution
Analysis of Edge Deletion Processes on Faulty Random Regular Graphs
41
of the while-loop of the algorithm with input Φ. A class simply survives if it has
not yet been deleted.
In section 2 we analyze this algorithm
when run for j − 1 executions of
p
logd n throughout. We prove that the
the loop where we set j = j(n) =
number of classes surviving j − 1 rounds with degree t ≥ k is linear in n with
high probability when p > r(k, d) whereas the number of light classes is o(n).
(Initially this number is linear in n.) An extra argument presented in section 4
will show how to get rid of these few light classes leaving us with a linear size
k-core provided p > r(k, d). On the other hand, if p < r(k, d) then we show
that the expected number of classes surviving j − 1 rounds with any degree is
o(n) and that we have no longer enough classes to form a k-core. This is shown
in section 3.
2
Reduction of the Number of Light Classes
For d ≥ t ≥ 0, and for a particular integer j, we let
Xt : Con(n, d, p) → N
(1)
be the number of classes surviving j − 1 rounds of the global algorithm with
degree equal to t. As usual we can represent X = Xt as a sum of indicator
random variables
X = X W 1 + · · · + XW n ,
(2)
where XW assumes the value 1 when the class W survives j − 1 rounds with
degree equal to t and 0 when this is not the case. Then EX = n · E[XW ] =
n · P r[W survives with degree t] for W arbitrary. We determine
P r[W survives with degree t] approximately, that is an interval of width o(1)
which includes the probability. The probability of the event: W survives j −
1 rounds with degree t, turns out to depend only on the j−environment of
W defined as: For a class W the j−environment of W , j − EnvΦ (W ), is that
subconfiguration of Φ which has as classes the classes whose distance from W
is at most j. Here distance means the number of edges in a shortest path. The
edges of j − EnvΦ (W ) are those induced from Φ.
The proof of the following lemma follows with standard conditioning techniques observing that the j−environment of a class W in a random configuration
can be generated by a natural probabilistic breadth first generation process (cf.
[4] for details on this.) Here it is important that j only slowly goes to infinity.
Lemma 1. Let W be a fixed class then P r{j − EnvΦ (W ) is a tree} ≥ 1 − o(1).
Note that the lemma does not mean: Almost always the j-environment of all
classes is a tree. The definition of j-environment extends to faulty configurations
in the obvious manner. Focussing on a j-environment which is a tree is very
convenient since in a faulty configuration, it can be thought of as a branching
42
A. Goerdt, M. Molloy
process whereby the number of children of the root is distributed as Bin(d, p),
and the number of children of each non-root as Bin(d − 1, p).
The following algorithm approximates the effect the global algorithm has on
a fixed class W , provided the j−environment of W is a tree.
Algorithm 3 The Local Algorithm.
Input: A (sub-)configuration Γ , which is a j−environment of a class W in a
faulty configuration. Γ is a tree with root W .
Φ := Γ
for i = j − 1 downto 0 do
Modify Φ as follows: Delete all light classes in depth i of the tree Φ.
od.
The output is “W survives with degree t” if W is not deleted and has final degree
t. If W is deleted then the output is “W does not survive”.
Note that it is not possible for W to survive with degree less than k. By
round l of the algorithm we mean an execution of the loop with i = j − l where
1 ≤ l ≤ j. A class in depth i where 0 ≤ i ≤ j survives with degree t iff it is not
deleted and has degree t after round j − i of the algorithm. Note that classes
in depth j are never deleted and so they are considered to always survive. The
next lemma states in which respect the local algorithm approximates the global
one. The straightforward formal proof is omitted in this abridged version.
Lemma 2. Let j ≥ 1. For each class W and each faulty configuration Φ where
j − EnvΦ (W ) is a tree we have: After j − 1 rounds of the global algorithm with
Φ the class W survives with degree t ≥ k. ⇔ After running the local algorithm
with j − EnvΦ (W ) the class W survives with degree t ≥ k.
Note that W either survives j − 1 rounds of the global algorithm and the
whole local algorithm with the same degree t ≥ k or does not survive the local
algorithm in which case it does or does not survive j − 1 global rounds, but does
certainly not survive j global rounds.
We condition the following considerations on the almost sure event that for
j = j(n) the j−environment of the class W in the underlying fault free configuration is a tree (cf. Lemma 1). We denote this environment in a random faulty
configuration by Γ . We turn our attention to the calculation of the survival
probability with the local algorithm.
For i with 0 ≤ i ≤ j−1 let φi be the probability that a class in level (=depth)
j − i of Γ survives the local algorithm applied to Γ . As the j-enviroment in the
underlying fault-free configuration is a tree, the survival events of the children
of given class are independent. Therefore:
φ0 = 1 and φi = P r[Bin(d − 1, p · φi−1 ) ≥ k − 1].
(3)
And furthermore, considering now the root W of the j-environment, we get for
t ≥ k by analogous considerations:
P r[W survives the local algorithm with degree t.] = P r[Bin(d, p · φj−1 ) = t].
Analysis of Edge Deletion Processes on Faulty Random Regular Graphs
43
We have that the sequence of the φi ’s is monotonically decreasing and in the
interval [0, 1]. Hence φ = φ(p) = limi→∞ φi is well defined and as all functions
involved are continuous we get: φ = P r[Bin(d − 1, p · φ) ≥ k − 1]. (Note that
this is no definition of φ, the equation is always satisfied by φ = 0.)
Two further notations for subsequent usage: λt,i = λt,i (p) = P r[Bin(d, p ·
φi−1 ) = t] for i ≥ 1. Again we have that the λt,i ’s are monotonically decreasing
and between 0 and 1 and λt = λt (p) = limi→∞ λt,i . exists. Hence for our fixed
class W , considering j → ∞, we get:
P r[W survives the local algorithm with degree t.] = λt,j = λt + o(1). (4)
Here is where our formula for r(k, d) comes from:
Lemma 3. φ > 0 iff p > r(k, d).
Proof. First let φ > 0. As stated above we have φ = P r[Bin(d − 1, pφ) ≥ k − 1].
Therefore P r[Bin(d − 1, pφ) ≥ k − 1] > 0 and setting λ = p · φ, we get λ/p =
P r[Bin(d − 1, λ) ≥ k − 1] and so p = λ/P r(Bin(d − 1, λ) ≥ k − 1) = L(λ) and
the result follows.
Now let p > r(k, d). Let λ0 be such that r(k, d) = L(λ0 ). We show by
induction on i that p · φi ≥ λ0 . For the induction base we get: p · φ0 = p >
r(k, d) ≥ λ0 where the last estimate holds because the denominator in the
definition of L(λ0 ) always is ≤ 1. For the induction step we get:
p · φi+1 = p · P r[Bin(d − 1, p · φi ) ≥ k − 1] ≥ p · P r[Bin(d − 1, λ0 ) ≥ k − 1] > λ0
where the last but one estimate uses the induction hypothesis and the last one
follows from the assumption.
⊓
⊔
We now return to the analysis of the global algorithm. The next corollary
follows directly with Lemma 1, Lemma 2, and (4).
p
logd n. In
Corollary 1. Let W be a fixed class, t ≥ k and let j = j(n) =
the space of faulty configurations we have (cf.(2)):
P r[XW = 1] = P r{W survives j(n) − 1 global rounds with degree t}
= λt + o(1).
Next the announced concentration result:
Theorem 4. Let t ≥ k, X = Xt be the random variable defined as in (1), and
let λ = λt ; then we have:
(1) EX = λ · n + o(n). (2) Almost surely |X − λ · n| ≤ o(n).
Proof. (1) The claim follows from the representation of X as a sum of indicator
random variables (cf. (2)) and with Corollary 1.
(2) We show that V X = o(n2 ). This implies the claim with an application
of Tschebycheff’s inequality. We have X = XW1 + XW2 + . . . + XWn (cf.
(2)). This and (1) of the present theorem implies V X = E[X 2 ] − (EX)2 =
E[X 2 ] − (λ2 · n2 + o(n) ·n). Moreover, E[X 2 ] = EX + n·(n−1)·E[XU ·XW ] =
λ · n + o(n) + n · (n − 1) · E[XU · XW ], where U and W are two arbitrary distinct
44
A. Goerdt, M. Molloy
classes. We need to show that E[X 2 ] = λ2 · n2 + o(n2 ). This follows from
E[XU · XW ] = λ2 + o(1) showing that the events XU = 1 and XW = 1 are
asymptotically independent. This follows by conditioning on the event that the
j−environments of U and W are disjoint trees and analyzing the breadth first
generation procedure for the j− environment of a given class. Again we need
that j goes only slowly to infinity.
⊓
⊔
3
When There is no k-Core
The proof of Theorem 1(b) is now quite simple. First we need the following fact:
Lemma 4. A.s. a random member of Con(n, d, p) has no k-core on o(n) vertices.
Proof. The lemma follows from the fact that a random member of Con(n, d) a.s.
has no subconfiguration with average degree at least 3 on at most ǫn vertices,
where ǫ = ǫ(d) is a small positive constant. Consider any s ≤ ǫn. The number of
choices for s classes, 1.5s edges from amongst those classes, and copies for the
endpoint of each edge, is at most:
s
n
2
d3s .
1.5s
s
Setting M (t) = t!/(2t/2 (t/2)!) to be the number of ways of pairing t copies,
we have that for any such collection, the probability that those pairs lie in our
random member of Con(n, d) is
e
M ((d − 3s)n)/M (dn) < ( )1.5s .
n
Therefore, the expected number of such subconfigurations is at most:
s
en e(s2 /2) 1.5s e 1.5s
e
n
2
) ( )
d3s ( )1.5s < ( )s (
n
s
1.5s
n
1.5s
s
6
20d s .5s
) = f (s).
≤(
n
Therefore,Pif ǫ = 1/40d6 then the expected number of such subconfigurations is
n
⊓
⊔
less than s=2 f (s) which is easily verified to be o(1).
Now, by Lemma 3, we have for p < r(k, d) that φ = 0. Therefore, as j goes to
infinity the expected number of classes surviving j rounds with degree at least
k is o(n) and so almost surely is o(n). With the last lemma we get Theorem 1
(b).
Analysis of Edge Deletion Processes on Faulty Random Regular Graphs
4
45
When There is a k-Core
In this section, we prove Theorem 1(a). So we assume that p > r(k, d). We start
by showing that almost surely very few light clauses survive the first j(n) − 1
iterations:
Lemma 5. p
In Con(n, d, p) almost surely: The number of light classes after
j(n) − 1 = logd n − 1 rounds of the global algorithm is reduced to o(n).
Proof. The proof follows with Theorem 4 applied to j − 2 and j − 1 (which both
go to infinity).
⊓
⊔
In order to eliminate the light classes still present after j(n)−1 global rounds,
we need to know something about the distribution of the configurations after
j(n) − 1 rounds. As usual in similar situations the uniform distribution needs to
be preserved. For n̄ = (n0 , n1 , n2 , . . . , nd ) where the sum of the ni is at most
n we let Con(n̄) be the space of all configurations with ni classes consisting of
i copies. Each configuration is equally likely. The following lemma is proved in
[5].
Lemma 6. Conditioning the space Con(n , d , p) on those configuration which
give a configuration in Con(n̄) after i global rounds, each configuration from
Con(n̄) has the same probability to occur after i global rounds.
After running the global algorithm for j(n) − 1 rounds we get by Lemma 5 a
configuration uniformly distributed in Con(n̄) where n1 + n2 +...+nk−1 = o(n)
and |nt − λt · n| ≤ o(n) for t ≥ k with high probability. A probabilistic analysis
of the following algorithm eliminating the light classes one by one shows that we
obtain a linear size k-core with high probability.
Algorithm 5
Input: A faulty configuration Φ.
Output: The k-core of Φ.
while There exist light classes in Φ do
Choose uniformly at random a light class W from all light classes
and delete W and the edges incident with W .
od. The classes of degree ≥ k are the k-core of Φ.
In order to perform a probabilistic analysis of this algorithm it is again important that the uniform distribution is preserved. A similar result is Proposition
1 in [6] (for the case of graphs instead of configurations).
Lemma 7. If we apply the algorithm above to a uniformly random Φ ∈ Con(n̄),
(n̄ fixed) for a given number of iterations we get: Conditional on the event (in
Con(n̄)) that the configuration obtained, Ψ , is in Con(n′0 , n′1 n′2 , n′3 , . . . , n′d ) the
configuration Ψ is a uniformly random configuration from this space.
46
A. Goerdt, M. Molloy
Lemma 8. We consider probability spaces Con(n̄) where the number of heavy
vertices is ≥ δ · n. In one round of Algorithm 5 one light class disappears and we
get ≤ k − 1 new light classes. Let Y : Con(n̄) → P
N be the number of new light
classes after one round of Algorithm 5. Let ν =
i i · ni and π = (k · nk )/ν.
Thus π is the probability to pick a copy of degree k when picking uniformly at
random from all copies belonging to edges.Then:
(a) P r[Y = l] = P r[Bin(deg(W ), π) = l] + o(1).
(b) EY ≤ (k − 1) · π + o(1).
The straightforward proof of this lemma is omitted due to lack of space. Our
next step is to bound π.
Lemma 9. π ≤ (1 − ǫ)/(k − 1) for some ǫ > 0.
Proof. We will prove that when p = r(k, d) then π = 1/(k − 1). Since π is easily
shown to be decreasing in p, this proves our lemma. Recall that r(k, d) is defined
to be the minimum of the function L(λ). Therefore, at L(λ) = r(k, d), we have
′
L (λ) = 0. Differentiating L, we get:
d−1
d−1
X
X
d−1 i
d−1 i
λ (1 − λ)d−1−i =
λ (1 − λ)d−2−i (i − (d − 1)λ). (5)
i
i
i=k−1
i=k−1
A simple inductive proof shows that the RHS of (5) is equal to
d − 1 k−1
(1 − λ)d−k .
(k − 1)
λ
k−1
(6)
Indeed, it is trivially true for k = d, and if it is true for k = r + 1 then for k = r
the RHS is equal to
d−1 r
d − 1 r−1
d−1−r
(r − 1 − (d − 1)λ) + r
λ (1 − λ)d−1−r
λ (1 − λ)
r
r−1
d − 1 r−1
= (r − 1)
λ (1 − λ)d−r
r−1
Setting j = i + 1, and multiplying by λd, the LHS of (5) comes to:
d
d
X
X
d j
d−1 j
j
d
λ (1 − λ)d−j ,
λ (1 − λ)d−j =
j
j−1
j=k
j=k
and (6) comes to
d k−1
d−1 k
d−k
= k(k − 1)
λ
(1 − λ)d−k .
d(k − 1)
λ (1 − λ)
k
k−1
Now, since
k kd λk−1 (1 − λ)d−k
+ o(1),
π = Pd
d j
d−j
j=k j j λ (1 − λ)
this establishes our lemma.
⊓
⊔
Analysis of Edge Deletion Processes on Faulty Random Regular Graphs
47
Lemma 10. Algorithm 5 stops after o(n) rounds of the while loop with a linear
size k-core with high probability (with respect to Con(n̄)).
Proof. We define Yi to be the number of light classes remaining after i steps of
Algorithm 5. By assumption, Y0 = o(n). Furthermore, by Lemmas 8 and 9, we
have EY1 ≤ Y0 − 1 + (k − 1)π < Y0 − ǫ. Furthermore, it is not hard to verify
that, since there are Θ(n) classes of degree k, then so long as i = o(n) we have
1
EYi+1 ≤ Yi − ǫ,
2
and in particular, the probability that at least ℓ new light vertices are formed
during step i is less than the probability that the binomial variable Bin(k − 1, π)
is at least ℓ.
Therefore, for any t = o(n), Y0 , Y1 , ..., Yt is statistically dominated by a random walk defined as:
Z0 = Y0 ; Zi+1 = Zi − 1 + Bin(k − 1,
1 − 12 ǫ
).
k−1
Since Zi has a drift of − 21 ǫ, it is easy to verify that with high probability, Zt = 0
for some t = o(n), and thus with high probability Yt = 0 as well.
⊓
⊔
If Yt = 0 then we are left with a k-core of linear size.
Clearly Lemma 10 implies Theorem 1(a).
References
1. Bela Bollobas. Random Graphs. Academic Press. 1985.
2. –.The isoperimetric number of random regular graphs. European Journal of Combinatorics. 1988, 9, 241-244.
3. Richard Cole, Bruce Maggs, Ramesh Sitaraman. Routing on Butterfly networks
with random faults. In Proceedings FoCS 1995. IEEE. 558-570.
4. Andreas Goerdt. The giant component threshold for random regular graphs with
edge faults. In Proceedings MFCS 1997. LNCS 1295. 279-288.
5. –. Random regular graphs with edge faults: expansion through cores. In Proceedings ISAAC 1998. LNCS 1533, 219-228.
6. Boris Pittel, Joel Spencer, Nicholas Wormald. Sudden emergence of a giant k-core
in a random graph. Journal of Combinatorial Theory B 67,1996,111-151.
7. Mike Molloy and Boris Pittel. Subgraphs with average degree 3 in a random graph.
In preparation.
8. Mike Molloy and Nick Wormald.In preparation.
9. S. Nikoletseas, K. Palem, P. Spirakis, M. Yung. Vertex disjoint paths and multconnectivity in random graphs: Secure network computing. In Proceedings ICALP
1994. LNCS 820. 508-519.
10. Paul Spirakis and S. Nikoletseas. Expansion properties of random regular graphs
with edge faults. In Proceedings STACS 1995. LNCS 900. 421-432.
Equivalent Conditions for Regularity
(Extended Abstract)
Y. Kohayakawa1⋆ , V. Rödl2 , and J. Skokan2
1
Instituto de Matemática e Estatı́stica, Universidade de São Paulo,
Rua do Matão 1010, 05508–900 São Paulo, Brazil
yoshi@ime.usp.br
2
Department of Mathematics and Computer Science,
Emory University, Atlanta, GA, 30322, USA
{rodl,jozef}@mathcs.emory.edu
Abstract. Haviland and Thomason and Chung and Graham were the first to investigate systematically some properties of quasi-random hypergraphs. In particular,
in a series of articles, Chung and Graham considered several quite disparate properties of random-like hypergraphs of density 1/2 and proved that they are in fact
equivalent. The central concept in their work turned out to be the so called deviation of a hypergraph. Chung and Graham proved that having small deviation is
equivalent to a variety of other properties that describe quasi-randomness. In this
note, we consider the concept of discrepancy for k-uniform hypergraphs with an
arbitrary constant density d (0 < d < 1) and prove that the condition of having
asymptotically vanishing discrepancy is equivalent to several other quasi-random
properties of H, similar to the ones introduced by Chung and Graham. In particular, we give a proof of the fact that having the correct ‘spectrum’ of the s-vertex
subhypergraphs is equivalent to quasi-randomness for any s ≥ 2k. Our work can
be viewed as an extension of the results of Chung and Graham to the case of an
arbitrary constant valued density. Our methods, however, are based on different
ideas.
1 Introduction and the Main Result
The usefulness of random structures in theoretical computer science and in discrete
mathematics is well known. An important, closely related question is the following:
which, if any, of the almost sure properties of such structures suffice for a deterministic
object to have to be as useful or relevant?
Our main concern here is to address the above question in the context of hypergraphs.
We shall continue the study of quasi-random hypergraphs along the lines initiated by
Haviland and Thomason [7,8] and especially by Chung [2], and Chung and Graham [3,4].
One of the central concepts concerning hypergraph quasi-randomness, the so called hypergraph discrepancy, was investigated by Babai, Nisan, and Szegedy [1], who found
a connection between communication complexity and hypergraph discrepancy. This
⋆
Partially supported by FAPESP (Proc. 96/04505–2), by CNPq (Proc. 300334/93–1), and by
MCT/FINEP (PRONEX project 107/97).
G. Gonnet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNCS 1776, pp. 48–57, 2000.
c Springer-Verlag Berlin Heidelberg 2000
Equivalent Conditions for Regularity
49
connection was further studied by Chung and Tetali [5]. Here, we carry out the investigation very much along the lines of Chung and Graham [3,4], except that we focus on
hypergraphs of arbitrary constant density, making use of different techniques.
In the remainder of this introduction, we carefully discuss a result of Chung and
Graham [3] and state our main result, Theorem 3 below.
1.1 The Result of Chung and Graham
We need to start with some definitions. For a set V and an integer k ≥ 2, let [V ]k
denote the system of all k-element subsets of V . A subset G ⊂ [V ]k is called a kuniform hypergraph. If k = 2, we have a graph. We sometimes use the notation G =
(V (G), E(G)). If there is no danger of confusion, we shall identify the hypergraphs with
their edge sets. Throughout this paper, the integer k is assumed to be a fixed constant.
For any l-uniform hypergraph G and k ≥ l, let Kk (G) be the set of all k-element sets
(l)
that span a clique Kk on k vertices. We also denote by Kk (2) the complete k-partite
k-uniform hypergraph whose every partite set contains precisely two vertices. We refer
to Kk (2) as the generalized octahedron, or, simply, the octahedron.
We also consider a function µH : [V ]k → {−1, 1} such that, for all e ∈ [V ]k , we
have
−1,
if e ∈ H
µH (e) =
1,
if e 6∈ H.
Let [k] = {1, 2, . . . , k}, let V 2k denote the set of all 2k-tuples (v1 , v2 , . . . , v2k ), where
(k)
vi ∈ V (1 ≤ i ≤ 2k), and let ΠH : V 2k → {−1, 1} be given by
(k)
ΠH (u1 , . . . , uk , v1 , . . . , vk ) =
Y
ε
µH (ε1 , . . . , εk ),
where the product is over all vectors ε = (εi )ki=1 with εi ∈ {ui , vi } for all i and we
understand µH to be 1 on arguments with repeated entries.
Following Chung and Graham (see, e.g., [4]), we define the deviation dev(H) of H
by
X
1
(k)
ΠH (u1 , . . . , uk , v1 , . . . , vk ).
dev(H) = 2k
m
ui ,vi ∈V, i∈[k]
For two hypergraphs G and H, we denote by H
G the set of all induced subhypergraphs
w
for the number of weak (i.e., not
of H that are isomorphic to G. We also write H
G
necessarily induced) subhypergraphs of H that are isomorphic to G. Furthermore, we
need the notion of the link of a vertex.
Definition 1 Let H be a k-uniform hypergraph and x ∈ V (H). We shall call the (k −1)uniform hypergraph
H(x) = {e \ {x}: e ∈ H, x ∈ e}
the link of the vertex x in H. For a subset W ⊂ V (H), the joint W -link is H(W ) =
T
x∈W H(x). For simplicity, if W = {x1 , . . . , xk }, we write H(x1 , . . . , xk ).
50
Y. Kohayakawa, V. Rödl, J. Skokan
Observe that if H is k-partite, then H(x) is (k − 1)-partite for every x ∈ V . Furthermore, if k = 2, then H(x) may be identified with the ordinary graph neighbourhood of
x. Moreover, H(x, x′ ) may be thought of as the ‘joint neighbourhood’ of x and x′ .
In [3], Chung and Graham proved that if the density of an m-vertex k-uniform
hypergraph H is 1/2, i.e., |H| = (1/2 + o(1)) m
k , where o(1) → 0 as m → ∞, then
the following statements are equivalent:
(Q1 (s)) for all k-uniform hypergraphs G on s ≥ 2k vertices and automorphism group
Aut(G),
s!
m −(ks)
H
,
= (1 + o(1))
2
| Aut(G)|
s
G
(Q2 ) for all k-uniform hypergraphs G on 2k vertices and automorphism group Aut(G),
we have
(2k)!
m −(2k
H
,
= (1 + o(1))
2 k)
| Aut(G)|
2k
G
(Q3 ) dev(H) = o(1),
(Q4 ) for almost all choices of vertices x, y ∈ V , the (k − 1)-uniform hypergraph
H(x)△H(y), that is, the complement [V ]k−1 \ (H(x)△H(y)) of the symmetric
difference of H(x) and H(y), satisfies Q2 with k replaced by k − 1,
(Q5 ) for 1 ≤ r ≤ 2k − 1 and almost all x, y ∈ V ,
H(x, y)
r
m −(k−1
).
= (1 + o(1))
2
(k−1)
r
Kr
The equivalence of these properties is to be understood in the following sense. If we
have two properties P = P (o(1)) and P ′ = P ′ (o(1)), then “P ⇒ P ′ ” means that
for every ε > 0 there is a δ > 0 so that any k-uniform hypergraph H on m vertices
satisfying P (δ) must also satisfy P ′ (ε), provided m > M0 (ε).
In [3] Chung and Graham stated that “it would be profitable to explore quasirandomness extended to simulating random k-uniform hypergraphs Gp (n) for p 6= 1/2,
or, more generally, for p = p(n), especially along the lines carried out so fruitfully by
Thomason [13,14].” Our present aim is to explore quasi-randomness from this point of
view. In this paper, we concentrate on the case in which p is an arbitrary constant. In
certain crucial parts, our methods are different from the ones of Chung and Graham.
Indeed, it seems to us that the fact that the density of H is 1/2 is essential in certain
proofs in [3] (especially those involving the concept of deviation).
1.2
Discrepancy and Subgraph Counting
The following concept was proposed by Frankl and Rödl and later investigated by
Chung [2] and Chung and Graham in [3,4]. For an m-vertex k-uniform hypergraph
H with vertex set V , we define the density d(H) and the discrepancy disc1/2 (H) of H
−1
and
by letting d(H) = |H| m
k
disc1/2 (H) =
1
mk
max
G⊂[V ]k−1
|H ∩ Kk (G)| − |H̄ ∩ Kk (G)| ,
(1)
Equivalent Conditions for Regularity
51
where the maximum is taken over all (k − 1)-uniform hypergraphs G with vertex set V ,
and H̄ is the complement [V ]k \ H of H.
To accommodate arbitrary densities, we extend the latter concept as follows.
Definition 2 Let H be a k-uniform hypergraph with vertex set V with |V | = m. We
define the discrepancy disc(H) of H as follows:
disc(H) =
1
mk
max
G⊂[V ]k−1
|H ∩ Kk (G)| − d(H)|Kk (G)| ,
(2)
where the maximum is taken over all (k − 1)-uniform hypergraphs G with vertex set V .
Observe that if d(H) = 1/2, then disc(H) = (1/2) disc1/2 (H), so both notions
are equivalent. Following some initial considerations by Frankl and Rödl, Chung and
Graham investigated the relation between discrepancy and deviation. In fact, Chung [2]
succeeded in proving the following inequalities closely connecting these quantities:
k
(i) dev(H) < 4k (disc1/2 (H))1/2 ,
k
(ii) disc1/2 (H) < (dev(H))1/2 .
For simplicity, we state the inequalities for the density 1/2 case. For the general case,
see Section 5 of [2].
Before we proceed, we need to introduce a new concept. If the vertex set of a
hypergraph is totally ordered, we say that we have an ordered hypergraph. Given two
ordered hypergraphs G≤ and H≤′ , where ≤ and ≤′ denote the orderings on the vertex
sets of G = G≤ and H = H≤′ , we say that a function f : V (G) → V (H) is an embedding
of ordered hypergraphs if (i) it is an injection, (ii) it respects the orderings, i.e., f (x) ≤′
f (y) whenever x ≤ y, and (iii) f (g) ∈ H if and only if g ∈ G, where f (g) is the set
′
formed by the
images of all the vertices in g. Furthermore, if G = G≤ and H = H≤ ,
we write H
for
the
number
of
such
embeddings.
G ord
As our main result, we shall prove the following extension of Chung and Graham’s
result.
Theorem 3 Let H = (V, E) be a k-uniform hypergraph of density 0 < d < 1. Then the
following statements are equivalent:
(P1 ) disc(H) = o(1),
(P2 ) disc(H(x)) = o(1) for all but o(m) vertices x ∈ V and disc(H(x, y)) = o(1)
for all but o(m2 ) pairs x, y ∈ V ,
(P3 ) disc(H(x, y)) = o(1) for all but o(m2 ) pairs x, y ∈ V ,
(P4 ) the number of octahedra Kk (2) in H is asymptotically minimized among all
k-uniform hypergraphs of density d; indeed,
w
m2k k
H
= (1 + o(1)) k d2 ,
2 k!
Kk (2)
(P5 ) for any s ≥ 2k and any k-uniform hypergraph G on s vertices with e(G) edges
and automorphism group Aut(G),
s
s!
m e(G)
H
,
(1 − d)(k)−e(G)
= (1 + o(1))
d
| Aut(G)|
s
G
52
Y. Kohayakawa, V. Rödl, J. Skokan
(P5′ ) for any ordering H≤ of H and for any fixed integer s ≥ 2k, any ordered k-uniform
hypergraph G≤ on s vertices with e(G) edges is such that
s
m e(G)
H
(1 − d)(k)−e(G) ,
= (1 + o(1))
d
s
G ord
(P6 ) for all k-uniform hypergraphs G on 2k vertices with e(G) edges and automorphism
group Aut(G),
2k
(2k)!
m e(G)
H
(1 − d)( k )−e(G)
= (1 + o(1))
d
.
| Aut(G)|
2k
G
(P6′ ) for any ordering H≤ of H, any ordered k-uniform hypergraph G≤ on 2k vertices
with e(G) edges is such that
2k
H
m e(G)
(1 − d)( k )−e(G) .
= (1 + o(1))
d
G ord
2k
Some of the implications in Theorem 3 are fairly easy or are by now quite standard.
There are, however, two implications that appear to be more difficult.
The proof of Chung and Graham that dev1/2 (H) = o(1) implies P5 (the ‘subgraph
counting formula’) is based on an approach that has its roots in a seminal paper of
Wilson [15]. This beautiful proof seems to make non-trivial use of the fact that d(H) =
1/2. Our proof of the implication that small discrepancy implies the subgraph counting
formula (P1 ⇒ P5′ ) is based on a different technique, which works well in the arbitrary
constant density case (see Section 2.2).
Our second contribution, which is somewhat more technical in nature, lies in a novel
approach for the proof of the implication P2 ⇒ P1 . Our proof is based on a variant of
the Regularity Lemma of Szemerédi [12] for hypergraphs [6] (see Section 2.1).
2 Main Steps in the Proof of Theorem 3
2.1 The First Part
The first part of the proof of Theorem 3 consists of proving that properties P1 , . . . , P4 are
mutually equivalent. As it turns out, the proof becomes more transparent if we restrict
ourselves to k-partite hypergraphs. In the next paragraph, we introduce some definitions
that will allow us to state the k-partite version of P1 , . . . , P4 (see Theorem 15). We close
this section introducing the main tool in the proof of Theorem 15, namely, we state a
version of the Regularity Lemma for hypergraphs (see Lemma 20).
Definitions for Partite Hypergraphs. For simplicity, we first introduce the term cylinder to mean partite hypergraphs.
Definition 4 Let k ≥ l ≥ 2 be two integers. We shall refer to any k-partite l-uniform
If
hypergraph H = (V1 ∪ . . . ∪ Vk , E) as a k-partite l-cylinder or (k, l)-cylinder.
S
l = k − 1, we shall often write Hi for the subhypergraph of H induced on j 6=i Vj .
Sk
(l)
Clearly, H = i=1 Hi . We shall also denote by Kk (V1 , . . . , Vk ) the complete (k, l)cylinder with vertex partition V1 ∪ . . . ∪ Vk .
Equivalent Conditions for Regularity
53
Definition 5 For a (k, l)-cylinder H, we shall denote by Kj (H), l ≤ j ≤ k, the (k, j)cylinder whose edges are precisely those j-element subsets of V (H) that span cliques
of order j in H.
When we deal with cylinders, we have to measure density according to their natural
vertex partitions.
Definition 6 Let H be a (k, k)-cylinder with k-partition V = V1 ∪ . . . ∪ Vk . We define
the k-partite density or simply the density d(H) of H by
d(H) =
|H|
.
|V1 | . . . |Vk |
To be precise, we should have a distinguished piece of notation for the notion of
k-partite density. However, the context will always make clear which notion we mean
when we talk about the density of a (k, k)-cylinder.
We should also be careful when we talk about the discrepancy of a cylinder.
Definition 7 Let H be a (k, k)-cylinder with vertex set V = V1 ∪ . . . ∪ Vk . We define
the discrepancy disc(H) of H as follows:
disc(H) =
1
max |H ∩ Kk (G)| − d(H)|Kk (G)| ,
|V1 | . . . |Vk | G
(3)
where the maximum is taken over all (k, k − 1)-cylinders G with vertex set V = V1 ∪
. . . ∪ Vk .
We now introduce a simple but important concept concerning the “regularity” of a
(k, k)-cylinder.
Definition 8 Let H be a (k, k)-cylinder with k-partition V = V1 ∪ . . . ∪ Vk and let
δ < α be two positive real numbers. We say that H is (α, δ)-regular if the following
condition is satisfied: if G is any (k, k − 1)-cylinder such that |Kk (G)| ≥ δ|V1 | . . . |Vk |,
then
(4)
(α − δ)|Kk (G)| ≤ |H ∩ Kk (G)| ≤ (α + δ)|Kk (G)|.
Lemma 9 Let H be an (α, δ)-regular (k, k)-cylinder. Then disc(H) ≤ 2δ.
Lemma 10 Suppose H is a (k, k)-cylinder with k-partition V = V1 ∪ . . . ∪ Vk . Put
α = d(H) and assume that disc(H) ≤ δ. Then H is (α, δ 1/2 )-regular.
The k-Partite Result. Suppose H is a k-uniform hypergraph and let H′ be a ‘typical’
k-partite spanning subhypergraph of H. In this section, we relate the discrepancies of H
and H′ .
Definition 11 Let H = (V, E) be a k-uniform hypergraph with m vertices and let
P = (Vi )k1 be a partition of V . We denote by HP the (k, k)-cylinder consisting of the
edges h ∈ H satisfying |h ∩ Vi | = 1 for all 1 ≤ i ≤ k.
54
Y. Kohayakawa, V. Rödl, J. Skokan
The following lemma holds.
Lemma 12 For any partition P = (Vi )k1 of V , we have
(i) disc(H) ≥ |d(HP ) − d(H)||V1 | . . . |Vk |/mk ,
(ii) disc(HP ) ≤ 2 disc(H)mk /|V1 | . . . |Vk |.
An immediate consequence of the previous lemma is the following.
Lemma 13 If disc(H) = o(1), then disc(HP ) = o(1) for (1 − o(1))k m partitions
P = (Vi )k1 of V .
With some more effort, one may prove a converse to Lemma 13.
Lemma 14 Suppose there exists a real number γ > 0 such that disc(HP ) = o(1) for
γk m partitions P = (Vi )k1 of V . Then disc(H) = o(1).
We now state the k-partite version of a part of our main result, Theorem 3.
Theorem 15 Suppose V = V1 ∪ . . . ∪ Vk , |V1 | = . . . = |Vk | = n, and let H = (V, E)
be a (k, k)-cylinder with |H| = dnk . Then the following four conditions are equivalent:
(C1 ) H is (d, o(1))-regular;
(C2 ) H(x) is (d, o(1))-regular for all but o(n) vertices x ∈ Vk and H(x, y) is
(d2 , o(1))-regular for all but o(n2 ) pairs x, y ∈ Vk ;
(C3 ) H(x, y) is (d2 , o(1))-regular for all but o(n2 ) pairs x, y ∈ Vk ;
(C4 ) the number of copies of Kk (2) in H is asymptotically minimized among all such
k
(k, k)-cylinders of density d, and equals (1 + o(1))n2k d2 /2k .
Remark 1. The condition |V1 | = . . . = |Vk | = n in the result above has the sole purpose
of making the statement more transparent. The immediate generalization of Theorem 15
for V1 , . . . , Vk of arbitrary sizes holds.
Remark 2. The fact that the minimal number of octahedra in a (k, k)-cylinder is asympk
totically (1 + o(1))n2k d2 /2k is not difficult to deduce from a standard application of
the Cauchy–Schwarz inequality for counting “cherries” (paths of length 2) in bipartite
graphs.
We leave the derivation of the equivalence of properties P1 , . . . , P4 from Theorem 15
to the full paper.
A Regularity Lemma. The hardest part in the proof of Theorem 15 is the implication
C2 ⇒ C1 . In this paragraph, we discuss the main tool used in the proof of this implication.
It turns out that, in what follows, the notation is simplified if we consider (k + 1)-partite
hypergraphs.
Throughout this paragraph, we let G be a fixed (k + 1, k)-cylinder with vertex set
Sk+1
V (G) = V1 ∪. . .∪Vk+1
S . Recall that G = i=1 Gi , where Gi is the corresponding (k, k)cylinder induced on j 6=i Vj . In this section, we shall focus on “regularizing” the (k, k)cylinders G1 , . . . , Gk , ignoring Gk+1 . Alternatively, we may assume that Gk+1 = ∅.
Equivalent Conditions for Regularity
55
Sk
Definition 16 Let F = i=1 Fi be a (k, k − 1)-cylinder with vertex set V1 ∪ . . . ∪ Vk .
For a vertex v ∈ Vk+1 , we define the G-link GF (x) of x with respect to F to be the
(k, k − 1)-cylinder GF (x) = G(x) ∩ F.
Definition 17 Let W ⊂ Vk+1 and let F =
pair (F, W ) is (ε, d)-regular if
Sk
i=1
Fi be as above. We shall say that the
|Kk (GF (x))|
−d <ε
|Kk (F)|
(5)
for all but at most ε|W | vertices x ∈ W , and
|Kk (GF (x)) ∩ Kk (GF (y))|
− d2 < ε
|Kk (F)|
(6)
for all but at most ε|W |2 pairs x, y ∈ W .
Definition 18 Let t be a positive integer and let Vk+1 = W1 ∪ . . . ∪ Wt be an arbitrary
(t)
(i)
(i)
partition of Vk+1 . For every i ∈ [k], consider a t-partition Pi = {E1 , . . . , Et } of
St
(i)
(t)
(t)
V1 × . . . × Vi−1 × Vi+1 × . . . × Vk = α=1 Eα . Put P (t) = (P1 , . . . , Pk ). We shall
(k)
(1)
write E(P (t) ) for the collection of all (k, k − 1)-cylinders E of the form Eα1 ∪ . . . ∪ Eαk ,
(t)
(i)
where Eαi ∈ Pi for all 1 ≤ i ≤ k.
Clearly, with the notation as above, we have |E(P (t) )| = tk . Moreover, observe that
each of the tk+1 pairs (E, Wi ), where E ∈ E(P (t) ) and 1 ≤ i ≤ t, may be classified as
ε-regular or ε-irregular (i.e., not ε-regular), according to Definition 17. Also, notice that
each v = (v1 , . . . , vk+1 ) ∈ V1 × . . . × Vk+1 is ‘covered’ by exactly one such pair, that
is, v ∈ Kk (E) × Wi for a unique pair (E, Wi ).
t
(t) k
Definition 19 Let P (t) = Pi 1 and Wi 1 be as in Definition 18. We shall say that
(t)
(t)
the system of partitions P1 , . . . , Pk , {W1 , . . . , Wt } is ε-regular if the number of
(k + 1)-tuples (v1 , . . . , vk+1 ) ∈ V1 × . . . × Vk+1 that are not covered by the family of
ε-regular pairs (E, Wi ) with E ∈ E(P (t) ) and 1 ≤ i ≤ t is at most ε|V1 | . . . |Vk+1 |.
The main tool in the proof of C2 ⇒ C1 is the following result (see [9] for the details).
Lemma 20 For every ε S
> 0 and t0 ≥ 1, there exist integers n0 and T0 such that every
k+1
(k + 1, k)-cylinder G = i=1 Gi with vertex set V1 ∪ . . . ∪ Vk+1 , where |Vi | ≥ n0 ∀i,
(t)
(t)
1 ≤ i ≤ k+1, admits an ε-regular system of partitions {P1 , . . . , Pk , {W1 , . . . , Wt }}
with t0 < t < T0 .
2.2 The Subgraph Counting Formula
In this section, we shall state the main result that may be used to prove the implication
P1 ⇒ P5′ . To this end, we need to introduce some notation. Throughout this section,
s ≥ 2k is some fixed integer.
56
Y. Kohayakawa, V. Rödl, J. Skokan
If H and G are, respectively, k-uniform and ℓ-uniform (k ≥ ℓ), then we say that H
is supported on G if H ⊂ Kk (G).
Suppose we have pairwise disjoint sets W1 , . . . , Ws , with |Wi | = n for all i. Suppose
further that we have a sequence G (2) , . . . , G (k) of s-partite cylinders on W1 ∪ . . . ∪ Ws ,
with G (i) an (s, i)-cylinder and, moreover, such that G (i) is supported on G (i−1) for all
3 ≤ i ≤ k. Suppose also that, for all 2 ≤ i ≤ k and for all 1 ≤ j1 < . . . < ji ≤ s,
the (i, i)-cylinder G[j1 , . . . , ji ] = G (i) [Wj1 ∪ . . . ∪ Wji ] is (γi , δ)-regular with respect
to G (i−1) [j1 , . . . , ji ] = G (i−1) [Wj1 ∪. . .∪Wji ], that is, whenever G ⊂ G (i−1) [j1 , . . . , ji ]
is such that |Ki (G)| ≥ δ|Ki (G (i−1) [j1 , . . . , ji ])|, we have
(γi − δ)|Ki (G)| ≤ |G[j1 , . . . , ji ] ∩ Ki (G)| ≤ (γi + δ)|Ki (G)|.
(k)
(k)
Finally, let us say that a copy of Ks in W1 ∪. . .∪Ws is transversal if |V (Ks )∩Wi | =
1 for all 1 ≤ i ≤ s.
Our main result concerning counting subhypergraphs is then the following.
Theorem 21 For any ε > 0 and any γ2 , . . . , γk > 0, there is δ0 > 0 such that if δ < δ0 ,
(s)
(s)
(k)
then the number of transversal Ks in G (k) is (1 + O(ε))γkk . . . γ2 2 ns .
Theorem 21 above is an instance ofcertain counting lemmas developed by Rödl and
Skokan for such complexes G = G (i) 2≤i≤k (see, e.g., [11]).
3 Concluding Remarks
We hope that the discussion above on our proof approach for Theorem 3 gives some
idea about our methods and techniques. Unfortunately, because of space limitations and
because we discuss the motivation behind our work in detail, we are unable to give more
details. We refer the interested reader to [9].
It is also our hope that the reader will have seen that many interesting questions
remain. Probably, the most challenging of them concerns developing an applicable theory
of sparse quasi-random hypergraphs. Here, we have in mind such lemmas for sparse
quasi-random graphs as the ones in [10].
References
1. L. Babai, N. Nisan, and M. Szegedy, Multiparty protocols, pseudorandom generators for
logspace, and time-space trade-offs, J. Comput. System Sci. 45 (1992), no. 2, 204–232,
Twenty-first Symposium on the Theory of Computing (Seattle, WA, 1989).
2. F.R.K. Chung, Quasi-random classes of hypergraphs, Random Structures and Algorithms 1
(1990), no. 4, 363–382.
3. F.R.K. Chung and R.L. Graham, Quasi-random hypergraphs, Random Structures and Algorithms 1 (1990), no. 1, 105–124.
4.
, Quasi-random set systems, Journal of the American Mathematical Society 4 (1991),
no. 1, 151–196.
5. F.R.K. Chung and P. Tetali, Communication complexity and quasi randomness, SIAM J.
Discrete Math. 6 (1993), no. 1, 110–123.
Equivalent Conditions for Regularity
57
6. P. Frankl and V. Rödl, The uniformity lemma for hypergraphs, Graphs and Combinatorics 8
(1992), no. 4, 309–312.
7. J. Haviland and A.G. Thomason, Pseudo-random hypergraphs, Discrete Math. 75 (1989),
no. 1–3, 255–278, Graph theory and combinatorics (Cambridge, 1988).
8.
, On testing the “pseudo-randomness” of a hypergraph, Discrete Math. 103 (1992),
no. 3, 321–327.
9. Y. Kohayakawa, V. Rödl, and J. Skokan, Equivalent conditions for regularity, in preparation,
1999.
10. Y. Kohayakawa, V. Rödl, and E. Szemerédi, The size-Ramsey number of graphs of bounded
degree, in preparation, 1999.
11. V. Rödl and J. Skokan, Uniformity of set systems, in preparation, 1999.
12. E. Szemerédi, Regular partitions of graphs, Problèmes Combinatoires et Théorie des Graphes
(Colloq. Internat. CNRS, Univ. Orsay, Orsay, 1976) (Paris), Colloques Internationaux CNRS
n. 260, 1978, pp. 399–401.
13. A.G. Thomason, Pseudorandom graphs, Random graphs ’85 (Poznań, 1985), North-Holland
Math. Stud., vol. 144, North-Holland, Amsterdam-New York, 1987, pp. 307–331.
14.
, Random graphs, strongly regular graphs and pseudorandom graphs, Surveys in
Combinatorics 1987 (C. Whitehead, ed.), London Mathematical Society Lecture Note Series,
vol. 123, Cambridge University Press, Cambridge–New York, 1987, pp. 173–195.
15. R.M. Wilson, Cyclotomy and difference families in elementary abelian groups, J. Number
Theory 4 (1972), 17–47.
Cube Packing
F.K. Miyazawa1⋆ and Y. Wakabayashi2⋆
1
2
Instituto de Computação — Universidade Estadual de Campinas
Caixa Postal 6176 — 13083-970 — Campinas–SP — Brazil
fkm@dcc.unicamp.br
Instituto de Matemática e Estatı́stica — Universidade de São Paulo
Rua do Matão, 1010 — 05508–900 — São Paulo–SP — Brazil
yw@ime.usp.br
Abstract. The Cube Packing Problem (CPP) is defined as follows. Find
a packing of a given list of (small) cubes into a minimum number of
(larger) identical cubes. We show first that the approach introduced by
Coppersmith and Raghavan for general online algorithms for packing
problems leads to an online algorithm for CPP with asymptotic performance bound 3.954. Then we describe two other offline approximation
algorithms for CPP: one with asymptotic performance bound 3.466 and
the other with 2.669. A parametric version of this problem is defined and
results on online and offline algorithms are presented. We did not find in
the literature offline algorithms with asymptotic performance bounds as
good as 2.669.
1
Introduction
The Cube Packing Problem (CPP) is defined as follows. Given a list L of n cubes
(of different dimensions) and identical cubes, called bins, find a packing of the
cubes of L into a minimum number of bins. The packings we consider are all
orthogonal. That is, with respect to a fixed side of the bin, the sides of the cubes
must be parallel or orthogonal to it.
CPP is a special case of the Three-dimensional Bin Packing Problem (3BP).
In this problem the list L consists of rectangular boxes and the bins are also
rectangular boxes. Here, we may assume that the bins are cubes, since otherwise
we can scale the bins and the boxes in L correspondingly.
In 1989, Coppersmith and Raghavan [6] presented an online algorithm for
3BP, with asymptotic performance bound 6.25. Then, in 1992, Li and Cheng
[11] presented an algorithm with asymptotic performance bound close to 4.93.
Improving the latter result, Csirik and van Vliet [7], and also Li and Cheng
[10] designed algorithms for 3BP with asymptotic performance bound 4.84 (the
best bound known for this problem). Since CPP is a special case of 3BP, these
⋆
This work has been partially supported by Project ProNEx 107/97 (MCT/FINEP),
FAPESP (Proc. 96/4505–2), and CNPq individual research grants (Proc. 300301/987 and Proc. 304527/89-0).
G. Gonnet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNCS 1776, pp. 58–67, 2000.
c Springer-Verlag Berlin Heidelberg 2000
Cube Packing
59
algorithms can be used to solve it. Our aim is to show that algorithms with
better asymptotic performance bounds can be designed.
Results of this kind have already been obtained for the 2-dimensional case,
more precisely, for the Square Packing Problem (SPP). In this problem we are
given a list of squares and we are aked to pack them into a minimum number
of square bins. In [6], Coppersmith and Raghavan observe that their technique
leads to an online algorithm for SPP with asymptotic performance bound 2.6875.
They also proved that any online algorithm for packing d-dimensional squares,
d ≥ 2, must have asymptotic performance bound at least 4/3. Ferreira, Miyazawa
and Wakabayashi [9] presented an offline algorithm for SPP with asymptotic
performance bound 1.988. For the more general version of the 2-dimensional
case, where the items of L are rectangles (instead of squares), Chung, Garey and
Johnson [2] designed an algorithm with asymptotic performance bound 2.125.
For more results on packing problems the reader is referred to [1,3,4,5,8].
The remainder of this paper is organized as follows. In Section 2 we present
some notation and definitions. In Section 3 we describe an online algorithm
for CPP that uses an approach introduced by Coppersmith and Raghavan [6],
showing that its asymptotic performance bound is at most 3.954. In Section 4
we present an offline algorithm with asymptotic performance bound 3.466. We
mention a parametric version for these algorithms and derive asymptotic performance bounds. In Section 5 we present an improved version of the offline
algorithm described in Section 4. We show that this algorithm has asymptotic
performance bound 2.669. Finally, in Section 6 we present some concluding remarks.
2
Notation and Definitions
The reader is referred to [14] for the basic concepts and terms related to packing.
Without loss of generality, we assume that the bins have unit dimensions, since
otherwise we can scale the cubes of the instance to fulfill this condition.
A rectangular box b with length x, width y and height z is denoted by a
triplet b = (x, y, z). Thus, a cube is simply a triplet of the form (x, x, x). The
size of a cube c = (x, x, x), denoted by s(c), is x. Here we assume that every
cube in the input list L has size at most 1.The volume of a list L, denoted by
V (L), is the sum of the volumes of the items in L.
For a given list L and algorithm A, we denote by A(L) the number of bins
used when algorithm A is applied to list L, and by OPT(L) the optimum number
of bins for a packing of L. We say that an algorithm A has an asymptotic
performance bound α if there exists a constant β such that
A(L) ≤ α · OPT(L) + β,
for all input list L.
If β = 0 then we say that α is an absolute performance bound for algorithm A.
If P is a packing, then we denote by #(P) the number of bins used in P.
An algorithm to pack a list of items L = (c1 , . . . , cn ) is said to be online
if it packs the items in the order given by the list L, without knowledge of the
subsequent items on the list. An algorithm that is not online is said to be offline.
60
F. K. Miyazawa and Y. Wakabayashi
We consider here a parametric version of CPP, denoted by CPPm , where m
is a natural number. In this problem, the instance L consists of cubes with size
at most 1/m. Thus CPP1 and CPP are the same problem.
3
The Online Algorithm of Coppersmith and Raghavan
In 1989, Coppersmith and Raghavan [6] introduced an online algorithm for the
multidimensional bin packing problem. In this section we describe a specialized
version of this algorithm for CPP. Our aim is to derive an asymptotic performance bound for this algorithm (not explicitly given in the above paper).
The main idea of the algorithm is to round up the dimensions of the items
in L using a rounding set S = {1 = s0 , s1 , . . . , si , . . .}, si > si+1 . The first step
consists in rounding up each item size to the nearest value in S. The rounding
set S for CPP is S := S1 ∪ S2 ∪ S3 , where
S1 = {1}, S2 = {1/2, 1/4, . . . , 1/2k , . . .},
S3 = {1/3, 1/6, . . . , 1/(3 · 2k ), . . .}.
Let x be the value obtained by rounding up x to the nearest value in S. Given
a cube c = (x, x, x), define c as the cube c := (x, x, x). Let L be the list obtained
from L by rounding up the sizes of the cubes to the values in the rounding set
S. The idea is to pack the cubes of the list L instead of L, so that the packing of
each cube c ∈ L represents the packing of c ∈ L. The packing of L is generated
into bins belonging to three different groups: G1 , G2 and G3 . Each group Gi
contains only bins of dimensions (x, x, 1), x ∈ Si , i = 1, 2, 3. A bin of dimension
(x, x, 1) will have only cubes c = (x, x, x) packed into it. We say that a cube
c = (x, x, x) is of type i, if x ∈ Si , i = 1, 2, 3.
To pack the next unpacked cube c ∈ L with size x ∈ Si , we proceed as
follows.
P
1. Let B ∈ Gi be the first bin B = (x, x, 1), such that b∈B s(b) + x ≤ 1 (if
there exists such a bin B).
2. If there is a bin B as in step 1, pack c in a Next Fit manner into B.
3. Otherwise,
a) take the first empty bin C = (y, y, 1), y ∈ Si , with y > x and y as small
as possible. If there is no such bin C, take a new bin (1, 1, 1) and replace
it by i2 bins of dimensions ( 1i , 1i , 1) and let C = (y, y, 1) be the first of
these i2 bins.
b) If y > x, then replace C by other four bins of dimensions ( y2 , y2 , 1).
Continue in this manner replacing one of these new bins by four bins,
until there is a bin C of dimension C = ( 2ym , 2ym , 1) with 2ym = x.
c) Pack c in a Next Fit manner into the first bin C.
4. Update the group Gi .
Let us now analyse the asymptotic performance of the algorithm we have
described. Consider P the packing of L generated by this algorithm, Li the set
of all cubes of type i in L, and Pi the set of bins of P having only cubes of
Cube Packing
61
type i. Now, let us consider the bins B = (x, x, 1), x ∈ Si , and compute the
volume occupied by the cubes of L that were packed into these bins. All bins in
the group G1 are completely filled. Thus, #(P1 ) = V (L1 ). For the bins in the
groups G2 and G3 the unoccupied volume is at most 1 for each group. Therefore,
we have #(Pi ) ≤ V (Li ) + 1, i = 2, 3.
Now, let us consider the volume we increased because of the rounding process.
Each cube c ∈ L1 has volume at least 18 of c, and each cube c ∈ L2 ∪ L3 has
8
of c. Hence, we have the following inequalities:
volume at least 27
#(P1 ) ≤
1
V (L1 ), and
1/8
#(P2 ∪ P3 ) ≤
1
V (L2 ∪ L3 ) + 2.
8/27
Let n1 := #(P1 ) and n23 := #(P2 ∪ P3 ) − 2. Thus, using the inequalities
above and the fact that the volume of the cubes in L is a lower bound for the
8
n23 .
optimum packing, we have OPT(L) ≥ V (L) ≥ 18 n1 + 27
8
Since OPT(L) ≥ n1 , it follows that OPT(L) ≥ max{n1 , 81 n1 + 27
n23 }. Now
using the fact that #(P) = #(P1 ) + #(P2 ∪ P3 ) = n1 + n23 + 2, we have
#(P) ≤ α · OPT(L) + 2,
8
n23 }). Analysing the two possible cases
where α = (n1 + n23 )/(max{n1 , 18 n1 + 27
for the denominator, we obtain α ≤ 3.954.
The approach used above can also be used to develop online algorithms for
the parametric version CPPm . In this case we partition the packing into two
parts. One part is an optimum packing with all bins, except perhaps the last
(say n′ bins), filled with m3 cubes of volume at least (1/(m + 1))3 each. The
other part is a packing with all bins, except perhaps a fixed number of them
(say n′′ bins), having an occupied volume of at least ((m + 1)/(m + 2))3 .
It is not difficult to show that the asymptotic performance bound αm of
CPPm is bounded by (n′ +n′′ )/(max{n′ , (m/(m+1))3 n′ +((m+1)/(m+2))3 n′′ }).
For m = 2 and m = 3 these values are at most 2.668039 and 2.129151, respectively.
4
An Offline Algorithm
Before we present our first offline algorithm for CPP, let us describe the algorithm NFDH (Next Fit Descreasing Height), which is used as a subroutine.
NFDH first sorts the cubes of L in nonincreasing order of their size, say
c1 , c2 , . . . , cn . The first cube c1 is packed in the position (0, 0, 0), the next one
is packed in the position (s(c1 ), 0, 0) and so on, side by side, until a cube that
does not fit in this layer is found. At this moment the next cube ck is packed in
the position (0, s(c1 ), 0). The process continues in this way, layer by layer, until
a cube that does not fit in the first level is found. Then the algorithm packs this
cube in a new level at height s(c1 ). When a cube cannot be packed in a bin, it
is packed in a new bin. The algorithm proceeds in this way until all cubes of L
have been packed.
62
F. K. Miyazawa and Y. Wakabayashi
The following results will be used in the sequel. The proof of Lemma 1 is left
to the reader. Theorem 2 follows immediately from Lemma 1.
Theorem 1 (Meir and Moser [12]). Any list L of k-dimensional cubes, with
sizes x1 ≥ x2 ≥ · · · ≥ xn ≥ · · ·, can be packed by algorithm NFDH into only one
k-dimensional rectangular parallelepiped of volume a1 × a2 × . . . × ak if aj > x1
(j = 1, . . . , k) and xk1 + (a1 − x1 )(a2 − x1 ) · · · (ak − x1 ) ≥ V (L).
Lemma 1. For any list of cubes L = (c1 , . . . , cn ) such that x(ci ) ≤
following holds for the packing of L into unit bins:
1
m,
the
1
m,
the
3
NFDH(L) ≤ ((m + 1)/m) V (L) + 2.
Theorem 2. For any list of cubes L = (b1 , . . . , bn ) such that x(bi ) ≤
following holds for the packing of L into unit bins:
3
NFDH(L) ≤ ((m + 1)/m) OPT(L) + 2.
Before presenting the first offline algorithm, called CUBE, let us introduce a
definition and the main ideas behind it.
+ C, where v
If a packing P of a list L satisfies the inequality #(P) ≤ V (L)
v
and C are constants, then we say that v is a volume guarantee of the packing
P (for the list L). Algorithm CUBE uses an approach, which we call critical set
combination (see [13]), based on the following observation.
Recall that in the analysis of the performance of the algorithm presented
in Section 3 we considered the packing divided into two parts. One optimum
packing, of the list L1 , with a volume guarantee 18 , and the other part, of the
8
list L23 = L2 ∪ L3 , with a volume guarantee 27
. If we consider this partition of
L, the volume we can guarantee in each bin is the best possible, as we can have
cubes in L1 with volume very close to 81 , and cubes in L23 for which we have
8
. In the critical
a packing with volume occupation in each bin very close to 27
set combination approach, we first define some subsets of cubes in L1 and L23
with small volumes as the critical sets. Then we combine the cubes in these
critical sets obtaining a partial packing that is part of an optimum packing and
has volume occupation in each bin better than 81 . That is, sets of cubes that
would lead to small volume occupation are set aside and they are combined
appropriately so that the resulting packing has a better volume guarantee.
Theorem 3. For any list L of cubes for CPP, we have
CUBE(L) ≤ 3.466 · OPT(L) + 4.
Proof. First, consider the packing PAB . Since each bin of PAB , except perhaps
the last, contains one cube of LA and seven cubes of LB , we have
#(PAB ) ≤
1
V (LAB ) + 1,
83/216
where LAB is the set of cubes of L packed in PAB .
(1)
Cube Packing
63
Algorithm CUBE
// To pack a list of cubes L into unit bins B = (1, 1, 1).
1 Let p = 0.354014; and LA , LB be sublists of L defined as follows.
LA ← {c ∈ L1 :
1
2
< s(c) ≤ (1 − p)},
LB ← {c ∈ L2 :
1
3
< s(c) ≤ p}.
2 Generate a partial packing PAB of LA ∪ LB , such that PAB is the union of packings
1
k
i
PAB
, . . . , PAB
, where PAB
is a packing generated for one bin, consisting of one
cube of LA and seven cubes of LB , except perhaps the last (that can have fewer
cubes of LB ). [The packing PAB will contain all cubes of LA or all cubes of LB .]
Update the list L by removing the cubes packed in PAB .
3 P ′ ← NFDH(L);
4 Return P ′ ∪ PAB .
end algorithm.
Now consider a partition of P ′ into three partial packings P1 , P2 and P3 ,
defined as follows. The packing P1 has the bins of P ′ with at least one cube of
size greater than 12 . The packing P2 has the bins of P ′ \ P1 with at least one
cube of size greater than 31 . The packing P3 has the remaining bins, i.e., the
bins in P ′ \ (P1 ∪ P2 ). Let Li be the set of cubes packed in Pi , i = 1, 2, 3.
Since all cubes of L3 have size at most 1/3, and they are packed in P3 with
algorithm NFDH, by Lemma 1, we have
#(P3 ) ≤
1
V (L3 ) + 2.
27/64
(2)
Case 1. LB is totally packed in PAB .
In this case, every cube of L1 has volume at least 18 . Therefore
#(P1 ) ≤
1
V (L1 ).
1/8
(3)
Now, since every cube of L2 has size at least p and each bin of packing P2 has
at least 8 cubes of L2 , we have
#(P2 ) ≤
1
V (L2 ) + 1.
8p3
(4)
83 27
, 64 }, using (1), (2) and (4), and setting Paux :=
Since 8p3 = min{8p3 , 216
PAB ∪ P3 ∪ P2 , we have
#(Paux ) ≤
1
V (Laux ) + 4.
8p3
(5)
Clearly, P1 is an optimum packing of L1 , and hence
#(P1 ) ≤ OPT(L).
(6)
64
F. K. Miyazawa and Y. Wakabayashi
Defining h1 := #(P1 ) and h2 := #(Paux ) − 4, and using inequalities (3), (5) and
(6) we have
CUBE(L) ≤ α′ · OPT(L) + 4,
where α′ = (h1 + h2 )/(max{h1 , 18 h1 + 8p3 h2 }) ≤ 3.466.
Case 2. LA is totally packed in PAB .
In this case, the volume guarantee for the cubes in L1 is better than the one
obtained in Case 1. Each cube of L1 has size at least 1 − p. Thus, #(P1 ) ≤
8
1
(1−p)3 V (L1 ). For the packing P2 , we obtain a volume guarantee of at least 27 ,
and the same holds for the packings P3 and PAB . Thus, for Paux as above,
1
V (Laux ) + 4.
#(Paux ) ≤ 8/27
Since #(P1 ) ≤ OPT(L), combining the previous inequalities and proceeding
as in Case 1, we have
CUBE(L) ≤ α′′ · OPT(L) + 4,
8
h2 }) ≤ 3.466.
where α′′ = (h1 + h2 )/(max{h1 , (1 − p)3 h1 + 27
The proof of the theorem follows from the results obtained in Case 1 and
Case 2. We observe that the value of p was obtained by imposing equality for
the values of α′ and α′′ .
Algorithm CUBE can also be generalized for the parametric problem CPPm .
The idea is the same as the one used in algorithm CUBE. The input list is first
subdivided
i into two parts, P1 and P2 . Part P1 consists of those cubes with size in
1
1
,
m+1 m , and part P2 consists of the remaining cubes. The critical cubes in each
part are defined using an appropriate value of p = p(m), and then combined.
The analysis is also divided into two parts, according to which critical set is
totally packed in the combined packing. It is not difficult to derive the bounds
α(CUBEm ) that can be obtained for the corresponding algorithms. For m = 2
and m = 3 the values of α(CUBEm ) are at most 2.42362852 (p = 0.26355815)
and 1.98710756 (p = 0.20916664), respectively.
5
An Improved Algorithm for CPP
We present in this section an algorithm for the cube packing problem that is
an improvement of algorithm CUBE described in the previous section. For that,
we consider another restricted version of CPP, denoted by CPPk , where k is an
integer greater than 2. In this problem the instance is a list L consisting of cubes
of size greater than k1 . We use in the sequel the following result for CPP3 .
Lemma 2. There is a polynomial time algorithm to solve CPP3 .
Proof. Let L1 = {c ∈ L : s(c) > 12 } and L2 = L\L1 . Without loss of generallity,
consider L1 = (c1 , . . . , ck ). Pack each cube ci ∈ L1 in a unit bin Ci at the corner
(0, 0, 0). Note that it is possible to pack seven cubes with size at most 1 − s(ci )
Cube Packing
65
(j)
in each bin Ci . Now, for each bin Ci , consider seven other smaller bins Ci ,
j = 1, . . . , 7, each with size 1 − s(ci ). Consider a bipartite graph G with vertex
set X ∪ Y , where X is the set of the small bins, and Y is precisely L2 . In G
(j)
there is an edge from a cube c ∈ L2 to a bin Ci ∈ X if and only if c can be
(j)
packed into Ci . Clearly, a maximum matching in G corresponds to a maximum
packing of the cubes of L2 into the bins occupied by the cubes of L1 . Denote by
P12 the packing of L1 combined with the cubes of L2 packed with the matching
strategy. The optimum packing of L can be obtained by adding to the packing
P12 the bins packed with the remaining cubes of L2 (if existent), each with 8
cubes, except perhaps the last.
We say that a cube c is of type G, resp. M , if s(c) ∈ 12 , 1 , resp. s(c) ∈ 31 , 21 .
Lemma 3. It is possible to generate an optimum packing of an instance of CPP3
such that each bin, except perhaps one, has one of the following configurations:
(a) C1: configuration consisting of 1 cube of type G and 7 cubes of type M ;
(b) C2: configuration consisting of exactly 1 cube of type G; and
(c) C3: configuration consisting of 8 cubes of type M .
Lemma 2 shows the existence of a polynomial time optimum algorithm for
CPP3 . In fact, it is not difficult to design a greedy-like algorithm to solve CPP3 in
time O(n log n). Such an algorithm is given in [9] for SPP3 (defined analogously
with respect to SPP).
We are now ready to present the improved algorithm for the cube packing
problem, which we call ICUBE (Improved CUBE).
Algorithm ICUBE
// To pack a list of cubes L into unit bins B = (1, 1, 1).
1. Let L′1 ← {q ∈ L : 13 < s(q) ≤ 1}.
2. Generate an optimum packing P1′ of L′1 (in polynomial time), with bins as in Lemma
3. That is, solve CPP3 with input list L′1 .
3. Let PA be the set of bins B ∈ P1′ having configuration C2 with a cube q ∈ B with
s(q) ≤ 32 ; let LA be the set of cubes packed in PA .
4. Let LB ← {q ∈ L : 0 < s(q) ≤ 31 }.
5. Generate a packing PAB filling the bins in PA with cubes of LB (see below).
6. Let L1 be the set of all packed cubes, and P1 the packing generated for L1 .
7. Let P2 be the packing of the unpacked cubes of LB generated by NFDH.
8. Return the packing P1 ∪ P2 .
end algorithm
To generate the packing PAB , in step 5 of algorithm ICUBE, we first partition
the list LB into 5 lists, LB,3 , LB,4 , LB,5 , LB,6 , LB,7 , defined as follows. LB,i =
1
< s(c) ≤ 1i }, i = 3, . . . , 6 and LB,7 = {c ∈ LB : s(c) ≤ 17 }. Then
{c ∈ LB : i+1
we combine the cubes in each of these lists with the packing PA generated in
step 3.
66
F. K. Miyazawa and Y. Wakabayashi
Now consider the packing of cubes of LB,3 into bins of PA . Since we can
pack 19 cubes of LB,3 into each of these bins, we generate such a packing until
all cubes of LB,3 have been packed, or until there are no more bins in PA . We
generate similar packings combining the remaining bins of PA with the lists LB,4 ,
LB,5 and LB,6 . To pack the cubes of LB,7 into bins of PA we consider the empty
space of the bin divided into three smaller bins of dimensions (1, 1, 13 ), (1, 13 , 13 )
and ( 31 , 13 , 31 ). Then use NFDH to pack the cubes in LB,7 into these smaller bins.
We continue the packing of LB,7 using other bins of PA until there are no more
unpacked cubes of LB,7 , or all bins of PA have been considered.
Theorem 4. For any instance L of CPP, we have
ICUBE(L) ≤ 2.669 · OPT(L) + 7.
C1′ , C2′
and C3′ be the set of bins used in P1′ with configurations
Proof. (Sketch) Let
C1, C2 and C3, respectively. Considering the volume guarantees of C1′ , C2′ and
1
1
V (C1′ ) + 1, #(C2′ ) ≤ 1/8
V (C2′ ) + 1, and #(C3′ ) ≤
C3′ , we have #(C1′ ) ≤ 1/8+7/27
1
′
8/27 V (C3 ) + 1.
We call LA the set of cubes packed in C2′ , and consider it a critical set (LA :=
{q ∈ L : 21 < s(q) ≤ 32 }). The bins of C2′ are additionally filled with the
cubes in L′1 , defined as LB , until possibly all cubes of LB have been packed
(LB := {q ∈ L : 0 < s(q) ≤ 13 }). We have two cases to analyse.
Case 1: All cubes of LB have been packed in PAB .
The analysis of this case is simple and will be omitted.
Case 2: There are cubes of LB not packed in PAB .
Note that the volume occupation in each bin with configuration C1 or C3 is at
8
. For the bins with configuration C2, we have a volume occupation of 81 .
least 27
In step 5, the bins with configuration C2 are additionaly filled with cubes of LB
generating a combined packing PAB .
In this case, all cubes of LA have been packed with cubes of LB . Thus, each
8
bin of PAB has a volume ocupation of at least 27
. The reader can verify this fact
by adding up the volume of these cubes in LA and the cubes of LB,i , i = 3, . . . , 6.
For bins combining cubes of LA with LB,7 , we use Theorem 1 to guarantee this
minimum volume ocupation for the resulting packed bins. Therefore, we have
8
. Thus we have
an optimum packing of L1 with volume guarantee at least 27
V (L)
#(P1 ) ≤ OPT(L), and #(P1 ) ≤ 8/27 + 6.
The packing P2 is generated by algorithm NFDH for a list of cubes with size
V (L)
not greater than 13 . Therefore, by Lemma 1, we have #(P2 ) ≤ 27/64
+ 2.
Now, proceeding as in the proof of Theorem 3, we obtain
ICUBE(L) ≤ α · OPT(L) + 8,
where α =
6
1945
729
≤ 2.669.
Concluding Remarks
We have described an online algorithm for CPP that is a specialization of an approach introduced by Coppersmith and Raghavan [6] for a more general setting.
Cube Packing
67
Our motivation in doing so was to obtain the asymptotic performance bound
(3.954) of this algorithm, so that we could compare it with the bounds of the
offline algorithms presented here.
We have shown a simple offline algorithm for CPP with asymptotic performance bound 3.466. Then we have designed another offline algorithm that is an
improvement of this algorithm, with asymptotic performance bound 2.669. This
result can be generalized to k-dimensional cube packing, for k > 3, by making
use of the Theorem 1 and generalizing the techniques used in this paper. Both
algorithms can be implemented to run in time O(n log n), where n is the number
of cubes in the list L. We have also shown that if the instance consists of cubes
with size greater than 1/3 there is a polynomial exact algorithm.
We did not find in the literature offlines algorithms for CPP with asymptotic
performance bound as good as 2.669.
References
1. B. S. Baker, A. R. Calderbank, E. G. Coffman Jr., and J. C. Lagarias. Approximation algorithms for maximizing the number of squares packed into a rectangle.
SIAM J. Algebraic Discrete Methods, 4(3):383–397, 1983.
2. F. R. K. Chung, M. R. Garey, and D. S. Johnson. On packing two-dimensional
bins. SIAM J. Algebraic Discrete Methods, 3:66–76, 1982.
3. E. G. Coffman, Jr., M. R. Garey, and D. S. Johnson. Approximation algorithms
for bin packing – an updated survey. In G. Ausiello et al. (eds.) Algorithms design
for computer system design, 49–106. Springer-Verlag, New York, 1984.
4. E. G. Coffman, Jr., M. R. Garey, and D. S. Johnson. Approximation algorithms
for bin packing – a survey. In D. Hochbaum (ed.) Approximation algorithms for
NP-hard problems, 46–93, PWS, 1997.
5. E. G. Coffman Jr. and J. C. Lagarias. Algorithms for packing squares: A probabilistic analysis. SIAM J. Comput., 18(1):166–185, 1989.
6. D. Coppersmith and P. Raghavan. Multidimensional on-line bin packing: algorithms and worst-case analysis. Oper. Res. Lett., 8(1):17–20, 1989.
7. J. Csirik and A. van Vliet. An on-line algorithm for multidimensional bin packing.
Operations Research Letters, 13:149–158, 1993.
8. H. Dyckhoff, G. Scheithauer, and J. Terno. Cutting and packing. In F. Maffioli,
M. Dell’Amico and S. Martello (eds.) Annotated Bibliographies in Combinatorial
Optimization, Chapter 22, 393–412. John Wiley, 1997.
9. C. E. Ferreira, F. K. Miyazawa, and Y. Wakabayashi. Packing squares into squares.
Pesquisa Operacional, a special volume on cutting and packing. To appear.
10. K. Li and K-H. Cheng. A generalized harmonic algorithm for on-line multidimensional bin packing. TR UH-CS-90-2, University of Houston, January 1990.
11. K. Li and K-H. Cheng. Generalized first-fit algorithms in two and three dimensions.
Int. J. Found. Comput Sci., 1(2):131–150, 1992.
12. A. Meir and L. Moser. On packing of squares and cubes. J. Combinatorial Theory
Ser. A, 5:116–127, 1968.
13. F. K. Miyazawa and Y. Wakabayashi. Approximation algorithms for the orthogonal
z-oriented three-dimensional packing problem. To appear in SIAM J. Computing.
14. F. K. Miyazawa and Y. Wakabayashi. An algorithm for the three-dimensional
packing problem with asymptotic performance analysis. Algorithmica, 18(1):122–
144, 1997.
Approximation Algorithms for Flexible Job
Shop Problems
Klaus Jansen1 , Monaldo Mastrolilli2 , and Roberto Solis-Oba3
1
Institut für Informatik und Praktische Mathematik, Universität zu Kiel, Germany
kj@informatik.uni-kiel.de †
2
IDSIA Lugano, Switzerland, monaldo@idsia.ch ‡
3
Department of Computer Science, The University of Western Ontario, Canada
solis@brown.csd.uwo.ca §
Abstract. The Flexible Job Shop Problem is a generalization of the
classical job shop scheduling problem in which for every operation there
is a group of machines that can process it. The problem is to assign
operations to machines and to order the operations on the machines, so
that the operations can be processed in the smallest amount of time. We
present a linear time approximation scheme for the non-preemptive version of the problem when the number m of machines and the maximum
number µ of operations per job are fixed. We also study the preemptive
version of the problem when m and µ are fixed, and present a linear time
(2 + ε)-approximation algorithm for the problem with migration.
1
Introduction
The job shop scheduling problem is a classical problem in Operations Research
[10] in which it is desired to process a set J = {J1 , . . . , Jn } of n jobs on a group
M = {1, . . . , m} of m machines in the smallest amount of time. Every job Jj
consists of a sequence of µ operations O1j , O2j , . . . , Oµj which must be processed
in the given order. Every operation Oij has assigned a unique machine mij ∈ M
which must process the operation without interruption during pij units of time,
and a machine can process at most one operation at a time.
In this paper we study a generalization of the job shop scheduling problem
called the flexible job shop problem [1], which models a wide variety of problems encountered in real manufacturing systems [1,13]. In the flexible job shop
problem an operation Oij can be processed by any machine from a given group
Mij ⊆ M . The processing time of operation Oij on machine k ∈ Mij is pkij . The
goal is to choose for each operation Oij an eligible machine and a starting time
so that the maximum completion time Cmax over all jobs is minimized. Cmax is
called the makespan or the length of the schedule.
†
‡
§
This research was done while the author was at IDSIA Lugano, Switzerland.
This author was supported by the Swiss National Science Foundation project 2155778.98.
This research was done while the author was at MPII Saarbrücken, Germany.
G. Gonnet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNCS 1776, pp. 68–77, 2000.
c Springer-Verlag Berlin Heidelberg 2000
Approximation Algorithms for Flexible Job Shop Problems
69
The flexible job shop problem is more complex than the job shop problem
because of the need to assign operations to machines. Following the three-field
α|β|γ notation suggested by Vaessens [13] and based on that of [4], we denote
our problem as m1m|chain, op ≤ µ|Cmax . In the first field m specifies that the
number of machines is a constant, 1 specifies that any operation requires at most
one machine to be processed, and the second m gives an upper bound on the
number of machines that can process an operation. The second field states the
precedence constraints and the maximum number of operations per job, while
the third field specifies the objective function. The following special cases of the
problem are already NP-hard (see [13] for a survey): 2 1 2|chain, n = 3|Cmax , 3
1 2|chain, n = 2|Cmax , 2 1 2|chain, op ≤ 2|Cmax .
The job shop scheduling problem has been extensively studied. The problem
is known to be strongly NP-hard even if each job has at most three operations
and there are only two machines [10]. Williamson et al. [14] proved that when
the number of machines, jobs, and operations per job are part of the input
there does not exist a polynomial time approximation algorithm with worst
case bound smaller than 54 unless P = N P . On the other hand the preemptive
version of the job shop scheduling problem is NP-complete in the strong sense
even when m = 3 and µ = 3 [3]. Jansen et al. [8] have designed a linear time
approximation scheme for the case when m and µ are fixed. When m and µ are
part of the input the best known result [2] is an approximation algorithm with
worst case bound O([log(mµ) log(min{mµ, pmax })/ log log(mµ)]2 ), where pmax
is the largest processing time among all operations.
Scheduling jobs with chain precedence constraints on unrelated parallel machines is equivalent to the flexible job shop problem. For the
first problem, Shmoys et al. [12] have designed a polynomial-time randomized algorithm that, with high probability, finds a schedule of length at most
∗
∗
), where Cmax
is the optimal makespan.
O((log2 n/ log log n)Cmax
In this work we study the preemptive and non-preemptive versions of the
flexible job shop scheduling problem when the number of machines m and the
number of operations per job µ are fixed. We generalize the techniques described
in [8] for the job shop scheduling problem and design a linear time approximation
scheme for the flexible job shop problem. In addition, each job Jj has a delivery
time qj . If in a schedule Jj completes its processing at time Cj , then its delivery
completion time is equal to Cj + qj . The problem now is to find a schedule
that minimizes the maximum delivery completion time Lmax . We notice that by
using the same techniques we can also handle the case in which each job Jj has
a release time rj when it becomes available for processing and the objective is
to minimize the makespan.
Our techniques allow us also to design a linear time approximation scheme
for the preemptive version of the flexible job shop problem without migration.
No migration means that each operation must be processed by a unique machine.
So if an operation is preempted, its processing can only be resumed on the same
machine on which it was being processed before the preemption. Due to space
limitations we do not describe this algorithm here. We also study the preemptive
70
K. Jansen, M. Mastrolilli, R. Solis-Oba
flexible job shop problem with migration, and present a (2 + ε)-approximation
algorithm for it. The last algorithm handles release and delivery times, and both
of them produce solutions with only a constant number of preemptions.
2
The Non-preemptive Flexible Job Shop Problem
Consider an instance of the flexible job shop problem with release and delivery
times. Let L∗max be the length of an optimum schedule. For every job JP
j , let Pj =
P
µ
s
[min
p
]
denote
its
minimum
processing
time.
Let
P
=
s∈M
kj
ij
Jj ∈J Pj .
i=1
Let rj be the release time of job Jj and qj be its delivery time. We define
tj = rj + Pj + qj for all jobs Jj , and we let tmax = maxj tj .
Lemma 1.
max
P
, tmax
m
≤ L∗max ≤ P + tmax .
We divide all processing, release, and delivery times by max
thus by Lemma 1,
1 ≤ L∗max ≤ m + 1, and tmax ≤ 1.
(1)
P
m , tmax
, and
(2)
We observe that Lemma 1 holds also for the preemptive version of the
problem with or without migration. Here we present an algorithm for the nonpreemptive flexible job shop problem that works for the case when all release
times are zero. The algorithm works as follows. First we show how to transform an instance of the flexible job shop problem into another instance without
delivery times. Then we define a set of time intervals and assign operations to
the intervals so that operations from the same job that are assigned to different
intervals appear in the correct order, and the total length of the intervals is no
larger than the length of an optimum schedule. We perform this step by first
fixing the position of the operations of a constant number of jobs (which we call
the long jobs), and then using linear programming to determine the position of
the remaining operations.
Next we use an algorithm by Sevastianov [11] to find a feasible schedule
for the operations within each interval. Sevastianov’s algorithm finds for each
interval a schedule of length equal to the length of the interval plus mµ3 pmax ,
where pmax is the largest processing time of any operation in the interval. In order
to keep this enlargement small, we remove from each interval a subset V of jobs
with large operations before running Sevastianov’s algorithm. Those operations
are scheduled at the beginning of the solution, and by choosing carefully the set
of long jobs we can show that the total length of the operations in V is very
small compared to the overall length of the schedule.
2.1
Getting Rid of the Delivery Times
We use a technique by Hall and Shmoys [6] to transform an instance of the
flexible job shop problem into another with only a constant number of different
Approximation Algorithms for Flexible Job Shop Problems
71
delivery times. Let qmax be the maximum delivery time and let ε > 0 be a
constant value. The idea is to round each delivery time down to the nearest
multiple of 2ε qmax to get at most 1 + 2/ε distinct delivery times. Next, apply
a (1 + ε/2)-approximation algorithm for the flexible job shop problem that can
handle 1+2/ε distinct delivery times (this algorithm is described below). Finally,
add 2ε qmax to the completion time of each job; this increases the length of the
solution by 2ε qmax . The resulting schedule is feasible for the original instance,
so this is a (1 + ε)-approximation algorithm for the original problem. In the
remainder of this section, we restrict our attention to the problem for which the
delivery times q1 , ..., qn can take only χ ≤ 1 + 2ε distinct values, which we denote
by δ1 > ... > δχ .
The delivery time of a job can be interpreted as an additional delivery operation that must be processed on a non-bottleneck machine after the last operation
of the job. A non-bottleneck machine is a machine that can process simultaneously any number of operations. Moreover, every feasible schedule for the jobs
J can be transformed into another feasible schedule, in which all delivery operations finish at the same time, without increasing the length of schedule: simply
shift the delivery operations to the end of the schedule. Therefore, we only need
to consider a set D = {d1 , ..., dχ } of χ different delivery operations, where di has
processing time δi .
2.2
Relative Schedules
Assume that the jobs are indexed so that P1 ≥ P2 ≥ ... ≥ Pn . Let L ⊂ J be the
set formed by the first k jobs, i.e., the k jobs with longest minimum processing
time, where k is a constant to be defined later. We call L the set of long jobs. An
operation from a long job is called a long operation, regardless of its processing
time. Let S = J \ L be the set of short jobs.
Consider any feasible schedule for the jobs in J . This schedule assigns a machine to every operation and it also defines a relative ordering for the starting
and finishing times of the operations. A relative schedule R for L is an assignment of machines to long operations and a relative ordering for the starting and
finishing times of the long operations and the delivery operations, such that there
is a feasible schedule for J that respects R. This means that for every relative
schedule R there is a feasible schedule for J that assigns the same machines as
R to the long operations and that schedules the long and the delivery operations
in the same relative order as R. Since there is a constant number of long jobs,
then there is also a constant number of different relative schedules.
Lemma 2. The number of relative schedules is at most mµk (2(µk + χ))!.
If we build all relative schedules, one of them must be equal to the relative
schedule defined by some optimum solution. Since it is possible to build all
relative schedules in constant time, we might assume without loss of generality
that we know how to find a relative schedule R such that some optimum schedule
for J respects R.
72
K. Jansen, M. Mastrolilli, R. Solis-Oba
Fix a relative schedule R as described above. The ordering of the starting and finishing times of the operations divide the time line into intervals
that we call snapshots. We can view a relative schedule as a sequence of snapshots M (1), M (2), . . . , M (g), where M (1) is the unbounded snapshot whose right
boundary is the starting time of the first operation according to R, and M (g)
is the snapshot bounded on the right by the finishing time of the delivery operations. The number of snapshots g is at most 2µk + χ + 1 because the starting
and finishing times of every operation might bound a snapshot.
2.3
Scheduling the Small Jobs
Given a relative schedule R as described above, to obtain a solution for the
flexible job shop problem we need to schedule the small operations within the
snapshots defined by R. We do this in two steps. First we use a linear program
LP (R) to assign small operations to snapshots and machines, and second, we
find a feasible schedule for the small operations within every snapshot.
To formulate the linear program we need first to define some variables. For
each snapshot M (ℓ) we use a variable tℓ to denote its length. For each Jj ∈ S we
define a set of decision variables xj,(i1 ,...,iµ ),(s1 ,...,sµ ) with the following meaning:
xj,(i1 ,...,iµ ),(s1 ,...,sµ ) = f iff for all q = 1, . . . , µ, an f fraction of the q-th operation
of job Jj is completely scheduled in the iq -th snapshot and on machine sq .
Let αj be the snapshot where the delivery operation of job Jj starts. For
every variable xj,(i1 ,...,iµ ),(s1 ,...,sµ ) we need 1 ≤ i1 ≤ i2 ≤ · · · ≤ iµ < αj to ensure
that the operations of Jj are scheduled in the proper order. Let Aj = {(i, s) |
i = (i1 , . . . , iµ ) 1 ≤ i1 ≤ . . . ≤ iµ < αj , s = (s1 , . . . , sµ ) sq ∈ Mqj and no long
operation is scheduled by R at snapshot iq on machine sq , for all q = 1, . . . , µ}.
The load Lℓ,h on machine h in snapshot M (ℓ) is the total processing time of
the operations from small jobs assigned to h during M (ℓ), i.e.,
Lℓ,h =
X
X
Jj ∈S (i,s)∈Aj
µ
X
s
xjis pqjq ,
(3)
q=1
iq =ℓ,sq =h
where iq and sq are the q-th components of tuples i and s respectively.
For every long operation Oij let αij and βij be the indices of the first and last
snapshots where the operation is scheduled. Let pij be the processing time of
long operation Oij according to the machine assignment defined by the relative
schedule R. We are ready to describe the linear program LP (R) that assigns
small operations to snapshots.
Pg
Minimize
ℓ=1 tℓ
P
βij
t = pij ,
for all Jj ∈ L, i = 1, . . . , µ,
s.t. (1)
Pgℓ=αij ℓ
(2)
t = δj ,
for all delivery operations dj ,
Pℓ=αj ℓ
x
=
1
,
for
all Jj ∈ S ,
(3)
jis
(i,s)∈Aj
(4) Lℓ,h ≤ tℓ ,
for all ℓ = 1, . . . , g, h = 1, . . . , m.
for all ℓ = 1, . . . , g ,
(5) tℓ ≥ 0 ,
Approximation Algorithms for Flexible Job Shop Problems
(6) xjis ≥ 0 ,
73
for all Jj ∈ S , (i, s) ∈ Aj .
Lemma 3. An optimum solution of LP (R) has value no larger than the length
of an optimum schedule S ∗ that respects the relative schedule R.
One can solve LP (R) optimally in polynomial time and get only a constant
number of jobs with fractional assignments, since a basic feasible solution of
LP (R) has at most kµ + n − k + mg + χ variables with positive value. By
constraint (3) every small job has at least one positive variable associated with
it, and so there are at most mg + kµ + χ jobs with fractional assignments. We
show later how to get rid of any constant number of fractional assignments by
only slightly increasing the length of the solution.
The drawback of this approach is that solving the linear program might take
a very long time. Since we want to get an approximate solution to the flexible
job shop problem it is not necessary to find an optimum solution for LP (R), an
approximate solution would suffice.
2.4
Approximate Solution of the Linear Program
A convex block-angular resource sharing problem has the form:
(
min λ
K
X
k=1
fik (xk )
k
k
≤ λ, for all i = 1, . . . , N, and x ∈ B , k = 1, . . . , K
)
where fik : B k → ℜ+ are N non-negative continuous convex functions, and
B k are disjoint convex compact nonempty sets called blocks, 1 ≤ k ≤ K. The
Potential Price Directive Decomposition Method of Grigoriadis and Khachiyan
[5] can find a (1 + ρ)-approximate solution to this problem for any ρ > 0. This
algorithm needs O(N (ρ−2 ln ρ−1 + ln N )(N ln ln(N/ρ) + KF )) time, where F is
the time needed to find a ρ-approximate solution tonthe following problem on o
any
PN
k k
k
k
k
N
.
block B , for some vector (p1 , . . . , pN ) ∈ ℜ : min
i=1 pi fi (x ) x ∈ B
We can write LP (R) as a convex block-angular resource sharing problem as
follows. First wePguess the value s of an optimum solution for LP (R), and add
g
the constraint: ℓ=1 tℓ ≤ s to the linear program. Note that s ≤ m + 1. Then
we replace constraint (4) by constraint (4′ ), where λ is a non-negative value:
(4’)
Lℓ,h − tℓ + m + 1 ≤ λ,
for all ℓ = 1, . . . , g, h = 1, . . . , m.
This new linear program, that we denote as LP (R, s, λ), has the above blockangular structure. The blocks Bj = {xjis | constraints (3)
Pgand (6) hold}, are
(mg)µ -dimensional simplicies. The block B|S|+1 = {tℓ | ℓ=1 tℓ ≤ s and constraints (1),(2), and (5) hold} has also constant dimension. Let fℓ,h = Lℓ,h − tℓ +
m + 1. Since tℓ ≤ s ≤ m + 1, these functions are non-negative. Each block Bi has
constant dimension, and so the above block optimization problem can be solved
74
K. Jansen, M. Mastrolilli, R. Solis-Oba
in constant time. Therefore the algorithm of [5] finds a (1 + ρ)-approximate solution for LP (R, s, λ) in O(n) time for any value ρ > 0. This gives a feasible
solution of LP (R, s, (m + 1 + ρ′ )) for ρ = ρ′ /(m + 1).
Let L∗max be the length of an optimum schedule and assume that R is a
relative schedule for L in an optimum schedule. We can use binary search on the
interval [1, 1+m] to find a value s ≤ (1+ 8ǫ )L∗max such that LP (R, s, (m+1+ρ′ ))
ǫ
. This search can be performed in O(log( 1ε log m))
has a solution for ρ′ = 8g
iterations by performing the binary search only on the following values,
ε
ε
ε
(1 + ), (1 + )2 , ..., (1 + )b−1 , m + 1
8
8
8
(4)
where b is the smallest integer such that (1 + ε/8)b ≥ m + 1. Thus, b ≤ ln(m +
ε/8
. To see that this
1)/ ln(1 + ε/8) + 1 = O( 1ε log m), since ln(1 + ε/8) ≥ 1+ε/8
search yields the desired value for s, note that there exists a nonnegative integer
i ≤ b such that L∗max ∈ [(1 + 8ε )i , (1 + 8ε )i+1 ] and therefore with the above search
we find a value s ≤ (1 + 8ǫ )L∗max for which LP (R, s, m + 1 + ρ′ ) has a feasible
solution. Linear program LP (R, s, m + 1 + ρ′ ) assumes that the length of each
snapshot is increased by ρ′ , and therefore the total length of the solution is
(1 + 8ε )L∗max + gρ′ ≤ (1 + 4ε )L∗max .
Lemma 4. A solution for LP (R, s, m + 1 + ρ′ ), with s ≤ (1 + 8ǫ )L∗max and
ǫ
, of value at most (1 + 4ǫ )L∗max can be found in linear time.
ρ′ = 8g
By using a similar technique as in [8] we can modify any feasible solution for
LP (R, s, m + 1 + ρ′ ) to get a new feasible solution in which all but a constant
number of variables xjis have integer values. Moreover we can do this rounding
step in linear time.
Lemma 5. A solution for LP (R, s, m+1+ρ′ ) can be transformed in linear time
into another solution for LP (R, s, m + 1 + ρ′ ) in which the set F of jobs that
still have fractional assignments after the rounding procedure has size |F| ≤ mg.
2.5
Generating a Feasible Schedule
To get a feasible schedule from the solution of the linear program we need to
remove all jobs F that received fractional assignment. These jobs are scheduled
sequentially at the beginning of the schedule.
For every operation of the small jobs, consider its processing time according
to the machine selected for it by the solution of the linear program. Let V be the
set formed by the small jobs containing at least one operation with processing
3
2
= 8µ m ε(m+1) g. We remove
time larger than τ = 8µ3εmg . Note that |V| ≤ m(m+1)
τ
from the snapshots all jobs in V and place them sequentially at the beginning of
the schedule.
Let O(ℓ) be the set of operations from small jobs that remain in snapshot
M (ℓ). Let pmax (ℓ) be the maximum processing time among the operations in
O(ℓ). Every snapshot M (ℓ) defines an instance of the job shop problem, since
Approximation Algorithms for Flexible Job Shop Problems
75
the solution of the linear program assigns a unique machine to every operation.
Hence we can use Sevastianov’s algorithm [11] to find in O(n2 µ2 m2 ) time a
feasible schedule for the operations O(ℓ); this schedule has length at most t̄ℓ =
tℓ + ρ′ + µ3 mpmax (ℓ). We must increase the length of every snapshot M (ℓ) to
t̄ℓ to accommodate the schedule produced by Sevastianov’s algorithm. Summing
up all these enlargements, we get:
Lemma 6.
g
X
ℓ=1
µ3 mpmax (ℓ) ≤ µ3 mgτ =
ε
ε
≤ L∗max .
8
8
(5)
The total length of the snapshots M (αij ), . . . , M (βij ) that contain a long operation Oij might be larger than pij . This creates some idle times on machine mij .
We start operations Oij for long jobs L at the beginning of the enlarged
P snapshot M (αij ). The resulting schedule is clearly feasible. Let P (J ′ ) = Jj ∈J ′ Pj
be the total processing time of all jobs in some set J ′ ⊂ J when the operations
of those jobs are assigned to the machines with the lowest processing times.
Lemma 7. A feasible schedule for the jobs J of length at most (1 + 38 ǫ)L∗max +
P (F ∪ V) can be found in O(n2 ) time.
We can choose the number k of long jobs so that P (F ∪ V) ≤ 8ǫ L∗max .
Pn
Lemma 8. [7] Let {d1 , d2 , . . . , dn } be positive values and j=1 dj ≤ m. Let q
1
be a nonnegative integer, α > 0 , and n ≥ (q + 1)⌈ α ⌉ . There exists an integer k
1
such that dk+1 + . . . + dk+qk ≤ αm and k ≤ (q + 1)⌈ α ⌉ .
3
ǫ
and q = 8µ m(m+1)
+ 1 m(2µ + χ + 1). By Lemma
Let us choose α = 8m
ε
3
2
5, |F ∪ V| ≤ mg + 8µ m ε(m+1) g ≤ qk. By Lemma 8 it is possible to choose a
1
value k ≤ (q + 1)⌈ α ⌉ so that the total processing time of the jobs in F ∪ V is
at most 8ε ≤ 8ε L∗max . This value of k can clearly be computed in constant time.
We select the set L of P
large jobs as the set consisting of the k jobs with largest
µ
processing times Pj = i=1 [mins∈Mkj psij ].
Lemma 9.
P (F ∪ V) ≤
ǫ ∗
L
.
8 max
(6)
Theorem 1. For any fixed m and µ, there is an algorithm for the flexible job
shop scheduling problem that computes for any value ε > 0, a feasible schedule
of length at most (1 + ǫ)L∗max in O(n) time.
Proof. By Lemmas 7 and 9, the above algorithm finds in O(n2 ) a schedule of
length at most (1 + 21 ǫ)L∗max . This algorithm can handle 1 + 2ǫ distinct delivery
times. By the discussion at the beginning of Section 3.1 it is easy to modify the
algorithm so that it handles arbitrary delivery times and it yields a schedule of
length at most (1 + ε)L∗max . For every fixed m, µ, and ǫ, all computations can
be carried out in O(n) time, with exception of the algorithm of Sevastianov that
runs in O(n2 ) time. The latter can be sped up to get linear time by “glueing”
pairs of small jobs together as described in [8].
⊓
⊔
76
3
K. Jansen, M. Mastrolilli, R. Solis-Oba
Preemptive Flexible Job Shop Problem with Migration
Let t a nonnegative variable, with t ≥ δ1 , that denotes the length of a schedule
(with delivery times). Consider χ time intervals defined as follows, [0, t − δ1 ], [t −
δ1 , t − δ2 ], ..., [t − δχ−1 , t − δχ ] where δ1 > . . . δχ are the delivery times. First we
ignore the release times and select the machines and the time intervals in which
the operations of every job are going to be processed. To do this we define a
linear program LP (similar to that of Sect. 2.3) that minimize the value of t.
The optimum solution of the LP has value no larger than L∗max . Again, by using
the Logarithmic Potential Price Directive Decomposition Method [5] and the
rounding technique of [8], we can compute in linear time a (1 + 4ε )-approximate
solution S̃ of the LP such that the size of the set F of jobs that receive fractional
assignments is bounded by mχ.
Let P denote the set of jobs from J \F for which at least one operation
εt̃
has processing time greater than 4χµ3 m(1+
ǫ , where t̃ is the value of S̃. Let
)
8
L = F ∪ P and S = J \L. According to S̃, find a feasible schedule σS for the jobs
from S applying Sevastianov’s algorithm. We use the algorithm of Sevastianov to
find a schedule for the operations assigned to each time interval. The maximum
delivery completion time (when release times are ignored) of σS is at most (1 +
ε)L∗max . By adding release times the length of σS is at most (2 + ε)L∗max , since
the maximum release time cannot be more than L∗max . Again, the algorithm of
Sevastianov can be sped up to take O(n) time, and computing the schedule σS
takes linear time.
Now, we ignore the delivery times for jobs from L (they are considered later).
3
2
). As we did for the
We note that the cardinality of set L is bounded by O( µ εm
2
delivery times, the release times can be interpreted as additional operations of
jobs that have to be processed on a non-bottleneck machine. Because of this
interpretation, we can add to the set OL of operations from L a set R of release
operations O0j with processing times rj . Each job Jj ∈ L has to perform its
release operation O0j on a non-bottleneck machine at the beginning. A relative
order R is an ordered sequence of the starting and finishing times of all operations
from OL ∪ R, such that there is a feasible schedule for L that respects R. The
ordering of the starting and finishing times of the operations divide the time line
4
2
).
into intervals. We observe that the number of intervals g is bounded by O( µ εm
2
Note that a relative order is defined without assigning operations of long jobs to
machines.
For every relative order R we define a linear program to assign (fractions
of) operations to machines and intervals that respects R. We build all relative
orders R and solve the corresponding linear programs in constant time. At the
end we select a relative schedule R∗ with the smallest solution value. We can
show that the value of this solution is no larger than the length of an optimum
preemptive schedule for L. This solution is in general not a feasible schedule for
L since the order of fractions of operations within an interval could be incorrect.
However the set of operations assigned to each interval gives an instance of the
preemptive open shop problem, which can be solved exactly in constant time [9].
Approximation Algorithms for Flexible Job Shop Problems
77
This can be done without increasing the length of each interval and with in total
only a constant number of preemptions (at most O(m2 ) preemptions for each
interval). By adding the delivery times the length of the computed schedule σL
is at most 2L∗max . The output schedule is obtained by appending σS after σL .
Theorem 2. For any fixed m, µ, and ε > 0, there is a (2 + ε)-linear-time
approximation algorithm for the preemptive flexible job shop scheduling problem
with migration.
References
1. P. Brandimarte, Routing and scheduling in a flexible job shop by tabu search,
Annals of Operations Research, 22, 158-183, 1993.
2. L.A. Goldberg, M. Paterson, A. Srinivasan, and E. Sweedyk, Better approximation
guarantees for job-shop scheduling, Proceedings of the 8th Symposium on Discrete
Algorithms (SODA 97), 599-608.
3. T. Gonzales and S. Sahni, Flowshop and jobshop schedules: complexity and approximation, Operations Research 26 (1978), 36-52.
4. R.L. Graham, E.L. Lawler, J.K. Lenstra, and A.H.G. Rinnoy Kan, Optimization and approximation in deterministic sequencing and scheduling, Ann. Discrete
Math. 5 (1979), 287-326.
5. M.D. Grigoriadis and L.G. Khachiyan, Coordination complexity of parallel pricedirective decomposition, Mathematics of Operations Research 21 (1996), 321-340.
6. L.A. Hall and D.B. Shmoys, Approximation algorithms for constrained scheduling
problems, Proceedings of the IEEE 30th Annual Symposium on Foundations of
Computer Science (FOCS 89), 134-139.
7. K. Jansen and L. Porkolab, Linear-time approximation schemes for scheduling
malleable parallel tasks, Proceedings of the 10th Annual ACM-SIAM Symposium
on Discrete Algorithms, (SODA 99), 490-498.
8. K. Jansen, R. Solis-Oba and M.I. Sviridenko, A linear time approximation scheme
for the job shop scheduling problem, Proceedings of the Second International Workshop on Approximation Algorithms (APPROX 99), 177-188.
9. E. L. Lawler, J. Labetoulle, On Preemptive Scheduling of Unrelated Parallel Processors by Linear Programming, Journal of the ACM , vol. 25, no. 4, pp. 612–619,
October 1978.
10. E.L. Lawler, J.K. Lenstra, A.H.G. Rinnooy Kan, and D.B. Shmoys, Sequencing
and scheduling: Algorithms and complexity, in: Handbook in Operations Research
and Management Science, Vol. 4, North-Holland, 1993, 445-522.
11. S.V. Sevastianov, Bounding algorithms for the routing problem with arbitrary
paths and alternative servers, Cybernetics 22 (1986), 773-780.
12. D.B. Shmoys, C. Stein, and J. Wein, Improved approximation algorithms for shop
scheduling problems, SIAM Journal on Computing 23 (1994), 617-632.
13. R.J.M. Vaessens, Generalized job shop scheduling: complexity and local search,
Ph.D. thesis (1995), Eindhoven University of Technology.
14. D. Williamson, L. Hall, J. Hoogeveen, C.. Hurkens, J.. Lenstra, S. Sevastianov,
and D. Shmoys, Short shop schedules, Operations Research 45 (1997), 288-294.
Emerging Behavior as Binary Search Trees Are
Symmetrically Updated
Stephen Taylor
College of the Holy Cross, Worcester MA 01610-2395, USA,
staylor@holycross.edu
Abstract. When repeated updates are made to a binary search tree,
the expected search cost tends to improve, as observed by Knott. For
the case in which the updates use an asymmetric deletion algorithm, the
Knott effect is swamped by the behavior discovered by Eppinger. The
Knott effect applies also to updates using symmetric deletion algorithms,
and it remains unexplained, along with several other trends in the tree
distribution. It is believed that updates using symmetric deletion do not
cause search cost to deteriorate, but the evidence is all experimental.
The contribution of this paper is to model separately several different
trends which may contribute to or detract from the Knott effect.
1
Background
A binary search tree (BST) is a tree structure with a key value stored in each
node. For each node, the key value is an upper bound on the values of keys
in the left subtree, and a lower bound on keys in the right subtree. If there
are no duplicate keys, a search of the tree for any given key value involves
examining nodes in a single path from the root. An insertion into the tree is
made by searching for a candidate key, then placing it as a child of the last node
reached in the search, so that an inserted key is always a leaf. Deletions are more
complicated, and use one of the algorithms described in section (2.2.)
When a BST with n nodes is grown by random insertions (RI) with no
deletions, the average search cost is O(log n), or equivalently, the total pathlength
from the root to every node in the tree (this is the internal pathlength, or IPL)
is O(n log n).
An update consists of deleting the node with some particular key-value, and
inserting another, either with the same key, or a different key.
Culberson [CM90] refers to the leftward-only descendents of the root as the
backbone, and the distances between key-values of the backbone as intervals.
[Bri86] calls the backbone and the corresponding rightward-only descendents of
the root the shell. The length of the shell is the pathlength from smallest to
largest key. Shell intervals are defined by the key-values of the shell nodes.
1.1
Related Work
When repeated updates are made to a binary search tree, the expected search
cost tends to improve. This Knott effect was first reported in [Kno75]. It turns out
G. Gonnet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNCS 1776, pp. 78–87, 2000.
c Springer-Verlag Berlin Heidelberg 2000
Emerging Behavior as Binary Search Trees Are Symmetrically Updated
79
that for updates using asymmetric [Hib62] deletion, the Knott effect is swamped
by the Eppinger effect. In [Epp83] Eppinger observed that after O(n2 ) updates,
the tree of size n has expected search time greater than O(ln n). There is a
striking early improvement and search costs drop to about 92% of initial values,
and then, after about n2 /2 iterations, the cost begins to rise. It levels out after
about n2 iterations. For a tree with 128 nodes, the final search cost levels out
to be about the same as for a tree built with insertions only; smaller trees, as
conjectured by Knott, fare better; but larger trees do worse. For trees of 2048
nodes, the asymptotic search cost is about 50% greater than for an RI BST.
Culberson [CM90] has given a model which explains this for updates in which
the item removed is always the item re-inserted, which he calls the Exact Fit
Domain (EFD) model. Culberson’s model is based
on directed random walks,
√
and finds that the expected search cost is O( n). We call similar undirected
random walks in trees updated with symmetrical deletion the Culberson effect,
although the time scales and results are different.
The Knott effect remains unexplained; and it is not the only unexplained
behavior. Simulations reported in Evans and Culberson [EC94] for two symmetric update algorithms show a reduced average pathlength, as predicted by the
Knott effect, but also that pathlengths from the root to the largest and smallest
leaves (and perhaps to some other, unmeasured subset of nodes) were 1.2 to 1.3
times longer than would be expected in a random binary search tree. We call
this the Evans effect.
[JK78] demonstrate analytically that Knott’s conjecture is correct for trees
with three nodes. [BY89] does the same for trees with four nodes. [Mes91] analyzes an update using symmetric deletion for the tree of three nodes.
Martinez and Roura [MR98] provide randomized algorithms which maintain
the distribution of trees after update to be the same as a RI binary search tree.
Their algorithms are not susceptible to the breakdown caused by sorted input,
nor to the Eppinger or Culberson effects. However, they are also immune to the
Knott effect, and thus miss any improvements in search times it might provide.
There are several open questions about update with symmetric deletion.
1. Does the Knott conjecture hold if it is revised for symmetric deletion?
2. If so, why? Is there a model which explains why stirring up the tree should
result in shorter internal path lengths? The effect is apparent even in Hibbard
deletions, before it is overwhelmed by the skewing of the tree.
3. Is there a long-term degeneration in the tree? If so, it must be over a much
longer term than the Hibbard deletion degeneration, because Eppinger’s and
Culberson’s simulations did not detect it.
2
2.1
Update Methodology
Exact Fit Domain Model
Culberson [CM89,CM90] proposed the unrealistic Exact Fit Domain (EFD)
model to simplify analysis. The assumption is there are only n keys possible
80
S. Taylor
and no duplicates in the tree, so that when an update occurs, the new key must
be the same as the one which was just deleted. This has the effect of localizing
the effects of update operations, making them easier to analyze. We assert without mathematical justification that the EFD model gives qualitatively similar
results to a more realistic random update model. Since repeated insertions result
in relatively well-understood behavior if they are not in the neighborhood of a
deletion, we claim that the EFD model simply telescopes the effect of separated
deletions and insertions in the same area. Time scales for emerging behavior may
be changed by the EFD model, but perhaps not other measurements. In support
of this suggestion, note the graphs of fig. (1.) [Grafting deletion is defined below.]
S 11.5
h
e
11
l
l 10.5
S
i
10
z
e
9.5
2150
2100
EFD grafting
EFD one-level
grafting
one-level
deletions
deletions
deletions
deletions
9
deletions
deletions
deletions
deletions
I 2000
P
L 1950
1900
8.5
1850
8
7.5
2050
EFD grafting
EFD one-level
grafting
one-level
1000
10000 100000 1e+06
Updates
(a) Comparing Shell Sizes
1800
1000
10000 100000 1e+06
Updates
(b) Comparing IPL
Fig. 1. Simulations with and without EFD.
2.2
Deletion Algorithms
Hibbard’s Asymmetrical Deletion Algorithm When Hibbard formulated
his deletion algorithm for binary trees in [Hib62], he was aware of the asymmetry.
His algorithm has two steps:
1. If the right subtree of the node to be deleted is not empty, replace the key of
the deleted node with its successor, the left-most node in the right subtree.
Then delete the successor.
2. If the right subtree of the node to be deleted is empty, replace the node with
its left subtree.
There are two, different, asymmetries: the deleted node is preferentially updated
from the right subtree; and the case when the right subtree is empty doesn’t
have a matching simple case when the left subtree is empty.
Emerging Behavior as Binary Search Trees Are Symmetrically Updated
81
Hibbard proved that his asymmetric deletion did not change the distribution
of tree shapes. He assumed that this meant that a symmetric algorithm was
unnecessary.
Symmetrical Grafting Deletion This algorithm is a combination of the Hibbard deletion algorithm and its mirror image. Whether the right-favored or
left-favored version of deletion is used is of equal probability, so the algorithm is
symmetrical. In our simulations, we use simple alternation rather than a random
number generator to decide which to use, but check for empty subtrees before
considering the successor or predecessor key. We call this grafting deletion because the subtree is grafted into the place of the deleted node when possible.
Most published symmetric deletion algorithms are variants on grafting deletion.
Simulations, for example fig. (2) show that one property of grafting deletion is
that zero-size subtrees rapidly get less common than in RI BSTs.
0.0045
grafting deletions
non-grafting deletions
0.004
0.0035
0.003
0.0025
Fraction
0.002
0.0015
0.001
0.0005
0
1000
10000
100000
Updates
1e+06
Fig. 2. Zero-size left-subtrees of root in 256 node BST.
Symmetrical Non-Grafting Deletion This is a symmetric deletion algorithm
which lacks an optimization for empty subtrees. The algorithm replaces a deleted
node with its successor or predecessor in the tree, unless there is none, in which
case it is replaced with the predecessor or successor. If there is neither predecessor
nor successor (the node to be deleted is a leaf) the node is simply removed.
The algorithm alternates between favoring predecessors and successors, so
it is symmetrical. Because it lacks an optimization to reduce the height of the
82
S. Taylor
tree by grafting a subtree nearer the root when the other subtree is empty, we
might expect that it would produce a distribution of binary search trees which
includes rather more zero-sized subtrees than algorithms which include such an
optimization. (These include the asymmetrical Hibbard deletion algorithm.)
This algorithm is easier to analyze using a Markov chain, because the statespace of trees of size n can be described by a single variable, the size of the left
subtree. In the particularly easy case of the Exact Fit Domain, in which the
replacement key in an update is always the same as the key deleted, the size of
the subtree can change only if the root is deleted, and only by one.
Assume the BST has n nodes, and πk,t be the probability that the left subtree
has k nodes. When the root is deleted for time t > 0, we have for each t the
n simultaneous equations (here we use Iverson notation [GKP89]: [P ](term)
evaluates to term if P is true, otherwise to zero:)
1
1
1
πk,t−1 + [k > 0]
πk−1,t−1 + [k < n]
πk+1,t−1 (1)
πk,t = 1 −
n
2n
2n
and assuming that there is a steady state, we can rewrite this as
1
1
1
πk,∞ + [k > 0]
πk−1,∞ + [k < n]
πk+1,∞ (2)
πk,∞ = 1 −
n
2n
2n
P
With the additional equation k πk,∞ = 1 we can solve the system to find
1
2(n − 1)
1
[0 < k < n − 1]
=
n−1
π0,∞ = πn−1,∞ =
πk,∞
3
3.1
(3)
(4)
Emerging Behavior
The Knott Effect
The Knott effect is the observed tendency of a binary search tree to become
more compact. That is, after a number of deletion and insertion operations are
performed on a random binary search tree, the resulting trees have smaller IPL
and therefore smaller search times. Knuth speculates that this effect may be due
to the tendency of (grafting) delete operations to remove empty subtrees.
A RI BST has one of the two worst possible keys at the root of any subtree
with probability 2/|size of subtree|. As a result of updates it evolves toward a
steady state in which the probability of zero subtrees is smaller. For the case of
update with non-grafting deletion, in the steady state, every subtree size except
zero and the largest possible is equally probable, and those two sizes are half as
likely as the others, as we have shown in eq. (3) and (4)
If we make the assumption that subtrees have the same distribution as the
root,this leads naturally to a recurrence for the IPL of such a tree.
0
n=0∪n=1
Pn−2 fi
(5)
fn =
n−1
+ 2 i=1 n−1
n>1
n − 1 + fn−1
Emerging Behavior as Binary Search Trees Are Symmetrically Updated
25000
83
grafting deletions
one-level deletions
24500
24000
I
P 23500
L
23000
22500
22000
10000
100000
Updates
1e+06
Fig. 3. Internal Pathlength declines as tree updated.
fn =
0
fn−1 +
fn−2
n−1
+
2n−3
n−1
n=0∪n=1
n>1
(6)
This can be evaluated numerically and compared with the corresponding recurrence for an RI BST. The comparison shows that for very large values of n, fn
grows quite close to IPLn ; only for small to intermediate values do they diverge.
3.2
The Evans Effect
The Evans effect is reported in [EC94] for search trees updated with symmetric
deletion algorithms. They report shells which are 1.2 to 1.3 times as long as those
of a RI BST. Presumably there are subtree shells to which the effect would also
apply, but clearly not to every path from the root, since the same simulations
also showed a Knott effect reduction in average pathlength. Figure (4) shows the
Evans effect. Note that the Evans effect doesn’t hold for non-grafting deletions;
the shell size (which is after all, the sum of two paths) follows the IPL down. For
grafting deletions, the shell size gradually rises. This suggests that the Evans
effect might be due to the grafting of subshell backbone unto the shell.
We can easily compute the expected size of the initial size of the shell in a
RI BST. By symmetry, the size of the shell should be twice the length of the
backbone, and this turns out to have a simple recurrence.
84
S. Taylor
14.5
14
✸
13.5
✸
✸
✸
✸
✸
grafting deletions
✸
✸
deletions
✸one-level ✸
✸
13 +
I
P 12.5
L
12
+
+
+
+
+
+
11.5
+
+
11
10.5
1000
+
+
10000
100000
Updates
1e+06
1e+07
Fig. 4. Changes in shell size as tree updated.
A tree with one node has a backbone length of zero. A tree with n nodes has
a left-subtree of size k, 0 ≤ k < n with probability n1 , and so
0
n=1
(7)
bn = Pn−1 1+bi
n
> 1,
i=1
n
which has the solution bn = Hn − 1 ≈ γ − 1 + ln n. The expected size of a RI
BST shell is then
E(shell) = 2γ − 2 + 2 ln n ≈ 2(ln n) − 0.845568670
(8)
The root of a n-node tree which has evolved to a steady state with one-level
1
and left
deletion will have its left or right subtree empty with probability 2(n−1)
1
subtrees of size k, 0 < k < n − 1 with probability n−1 . This leads to a recurrence
for the length of the backbone:
(
0
n=1
(9)
cn = 1+cn−1 Pn−2 1+ci
i=1 n−1 n > 1,
2(n−1) +
This doesn’t solve so quickly or easily, but the equivalent form
1
1
cn−1
cn +
+
cn+1 = 1 −
2n
2n
n
(10)
suggests that cn is smaller but asymptotically grows at almost the same rate as
bn .
Emerging Behavior as Binary Search Trees Are Symmetrically Updated
85
Similarly, according to our observation of multi-level delete, the left subtree
evolves to be almost never empty. So a recurrence for the backbone for such trees
can neglect the zero case.
0
n=1
(11)
dn = Pn−2 1+di
n
> 1,
i=1 n−2
3.3
The Culberson Effect
The Culberson Effect, as explained in [CM89] and [CM90] is the tendency of
interval endpoints along the shell of the binary search tree to engage in a random
walk as time passes. Collisions between endpoints cause adjacent intervals to
combine, so that the subsequent expected position and size of the resulting
coalesced interval differs from the expected position and size of either of the two
original intervals.
In Culberson’s formulation for asymmetric update, the random walk is directed; in the case of symmetric deletion, the random walk is undirected, and
therefore the effect is more subtle. Figure (5a) shows interval sizes near the root
as they evolve with one-level-deletion. Figure (5b) illustrates them for grafting
deletion. As each node is deleted (which may occur on any update with a prob-
key rank of root
root left child
leftmost grandchild
leftmost great grandchild
512
K
e
y 256
R
a
n
k
o
f 128
1
0
2
4
64
1000
key rank of root
root left child
leftmost grandchild
leftmost great grandchild
512
K
e
y 256
R
a
n
k
o
f 128
1
0
2
4
64
10000
100000
Updates
1e+06
(a) non-grafting updates
1e+07
1000
10000
100000
Updates
1e+06
1e+07
(b) grafting updates
Fig. 5. Subtree sizes on shell grow as tree updated.
ability of 1/n) the key value for that node will take on the value of either the
predecessor or the successor node, that is, move either right or left (grafting deletion may cause the keyvalue to skip over several intermediate keys.) After the
deletion, one end of the interval defined by the node has moved. The expected
position of the node has not changed, but the expected distance from the initial
position is one step.
86
S. Taylor
An interval is identified by its relative position in the shell, not its bounds.
Thus, we can speak of the movement of an interval when deletion of one of its
endpoint nodes causes a new key value to appear in the at one end of the interval.
This isn’t quite Brownian motion, since colliding intervals coalesce instead of
rebounding, but after ‘long enough’ one might expect the intervals nearer the
root in the shell to be pushing outward.
There are several special cases. When a shell node deletion occurs through
key replacement by the key of its successor or predecessor, which will usually
not be a shell node, the interval has moved right (left.) However, if the successor
(predecessor) is a shell node with an empty left (right) subtree, there is now one
fewer node on the shell, and an interval has disappeared.
Following Culberson, we find that the endpoint of an interval moves either left
or right when it is (symmetrically) updated; that is, the key value which defines
the endpoint of an interval changes either up or down as the backbone node
which holds it is updated. If there were no interval collisions, the expected value
of the key would stay constant, while its variance would increase linearly with the
number of updates. Since the probability that an update to the p
tree will involve
an endpoint is 1/n, the expected excursion of a key value is O( updates/n).
But the size of an interval is bounded below by one. The interval would cease
to exist if its endpoints collided. So the expected size of (remaining) intervals
will increase as the tree is updated. This effect is clearly visible in fig. (5.) It is
much more dramatic for non-grafting deletion, perhaps because in order for a
collision to take place the lower endpoint must have a zero-sized subtree, and
the grafting deletion algorithm prunes the population of zero-sized subtrees.
The Culberson effect should slightly increase IPL.
4
Recapitulation
Two unrealistic frameworks, Exact Fit Domain update, and non-grafting deletion
are being used to begin understanding three effects in the evolution of binary
search trees under update.
Tentatively it appears that the Knott effect may be less significant for a
large BST; the effect of fewer zero-size subtrees is predicted to disappear with
non-grafting deletion for trees of 100000 or more nodes.
11
Simulations show that the Culberson effect is still increasing after n 4 deletions. The fact that non-grafting deletion has a stronger Culberson effect needs
to be accounted for in modeling. Notice that zero-length subtrees, which figure in
the disappearance of intervals in the Culberson model, become quite rare when
grafting deletes are used, but are relatively common with non-grafting deletes.
What are the implications of the Evans effect for total IPL? Shell paths seem
to be shorter than average paths in the tree, even after Evans stretching, and we
would expect that shell lengths would be a less important contributor to IPL in
a larger BST.
Emerging Behavior as Binary Search Trees Are Symmetrically Updated
87
References
BY89.
Bri86.
CM89.
CM90.
EC94.
Epp83.
GBY91.
GKP89.
Hib62.
JK78.
Kno75.
Knu97.
MR98.
Mes91.
Ricardo A. Baeza-Yates. A Trivial Algorithm Whose Analysis Is Not: A
Continuation. BIT, 29(3):278–394, 1989.
Keith Brinck. On deletion in threaded binary trees. Journal of Algorithms,
7(3):395–411, September 1986.
Joseph Culberson and J. Ian Munro. Explaining the behavior of binary trees
under prolonged updates: A model and simulations. The Computer Journal,
32(1), 1989.
Joseph Culberson and J. Ian Munro. Analysis of the standard deletion
algorithms in exact fit domain binary search trees. Algorithmica, 5(3):295–
311, 1990.
Patricia A. Evans and Joseph Culberson. Asymmetry in binary search tree
update algorithms. Technical Report TR 94-09, University of Alberta Department of Computer Science, Edmonton, Alberta, Canada, May 1994.
Jeffrey L. Eppinger. An empirical study of insertion and deletion in binary
trees. CACM, 26(9):663–669, September 1983.
Gaston H. Gonnet and Ricardo Baeza-Yates. Handbook of Algorithms and
Data Structures. Addison-Wesley, 2nd edition, 1991.
Ronald L. Graham, Donald E. Knuth, and Oren Patashnik. Concrete Mathematics. Addison-Wesley, 1989.
Thomas N. Hibbard. Some combinatorial properties of certain trees with
applications to searching and sorting. JACM, 9:13–28, 1962.
Arne T. Jonassen and Donald E. Knuth. A trivial algorithm whose analysis
isn’t. Journal of Computer and System Sciences, 16:301–322, 1978.
Gary D. Knott. Deletion in Binary Storage Trees. PhD thesis, Stanford
University, 1975. Available as Tech. Rep. STAN-CS-75-491.
Donald E. Knuth. The Art of Computer Programming: Volume 3 / Sorting
and Searching. Addison-Wesley, 2nd edition, 1997.
Conrado Martinez and Salvador Roura. Randomized binary search trees.
JACM, 45(2):228–323, March 1998.
Xavier Messeguer. Dynamic behaviour in updating process over BST of size
two with probabilistic deletion algorithms. IPL, 38:89–100, April 1991.
The LCA Problem Revisited
Michael A. Bender1⋆ and Martı́n Farach-Colton2⋆⋆
1
Department of Computer Science, State University of New York at Stony Brook,
Stony Brook, NY 11794-4400, USA. Email: bender@cs.sunysb.edu.
2
Department of Computer Science, Rutgers University,
Piscataway, NJ 08855, USA. Email: farach@cs.rutgers.edu.
Abstract. We present a very simple algorithm for the Least Common
Ancestors problem. We thus dispel the frequently held notion that optimal LCA computation is unwieldy and unimplementable. Interestingly,
this algorithm is a sequentialization of a previously known PRAM algorithm.
1
Introduction
One of the most fundamental algorithmic problems on trees is how to find the
Least Common Ancestor (LCA) of a pair of nodes. The LCA of nodes u and v
in a tree is the shared ancestor of u and v that is located farthest from the root.
More formally, the LCA Problem is stated as follows: Given a rooted tree T ,
how can T be preprocessed to answer LCA queries quickly for any pair of nodes.
Thus, one must optimize both the preprocessing time and the query time.
The LCA problem has been studied intensively both because it is inherently
beautiful algorithmically and because fast algorithms for the LCA problem can
be used to solve other algorithmic problems.
In [HT84], Harel and Tarjan showed the surprising result that LCA queries
can be answered in constant time after only linear preprocessing of the tree
T . This classic paper is often cited because linear preprocessing is necessary
to achieve optimal algorithms in many applications. However, it is well understood that the actual algorithm presented is far too complicated to implement
effectively. In [SV88], Schieber and Vishkin introduced a new LCA algorithm.
Although their algorithm is vastly simpler than Harel and Tarjan’s—indeed,
this was the point of this new algorithm—it is far from simple and still not
particularly implementable.
The folk wisdom of algorithm designers holds that the LCA problem still
has no implementable optimal solution. Thus, according to hearsay, it is better
to have a solution to a problem that does not rely on LCA precomputation if
possible. We argue in this paper that this folk wisdom is wrong.
In this paper, we present not only a simplified LCA algorithm, we present
a simple LCA algorithm! We devise this algorithm by reëngineering an existing
⋆
⋆⋆
Supported in part by ISX Corporation and Hughes Research Laboratories.
Supported in part by NSF Career Development Award CCR-9501942, NATO Grant
CRG 960215, NSF/NIH Grant BIR 94-12594-03-CONF.
G. Gonnet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNCS 1776, pp. 88–94, 2000.
c Springer-Verlag Berlin Heidelberg 2000
The LCA Problem Revisited
89
complicated LCA algorithm: in [BBG+ 89] a PRAM algorithm was presented
that preprocesses and answers queries in O(α(n)) time and preprocesses in linear
work. Although at first glance, this algorithm is not a promising candidate for
implementation, it turns out that almost all of the complications are PRAM
induced: when the PRAM complications are excised from this algorithm so that
it is lean, mean, and sequential, we are left with an extremely simple algorithm.
In this paper, we present this reëngineered algorithm. Our point is not to
present a new algorithm. Indeed, we have already noted that this algorithm has
appeared as a PRAM algorithm before. The point is to change the folk wisdom so
that researchers are free to use the full power and elegance of LCA computation
when it is appropriate.
The remainder of the paper is organized as follows. In Section 2, we provide
some definitions and initial lemmas. In Section 3, we present a relatively slow
algorithm for LCA preprocessing. In Section 4, we show how to speed up the
algorithm so that it runs within the desired time bounds. Finally, in Section 5,
we answer some algorithmic questions that arise in the paper but that are not
directly related to solving the LCA problem.
2
Definitions
We begin by defining the Least Common Ancestor (LCA) Problem formally.
Problem 1. The Least Common Ancestor (LCA) problem:
Structure to Preprocess: A rooted tree T having n nodes.
Query: For nodes u and v of tree T , query lcaT (u, v) returns the least common
ancestor of u and v in T , that is, it returns the node furthest from the root
that is an ancestor of both u and v. (When the context is clear, we drop the
subscript T on the lca.)
The Range Minimum Query (RMQ) Problem, which seems quite different
from the LCA problem, is, in fact, intimately linked.
Problem 2. The Range Minimum Query (RMQ) problem:
Structure to Preprocess: A length n array A of numbers.
Query: For indices i and j between 1 and n, query rmqA (x, y) returns the index
of the smallest element in the subarray A[i . . . j]. (When the context is clear,
we drop the subscript A on the rmq.)
In order to simplify the description of algorithms that have both preprocessing and query complexity, we introduce the following notation. If an algorithm
has preprocessing time f (n) and query time g(n), we will say that the algorithm
has complexity hf (n), g(n)i.
Our solutions to the LCA problem are derived from solutions to the RMQ
problem. Thus, before proceeding, we reduce the LCA problem to the RMQ
problem. The following simple lemma establishes this reduction.
90
M.A. Bender, M. Farach-Colton
Lemma 1. If there is an hf (n), g(n)i-time solution for RMQ, then there is an
hf (2n − 1) + O(n), g(2n − 1) + O(1)i-time solution for LCA.
As we will see, the O(n) term in the preprocessing comes from the time needed
to create the soon-to-be-presented length 2n − 1 array, and the O(1) term in the
query comes from the time needed to convert the RMQ answer on this array to
an LCA answer in the tree.
Proof: Let T be the input tree. The reduction relies on one key observation:
Observation 2 The LCA of nodes u and v is the shallowest node encountered
between the visits to u and to v during a depth first search traversal of T .
Therefore, the reduction proceeds as follows.
1. Let array E[1, . . . , 2n − 1] store the nodes visited in an Euler Tour of the
tree T . 1 That is, E[i] is the label of the ith node visited in the Euler tour
of T .
2. Let the level of a node be its distance from the root. Compute the Level
Array L[1, . . . , 2n − 1], where L[i] is the level of node E[i] of the Euler Tour.
3. Let the representative of a node in an Euler tour be the index of first
occurrence of the node in the tour2 ; formally, the representative of i is
argminj {E[j] = i}. Compute the Representative Array R[1, . . . , n], where
R[i] is the index of the representative of node i.
Each of these three steps takes O(n) time, yielding O(n) total time. To
compute lcaT (x, y), we note the following:
– The nodes in the Euler Tour between the first visits to u and to v are
E[R[u], . . . , R[v]] (or E[R[v], . . . , R[u]]).
– The shallowest node in this subtour is at index rmqL (R[u], R[v]), since L[i]
stores the level of the node at E[i], and the RMQ will thus report the position
of the node with minimum level. (Recall Observation 2.)
– The node at this position is E[rmqL (R[u], R[v])], which is thus the output
of lcaT (u, v).
Thus, we can complete our reduction by preprocessing Level Array L for RMQ.
As promised, L is an array of size 2n − 1, and building it takes time O(n). Thus,
the total preprocessing is f (2n − 1) + O(n). To calculate the query time observe
that an LCA query in this reduction uses one RMQ query in L and three array
references at O(1) time each. The query thus takes time g(2n − 1) + O(1), and
we have completed the proof of the reduction.
1
2
The Euler Tour of T is the sequence of nodes we obtain if we write down the label
of each node each time it is visited during a DFS. The array of the Euler tour has
length 2n − 1 because we start at the root and subsequently output a node each
time we traverse an edge. We traverse each of the n − 1 edges twice, once in each
direction.
In fact, any occurrence of i will suffice to make the algorithm work, but we consider
the first occurrence for the sake of concreteness.
The LCA Problem Revisited
91
From now on, we focus only on RMQ solutions. We consider solutions to the
general RMQ problem as well as to an important restricted case suggested by
the array L. In array L from the above reduction adjacent elements differ by +1
or −1. We obtain this ±1 restriction because, for any two adjacent elements in
an Euler tour, one is always the parent of the other, and so their levels differ by
exactly one. Thus, we consider the ±1-RMQ problem as a special case.
2.1
A Naı̈ve Solution for RMQ
We first observe that RMQ has a solution with complexity hO(n2 ), O(1)i: build
a table storing answers to all of the n2 possible queries. To achieve O(n2 ) preprocessing rather than the O(n3 ) naive preprocessing, we apply a trivial dynamic
program. Notice that answering an RMQ query now requires just one array
lookup.
3
A Faster RMQ Algorithm
We will improve the hO(n2 ), O(1)i-time brute-force table algorithm for (general) RMQ. The idea is to precompute each query whose length is a power of
two. That is, for every i between 1 and n and every j between 1 and log n,
we find the minimum element in the block starting at i and having length 2j ,
that is, we compute M [i, j] = argmink=i...i+2j −1 {A[k]}. Table M therefore has
size O(n log n), and we fill it in time O(n log n) by using dynamic programming.
Specifically, we find the minimum in a block of size 2j by comparing the two minima of its two constituent blocks of size 2j−1 . More formally, M [i, j] = M [i, j −1]
if A[M [i, j − 1]] ≤ M [i + 2j−1 − 1, j − 1] and M [i, j] = M [i + 2j−1 − 1, j − 1]
otherwise.
How do we use these blocks to compute an arbitrary rmq(i, j)? We select
two overlapping blocks that entirely cover the subrange: let 2k be the size of the
largest block that fits into the range from i to j, that is let k = ⌊log(j − i)⌋.
Then rmq(i, j) can be computed by comparing the minima of the following two
blocks: i to i + 2k − 1 (M (i, k)) and j − 2k + 1 to j (M (j − 2k + 1, k)). These
values have already been computed, so we can find the RMQ in constant time.
This gives the Sparse Table (ST) algorithm for RMQ, with complexity
hO(n log n), O(1)i. Notice that the total computation to answer an RMQ query
is three additions, 4 array reference and a minimum, in addition to two other operations: a log and a floor. These can be seen together as the problem of finding
the most significant bit of a word. Notice that we must have one such operation
in our algorithm, since Harel and Tarjan [HT84] showed that LCA computation has a lower bound of Ω(log log n) on a pointer machine. Furthermore, the
most-significant-bit operation has a very fast table lookup solution.
Below, we will use the ST algorithm to build an even faster algorithm for the
±1RMQ problem.
92
4
M.A. Bender, M. Farach-Colton
An hO(n), O(1)i-Time Algorithm for ±1RMQ
Suppose we have an array A with the ±1 restriction. We will use a table-lookup
technique to precompute answers on small subarrays, thus removing the log
factor from the preprocessing. To this end, partition A into blocks of size log2 n .
Define an array A′ [1, . . . , 2n/ log n], where A′ [i] is the minimum element in the
ith block of A. Define an equal size array B, where B[i] is a position in the ith
block in which value A′ [i] occurs. Recall that RMQ queries return the position
of the minimum and that the LCA to RMQ reduction uses the position of the
minimum, rather than the minimum itself. Thus we will use array B to keep
track of where the minima in A′ came from.
The ST algorithm runs on array A′ in time hO(n), O(1)i. Having preprocessed A′ for RMQ, consider how we answer any query rmq(i, j) in A. The
indices i and j might be in the same block, so we have to preprocess each block
to answer RMQ queries. If i < j are in different blocks, the we can answer the
query rmq(i, j) as follows. First compute the values:
1. The minimum from i forward to the end of its block.
2. The minimum of all the blocks in between between i’s block and j’s block.
3. The minimum from the beginning of j’s block to j.
The query will return the position of the minimum of the three values computed.
The second minimum is found in constant time by an RMQ on A′ , which has
been preprocessed using the ST algorithm. But, we need to know how to answer
range minimum queries inside blocks to compute the first and third minima, and
thus to finish off the algorithm. Thus, the in-block queries are needed whether i
and j are in the same block or not.
Therefore, we focus now only on in-block RMQs. If we simply performed
RMQ preprocessing on each block, we would spend too much time in preprocessing. If two block were identical, then we could share their preprocessing.
However, it is too much to hope for that blocks would be so repeated. The following observation establishes a much stronger shared-preprocessing property.
Observation 3 If two arrays X[1, . . . , k] and Y [1, . . . , k] differ by some fixed
value at each position, that is, there is a c such that X[i] = Y [i] + c for every i,
then all RMQ answers will be the same for X and Y . In this case, we can use
the same preprocessing for both arrays.
Thus, we can normalize a block by subtracting its initial offset from every
element. We now use the ±1 property to show that there are very few kinds of
normalized blocks.
√
Lemma 4. There are O( n) kinds of normalized blocks.
Proof: Adjacent elements in normalized blocks differ by +1 or −1. Thus, normalized blocks are√specified by a ±1 vector of length (1/2 · log n) − 1. There are
2(1/2·log n)−1 = O( n) such vectors.
The LCA Problem Revisited
93
√
We are now basically done. We create O( n) tables, one for each possible
normalized block. In each table, we put all ( log2 n )2 = O(log2 n) answers to all in√
block queries. This gives a total of O( n log2 n) total preprocessing of normalized
block tables, and O(1) query time. Finally, compute, for each block in A, which
normalized block table it should use for its RMQ queries. Thus, each in-block
RMQ query takes a single table lookup.
Overall, the total space and preprocessing used for normalized block tables
and A′ tables is O(n) and the total query time is O(1). We show a complete
example below.
4.1
Wrapping Up
We started out by showing a reduction from the LCA problem to the RMQ
problem, but with the key observation that the reduction actually leads to a
±1RMQ problem.
We gave a trivial hO(n2 ), O(1)i-time table-lookup algorithm for RMQ, and
show how to sparsify the table to get a hO(n log n), O(1)i-time table-lookup
algorithm. We used this latter algorithm on a smaller summary array A′ and
needed only to process small blocks to finish the algorithm. Finally, we notice
that most of these blocks are the same, from the point of view of the RMQ
problem, by using the ±1 assumption given by the original reduction.
5
A Fast Algorithm for RMQ
We have a hO(n), O(1)i ±1RMQ. Now we show that the general RMQ can be
solved in the same complexity. We do this by reducing the RMQ problem to the
LCA problem! Thus, to solve a general RMQ problem, one would convert it to
an LCA problem and then back to a ±1RMQ problem.
The following lemma establishes the reduction from RMQ to LCA.
Lemma 5. If there is a hO(n), O(1)i solution for LCA, then there is a
hO(n), O(1)i solution for RMQ.
We will show that the O(n) term in the preprocessing comes from the time
needed to build the Cartesian Tree of A and the O(1) term in the query comes
from the time needed to covert the LCA answer on this tree to an RMQ answer
on A.
Proof: Let A[1, . . . , n] be the input array.
The Cartesian Tree of an array is defined as follows. The root of a Cartesian
Tree is the minimum element of the array, and the root is labeled with the
position of this minimum. Removing the root element splits the array into two
pieces. The left and right children of the root are the recursively constructed
Cartesian trees of the left and right subarrays, respectively.
A Cartesian Tree can be built in linear time as follows. Suppose Ci is the
Cartesian tree of A[1, . . . , i]. To build Ci+1 , we notice that node i + 1 will belong
to the rightmost path of Ci+1 , so we climb up the rightmost path of Ci until
94
M.A. Bender, M. Farach-Colton
finding the position where i+1 belongs. Each comparison either adds an element
to the rightmost path or removes one, and each node can only join the rightmost
path and leave it once. Thus the total time to build Cn is O(n).
The reduction is as follows.
– Let C be the Cartesian Tree of A. Recall that we associate with each node
in C the corresponding corresponding to A[i] with the index i.
Claim. rmqA (i, j) = lcaC (i, j).
Proof: Consider the least common ancestor, k, of i and j in the Cartesian Tree C.
In the recursive description of a Cartesian tree, k is the first node that separates
i and j. Thus, in the array A, element A[k] is between elements A[i] and A[j].
Furthermore, A[k] must be the smallest such element in the subarray A[i, . . . , j]
since otherwise, there would be an smaller element k ′ in A[i, . . . , j] that would
be an ancestor of k in C, and i and j would already have been separated by k ′ .
More concisely, since k is the first element to split i and j, it is between them
because it splits them, and it is minimal because it is the first element to do so.
Thus it is the RMQ.
We see that we can complete our reduction by preprocessing the Cartesian
Tree C for LCA. Tree C takes time O(n) to build, and because C is an n node
tree, LCA preprocessing takes O(n) time, for a total of O(n) time. The query
then takes O(1), and we have completed the proof of the reduction.
References
BBG+ 89. O. Berkman, D. Breslauer, Z. Galil, B. Schieber, and U. Vishkin. Highly
parallelizable problems. In Proc. of the 21st Ann. ACM Symp. on Theory
of Computing, pages 309–319, 1989.
HT84.
D. Harel and R. E. Tarjan. Fast algorithms for finding nearest common
ancestors. SIAM J. Comput., 13(2):338–355, 1984.
SV88.
B. Schieber and U. Vishkin. On finding lowest common ancestors: Simplification and parallelization. SIAM J. Comput., 17:1253–1262, 1988.
Optimal and Pessimal Orderings of Steiner
Triple Systems in Disk Arrays
Myra B. Cohen and Charles J. Colbourn
Computer Science, University of Vermont, Burlington, VT 05405, USA.
{mcohen,colbourn}@cs.uvm.edu
http://www.cs.uvm.edu/˜{mcohen,colbourn}
Abstract. Steiner triple systems are well studied combinatorial designs
that have been shown to possess properties desirable for the construction of multiple erasure codes in RAID architectures. The ordering of
the columns in the parity check matrices of these codes affects system
performance. Combinatorial problems involved in the generation of good
and bad column orderings are defined, and examined for small numbers
of accesses to consecutive data blocks in the disk array.
1
Background
A Steiner triple system is an ordered pair (S, T ) where S is a finite set of points
or symbols and T is a set of 3-element subsets of S called triples, such that
each pair of distinct elements of S occurs together in exactly one triple of T .
The order of a Steiner triple system (S, T ) is the size of the set S, denoted
|S|. A Steiner triple system of order v is often written as STS(v). An STS(v)
exists if and only if v ≡ 1, 3 (mod 6) (see [5], for example). We can relax the
requirement that every pair occurs exactly once as follows. Let (V, B) be a set
V of elements together with a collection B of 3-element subsets of V , so that no
pair of elements of V occurs as a subset of more than one B ∈ B. Such a pair
(V, B) is an (n, ℓ)-configuration when n = |V | and ℓ = |B|, and every element of
V is in at least one of the sets in B.
Let C be a configuration (V, B). We examine the following combinatorial
problems. When does there exist a Steiner triple system (S, T ) of order v in which
the triples can be ordered T0 , . . . , Tb−1 , so that every ℓ consecutive triples form
a configuration isomorphic to C? Such an ordering is a C-ordering of the Steiner
triple system. When we treat the first triple as following the last (and hence
cyclically order the triples), and then enforce the same condition, the ordering
is a C-cyclic ordering. The presence of configurations in Steiner triple systems
has been studied in much detail; see [5] for an extensive survey. Apparently,
the presence or absence of configurations among consecutive triples in a triple
ordering of an STS has not been previously examined.
Our interest in these problems arises from an application in the design of
erasure codes for disk arrays. Prior to examining the combinatorial problems
posed, we explore the disk array application. As processor speeds have increased
G. Gonnet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNCS 1776, pp. 95–104, 2000.
c Springer-Verlag Berlin Heidelberg 2000
96
M.B. Cohen, C.J. Colbourn
rapidly in recent years, one method of bridging the Input-Output (I/O) performance gap has been to use redundant arrays of independent disks (RAID)
[9]. Individual data reads and writes are striped across multiple disks, thereby
creating I/O parallelism. Encoding redundant information onto additional disks
allows reconstruction of lost information in the presence of disk failures. This
creates disk arrays with high throughput and good reliability. However, an array
of disks has a substantially greater probability of a disk failure than does an individual disk [8,9]. Indeed, Hellerstein et al. [8] have shown that the reliability
of an array of 1000 disks which protects against one error, even with periodic
daily or weekly repairs, has a lower reliability than an individual disk. Most
systems that are available currently handle only one or two disk failures [15].
As arrays grow in size, the need for greater redundancy without a reduction in
performance becomes important.
A catastrophic disk failure is an erasure. When a disk fails all of the information is lost or erased. Codes that can correct for n erasures are called n-erasure
correcting codes. The minimum number of additional disks that must be accessed
for each write in an n-erasure code, the update penalty, has been shown to be n
[1,8]. Chee, Colbourn, and Ling [1] have shown that Steiner triple systems possess properties that make them desirable 3-erasure correcting codes with minimal
update penalties. The correspondence between Steiner triple systems and parity
check matrices is that used by Hellerstein et al. [3,8]. Codewords in a binary
linear code are viewed as vectors of information and check bits. The code can
then be defined in terms of a c × (k + c) parity check matrix, H = [P |I] where
k is the number of information disks, I is the c × c identity matrix and P is a
c × k matrix that determines the equations of the check disks. The columns of
P are indexed by the k information disks. The columns of I and the rows of
H are indexed by the c check disks. A set of disk failures is recoverable if and
only if the corresponding set of equations in its parity check matrix is linearly
independent [1,8]. Any set of t binary vectors is linearly independent over GF[2]
if and only if the vector sum modulo two of those columns, or any non-empty
subset of those columns, is not equal to the zero vector [8].
0
1
1
1
0
0
0
0
0
0
0
0
0
0
0
1
2
3
1
0
0
1
1
0
0
0
0
0
0
0
0
1
0
0
0
0
1
1
0
0
0
0
0
0
4
1
0
0
0
0
0
0
1
1
0
0
0
0
1
0
0
0
0
0
0
0
0
1
1
0
0
5
1
0
0
0
0
0
0
0
0
0
0
1
1
0
1
0
1
0
1
0
0
0
0
0
0
0
6
0
1
0
0
1
0
1
0
0
0
0
0
0
information
0 0 0 0 0 0
1 1 1 0 0 0
0 0 0 1 1 1
0 0 0 1 0 0
0 0 0 0 1 0
0 0 0 0 0 1
0 0 0 0 0 0
1 0 0 1 0 0
0 1 0 0 0 0
1 0 0 0 0 0
0 0 1 0 0 0
0 1 0 0 1 0
0 0 1 0 0 1
7
8
0
0
1
0
0
0
1
0
0
0
1
0
0
0
0
1
0
0
0
0
0
1
1
0
0
0
0
0
0
1
0
0
1
0
1
0
0
0
0
0
0
0
1
0
0
0
0
0
1
0
0
1
0
0
0
1
0
0
0
0
0
0
1
1
0
0
0
0
0
1
1
0
0
0
1
0
0
0
0
0
0
0
1
0
0
1
0
0
1
0
0
0
0
0
0
1
0
0
0
1
0
0
0
1
0
0
0
0
0
1
0
1
0
0
0
1
0
0
0
0
0
0
1
0
0
1
0
1
0
0
25
0 0
0 0
0 0
0 0
0 0
0 0
1 1
1 0
0 0
0 1
0 0
0 1
1 0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
check
0 0 0
0 0 0
0 0 0
0 0 0
1 0 0
0 1 0
0 0 1
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
12
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
1 0
0 1
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
26 27 28 29 30 31 32 33 34 35 36 37 38
Fig. 1. Steiner (13) Parity Check Matrix: The shaded disks are check disks.
Optimal and Pessimal Orderings of Steiner Triple Systems in Disk Arrays
97
Figure 1 shows a parity check matrix for an STS(13). Cohen and Colbourn
[3] examine the ordering of columns in the parity check matrices. This departs
from the standard theory of error correcting codes where the order of columns
in a parity check matrix is unimportant [16].
One particular class of these codes, anti-Pasch Steiner triple systems, has
been shown to correct for all 4-erasures except for bad erasures [1]. A bad erasure
is one that involves an information disk and all three of its check disks.
a
d
b
e
f
c
Fig. 2. Pasch Configuration
Figure 2 shows six elements {a, b, c, d, e, f } and four triples {a, b, c},{a, d, e},
{f, b, d} and {f, c, e}. These form a (6,4)-configuration called a Pasch configuration or quadrilateral [14]. The points represent the check disks (rows of the
parity check matrix). Each triple represents an information disk (column of the
parity check matrix). If we convert this diagram to a (portion of a) parity check
matrix we find that if all four information disks fail there is an irrecoverable
loss of information. The resulting four columns in the parity check matrix are
linearly dependent and therefore cannot be reconstructed. Anti-Pasch Steiner
triple systems yield codes which avoid this configuration. The existence of antiPasch STS(v) for all v ≡ 1 or 3 (mod 6) except when v=7 or 13 was recently
solved [7,14].
Cohen and Colbourn [3] examined some of the issues pertaining to encoding
Steiner triple systems in a disk array. In a multiple erasure correcting disk array,
there may be an overlap among the check disks accessed for consecutive information disks in reads and writes. The number of disks needed in an individual write
can therefore be minimized by ordering the columns of this matrix. Using the
assumption that the most expensive part of reading or writing in a disk array is
the physical read or write to the disk, this overlap can have a significant effect on
performance. Cohen and Colbourn [3] describe a write to a triple erasure code
as follows. First the information disks are read followed by all of their associated
check disks. In the case when check disks overlap, the physical read only takes
place once. All of the new parity is computed and then this new parity and the
new information is written back to the disks. Once again, the shared check disks
are only physically written to one time. Theoretically, the update penalty is the
98
M.B. Cohen, C.J. Colbourn
same for all reads and writes in an array. But when more than one information
disk in an array shares a common check disk this saves two disk accesses, one
read and one write. This finally leads to the questions posed at the outset. In
particular, can one ordering be found that optimizes writes of various sizes in
such an array?
In order to derive some preliminary results about ordering we have implemented a computer simulation [4,3]. RaidSim [9,12,13] is a simulation program
written at the University of California at Berkeley [12]. Holland [9] extends it
to include declustered parity and online reconstruction. The raidSim program
models disk reads and writes and simulates the passage of time. The modified
version from [9] is the starting point for our experiments. RaidSim is extended to
include mappings for Steiner triple systems and to tolerate multiple disk failures
and to detect the existence of unrecoverable four and five erasures [4].
The performance experiments are run with a simulated user concurrency
level of 500. Six Steiner triple systems of order 15 are used in these experiments.
These are the systems numbered 1, 2, 20, 38, 67 and 80 in [5]. There are 80 nonisomorphic systems of order 15. The number of Pasch configurations in STS(15)
range from 105 in STS(15) system one to zero in STS(15) system 80.
2
Pessimal Ordering
A worst triple ordering is one in which consecutive triples are all disjoint. Indeed,
if the reads and writes involve at most ℓ consecutive data blocks, a worst triple
(or column) ordering is one in which each set of ℓ consecutive triples has all
triples disjoint. Let Dℓ be the (3ℓ, ℓ)-configuration consisting of ℓ disjoint triples.
A pessimal ordering of an STS is a Dℓ -ordering. It is easily seen that a Dℓ+1 ordering is also a Dℓ -ordering.
The unique STS(7) has no D2 -ordering since every two of its triples intersect.
The unique STS(9) has no D2 -ordering, as follows. Consider a triple T . There are
exactly two triples, T ′ and T ′′ disjoint from T (and indeed T ′ and T ′′ are disjoint
from each other as well). Without loss of generality, suppose that T ′ precedes
T and T ′′ follows T in the putative ordering. Then no triple can precede T ′ or
follow T ′′ . These two small cases are, in a sense, misleading. Both STS(13)s and
all eighty STS(15)s admit not only a D2 -ordering, but also a D3 -cyclic ordering.
This is easily established using a simple backtracking algorithm.
We establish a general result:
Theorem 1. For v ≡ 1, 3
a Dℓ -ordering.
(mod 6) and v ≥ 9ℓ−6, there exists an STS(v) with
Proof. When v ≡ 3 (mod 6), there is a Steiner triple system (S, T ) of order v
in which the triples can be partitioned into (v − 1)/2 classes R1 , . . . , R(v−1)/2 , so
that within each class all triples are disjoint. Each class Ri contains v/3 triples.
(This is a Kirkman triple system; see [5].) When v ≡ 1 (mod 6) and v ≥ 19,
there is a Steiner triple system (S, T ) of order v in which the triples can be
partitioned into (v + 1)/2 classes R1 , . . . , R(v+1)/2 , so that within each class all
Optimal and Pessimal Orderings of Steiner Triple Systems in Disk Arrays
99
triples are disjoint. R1 contains (v − 1)/6 triples, and each other class contains
(v − 1)/3. (This is a Hanani triple system; see [5].)
Our orderings place all triples of Ri before all triples of Ri+1 for each 1 ≤ i <
s. We must order the triples of each class Ri . To do this, we first order the triples
of R1 arbitrarily. Let us then suppose that R1 , . . . , Ri−1 have been ordered. To
select the jth triple in the ordering of Ri for 1 ≤ j < ℓ, we choose a triple which
has not already been chosen in Ri , and which does not intersect any of the last
ℓ − j triples in Ri−1 . Such a triple exists, since j − 1 triples have been chosen,
and at most 3(ℓ − j) triples of Ri intersect any of the last ℓ − j triples of Ri−1 ,
but 3(ℓ − j) + j − 1 < 3ℓ − 2 ≤ ⌊v/3⌋ for all j ≥ 1.
A similar proof yields Dℓ -cyclic orderings when v is larger. What is striking
about the computational results for order 15 is not that an ordering for some
system can be found, but that every system has a D3 -cyclic ordering. This
suggests the possibility that for v sufficiently large, every STS(v) admits a Dℓ cyclic ordering. To verify this, form the t-intersection graph Gt of a triple system
(S, T ) by including a vertex for each triple in T , and making two vertices adjacent
if the corresponding triples share t elements. A D2 -cyclic ordering of (S, T ) is
equivalent to a Hamilton cycle in G0 . But more is true. A Dℓ -cyclic ordering of
(S, T ) is equivalent to the (ℓ − 1)st power of a Hamilton cycle in G0 . Komlós,
Sárközy, and Szemerédi [11] establish that for any ε
> 0 and any sufficiently
k
+ ε n, G contains
large n-vertex graph G of minimum degree at least k+1
the kth power of a Hamilton cycle. Now G0 has v(v − 1)/6 vertices and degree
(v(v −10)+21)/6, and so G0 meets the required conditions. Thus when ℓ is fixed,
every sufficiently large STS(v) admits a Dℓ -ordering. We give a direct proof of
this, which does not rely upon probabilistic methods.
Theorem 2. For ℓ a positive integer and v ≥ 81(ℓ−1)+1, every STS(v) admits
a Dℓ -ordering.
Proof. Let (S, T ) be an STS(v). Form the 1-intersection graph G1 of (S, T ).
G1 is regular of degree 3(v − 1)/2, and therefore has a proper vertex coloring
in s ≤ 3(v − 1)/2 colors. Form a partition of T , defining classes R1 , . . . , Rs of
triples by placing a triple in the class Ri when the corresponding vertex of G1
has the ith color. Let us suppose without loss of generality that |Ri | ≤ |Ri+1 | for
1 ≤ i < s. Now if 3|R1 | < |Rs |, there must be a triple of Rs intersecting no triple
of R1 . When this occurs, move
Ps such a triple from Rs to R1 . This can be repeated
until 3|R1 | ≥ |Rs |. Since i=1 |Ri | = v(v − 1)/6, we find that |Rs | ≥ ⌈v/9⌉ and
thus |R1 | ≥ ⌈v/27⌉. But then |R1 | > 3ℓ − 3, and we can apply precisely the
method in the proof of Theorem 1 to produce the ordering required.
The bound on v can almost certainly be improved upon. Indeed for ℓ = 3,
we expect that every STS(v) with v > 13 has a D3 -cyclic ordering.
100
3
M.B. Cohen, C.J. Colbourn
Optimal Ordering
Optimal orderings pose more challenging problems. We wish to minimize rather
than maximize the number of check disks associated with ℓ consecutive triples.
We begin by considering small values of ℓ. When ℓ = 2, define the configuration
I2 to be the unique (5,2)-configuration, which consists of two intersecting triples.
An optimal ordering is an I2 -ordering. Horák and Rosa [10] essentially proved
the following:
Theorem 3. Every STS(v) admits an I2 -cyclic ordering.
Proof. The 1-intersection graph G1 of the STS has a hamiltonian cycle [10].
Let T be the unique (6,3)-configuration, as depicted in Figure 3.
Fig. 3. Optimal Ordering on Three Blocks
An optimal ordering when ℓ = 3 is a T -ordering. The unique STS(7) has a
T -cyclic ordering: 013, 124, 026, 156, 235, 346, 045. The unique STS(9) also has
a T -cyclic ordering: 012, 036, 138, 048, 147, 057, 345, 237, 246, 678, 258, 156.
We might therefore anticipate that every STS(v) has a T -cyclic ordering, but
this does not happen. To establish this, we require a few definitions. A proper
subsystem of a Steiner triple system (S, T ) is a pair (S ′ , T ′ ) with S ′ ⊂ S and
T ′ ⊂ T , |S ′ | > 3, and (S ′ , T ′ ) itself a Steiner triple system. A Steiner space is a
Steiner triple system with the property that every three elements which do not
appear together in a triple are contained in a proper subsystem.
Theorem 4. No Steiner space admits a T -ordering. Hence, whenever we have
v ≡ 1, 3 (mod 6), v ≥ 15, and v 6∈ {19, 21, 25, 33, 37, 43, 51, 67, 69, 145}, there
is a Steiner triple system admitting no T -ordering.
Proof. Suppose that (S, T ) is a Steiner space which has a T -ordering. Then
consider two consecutive triples under this ordering. These are contained within
a proper subsystem. Any triple preceding or following two consecutive triples of
a subsystem must also lie in the subsystem. But this forces all triples of T to lie
in the subsystem, which is a contradiction.
Optimal and Pessimal Orderings of Steiner Triple Systems in Disk Arrays
101
The conditions on v reflect the current knowledge about the existence of
Steiner spaces (see [5]).
A much weaker condition suffices to establish that there is no T -ordering.
A T -ordering cannot have any two consecutive triples which appear together
in a proper subsystem. By the same token, a T -ordering cannot have any two
triples which appear together in a proper subsystem and are separated by only
one triple in the ordering. Hence the strong condition on subsystems enforced in
Steiner spaces can be relaxed. Of course, our interest is in producing Steiner triple
systems that do admit T -orderings. Both STS(13)s admit T -orderings but not
cyclic T -orderings. Of the 80 STS(15)s, only fourteen admit cyclic T -orderings;
they are numbers 20, 22, 38, 39, 44, 48, 50, 51, 52, 53, 65, 67, 75, and 76. However,
73 of the systems (those numbered 8–80) admit a T -ordering. System 1 is the
projective triple system and hence is a Steiner space (see [5]). However, the six
systems numbered 2–7 also do not appear to admit a T -ordering. These results
have all been obtained with a simple backtracking algorithm.
General constructions for larger orders appear to be difficult to establish.
However, we expect that for every order v ≥ 15 there exists a system having a
T -cyclic ordering. For example, let Ti0 = {i, 5 + i, 11 + i}, Ti1 = {i, 2 + i, 9 + i},
and Ti2 = {1 + i, 2 + i, 5 + i}, with arithmetic modulo 19. Then an STS(19)
with a T -cyclic ordering exists with the triple Tij in position 27i + j mod 57 for
0 ≤ i < 19 and 0 ≤ j < 3. A similar solution for v = 25 is obtained by setting
Ti0 = {i, 1 + i, 6 + i}, Ti1 = {6 + i, 8 + i, 16 + i}, Ti2 = {1 + i, 8 + i, 22 + i}, and
Ti3 = {3 + i, 6 + i, 22 + i}, arithmetic modulo 100. Place triple Tij in position
32i + j modulo 100. While these small designs indicate that specific systems
admitting an ordering can be easily found, we have not found a general pattern
for larger orders.
When ℓ = 4, four triples must involve at least six distinct elements. Indeed,
the only (6,4)-configuration is the Pasch configuration. It therefore appears that
the best systems from an ordering standpoint (when ℓ = 4) are precisely those
which are poor from the standpoint of erasure correction. However, in our performance experiments, ordering plays a larger role than does the erasure correction
capability [4,3]. Hence it is sensible to examine STSs which admit orderings with
Pasch configurations placed consecutively. Unfortunately, this does not work in
general:
Lemma 1. No STS(v) for v > 3 is Pasch-orderable.
Proof. Any three triples of a Pasch configuration lie in a unique Pasch configuration. Hence four consecutive triples forming a Pasch configuration for some
triple ordering can neither be preceded nor followed by a triple which forms a
second Pasch configuration.
It therefore appears that an optimal ordering has exactly ⌈(v(v −1)/6)−3⌉ of
the sets of four consecutive triples inducing a Pasch configuration; these alternate
with sets of four consecutive triples forming a (7,4)-configuration. We have not
explored this possibility.
102
M.B. Cohen, C.J. Colbourn
A general theory for all values of ℓ would be worthwhile, but appears to be
substantially more difficult than for pessimal orderings.
4
Conclusions
It is natural to ask whether the orderings found here have a real impact on disk
array performance. Figures 4 - 6 show the results of performance experiments
using various orderings. The desired orderings will provide the lowest response
times. The ‘good’ ordering is a T -ordering when one exists, and otherwise is an
ordering found in an effort to maximize the number of consecutive T configurations; it is labeled A in these figures. The ‘bad’ ordering is a D3 -ordering and is
labeled B. The ordering labeled C is one obtained from a random triple ordering.
The most significant difference in performance arises in a workload of straight
writes. This is as expected because this is where decreasing the actual update
penalty has the greatest impact. Although the read workload shows no apparent
differences during fault-free mode, it does start to differentiate when multiple
failures occur.
Write Workload Comparision of Orderings for STS(15)
6000
A
B
C
Average Response Time(ms)
5000
4000
3000
2000
1000
Steiner 15 systems combined
~630 experiments/data point
0
0
1
2
3
Number of Failures
4
5
Fig. 4. Ordering Results - Straight Write Workload
The structure of optimal orderings for ℓ ≥ 4 is an open and interesting question. Minimizing disk access through ordering means that the update penalty
is only an upper bound on the number of accesses in any write. By keeping
the number of check disk accesses consistently lower, performance gains can be
achieved. An interesting question is the generalization for reads and writes of
different sizes: Should an array be configured specifically for a particular size
when optimization is desired? One more issue in optimization of writes in triple
erasure codes is that of the large or stripe write [9]. At some point, if we have a
large write in an array, all of the check disks are accessed. There is a threshold
beyond which it is less expensive to read all of the information disks, compute
Optimal and Pessimal Orderings of Steiner Triple Systems in Disk Arrays
103
Read Workload Comparision of Orderings for STS(15)
2000
Average Response Time(ms)
A
B
C
1500
1000
500
Steiner 15 systems combined
~630 experiments/data point
0
0
1
2
3
Number of Failures
4
5
Fig. 5. Ordering Results - Straight Read Workload
Mixed Workload Comparision of Orderings for STS(15)
2500
A
B
C
Average Response Time(ms)
2000
1500
1000
500
Steiner 15 systems combined
~630 experiments/data point
0
0
1
2
3
Number of Failures
4
5
Fig. 6. Ordering Results - Mixed Workload
the new parity and then write out all of the new information disks and all of
the check disks. When using an STS(15), a threshold occurs beyond the halfway
point. An STS(15) has 35 information and 15 check disks. If 23 disks are to be
written they must use at least 14 check disks. In the method of writing described
above, 46 information accesses and 28 check disk accesses yield a total of 74 physical disk accesses. A large write instead has 35 reads, followed by 15 + 23 = 28
writes, for a total of 73 physical accesses. This threshold for all STSs determines
to some extent how to optimize disk writes.
Steiner triple systems provide an interesting option for redundancy in large
disk arrays. They have the unexpected property of lowering the expected update
penalty when ordered optimally.
104
M.B. Cohen, C.J. Colbourn
Acknowledgments
Research of the authors is supported by the Army Research Office (U.S.A.) under
grant number DAAG55-98-1-0272 (Colbourn). Thanks to Sanjoy Baruah, Ron
Gould, Alan Ling, and Alex Rosa for helpful comments.
References
1. Yeow Meng Chee, Charles J. Colbourn and Alan C. H. Ling. Asymptotically optimal
erasure-resilient codes for large disk arrays. Discrete Applied Mathematics, to appear.
2. Peter M. Chen, Edward K. Lee, Garth A. Gibson, Randy H. Katz and David A.
Patterson. RAID: High-performance, reliable secondary storage. ACM Computing
Surveys 26 (1994) 145–185.
3. M.B. Cohen and C.J. Colbourn. Steiner triple systems as multiple erasure codes in
large disk arrays. submitted.
4. Myra B. Cohen. Performance analysis of triple erasure codes in large disk arrays.
Master’s thesis, University of Vermont, 1999.
5. Charles J. Colbourn and Alexander Rosa. Triple Systems. Oxford University Press,
Oxford, 1999.
6. Garth A. Gibson. Redundant Disk Arrays, Reliable Parallel Secondary Storage. MIT
Press, 1992.
7. M.J. Grannell, T.S. Griggs, and C.A. Whitehead. The resolution of the anti-Pasch
conjecture. Journal of Combinatorial Designs, to appear.
8. Lisa Hellerstein, Garth A. Gibson, Richard M. Karp, Randy H. Katz and David A.
Patterson. Coding techniques for handling failures in large disk arrays. Algorithmica
12 (1994) 182–208.
9. Mark C. Holland. On-Line Data Reconstruction In Redundant Disk Arrays. PhD
thesis, Carnegie Mellon University, 1994.
10. Peter Horák and Alexander Rosa. Decomposing Steiner triple systems into small
configurations. Ars Combinatoria 26 (1988) 91–105.
11. János Komlós, Gábor Sárközy and Endre Szemerédi. On the Pósa-Seymour conjecture. Journal of Graph Theory 29 (1998) 167–176.
12. Edward K. Lee. Software and performance issues in the implementation of a RAID
prototype. Technical Report CSD-90-573, University of California at Berkeley, 1990.
13. Edward K. Lee. Performance Modeling and Analysis of Disk Arrays. PhD thesis,
University of California at Berkeley, 1993.
14. A.C.H. Ling, C.J. Colbourn, M.J. Grannell and T.S. Griggs. Construction techniques for anti-Pasch Steiner triple systems. Journal of the London Mathematical
Society, to appear.
15. Paul Massiglia. The RAID Book, A Storage System Technology Handbook, 6th Edition. The RAID Advisory Board, 1997.
16. Scott A. Vanstone and Paul C. van Oorschot. An Introduction to Error Correcting
Codes with Applications. Kluwer Academic Publishers, 1989.
Rank Inequalities for Packing Designs and
Sparse Triple Systems
Lucia Moura
⋆
Department of Computer Science, University of Toronto,
and The Fields Institute for Research in Mathematical Sciences
lucia@cs.toronto.edu
Abstract. Combinatorial designs find numerous applications in computer science, and are closely related to problems in coding theory. Packing designs correspond to codes with constant weight; 4-sparse partial
Steiner triple systems (4-sparse PSTSs) correspond to erasure-resilient
codes able to correct all (except for “bad ones”) 4-erasures, which are
useful in handling failures in large disk arrays [4,10]. The study of polytopes associated with combinatorial problems has proven to be important for both algorithms and theory. However, research on polytopes for
problems in combinatorial design and coding theories have been pursued
only recently [14,15,17,20,21]. In this article, polytopes associated with
t-(v, k, λ) packing designs and sparse PSTSs are studied. The subpacking
and sparseness inequalities are introduced. These can be regarded as rank
inequalities for the independence systems associated with these designs.
Conditions under which subpacking inequalities define facets are studied.
Sparseness inequalities are proven to induce facets for the sparse PSTS
polytope; some extremal families of PSTS known as Erdös configurations
play a central role in this proof. The incorporation of these inequalities
in polyhedral algorithms and their use for deriving upper bounds on the
packing numbers are suggested. A sample of 4-sparse P ST S(v), v ≤ 16,
obtained by such an algorithm is shown; an upper bound on the size of
m-sparse PSTSs is presented.
1
Introduction
In this article, polytopes associated with problems in combinatorial design and
coding theories are investigated. We start by defining the problems in which
we are interested, and then describe their polytopes
and motivations for this
research. Throughout the paper, we denote by Vk the family of sets {B ⊆ V :
|B| = k}. Let v ≥ k ≥ t. A t–(v, k, λ) design is a pair (V, B) where V is a v-set
and B is a collection of k-subsets of V called blocks such that every t-subset of
V is contained in exactly λ blocks of B. Design theorists are concerned with the
existence of these designs. A t–(v, k, λ) packing design is defined by replacing the
condition “in exactly λ blocks” in the above definition by “in at most λ blocks”.
The objective is to determine the packing number, denoted by Dλ (v, k, t), which
⋆
Supported by Natural Sciences and Engineering Research Council of Canada PDF
G. Gonnet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNCS 1776, pp. 105–114, 2000.
c Springer-Verlag Berlin Heidelberg 2000
106
L. Moura
is the maximum number of blocks in a t–(v, k, λ) packing design. The existence
of a t–(v, k, λ) design can be decided by checking whether the packing number
Dλ (v, k, t) is equal to λ vt / kt . Thus, the determination of the packing number
is the most general problem and we will concentrate on it. Designs place a central
role in the theory of error-correcting codes, and, in particular, t-(v, k, 1) packing
designs correspond to constant weight codes of weight k, length v and minimum
distance 2(k − t + 1). For surveys on packing designs see [18,19].
Determining the packing number is a hard problem in general, although the
problem has been solved for specific sets of parameters. For instance, the existence of Steiner Triple Systems (i.e. 2–(v, 3, 1) designs), and the packing number
for Partial Steiner Triple Systems (PSTS) (i.e. 2–(v, 3, 1) packing designs) have
been settled. On the other hand, the study of triple systems is an active area of
research with plenty of open problems. Interesting problems arise in the study
of STSs and PSTSs avoiding prescribed sub-configurations (see the survey [8]).
Let us denote by ST S(v) the Steiner triple system (P ST S(v) for a partial one)
on v points. A (p, l)-configuration in a (partial) Steiner triple system is a set of
l blocks (of the (partial) Steiner triple system) spanning p elements. Let m ≥ 4.
An ST S(v) is said to be m-sparse if it avoids every (l + 2, l)-configuration for
4 ≤ l ≤ m. Erdös (see [12]) conjectured that for all m ≥ 4 there exists an integer vm such that for every admissible v ≥ vm there exist an m-sparse ST S(v).
Again the objective is to determine the sparse packing number, denoted by
D(m, v), which is the maximum number of blocks in an m-sparse P ST S(v).
The 4-sparse PSTSs are the same as anti-Pasch ones, since Pasches are the only
(6, 4)-configurations. A 4-sparse (or anti-Pasch) ST S(v) is known to exist for all
v ≡ 3 (mod 6) [2]. For the remaining case, i.e. the case v ≡ 1 (mod 6), there
are constructions and partial results. Anti-mitre Steiner triple systems were first
studied in [6]. The 5-sparse Steiner triple systems are the systems that are both
anti-Pasch and anti-mitre. Although there are some results on 5-sparse STSs
[6,13], the problem is far from settled. In spite of Erdös conjecture, no m-sparse
Steiner triple system is known for m ≥ 6. The study of m-sparse PSTSs gives
rise to interesting extremal problems in hypergraph theory; in addition, these
designs have applications in computer science. For instance, the 4-sparse (or
anti-Pasch) PSTSs correspond to erasure-resilient codes that tolerates all (except bad) 4-erasures, which are useful in applications for handling failures in
large disk arrays [4,10].
Let D be the set of all packing designs of the same kind and with the same
parameters (for instance, the set of all 2-(10, 3, 1) packing designs or the set of
v
all 5-sparse P ST S(10)). Let P (D) be the polytope in IR(k) given by the convex
hull of the incidence vectors of the packing designs in D. Thus, determining the
packing number associated with D, amounts to solving the following optimization problem
(
P
maximize
B∈(Vk ) xB
Subject to x ∈ P (D).
If we had a description of P (D) in terms of linear inequalities, this problem
could be solved via linear programming. Unfortunately, it is unlikely for us to find
Rank Inequalities for Packing Designs and Sparse Triple Systems
107
complete descriptions of polytopes for hard combinatorial problems. On the other
hand, some very effective computational methods use partial descriptions of a
problem’s polytope [3]. Therefore, it is of great interest to find classes of facets for
these polytopes. It is also important to design efficient separation algorithms for
a class of facets. Given a point outside a polytope and a class of valid inequalities
for the polytope, a separation algorithm determines an inequality that is violated
by the point or decides one does not exist. This is fundamental in branch-and-cut
or other polyhedral algorithms that work with partial descriptions of polytopes.
Polytopes for general t–(v, k, λ) packing designs were first discussed in [14];
their clique facets have been determined for all packings with λ = 1 and k − t ∈
{1, 2} for all t and v [16]. A polyhedral algorithm for t–(v, k, 1) packings and
designs was proposed and tested in [17]. A related work that employs incidence
matrix formulations for 2-(v, k, λ) design polytopes can be found in [20].
In this paper, we present two new classes of inequalities: the subpacking
and the sparseness inequalities. They are types of rank inequalities when one
regards the packing designs as independence systems, as discussed in Section 2.
In Section 3, we focus on the subpacking inequalities, which are valid inequalities for both t–(v, k, λ) packing designs and sparse PSTSs. We study conditions
under which these inequalities induce facets for the packing design polytope.
In Section 4, we discuss sparseness inequalities. Given m ≥ 4, the l-sparseness
inequalities, 2 ≤ l ≤ m, are valid for the m-sparse PSTS polytope, and proven
to always be facet inducing. In Section 5, we show the results of our branch-andcut algorithm for determining the sparse packing number for 4-sparse P ST S(v)
with v ≤ 16. The algorithm follows the lines of the one described in [17], but employs sparse facets. With these 4-sparse packing numbers in hand, we develop a
simple bound that uses the previous packing number and Chvátal-Gomory type
of cuts to give an upper bound on the next packing numbers. Further research
is discussed in Section 6.
2
Independence Systems, Packing Designs and their
Polytopes
In this section, we define some terminology about independence systems and
collect some results we use from the independence system literature. We closely
follow the notation in [11]. Along the section, we translate the concepts to the
context of combinatorial designs.
Let N = {v1 , v2 , . . . , vn } be a finite set. An independence system on N is a
family I of subsets of N closed under inclusion, i.e. satisfying the property: J ∈ I
and I ⊆ J implies I ∈ I, for all J ∈ I. Any set in I is called independent and any
set outside I is called dependent. Any minimal (with respect to set inclusion)
dependent set is called a circuit, and an independent system is characterized
by its family of circuits, which we denote by C. The independence number of I,
denoted by α(I), is the maximum size of an independent set in I. Given a subset
S of N , the rank of S is defined by r(S) = max{|I| : I ∈ I and I ⊆ S}. Note
that α(I) = r(N ).
108
L. Moura
If the circuits in C have size 2, G = (N, C) forms a graph with N as the
nodeset, C as the edgeset and I forms the set of independent or stable sets of G.
Remark 1. (Packing Designs) Given t, v, k, λ, let I be the family of all t(v, k, λ) packing designs on the same v-set V . Let, N = Vk , then I is clearly an
independence system on N . The packing number
is the independence number.
Each circuit in C corresponds to a subset of Vk of cardinality λ + 1 such that
its k-sets contain a common t-subset of V . For λ = 1, C is simply formed by the
pairs of k-sets of V which intersect in at least t points, and the underlying graph
is obvious.
Following the definition in [9], an Erdös configuration of order n, n ≥ 1, in
a (partial) Steiner triple system is any (n + 2, n)-configuration, which contains
no (l + 2, l)-configuration, 1 < l < n. In fact, this is equivalent to requiring that
4 ≤ l < n, since there cannot be any (4, 2)- or (5, 3)-configurations in a PSTS.
Remark 2. (Sparse PSTSs.) Let I be the independence system of the 2−(v, 3, 1)
packing designs on the same v-set V . Let C be its collection of circuits, namely,
the family of all pairs of triples of V whose intersection has cardinality 2. Adding
m-sparseness requirements to I amounts to removing from I the packing designs
that are not m-sparse, and adding extra circuits to C. The circuits to be added
to C are precisely the Erdös configurations of order l, for all 4 ≤ l ≤ m.
Before we discuss valid inequalities for the independence system polytope,
we recall some definitions. A polyhedron P ⊆ IRn is the set of points satisfying a
finite set of linear inequalities. A polytope is a bounded polyhedron. A polyhedron
P ⊆ IRn is of dimension k, denoted by dimP = k, if the maximum number of
affinely independent points in P is k + 1. We say that P is full dimensional if
dimP = n. Let d ∈ IRn and d0 ∈ IR. An inequality dT x ≤ d0 is said to be valid
for P if it is satisfied by all points of P . A subset F ⊆ P is called a face of P if
there exists a valid inequality dT x ≤ d0 such that F = P ∩ {x ∈ IRn : dT x = d0 };
the inequality is said to represent or to induce the face F . A facet is a face of
P with dimension (dimP ) − 1. If P is full dimensional (which can be assumed
w.l.o.g. for independence systems), then each facet is determined by a unique (up
to multiplication by a positive number) valid inequality. Moreover, the minimal
system of inequalities representing P is given by the inequalities inducing its
facets.
Consider again an independence system I on N . The rank inequality associated with a subset S of N is defined by
X
xi ≤ r(S),
(1)
i∈S
and is obviously a valid inequality for the independence system polytope P (I).
Necessary or sufficient conditions for a rank inequality to induce a facet have
been discussed [11]. We recall some definitions. A subset S of N is said to be
closed if r(S ∪ {i}) ≥ r(S) + 1 for all i ∈ N \ S. S is said to be nonseparable if
r(S) < r(T ) + r(S \ T ) for all nonempty proper subset T of S.
Rank Inequalities for Packing Designs and Sparse Triple Systems
109
A necessary condition for (1) to induce a facet is that S be closed and nonseparable. This was observed by Laurent [11], and was stated by Balas and Zemel
[1] for independent sets in graphs. A sufficient condition for (1) to induce a facet
is given in the next theorem. Let I be an independence system on N and let
S be a subset of N . Let C be the family of circuits of I and let CS denote its
restriction to S. The critical graph of I on S, denoted by GS (I), is defined as
having S as its nodeset and with edges defined as follows: i1 , i2 ∈ S are adjacent
if and only if the removal of all circuits of CS containing {i1 , i2 } increases the
rank of S.
Theorem 1. (Laurent [11], Chvátal [5] for graphs) Let S ⊆ N . If S is closed
and the critical graph GS (I) is connected, then the rank inequality (1) associated
with S induces a facet of the polytope P (I).
Proposition 1. (Laurent [11], Cornuejols and Sassano [7]) The following are
equivalent
1. The rank inequality (1) induces a facet of P (I).
2. S is closed and the rank inequality (1) induces a facet of P (IS ).
3
Subpacking Inequalities for t–(v, k, λ) Packings
Let us denote by Pt,v,k,λ the polytope associated with the t–(v, k, λ) packing
independence
designs on the same
v-set V , and by It,v,k,λ the corresponding
S
V
system on N = k . Let S ⊆ V . Then, it is clear that r( k ) = Dλ (|S|, k, t) and
the rank inequality associated with Sk is given by
X
xB ≤ Dλ (|S|, k, t).
(2)
B∈(S
k)
We call this the subpacking inequality associated with S, which is clearly
valid for Pt,v,k,λ . In this section, we investigate conditions for this inequality
to be facet inducing. The next proposition gives a sufficient condition for a
subpacking inequality not to induce a facet.
Proposition 2. If there exists a t-(v, k, λ) design, then
X
xB ≤ Dλ (v, k, t)
B∈(Vk )
(3)
does not induce a facet of Pt,v,k,λ .
t) = λ vt / kt .
Proof. Since there exists a t-(v, k, λ) design, it follows Dλ (v, k,P
Then, equation (3) can be obtained by adding the clique facets: B⊇T xB ≤ λ,
for all T ⊆ V , |T | = t. Thus, (3) cannot induce a facet.
⊓
⊔
The next proposition addresses the extendibility of facet inducing subpacking
inequalities from Pt,|S|,k,λ to Pt,v,k,λ , v ≥ |S|.
110
L. Moura
Proposition 3. Let S ⊆ V . Then, the following are equivalent:
1. the subpacking inequality (2) induces a facet of Pt,v,k,λ .
2. the subpacking inequality (2) induces a facet of Pt,|S|,k,λ ; and for all B ′ ∈
S
V
k \ k there exists t–(|S|, k, λ) packing design (S, B) with |B| = Dλ (|S|, k, t)
such that (S, B ∪ {B ′ }) is a t–(v, k, λ) packing design.
Proof. The last condition in 2 is equivalent to Sk being closed for the indepen⊔
dence system It,v,k,λ ; thus, the equivalence comes directly from Theorem 1. ⊓
For the particular case of k = t + 1 and λ = 1, facet inducing subpacking
inequalities are always extendible.
Proposition 4. (Guaranteed extendibility of a class of subpacking facets) Let
k = t + 1 and λ = 1. Then, the subpacking inequality
X
xB ≤ D1 (|S|, t + 1, t)
(4)
S
B∈(t+1
)
associated with S ⊆ V induces a facet for Pt,v,t+1,1 if and only if it induces a
facet for Pt,|S|,t+1,1 .
The proof of Proposition 4 involves showing that the second condition in item 2
of Proposition 3 holds for k = t + 1 and λ = 1.
Theorem 2. (Facet defining subpacking inequalities for PSPSs) Let S ⊆ V ,
|S| ≤ 10. The subpacking inequality associated with S induces a facet of P2,v,3,1 ,
v ≥ |S|, if and only if |S| ∈ {4, 5, 10}.
Sketch of the proof. Since there exist ST S(v) for all v ≡ 1, 3 (mod 6), Proposition 2 covers cases |S| ∈ {7, 9}. It remains to deal with the cases |S| ∈
{4, 5, 6, 8, 10} (see Table 1). Subpacking inequalities with |S| = 4 are facetinducing clique inequalities. For the case |S| ∈ {5, 10}, we show that the corresponding critical graphs are connected, which (by Theorem 1) is sufficient to
show the inequalities induce facets of P2,|S|,3,1 . Proposition 4 guarantees they also
define facets of P2,v,3,1 . For the case |S| ∈ {6, 8}, we show that the corresponding subpacking inequalities can be written as (non-trivial) linear combinations
of other valid inequalities, which implies that they do not induce facets.
⊓
⊔
Remark 3. (Separation of subpacking inequalities) For a constant C, subpacking
inequalities with |S| ≤ C can be separated in polynomial time. This is the case,
PC
v
C
since there are exactly
s=4
s ∈ O(v ) inequalities to check,
which is a
polynomial in the number of variables of the problem, which is kv .
4
Sparseness Facets for Sparse PSTSs
Let us denote by Pm,v the polytope associated with m-sparse P ST S(v) on the
same v-set V , and by Im,v the corresponding independence system. The main
contribution of this section is a class of facet inducing inequalities for Pm,v , which
we call sparseness inequalities, given by Theorem 3.
Rank Inequalities for Packing Designs and Sparse Triple Systems
111
Table 1. Summary of facet inducing subpacking inequalities of P2,v,3,1 for |S| ≤ 10.
|S|
4
5
6
7
8
9
10
1,3
(mod 6)
D1 (|S|, 3, 2)
or r( S3 )
1
2
4
7
8
12
13
(|S|2 − |S|)/6
P
B∈(S
3)
xB ≤ D1 (|S|, 3, 2) Reference
is facet inducing, v ≥ |S|
Yes
Yes
No
No
No
No
Yes
No
maximal clique [17]
Theorem 2
Theorem 2
∃ ST S(7) + Proposition 2
Theorem 2
∃ ST S(9) + Proposition 2
Theorem 2
∃ ST S(|S|) + Proposition 2
Lemma 1. (Lefmann et al.[12, Lemma 2.3])
Let l, r be positive integers, l ≥ 1. Then any (l + 2, l + r)-configuration in a
Steiner triple system contains a (l + 2, l)-configuration.
The proofs of the next two lemmas are omitted in this extended abstract.
Lemma 2. (Construction of an Erdös configuration, for all n ≥ 4)
Consider the following recursive definition:
E4 = {E1 = {1, 2, 5}, E2 = {3, 4, 5}, E3 = {1, 3, 6}, E4 = {2, 4, 6}},
E5 = E4 \ {E4 } ∪ {{2, 4, 7}, E5 = {5, 6, 7}}
En+1 = En \ {En } ∪ {En \ {n + 2} ∪ {n + 3}} ∪
∪ {En+1 = {n + 2, n + 3, 1 + ((n − 1) mod 4)}} , n ≥ 5.
Then, for all n ≥ 4, En is an Erdös configuration of order n.
Lemma 3. Let v−2 ≥ l ≥ 4 and let T be an (l+2)-subset of V . Let R ∈ V3 \ S3 .
Then, there exists an Erdös configuration S of order l on the points of T and a
triple S ∈ S, such that S \ {S} ∪ {R} is an l-sparse P ST S(v).
Theorem 3. (m-sparseness facets) Let m ≥ 4. Then, for any 2 ≤ l ≤ m and
any (l + 2)-subset T of V , the inequality
X
xB ≤ l − 1
s(T ) :
B∈(T3 )
defines a facet for Pm,v .
Proof. Inequalities s(T ) with l ∈ {1, 2} are facet inducing for P2,v,3,1 (see Table 1), but even though the inclusion Pm,v ⊆ P2,v,3,1 is in general proper, it is
easy to show they remain facet inducing for Pm,v . Thus, we concentrate on l ≥ 4.
112
L. Moura
The validity
of s(T ) comes from the definition of l-sparse PSTSs, i.e. the fact
that r( T3 ) = l − 1 for Im,v . Lemma 3 implies that Im,v is closed. Thus, by Theorem 1, it is sufficient to show that the critical graph G(T ) (Im,v ) is connected.
3
Let E be an Erdös configuration of order l on the points of T . There must be
two triples in E whose intersection is a single point, call those triples B1 and
B2 . We claim E \ {B1 } and E \ {B2 } are m-sparse 2-(v, 3, 1) packings. Indeed,
|E \ {Bi }| = |E| − 1 = l − 1, and since E was (l − 1)-sparse, so is E \ {Bi }, i = 1, 2.
Thus, there exists an edge in the critical graph G(T ) (Im,v ) connecting triples B1
3
and B2 . By permuting T , we can show this is true for any pair of triples which
intersects in one point. That is, there exists an edge in G(T ) (Im,v ) connecting
3
C1 and C2 , for any C1 , C2 ∈ T3 with |C1 ∩ C2 | = 1. It is easy to check that this
graph is connected.
⊓
⊔
Remark 4. The following is an integer programming formulation for the optimization problem associated with Pm,v , in which all the inequalities are facet
inducing (see Theorem 3). Note that the second type of inequalities can be omitted from the integer programming formulation, since for integral points they are
implied by the first type of inequalities (the first type guarantees that x is a
PSTS).
P
xB
maximize
V
PB∈( 3 )
for all T ⊆ V, |T | = 4,
Subject to
T xB ≤ 1,
PB∈( 3 )
x
≤
2,
for all T ⊆ V, |T | = 5,
T
B
PB∈( 3 )
for all T ⊆ V, |T | = 6,
B∈(T3 ) xB ≤ 3,
..
..
.
.
P
for
all
T ⊆ V, |T | = m + 2,
T xB ≤ m − 1,
B∈
(3)
V
)
(
3
x ∈ {0, 1}
Remark 5. (Separation of m-sparse facets) For constant m ≥ 4, l-sparse facets,
l ≤ m, can
be separated
in polynomial time. This is the case, since there are
Pm+2
v
m+2
) inequalities to check,
exactly l=4
l ∈ O(v
which is a polynomial in
the number of variables of the problem, which is v3 .
5
Using Facets for Lower and Upper Bounds
In this section, we illustrate some interesting uses of valid inequalities for packing
design problems. Recall that D(m, v) denotes the maximum size of an m-sparse
P ST S(v). We show an upper bound on D(m, v) based on valid subpacking
inequalities for m-sparse PSTSs. We also display the results of an algorithm
that uses 4-sparse facets to determine D(4, v).
Proposition 5. (Upper bound for m-sparse number) Let m ≥ 4. Then,
D(m, v − 1) · v
.
D(m, v) ≤ U (m, v) :=
v−3
Rank Inequalities for Packing Designs and Sparse Triple Systems
113
Table 2. The anti-Pasch (4-sparse) PSTS number for small v
v
6
7
8
9
10
11
12
13
14
15
16
exact∗
upper bounds ∗∗
D(4, v) D1 (v, 3, 2) U (4, v)
3
4
4
5
7
5
8
8
8
12
12
12
12
13
17
15
17
16
19
20
20
≥24
26
24
28
28
30
35
35
35
37
37
43
∗ results from branch-and-cut algorithm
∗∗ upper bounds from known packing
numbers and from Proposition 5
To the best of our knowledge the
determination of D(4, v) for v ∈ [10, 13]
are new results.
P
Proof. There are v rank inequalities of the form B∈(T ) xB ≤ D(m, v − 1),
3
V
. Each triple appears in v − 3 of these inequalities. Thus, adding
for T ∈ v−1
P
. Since the left-hand side is
these inequalities yields B∈(V ) xB ≤ D(m,v−1)·v
v−3
3
integral, we take the floor function on the right-hand side. The inequality is valid
in particular for x being the incidence vector of a maximal m-sparse ST S(v), in
which case the left-hand side is equal to D(m, v).
⊓
⊔
In Table 2, we show values for D(4, v) obtained by our algorithm. To the general algorithm in [17], we added 4-sparse inequalities. Due to their large number,
the 4-sparse inequalities are not included in the original integer programming
formulation, but are added whenever violated. For v = 13, it was not possible to
solve the problem to optimality but a solution of size 24 was obtained; since this
matches one of the upper bounds, we conclude D(4, 13) = 24. All other cases
were solved to optimality.
6
Conclusions and Further Work
In this article, we derive and study new classes of valid and facet inducing inequalities for the packing designs and m-sparse PSTS polytopes. We also exemplify how this knowledge can be used in algorithms to construct designs as well
as for deriving upper bounds on packing numbers. A number of extensions of this
work could be pursued. For instance, we are currently investigating how to generalize results from Table 1 in order to determine the facet inducing subpacking
inequalities of PSTSs for all |S|. We are also working on the design of separation
algorithms for m-sparse facets that would be more efficient than the naive one
which checks all inequalities (see complexity in Remark 5). Other directions for
further research are the study of other rank inequalities and the investigation of
new upper bounds on the lines suggested in Section 5. In an expanded version of
114
L. Moura
this article, we intend to include the proofs that were omitted in this extended
abstract as well as some of the extensions mentioned above.
References
1. E. Balas and E. Zemel. Critical cutsets of graphs and canonical facets of set-packing
polytopes. Math. Oper. Res., 2:15–19, 1977.
2. A.E. Brouwer. Steiner triple systems without forbidden configurations. Technical
Report ZW104/77, Mathematisch Centrum Amsterdam, 1977.
3. A. Caprara and M. Fischetti. Branch-and-cut algorithms. In Dell’Amico et al, eds.,
Annotated Bibliographies in Combinatorial Optimization, John Wiley & Sons, 1997.
4. Y.M. Chee, C.J. Colbourn, and A.C.H. Ling. Asymptotically optimal erasureresilient codes for large disk arrays. Discrete Appl. Math., to appear.
5. V. Chvátal. On certain polytopes associated with graphs. J. Combin. Theory. Ser.
B, 18:138–154, 1975.
6. C.J. Colbourn, E. Mendelsohn, A. Rosa, and J. Širáň. Anti-mitre Steiner triple
systems. Graphs Combin., 10:215–224, 1994.
7. G. Cornuéjols and A. Sassano. On the 0,1 facets of the set covering polytope. Math.
Programming, 43:45–55, 1989.
8. M.J. Grannell and T.S. Griggs. Configurations in Steiner triple systems. Combinatorial Designs and their applications, 103–126, Chapman & Hall/CRC Res. Notes
Math. 403, Chapman & Hall/CRC, 1999.
9. M.J. Grannell, T.S. Griggs, and E. Mendelsohn. A small basis for four-line configurations in Steiner triple systems. J. Combin. Des., 3(1):51–59, 1995.
10. L. Hellerstein, G.A. Gibson, R.M. Karp, R.H. Katz, D.A. Paterson. Coding techniques for handling failures in large disk arrays. Algorithmica, 12:182–208, 1994.
11. M. Laurent. A generalization of antiwebs to independence systems and their canonical facets. Math. Programming, 45:97–108, 1989.
12. H. Lefmann, K.T. Phelps, and V. Rödl. Extremal problems for triple systems. J.
Combin. Des., 1:379–394, 1993.
13. A.C.H. Ling. A direct product construction for 5-sparse triple systems. J. Combin.
Des., 5:444–447, 1997.
14. L. Moura. Polyhedral methods in design theory. In Wallis, ed., Computational and
Constructive Design Theory, Math. Appl. 368:227–254, 1996.
15. L. Moura. Polyhedral Aspects of Combinatorial Designs. PhD thesis, University
of Toronto, 1999.
16. L. Moura. Maximal s-wise t-intersecting families of sets: kernels, generating sets,
and enumeration. J. Combin. Theory. Ser. A, 87:52–73, 1999.
17. L. Moura. A polyhedral algorithm for packings and designs. Algorithms–ESA’99.
Proceedings of the 7th Annual European Symposium held in Prague, 1999, Lecture
Notes in Computer Science 1643, Springer-Verlag, Berlim, 1999.
18. W.H. Mills and R.C. Mullin. Coverings and packings. In Dinitz and Stinson, eds.,
Contemporary Design Theory: a collection of surveys, 371–399. Wiley, 1992.
19. D.R. Stinson. Packings. In Colbourn and Dinitz, eds., The CRC handbook of
combinatorial designs, 409–413, CRC Press, 1996.
20. D. Wengrzik. Schnittebenenverfahren für Blockdesign-Probleme. Master’s thesis,
Universität Berlin, 1995.
21. E. Zehendner. Methoden der Polyedertheorie zur Herleitung von oberen Schranken
für die Mächtigkeit von Blockcodes. Doctoral thesis, Universität Augsburg, 1986.
The Anti-Oberwolfach Solution: Pancyclic
2-Factorizations of Complete Graphs
Brett Stevens
Department of Mathematics and Statistics
Simon Fraser University
Burnaby BC V5A 1S6
brett@math.sfu.ca
Abstract. We pose and completely solve the existence of pancyclic 2factorizations of complete graphs and complete bipartite graphs. Such
2-factorizations exist for all such graphs, except a few small cases which
we have proved are impossible. The solution method is simple but powerful. The pancyclic problem is intended to showcase the power this
method offers to solve a wide range of 2-factorization problems. Indeed,
these methods go a long way towards being able to produce arbitrary
2-factorizations with one or two cycles per factor.
1
Introduction
Suppose that there is a conference being held at Punta del Este, Uruguay. There
are 2n + 1 people attending the conference and it is to be held over n days. Each
evening there is a dinner which everyone attends. To accommodate the many
different sizes of conferences, the Las Dunas Hotel has many different sizes of
tables. In fact, they have every table size from a small triangular table to large
round tables seating 2n + 1 people. When this was noticed, the organizers, being
knowledgeable in combinatorics, asked themselves if a seating arrangement could
be made for each evening such that every person sat next to every other person
exactly once over the course of the conference and each size table was used at
least once.
Such a schedule, really a decomposition of K2n+1 into spanning graphs all
with degree 2 (collections of cycles), would be an example of a 2-factorization of
K2n+1 . Due to their usefulness in solving scheduling problems, 2-factorizations
have been well studied. The Oberwolfach problem asks for a 2-factorization in
which each subgraph in the decomposition has the same pattern of cycles and
much work has been done toward its solution [2,7]. This corresponds to the hotel using the exact same set of tables each night. Often other graphs besides
odd complete graphs are investigated. Complete graphs of even order with a
perfect matching removed so the graph has even degree have received much attention [1]. In such solutions each person would miss sitting next to exactly one
other during the conference. Oberwolfach questions have also been posed and
solved for complete bipartite graphs [7]. The problem posed in the introductory
G. Gonnet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNCS 1776, pp. 115–122, 2000.
c Springer-Verlag Berlin Heidelberg 2000
116
B. Stevens
paragraph asks that every size cycle appear and so is called the pancyclic 2factorization problem, or, since it forces such different cycle sizes, the title of
‘anti-Oberwolfach problem’ emphasizes this contrast. There are analogous formulations for an even number of people with a complete matching removed (coauthor avoiding to prevent conflict) and for bipartite graphs as well (the seating
arrangements alternating computer scientist and mathematician to foster cross
disciplinary communication).
The Conference Organizers soon noted that tables of size n − 1 and n − 2,
although available, were forbidden since the remaining people would be forced
to sit at tables of size 1 or 2 which did not exist and would preclude every pair
being neighbors exactly once. After realizing this and doing a preliminary count,
the organizers then asked themselves for a schedule that would include the first
evening with everyone seated around one large table of size 2n + 1, an evening
with a size three table paired with a size 2n − 2 table, an evening with a size
four table paired with a size 2n − 3 table and so forth up to an evening with size
n table paired with a size n + 1 table. There was one evening remaining and the
organizers thought it would be nice to have everyone seated again at one table
for the final dinner together.
If the solution methods from the Oberwolfach problem can be paired with
methods for the anti-Oberwolfach problem, then it is conceivable that that general 2-factorization problems can be tackled with great power. This would enable
us to answer many different and new scheduling and tournament problems. Indeed, the pancyclic question is recreational in nature but we use it as a convenient
context in which to present powerful and very serious construction methods that
can contribute to a broader class of 2-factorizations.
Another primary motivation for this problem is recent papers investigating
the possible numbers of cycles in cycle decompositions of complete graphs [3]
and in 2-factorizations[4,5]. For each n, the number of cycles that appear in an
anti-Oberwolfach solution are admissible so the question was asked if this specific
structure was possible. We show that the answer to all versions of the problem,
complete odd graphs, complete even graphs minus a complete matching, and
complete bipartite graphs, is affirmative, except for small cases which we have
proved impossible. The solution method is very similar to Piotrowski’s approach
to 2-factorization problems: we modify pairs of Hamiltonian cycles into pairs of
2-factors with the desired cycle structures.
In this paper we offer first some definitions and discussion of 2-factorizations,
formalizing the notions discussed above. Then we solve the standard and bipartite formulations of the anti-Oberwolfach problem. We end with a discussion of
the solution method, possible extensions of the problem, and the power these
methods provide for constructing very general classes of 2-factorizations.
2
Definitions and Discussion
Definition 1 A k-factorization of a graph G, is a decomposition of G into spanning subgraphs all regular of degree k. Each such subgraph is called a k-factor.
The Anti-Oberwolfach Solution
117
We are interested in a special class of 2-factorizations, but also use 1-factors
(perfect matchings) on occasion.
Definition 2 A pancyclic 2-factorization of a graph, G, of order n, is a 2factorization of G where a cycle of each admissible size, 3, 4, . . . , n − 4, n − 3, n,
appears at least once in some 2-factor.
There is a similar definition for the bipartite graphs in which no odd cycles can
appear:
Definition 3 A pancyclic 2-factorization of a bipartite graph, G, of even order
n, is a 2-factorization of G where a cycle of each admissible size, 4, 6, . . . , n−4, n,
appears at least once in some 2-factor.
We ask whether such 2-factorizations exist for complete odd graphs K2n+1 , complete even graphs, with a 1-factor removed to make the degree even, K2n −
nK2 , and complete bipartite graphs, some with a 1-factor removed, K2n,2n and
K2n+1,2n+1 − (2n + 1)K2 .
In every case, counting shows that the all the 2-factors that are not Hamiltonian (an n-cycle) must be of the form: an i-cycle and a n − i cycle. We define
here a notation to refer to the different structure of 2-factors:
Definition 4 An i, (n − i)-factor is a 2-factor of an order n graph, G, that is
the disjoint union of an i-cycle and a (n − i)-cycle.
In each case the solution is similar. For each graph in question, G, we present
a 2-factorization, {F0 , F1 , . . . , Fk }, and a cyclic permutation σ of a subset of the
vertices of G so that Fi = σ i (F0 ) and σ k+1 is the identity. We decompose the
union of consecutive pairs of 2-factors, Fi ∪ Fi+1 , into two other 2-factors with
desired cycle structures by swapping pairs of edges. The cyclic automorphism
group guarantees that any two unions of any two consecutive 2-factors are isomorphic. Thus we can formulate general statements about decomposition of
the complete graphs into these unions and the possible decompositions of these
unions. A few cases require more sophisticated manipulation. In certain cases
we swap only one pair of edges; in others, we use up to four such swaps. These
methods demonstrate the power of Piotrowski’s approach of decomposing pairs
of Hamiltonian cycles from a Walecki decomposition into the desired 2-factors
3
Main Results
We demonstrate the solution method in more detail for the case K2n+1 . In the
other cases the solution method is essentially the same with minor adjustments
and a slight loss of economy.
3.1
The Solution for K2n+1
Walecki’s decomposition of K2n+1 into Hamiltonian 2-factors give us the starting
point of our construction.
118
B. Stevens
Lemma 1. There exists a decomposition of K2n+1 into Hamiltonian 2-factors
that are cyclic developments of each other.
The first of the Hamiltonian 2-factors is shown in Figure 1. Each of the remaining
n − 1 2-factors is a cyclic rotation of the first.
10
11
20
21
30
8
31
k-1 1
k-1 0
k1
k0
Fig. 1. A Walecki 2-factor of K2n+1 .
The union of two consecutive Hamilton cycles from the Walecki decomposition is isomorphic to the graph given in Figure 2. This graph can be decomposed
21
31
i-1
1
i1
i+1 1
i+2 1
8
...
1
0
2
0
3
0
k-2
1
k-1
1
k1
...
i-1
0
i
0
i+1
0
i+2
0
8
11
k-2
0
k-1
0
k
0
Fig. 2. The union of two consecutive Walecki 2-factors.
into two other Hamiltonian 2-factors that are not identical to the original Walecki
2-factors. These are shown in Figure 3. It is these two Hamiltonian 2-factors that
can be decomposed into 2-factors with various cycle structures.
Lemma 2. The graph in Figure 2 can be decomposed into two 2-factors such
that the first is an 2i+1, 2(n−i)-factor and the second is a 2j +1, 2(n−j)-factor
for any 1 ≤ i 6= j ≤ n − 2. Alternatively the second can remain a Hamiltonian
2-factor, with no restriction on i.
The Anti-Oberwolfach Solution
31
i-1 1
i1
i+1 1
i+2 1
8
...
1
2
0
11
0
21
3
i-1
0
31
8
20
1
k-1
1
k1
...
i
0
i-1 1
i+1
0
i1
i+2
0
i+11
k-2
0
i+2 1
...
10
k-2
8
21
0
k-1
0
k
0
k-2 1
k-1 1
k1
k-2 0
k-1 0
k0
...
30
i-1 0
i0
i+10
8
11
119
i+20
Fig. 3. Decomposition of the union of two consecutive Walecki 2-factors into two other
Hamiltonian 2-factors.
Proof. The first decomposition is achieved by swapping four edges between the
two graphs of Figure 3 and is shown in Figure 4.
i-1 1
31
i1
i+1 1
i+2 1
8
0
11
2
0
21
3
i-1
0
31
0
i-1 1
i
0
i1
i+1
0
i+1 1
i+2
8
0
2
0
3
0
j+11
j+2 1
i+2 1
0
j-1 1
j
0
j1
j+1
0
j+1 1
j+2
0
j+2 1
...
i-1
0
i
0
i+1
0
i+2
0
k-2 1
k-1 1
k1
k-2
k-1
k
...
j-1
0
...
1
j1
...
...
1
j-1 1
8
21
0
0
0
k-2 1
k-1 1
k1
k-2
k-1
k
...
j-1
0
j
0
j+1
0
j+2
0
8
11
0
0
0
Fig. 4. Decomposition into a 2i + 1, 2(n − i)-factor and a 2j + 1, 2(n − j)-factor.
The second decomposition is achieved by swapping only two edges between
the two graphs of Figure 3 and is shown in Figure 5.
In both figures the sets of swapped edges are shown as dashed or dotted lines.
The flexibility of the parameters i and j together with the decomposition of
K2n+1 into n cyclically derived Hamiltonian 2-factors gives the main result.
Theorem 1 There exists a pancyclic 2-factorization of K2n+1 for all n ≥ 1.
B. Stevens
21
i-1
31
1
i1
i+1 1
8
0
11
2
0
21
3
i-1
0
0
i-1 1
31
i
0
i1
i+1
0
i+1 1
i+2
8
0
2
0
3
0
k-2
0
k-2
i+2 1
k-1
1
k1
0
1
k-1
k-1
0
1
k
0
k1
...
...
1
1
...
...
1
k-2
i+2 1
8
11
i-1
0
i
0
i+1
0
i+2
0
8
120
k-2
0
k-1
0
k
0
Fig. 5. Decomposition into a 2i + 1, 2(n − i)-factor and a Hamiltonian 2-factor.
3.2
The Remaining Cases: K2n − nK2 , K2n,2n and
K2n+1,2n+1 − (2n + 1)K2
Decomposing graphs with odd degree into 2-factors is impossible since each 2factor accounts for two edges from each vertex. In these cases it is customary
to remove a 1-factor and attempt to decompose the resulting graph which has
even degree. The existence of pancyclic 2-factorizations of K2n − nK2 , K2n,2n
and K2n+1,2n+1 − (2n + 1)K2 is achieved in the same manner as that of K2n+1 .
We decompose the graph into 2-factors (usually Hamiltonian) that are cyclic
developments of each other, so that all unions of pairs of consecutive 2-factors
are isomorphic. The union of two consecutive 2-factors has a structure very similar to that in the case K2n+1 and they can be broken into smaller cycles in
almost exactly the same manner. There are two minor, though notable, differences. When decomposing K2n − nK2 , it is necessary in one fourth of the cases
to reinsert the removed 1-factor and remove another to be able to construct odd
numbers of 2-factors with different parities. In the bipartite case it is sometimes
necessary to apply the edge swapping additional times since the original decomposition into 2-factors may not have produced Hamiltonian 2-factors. Again, in
each case the complete solution is achievable.
Theorem 2 There exists a pancyclic 2-factorization of K2n −nK2 for all n ≥ 1.
Theorem 3 There exists a pancyclic 2-factorization of Kn,n for all even n > 4
and Kn,n − nK2 for all odd n > 1. The cases n = 1, 2, 4 are all impossible.
In all cases the union of the edges from each set of four swapped edges form
an induced 4-cycle, and the remainder of the graphs are paths connecting pairs
of vertices from the 4-cycle. This induced 4-cycle and connected paths are the
underlying structure of the construction. Consideration of this structure allows
The Anti-Oberwolfach Solution
121
the swapping to be formalized and made rigorous, so that the proofs can rest
on a foundation other than nice figures. Unfortunately the statements of these
swapping lemmas are lengthy and technical and space does not permit their
inclusion.
4
Conclusion
As a demonstration of a powerful method for a wide range of 2-factorization
problems, of similar type to Piotrowski’s Oberwolfach constructions, we have
solved the pancyclic 2-factorization for four infinite families of complete or nearly
complete graphs, K2n+1 , K2n −nK2 , K2n,2n and K2n+1,2n+1 −(2n+1)K2 . In each
case, pancyclic 2-factorizations exist for all n except for a very few small n where
the solution is shown not to exist. Moreover, in each case the solution method
is similar. We start with a 2-factorization of the graph in question with a cyclic
automorphism group. The union of consecutive pairs of the 2-factors is shown
to be decomposable into two 2-factors with a wide range of cycle structures
by judicious swapping of the two pairs of opposite edges of induced 4-cycles.
This flexibility of decomposition and the automorphism group allow the desired
solution to be constructed.
The plethora of induced 4-cycles in the union of consecutive 2-factors from
the various 2-factorizations allow us not only to construct the various solutions
in many different ways, but to go far beyond the problem solved here. In K2n+1
it seems that the swapping lemmas can only produce one odd cycle per factor
and at most two in K2n − nK2 . Beyond this restriction there is a great deal of
flexibility in the application of the swapping lemmas. The use of these methods
to solve the pancyclic 2-factorization problem indicates the strength and range of
the swapping lemmas. We propose that the methods outlined in this article might
be powerful for constructing Oberwolfach solutions, and other 2-factorization
and scheduling problems. One very interesting problem is the construction of 2factorizations with prescribed lists of cycle types for each 2-factor. If the list can
only contain 2-factors with one or two cycles, then the methods presented here
nearly complete the problem. The only obstacle towards the solution of these
problems is the construction of pairs of 2-factors with the same cycle type, the
Oberwolfach aspect of the question. P. Gvozdjak is currently working on such
constructions.
There are other pancyclic decomposition questions that can be asked. The
Author and H. Verrall are currently working on the directed analogue of the
anti-Oberwolfach problem. Other obvious pancyclic problems can be formulated
for higher λ for both directed and undirected graphs; 2-path covering pancyclic
decompositions, both resolvable and non-resolvable . In each of these cases we
gain the flexibility to ask for different numbers within each size class of cycles,
possibly admitting digons and losing other restrictions enforced by the tightness
of the case solved here.
122
5
B. Stevens
Acknowledgments
I would like to thank Profs. E. Mendelsohn and A. Rosa for suggesting this problem and their support and encouragement. I was supported by NSF Graduate
Fellowship GER9452870, the Department of Mathematics at the University of
Toronto, a PIMS postdoctoral fellowship and the Department of Mathematics
and Statistics at Simon Fraser University.
References
1. B. Alspach and S. Marshall. Even cycle decompositions of complete graphs minus
a 1-factor. Journal of Combinatorial Designs, 2(6):441–458, 1994.
2. B. Alspach, P. Schellenberg, D. Stinson, and D. Wagner. The Oberwolfach problem
and factors of uniform odd length cycles. J. Combin. Theory. Ser. A, 52:20–43,
1989.
3. E. J. Billington and D. E. Bryant. The possible number of cycles in cycle systems.
Ars Combin., 52:65–70, 1999.
4. I. J. Dejter, F. Franek, E. Mendelsohn, and A. Rosa. Triangles in 2-factorizations.
J. Graph Theory, 26(2):83–94, 1997.
5. M. J. Grannell and A. Rosa. Cycles in 2-factorizations. J. Combin. Math. Combin.
Comput., 29:41–64, 1999.
6. E. Lucas. Récréations Mathématiques, volume 2. Gauthier-Villars, Paris, 1883.
7. W. Piotrowski. The solution of the bipartite analogue of the Oberwolfach problem.
Discrete Math., 97:339–356, 1991.
Graph Structure of the Web: A Survey
Prabhakar Raghavan1
IBM Almaden Research Center K53/B1, 650 Harry Road, San Jose CA 95120.
pragh@almaden.ibm.com
1
Summary
The subject of this survey is the directed graph induced by the hyperlinks between Web pages; we refer to this as the Web graph. Nodes represent static
html pages and hyperlinks represent directed edges between them. Recent estimates [5] suggest that there are several hundred million nodes in the Web graph;
this quantity is growing by several percent each month. The average node has
roughly seven hyperlinks (directed edges) to other pages, making for a total of
several billion hyperlinks in all.
There are several reasons for studying the Web graph. The structure of this
graph has already led to improved Web search [7,10,11,12,25,34], more accurate
topic-classification algorithms [13] and has inspired algorithms for enumerating
emergent cyber-communities [28]. Beyond the intrinsic interest of the structure
of the Web graph, measurements of the graph and of the behavior of users
as they traverse the graph, are of growing commercial interest. These in turn
raise a number of intriguing problems in graph theory and the segmentation
of Markov chains. For instance, Charikar et al. [14] suggest that analysis of
surfing patterns in the Web graph could be exploited for targeted advertising
and recommendations. Fagin et al. [18] consider the limiting distributions of
Markov chains (modeling users browsing the Web) that occasionally undo their
last step.
In this lecture we will cover the following themes from our recent work:
– How the structure of the Web graph has been exploited to improve the
quality of Web search.
– How the Web harbors an unusually large number of certain clique-like subgraphs, and the efficient enumeration of these subgraphs for the purpose of
discovering communities of interest groups in the Web.
– A number of measurements of degree sequences, connectivity, component
sizes and diameter on the Web. The salient observations include:
1. In-degrees on the Web follow an inverse power-law distribution.
2. About one quarter the nodes of the Web graph lie in a giant strongly
connected component; the remaining nodes lie in components that give
some insights into the evolution of the Web graph.
3. The Web is not well-modeled by traditional random graph models such
as Gn,p .
G. Gonnet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNCS 1776, pp. 123–125, 2000.
c Springer-Verlag Berlin Heidelberg 2000
124
P. Raghavan
– A new class of random graph models for evolving graphs. In particular, some
characteristics observed in the Web graph are modeled by random graphs in
which the destinations of some edges are created by probabilistically copying
from other edges at random. This raises the prospect of the study of a new
class of random graphs, one that also arises in other contexts such as the
graph of telephone calls [3].
Pointers to these algorithms and observations, as well as related work, may
be found in the bibliography below.
Acknowledgements
The work covered in this lecture is the result of several joint pieces of work with
a number of co-authors. I thank the following colleagues for collaborating on
these pieces of work: Andrei Broder, Soumen Chakrabarti, Byron Dom, David
Gibson, Jon Kleinberg, S. Ravi Kumar, Farzin Maghoul, Sridhar Rajagaopalan,
Raymie Stata, Andrew Tomkins and Janet Wiener.
References
1. S. Abiteboul, D. Quass, J. McHugh, J. Widom, and J. Wiener. The Lorel Query
language for semistructured data. Intl. J. on Digital Libraries, 1(1):68-88, 1997.
2. R. Agrawal and R. Srikanth. Fast algorithms for mining association rules. Proc.
VLDB, 1994.
3. W. Aiello, F. Chung and L. Lu. A random graph model for massive graphs. To
appear in the Proceedings of the ACM Symposium on Theory of Computing, 2000.
4. G. O. Arocena, A. O. Mendelzon, G. A. Mihaila. Applications of a Web query
language. Proc. 6th WWW Conf., 1997.
5. K. Bharat and A. Broder. A technique for measuring the relative size and overlap
of public Web search engines. Proc. 7th WWW Conf., 1998.
6. K. Bharat and M. R. Henzinger. Improved algorithms for topic distillation in a
hyperlinked environment. Proc. ACM SIGIR, 1998.
7. S. Brin and L. Page. The anatomy of a large-scale hypertextual Web search engine.
Proc. 7th WWW Conf., 1998. See also http://www.google.com.
8. A.Z. Broder, S.R. Kumar, F. Maghoul, P. Raghavan, S. Rajagopalan, R. Stata,
A. Tomkins and J. Wiener. Graph structure in the web: experiments and models.
Submitted for publication.
9. B. Bollobás. Random Graphs, Academic Press, 1985.
10. J. Carrière and R. Kazman. WebQuery: Searching and visualizing the Web through
connectivity. Proc. 6th WWW Conf., 1997.
11. S. Chakrabarti, B. Dom, D. Gibson, J. Kleinberg, P. Raghavan and S. Rajagopalan.
Automatic resource compilation by analyzing hyperlink structure and associated
text. Proc. 7th WWW Conf., 1998.
12. S. Chakrabarti, B. Dom, S. R. Kumar, P. Raghavan, S. Rajagopalan, and
A. Tomkins. Experiments in topic distillation. SIGIR workshop on hypertext IR,
1998.
13. S. Chakrabarti and B. Dom and P. Indyk. Enhanced hypertext classification using
hyperlinks. Proc. ACM SIGMOD, 1998.
Graph Structure of the Web: A Survey
125
14. M. Charikar, S. R. Kumar, P. Raghavan, S. Rajagopalan and A. Tomkins. On
targeting Markov segments. Proc. ACM Symposium on Theory of Computing, 1999.
15. H. T. Davis. The Analysis of Economic Time Series. Principia press, 1941.
16. R. Downey, M. Fellows. Parametrized Computational Feasibility. In Feasible Mathematics II, P. Clote and J. Remmel, eds., Birkhauser, 1994.
17. L. Egghe, R. Rousseau, Introduction to Informetrics, Elsevier, 1990.
18. R. Fagin, A. Karlin, J. Kleinberg, P. Raghavan, S. Rajagopalan, R. Rubinfeld,
M. Sudan, A. Tomkins. Random walks with “back buttons”. To appear in the
Proceedings of the ACM Symposium on Theory of Computing, 2000.
19. D. Florescu, A. Levy and A. Mendelzon. Database techniques for the World Wide
Web: A survey. SIGMOD Record, 27(3): 59-74, 1998.
20. E. Garfield. Citation analysis as a tool in journal evaluation. Science, 178:471–479,
1972.
21. N. Gilbert. A simulation of the structure of academic science. Sociological Research
Online, 2(2), 1997.
22. G. Golub, C. F. Van Loan. Matrix Computations, Johns Hopkins University Press,
1989.
23. M. R. Henzinger, P. Raghavan, and S. Rajagopalan. Computing on data streams.
AMS-DIMACS series, special issue on computing on very large datasets, 1998.
24. M. M. Kessler. Bibliographic coupling between scientific papers. American Documentation, 14:10–25, 1963.
25. J. Kleinberg. Authoritative sources in a hyperlinked environment, J. of the ACM,
1999, to appear. Also appears as IBM Research Report RJ 10076(91892) May 1997.
26. J. Kleinberg, S. Ravi Kumar, P. Raghavan, S. Rajagopalan and A. Tomkins. The
Web as a graph: measurements, models and methods. Invited paper in Proceedings of the International Conference on Combinatorics and Computing, COCOON,
1999. Springer-Verlag Lecture Notes in Computer Science.
27. D. Konopnicki and O. Shmueli. Information gathering on the World Wide Web:
the W3QL query language and the W3QS system. Trans. on Database Systems,
1998.
28. S. R. Kumar, P. Raghavan, S. Rajagopalan and A. Tomkins. Trawling emerging
cyber-communities automatically. Proc. 8th WWW Conf., 1999.
29. L. V. S. Lakshmanan, F. Sadri, and I. N. Subramanian. A declarative approach to
querying and restructuring the World Wide Web. Post-ICDE Workshop on RIDE,
1996.
30. R. Larson. Bibliometrics of the World Wide Web: An exploratory analysis of the
intellectual structure of cyberspace. Ann. Meeting of the American Soc. Info. Sci.,
1996.
31. A. J. Lotka. The frequency distribution of scientific productivity. J. of the Washington Acad. of Sci., 16:317, 1926.
32. A. Mendelzon, G. Mihaila, and T. Milo. Querying the World Wide Web, J. of
Digital Libraries 1(1):68–88, 1997.
33. A. Mendelzon and P. Wood. Finding regular simple paths in graph databases.
SIAM J. Comp., 24(6):1235-1258, 1995.
34. E. Spertus. ParaSite: Mining structural information on the Web. Proc. 6th WWW
Conf., 1997.
35. G. K. Zipf. Human behavior and the principle of least effort. New York: Hafner,
1949.
Polynomial Time Recognition of
Clique-Width ≤ 3 Graphs
(Extended Abstract)
Derek G. Corneil1 , Michel Habib2 , Jean-Marc Lanlignel2 , Bruce Reed3 , and
Udi Rotics1
1
Department of Computer Science, University of Toronto, Toronto, Canada
2
LIRMM, CNRS and University Montpellier II, Montpellier, France ‡
3
Equipe de Combinatoire, CNRS, University Paris VI, Paris, France §
†
Abstract. The Clique-width of a graph is an invariant which measures
the complexity of the graph structures. A graph of bounded tree-width
is also of bounded Clique-width (but not the converse). For graphs G of
bounded Clique-width, given the bounded width decomposition of G, every optimization, enumeration or evaluation problem that can be defined
by a Monadic Second Order Logic formula using quantifiers on vertices
but not on edges, can be solved in polynomial time.
This is reminiscent of the situation for graphs of bounded tree-width,
where the same statement holds even if quantifiers are also allowed on
edges. Thus, graphs of bounded Clique-width are a larger class than
graphs of bounded tree-width, on which we can resolve fewer, but still
many, optimization problems efficiently.
In this paper we present the first polynomial time algorithm (O(n2 m))
to recognize graphs of Clique-width at most 3.
1
Introduction
The notion of the Clique-width of graphs was first introduced by Courcelle,
Engelfriet and Rozenberg in [CER93]. The clique-width of a graph G, denoted
by cwd(G), is defined as the minimum number of labels needed to construct G,
using the four graph operations: creation of a new vertex v with label i (denoted i(v)), disjoint union (⊕), connecting vertices with specified labels (η) and
renaming labels (ρ). Note that ⊕ is the disjoint union of two labeled graphs, each
vertex of the new graph retains the label it had previously. ηi,j (i 6= j), called the
“join” operation, causes all edges (that are not already present) to be created
between every vertex of label i and every vertex of label j. ρi→j causes all vertices of label i to assume label j. As an example of these notions see the graph in
Fig. 4 together with its 3-expression in Fig. 4 and the parse tree associated with
the expression in Fig. 4. A detailed study of clique-width is presented in [CO99].
†
‡
§
email: dgc,rotics @cs.toronto.edu
email: habib,lanligne @lirmm.fr
email: reed@moka.ccr.jussieu.fr
G. Gonnet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNCS 1776, pp. 126–134, 2000.
c Springer-Verlag Berlin Heidelberg 2000
Polynomial Time Recognition of Clique-Width ≤ 3 Graphs
127
Also a study of the clique-width of graphs with few P4 s (i.e., a path on four vertices) and on perfect graph classes is presented in [MR99,GR99]. For example,
distance hereditary graphs and P4 -sparse graphs have bounded clique-width (3
and 4 respectively) whereas unit interval graphs, split graphs and permutation
graphs all have unbounded clique-width.
The motivation for studying clique-width is analogous to that of tree-width.
In particular, given a parse tree which shows how to construct a graph G using
k labels and the operations above, every decision, optimization, enumeration or
evaluation problem on G which can be defined by a Monadic Second Order Logic
formula ψ, using quantifiers on vertices but not on edges, can be solved in time
ck · O(n + m) where ck is a constant which depends only on ψ and k, where n
and m denote the number of vertices and edges of the input graph, respectively.
For details, see [CMRa,CMRb].
Furthermore clique-width is “more powerful” than tree-width in the sense
that if a class of graphs is of bounded tree-width then it is also of bounded
clique-width [CO99]. (In particular for every graph G, cwd(G) ≤ 2twd(G)+1 + 1,
where twd(G) denotes the tree-width of G).
One of the central open questions concerning clique-width is determining the
complexity of recognizing graphs of clique-width at most k, for fixed k. It is
easy to see that graphs of clique-width 1 are graphs with no edges. The graphs
of clique-width at most 2 are precisely the cographs (i.e., graphs without P4 )
[CO99]. In this paper we present a polynomial time algorithm (O(n2 m)) to
determine if a graph has clique-width at most 3. For graphs of Clique-width ≤3
the algorithm also constructs the 3-expression which defines the graph.
An implementation that achieves the O(n2 m) bound would be quite complicated, because a linear modular decomposition algorithm is needed. However the
other steps of the algorithm present no major difficulty: we use only standard
data structures, and the Ma-Spinrad split decomposition algorithm. So if we fall
back on an easy modular decomposition algorithm (see Sec. 6), there is a slightly
slower (O(n2 m log n)), easily implementable version of the algorithm.
Unfortunately, there does not seem to be a succinct forbidden subgraph characterization of graphs with clique-width at most 3, similar to the P4 -free characterization of graphs with clique-width at most 2. In fact every cycle Cn with n ≥ 7
has clique-width 4, thereby showing an infinite set of minimal forbidden induced
subgraphs for Clique-width ≤3 [MR99].
2
Background
We first introduce some notation and terminology. The graphs we consider in
this paper are undirected and loop-free. For a graph G we denote by V (G)
(resp. E(G)) the set of vertices (resp. edges) of G. For X ⊆ V (G), we denote
by G[X] the subgraph of G induced by X. We denote by G \ X the subgraph
of G induced by V (G) \ X. We say that vertex v is universal to X if v is adjacent
to all vertices in X \ {v} and that v is universal in G if v is universal to V (G).
On the other hand v misses X if v misses (i.e., is not adjacent to) all vertices
128
D.G. Corneil et al.
in X \ {v}. We denote by N (v) the neighborhood of v in G, i.e., the set of
vertices in G adjacent to v. We denote by N [v] the closed neighborhood of v,
i.e., N [v] = N (v) ∪ {v}.
A labeled graph is a graph with integer labels associated with its vertices,
such that each vertex has exactly one label. We denote by hG : V1 , . . . , Vp i the
labeled graph G with labels in {1, . . . , p} where Vi denotes the set of vertices
of G having label i (some of these sets may be empty).
The definition of the Clique-width of graphs (see the introduction) extends
naturally to labeled graphs. The Clique-width of a labeled graph hG : V1 , . . . , Vp i
denoted by cwd(G : V1 , . . . , Vp ) is the minimum number of labels needed to
construct G such that all vertices of Vi have label i (at the end of the construction
process), using the four operations i(v), η, ρ and ⊕ (see the introduction for the
definition of these operations). Note that, for instance, the cycle with 4 vertices
(the C4 ) is of Clique-width ≤3, but the C4 labeled 1−1−2−2 consecutively
around the circle is not.
We say that a graph is 2-labeled if exactly two of the label sets are non-empty,
and 3-labeled if all three of them are non-empty.
Without loss of generality, we may assume that our given graphs are prime,
in the sense that they have no modules. (A module of G is an induced subgraph
H, 1 < |H| < |G|, such that each vertex in G \ H is either universal to H
or misses H). This assumption follows from the easily verifiable observation
(see [CMRa]) that for every graph G which is not a cograph (i.e., is of cliquewidth > 2), and has a module H, cwd(G) = max{cwd(H), cwd(G \ (H − x))},
where x is any vertex of H.
Given a connected 3-labeled graph, the last operation in a parse tree which
constructs it must have been a join. In principle this yields three possibilities.
However, if two different join operations are possible, the graph has a module:
for example, if both η1,2 and η1,3 are possible, the vertices of label 2 and 3 form
a module. So since we are only considering prime graphs we can determine the
last operation of the parse tree.
Unfortunately we cannot continue this process on the subproblems as deleting
the join edges may disconnect the graph and leave us with 2-labeled subproblems.
In fact it turns out that solving Clique-width restricted to 3-labeled graphs is
no easier than solving it for 2-labeled graphs.
In contrast, if we attempt to find in this top-down way the parse tree for a 2labeled prime graph, then we never produce a subproblem with only 1 non-empty
label set, because its vertices would form a module (as the reader may verify).
This fact motivates the following definition: for partition A ∪ B of V (H), let 2–
LAB(H : A, B) denote the problem of determining whether cwd(H : A, B) ≤ 3
(and finding a corresponding decomposition tree). Since A and B form a disjoint
partition of V (H) we will also denote it as 2–LAB(H : A, −). If we can find a
polynomial time algorithm to solve this problem, then our problem reduces to
finding a small set S of possible 2-labelings such that at least one of them is
of Clique-width ≤3 iff G is of Clique-width ≤3. We first discuss how we solve
2–LAB, then discuss how to use it to solve the general Clique-width ≤3 problem.
Polynomial Time Recognition of Clique-Width ≤ 3 Graphs
3
129
Labeled Graphs
The 2–LAB problem is easier to solve than the general Clique-width ≤3 problem, because there are fewer possibilities. The last operation must have been a
relabeling, and before that edges were created. With our top-down approach, we
have to introduce a third label set, that is to split one of the two sets A or B in
such a way that all edges are present between two of the three new sets (Fig. 1
shows the two possibilities when the set of vertices labeled with 1 is split); and
there are four ways to do this in general. Each of these four ways corresponds to
one of the ways of introducing the third label set, namely: consider the vertices
of A that are universal to B; consider the vertices of B that are universal to A;
consider the co-connected components of both A and B (these are the connected
components of the complement of the set).
1
1
2
−→
1
2
3
or
2
or . . .
3
Fig. 1. 2–LAB procedure main idea
If there is only one possible way of relabeling the vertices and undoing a join,
then we have a unique way of splitting our problem into smaller problems; we
do so and continue, restricting our attention to these simpler subproblems.
The difficulty arises when there is more than one possible join that could
be undone. As mentioned in the previous section, if all three label sets are nonempty, then this possibility will not arise because it would imply the existence of
a module. In the general case this situation may arise, but again it implies very
strong structural properties of the graph, which allow us to restrict our attention
to just one of the final possible joins. The proof that we need consider just one
final join even when there is more than one possibility will be described in the
journal version of the paper (or see [WWW]).
We then remove the edges (adding a join node to the decomposition tree),
which disconnects the graph, and we can apply again the above algorithm, until
either we have a full decomposition tree, or we know that the input graph with
the initial labeling of A and B is not of Clique-width ≤3.
4
Algorithm Outline
We now know how to determine if the Clique-width of hG : A, Bi is ≤ 3 for any
partition A ∪ B of V (G). Our problem thus reduces to finding a small set S of
possible 2-labelings such that at least one of them is of Clique-width ≤3 iff G is
of Clique-width ≤3.
130
D.G. Corneil et al.
We are interested in the last join operation in a parse tree corresponding
to a 3-expression which defines G (if such an expression exists); without loss
of generality we can assume that this is a η1,2 . The first case is when there
is only one vertex x of label 1. In this case the parse tree is a solution of 2–
LAB(G : {x}, −). More generally, if the graph obtained by deleting the edges
from the last join has a 3-labeled connected component, then it turns out that
there is a simple procedure for finding the corresponding 2–LAB problems.
Do all graphs with clique-width at most 3 have a parse tree which ends in
this way? Unfortunately the answer is no. The graph in Fig. 4 has clique-width 3
but it is easy to show that there is no parse tree formed in the manner described
above.
1
8
2
3
7
4
6
5
Fig. 2. An example graph.
t = η1,2 (l ⊕ r)
l = ρ1→3 ◦ η1,3 η1,2 1(8) ⊕ 2(7) ⊕ η2,3 3(1) ⊕ 2(2)
r = ρ2→3 ◦ η2,3 η1,2 1(6) ⊕ 2(5) ⊕ η1,3 1(3) ⊕ 3(4)
Fig. 3. A 3-expression t for the graph of Fig. 4.
Thus, we need to consider other final joins, when the graph obtained by
deleting the edges from the last join has no 3-labeled connected component.
This leads to the notion of cuts (often called joins in the literature) first studied
by Cunningham [Cun82]. A cut is a disjoint partition of V (G) into (X : Y )
where |X|, |Y | > 1 together with the identification of subsets X̃ ⊆ X, Ỹ ⊆ Y ,
called the boundary sets, where E(G) ∩ (X × Y ) = X̃ × Ỹ . Note that since we
assume our graphs are module free, X̃ ⊂ X and Ỹ ⊂ Y . For the graph in Fig. 4
X = {1, 2, 7, 8}, X̃ = {2, 7}, Y = {3, 4, 5, 6} and Ỹ = {3, 6}.
Polynomial Time Recognition of Clique-Width ≤ 3 Graphs
131
η1,2
ρ1→3
ρ2→3
η1,3
η1,2
1(8)
η2,3
η2,3
2(7)
3(1)
η1,2
2(2)
1(6)
η1,3
2(5)
1(3)
3(4)
Fig. 4. The parse tree corresponding to the 3-expression of Fig. 4.
Note how the partition of V (G) by a cut (X : Y ) is reflected in a parse
tree of G, where G1 = G[X] and G2 = G[Y ]. This suggest the possibility of
an algorithm to examine every cut of G and to try to find a parse tree that
reflects that cut. There are a number of problems with this approach. First, the
number of cuts may grow exponentially with n. (In particular, consider the graph
consisting of (n−1)/2 P3 s that all share a common endpoint.) Fortunately, as we
will see later, we only need consider at most O(n) cuts. Secondly, we would need a
polynomial time algorithm to find such a set of O(n) cuts. In fact the algorithm
by Ma and Spinrad [MS94] does this for us. (Another approach for finding a
polynomial size set of cuts is described in [WWW].) For any of these cuts (say
cut (X : Y )) we can see if it corresponds to an appropriate decomposition by
solving 2–LAB(G : X̃ ∪ Ỹ , −).
We now present the formal specifications of our algorithm.
5
Formal Description
Our algorithm has the following outline: (Note that although the algorithm is
described as a recognition algorithm, it can easily be modified to produce a
3-expression which defines the input graph, if such an expression exists.)
Given graph J use Modular Decomposition to find the prime graphs
J1 , . . . , Jk associated with J.
for i := 1 to k
if ¬CW D3(Ji ) then
STOP (cwd(J) > 3)
STOP (cwd(J) ≤ 3)
132
D.G. Corneil et al.
function CW D3(G)
{This function is true iff prime graph G has cwd(G) ≤ 3.}
{First see if there is a parse tree with a final join with a 3-labeled connected
component. }
for each x ∈ V (G)
if 2-LAB(G : {x}, −) or 2-LAB(G : {z ∈ N (x)|N [z] * N [x]}, −)
then
return
true
{Since there is no such parse tree, determine if there is a parse tree for which
the final join corresponds to a cut.}
Produce a set of cuts {(X1 : Y1 ), (X2 : Y2 ), . . . , (Xl : Yl )} so that if there is
a parse tree whose final join corresponds to a cut, there is one whose final join
corresponds to a cut in this set (using e.g. Ma-Spinrad).
for i := 1 to l
if 2-LAB(G : X̃i ∪ Ỹi , −) then
return
true
return
6
false
Correctness and Complexity Issues
In this section we give more details regarding the sufficiency, for our purposes, of
the set of cuts determined by the Ma-Spinrad algorithm and then briefly discuss
the complexity of our algorithm.
As mentioned in Sect. 4, the number of cuts in a graph may grow exponentially with the size of the graph. We prove however, that if none of the cuts
identified by the Ma-Spinrad algorithm show that cwd(G) ≤ 3 then no cut can
establish cwd(G) ≤ 3. In order to prove this we first introduce some notation.
We say that cut (X : Y ) is connected if both G[X] and G[Y ] are connected, 1disconnected if exactly one of G[X] and G[Y ] is disconnected and 2-disconnected
if both G[X] and G[Y ] are disconnected. We say that two cuts (X : Y ) and
(W : Z) cross iff X ∩ W 6= ∅, X ∩ Z 6= ∅, Y ∩ W 6= ∅ and Y ∩ Z 6= ∅. We denote
by CT the set of cuts produced by the Ma-Spinrad algorithm. Recall that for
every cut (X : Y ) our algorithm calls 2-LAB(G : X̃ ∪ Ỹ , −) to check whether
this cut can establish cwd(G) ≤ 3. We denote it as the call to 2-LAB on behalf
of cut (X : Y ). Suppose all the calls to 2-LAB on behalf of the cuts in CT failed
and there is a cut (X : Y ) not in CT such that the call to 2-LAB on behalf
of (X : Y ) succeeds. We show that there is a cut in CT (W : Z) which crosses
(X : Y ). Furthermore we show that if (X : Y ) is connected then X̃ ∪ Ỹ = W̃ ∪ Z̃.
Thus the call to 2-LAB on behalf of cut (X : Y ) is the same as the call to 2-LAB
on behalf of cut (W : Z), a contradiction. Thus (X : Y ) is not connected. We
show that if (X : Y ) is 2-disconnected then (X : Y ) must be in CT , again a
Polynomial Time Recognition of Clique-Width ≤ 3 Graphs
133
contradiction. Thus (X, Y ) must be 1-disconnected. In this case we also reach a
contradiction, as described in the full version of the paper.
We now turn briefly to complexity issues. As shown in [CH94] modular decomposition can be performed in linear time. The Ma-Spinrad algorithm can be
implemented in O(n2 ) time. Function 2-LAB is invoked O(n) times. As shown
in the journal version of the paper, the complexity of 2-LAB is O(mn); thus the
overall complexity of our algorithm is O(n2 m).
There is one case in the 2–LAB procedure where we use a modular decomposition tree. Thus for achieving best complexity, a linear modular decomposition algorithm is needed there. Up to now, no such algorithm is known that
is also easy to implement. However, if a practical algorithm is sought, one can
use an O(n + m log n) algorithm [HPV99]. The complexity of 2–LAB is then
O(mn log n), and the overall complexity would be O(mn2 log n).
7
Concluding Remarks
Having shown that the clique-width at most 3 problem is in P , the key open
problem is to determine whether the fixed clique-width problem is in P for
constants larger than 3. Even extending our algorithm to the 4 case is a nontrivial
and open problem. Although, to the best of our knowledge, it has not been
established yet, one fully expects the general clique-width decision problem to
be NP-complete.
Acknowledgments
D.G. Corneil and U. Rotics wish to thank the Natural Science and Engineering
Research Council of Canada for financial assistance.
References
[CER93]
[CH94]
[CMRa]
[CMRb]
[CO99]
B. Courcelle, J. Engelfriet, and G. Rozenberg. Handle-rewriting hypergraph
grammars. J. Comput. System Sci., 46:218-270, 1993.
A. Cournier and M. Habib. A new linear algorithm for modular decomposition. Lecture Notes in Computer Science, 787:68-84, 1994.
B. Courcelle, J.A. Makowsky, and U. Rotics. Linear time solvable optimization problems on certain structured graph families, extended abstract.
Graph Theoretic Concepts in Computer Science, 24th International Workshop, WG’98, volume 1517 of Lecture Notes in Computer Science, pages
1-16. Springer Verlang, 1998. Full paper to appear in Theory of Computing
Systems.
B. Courcelle, J.A. Makowsky, and U. Rotics. On the fixed parameter complexity of graph enumeration problems definable in monadic second order
logic. To appear in Disc. Appl. Math.
B. Courcelle and S. Olariu. Upper bounds to the clique-width of graphs.
To appear in Disc. Appl. Math.
(http://dept-info.labri.u-bordeaux.fr/∼courcell/ActSci.html), 1999.
134
D.G. Corneil et al.
[Cun82]
[GR99]
[HPV99]
[MR99]
[MS94]
[WWW]
W.H. Cunningham. Decomposition of directed graphs. SIAM J. Algebraic
Discrete Methods, 3:214-228, 1982.
M. C. Golumbic and U. Rotics. On the clique-width of perfect graph classes
(extended abstract). To appear in WG99, 1999.
M. Habib, C. Paul and L. Viennot. Partition refinement techniques: an
interesting algorithmic tool kit. International Journal of Foundations of
Computer Science, volume 10, 1999, 2:147–170.
J.A. Makowsky and U. Rotics. On the clique-width of graphs with few P4 ’s.
To appear in the International Journal of Foundations of Computer Science
(IJFCS), 1999.
T. Ma and J. Spinrad. An O(n2 ) algorithm for undirected split decomposition. Journal of Algorithms, 16:145-160, 1994.
A complete description of the 2–LAB procedure, and also a complete description in French of a different approach to the entire algorithm.
http://www.lirmm.fr/∼lanligne
On Dart-Free Perfectly Contractile Graphs⋆
Extended Abstract
Cláudia Linhares Sales1 and Frédéric Maffray2
1
2
DC-LIA, Bloco 910, CEP 60455-760, Campus do Pici, UFC, Fortaleza-CE, Brazil
CNRS, Laboratoire Leibniz, 46 avenue Félix Viallet, 38031 Grenoble Cedex, France
Abstract. The dart is the five-vertex graph with degrees 4, 3, 2, 2, 1.
An even pair is pair of vertices such that every chordless path between
them has even length. A graph is perfectly contractile if every induced
subgraph has a sequence of even-pair contractions that leads to a clique.
We show that a recent conjecture on the forbidden structures for perfectly contractile graphs is satisfied in the case of dart-free graphs. Our
proof yields a polynomial-time algorithm to recognize dart-free perfectly
contractile graphs.
Keywords: Perfect graphs, even pairs, dart-free graphs, claw-free graphs
1
Introduction
A graph G is perfect [1] if every induced subgraph H of G has its chromatic
number χ(H) equal to the maximum size ω(H) of the cliques of H. One of
the most attractive properties of perfect graphs is that some problems that are
hard in general, such as optimal vertex-coloring and maximum clique number,
can be solved in polynomial time in perfect graphs, thanks to the algorithm
of Grötschel, Lovász and Schrijver [7]. However, that algorithm, based on the
ellipsoid method, is quite impractical. So, an interesting open problem is to find
a combinatorially “simple” polynomial-time algorithm to color perfect graphs.
In such an algorithm, one may reasonably expect that some special structures
of perfect graphs will play an important role. An even pair in a graph G is
a pair of non-adjacent vertices such that every chordless path of G between
them has an even number of edges. The contraction of a pair of vertices x, y
in a graph G is the process of removing x and y and introducing a new vertex
adjacent to every neighbor of x or y in G. Fonlupt and Uhry [6] proved that
contracting an even pair in a perfect graph yields a new perfect graph with the
same maximum clique number. In consequence, a natural idea for coloring a
perfect graph G is, whenever it is possible, to find an even pair in G, to contract
it, and to repeat this procedure until a graph G′ that is easy to color is obtained.
By the result of Fonlupt and Uhry, that final graph G′ has the same maximum
clique size as G and (since it is perfect) the same chromatic number. Each
⋆
This research was partially supported by the cooperation between CAPES (Brazil)
and COFECUB (France), project number 213/97. The first author is partially supported by CNPq-Brazil grant number 301330/97.
G. Gonnet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNCS 1776, pp. 135–144, 2000.
c Springer-Verlag Berlin Heidelberg 2000
136
C. Linhares Sales, F. Maffray
vertex of G′ represents a stable set of G, so one can easily obtain an optimal
coloring of G from any optimal coloring of G′ . For many classical perfect graphs
one may expect the final graph to be a clique. Thus one may wonder whether
every perfect graph admit a sequence of even-pair contractions that leads to
a clique. Unfortunately, the answer to this question is negative (the smallest
counterexample is the complement of a 6-cycle).
Bertschi [2] proposes to call a graph G even contractile if it admits a sequence
of even-pair contractions leading to a clique, and perfectly contractile if every
induced subgraph of G is even contractile. The class of perfectly contractile
graphs contains many known classes of perfect graphs, such as Meyniel graphs,
weakly triangulated graphs, and perfectly orderable graphs, see [4].
Everett and Reed [5] have proposed a conjecture characterizing perfectly
contractile graphs. In order to present it, we need some technical definitions. A
hole is a chordless cycle of length at least five, and an antihole is the complement
a hole. A hole or antihole is even (resp. odd) if it has an even (odd) number of
vertices. We denote by C̄6 the complement of a hole on six vertices.
Definition 1 (Stretcher). A stretcher is any graph that can be obtained by
subdividing the three edges of C̄6 that do not lie in a triangle in such a way
that the three chordless paths between the two triangles have the same parity. A
stretcher is odd (resp. even) if the three paths are odd (even) (see figure 1).
r
❅
❅
❅r
r
r
r
r
r
r
❅
❅
❅r
An even stretcher
r
✁❅
✁ ❅
r✁
❅r
r
❆
❆
❆r
r
❆
❆
❆r
r
r
❅
✁
❅ ✁
❅✁r
An odd stretcher
Conjecture 1 (Perfectly Contractile Graph Conjecture [5]). A graph is perfectly
contractile if and only if it contains no odd hole, no antihole, and no odd
stretcher.
Note that there is no even pair in an odd hole or in an antihole, but odd
stretchers may have even pairs. So, the ‘only if’ part of the conjecture is established if we can check that every sequence of even-pair contractions in an odd
On Dart-Free Perfectly Contractile Graphs
137
stretcher leads to a graph that is not a clique; this is less obvious but was done
formally in [9].
The above conjecture has already been proved for planar graphs [9], for clawfree graphs [8] and for bull-free graphs [3].
Here, we are interested in the dart-free perfectly contractile graphs. Recall
that the dart is the graph on five vertices with degree sequence (4, 3, 2, 2, 1); in
other words, a dart is obtained from a 4-clique by removing one edge and adding
a new vertex adjacent to exactly one of the remaining vertices of degree three.
We will call tips of the dart its two vertices of degree two.
r
r
r
❅
❅
❅r
r
A dart
A graph is dart-free if it does not contain a dart as an induced subgraph.
Dart-free graphs form a large class of interest in the realm of perfect graphs as
it contains all diamond-free graphs and all claw-free graphs. Dart-free graphs
were introduced by Chvátal, and Sun [11] proved that the Strong Perfect Graph
Conjecture is true for this class, that is, a dart-free graph is perfect if only if it
contain no odd hole and no odd antihole. Chvátal, Fonlupt, Sun and Zemirline
[12] devised a polynomial-time algorithm to recognize dart-free graphs. On the
other hand, the problem of coloring the vertices of a dart-free perfect graph in
polynomial time using only simple combinatorial arguments remains open.
We will prove that Everett and Reed’s conjecture on perfectly contractile
graphs is also true for dart-free graphs, that is:
Theorem 1 (Main Theorem). A dart-free graph is perfectly contractile if and
only if it contains no odd hole, no antihole, and no odd stretcher.
Moreover, we will present a polynomial-time combinatorial algorithm to color
optimally the perfectly contractile dart-free graphs. In order to prove our main
theorem, we will use the decomposition structure found by Chvátal, Fonlupt,
Sun and Zemirline [12]. It is presented in the next section.
We finish this section with some terminology and notation. We denote by
N (x) the subset of vertices of G to which x is adjacent. The complement of a
graph G is denoted by Ḡ. If {x, y} is an even pair of a graph G, the graph obtained
by the contraction of x and y is denoted by G/xy. It will be convenient here to
call two vertices x, y of a graph twins when they are adjacent and N (x) ∪ {x} =
N (y) ∪ {y} (the usual definition of twins does not necessarily require them to be
adjacent). A claw is a graph isomorphic to the complete bipartite graph K1,3 .
A double-claw is a graph with vertices u1 , u2 , u3 , v1 , v2 and edges v1 v2 and ui vj
138
C. Linhares Sales, F. Maffray
r
r
❅
❅
❅r
r
r
r
❍ ✟✟❅
✁
✟❍✁❍
✟✟ ✁ ❍❅
❍❍
❅r
r✟
✁
✁
r✁
A claw
A double-claw
(1 ≤ i ≤ 3, 1 ≤ j ≤ 2). Two twins x, y are called double-claw twins if they
are the vertices v1 , v2 in a double-claw as above. The join of two vertex-disjoint
graphs G1 = (V1 , E1 ) and G2 = (V2 , E2 ) is the graph G with vertex-set V1 ∪ V2
and edge-set E1 ∪ E2 ∪ F , where F is set of all pairs made of one vertex of G1
and one vertex of G2 .
2
Decomposition of Dart-Free Perfect Graphs
We present here the main results from [12] and adopt the same terminology. We
call dart-free the algorithm from [12] to recognize dart-free perfect graphs.
When a graph G has a pair of twins x, y, Lovász’s famous Replication Lemma
[10] ensures that G is perfect if and only if G − x (or G − y) is perfect. So, the
initial step of algorithm dart-free is to remove one vertex among every pair of
twins in the graph. Dart-free graphs without twins have some special properties.
Definition 2 (Friendly graph [12]). A graph G is friendly if the neighborhood
N (x) of every vertex x of G that is the center of a claw induces vertex-disjoint
cliques.
Theorem A ([12]) Let G be a dart-free graph without twins. If G and Ḡ are
connected, then G is friendly.
Theorem B ([12]) A graph G is friendly if and only if it contains no dart and
no pair of double-claw twins.
Let G be a dart-free graph. Let W be the subset of all vertices of G that have
at least one twin, and let T be a subset of W such that every pair of twins of
G has at least one of them in T . Using Theorem A, one can find in polynomial
time a family F of pairwise vertex-disjoint friendly graphs such that: (a) the
elements of F are induced subgraphs of G − T , and (b) G is perfect if and only
if every element of F is perfect. This family can be constructed as follows: first,
put G − T in F; then, as long as there exists an element H of F such that either
H or H̄ is disconnected, replace in F the graph H by its connected components
(if H is disconnected) or by the complements of the connected components of H̄
On Dart-Free Perfectly Contractile Graphs
139
(if H̄ is disconnected). In consequence, the problem of deciding whether a dartfree graph is perfect is reduced to deciding whether a friendly graph is perfect
or not. For this purpose, friendly graphs are decomposed further.
Definition 3 (Bat [12]). A bat is any graph that can be formed by a chordless
path a1 a2 · · · am (m ≥ 6) and an additional vertex z adjacent to a1 , ai , ai+1 and
am for some i with 3 ≤ i ≤ m − 3 and to no other vertex of the path. A graph G
is bat-free if it does not contain any bat as an induced subgraph.
Given a graph G and a vertex z, a z-edge is any edge whose endpoints are
both adjacent to a given vertex z. The graph obtained from G by removing
vertex z and all z-edges is denoted by G ∗ z.
Definition 4 (Rosette [12]). A graph G is said to have a rosette centered at
a vertex z of G if G ∗ z is disconnected and the neighborhood of z consists of
vertex-disjoint cliques.
Theorem C ([12]) Every friendly graph G containing no odd hole either is
bat-free, or has a clique-cutset, or has a rosette.
Definition 5 (Separator [12]). A separator S is a cutset with at most two
vertices such that, if S has two non-adjacent vertices, each component of G − S
has at least two vertices.
Theorem D ([12]) Every bat-free friendly graph G containing no odd hole either is bipartite, or is claw-free, or has a separator.
A decomposition of G along special cutsets can be defined as follows:
– Clique-cutset decomposition: Let C be a clique-cutset of G and let B1 , . . .,
Bk be the connected components of G − C. The graph G is decomposed
into the pieces of G with respect to C, which are the induced subgraphs
Gi = G[Bi ∪ C] (i = 1, . . . , k).
– Rosette decomposition: Consider a rosette centered at a vertex z of G, and
let B1 , . . . Bk (k ≥ 2) be the connected components of G ∗ z. The graph
G is decomposed into k + 1 graphs G1 , . . . , Gk , H defined as follows. For
i = 1, . . . , k, the graph Gi is G[Bi ∪ {z}]. The graph H is formed from
G[N (z)] by adding vertices w1 , . . . , wk and edges from wi to all of N (z) ∩ Bi
(i = 1, . . . , k).
– Separator decomposition: When S is a separator of size one or two with its
two vertices adjacent, S is a clique-cutset and the decomposition is as above.
When S = {u, v} is a separator of G with u, v non-adjacent, let B1 , . . . , Bk
be the components of G − S, and let P be a chordless path between u and
v in G. The graph G is decomposed into k graphs G1 , . . . , Gk defined as
follows. If P is even, Gi is obtained from G[Bi ∪ S] by adding one vertex wi
with edges to u and v. If P is odd, set Gi = G[Bi ∪ S] + uv.
140
C. Linhares Sales, F. Maffray
Algorithm Bat-free builds a decomposition tree T of a friendly graph G. At
the initial step, G is the root and the only node of the tree. At the general step,
let G′ be any node of T . If G′ can be decomposed by one of the special cutsets
(clique or rosette), then add in T , as children of G′ , the graphs into which it is
decomposed. More precisely, the clique-cutset decomposition is applied first, if
possible; the rosette decomposition is applied only if the clique-cutset decomposition cannot be applied. Since each leaf H of T is friendly and has no clique-cutset
and no rosettes, Theorem C ensures that either H is bat-free or G was not perfect. So, the second phase of the algorithm examines the leaves of T : each leaf H
of T must either be bipartite, or be claw-free, or contain a separator, or else G is
not perfect, by Theorem D. If H contains a separator, a separator decomposition
is applied. When no separator decomposition is possible, G is perfect if and only
if all the remaining leaves of T are either bipartite or claw-free.
3
Dart-Free Perfectly Contractile Graphs
This section is dedicated to the proof of Theorem 1. In this proof, we will use the
decomposition of dart-free perfect graphs obtained by the bat-free algorithm.
We organize this section following the steps of the decomposition. First, we
examine the friendly graphs.
Theorem 2. A friendly graph G is perfectly contractile if and only if it contains
no odd stretcher, no antihole and no odd hole.
Proof. As observe at the beginning, no perfectly contractile graph can contain
an odd hole, an antihole, or an odd stretcher. Conversely, suppose that G has
no odd stretcher, no antihole and no odd hole as induced subgraph, and let us
prove by induction on the number of vertices of G that G is perfectly contractile.
The fact is trivially true when G has at most six vertices. In the general case,
by Theorem C, G either is bat-free or has a clique-cutset or a rosette. We are
going to check each of these possibilities. The following lemmas prove Theorem 2
respectively when G has a clique-cutset and when G has a rosette. Their proofs
are omitted and will appear in the full version of the paper.
Lemma 1. Let G be a friendly graph, with no odd hole, no odd stretcher and
no antihole. If G has a clique cutset, then G is perfectly contractile.
Lemma 2. Let G be a friendly graph with no odd hole, no antihole and no odd
stretcher. If G has a rosette centered at some vertex z, then:
(i) Every piece of G with respect to z is perfectly contractile; and
(ii) G is perfectly contractile if every piece of G has a sequence of even-pair
contractions leading to a clique such that each graph g in this sequence either is
dart-free or contains a dart whose tips form an even pair of g.
Lemma 3. Let G be a friendly graph with no odd hole, no antihole and no odd
stretcher. If G is bat-free, then G is perfectly contractile.
On Dart-Free Perfectly Contractile Graphs
141
For this lemma, we prove that: (a) If G has a separator S, then G is perfectly
contractile; (b) If G is bipartite or claw-free, G is perfectly contractile. These
facts imply Lemma 3.
The following lemmas will ensure the existence of a sequence of even-pair
contractions whose intermediate graphs are dart-free.
Lemma 4. Every bipartite graph admits a sequence of even-pair contractions
that lead to a clique and whose intermediate graphs are dart-free.
Lemma 5. Every claw-free graph admits a sequence of even-pair contractions
that lead to a clique and whose intermediate graphs either are dart-free or have
a dart whose tips form an even pair.
Lemma 4 is trivial; the proof of Lemma 5 is based on the study of even pairs in
claw-free graphs that was done in [8].
Lemmas 4 and 5 imply that every friendly graph that contains no odd hole,
no antihole and no odd stretcher admits a sequence of even-pair contractions
whose intermediate graphs are dart-free. Therefore, Lemmas 1, 2 and 3 together
imply Theorem 2.
3.1
An Algorithm
Now we give the outline of an even-pair contraction algorithm for a friendly
graph G without odd holes, antiholes or odd stretchers. The algorithm has two
main steps: constructing the decomposition tree, then contracting even pairs in
a bottom-up way along the tree.
In the first step, the algorithm uses a queue Q that initially contains only G.
While Q is not empty, a graph G′ of Q is dequeued and the following sequence of
steps are executed at the same time that a decomposition tree T is being built:
1. If G′ has a clique-cutset C, put the pieces of G′ with respect to C in Q;
repeat the first step.
2. If G′ has a rosette centered at z, put in Q all the pieces of G′ with respect
to the rosette; except H; repeat the first step.
3. If G′ has a separator {a, b} and {a, b} forms an even pair, contract a and b,
put G′ /ab in Q; repeat the first step. If {a, b} forms an odd pair, put G′ + ab
in Q; repeat the first step.
The second step examines the tree decomposition T in a bottom-up way. For
each leaf G′ of T , we have:
1. If G′ is a bipartite graph, then a sequence of even-pair contractions that
turns G′ into a K2 is easily obtained.
2. If G′ is a claw-free graph, then a sequence of even-pair contractions that
turns G′ into a clique can be obtained by applying the algorithm described
in [8].
Now, since every leaf is a clique, we can glue the leaves by the cutsets that
produced them, following the tree decomposition. Three cases appear:
142
C. Linhares Sales, F. Maffray
1. Suppose that G1 , . . . , Gk are the pieces produced by a separator {a, b}. Since
G1 , . . . , Gk are cliques, glueing G1 , . . . , Gk by {a, b} (more exactly, by vertices
that correspond to a and b) will produce a graph with a clique-cutset and
such that every piece is a clique. This is a special kind of triangulated graph,
in which a sequence of even-pair contractions leading to a clique can easily
be obtained.
2. Suppose that G1 , . . . , Gk are the pieces produced by a rosette and let G′ be
the graph obtained by glueing all these pieces (which are cliques) according
to the rosette. Let G′′ be the graph obtained from G′ by removing (i) every
vertex that sees all the other vertices, and (ii) every vertex whose neighbours
form a clique; it is easy to check that any sequence of even-pair contractions
for G′′ yields a sequence of even-pair contractions for G′ . Moreover, we can
prove that G′′ is friendly and bat-free; so it must either have a separator
or be a bipartite graph or a claw-free graph. Thus, an additional step of
decomposition and contraction will give us the desired sequence of even-pair
contractions for G′′ .
3. Suppose that G1 , . . . , Gk are the pieces produced by a clique-cutset. Again,
the graph G′ obtained by glueing these pieces along the clique-cutset is a
triangulated graph for which a sequence of even-pair contractions that turns
it into a clique can be easily obtained.
Finally, we can obtain a sequence of even-pair contractions for G by concatenating all the sequences mentioned above.
Proof of Theorem 1
Let G be a dart-free graph with no odd hole, no antihole and no odd stretcher.
If G has no twins and G and Ḡ are connected, then G is friendly and so, by
Theorem 2, perfectly contractile. If Ḡ is disconnected then we can show that G
has a very special structure which is easy to treat separately. If G is disconnected,
it is sufficient to argue for each component of G. Hence we are reduced to the
case where G is connected and is obtained from a friendly graph by replication
(making twins). We conduct this proof by induction and along three steps. First,
we modify slightly the construction of family F described in section 2. As we
have seen, F was obtained from a dart-free graph without twins. Unfortunately,
twins cannot be bypassed easily in the question of perfect contractibility (Note:
it would follow from Everett and Reed’s conjecture that replication preserves
perfect contractibility; but no proof of this fact is known). However, by Theorem
B, we need only remove double-claw twins from a dart-free graph G to be able
to construct a family of friendly graphs from G. It is not hard to see that if
a dart-free graph G such that G and Ḡ are connected contains a double claw,
then it must contain double-claw twins (see the proof of Theorem A in [12]). So
Theorem A can be reformulated as follows:
Theorem E ([12]) Let G be a dart-free graph without double-claw twins. If G
and Ḡ are connected, then G is friendly.
On Dart-Free Perfectly Contractile Graphs
143
Therefore, instead of removing all the twins of a dart-free graph G, we can afford
to remove only the double-claw twins, as follows: initialize G′ = G and T = ∅; as
long as G′ has a pair of double-claw twins x, y, set G′ = G′ − x and T = T + x.
Observe that if G contains no odd stretcher, no antihole and no odd hole, then
so does G′ . Now let the family F ′ be obtained from G′ just like F was obtained
in the previous section.
The second step is to examine each friendly graph F of F ′ , to add back
its twins and to prove that it is perfectly contractile. Since G′ contains no odd
stretcher, no antihole and no odd hole, so does F ; hence, by Theorem 2, F is
perfectly contractile. Denote by TF the set of double-claw twins of F , and by
F + TF the graph obtained by adding back the twins in F . Since F is friendly,
we can consider the tree decomposition of F . Five cases appear:
(1) F contains a clique-cutset. Then F + TF contains a clique-cutset C.
Every piece of F + TF with respect to C is an induced subgraph of G, and so
it is perfectly contractile. Moreover, clearly, each even pair of F + TF is an even
pair of the whole graph.
(2) F contains a rosette (centered at a vertex z). Suppose first that z has
no twins in TF . Then F + TF also contains a rosette centered at z, and the
proof works as usual. Now suppose that z has a set of twins T (z) ∈ TF . We
can generalize the rosette in following way: remove the vertices z + T (z) and all
the z edges, and construct the pieces G1 , . . . , Gk as before, except that z + T (z)
(instead of z alone) lies in every piece. Each piece is an induced subgraph of
G, so it is perfectly contractile. Moreover we can prove that each even pair in a
piece is an even pair of the whole graph. The desired result can then be obtained
as in the twin-free case above.
(3) F has a separator {a, b}. Let A = {a, a1 , . . . , al } and B = {b, b1 , . . . , br }
(r ≥ l ≥ 0) be the sets of twins of a and b respectively. If {a, b} is an even pair,
then we do the following sequence of contractions: {a, b}, {a1 , b1 }, . . . , {al , bl }.
A lemma (whose proof is omitted here) ensures that this is a valid sequence of
even-pair contractions. The result is a graph with a clique-cutset C ∪ R, where
C consists of the l contracted vertices and R is made of the r − l remaining
vertices of B. For each piece G1 of this graph with respect to C, the graph
G1 − R is isomorphic to a piece of F/ab with vertex ab replicated l times. This
fact, together with the fact that all the pieces of F with respect to {a, b} are
perfectly contractile, and with the induction hypothesis, implies that F + TF is
perfectly contractile. If {a, b} is an odd pair, a different type of modification of
the construction of the pieces is introduced; we skip this subcase for the sake of
shortness.
(4) F is bipartite. Then the vertices of F + TF can be divided into two sides
such that every connected component of each side is a clique and every clique
from one side sees all or none of the vertices of any clique on the other side. It
is easy to check directly that such a graph is perfectly contractile.
(5) F is claw-free. Then F + TF is claw-free. So F + TF , as F , is perfectly
contractile.
144
C. Linhares Sales, F. Maffray
Lemma 6. If F is a perfectly contractile friendly graph, then F +TF is perfectly
contractile.
The third and last step of the proof of Theorem 1 is the following lemma:
Lemma 7. A dart-free graph G′ without double-claw twins is perfectly contractile if and only if every friendly graph H of F ′ is perfectly contractile.
Finally, given a dart-free graph G that contains no odd hole, no antihole
and no odd stretcher, we obtain a graph G′ that contains no double-twins and
is decomposable into friendly graphs. By Theorem 2 these graphs are perfectly
contractile, and by Lemma 6, adding the twins back to these graphs preserves
their perfectly contractability. So, the modified family F ′ is a set of perfectly
contractile graphs. By Lemma 7, G is perfectly contractile, and the proof of
Theorem 1 is now complete.
4
Conclusion
The many positive results gathered in the past few years about Conjecture 1
(see [4]) motivate us to believe strongly in its validity and to continue our study
of this conjecture.
References
1. C. Berge. Les problèmes de coloration en théorie des graphes. Publ. Inst. Stat.
Univ. Paris, 9:123–160, 1960.
2. M. Bertschi. Perfectly contractile graphs. J. Comb. Theory, Series B, 50:222–230,
1990.
3. C.M.H. de Figueiredo, F. Maffray, and O. Porto. On the structure of bull-free
perfect graphs. Graphs and Combinatorics, 13:31–55, 1997.
4. H. Everett, C.M.H. de Figueiredo, C. Linhares Sales, F. Maffray, O. Porto, and
B. Reed. Path parity and perfection. Disc. Math., 165/166:233–252, 1997.
5. H. Everett and B.A. Reed. Problem session on path parity. DIMACS Workshop
on Perfect Graphs, Princeton, NJ, June 1993.
6. J. Fonlupt and J.P. Uhry. Transformations which preserve perfectness and hperfectness of graphs. “Bonn Workshop on Combinatorial Optimization 1981”,
Ann. Disc. Math., 16:83–95, 1982.
7. M. Grötschel, L. Lovász, and A. Schrijver. Polynomial algorithms for perfect
graphs. “Topics on Perfect Graphs”, Ann. Disc. Math., 21:325–356, 1984.
8. C. Linhares-Sales and F. Maffray. Even pairs in claw-free perfect graphs. J. Comb.
Theory, Series B, 74:169–191, 1998.
9. C. Linhares-Sales, F. Maffray, and B.A. Reed. On planar perfectly contractile
graphs. Graphs and Combinatorics, 13:167–187, 1997.
10. L. Lovász. Normal hypergraphs and perfect graphs. Disc. Math., 2:253–267, 1972.
11. L. Sun. Two classes of perfect graphs. J. Comb. Theory, Series B, 53:273–292,
1991.
12. L. Sun V. Chvátal, J. Fonlupt and A. Zemirline. Recognizing dart-free perfect
graphs. Tech. Report 92778, Institut für Ökonometrie und Operations Research,
Rheinische Friedrich Wilhelms Universität, Bonn, Germany, 1992.
Edge Colouring Reduced Indifference Graphs
Celina M.H. de Figueiredo1 , Célia Picinin de Mello2 , and Carmen Ortiz3
1
Instituto de Matemática, Universidade Federal do Rio de Janeiro, Brazil
celina@cos.ufrj.br
2
Instituto de Computação, Universidade Estadual de Campinas, Brazil
celia@dcc.unicamp.br
3
Escuela de Ingenierı́a Industrial, Universidad Adolfo Ibañez, Chile
cortiz@uai.cl
Abstract. The chromatic index problem – finding the minimum number
of colours required for colouring the edges of a graph – is still unsolved
for indifference graphs, whose vertices can be linearly ordered so that
the vertices contained in the same maximal clique are consecutive in
this order. Two adjacent vertices are twins if they belong to the same
maximal cliques. A graph is reduced if it contains no pair of twin vertices. A graph is overfull if the total number of edges is greater than the
product of the maximum degree by ⌊n/2⌋, where n is the number of vertices. We give a structural characterization for neighbourhood-overfull
indifference graphs proving that a reduced indifference graph cannot be
neighbourhood-overfull. We show that the chromatic index for all reduced indifference graphs is the maximum degree.
1
Introduction
In this paper, G denotes a simple, undirected, finite, connected graph. The sets
V (G) and E(G) are the vertex and edge sets of G. Denote |V (G)| by n and
|E(G)| by m. A graph with just one vertex is called trivial. A clique is a set of
vertices pairwise adjacent in G. A maximal clique of G is a clique not properly
contained in any other clique. A subgraph of G is a graph H with V (H) ⊆ V (G)
and E(H) ⊆ E(G). For X ⊆ V (G), denote by G[X] the subgraph induced by X,
that is, V (G[X]) = X and E(G[X]) consists of those edges of E(G) having both
ends in X. For Y ⊆ E(G), the subgraph induced by Y is the subgraph of G whose
vertex set is the set of endpoints of edges in Y and whose edge set is Y ; this
subgraph is denoted by G[Y ]. The notation G \ Y denotes the subgraph of G
with V (G \ Y ) = V (G) and E(G \ Y ) = E(G) \ Y . A graph G is H-free if G
does not contain an isomorphic copy of H as an induced subgraph. Denote by
Cn the chordless cycle on n vertices and by 2K2 the complement of the chordless
cycle C4 . A matching M of G is a set of pairwise non adjacent edges of G. A
matching M of G covers a set of vertices X of G when each vertex of X is
incident to some edge of M . The graph G[M ] is also called a matching.
For each vertex v of a graph G, the adjacency AdjG (v) of v is the set of
vertices that are adjacent to v. The degree of a vertex v is deg(v) = | AdjG (v)|.
G. Gonnet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNCS 1776, pp. 145–153, 2000.
c Springer-Verlag Berlin Heidelberg 2000
146
C.M.H. de Figueiredo et al.
The maximum degree of a graph G is then ∆(G) = maxv∈V (G) deg(v). We use
the simplified notation ∆ when there is no ambiguity. We call ∆-vertex a vertex
with maximum degree. The set N [v] denotes the neighbourhood of v, that is,
N [v] = AdjG (v) ∪ {v}. A subgraph induced by the neighbourhood of a vertex
is simply called a neighbourhood. We call ∆-neighbourhood the neighbourhood
of a ∆-vertex. Two vertices v e w are twins when N [v] = N [w]. Equivalently,
two vertices are twins when they belong to the same set of maximal cliques. A
graph is reduced if it contains no pair of twin vertices. The reduced graph G′ of
a graph G is the graph obtained from G by collapsing each set of twins into a
single vertex and removing possible resulting parallel edges and loops.
The chromatic index χ′ (G) of a graph G is the minimum number of colours
needed to colour the edges of G such that no adjacent edges get the same colour.
A celebrated theorem by Vizing [12, 10] states that χ′ (G) is always ∆ or ∆ + 1.
Graphs with χ′ (G) = ∆ are said to be in Class 1; graphs with χ′ (G) = ∆ + 1
are said to be in Class 2. A graph G satisfying the inequality m > ∆(G)⌊n/2⌋,
is said to be an overfull graph [8]. A graph G is subgraph-overfull [8] when it has
an overfull subgraph H with ∆(H) = ∆(G). When the overfull subgraph H can
be chosen to be a neighbourhood, we say that G is neighbourhood-overfull [4].
Overfull, subgraph-overfull, and neighbourhood-overfull graphs are in Class 2.
It is well known that the recognition problem for the set of graphs in Class 1
is NP-complete [9]. The problem remains NP-complete for several classes, including comparability graphs [1]. On the other hand, the problem remains unsolved
for indifference graphs: graphs whose vertices can be linearly ordered so that the
vertices contained in the same maximal clique are consecutive in this order [11].
We call such an order an indifference order. Given an indifference graph, for
each maximal clique A, we call maximal edge an edge whose endpoints are the
first and the last vertices of A with respect to an indifference order. Indifference
graphs form an important subclass of interval graphs: they are also called unitary
interval graphs or proper interval graphs. The reduced graph of an indifference
graph is an indifference graph with a unique indifference order (except for its
reverse). This uniqueness property was used to describe solutions for the recognition problem and for the isomorphism problem for the class of indifference
graphs [2].
It has been shown that every odd maximum degree indifference graph is
in Class 1 [4] and that every subgraph-overfull indifference graph is in fact
neighbourhood-overfull [3]. It has been conjectured that every Class 2 indifference graph is neighbourhood-overfull [4, 3]. Note that the validity of this conjecture implies that the edge-colouring problem for indifference graphs is in P.
The goal of this paper is to investigate this conjecture by giving another
positive evidence for its validity. We describe a structural characterization for
neighbourhood-overfull indifference graphs. This structural characterization implies that no reduced indifference graph is neighbourhood-overfull. We prove
that all reduced indifference graphs are in Class 1 by exhibiting an edge colouring with ∆ colours for every indifference graph with no twin ∆-vertices. In order
to construct such an edge colouring with ∆ colours, we decompose an arbitrary
Edge Colouring Reduced Indifference Graphs
147
indifference graph with no twin ∆-vertices into two indifference graphs: a matching covering all ∆-vertices and an odd maximum degree indifference graph.
The characterization for neighbourhood-overfull indifference graphs is described in Section 2. The decomposition and the edge colouring of indifference
graphs with no twin ∆-vertices is in Section 3. Our conclusions are in Section 4.
2
Neighbourhood-Overfull Indifference Graphs
In this section we study the overfull ∆-neighbourhoods of an indifference graph.
Since it is known that every odd maximum degree indifference graph is in
Class 1 [4], an odd maximum degree indifference graph contains no overfull
∆-neighbourhoods. We consider the case of even maximum degree indifference
graphs. A nontrivial complete graph with even maximum degree ∆ is always
an overfull ∆-neighbourhood. We characterize the structure of an overfull ∆neighbourhood obtained from a complete graph by removal of a set of edges.
Theorem 1. Let K∆+1 be a complete graph with even maximum degree ∆. Let
F = K∆+1 \ R, where R is a nonempty subset of edges of K∆+1 . Then, the
graph F is an overfull indifference graph with maximum degree ∆ if and only if
H = G[R] is a 2K2 -free bipartite graph with at most ∆/2 − 1 edges.
The proof of Theorem 1 is divided into two lemmas.
Lemma 1. Let K∆+1 be a complete graph with even maximum degree ∆. Let R
be a nonempty subset of edges of K∆+1 . If F = K∆+1 \ R is an overfull indifference graph with maximum degree ∆, then H = G[R] is a 2K2 -free bipartite
graph and |R| < ∆/2.
Proof. Let R be a nonempty subset of edges of K∆+1 , a complete graph with
even maximum degree ∆. Let F = K∆+1 \ R be an overfull indifference graph
with maximum degree ∆. Note that ∆ > 2.
Because F is an overfull graph, |V (F )| is odd and there are at most ∆/2 − 1
missing edges joining vertices of F . Hence |R| < ∆/2.
Suppose, by contradiction, that the graph H = G[R] contains a 2K2 as an
induced subgraph. Then F contains a chordless cycle C4 as an induced subgraph,
a contradiction to F being an indifference graph. Since H is a graph free of 2K2 ,
we conclude that the graph H does not contain the chordless cycle Ck , k ≥ 6.
We show that the graph H does not contain neither a C5 nor a C3 as an induced
subgraph. Assume the contrary. If H contains a C5 as an induced subgraph, then
F contains a C5 as an induced subgraph, since C5 is a self-complementary graph,
a contradiction to F being an indifference graph. If H contains a C3 as an induced
subgraph, then F contains a K1,3 as an induced subgraph, since, by hypothesis,
F has at least one vertex of degree ∆, a contradiction to F being an indifference
⊓
⊔
graph. Therefore, H is a bipartite graph, without 2K2 , and |R| < ∆/2.
148
C.M.H. de Figueiredo et al.
Lemma 2. Let K∆+1 be a complete graph with even maximum degree ∆. If
H = G[R] is a 2K2 -free bipartite graph induced by a nonempty set R of edges
of size |R| < ∆/2, then F = K∆+1 \ R is an overfull indifference graph with
maximum degree ∆.
Proof. By definition of F and because |R| < ∆/2, we have that F is an overfull
graph and it has vertices of degree ∆. We shall prove that F is an indifference
graph by exhibiting an indifference order on the vertex set V (F ) of F .
Since H = G[R] is a 2K2 -free bipartite graph, H is connected with unique bipartition of its vertex set into sets X and Y . Now, label the vertices x1 , x2 , . . . , xk
of X and label the vertices y1 , y2 , . . . , yℓ of Y according to degree ordering: labels correspond to vertices in no increasing vertex degree order, i.e., deg(x1 ) ≥
deg(x2 ) ≥ · · · ≥ deg(xk ) and deg(y1 ) ≥ deg(y2 ) ≥ · · · ≥ deg(yℓ ), respectively.
This degree ordering induces the following properties on the vertices of the
adjacency of each vertex of X and Y :
– The adjacency of a vertex of H defines an interval on the degree order, i.e.,
AdjH (xi ) = {yj : 1 ≤ p ≤ j ≤ p + q ≤ ℓ} and AdjH (yj ) = {xi : 1 ≤ r ≤ i ≤
r + s ≤ k}.
Indeed, let a be a vertex of H such that AdjH (a) is not an interval. Then
AdjH (a) has at least two vertices b and d, and there is a vertex c such
that ac 6∈ R between b and d. Without loss of generality, suppose that
deg(b) ≥ deg(c) ≥ deg(d). Since deg(c) ≥ deg(d), there is a vertex e such
that e is adjacent to c but is not adjacent to d. It follows that, when either
deg(e) ≤ deg(a) or deg(a) ≤ deg(e), H has an induced 2K2 (ec and ad), a
contradiction.
– The adjacency-sets of the vertices of H are ordered with respect to set inclusion according to the following containment property: AdjH (x1 ) ⊇ AdjH (x2 )
⊇ · · · ⊇ AdjH (xk ) and AdjH (y1 ) ⊇ AdjH (y2 ) ⊇ · · · ⊇ AdjH (yℓ ).
For, suppose there are a and b in X with deg(a) ≥ deg(b) and AdjH (a) 6⊇
AdjH (b). Hence, there are vertices c and d such that c is adjacent to a but
not to b, and d is adjacent to b but not to a. The edges ac and bd induce a
2K2 in H, a contradiction.
– x1 y1 is a dominating edge of H, i.e., every vertex of H is adjacent to x1 or
to y1 .
This is a direct consequence of the two properties above.
When H is not a complete bipartite graph, let i and j be the smallest indices
of vertices of X and Y , respectively, such that xi yj is not an edge of H. Note
that, because x1 y1 is a dominating edge of H, we have i and j greater than 1.
Define the following partition of V (H):
A := {x1 , x2 , . . . , xi−1 };
S := {xi , xi+1 , . . . , xk };
B := {y1 , y2 , . . . , yj−1 };
T := {yj , yj+1 , . . . , yℓ }.
Note that S and T can be empty sets and that the graph induced by A ∪ B
is a complete bipartite subgraph of H.
Edge Colouring Reduced Indifference Graphs
149
Now we describe a total ordering on V (F ) as follows. We shall prove that
this total ordering gives the desired indifference order.
– First, list the vertices of X = A ∪ S as x1 , x2 , . . . , xk ;
– Next, list all the ∆-vertices of F , v1 , v2 , . . . , vs ;
– Finally, list the vertices of Y = T ∪ B as yl , yl−1 , . . . , y1 .
The ordering within the sets A, S, D, T , B, where D denotes the set of the
∆-vertices of F , is induced by the ordering of V (F ).
By the containment property of the adjacency-sets of the vertices of H and
because each adjacency defines an interval, the consecutive vertices of the same
degree are twins in F . Hence, it is enough to show that the ordering induced on
the reduced graph F ′ of F is an indifference order.
For simplicity, we use the same notation for vertices of F and F ′ , i.e., we
call vertices in F ′ corresponding to vertices of X by x′1 , x′2 , . . . , x′k′ , and we
call vertices corresponding to vertices of Y by y1′ , y2′ , . . . , yℓ′ ′ . Note that the set
D contains only one representative vertex in F ′ , and we denote this unique
representative vertex by v1′ .
′
By definition of F ′ , x′1 v1′ , x′2 yℓ′ ′ , . . . , x′i−1 yj+1
, x′i yj′ , . . . , x′k′ y2′ , v1′ y1′ are edges
′
′
of F . Since vertex v1 is a representative vertex of a ∆-vertex of F , it is also a
∆-vertex of F ′ . Thus v1′ is adjacent to each vertex of F ′ . Each edge listed above,
distinct from x′1 v1′ and v1′ y1′ , has form x′p yq′ . We want to show that x′p is adjacent
to all vertices from x′p+1 up to yq′ with respect to the order. For suppose, without
loss of generality, that x′p is not adjacent to some vertex z between x′p and yq′
with respect to the order. Now by the definition of the graphs F and H, every
edge of K∆+1 not in F belongs to graph H. Since H is a bipartite graph, with
bipartition of its vertex set into sets X and Y , we have in F all edges linking
vertices in X, and so we have z 6= xs , p ≤ s ≤ k. Vertex z is also distinct from
ys , q ≤ s ≤ l, by the properties of the adjacency in H. Hence, x′p is adjacent to
all vertices from x′p+1 up to yq′ with respect to the order. It follows that each
edge listed above defines a maximal clique of F ′ . Hence, this ordering satisfies
the property that vertices belonging to the same maximal clique are consecutive
and we conclude that this ordering on V (F ′ ) is the desired indifference order.
This conclusion completes the proofs of both Lemma 2 and Theorem 1.
⊓
⊔
Corollary 1. Let G be an indifference graph. A ∆-neighbourhood of G with at
most ∆/2 vertices of maximum degree is not neighbourhood-overfull.
Proof. Let F be a ∆-neighbourhood of G with at most ∆/2 vertices of degree ∆.
If ∆ is odd, then F is not neighbourhood-overfull. If ∆ is even, then we use the
notation of Lemma 1 and Lemma 2. The hypothesis implies |X|+|Y | ≥ (∆/2)+1.
Since vertex x1 misses every vertex of Y and since vertex y1 misses every vertex
of X, there are at least |X| + |Y | − 1 missing edges having as endpoints x1 or y1 .
Hence, there are at least |X| + |Y | − 1 ≥ ∆/2 missing edges in F , and F cannot
be neighbourhood-overfull.
⊓
⊔
150
C.M.H. de Figueiredo et al.
Corollary 2. An indifference graph with no twin ∆-vertices is not neighbourhood-overfull.
Proof. Let G be an indifference graph with no twin ∆-vertices. The hypothesis
implies that every ∆-neighbourhood F of G contains precisely one vertex of
degree ∆. Now Corollary 1 says F is not neighbourhood-overfull and therefore
G itself cannot be neighbourhood-overfull.
⊓
⊔
Corollary 3. A reduced indifference graph is not neighbourhood-overfull.
3
⊓
⊔
Reduced Indifference Graphs
We have established in Corollary 2 of Section 2 that an indifference graph with
no twin ∆-vertices is not neighbourhood-overfull, a necessary condition for an
indifference graph with no twin ∆-vertices to be in Class 1. In this section, we
prove that every indifference graph with no twin ∆-vertices is in Class 1. We
exhibit a ∆-edge colouring for an even maximum degree indifference graph with
no twin ∆-vertices. Since every odd maximum degree indifference graph is in
Class 1, this result implies that all indifference graphs with no twin ∆-vertices,
and in particular that all reduced indifference graphs are in Class 1.
of a graph G. It is clear that if
Let E1 , . . . , Ek be a partition of the edge set P
the subgraphs G[Ei ], 1 ≤ i ≤ k, satisfy ∆(G) = i ∆(G[Ei ]) and, if for each i,
G[Ei ] is in Class 1, then G is also in Class 1. We apply this decomposition
technique to our given indifference graph with even maximum degree and no
twin ∆-vertices.
We partition the edge set of an indifference graph G with even maximum
degree ∆ and with no twin ∆-vertices into two sets E1 and E2 , such that
G1 = G[E1 ] is an odd maximum degree indifference graph and G2 = G[E2 ]
is a matching.
Let G be an indifference graph and v1 , v2 , . . . , vn an indifference order for G.
By definition, an edge vi vj is maximal if there does not exist another edge vk vℓ
with k ≤ i and j ≤ ℓ. Note that an edge vi vj is maximal if and only if the edges
vi−1 vj and vi vj+1 do not exist. In addition, every maximal edge vi vj defines
a maximal clique having vi as its first vertex and vj as its last vertex. Thus,
every vertex is incident to zero, one, or two maximal edges. Moreover, given an
indifference graph with an indifference order and an edge that is maximal with
respect to this order, the removal of this edge gives a smaller indifference graph:
the original indifference order is an indifference order for the smaller indifference
graph.
Based on Lemma 3 below, we shall formulate an algorithm for choosing a
matching of an indifference graph with no twin ∆-vertices that covers every
∆-vertex of G.
Lemma 3. Let G be a non trivial graph. If G is an indifference graph with no
twin ∆-vertices, then every ∆-vertex of G is incident to precisely two maximal
edges.
Edge Colouring Reduced Indifference Graphs
151
Proof. Let G be a non trivial indifference graph without twin ∆-vertices and let
v be a ∆-vertex of G. Consider v1 , v2 , . . . , vn an indifference order for G. Because
v is a ∆-vertex of G and G is not a clique, we have v = vj , with j 6= 1, n. Let vi
and vk be the leftmost and the rightmost vertices with respect to the indifference
order that are adjacent to vj , respectively. Suppose that vi vj is not a maximal
edge. Then vi−1 vj or vi vj+1 is an edge in G. The existence of vi−1 vj contradicts
vi being the leftmost neighbour of vj . Because vj and vj+1 are not twins, the
existence of vi vj+1 implies deg(vj+1 ) ≥ ∆ + 1, a contradiction. Analogously, we
have that vj vk is also a maximal edge.
⊓
⊔
We now describe an algorithm for choosing a set of maximal edges that covers
all ∆-vertices of G.
Input: an indifference graph G with no twin ∆-vertices with an indifference order
v1 , . . . , vn of G.
Output: a set of edges M that covers all ∆-vertices of G.
1. For each ∆-vertex of G, say vj , in the indifference order, put in a set E the
edge vi vj , where vi is its leftmost neighbour with respect to the indifference
order. Each component of the graph G[E] is a path. (Each component H of
G[E] has ∆(H) ≤ 2 and none of the components is a cycle, by the maximality
of the chosen edges.)
2. For each path component P of G[E], number each edge with consecutive
integers starting from 1. If a path component Pi contains an odd number
of edges, then form a matching Mi of G[E] choosing the edges numbered by
odd integers. If a path component Pj contains an even number of edges, then
by even integers.
form a matching Mj choosing the edges numbered
S
3. The desired set of edges M is the union k Mk .
We claim that the matching M above defined covers all ∆-vertices of G. For,
if a path component of G[E] contains an odd number of edges, then M covers all
of its vertices. If a path component of G[E] contains an even number of edges,
then the only vertex not covered by M is the first vertex of this path component.
However, by definition of G[E], this vertex is not a ∆-vertex of G.
Theorem 2. If G is an indifference graph with no twin ∆-vertices, then G is
in Class 1.
Proof. Let G be an indifference graph with no twin ∆-vertices. If G has odd
maximum degree, then G is in Class 1 [4].
Suppose that G is an even maximum degree graph. Let v1 , . . . , vn be an
indifference order of G. Use the algorithm described above to find a matching M
for G that covers all ∆-vertices of G. The graph G \ M is an indifference graph
with odd maximum degree because the vertex sets of G and G \ M are the
same and the indifference order of G is also an indifference order for G \ M .
Moreover, since M is a matching that covers all ∆-vertices of G, we have that
152
C.M.H. de Figueiredo et al.
∆(G \ M ) = ∆ − 1 is odd. Hence, the edges of G \ M can be coloured with
∆ − 1 colours and one additional colour is needed to colour the edges in the
matching M . This implies that G is in Class 1.
⊓
⊔
Corollary 4. All reduced indifference graphs are in Class 1.
4
⊓
⊔
Conclusions
We believe our work makes a contribution to the problem of edge-colouring
indifference graphs in three respects.
First, our results on the colouring of indifference graphs show that, in all
cases we have studied, neighbourhood-overfullness is equivalent to being Class 2,
which gives positive evidence to the conjecture that for any indifference graph
neighbourhood-overfullness is equivalent to being Class 2. It would be interesting
to extend these results to larger classes. We established recently [5] that every
odd maximum degree dually chordal graph is Class 1. This result shows that our
techniques are extendable to other classes of graphs.
Second, our results apply to a subclass of indifference graphs defined recently
in the context of clique graphs. A graph G is a minimum indifference graph if
G is a reduced indifference graph and, for some indifference order of G, every
vertex of G is the first or the last element of a maximal clique of G [6]. Given two
distinct minimum indifference graphs, their clique graphs are also distinct [6].
This property is not true for general indifference graphs [7]. Note that our results apply to minimum indifference graphs: no minimum indifference graph is
neighbourhood-overfull, every minimum indifference graph is in Class 1, and we
can edge-colour any minimum indifference graph with ∆ colours.
Third, and perhaps more important, the decomposition techniques we use to
show these results are new and proved to be simple but powerful tools.
Acknowledgments. We are grateful to João Meidanis for many insightful and
inspiring discussions on edge colouring. We thank Marisa Gutierrez for introducing us to the class of minimum indifference graphs. This work was partially
supported by Pronex/FINEP, CNPq, CAPES, FAPERJ and FAPESP, Brazilian
research agencies. This work was done while the first author was visiting IASI,
the Istituto di Analisi dei Sistemi ed Informatica, with financial support from
CAPES grant AEX0147/99-0. The second author is visiting IASI on leave from
IC/Unicamp with financial support from FAPESP grant 98/13454-8.
References
1. L. Cai and J. A. Ellis. NP-completeness of edge-colouring some restricted graphs.
Discrete Appl. Math., 30:15–27, 1991.
2. C. M. H. de Figueiredo, J. Meidanis, and C. P. de Mello. A linear-time algorithm
for proper interval graph recognition. Inform. Process. Lett., 56:179–184, 1995.
Edge Colouring Reduced Indifference Graphs
153
3. C. M. H. de Figueiredo, J. Meidanis, and C. P. de Mello. Local conditions for
edge-coloring. Technical report, DCC 17/95, UNICAMP, 1995. To appear in J.
Combin. Mathematics and Combin. Computing 31, (1999).
4. C. M. H. de Figueiredo, J. Meidanis, and C. P. de Mello. On edge-colouring
indifference graphs. Theoret. Comput. Sci., 181:91–106, 1997.
5. C. M. H. de Figueiredo, J. Meidanis, and C. P. de Mello. Total-chromatic number
and chromatic index of dually chordal graphs. Inform. Process. Lett., 70:147–152,
1999.
6. M. Gutierrez and L. Oubiña. Minimum proper interval graphs. Discrete Math.,
142:77–85, 1995.
7. B. Hedman. Clique graphs of time graphs. J. Combin. Theory Ser. B, 37:270–278,
1984.
8. A. J. W. Hilton. Two conjectures on edge-colouring. Discrete Math., 74:61–64,
1989.
9. I. Holyer. The NP-completeness of edge-coloring. SIAM J. Comput., 10:718–720,
1981.
10. J. Misra and D. Gries. A constructive proof of Vizing’s theorem. Inform. Process.
Lett., 41:131–133, 1992.
11. F. S. Roberts. On the compatibility between a graph and a simple order. J.
Combin. Theory Ser. B, 11:28–38, 1971.
12. V. G. Vizing. On an estimate of the chromatic class of a p-graph. Diskrete Analiz.,
3:25–30, 1964. In Russian.
Two Conjectures on the Chromatic Polynomial
David Avis1 , Caterina De Simone2 , and Paolo Nobili3
1
School of Computer Science, McGill University, 3480 University Street, Montreal,
Canada, H3A2A7. avis@cs.mcgill.ca†
2
Istituto di Analisi dei Sistemi ed Informatica (IASI), CNR, Viale Manzoni 30,
00185 Rome, Italy. desimone@iasi.rm.cnr.it
3
Dipartimento di Matematica, Università di Lecce, Via Arnesano, 73100 Lecce,
Italy; and IASI-CNR. nobili@iasi.rm.cnr.it
Abstract. We propose two conjectures on the chromatic polynomial
of a graph and show their validity for several classes of graphs. Our
conjectures are stronger than an older conjecture of Bartels and Welsh
[1].
Keywords: Vertex colorings, chromatic polynomials of graphs
The goal of this paper is to propose two conjectures on the chromatic polynomial
of a graph and prove them for several classes of graphs. Our conjectures are
stronger than a conjecture of Bartel and Welsh [1] that was recently proved by
Dong [2].
Let G be a graph. The chromatic polynomial of G is related to the colourings
(of the vertices) of G so that no two adjacent vertices get the same colour. If we
denote by ck (G) the number of ways to colour the vertices of G with exactly k
colours, then the chromatic polynomial of G is:
P (G, λ) =
n
X
ck (G)(λ)k ,
k=1
where (λ)k = λk k!.
Let ω(G) denote the clique number of G (maximum number of pairways
adjacent vertices) and let χ(G) denote the chromatic number of G (minimum
number of colours used in a colouring).
We propose the following two conjectures on P (G, λ):
Conjecture 1
λ − χ(G)
P (G, λ − 1)
≤
P (G, λ)
λ
λ−1
λ
n−χ(G)
∀λ ≥ n;
(1)
λ−1
λ
n−ω(G)
∀λ ≥ n.
(2)
Conjecture 2
λ − ω(G)
P (G, λ − 1)
≤
P (G, λ)
λ
†
This research was performed while the author was visiting IASI.
G. Gonnet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNCS 1776, pp. 154–162, 2000.
c Springer-Verlag Berlin Heidelberg 2000
Two Conjectures on the Chromatic Polynomial
155
Conjectures 1 and 2 are related to a conjecture of Bartels and Welsh [1],
known as the Shameful Conjecture:
The Shamefule Conjecture: For every graph G with n vertices,
P (G, n − 1)
≤
P (G, n)
n−1
n
n
.
The Shameful Conjecture was recently proved by Dong [2], who showed that
for every connected graph G with n vertices,
λ−2
P (G, λ − 1)
≤
P (G, λ)
λ
λ−2
λ−1
n−2
∀λ ≥ n.
(3)
What relates Conjectures 1 and 2 to Bartel’s and Welsh’s conjecture is that
both are stronger than their conjecture. To show that, let first prove the following
easy inequality:
m−k
≤
m
m−1
m
k
for every two integers m, k with m ≥ k ≥ 0,
(4).
The validity of (4) immediately comes from the following inequality, which is
strict if k > 1:
k
k
Y
m−1
m−k Y m−i
=
≤
.
m
m − i + 1 i=1 m
i=1
Now, (4) immediately implies that Conjecture 1 is stronger than Conjecture 2
(write (4) with m = λ − ω(G) and k = χ(G) − ω(G)). To see that Conjecture 2
implies the Shameful Conjecture, write (2) with λ = n, that is:
n − ω(G)
P (G, n − 1)
≤
P (G, n)
n
n−1
n
n−ω(G)
,
and apply inequality (4) with m = n and k = ω(G).
Moreover, inequality (3) is not stronger than inequalities (1) and (2): in fact,
it is easy to show that for every graph G with n vertices and 2ω(G) ≥ n + 2, the
right hand side of inequality (2) is smaller than the right hand side of inequality
(3).
The two upper bounds given in our conjectures can be considered as interpolations between the respective ratios for the empty graphs On (graph with n
vertices and no edges) and the complete graphs Kn (graphs with n vertices and
all edges), for which the conjectured bounds are clearly tight. Their strength
allowed us to define operations on graphs that maintain the validity of the conjectured bounds. In particular, we prove the validity of our conjectures for several
classes of graphs and then use these classes of graphs as building blocks to enlarge
the class of graphs for which our conjectures are true.
156
D. Avis, C. De Simone, P. Nobili
In [1], it was introduced the concept of the mean colour number of a graph
G with n vertices, as
P (G, n − 1)
.
µ(G) = n 1 −
P (G, n)
Since the Shameful Conjecture is true, it immediately yields a bound for µ(G),
that is
n
n−1
,
µ(G) ≥ n 1 −
n
which is tight only for the graph G = On . If Conjecture 2 were true, then we
could get a better bound for µ(G), that is
n−ω(G) !
n − ω(G) n − 1
,
µ(G) ≥ n 1 −
n
n
which is tight when G is the disjoint union of a clique and a stable set.
The next four theorems will give operations for building families of graphs
which satisfy Conjecture 1 (or Conjecture 2) from the basic graph O1 .
For this purpose, we first need to give some notations and definitions. Let G
be a graph with n vertices, if G is a tree then we shall denote it by Tn , and if G
is a cycle then we shall denote it by Cn .
A universal vertex in a graph is a vertex which is adjacent to all other vertices.
A clique-cutset in a graph G is a clique whose removal from G disconnects the
graph. If G has induced subgraphs G1 and G2 such that G = G1 ∪ G2 and
G1 ∩ G2 = Kt (for some t), then we say that G arises from G1 and G2 by clique
identification (see Figure 1). Clearly, if G arises by clique identification from two
other graphs, then G has a clique-cutset (namely, the clique Kt ).
G-K
1
t
K
t
G-K
2
t
G
Fig. 1. Clique cutset Kt
A graph is chordal (or triangulated) if it contains no induced cycles other
than triangles. It is well known that a graph is chordal if it can be constructed
recursively by clique identifications, starting from complete graphs.
Let uv be an edge of a graph G. By G|uv we denote the graph obtained from
G by contracting the edge uv into a new vertex which becomes adjacent to all
Two Conjectures on the Chromatic Polynomial
157
the former neighbours of u and v. We say that G is contractable to a graph F
if G contains a subgraph that becomes F after a series of edge contractions and
edge deletions. A graph is series-parallel if it is not contractable to K4 .
Finally, let uv be an edge of a graph G. Subdividing the edge uv means to
delete uv and add a new vertex x which is adjacent to only u and v. It is well
known that a series-parallel multigraph can be constructed recursively from a
K2 by the operations of subdiving and of doubling edges.
Now we are ready to prove our results.
Theorem 1 (Disjoint union) Let H be a graph obtained from two graphs G1
and G2 by disjoint union. If both G1 and G2 satisfy Conjecture 1 then H also
satisfies Conjecture 1.
Proof. : Assume that H has n vertices. Let ni denote the number of vertices of
Gi (i = 1, 2) and let λ ≥ n(= n1 + n2 ). Assume that χ(H) = χ(G2 ) ≥ χ(G1 ).
Since
P (H, λ) = P (G1 , λ)P (G2 , λ),
we have
P (G1 , λ − 1) P (G2 , λ − 1)
P (H, λ − 1)
=
.
P (H, λ)
P (G1 , λ)
P (G2 , λ)
Since both G1 and G2 satisfy Conjecture 1, we have
n−χ(G1 )−χ(G2 )
λ − χ(G1 ) λ − χ(G2 ) λ − 1
P (H, λ − 1)
≤
.
P (H, λ)
λ
λ
λ
But (3) implies that
λ − χ(G1 )
λ
λ−1
λ
−χ(G1 )
≤ 1,
and so we are done.
Theorem 2 (Add a universal vertex) Let H be a graph obtained from some
graph G by adding a universal vertex. If G satisfies Conjecture 1 then H also
satisfies Conjecture 1.
Proof. : Assume that H has n vertices. Write χ = χ(H) = χ(G) + 1 and let
λ ≥ n. Since P (H, λ) = λP (G, λ − 1), we have:
λ − 1 P (G, λ − 2)
P (H, λ − 1)
=
.
P (H, λ)
λ
P (G, λ − 1)
But then, since G satisfies Conjecture 1,
λ−1 λ−χ
P (H, λ − 1)
≤
P (H, λ)
λ λ−1
and so we are done because
λ−2
λ−1
<
λ−1
λ .
λ−2
λ−1
n−χ
,
158
D. Avis, C. De Simone, P. Nobili
Theorem 3 (Clique identification) Let H be a graph obtained from two graphs
G1 and G2 by clique identification. If both G1 and G2 satisfy Conjecture 1 then
H also satisfies Conjecture 1.
Proof. : Set χ = χ(H). Without loss of generality, we can assume that χ(G2 ) ≥
χ(G1 ), and so χ = χ(G2 ). Let ni denote the number of vertices of Gi (i = 1, 2)
and let G1 ∩ G2 = Kt . Clearly, H has n = n1 + n2 − t vertices. Now, let λ ≥ n.
Since
P (G1 , λ)P (G2 , λ)
P (G1 , λ)P (G2 , λ)
=
,
P (H, λ) =
P (Kt , λ)
(λ)t
we have
λ P (G1 , λ − 1) P (G2 , λ − 1)
P (H, λ − 1)
=
.
P (H, λ)
λ − t P (G1 , λ)
P (G2 , λ)
Since both G1 and G2 satisfy (1), we have
λ λ − χ(G1 ) λ − χ(G2 )
P (H, λ − 1)
≤
P (H, λ)
λ−t
λ
λ
λ−1
λ
n1 +n2 −χ(G1 )−χ(G2 )
,
that is
λ − χ(G1 )
P (H, λ − 1)
≤
P (H, λ)
λ−t
λ−1
λ
t−χ(G1 )
λ−χ
λ
λ−1
λ
n−χ
.
Hence, to prove the theorem, we only need show that
t−χ(G1 )
λ − χ(G1 ) λ − 1
≤ 1,
λ−t
λ
that is
λ−1
λ
χ(G1 )−t
≥
λ − χ(G1 )
.
λ−t
Now, since χ(G1 ) ≥ t, (3) (with m = λ and k = χ(G1 ) − t) implies that
χ(G1 )−t
λ − χ(G1 ) + t
λ−1
.
≥
λ
λ
But
and so we are done.
λ − χ(G1 )
λ − χ(G1 ) + t
≥
,
λ
λ−t
Theorem 4 (Edge subdivision) Let G be a graph with n vertices, let uv be an
edge of G, let r be a positive integer, and let H be the graph obtained from G by
deleting edge uv and by adding the new vertices x1 , · · · , xr and connecting each
of them to both u and v (see Figure 2). If the following two properties hold
,
(a) min{dG (u), dG (v)} ≤ n+r+1
2
(b) both G and G|uv satisfy Conjecture 2,
then the graph H also satisfies Conjecture 2.
Two Conjectures on the Chromatic Polynomial
159
u
u
...
x
1
G-{u,v}
>
G-{u,v}
x
r
v
v
G
H
Fig. 2. Subdivide edge uv
Before proving Theorem 4, we need the following technical lemma.
Lemma 1 Let x and r be two integers with x > r ≥ 1. Then
x
(x − 1)r (x + 1)r+1 − x2r+1
< .
r
r
r
r
x[(x − 1) x − (x + 1) (x − 2) ]
2
Proof. of Theorem 4.
Write G′′ = G|uv , ω = ω(H), ω ′ = ω(G), and ω ′′ = ω(G′′ ). Since
P (H, λ) = P (H + uv, λ) + P (H|uv , λ) = (λ − 2)r P (G, λ) + (λ − 1)r P (G′′ , λ),
we have
P (H, λ − 1)
=
P (H, λ)
λ−3
λ−2
where
α=
r
P (G, λ − 1)
α+
P (G, λ)
λ−2
λ−1
r
P (G′′ , λ − 1)
(1 − α),
P (G′′ , λ)
(λ − 2)r P (G, λ)
.
(λ − 2)r P (G, λ) + (λ − 1)r P (G′′ , λ)
To prove the theorem we have to show that, for every λ ≥ n + r,
λ−ω
P (H, λ − 1)
≤
P (H, λ)
λ
λ−1
λ
n+r−ω
r
.
For this purpose, write
R=
λ−3
λ−2
r
,
S=
λ−2
λ−1
.
Since by assumption both G and G′′ satisfy Conjecture 2, we have
λ − ω′
P (H, λ − 1)
≤ Rα
P (H, λ)
λ
λ−1
λ
n−ω′
λ − ω ′′
+ S(1 − α)
λ
λ−1
λ
n−1−ω′′
.
160
D. Avis, C. De Simone, P. Nobili
Hence, to prove the theorem we only need show that
Rα
λ − ω′
λ−ω
λ
λ−1
s′
+ S(1 − α)
λ − ω ′′
λ−ω
λ
λ−1
s′′
≤ 1,
(4)
where s′ = r + ω ′ − ω and s′′ = r + ω ′′ − ω + 1.
For this purpose, first note that either ω ′ = ω or ω ′ = ω + 1 and that either
′′
ω = ω or ω ′′ = ω + 1. But, since
λ − ω∗
λ−ω
λ
λ−1
ω∗ −ω
≤1
where ω ∗ = ω ′ or ω ∗ = ω ′′ , it follows that we only need show the validity of (4)
in the case ω ′ = ω ′′ = ω, that is
r
r
r
r+1
λ−2
λ−3
λ
λ
α+
(1 − α) ≤ 1.
(5)
λ−2
λ−1
λ−1
λ−1
Now, inequality (5) is equivalent to the following
r
r
r
r
λ−1
λ−2
λ−2
λ
λ
λ−3
≤
.
−
−
α
λ−2
λ−1
λ−1
λ
λ−1
λ−1
Since the coefficient of α in this inequality is strictly negative, we can divide
both sides by this term and simplify terms to get the equivalent inequality:
r
(λ − 2)r λr+1 − (λ − 1)2r+1
λ−2
.
α≥
λ
(λ − 2)2r λ − (λ − 3)r (λ − 1)r+1
Replacing the expression for α, we have
(λ − 2)r λr+1 − (λ − 1)2r+1
P (G, λ)
≥
.
P (G′′ , λ)
(λ − 1)[(λ − 2)r (λ − 1)r − λr (λ − 3)r ]
(6)
Hence, in order to prove the theorem, we only need show that (6) holds. Now,
Lemma 1 implies that
λ−1
(λ − 2)r λr+1 − (λ − 1)2r+1
≤
.
r
r
r
r
(λ − 1)[(λ − 2) (λ − 1) − λ (λ − 3) ]
2
Hence it is sufficient to show that for every λ ≥ n + r,
λ−1
P (G, λ)
≥
.
P (G′′ , λ)
2
For this purpose, consider any λ colouring of the graph G′′ . Since G′′ has less
than λ vertices, this colouring can be extended to a λ colouring of the graph G
as follows: give to vertex u (respectively, v) the same colour as that given to the
vertex in G′′ arising from contracting uv, and give to vertex v (respectively, u)
Two Conjectures on the Chromatic Polynomial
161
any of the λ − dG (u) (respectively, λ − dG (v)) colours not used by the neighbours
of vertex u (respectively, v). In other words,
P (G, λ) ≥ P (G′′ , λ)(λ − min{dG (u), dG (v)}).
Now, by assumption, min{dG (u), dG (v)} ≤
n+r+1
,
2
λ − min{dG (u), dG (v)} ≥ λ −
and so, since λ ≥ n + r,
λ−1
n+r+1
≥
,
2
2
and we are done. The theorem follows.
The previous four theorems give operations for building families of graphs
which satisfy Conjecture 1 (or Conjecture 2) from the basic graph O1 .
The following corollary follows immediately from Theorems 2 and 3:
Corollary 1 Every chordal graph satisfies Conjecture 1.
In particular, the empty graphs On and the trees Tn satisfy Conjecture 1.
Moreover, Theorems 3 and 4 can be used to prove the following result:
Theorem 5 Every series-parallel graph satisfies Conjecture 2.
Proof. : Let H be a series-parallel graph with m vertices. If m is small then the
theorem is obviously true. Hence, we can assume that every series-parallel graph
with fewer vertices than H verifies Conjecture 2. Moreover, we can assume that
H has no clique-cutset: for otherwise H would arise as clique-identification from
two other series-parallel graphs and so we could apply Theorem 3.
Now, by definition, H comes from some other series-parallel graph H ′ by
either duplicating some edge of H ′ or by subdividing some edge of H ′ . Since in
the first case H will still verify Conjecture 2, we only need show the validity of
the theorem when H is constructed from H ′ by subdividing some edge uv of H ′ .
Let x be the unique vertex of H that is not a vertex of H ′ . Set
T = {y ∈ V (H ′ ) : dH ′ (y) = 2, yu ∈ E(H ′ ), yv ∈ E(H ′ )}.
Write T = {x1 , · · · , xr−1 }, with r ≥ 1. Let G denote the graph obtained from
H ′ by removing all vertices in T . It follows that H can be built from G by
subdividing edge uv with the r vertices x1 , · · · , xr−1 , xr with xr = x, as shown
in Figure 2. Clearly, G is also series-parallel, and so it verifies Conjecture 2. Let
n denote the number of vertices of G. Note that H has n + r vertices. Since the
graph G|uv is also series-parallel, we can apply Theorem 4. Hence, to prove the
theorem, we only need show that
min{dG (u), dG (v)} ≤
n+r+1
.
2
For this purpose, set
A = {y ∈ V (G) : yu ∈
/ E(G), yv ∈
/ E(G)}
B = {y ∈ V (G) : yu ∈ E(G), yv ∈ E(G)}
C = V (G) − (A ∪ B ∪ {u, v}).
162
D. Avis, C. De Simone, P. Nobili
Now, if B contains at most one vertex, then dG (u) + dG′ (v) ≤ n + 1 and we are
done. Hence we can assume that B contains at least two vertices. Clearly, B is
a stable set in G′ (for otherwise, G would contain a K4 ).
First, note that:
• no vertex in C is adjacent to some vertex in B.
To see this, assume the contrary: there exists some vertex z ∈ C which is adjacent
to some vertex y ∈ B. Without loss of generality, we can assume that zu ∈ E(G).
Since H has no clique-cutset, it follows that {u, y} is not a clique-cutset in G, and
so there must exists a path P in G − {u, y} joining z to v. But then contracting
all edges of P − {v}, we get a K4 , contradicting the assumption that G is seriesparallel.
Next, note that
• every vertex in A is adjacent to at most one vertex in B.
This is obviously true because G is not contractable to K4 .
Since by assumption H and hence G has no clique cutset, every vertex in B is
adjacent to some vertex in A ∪ C, it follows that |A| ≥ |B| (recall that no vertex
in B is adjacent to some vertex in C), and so n = 2+|A|+|B|+|C| ≥ 2+2|B|+C.
But then dG (u) + dG′ (v) = |C| + 2|B| + 2 ≤ n, and so min{dG (u), dG (v)} ≤ n2 ,
and we are done.
References
1. J.E. Bartels, D. Welsh, The Markov Chain of Colourings, In: Lecture Notes of
Computer Science 920, Proceedings of the 4th International IPCO Conference
(Copenhagen, Denmark) 373–387 (1995).
2. F.M. Dong, Proof of a Chromatic Polynomial Conjecture, to appear on J. of Combin. Theory Ser. B.
Finding Skew Partitions Efficiently⋆
Celina M. H. de Figueiredo1 , Sulamita Klein1 , Yoshiharu Kohayakawa2 , and
Bruce A. Reed3
1
Instituto de Matemática and COPPE, Universidade Federal do Rio de Janeiro,
Brazil. {celina,sula}@cos.ufrj.br
2
Instituto de Matemática e Estatı́stica, Universidade de São Paulo, Brazil.
yoshi@ime.usp.br
3
CNRS, Université Pierre et Marie Curie, Institut Blaise Pascal, France.
reed@ecp6.jussieu.fr
Abstract. A skew partition as defined by Chvátal is a partition of the
vertex set of a graph into four nonempty parts A, B, C, D such that
there are all possible edges between A and B, and no edges between C
and D. We present a polynomial-time algorithm for testing whether a
graph admits a skew partition. Our algorithm solves the more general
list skew partition problem, where the input contains, for each vertex,
a list containing some of the labels A, B, C, D of the four parts. Our
polynomial-time algorithm settles the complexity of the original partition
problem proposed by Chvátal, and answers a recent question of Feder,
Hell, Klein and Motwani.
1
Introduction
A skew partition is a partition of the vertex set of a graph into four nonempty
parts A, B, C, D such that there are all possible edges between A and B, and
no edges between C and D. We present a polynomial-time algorithm for testing
whether a graph admits a skew partition, as well as for the more general list skew
partition problem, where the input contains, for each vertex, a list containing
some of the four parts.
Many combinatorial problems can be described as finding a partition of the
vertices of a given graph into subsets satisfying certain properties internally
(some parts may be required to be independent, or sparse in some other sense,
others may conversely be required to be complete or dense), and externally (some
pairs of parts may be required to be completely nonadjacent, others completely
adjacent). In [10], Feder et al. defined a parameterized family of graph problems
of this type.
The basic family of problems they considered is as follows: partition the
vertex set of a graph into k parts A1 , A2 , . . . , Ak with a fixed “pattern” of requirements as to which Ai are independent or complete and which pairs Ai , Aj
⋆
Research partially supported by CNPq, MCT/FINEP PRONEX Project 107/97,
CAPES(Brazil)/COFECUB(France) Project 213/97, FAPERJ, and by FAPESP
Proc. 96/04505-2.
G. Gonnet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNCS 1776, pp. 163–172, 2000.
c Springer-Verlag Berlin Heidelberg 2000
164
C.M.H. de Figueiredo et al.
are completely nonadjacent or completely adjacent. These requirements may be
conveniently encoded by a symmetric k×k matrix M in which the diagonal entry
Mi,i is 0 if Ai is required to be independent, 2 if Ai is required to be a clique,
and 1 otherwise (no restriction). Similarly, the off-diagonal entry Mi,j is 0, 1, or
2, if Ai and Aj are required to be completely nonadjacent, have arbitrary connections, or are required to be completely adjacent, respectively. Following [10],
we call such a partition an M -partition.
Many combinatorial problems just ask for an M -partition. For instance a
k-coloring is an M -partition where M is the adjacency matrix of the complete kgraph, and, more generally, H-coloring (homomorphism to a fixed graph H [13])
is an M -partition where M is the adjacency matrix of H. It is known that
H-coloring is polynomial-time solvable when H is bipartite and NP-complete
otherwise [13]. When M is the adjacency matrix of H plus twice the identity
matrix (all diagonal elements are 2), then M -partitions reduce to the so-called
(H, K)-partitions which were studied by MacGillivray and Yu [15]. When H is
triangle-free then (H, K)-partition is polynomial-time solvable, otherwise it is
NP-complete.
Other well-known problems ask for M -partitions in which all parts are restricted to be nonempty (e.g., skew partitions, clique cutsets, stable cutsets). In
yet other problems there are additional constraints, such as those in the definition of a homogeneous set (requiring one of the parts to have at least 2 and at
most n − 1 vertices). For instance, Winkler asked for the complexity of deciding
the existence of an M -partition, where M has the rows 1101, 1110, 0111, and
1011, such that all parts are nonempty and there is at least one edge between
parts A and B, B and C, C and D, and D and A. This has recently been shown
NP-complete by Vikas [17].
The most convenient way to express these additional constraints turns out
to be to allow specifying for each vertex (as part of the input) a “list” of parts in
which the vertex is allowed to be. Specifically, the list-M -partition problem asks
for an M -partition of the input graph in which each vertex is placed in a part
which is in its list. Both the basic M -partition problem (“Does the input graph
admit an M -partition?”), and the problem of existence of an M -partition with
all parts nonempty, admit polynomial-time reductions to the list-M -partition
problem, as do all of the above problems with the “additional” constraints. List
partitions generalize list-colorings, which have proved very fruitful in the study
of graph colorings [1, 12]. They also generalize list-homomorphisms which were
studied earlier [7, 8, 9].
Feder et al. [10] were the first to introduce and investigate the list version of
these problems. It turned out to be a useful generalization, since list problems
recurse more conveniently. This enabled them to classify the complexity (as
polynomial-time solvable or N P -complete) of list-M -partition problems for all
3 × 3 matrices M and some 4 × 4 matrices M . For other 4 × 4 matrices M they
were able to produce sub-exponential algorithms - including one for the skew
partition problem described below. This was the first sub-exponential algorithm
for the problem, and an indication that the problem is not likely to be N P -
Finding Skew Partitions Efficiently
165
complete. We were motivated by their approach, and show that in fact one can
use the mechanism of list partitions to give a polynomial-time algorithm for the
problem.
A skew partition is an M -partition, where M has the rows 1211, 2111, 1110,
and 1101, such that all parts are nonempty. List Skew Partition (LSP) is simply
the list-M -partition problem for this M . We can solve skew partition by solving
at most n4 LSP problems such that vi ∈ Ai for 1 ≤ i ≤ 4, for all possible
quadruples {v1 , v2 , v3 , v4 } of vertices of the input graph.
The skew partition problem has interesting links to perfect graphs, and is
one of the main problems in the area. Before presenting our two algorithms we
discuss perfect graphs and their link to skew partition.
2
Skew Cutsets and the Strong Perfect Graph Conjecture
A graph is perfect if each induced subgraph admits a vertex colouring and a
clique of the same size. A graph is minimal imperfect if it is not perfect but all
of its proper induced subgraphs are perfect. Perfect graphs were first defined by
Berge [2] who was interested in finding a good characterization of such graphs.
He proposed the strong perfect graph conjecture: the only minimal imperfect
graphs are the odd chordless cycles of length at least five and their complements.
Since then researchers have enumerated a list of properties of minimal imperfect
graphs. The strong perfect graph conjecture remains open and is considered a
central problem in computational complexity, combinatorial optimization, and
graph theory.
Chvátal [4] proved that no minimal imperfect graph contains a structure
that he called a star cutset: a vertex cutset consisting of a vertex and some of
its neighbours. Chvátal exhibited a polynomial-time recognition algorithm for
graphs with a star cutset. He also conjectured that no minimal imperfect graph
contains a skew partition. Recalling our earlier definition, a skew partition is a
partition of the vertex set of a graph into four nonempty parts A, B, C, D such
that there are all possible edges between A and B, and no edges between C and
D. We call each of the four nonempty parts A, B, C, D a skew partition set. We
say that A ∪ B is a skew cutset. The complexity of testing for the existence of
a skew cutset has motivated many publications [5, 10, 14, 16]. Recently, Feder
et al. [10] described a quasi-polynomial algorithm for testing whether a graph
admits a skew partition, which strongly suggested that this problem was not
NP-complete. In this paper, we present a polynomial-time recognition algorithm
for testing whether a graph admits a skew partition.
Cornuéjols and Reed [5] proved that no minimal imperfect graph contains a
skew partition in which A and B are both stable sets. Actually, they proved the
following more general result. Let a complete multi-partite graph be one whose
vertex set can be partitioned into stable sets S1 , . . . , Sk , such that there are all
possible edges between Si and Sj , for i 6= j. They proved that no minimal imperfect graph contains a skew cutset that induces a complete multi-partite graph.
Their work raised questions about the complexity of testing for the existence
166
C.M.H. de Figueiredo et al.
either of a complete bipartite cutset or of a complete multi-partite cutset in a
graph.
Subsequently, Klein and de Figueiredo [16] showed how to use a result of
Chvátal [3] on matching cutsets in order to establish the NP-completeness of
recognizing graphs with a stable cutset. In addition, they established the NPcompleteness of recognizing graphs with a complete multi-partite cutset. In particular, their proof showed that it is NP-complete to test for the existence of a
complete bipartite cutset, even if the cutset induces a K1,p .
As shown by Chvátal [4], to test for the existence of a star cutset is in P,
whereas to test for the existence of the special star cutset K1,p is NP-complete,
as shown in [16]. The polynomial-time algorithm described in this paper offers
an analogous complexity situation: to test for the existence of a skew cutset
is in P, whereas to test for the existence of a complete bipartite cutset is NPcomplete [16].
3
Overview
The goal of this paper is to present a polynomial-time algorithm for the following
decision problem:
Skew Partition Problem
Input: a graph G = (V, E).
Question: Is there a skew partition A, B, C, D of G?
We actually consider list skew partition (LSP) problems, stated as decision
problems as follows:
List Skew Partition Problem
Input: a graph G = (V, E) and for each vertex v ∈ V , a subset Lv of {A, B, C, D}.
Question: Is there a skew partition A, B, C, D of G such that each v is contained
in some element of the corresponding Lv ?
Throughout the algorithm we have a partition of V into at most 15 sets
indexed by the nonempty subsets of {A, B, C, D}, i.e., {SL |L ⊆ {A, B, C, D}},
such that Property 1 is always satisfied. For convenience, we denote S{A} by SA .
Note that the relevant inputs for LSP have SA , SB , SC , and SD nonempty.
Property 1. If the algorithm returns a skew partition, then if v is in SL , then
the returned skew partition set containing v is in L.
Initially, we set SL = {v|Lv = L}.
We also restrict our attention to LSP instances which satisfy the following
property:
Property 2. If v ∈ SL , for some L with A ∈ L, then it sees every vertex of SB .
If v ∈ SL , for some L with B ∈ L, then it sees every vertex of SA . If v ∈ SL , for
Finding Skew Partitions Efficiently
167
some L with C ∈ L, then it is non-adjacent to every vertex of SD . If v ∈ SL , for
some L with D ∈ L, then it is non-adjacent to every vertex of SC .
Both Properties 1 and 2 hold throughout the algorithm.
Remark 1. Since SB must be contained in B, we know that if v is to be in A for
some solution to the problem, then v must see all of SB . Thus if some v ∈ SA
misses a vertex of SB , then there is no solution to the problem and we need not
continue. If there is some L with A properly contained in L and a vertex v in
SL which misses a vertex of SB , then we know in any solution to the problem v
must be contained in some element of L \ A. So we can reduce to a new problem
where we replace SL by SL \ v, we replace
P SL\A by SL\A + v and all other SL are
as before. Such a reduction reduces L |SL ||L| by 1. Since this sum is at most
4n, after O(n) similar reductions we must obtain an LSP problem satisfying
Property 2 (or halt because the original problem has no solution).
In our discussion we often create new LSP instances and whenever we do
so, we always perform this procedure to reduce to an LSP problem satisfying
Property 2.
For an instance I of LSP we have {SL (I)|L ⊆ {A, B, C, D}, but we drop the
(I) when it is not needed for clarity.
We will consider a number of restricted versions of the LSP problems:
– MAX-3-LSP: an LSP problem satisfying Property 2 such that SABCD = ∅;
– MAX-2-LSP: an LSP problem satisfying Property 2 such that if |L| > 2,
then SL = ∅;
– AC-TRIV LSP: an LSP problem satisfying Property 2 such that SAC = ∅;
Remark 2. It is easy to obtain
[
[a solution to an instance of AC-TRIV-LSP
SL . By
SL , C = SC , and D =
as follows: A = SA , B =
B∈L
Property 2 this is indeed a skew partition.
D∈L, B 6∈L
– BD-TRIV LSP, AD-TRIV LSP, BC-TRIV-LSP. These problems are defined
and solved similarly as AC-TRIV LSP.
Our algorithm for solving LSP requires four subalgorithms which replace an
instance of LSP by a polynomial number of instances of more restricted versions
of LSP.
Algorithm 1 Takes an instance of LSP and returns in polynomial time a list
L of a polynomial number of instances of MAX-3-LSP such that
(i) a solution to any problem in L is a solution of the original problem, and
(ii) if none of the problems in L have a solution, then the original problem has
no solution.
168
C.M.H. de Figueiredo et al.
Algorithm 2 Takes an instance of MAX-3-LSP and returns in polynomial time
a list L of a polynomial number of instances of MAX-3-LSP such that:
(i) and (ii) of Algorithm 1 hold, and
(iii) for each problem in L, either SABC = ∅ or SABD = ∅.
Algorithm 3 Takes an instance of MAX-3-LSP and returns in polynomial time
a list L of a polynomial number of instances of MAX-3-LSP such that:
(i) and (ii) of Algorithm 1 hold, and
(iii) for each problem in L, either SBCD = ∅ or SACD = ∅.
Algorithm 4 Takes an instance of MAX-3-LSP such that
(a) either SABC or SABD is empty, and
(b) either SBCD or SACD is empty
and returns a list L of a polynomial number of problems each of which is an
instance of one of MAX-2-LSP, AC-TRIV LSP, AD-TRIV LSP, BC-TRIV LSP
or BD-TRIV LSP such that (i) and (ii) of Algorithm 1 hold.
We also need two more algorithms for dealing with the most basic instances
of LSP.
Algorithm 5 Takes an instance of MAX-2-LSP and returns either
(i) a solution to this instance of MAX-2-LSP, or
(ii) the information that this problem instance has no solution.
Remark 3. Algorithm 5 simply applies 2-SAT as discussed in [6]; we omit the
details.
Algorithm 6 Takes an instance of AC-TRIV LSP or AD-TRIV LSP or BCTRIV LSP or BD-TRIV LSP and returns a solution using the partitions discussed in the Remark 2.
To solve an instance of LSP we first apply Algorithm 1 to obtain a list
L1 of instances of MAX-3-LSP. For each problem instance I on L1 , we apply
Algorithm 2 and let LI be the output list of problem I. We let L2 be the
concatenation of the lists {LI |I ∈ L1 }. For each I in L2 , we apply Algorithm
3. Let L3 be the concatenation of the lists {LI |I ∈ L2 }. For each problem
instance I on L3 , we apply Algorithm 4. Let L4 be the concatenation of the
lists {LI |I ∈ L3 }. Each element of L4 can be solved using either Algorithm 5 or
Algorithm 6 in polynomial time. If any of these problems has a solution S, then
by the specifications of the algorithms, S is a solution to the original problem.
Otherwise, by the specifications of the algorithms, there is no solution to the
original problem. Clearly, the whole algorithm runs in polynomial time.
Finding Skew Partitions Efficiently
4
169
Algorithm 1
We now present Algorithm 1. The other algorithms are similar and their details
are left for a longer version of this paper. The full details and proofs are in the
technical report [11].
Algorithm 1 recursively applies Procedure 1 which runs in polynomial time.
Procedure 1 Input: An instance I of LSP.
Output: Four instances I1 , I2 , I3 , I4 of LSP such that, for 1 ≤ j ≤ 4, we have
9
|SABCD (I)|.
|SABCD (Ij )| ≤ 10
It is easy to prove inductively that recursively applying Procedure 1 yields a
polynomial time implementation of Algorithm 1 which when applied to an input
graph with n vertices creates as output a list L of instances of LSP such that
log 10 n
|L| ≤ 4 9 ≤ n14 .
Let n = |SABCD (I)|. For any skew partition {A, B, C, D}, let A′ = A ∩
SABCD (I), B ′ = B ∩ SABCD (I), C ′ = C ∩ SABCD (I), and D′ = D ∩ SABCD (I).
n
≤ |SABCD ∩N (v)| ≤ 9n
Case 1: There exists a vertex v in SABCD such that 10
10 .
Branch according to whether v ∈ A, v ∈ B, v ∈ C, or v ∈ D with instances
IA , IB , IC , ID respectively. We define IA by initially setting SA (IA ) = v + SA (I)
and reducing so that Property 2 holds. We define IB , IC , ID similarly. We note
that by Property 2, if v ∈ C, then D ∩ N (v) = ∅. So, SABCD (IC ) ⊂ SABCD (I) \
N (v).
n
vertices in SABCD ∩ N (v), this means that
Because there are at least 10
9n
|SABCD (IC )| ≤ 10 .
Symmetrically, |SABCD (ID )| ≤ 9n
10 .
Similarly, by Property 2, SABCD (IA ) ⊂ SABCD (I)∩N (v), so |SABCD (IA )| ≤
9n
9n
⊓
⊔
10 . Symmetrically |SABCD (IB )| ≤ 10 .
n
n
vertices v in SABCD such that |SABCD ∩N (v)| < 10
Case 2: There are at least 10
9n
n
and there are at least 10 vertices v in SABCD such that |SABCD ∩ N (v)| > 10 .
Let W = {v ∈ SABCD : |SABCD ∩ N (v)| > 9n
10 } and X = {v ∈ SABCD :
n
n
n
}. Branch according to |A′ | ≥ 10
, or |B ′ | ≥ 10
, or
|SABCD ∩ N (v)| < 10
n
n
′
′
|C | ≥ 10 , or |D | ≥ 10 with corresponding instances IA′ , IB ′ , IC ′ and ID′ . Each
of these choices forces either all the vertices in W or all the vertices in X to
n
n
, then every vertex in B has 10
have smaller label sets, as follows. If |A′ | ≥ 10
neighbours in SABCD (I), so B ∩ X = ∅. Thus, SABCD (IA′ ) = SABCD (I) \ X,
n
′
and |SABCD (IA′ )| ≤ 9n
10 . If |B | ≥ 10 , then a symmetrical argument shows that
X ∩ A = ∅. Thus, SABCD (IB ′ ) = SABCD (I) \ X, and |SABCD (IB ′ )| ≤ 9n
10 . If
n
n
, then every vertex in D has at least 10
non-neighbours in SABCD (I).
|C ′ | ≥ 10
Hence W ∩ D = ∅, SABCD (IC ′ ) = SABCD (I) \ W , and so |SABCD (IC ′ )| ≤ 9n
10 .
n
If |D′ | ≥ 10
, then a symmetrical argument shows that |SABCD (ID′ )| ≤ 9n
.
⊓
⊔
10
Case 3: There are over
9n
10
vertices in W .
170
C.M.H. de Figueiredo et al.
We will repeatedly apply the following procedure to various W ′ ⊆ W with
′
|W ′ | ≥ 8n
10 . We recursively define a partition of W into three sets O, T , and N T
such that:
– There are all edges between O and T ;
– For every w in N T , there exists v in O such that w misses v;
– The complement of O is connected.
Start by choosing v1 in W ′ and setting: O = {v1 }, T = N (v1 )∩W ′ , and N T =
n
W \ (N (v1 ) ∪ {v1 }). Note that for each vertex v of W ′ , since v misses at most 10
n
n
′
′
′
vertices of SABCD , |N (v)∩W | > |W |− 10 . So |N T | = |W \(N (v1 )∪{v1 })| < 10 .
Grow O by moving an arbitrary vertex v from N T to O, and by moving T \N (v)
from T to N T until:
′
(i) |O| + |N T | ≥
(ii) N T = ∅.
n
10 ;
or
n
, and
If the growing process stops with condition (i), i.e., |O| + |N T | ≥ 10
vi was the last vertex added to O, then adding vi to O increased |N T | by
n
n
n
at most |W ′ \ (N (vi ) ∪ {vi })| < 10
. Thus, |O| + |N T | < 10
+ 10
= n5 . So,
n
6n
n
8n
|T | ≥ 10 − 5 = 10 ≥ 10 .
Our first application of the procedure is with W ′ = W . If we stop because
(i) holds, then we define four new instances of LSP according to the intersection
of the skew partition sets A, B, C or D with O, as follows:
(a)
(b)
(c)
(d)
I1 : C ∩ O 6= ∅,
I2 : C ∩ O = ∅, D ∩ O 6= ∅,
I3 : O ⊆ A,
I4 : O ⊆ B.
Recall that the complement of O is connected, which implies that if O ∩
(C ∪ D) = ∅, then either O ⊆ A or O ⊆ B. If O ⊆ A, then N T ∩ B = ∅ since
(∀w ∈ N T )(∃v ∈ O) such that vw 6∈ E. Thus, (O ∪N T )∩SABCD (I3 ) = ∅. Hence
9n
|SABCD (I3 )| ≤ 9n
10 . A symmetrical argument shows that |SABCD (I4 )| ≤ 10 .
If C ∩ O 6= ∅, then D ∩ T = ∅. Thus, T ∩ SABCD (I1 ) = ∅, which implies
9n
|SABCD (I1 )| ≤ 9n
10 . A symmetrical argument shows that |SABCD (I2 )| ≤ 10 .
Thus, if our application of the above procedure halts with output (i), then we
have found the four desired output instances of LSP.
Otherwise, the growing process stops with condition (ii), i.e., N T = ∅ and
n
. Set O1 = O and reapply the algorithm to W ′ = W \ O1 to obtain
|O| < 10
O2 . More generally, having constructed disjoint sets O1 , . . . , Oi with |∪ij=1 Oj | <
n/10, we construct Oi+1 by applying the algorithm to Wi = W \ ∪ij=1 Oj . Note
|Wi | > 8n
10 .
n
We continue until |∪ij=1 Oj | ≥ 10
or condition (i) occurs. If condition (i) ever
occurs, then we proceed as above. Otherwise, we stop after some iteration i∗
n
n
n
such that | ∪i<i∗ Oi | < 10
and | ∪i≤i∗ Oi | ≥ 10
. Since |Oi∗ | < 10
, we have that
Finding Skew Partitions Efficiently
171
|∪i≤i∗ Oi | ≤ n5 . Also, all the edges between sets Z = ∪ij=1 Oj and Y = W \∪ij=1 Oj
exist, which implies that C ∩ Z = ∅ or D ∩ Y = ∅.
We now define two new instances of LSP according to the intersection of
skew partition sets C or D with Z, as follows:
(a) I1 : C ∩ Z = ∅,
(b) I2 : D ∩ Y = ∅.
⊓
⊔
In either output instance Ii , |SABCD (Ii )| ≤ 9n
10 .
G)
and
is
is
symmetric
to
Case
3
(consider
Note that the case |X| > 9n
10
omitted.
5
Concluding Remarks
It is evident to the authors that the techniques we have developed will apply to
large classes of list-M -partition problems. We intend to address this in future
work.
References
1. N. Alon, and M. Tarsi. Colorings and orientations of graphs. Combinatorica 12
(1992) 125–134.
2. C. Berge. Les problèmes de coloration en théorie des graphes. Publ. Inst. Stat.
Univ. Paris 9 (1960) 123–160.
3. V. Chvátal. Recognizing decomposable graphs. J. Graph Theory 8 (1984) 51–53.
4. V. Chvátal. Star-Cutsets and perfect graphs. J. Combin. Theory Ser. B 39 (1985)
189–199.
5. G. Cornuéjols, and B. Reed. Complete multi-partite cutsets in minimal imperfect
graphs. J. Combin. Theory Ser. B 59 (1993) 191–198.
6. H. Everett, S. Klein, and B. Reed. An optimal algorithm for finding clique-cross
partitions. Congr. Numer. 135 (1998) 171–177.
7. T. Feder, and P. Hell. List homomorphisms to reflexive graphs. J. Combin. Theory
Ser. B 72 (1998) 236–250.
8. T. Feder, P. Hell, and J. Huang. List homomorphisms to bipartite graphs.
Manuscript.
9. T. Feder, P. Hell, and J. Huang. List homomorphisms to general graphs.
Manuscript.
10. T. Feder, P. Hell, S. Klein, and R. Motwani. Complexity of graph partition problems. Proceedings of the thirty-first ACM Symposium on Theory of Computing,
STOC’99 (1999) 464–472.
11. C. M. H. de Figueiredo, S. Klein, Y. Kohayakawa, and B. Reed. Finding Skew Partitions Efficiently. Technical Report ES-503/99, COPPE/UFRJ, Brazil. Available
at ftp://ftp.cos.ufrj.br/pub/tech reps/es50399.ps.gz.
12. H. Fleischner, and M. Stiebitz. A solution of a coloring problem of P. Erdös.
Discrete Math. 101 (1992) 39–48.
13. P. Hell, and J. Nešetřil. On the complexity of H-coloring. J. Combin. Theory Ser.
B 48 (1990) 92–110.
172
C.M.H. de Figueiredo et al.
14. C .T. Hoàng. Perfect Graphs. Ph.D. Thesis, School of Computer Science, McGill
University, Montreal, (1985).
15. G. MacGillivray, and M.L. Yu. Generalized partitions of graphs. Discrete Appl.
Math. 91 (1999) 143–153.
16. S. Klein, and C. M. H. de Figueiredo. The NP-completeness of multi-partite cutset
testing. Congr. Numer. 119 (1996) 216–222.
17. N. Vikas. Computational complexity of graph compaction. Proceedings of the tenth
ACM-SIAM Symposium on Discrete Algorithms (1999) 977–978.
On the Competitive Theory and Practice of Portfolio
Selection
(Extended Abstract)
Allan Borodin1 , Ran El-Yaniv2 , and Vincent Gogan3
1
3
Department of Computer Science, University of Toronto
bor@cs.toronto.edu
2
Department of Computer Science, Technion
rani@cs.technion.ac.il
Department of Computer Science, University of Toronto
vincent@cs.toronto.edu
Abstract. Given a set of say m stocks (one of which may be “cash”), the online
portfolio selection problem is to determine a portfolio for the ith trading period
based on the sequence of prices for the preceding i − 1 trading periods. Competitive analysis is based on a worst case perspective and such a perspective is
inconsistent with the more widely accepted analyses and theories based on distributional assumptions. The competitive framework does (perhaps surprisingly)
permit non trivial upper bounds on relative performance against CBAL-OPT, an optimal offline constant rebalancing portfolio. Perhaps more impressive are some
preliminary experimental results showing that certain algorithms that enjoy “respectable” competitive (i.e. worst case) performance also seem to perform quite
well on historical sequences of data. These algorithms and the emerging competitive theory are directly related to studies in information theory and computational
learning theory and indeed some of these algorithms have been pioneered within
the information theory and computational learning communities. We present a
mixture of both theoretical and experimental results, including a more detalied
study of the performance of existing and new algorithms with respect to a standard sequence of historical data cited in many studies. We also present experiments
from two other historical data sequences.
1 Introduction
This paper is concerned with the portfolio selection (PS) problem, defined as follows.
Assume a market with m securities. The securities can be stocks, bonds, currencies,
commodities, etc. For each trading day i ≥ 0, let vi = (vi,1 , vi,1 , . . . , vi,m ) be the
price vector for the ith period, where vi,j , the price or value of the jth security, is
given in the “local” currency, called here cash or dollars. For analysis it is often more
convenient to work with relative prices rather than prices. Define xi,j = vi,j /vi−1,j
to be the relative price of the jth security corresponding to the ith period.1 Denote by
1
Here we are greatly simplifying the nature of the market and assuming that xi+1,j is the
ratio of opening price on the i + 1st day to the opening price on the ith day. That is, we are
assuming that a trader can buy or sell at the opening price. Later we try to compensate for this
by incorporating bid-ask spreads into transaction costs.
G. Gonnet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNCS 1776, pp. 173–196, 2000.
c Springer-Verlag Berlin Heidelberg 2000
174
A. Borodin, R. El-Yaniv, V. Gogan
xi = (xi,1 , . . . , xi,m ) the market vector of relative prices corresponding to the ith day.
A portfolio b is specified by the proportions of current dollar wealth invested in each of
the securities
X
bj ≤ 1,
bj = 1 .
b = (b1 , . . . , bm ),
P
The return of a portfolio b w.r.t. a market vector x is b · x = j bj xj . The (compound)
return of a sequence of portfolios B = b1 , . . . , bn w.r.t. a market sequence X =
x1 , . . . , xn is
n
Y
bi · xi .
R(B, X) =
i=1
A PS algorithm is any deterministic or randomized rule for specifying a sequence of
portfolios. If ALG is a deterministic (respectively, randomized) PS algorithm then its
(expected) return with respect to a market sequence X is denoted by ALG(X).
The basic PS problem described here ignores several important factors such as transaction commissions, buy-sell spreads and risk tolerance and control.
A simple strategy, advocated by many financial advisors, is to simply divide up the
amount of cash available and to buy and hold a portfolio of the securities. This has the
advantage of minimizing transaction costs and takes advantage of the natural tendency
for the market to grow. In addition, there is a classical algorithm, due to Markowitz
[Mar59], for choosing the weightings of the portfolio so as to minimize the variance for
any target expected return.
An alternative approach to portfolio management is to attempt to take advantage of
volatility (exhibited in price fluctuations) and to actively trade on a “day by day” basis.
Such trading can sometimes lead to returns that dramatically outperform the performance
of the best security.
For example, consider the class of constant rebalanced algorithms. An algorithm in
this class, denoted CBALb is specified by a fixed portfolio b and maintains a constant
weighting (by value) amongst the securities. Thus, at the beginning of each trading
period CBALb rebalances its portfolio so that it is b-balanced. The constant rebalanced
algorithms are motivated by several factors. In particular, it can be shown that the optimal
offline algorithm in this class, CBAL-OPT, can lead to exponential returns that dramatically
outperform the best stock (see e.g. [Cov91]).
One objective in studying the portfolio selection problem is to arrive at online trading
strategies that are guaranteed, in some sense, to perform well. What is the choice of
performance measure? We focus on a competitive analysis framework whereby the
performance of an online portfolio selection strategy is compared to that of a benchmark
algorithm on every input. One reasonable benchmark is the return provided by the best
stock. For more active strategies, an optimal algorithm (that has complete knowledge
of the future) could provide returns so extreme that any reasonable approach is doomed
when viewed in comparison. Specifically, any online algorithm competing against the
optimal offline algorithm, called OPT is at best mn -competitive where n is the number
of trading days and m is the number of stocks.
In the more classical (and perhaps most influential) approach, the PS problem is
broken down into two stages. First, one uses statistical assumptions and historical data
to create a model of stock prices. After this the model is used to predict future price
On the Competitive Theory and Practice of Portfolio Selection
175
movements. The technical difficulties encountered in the more traditional approach (i.e.
formulating a realistic yet tractable statistical model) motivates competitive analysis.
The competitive analysis approach starts with minimal assumptions, derives algorithms
within this worst case perspective and then perhaps adds statistical or distributional
assumptions as necessary (to obtain analytical results and/or to suggest heuristic improvements to the initial algorithms). It may seem unlikely that this approach would be
fruitful but some interesting results have been proven. In particular, Cover’s Universal
Portfolio algorithm [Cov91] was proven to possess important theoretical properties. We
are mainly concerned with competitive analyses against CBAL-OPT and say that a portfolio selection algorithm ALG is c-competitive (w.r.t. CBAL-OPT) if the supremum, over all
market sequences X, of the ratio CBAL-OPT(X)/ALG(X) is less than or equal c.
Instead of looking at ALG’s competitive ratio we can equivalently measure the degree
of “universality” of ALG. Following Cover [Cov91], we say that ALG is universal if for
all X,
log CBAL-OPT(X) log ALG(X)
−
→0 .
n
n
In this sense, the “regret” experienced by an investor that uses a universal online algorithm
approaches zero as time goes on. Clearly the rate at which this regret approaches zero
corresponds to the competitive ratio and ALG is universal if and only if its competitive ratio
is 2o(n) . One motivation for measuring performance by universality is that it corresponds
to the minimization of the regret, using a logarithmic utility function (see [BE98]). On
the other hand, it obscures the convergence rate and therefore we prefer to use the
competitive ratio. When the competitive ratio of a PS algorithm (against CBAL-OPT) can
be bounded by a polynomial in n (for a fixed number of stocks), we shall say that the
algorithm is competitive.
2 Some Classes of PS Algorithms
2.1
Buy-And-Hold (BAH) Algorithms
The simplest portfolio selection policy is buy-and-hold (BAH): Invest in a particular
portfolio and let it sit for the entire duration of the investment. Then, in the end, cash
the portfolio out of the market. The optimal offline algorithm, BAH-OPT, invests in the
best performing stock for the relevant period. Most investors would probably consider
themselves to be very successful if they were able to achieve the return of BAH-OPT.
2.2
Constant-Rebalanced (CBAL) Algorithms
The constant-rebalanced (CBAL) algorithm CBALb has an associated fixed portfolio b =
(b1 , . . . , bm ) and operates as follows: at the beginning of each trading period it makes
trades so as to rebalance its portfolio to b (that is, a fraction bi is invested in the ith
stock, i = 1, . . . , m). It is easy to see that the return of CBAL-OPT is bounded from below
by the return of BAH-OPT since every BAH strategy can be thought of as an extremal CBAL
algorithm. It has been empirically shown that in real market sequences, the return of
CBAL-OPT can dramatically outperform the best stock (see e.g. Table 3).
176
A. Borodin, R. El-Yaniv, V. Gogan
Example 1 (Cover and Gluss [CG86], Cover [Cov91]). Consider the case m = 2 with
one stock and cash. Consider the market sequence
1
1
1
1
X=
,
,
,
,...
1/2
2
1/2
2
CBAL( 1 , 1 ) (X)
2 2
n/2
1
1
1
· (1 + )
· (1 + 2)
2
2
2
n/2 n/2
9
3 3
=
.
=
·
4 2
8
=
Thus, for this market, the return of CBAL( 21 , 12 ) is exponential in n while the best stock is
moving nowhere.
Under the assumption that the daily market vectors are identically and independently
and identically distributed (i.i.d), there is yet another motivating reason for considering
CBAL algorithms.
Theorem 1 (Cover and Thomas [CT91]). Let X = x1 , . . . , xn be i.i.d. according to
some distribution F (x). Then, for some b, CBALb performs at least as good (in the sense
of expected return) as the best online PS algorithm.
Theorem 1 tells us that it is sufficient to look for our best online algorithm in the set
of CBAL algorithms, provided that the market is generated by an i.i.d. source. One should
keep in mind, however, that the i.i.d. assumption is hard to justify. (See [Gre72,BCK92]
for alternative theories.)
2.3
Switching Sequences (Extremal Algorithms)
Consider any sequence of stock indices,
J (n) = j1 , j2 , . . . , jn ,
ji ∈ {1, . . . , m} .
This sequence prescribes a portfolio management policy that switches its entire wealth
from stock ji to stock ji+1 . An algorithm for generating such sequences can be deterministic or randomized. Ordentlich and Cover [OC96] introduce switching sequence
algorithms (called extremal strategies). As investment strategies, switching sequences
may seem to be very speculative but from a theoretical perspective they are well motivated. In particular, for any market sequence, the true optimal algorithm (called OPT) is
a switching sequence2 .
Example 2. Consider the following market sequence X:
0
0
0
0
2
2
2
2
,
,
,
,
,
,
,
2
2
2
2
0
0
0
0
Starting with $1, the switching sequence 2, 2, 2, 2, 1, 1, 1, 1 returns $256. In contrast,
any (mixture of) buy and holds will go bankrupt on X returning $0 and for all b,
CBALb (X) ≤ CBAL( 1 , 1 ) (X) = CBAL-OPT(X) = $1.
2 2
2
Here we are assuming either no transaction costs or a simple transaction cost model such as a
fixed percentage commission.
On the Competitive Theory and Practice of Portfolio Selection
177
3 Some Basic Properties
In this section we derive some basic properties concerning PS algorithms.
3.1
Kelly Sequences
A Kelly (or a “horse race”) market vector for m stocks is a vector of the form
(0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0) .
{z
}
|
m
That is, except for one stock that retains its value, all stocks crash.
n
be the set of all length n Kelly market sequences over m stocks. (When
Let Km
either m and/or n is clear from the context it will be omitted.) There are mn Kelly
market sequences of length n. Kelly sequences were introduced by Kelly [Kel56] to
model horse race gambling. We can use them to derive lower bounds for online PS
algorithms (see e.g. [CO96] and Lemma 6 below).
The following simple but very useful lemma is due to Ordentlich and Cover [OC96].
Lemma 1. Let ALG be any online algorithm. Then
X
ALG(K) = 1 .
n
K∈Km
Proof. We sketch the proof for the case m = 2. The proof is by induction on n. For the
base case, n = 1, notice that ALG must specify its first portfolio, (b, 1 − b), before the
(Kelly) market sequence
length one) is presented. The two possible Kelly
(in this case, of
sequences are K1 = 01 and K2 = 10 . Therefore, ALG(K1 ) + ALG(K2 ) = (1 − b) + b =
1. The induction hypothesis states that the lemma holds for n − 1 days. The proof is
complete when we add up the two returns corresponding to the two possible Kelly vectors
for the first day.
⊓
⊔
Lemma 1 permits us to relate CBALs to online switching sequences as shown in the next
two lemmas.
Lemma 2. Let ALG be any PS algorithm. There exists an algorithm ALG′ , which is a
mixture of switching sequences, that is equivalent (in terms of expected return) to ALG,
over Kelly market sequences.
P
Proof. Fix n. By Lemma 1 we have Kℓ ∈Kn ALG(Kℓ ) = 1 with ALG(Kℓ ) ≥ 0 for
all Kℓ ∈ Kn . Therefore, {ALG(Kℓ )}ℓ is a probability distribution. For a sequence of
Kelly market vectors K = k1 , k2 , . . . , kn , denote by SK = s(k1 ), s(k2 ), . . . , s(kn ) the
switching sequence
where s(ki ) is the index of the stock with relative price one in ki .
P
Let ALG′ = Kℓ ∈Kn ALG(Kℓ ) · SKℓ be the mixture of switching sequences that assigns
a weight ALG(Kℓ ) to the sequence SKℓ . Clearly, for each Kelly market sequence K we
have ALG′ (K) = ALG(K).
178
A. Borodin, R. El-Yaniv, V. Gogan
Lemma 2 can be extended in the following way.
Lemma 3. If ALG = CBALb is a constant-rebalanced algorithm then (for known n) we
CBALb on any market sequence using the same method.
can achieve the same return as P
That is, use the mixture ALG′ = Kℓ ∈Kn CBALb (Kℓ ) · SKℓ .
). The return of CBALb over a Kelly
Proof. To prove this, consider any b = (b1 , . . . , bmQ
market sequence Kℓ = k1ℓ , . . . , knℓ is CBALb (Kℓ ) = i bs(kiℓ ) which gives the weight of
SKℓ in ALG′ . Therefore, for an arbitrary market sequence we have
X Y
′
ALG (X) =
bs(kiℓ ) SKℓ (X)
=
Kℓ ∈Kn i
m
n X
Y
bi xij
i=1 j=1
=
Y
i
b · xi = CBALb (X).
Theorem 2. (i) A (Mixture of) CBAL algorithms can emulate a (mixture) of BAH algorithms. (ii) A (mixture of) SS algorithms can emulate a CBAL algorithm and hence any
mixture of CBALs.
Lemma 4. The competitive ratio is invariant under scaling of the relative prices. In
particular, it is invariant under scaling of each day independently Thus we can assume
without loss of generality that all relative prices are in [0, 1]
Lemma 5. In the game against
Kelly market sequences.
OPT
it is sufficient to prove lower bounds using only
Proof. Let X = x1 , . . . , xn be an arbitrary market sequence. Using Lemma 4 we can
scale each day independently so that in each market vector xi = xi1 . . . , xim , the
maximum relative price of each day (say it is xi,jmax ) equals 1. Now we consider the
“Kelly projection” X ′ of the market X; that is, in the market vector x′i , x′i,jmax = 1 and
xi.ℓ = 0 for ℓ 6= jmax . For any algorithm ALG, we have ALG(X) ≥ ALG(X ′ ), but OPT
always satisfies (when there are no commissions) OPT(X) = OPT(X ′ ).
⊓
⊔
3.2
Lower Bound Proof Technique
Using Lemma 1 we can now describe a general lower bound proof technique due to
Cover and Ordentlich.
Lemma 6 (Portfolio Selection Lower Bound). Let OPTC be an optimal offline algorithm
from a class C of offline algorithms
P (e.g. CBAL-OPT when C is the class of constant rebaln
= K∈Kn OPTC (K) is a lower bound on the competitive
anced portfolios). Then rm
m
ratio of any online algorithm relative to OPTC .
On the Competitive Theory and Practice of Portfolio Selection
179
Proof. We use Lemma 1. Clearly, the maximum element in any set is not smaller than
any weighted average of all the elements in the set. Let QK = PALG(K)
= ALG(K).
K ALG(K)
We then have
X
OPTC (K)
OPTC (K)
QK ·
≥
maxn
K∈Km ALG(K)
ALG(K)
n
K∈Km
X
=
n
K∈Km
=
X
n
K∈Km
ALG(K)
·
OPTC (K)
OPTC (K)
ALG(K)
.
⊓
⊔
4 Optimal Bounds against BAH-OPT and OPT
In some cases, Lemma 6 provides the means for easily proving lower bounds. For
example, consider the PS game against BAH-OPT
P For any market sequence, BAH-OPT invests
its entire wealth in the best stock. Therefore, K∈Km
n BAH-OPT(K) = m, and m is a lower
bound on the competitive ratio of any investment algorithm for this problem. Moreover,
m is a tight bound since the algorithm that invests 1/m of its initial wealth in each of
the m stocks achieves a competitive ratio of m.
Similarly, for any Kelly market sequence K we P
have OPT(K) = 1. Therefore, as
there are mn Kelly sequences of length n, we have, K∈Km OPT(K) = mn and thus
mn is a lower bound on the competitive ratio of any online algorithm against OPT. In this
case, CBAL(1/m,...,1/m) achieves the optimal bound!
5 How Well Can We Perform against CBAL-OPT
Comparison against CBAL-OPT has received considerable attention in the literature. The
best known results are summarized in Table 1. Using the lower bound technique from
Section 3.2, we first establish the Ordentlich and Cover [OC96] lower bound.
The following lemma is immediate but very useful.
Lemma 7. Consider a Kelly market sequence X n = x1 , ..., xn over m stocks. We can
represent X n as a sequence X n = x1 , ..., xn ∈ {1, 2. . . . , m}n . Let CBALb be a constantrebalanced algorithm.
Suppose that in the sequence X n there are nj occurrences of
P
(stock) j with j nj = n. We say that such a Kelly sequence has type (n1 , n2 , . . . , nm ).
For a sequence X n of type (n1 , n2 , . . . , nm ), the return R(b, X n ) of CBALb on the
sequence Xn is
n1
nm
n2
Y n
Y
Y
Y
b1 (b2 ) · · · (bm ) =
R(b, X n ) =
bj j .
j=1
j=1
j=1
That is, the return of any CBAL depends only only the type.
180
A. Borodin, R. El-Yaniv, V. Gogan
The development in Section 6.2 gives an alternative and more informative proof in
that it determines the optimal portfolio b∗ for any Kelly sequence xn of a given type;
namely b∗ = (n1 /n, . . . , nm /n).
We then can apply Lemma 6 to the class C of CBAL algorithms to obtain:
Lemma 8. Let ALG be any online PS algorithm. Then a lower bound for the competitive
ratio of ALG against CBAL-OPT is
Y n j
X
n
n
n
.
rm =
nj
n1 n2 . . . nm j
(n1 ,...,nm ):Σnj =n
Proof. Consider the case m = 2. There are clearly nn1 length n Kelly sequences
X having type (n1 , n2 ) = (n1 , n − n1 ) and for each such sequence CBAL-OPT(X) =
Q n nj
.
j nj
Using a mixture of switching sequences and a min-max analysis, Ordentlich and
Cover provide a matching upper bound (for any market sequence of length n) showing
n
is the optimal bound for the competitive ratio in the context of a known horizon
that rm
n.
Table 1. The best known results w.r.t. CBAL
Lower
3
m=2
m≥2
√
p
m−1
π
n
n
2
≃ πn/2 rm ≃ Γ m
(2) 2
n
r2m
rm
√
m−1
2 n+1
2(n + 1) 2
Source
r2n
Upper (known n)
Upper
[OC96]
[OC96]
[CO96]
From the discussion above it follows immediately that any one CBAL (or any finite
mixture of CBALs) cannot be competitive relative to CBAL-OPT; indeed for any m ≥ 2, the
competitive ratio of any CBAL against CBAL-OPT will grow exponentially in n.
5.1
Cover’s Universal Portfolio Selection Algorithm
The Universal Portfolio algorithms presented by Cover [Cov91] are special cases of
the class of “µ-weighted” algorithms which we denote by W µ . A rather intuitive understanding of the µ-weighted algorithms was given by Cover and Ordentlich. These
algorithms are parameterized by a distribution µ, over the set of all portfolios B. Cover
and Ordentlich show the following result:
wealth of
3
Wµ
= Eµ(b) [wealth of
R∞
−t x−1
CBALb ]
.
The Gamma function is defined as Γ (x) = 0 e t
dt. It can be shown that Γ (1) = 1
and that Γ (x + 1) = xΓ (x). Thus if n ≥ 1 is an integer, Γ (n + 1) = n!. Note also that
√
Γ (1/2) = π.
On the Competitive Theory and Practice of Portfolio Selection
181
This observation is interesting because the definition of W µ (see [CO96]) is in terms of a
sequence of adaptive portfolios (depending on the market sequence) that progressively
give more weight to the better-performing constant rebalanced portfolios. But the above
observation shows that the return of these µ-weighted algorithms is equivalent to a “nonlearning algorithm”. That is, a mixture of CBALs specifies a randomized trading strategy
that is in some sense independent of the stock market data. Of course, the composite
portfolio determined by a mixture of CBALs does depend on the market sequence.
Cover and Ordentlich analyze two instances of W µ . One (called UNI) that uses the
uniform distribution (equivalently, the Dirichlet(1, 1, . . . , 1) distribution) and another
(here simply called DIR) that uses the Dirichlet( 12 , 12 , . . . , 21 ) distribution. They prove
that the uniform algorithm UNI has competitive ratio
n+m−1
≤ (n + 1)m−1 ,
m−1
(1)
and that this bound is tight. Somewhat surprisingly (in contrast, see the discussion
concerning the algorithms in Sections 6.3– 6.5.) this bound can be extracted from UNI
by an adversary using only Kelly market sequences; in fact, by using a Kelly sequence
X̃ in which one fixed stock “wins” every day. For the case of m = 2, this can be easily
R1
1
.
seen since the return CBAL(b,1−b) (X̃) is bn for n days and 0 bn db = n+1
Cover and Ordentlich show that the DIR algorithm has competitive ratio
m−1
Γ (1/2)Γ (n + m/2)
≤ 2(n + 1) 2 .
Γ (m/2)Γ (n + 1/2)
(2)
and here again the bound is tight and achieved using Kelly sequences. Hence for any
fixed m, there is a constant gap between the optimal lower bound for fixed horizon and
the upper bound provided by DIR for unknown horizon.
It is instructive to consider an elegant proof of the universality of UNI (with a slightly
inferior bound) due to Blum and Kalai [BK97].4 Let B be the (m − 1)-dimensional
simplex of portfolio vectors and let µ be any distribution over B. Recall that the return
of the µ-weighted alg is a µ-weighted average of the returns of all CBALb algs. Let X be
any market sequence of length n and let CBALb∗ = CBAL-OPT. Say that b is “near” b∗ if
n
1
b∗ + n+1
z for some z ∈ B. Therefore, for each day i we have
b = n+1
CBALb (xi )
≥
n
· CBALb∗ (xi )
n+1
So, for n days,
CBALb∗ (X)
CBALb (X)
≤
1+
1
n
n
≤ e.
Let Volm (·) denote the m-dimensional volume.
4
See also the web page http://www.cs.cmu.edu/˜akalai/coltfinal/slides.
182
A. Borodin, R. El-Yaniv, V. Gogan
Under the uniform distribution over B, the probability that b is near b∗ is
Pr[b near b∗ ] =
Volm−1
n
∗
n+1 b
+ n1 z
Volm−1 (B)
Volm−1 n1 z
=
Volm−1 (B)
m−1
1
.
=
n+1
m−1
Thus
5.2
Expert Advice and the EG Algorithm
1
n+1
fraction of the initial wealth is invested in CBAL’s which are “near”
CBAL-OPT, each of which is attaining a ratio e. Therefore, the competitive ratio achieved
is e · (n + 1)m−1 .
The EG algorithm proposed by Helmbold it et al [HSSW98] takes a different approach. It
tries to move towards the CBAL-OPT portfolio by using an update function that minimizes
an objective function of the form
F t (bt+1 ) = η log (bt+1 · xt ) − d(bt+1 , bt )
where d(b, b′ ) is some distance or dissimilarity measure over distributions (portfolios)
and η is a learning rate parameter.
The competitive bound proven by Helmbold et al for EG is weaker than the bound obtained for UNI. However, EG is computationally much simpler than UNI and experimentally
it outperforms UNI on the New York Stock Exchange data (see [HSSW98] and Table 3).
The EG algorithm developed from a framework for online regression and a successful
body of work devoted to predicting based on expert advice. When trying to select the best
expert (or a weighting of experts), the EG algorithm is well motivated. It is trying to minimize a loss function based on the weighting of various expert opinions and in this regard
it is similar to UNI. However, it is apparent that CBAL-OPT does not make its money (over
buy and hold) by seeking out the best stock. If one is maintaining a constant portfolio,
one is selling rising stocks and buying falling ones. This strategy is advantageous when
the falling stock reverses its trend and starts rising. We also note that in order to prove
the universality of EG, the value of the learning rate η decreases to zero as the horizon
n increases. When η = 0, EG degenerates to the uniform CBAL(1/m,1/m,...,1/m) which is
not universal whereas the small learning rate (as given by their proof) of EG is sufficient
to make it universal. It is also the case that if each day the price relatives for all stocks
were identical, then (as one would expect) EG will again be identical to the uniform CBAL.
Hence when one combines a small learning rate with a “reasonably stable” market (i.e.
the price ratives are not too erratic), we might expect the performance of EG to be similar
to that of the uniform CBAL and this seems to be confirmed by our experiments.
On the Competitive Theory and Practice of Portfolio Selection
183
6 On Some Portfolio Selection Algorithms
In this section we present and discuss several online portfolio selection algorithms. The
DELTA algorithm is a new algorithm suggested by the goal of exploiting the rationale of
constant rebalancing algorithms. The other algorithms are adaptations of known prediction algorithms.
6.1 The DELTA Algorithm
In this section we define what we call the DELTA(r, w, tcorr ) algorithm. Informally, this
online algorithm operates as follows. There is a risk parameter r between 0 and 1 controlling the fraction of a stocks value that the algorithm is willing to trade away on any
given day. Each stock will risk that proportion of its weight if it is climbing in value
and is sufficiently anti-correlated with other stock(s). If it is anti-correlated, the at-risk
amount is spread amongst the falling stocks proportional to the correlation coefficient 5 .
The algorithm takes two other parameters. The “window” length, w, specifies the length
of history used in calculating the new portfolio. To take advantage of short term movements in the price relatives, a small window is used. Finally, tcorr < 0 is a correlation
threshold which determines if a stock is sufficiently anti-correlated with another stock
(in which case the weighting of the stock is changed).
A theoretical analysis of the DELTA algorithm seems to be beyond reach, at this stage.
In Sections 7 and 8 we present experimental studies of the performance of the DELTA
algorithm.
6.2 The Relation between Discrete Sequence Prediction and Portfolio Selection
We briefly explore the well established relation between the portfolio selection problem
and prediction of discrete sequences. We then discuss the use of some known prediction
algorithms for the PS problem.
Simply put, the standard worst case prediction game under the log-loss measure is
a special case of the PS game where the adversary is limited to generating only Kelly
market vectors. As mentioned in Section 5.1, Cover and Ordentlich showed that the PS
algorithms UNI and DIR obtain their worst-case behavior over Kelly market sequences.
However, this does not imply that the PS problem is reducible to the prediction problem,
and we will see here several examples of prediction algorithms that are not competitive
(against CBAL-OPT) in the PS context but are competitive in the prediction game.
Here is a brief description of the prediction problem. (For a recent comprehensive
survey of online prediction see [MF98].) In the online prediction problem the online
player receives a sequence of observations x1 , x2 , . . . , xt−1 where the xi are symbols in
some alphabet of size m. At each time instance t the player must generate a prediction bt
for the next, yet unseen symbol xt . The prediction is in general a probability distribution
5
The correlation coefficient is a normalized covariance with the covariance divided by the
product of the standard deviations; that is,
Cor(X,Y) = Cov(X,Y)/(std(X)*std(Y)) where
Cov(X,Y) = E[X-mean(X)) * (Y-mean(Y))].
184
A. Borodin, R. El-Yaniv, V. Gogan
bt (xt ) = bt (xt |x1 , . . . , xt−1 ) over the alphabet. Thus, bt gives the confidence that the
player has on each of the m possible future outcomes. After receiving xt the player
incurs a loss of l(bt , xt ) where l is some loss function. Here we concentrate on the logloss function where l(bt , xt ) = − log bt (xt ). The total loss of a prediction algorithm,
B
Pn= b1 , b2 , . . . , bn with respect to a sequence X = x1 , x2 , . . . , xn is L(B, X) =
i=1 l(bi , xi ).
As in the competitive analysis of online PS algorithms, it is customary to measure the
worst case competitive performance of a prediction algorithm with respect to a comparison class of offline prediction strategies. Here we only consider the comparison class of
constant predictors (which correspond to the class of constant rebalanced algorithms in
the PS problem). The “competitive ratio”6 of a strategy B with respect to the comparison
class B is maxX [L(B, X) − inf b∈B L(b, X)].
There is a complete equivalence between (competitive analysis of) online prediction
algorithms under the log loss measure with respect to (offline) constant predictors and
the (competitive analysis of) online PS algorithms with respect to (offline) constant
rebalanced algorithms whenever the only allowable market vectors are Kelly sequences.
To see that, consider the binary case m = 2 and consider the Kelly market sequence X =
(xi = 0 corresponds to
x1 , . . . , xn where xi ∈ {0,1} represents a Kelly market vector
the Kelly market vector 10 and xi = 1 corresponds to 01 ). For ease of exposition we
now consider (in this binary case) stock indices in {0, 1} (rather than {1, 2}). Let CBALb
be the constant-rebalanced
algorithm with portfolio (b, 1 − b). The return of CBALb is
Qn
R(b, X) = i=1 (xi b + (1 − xi )(1 − b)). Let (n0 , n1 ) be the type of X (i.e. in X there
are n0 zeros and n1 ones, n0 + n1 = n.) Taking the base 2 logarithm of the return and
dividing by n we get
n0
n1
1
log R(b, X) =
log b +
log(1 − b)
n
n
n
1 n1
1
n0
log
= − log −
n
b
n
1−b
n0
n0
n0 /n n1
n1 /n
n0
n1
n1
+
=−
log
+
log
log
+
log
n
b
n
(1 − b)
n
n
n
n
i
n n
h n n
0
0
1
1
||(b, 1 − b) − H
.
(3)
,
,
= −DKL
n n
n n
As the KL-divergence DKL (·||·) is always non-negative, the optimal offline choice
for the constant-rebalanced portfolio (CBAL-OPT) , that maximizes n1 log R(b, X), is b∗ =
n0 /n, in which case the KL-divergence vanishes.
We now consider the competitive ratio obtained by an algorithm ALG against
CBAL-OPT. Using the above expression for the log-return of CBAL-OPT we get
log
6
CBAL-OPT
ALG
= log CBAL-OPT − log ALG
n n
1
0
− log ALG.
,
= −nH
n n
The more common term in the literature is “regret”.
On the Competitive Theory and Practice of Portfolio Selection
185
Using Jensen’s inequality we have
− log ALG(X) = − log
=−
≤−
=
i=1
n
X
n
X
i=1
=
i=1
n
X
i=1
n
X
i=1
n
Y
(xi bi + (1 − xi )(1 − bi ))
log (xi bi + (1 − xi )(1 − bi ))
(xi log bi + (1 − xi ) log(1 − bi ))
xi
1 − xi
xi log
+ (1 − xi ) log
bi xi
(1 − bi )(1 − xi )
DKL [(xi , 1 − xi )||(bi , 1 − bi )] +
n
X
i=1
H(xi , 1 − xi ) .
Since each of the entropies H(xi , 1 − xi ) = 0 (xi = 0 or xi = 1), we have
− log ALG(X) ≤
Putting all this together,
log
CBAL-OPT
ALG
≤
n
X
i=1
n
X
i=1
DKL [(xi , 1 − xi )||(bi , 1 − bi )] .
n1
.
n n
n
DKL [(xi , 1 − xi )||(bi , 1 − bi )] − nH
0
,
In the prediction game the online player must generate a prediction bi for the ith bit of
a binary sequence that is revealed online. The first, KL-divergence, term of the bound
measures in bits the total redundancy or inefficiency of the prediction. The second, entropy term, measures how predictable the sequence X is. In order to prove a competitive
ratio of C, against CBAL-OPT, it is sufficient to prove that
n
X
i=1
DKL [(xi , 1 − xi )||(bi , 1 − bi )] ≤ log C + nH
n1
.
n n
n
0
,
(4)
1
.
If xi = 1 this expression reduces to log b1i , and if xi = 0, it reduces to log 1−b
i
Therefore, theQsummation over all these KL-divergence terms can be expressed as the
logarithm of i z1i where zi = bi iff xi = 1 and zi = 1 − bi if xi = 0. We thus define,
for an online prediction
algorithm, its “probability product” P corresponding to an input
Q
sequence to be i z1i , and to prove a competitive ratio of C it is sufficient to prove that
Q
log(P ) = log i z1i ≤ log C + nH nn0 , nn1 . A similar development to the above can
be established for any m > 2.
6.3 The Add-beta Prediction Rule
Consider the following method for predicting the i + 1st bit of a binary sequence, based
on the frequency of zeros and ones that appeared until (and including) the ith round.
186
A. Borodin, R. El-Yaniv, V. Gogan
We maintain counts, Cjt , j = 0, 1. Such that C0t records the frequency of the zeros and
C1 records the frequency of ones until and including round t. Based on these counts the
(online) algorithm predicts that the t + 1st bit will be 0 with probability
=
pt+1
0
C0t + β
,
2β + C0t + C1t
where the parameter beta is a non-negative real. This rule is sometimes called the addbeta prediction rule. The instance β = 1 is called Laplace law of succession (see
[MF98,CT91,Krich98]), and the case β = 1/2 is known to be optimal in both distributional and distribution free (worst case) settings [MF98]. In the case where there
are m possible outcomes, 1, . . . , m, the add-beta rule becomes
=
bt+1
i
Cit + β
P
.
mβ + 1≤j≤m Cjt
For the PS problem, one can use any prediction algorithm, such as the add-beta rule in
the following straightforward manner. Assume that the price relatives are normalized
everyday so that the largest price relative equals one. For each market vector X =
x1 , . . . , xm consider its Kelly projection K(X), in which all components are zero except
for component arg max xi which is normalized to equal one7 . At each round such an
algorithm yields a prediction for each of the m symbols.
Algorithm M0: The first prediction-based PS online algorithm we consider is called M0
(for “Markov of order zero”). This algorithm simply uses the add-beta prediction rule
on the Kelly projections of the market vectors. For the prediction game (equivalently,
PS with Kelly markets) one can show that algorithm M0 with β = 1 is (n + 1)m−1 competitive; that is, it achieves an identical competitive ratio to UNI, the uniform-weighted
PS algorithm of Cover and Ordentlich (see Section 5.1). Here is a sketch of the analysis
for the case m = 2. The first observation is that the return of M0 is the same for all
Kelly sequences of the same type. This result can be shown by induction on the length
of the market sequence. It is now straightforward to calculate the “probability product”
P (J) of M0 with respect to a market sequence X of type J = (n0 , n1 ) (by explicitly
calculating it for a sequence which contains n0 zeros followed by n1 ones), which equals
n
n0 n1
to (n+1)!
n0 !n1 ! . Since log n1 ≤ nH n , n (see [CT91]), we have
7
n
log P (J) = log (n + 1)
n1
n
= log(n + 1) + log
n1
n n
0
1
.
,
≤ log(n + 1) + nH
n n
To simplify the discussion, randomly choose amongst the stocks that achieve the maximum
price relative if there is more than one such stock.
On the Competitive Theory and Practice of Portfolio Selection
187
Using inequality (4) and the development thereafter it follows that (with β = 1) algorithm
M0 is (n + 1)-competitive in the prediction game (or restricted PS with Kelly sequences)
and it is not hard to prove a tight lower bound of n + 1 on its competitive ratio using
the Kelly sequence of all ones. One can quite easily generalize the above arguments for
m > 2, and it is possible to prove using a similar (but more involved) analysis that M0
based on the add- 12 rule achieves a competitive ratio of 2(n + 1)(m−1)/2 , the same as
the universal DIR algorithm for the general PS problem. (See Merhav and Feder [MF98]
and the references therein.)
Despite the fact that algorithm M0 is competitive in the online prediction game it is
not competitive (nor even universal) in the unrestricted PS game against offline constant
rebalanced portfolios. For the case m = 2 this can be shown using market sequences of
the form
n n
1
1−ǫ
.
0
1
It is easy to see that the competitive ratio is 2Ω(n) . Cover and Gluss [CG86] show how
a less naive learning algorithm is universal under the assumption that the set of possible
market vectors is bounded. In doing so, they illustrate how their algorithm (based on the
Blackwell [Bl56] approachability-excludability theorem) avoids the potential pitfalls of
a naive counting scheme such as M0.
Algorithm T0: One might surmise that the deficiency of algorithm M0 (in the PS game)
can attributed to the fact that it ignores useful information in market vectors. Like M0,
algorithm T0 uses the add-beta rule but now maintains its counters as follows:
Cjt+1 = Cjt + log2 (xt+1,j + 1) .
Here we again assume that price relatives have been normalized so that on any given
day the maximum is one. Clearly, T0 reduces to M0 in the prediction game. Algorithm T0
is also not competitive. This can be shown using sequences of the form
tn n
1
1
.
0
1
For sufficiently large t, it is clear that the behavior of T0 on the sequence will be similar
to that of CBAL( 12 , 21 ) , which is not competitive.
6.4
Prediction-Based Algorithms: Lempel-Ziv Trading
One can then suspect that the non-competitiveness of the add-beta variants (M0, T0) is
due to the fact that they ignore dependencies among market vectors (they record only
zero-order statistics). In an attempt to examine this possibility we also consider the
following trading algorithm based on the Lempel-Ziv compression algorithm [ZL78].
The LZ algorithm was also considered in the context of prediction (see Langdon [Lan83]
and Rissanen [Ris83]). Feder [Fed91] and Feder and Gutman [FG92] consider the worst
case competitive performance of the algorithm in the context of gambling. Feder shows
188
A. Borodin, R. El-Yaniv, V. Gogan
that the LZ algorithm is universal with respect to the (offline) class of all finite state
prediction machines.
Like M0, the PS algorithm based on Lempel-Ziv is the LZ prediction algorithm applied
to the Kelly projections of the market vectors. Using the same nemesis sequence as for
M0, namely
n n
1
1−ǫ
,
0
1
√
it is easy to see that a lower bound for the competitive ratio of LZ is 2Ω( (n) and hence
LZ is not competitive by our definition (although it might still be universal).
6.5
Portfolio Selection Work Function Algorithm
In his PhD thesis Ordentlich suggests the following algorithm (which is also a generalization of the M0 algorithm with β = 1/2). For the case m = 2 this algorithm chooses
the next portfolio bt+1 to be
t − 1 ∗ 1 1/2
,
bt+1 =
bt +
t
t 1/2
where b∗t is the optimal constant rebalanced portfolio until time t. With its two components, one that tracks the optimal offline algorithm so far and a second which may be
viewed as greedy, this algorithm can be viewed as kind of a work function algorithm
(see [BE98]).
Ordentlich shows that the sequence
n
0 1
1
1 0
1−ǫ
produces a competitive ratio of Ω(n2 ) from this algorithm.
We can slightly improve Ordentlich’s lower bound to Ω(n5/2 ). We concatenate
p
1i
0 i
for i = 2 . . . k to the end of Ordentlich’s sequence, with k = Θ( (n) so
0
1
that the entire input sequence remains of length O(n).
It remains an open question as to whether or not Ordentlich’s algorithm is competitive
(or at least universal). The Ω(n(5/2) )) lower bound shows that this algorithm is not as
competitive as UNI and DIR.
7 Experimental Results
We consider three data sets as test suites for most of the algorithms considered in the
relevant literature. 8 The first data set is the stock market data as first used by Cover
8
We do not present experiments for the DIR algorithm nor for Ordentlich’s “work function
algorithm”. Even though DIR’s worst case competitive bound is better than that of UNI, it has been
found in practice (see [HSSW98]) that DIR’s performance is worse than UNI. It is computationally
time consuming to even approximate DIR and the work function algorithm. In the full version
of this paper we plan to present experimental results for these algorithms.
On the Competitive Theory and Practice of Portfolio Selection
189
[Cov91] and then Cover and Ordentlich [CO96], Helmbold et al [HSSW98], Blum and
Kalai [BK97] and Singer [Sin98].9 This data set contains 5651 daily prices for 36 stocks
in the New York Stock Exchange (NYSE) for the twenty two year period July 3rd , 1962
to Dec 31st , 1984. The second data set consists of 88 stocks from the Toronto Stock
Exchange (TSE), for the five year period Jan 4th , 1994 to Dec 31st , 1998. The stocks
chosen were those that were traded on each of the 1258 trading days in this period. The
final data set is for intra day trading in the foreign exchange (FX) market. Specifically,
the data covers the bid-ask quotes between USD ($) and Japanese Yen, and between
USD and German Marks (DM) for the one year period Oct 1st , 1992 to Sep 30th , 1993.
As explained in Section 7.2, we interpret this data as 479081 price relatives for m = 2
stocks (i.e. Yen and DM).
7.1
Experiments on NYSE data
All 36 stocks in the sample had a positive return over the entire 22 year sample. The
returns ranged from a low of 3.1 to a high (BAH-OPT) of 54. Before running the online
algorithms on pairs of stocks, we determined the CBAL-OPT of each stock when traded
against cash and a 4% bond as shown in Table 2.
Of the 36 stocks, only three benefited from an active trading strategy against cash. The
winning strategy for the remaining 33 stocks was to buy and hold the stock. When cash
was replaced by a 4% annualized bond, seven stocks (the ones highlighted in Table 2)
benefited from active trading. It is interesting to note that the CBAL-OPT of all 36 stocks
is comprised of a portfolio of just 5 stocks. This portfolio has a return of 251.
Stock
Weight
Comm Metals 0.2767
Espey
0.1953
Iroquois
0.0927
Kin Ark
0.2507
Mei Corp
0.1845
Four of these five stocks are also the ones that most benefited from active trading by
against a bond. The CBAL-OPT of the remaining 31 stocks is still a respectable
69.9, beating BAH-OPT for the entire 36 stocks.
CBAL-OPT
Pairs of Stocks When looking at individual pairs of stocks, however, one finds a different
story. Instead of just these five or seven stocks benefiting from being paired with one
another, one finds that of the possible 630 pairs, almost half (44%) have a CBAL-OPT that
is 10% or more than the return of the best stock. It is this fact that has encouraged the
consideration of competitive-based online algorithms as this sample does indicate that
many stock pairings can benefit from frequent trading. Of course, it can be argued that
identifying a “profitable pair” is the real problem.
9
According to Helmbold et al, this data set was originally generated by Hal Stern. We do not
know what criteria was used in choosing this particular set of 36 stocks.
190
A. Borodin, R. El-Yaniv, V. Gogan
Stock
Dupont
Kin Arc
Sears
Lukens
Alcoa
Ingersoll
Texaco
MMM
Kodak
Sher Will
GM
Ford
P and 9
Pillsbury
GE
Dow Chem
Iroquois
Kimb Clark
Fischbach
IBM
AHP
Coke
Espey
Exxon
Merck
Mobil
Amer Brands
Pillsbury
Arco
JNJ
Mei Corp
HP
Gulf
Schlum
Comm Metals
Morris
BAH
3.07
4.13
4.25
4.31
4.35
4.81
5.39
5.98
6.21
6.54
6.75
6.85
6.98
7.64
7.86
8.76
8.92
10.36
10.70
12.21
13.10
13.36
13.71
14.16
14.43
15.21
16.10
16.20
16.90
17.22
22.92
30.61
32.65
43.13
52.02
54.14
BND(4%)
2.41
2.41
2.41
2.41
2.41
2.41
2.41
2.41
2.41
2.41
2.41
2.41
2.41
2.41
2.41
2.41
2.41
2.41
2.41
2.41
2.41
2.41
2.41
2.41
2.41
2.41
2.41
2.41
2.41
2.41
2.41
2.41
2.41
2.41
2.41
2.41
CBAL-OPT(0%) CBAL-OPT(4%)
3.07
12.54
4.25
4.31
4.35
4.81
5.39
5.98
6.21
6.54
6.75
6.85
6.98
7.64
7.86
8.76
9.81
10.36
10.70
12.21
13.10
13.36
14.88
14.16
14.43
15.21
16.10
16.20
16.90
17.22
22.92
30.61
32.65
43.13
52.02
54.14
3.18
18.33
4.25
4.93
4.42
4.81
5.39
5.98
6.21
6.54
6.75
6.85
6.98
7.64
7.86
8.76
12.08
10.36
10.70
12.21
13.10
13.36
17.89
14.16
14.43
15.21
16.10
16.20
16.90
17.22
23.29
30.61
32.65
43.13
52.02
54.14
Table 2. Return of CBAL-OPT when traded against 0 % cash and a 4 % a bond. For example,
when Dupont is balanced against cash (respectively, the bond),the return of CBAL-OPT = 3.07
(respectively, 3.18). The seven stocks that profit from active trading against a bond have been
highlighted.
On the Competitive Theory and Practice of Portfolio Selection
191
We abbreviate this uniform CBAL algorithm as UCBALm . The uniform buy and hold
(denoted UBAHm ) and UCBALm algorithms give us reasonable (and perhaps more realistic)
benchmarks by which to compare online algorithms; that is, while one would certainly
expect a good online algorithm to perform well relative to the uniform BAH, it also seems
reasonable to expect good performance relative to UCBAL since both algorithms can be
considered as naive strategies. We found that for over half (51%) of the stock pairings,
CBAL-OPT has a 10% or more advantage over the (1/2, 1/2) CBAL.
The finding that EG (with small learning rate) has no substantial (say, 1%) advantage over UCBAL2 confirms the comments made in Section 5.2. Previous expositions
demonstrated impressive returns; that is, where EG outperformed UNI and can significantly outperform the best stock (in a pair of stocks). The same can now be said for
UCBAL2 . The other interesting result is that DELTA seems to do remarkably better than the
UCBAL2 algorithm; it is at least 10% better than UCBAL2 for 344 pairs. In fact, DELTA does
at least 10% better than CBAL-OPT a third of the time (204 pairs). In the full version of this
paper we will present several other algorithms, some that expand and some that limit
the risk.
All Stocks Some of the algorithms were exercised on a portfolio of all 36 stocks. In
order to get the most from the data and detect possible biases, the sample was split up
into 10 equal time periods. These algorithms were then run on each of the segments. In
addition, the algorithms were run on the reverse sequence to see how they would perform
in a falling market. These results will be presented in the full paper.
7.2
Experiments on the TSE and FX data
The TSE and FX data are quite different in nature than the NYSE data. In particular,
while every stock made money in the NYSE data, 32 of the 88 stocks in the TSE data lost
money. The best return was 6.28 (Gentra Inc.) and the worst return was .117 (Pure Gold
Minerals). There were 15 stocks having non zero weight in CBAL-OPT, with three stocks
(including Gentra Inc.) constituting over 80% of the weight. Unlike its performance on
the NYSE data, with respect to the TSE data, UNI does outperform the online algorithms
UCBAL, EG, M0 and T0. It does not, however, beat DELTA and it is somewhat worse than
UBAH. The FX data was provided to us in a very different form, namely as bid-ask quotes
(as they occur) as opposed to (say) closing daily prices. We interpreted each “tick” as
a price by taking the average value (i.e. (ask+bid)/2). Since each tick only represents
one currency, we merged the ticks into blocks where each block is the shortest number
of ticks for which each currency is traded. We then either ignored the bid-ask nature
of the prices or we used this information to derive an induced (and seemingly realistic)
transaction cost for trading a given currency at any point in time. The Yen decreased
with a return of 0.8841 while the DM increased with a return of 1.1568. It should also
be noted that the differences in prices for consecutive ticks is usually quite small and
thus frequent trading in the context of even small transaction costs (i.e. spreads) can be
a very poor strategy. Note that for this particular FX data, CBAL-OPT is simply BAH-OPT.
Table 3 reports on the returns of the various algorithms for all three data sets with
and without transaction costs. Without transaction costs, we see that simple learning
192
A. Borodin, R. El-Yaniv, V. Gogan
algorithms such as M0 can sometimes do quite well, while more aggressive strategies
such as DELTA and LZ can have fantastic returns.
8 Portfolio Selection with Commissions and Bid-ask Spreads
Algorithmic portfolio selection with commissions is not so well studied. There can be
many commission models. Two simple models are the flat fee model and proportional
(or fixed rate) model. In some markets, such as foreign exchange there are no explicit
commissions at all (for large volume trades) but a similar (and more complicated) effect
is obtained due to buy-sell or bid-ask spreads. In the current reality, with the emerging
Internet trading services, the effect of commissions on traders is becoming less significant
but bid-ask spreads remain. The data for the FX market contains the bid-ask spreads.
When we want to view bid-ask spreads as transaction costs we define the transaction rate
as (ask-bid)/(ask+bid). The resulting (induced) transaction costs are quite non uniform
(over time) ranging between .00015 and .0094 with a mean transaction rate of .00057.
Table 3 presents the returns for the various algorithms for all data sets using different
transaction costs. For the NYSE and TSE data sets, we used fixed transaction cost rates
of .1% (i.e. very small) and 2% (more or less “full service”). For the FX data we both
artificially introduced fixed rate costs or used the actual bid-ask spreads as discussed
above.
For the simple but important model of fixed rate transaction costs, it is not too difficult
to extend the competitive analysis results to reflect such costs. In particular, Blum and
Kalai [BK97] extend the proof of UNI’s competitiveness to this model of transaction
costs. Suppose then that there is a transaction rate cost of c (0 ≤ c ≤ 1); that is, to buy
(or sell) $d of a stock costs $( 2c )d or alternatively, we can say that all transaction costs
are payed for by selling at a commission rate of c.
Blum and Kalai prove that UNI has a
competitive ratio upper bounded by (1+c)n+m−1
generalizing the bound in Equation 1
m−1
in Section 5.1. Using their proof in Section 5.1, one can obtain a bound of
(1+c)n
1
CBALb∗ (X)
≤ 1+
≤ e(1+c)
CBALb (X)
n
whenever b is near b∗ so that the competitive ratio is bounded above by e(1+c) · (n +
1)m−1 .
Blum and Kalai [BK97], and then Helmbold et.al [HSSW98] and Singer [Sin98] 10
present a few experimental results which seem to indicate that although transaction costs
are significant, it is still possible to obtain reasonable returns from algorithms such as
UNI and EG. Indeed for the NYSE data, EG “beats the market” even in the presence of 2%
transaction costs. Our experiments seem to indicate that transaction costs may be much
more problematic than some of the previous results and the theoretical competitiveness
(say of UNI) suggests. Algorithms such as LZ and our DELTA algorithm can sometimes have
exceptionally good returns when there are no transaction costs but disastrous returns with
(not unreasonable) costs of 2%.
10
We have been recently informed by Yoram Singer that the experimental results for his adaptive
γ algorithm are not correct and, in particular, the results with transaction costs are not as
encouraging as reported in [Sin98].
On the Competitive Theory and Practice of Portfolio Selection
UBAH
UCBAL
DELTA
DELTA
(1,10,-.01)
(.1,4,-.01)
193
M0(.5)
NYSE (0%)
14.4973
27.0752
1.9 × 1008
326.585
111.849
NYSE (.1%)
14.4901
26.1897
1.7 × 1007
246.224
105.975
NYSE (2%)
14.3523
13.9218
1.6 × 10−13
1.15638
38.0202
TSE (0%)
1.61292
1.59523
4.99295
1.93648
1.27574
TSE (.1%)
1.61211
1.58026
2.91037
1.82271
1.2579
TSE (2%)
1.59679
1.32103
9.9 × 10−05
.576564
.962617
FX (0%)
1.02047
1.0225
22094.4
3.88986
1.01852
FX (.1%)
1.01996
.984083
1.6 × 10−26
5.7 × 10−05
.979137
FX (2%)
1.01026
.475361 1.1 × 10−322
8.2 × 10−97
.462863
FX (bid-ask)
1.02016
.999421 1.02 × 10−13
.00653223
.994662
T0(.5)
EG(.01)
LZ
UNI
CBAL-OPT
NYSE (0%)
27.0614
27.0869
79.7863
13.8663
250.592
NYSE (.1%)
26.1773
26.2012
5.49837
13.8176
NC
NYSE (2%)
13.9252
14.6023
3.5 × 10−22
9.90825
NC
TSE (0%)
1.59493
1.59164
1.32456
1.60067
6.43390
TSE (.1%)
1.58002
1.58006
.597513
1.58255
NC
TSE (2%)
1.32181
1.34234
1.5 × 10−07
1.41695
NC
FX (0%)
1.0225
1.0223
716.781
1.02181
1.15682
FX (.1%)
.984083
.984435
1.9 × 10−32
.996199
1.15624
FX (2%)
.475369
.51265 1.6 × 10−322
.631336
1.14525
FX (bid-ask)
.999422
.999627 1.04 × 10−17
1.00645
1.15661
Table 3. The returns of various algorithms for three different data sets using different transaction
costs. Note that UNI and CBAL-OPT have only been approximated. The notation NC indicates an
entry which has not yet been calculated.
194
A. Borodin, R. El-Yaniv, V. Gogan
9 Concluding Remarks and Future Work
From both a theoretical and experimental point of view, it is clear that a competitive based
approach to portfolio selection is only just beginning to emerge. In contrast, the related
topic of sequence prediction is much better developed and indeed the practice seems
to closely follow the theory. There are, of course, at least two significant differences;
namely that (first) sequence prediction is a special case of portfolio selection and (second)
that transaction costs (or alternatively, bid-ask spreads) are a reality having a significant
impact. In the terminology of metrical task systems and competitive analysis (see [BE98]
and [BB97]), there is a cost to change states (i.e. portfolios).
On the other hand, PS algorithms can be applied to the area of expert prediction.
Specifically, we view each expert as a stock whose log loss for a given prediction can be
exponentiated to generate a stock price. Applying a PS algorithm to these prices yields
a portfolio which can be interpreted as a mixture of experts. (See Ordentlich [Or96] and
Kalai, Chen, Blum and Rosenfeld [KCBR99].)
Any useful online algorithm must at least “beat the market”; that is, the algorithm
should be able to consistently equal and many times surpass the performance of the
uniform buy and hold. In the absence of transaction costs, all of the online algorithms
discussed in this paper were able to beat the market for the NYSE stock data. However, the
same was not true for the TSE data, nor for the currency data. Furthermore, when even
modest transaction costs (e.g. .1%) were introduced many of the algorithms suffered
significant (and sometimes catastrophic) losses. This phenomena is most striking for
the DELTA algorithm and for the LZ algorithm which is a very practical algorithm in the
prediction setting.
Clearly the most obvious direction for future research is to understand the extent
to which a competitive based theory of online algorithms can predict performance with
regard to real stock market data. And here we are only talking about a theory that
completely disregards feedback on the market of any successful algorithm. We conclude
with a few questions of theoretical interest.
1. What is the competitive ratio for Ordentlich’s “work function algorithm” and for the
Lempel Ziv PS algorithm?
2. Can an online algorithm that only considers the Kelly projection of the market
input sequence be competitive (or universal)? What other general classes of online
algorithms can analyzed?
3. How can we define a portfolio selection “learning algorithm”? Is there a “true learning” PS algorithm that can attain the worst case competitive bounds of UNI or DIR?
4. To what extent can PS algorithms utilize “side information”, as defined in
Cover and Ordentlich [CO96]? See the very promising results in Helmbold
et al [HSSW98].
5. Determine the optimal competitive ratio against CBAL-OPT and against OPT in the
context of a fixed commission rate c.
6. Develop competitive bounds within the context of bid-ask spreads.
7. Continue the study of portfolio selection algorithms in the context of “shortselling”. (See Vovk and Watkins [VW98].)
8. Consider other benchmark algorithms as the basis of a competitive theory.
On the Competitive Theory and Practice of Portfolio Selection
195
Acknowledgments
We thank Rob Schapire for his very constructive comments. We also thank Steve Bellantoni for providing the TSE data.
References
Bl56.
BE98.
BB97.
BK97.
BCK92.
BKM93.
CG86.
CO96.
Cov91.
CT91.
CB99.
Fed91.
FG92.
Gre72.
HSSW98.
HW98.
KCBR99.
Kel56.
Krich98.
Lan83.
Mar59.
Blackwell, D.: An Analog of the Minimax Theorem for Vector Payoffs. Pacific J.
Math., 6, pp 1-8, 1956.
Borodin,A., El-Yaniv, R.: Online Computation and CompetitiveAnalysis. Cambridge
University Press (1998)
Blum, A., Burch, C.: On-line Learning and the Metrical task System Problem. Proceedings of the 10th Annual Conference on Computational Learning Theory (COLT
’97), pages 45–53. To appear in Machine Learning.
Blum, A., Kalai, A.: Universal portfolios with and without transaction costs. Machine
Learning, 30:1, pp 23-30, 1998.
Bollerslev, T., Chou, R.Y., and Kroner, K.F.: ARCH Modeling in Finance: A selective
review of the theory and empirical evidence. Journal of Econometrics, 52, 5-59.
Bodie, Z., Kane, A., Marcus, A.J.: Investments. Richard D. Irwin, Inc. (1993)
Empirical Bayes Stock Market Portfolios. Advances in Applied Mathematics, 7, pp
170-181, 1986.
Cover, T.M., Ordentlich, O.: Universal portfolios with side information. IEEE Transactions on Information Theory, vol. 42 (2), 1996.
Cover, T.M.: Universal portfolios. Mathematical Finance 1(1) (1991) 1–29
Cover, T.M., Thomas, J.A.: Elements of Information Theory. John Wiley & Sons,
Inc. (1991)
Cross J.E. and Barron A.R.: Efficient universal portfolios for past dependent target
classes, DIMACS Workshop: On-Line Decision Making, July, 1999.
Feder M.: Gambling using a finite state machine, IEEE Trans. Inform. Theory, vol.
37, pp. 1459–1465, Sept. 1991.
Feder M., Gutman M.: Universal Prediction of Individual Sequences IEEE Trans.
Inform. Theory, vol. 37, pp. 1459–1465, Sept. 1991.
Green, W.: Econometric Analysis Collier-McMillan, 1972
Helmbold, D.P., Schapire, R.E., Singer, Y., Warmuth, M.K.: On-line portfolio selection using multiplicative updates. Mathematical Finance, vol. 8 (4), pp.325-347,
1998.
Herbster, M., Warmuth, M.K.: Tracking the best expert. Machine Learning, vol. 32
(2), pp.1-29, 1998.
Kalai, A., Chen S.„ Blum A., and Rosenfeld R.: On-line Algorithms for Combining Language Models. Proceedings of the International Conference on Acoustics,
Speech, and Signal Processing (ICASSP), 1999.
Kelly, J.: A new interpretation of information rate. Bell Sys. Tech. Journal 35 (1956)
917–926
R.E. Krichevskiy, Laplace law of succession and universal encoding, IEEE Trans.
on Infor. Theory. Vol. 44 No. 1, January 1998.
G.G. Langdon, A note on the Lempel-Ziv model for compressing individual sequences. IEEE Trans. Inform. Theory, vol. IT-29, pp. 284–287, 1983
Markowitz, H.: Portfolio Selection: Efficient Diversification of Investments. John
Wiley and Sons (1959)
196
MF98.
Or96.
OC96.
Ris83.
Sin98.
ZL78.
VW98.
A. Borodin, R. El-Yaniv, V. Gogan
Merhav, N., Feder, M.: Universal prediction. IEEE Trans. Inf. Theory 44(6) (1998)
2124–2147
Ordentlich, E.: Universal Investmeny and Universal Data Compression. PhD Thesis,
Stanford University, 1996.
Ordentlich, E., Cover, T.M.: The cost of achieving the best portfolio in hindsight. Accepted for publication in Mathematics of Operations Research. (Conference version
appears in COLT 96 Proceedings under the title of “On-line portfolio selection”.)
Rissanen, J.: A universal data compression system. IEEE Trans. Information Theory,
vol IT-29, pp. 656-664, 1983.
Singer, Y.: Switching portfolios. International Journal of Neural Systems 84 (1997)
445-455
Ziv, J., Lempel, A.: Compression of individual sequences via variable rate coding.
IEEE Trans. Information Theory, vol IT-24, pp. 530-536, 1978.
Universal Portfolio Selection Vovk V. and Watkins C.: Universal Portfolio Selection,
COLT 1998.
Almost k-Wise Independence and Hard Boolean
Functions
Valentine Kabanets
Department of Computer Science
University of Toronto
Toronto, Canada
kabanets@cs.toronto.edu
Abstract. Andreev et al. [3] gave constructions of Boolean functions
(computable by polynomial-size circuits) with large lower bounds for
read-once branching program (1-b.p.’s): a function in P with the lower
bound 2n−polylog(n) , a function in quasipolynomial time with the lower
bound 2n−O(log n) , and a function in LINSPACE with the lower bound
2n−log n−O(1) . We point out alternative, much simpler constructions of
such Boolean functions by applying the idea of almost k-wise independence more directly, without the use of discrepancy set generators for
large affine subspaces; our constructions are obtained by derandomizing
the probabilistic proofs of existence of the corresponding combinatorial
objects. The simplicity of our new constructions also allows us to observe
that there exists a Boolean function in AC0 [2] (computable by a depth 3,
polynomial-size circuit over the basis {∧, ⊕, 1}) with the optimal lower
bound 2n−log n−O(1) for 1-b.p.’s.
1
Introduction
Branching programs represent a model of computation that measures the space
complexity of Turing machines. Recall that a branching program is a directed
acyclic graph with one source and with each node of out-degree at most 2. Each
node of out-degree 2 (a branching node) is labeled by an index of an input bit,
with one outgoing edge labeled by 0, and the other by 1; each node of out-degree
0 (a sink) is labeled by 0 or 1. The branching program accepts an input if there
is a path from the source to a sink labeled by 1 such that, at each branching
node of the path, the path contains the edge labeled by the input bit for the
input index associated with that node. Finally, the size of a branching program
is defined as the number of its nodes.
While there are no nontrivial lower bounds on the size of general branching
programs, strong lower bounds were obtained for a number of explicit Boolean
functions in restricted models (see, e.g., [12] for a survey). In particular, for readonce branching programs (1-b.p.’s) — where, on every path from the source to a
sink, no two branching nodes are √labeled by the same input index — exponential lower bounds of the form 2Ω( n) were given for explicit n-variable Boolean
functions in [17,18,5,7,8,16,10,6,4] among others. Moreover, [7,8,6,4]
exhibited
√
Boolean functions in AC0 that require 1-b.p.’s of size at least 2Ω( n) .
G. Gonnet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNCS 1776, pp. 197–206, 2000.
c Springer-Verlag Berlin Heidelberg 2000
198
V. Kabanets
√
After lower bounds of the form 2Ω( n) were obtained for 1-b.p.’s, the natural
problem was to find an explicit Boolean function with the truly exponential lower
bound 2Ω(n) . The first such bound was proved in [1] for the Boolean function
computing the parity of the number of triangles in a graph; the constant factor
was later improved in [16]. With the objective to improve this lower bound, Savický and Žák [15]√constructed a Boolean function in P that requires a 1-b.p. of
size at least 2n−3 n , and gave a probabilistic construction of a Boolean function
requiring a 1-b.p. of size at least 2n−O(log n) . Finally, Andreev et al. [3] presented a Boolean function in LINSPACE ∩ P/poly with the optimal lower bound
2n−log n+O(1) , and, by derandomizing the probabilistic construction in [15], a
Boolean function in QP ∩ P/poly with the lower bound 2n−O(log n) , as well as
a Boolean function in P with the lower bound 2n−polylog(n) ; here QP stands for
the quasipolynomial time npolylog(n) .
The combinatorics of 1-b.p.’s is quite well understood: a theorem of Simon
and Szegedy [16], generalizing the ideas of many papers on the subject, provides
a way of obtaining strong lower bounds. A particular case of this theorem states
that any 1-b.p. computing an r-mixed Boolean function has size at least 2r − 1.
Informally, an r-mixed function essentially depends on every set of r variables
(see the next section for a precise definition). The reason why this lower-bound
criterion works can be summarized as follows. A subprogram of a 1-b.p. Gn
starting at a node v does not depend on any variable queried along any path going
from the source s of Gn to v, and hence v completely determines a subfunction of
the function computed by Gn . If Gn computes an r-mixed Boolean function fn ,
then any two paths going from s to v can be shown to query the same variables,
whenever v is sufficiently close to s. Hence, such paths must coincide, i.e., assign
the same values to the queried variables; otherwise, two different assignments to
a set of at most r variables yield the same subfunction of fn , contradicting the
fact that fn is r-mixed. It follows that, near the source, Gn is a complete binary
tree, and so it must have exponentially many nodes.
Andreev et al. [3] construct a Boolean function fn (x1 , . . . , xn ) in LINSPACE∩
P/poly that is r-mixed for r = n − ⌈log n⌉ − 2 for almost all n. By the lowerbound criterion mentioned above, this yields the optimal lower bound Ω(2n /n)
2
for 1-b.p.’s. A Boolean function in DTIME(2log n )∩P/poly that requires a 1-b.p.
of size at least 2n−O(log n) is constructed by reducing the amount of randomness
used in the probabilistic construction of [15] to O(log2 n) advice bits. Since these
bits turn out to determine a polynomial-time computable function with the lower
2
bound 2n−O(log n) , one gets a function in P with the lower bound 2n−O(log n) by
making the advice bits a part of the input.
Both constructions in [3] use the idea of ǫ-biased sample spaces introduced
by Naor and Naor [9], who also gave an algorithm for generating small sample
spaces; three simpler constructions of such spaces were later given by Alon et
al. [2]. Andreev et al. define certain ǫ-discrepancy sets for systems of linear equations over GF(2), and relate these discrepancy sets to the biased sample spaces
of Naor and Naor through a reduction lemma. Using a particular construction
of a biased sample space (the powering construction from [2]), Andreev et al.
Almost k-Wise Independence and Hard Boolean Functions
199
give an algorithm for generating ǫ-discrepancy sets, which is then used to derandomize both a probabilistic construction of an r-mixed Boolean function for
r = n − ⌈log n⌉ − 2 and the construction in [15] mentioned above.
Our results. We will show that the known algorithms for generating small ǫbiased sample spaces can be applied directly to get the r-mixed Boolean function
as above, and to derandomize the construction in [15]. The idea of our first
construction is very simple: treat the elements (bit strings) of an ǫ-biased sample
space as the truth tables of Boolean functions. This will induce a probability
distribution on Boolean functions such that, on any subset A of k inputs, the
restriction to A of a Boolean function chosen according to this distribution will
look almost as if it were a uniformly chosen random function defined on the set
A. By an easy probabilistic argument, we will show that such a space of functions
will contain the desired r-mixed function, for a suitable choice of parameters ǫ
and k.
We indicate several ways of obtaining an r-mixed Boolean function with r =
n − ⌈log n⌉ − 2. In particular, using Razborov’s construction of ǫ-biased sample
spaces that are computable by AC0 [2] formulas [11] (see also [13]), we prove
that there are such r-mixed functions that belong to the class of polynomial-size
depth 3 formulas over the basis {&, ⊕, 1}. This yields the smallest (nonuniform)
complexity class known to contain Boolean functions with the optimal lower
bounds for 1-b.p.’s. (We remark that, given our lack of strong circuit lower
bounds, it is conceivable that the characteristic function of every language in
EXP can be computed in nonuniform AC0 [6].)
In our second construction, we derandomize a probabilistic existence proof
in [15]. We proceed along the usual path of derandomizing probabilistic algorithms whose analysis depends only on almost k-wise independence rather than
full independence of random bits [9]. Observing that the construction in [15]
is one such algorithm, we reduce its randomness complexity to O(log3 n) bits
(again treating strings of an appropriate sample space as truth tables). This gives
3
us a DTIME(2O(log n) )-computable Boolean function of quasilinear circuit-size
with the lower bound for 1-b.p.’s slightly better than that for the corresponding quasipolynomial-time computable function in [3], and a Boolean function in
3
quasilinear time, QL, with the lower bound for 1-b.p.’s at least 2n−O(log n) , which
is only slightly worse than the lower bound for the corresponding polynomialtime function in [3]. In the analysis of our construction, we employ a combinatorial lemma due to Razborov [11], which bounds from above the probability that
none of n events occur, given that these events are almost k-wise independent.
The remainder of the paper. In the following section, we state the necessary
definitions and some auxiliary lemmas. In Section 3, we show how to construct an
r-mixed function that has the same optimal lower bound for 1-b.p. as that in [3],
and observe that such a function can be computed in AC0 [2]. In Section 4, we
give a simple derandomization procedure for a construction in [15], obtaining two
more Boolean functions (computable in polynomial time and quasipolynomial
time, respectively) that are hard with respect to 1-b.p.’s.
200
2
V. Kabanets
Preliminaries
Below we recall the standard definitions of k-wise independence and (ǫ, k)independence. We consider probability distributions that are uniform over some
set S ⊆ {0, 1}n ; such a set is denoted by Sn and called a sample space.
Let Sn be a sample space, and let X = x1 . . . xn be a string chosen uniformly
from Sn . Then Sn is k-wise independent if, for any k indices i1 < i2 < · · · < ik
and any k-bit string α, we have Pr[xi1 xi2 . . . xik = α] = 2−k . Similarly, for Sn
and X as above, Sn is (ǫ, k)-independent if |Pr[xi1 xi2 . . . xik = α] − 2−k | 6 ǫ for
any k indices i1 < i2 < · · · < ik and any k-bit string α.
Naor and Naor [9] present an efficient construction of small (ǫ, k)-independent
sample spaces; three simpler constructions are given in [2]. Here we recall just
one construction from [2], the powering construction, although any of their three
constructions could be used for our purposes.
Consider the Galois field GF(2m ) and the associated m-dimensional vector
space over GF(2). For every element u of GF(2m ), let bin(u) denote the corresponding binary vector in the associated vector space. The sample space Pow2m
N
is defined as a set of N -bit strings such that each string ω is determined as
follows. Two elements x, y ∈ GF(2m ) are chosen uniformly at random. For each
1 6 i 6 N , the ith bit ωi is defined as hbin(xi ), bin(y)i, where ha, bi denotes the
inner product over GF(2) of binary vectors a and b.
N
Lemma 1 ([2]). The sample space Pow2m
N is 2m , k -independent for every k 6
N.
As we have mentioned in the introduction, we shall view the strings of the
sample space Pow2m
N as the truth tables of Boolean functions of log N variables.
It will be convenient to assume that N is a power of 2, i.e., N = 2n . Thus, the
uniform distribution over the sample space Pow2m
2n induces a distribution Fn,m
on Boolean functions of n variables that satisfies the following lemma.
Lemma 2. Let A be any set of k strings from {0, 1}n , for any k 6 2n . Let φ be
any Boolean function defined on A. For a Boolean function f chosen according
to the distribution Fn,m defined above, we have |Pr[f |A = φ] − 2−k | 6 2−(m−n) ,
where f |A denotes the restriction of f to the set A.
Proof: The k strings in A determine k indices i1 , . . . , ik in the truth table of f .
The function φ is determined by its truth table, a binary string α of length k.
Now the claim follows immediately from Lemma 1 and the definition of (ǫ, k)independence.
Razborov [11] showed that there exist complex combinatorial structures (such
as the Ramsey graphs, rigid graphs, etc.) of exponential size which can be
encoded by polynomial-size bounded-depth Boolean formulas over the basis
{&, ⊕, 1}. In effect, Razborov gave a construction of ǫ-biased sample spaces
(using the terminology of [9]), where the elements of such sample spaces are
the truth tables of AC0 [2]-computable Boolean functions chosen according to a
certain distribution on AC0 [2]-formulas. We describe this distribution next.
Almost k-Wise Independence and Hard Boolean Functions
201
For n, m, l ∈ N, a random formula F(n, m, l) of depth 3 is defined as
n
F(n, m, l) = ⊕lα=1 &m
β=1 ((⊕γ=1 λαβγ xγ ) ⊕ λαβ ),
(1)
where {λαβ , λαβγ } is a collection of (n + 1)ml independent random variables
uniformly distributed on {0, 1}. The following lemma shows that this distribution
determines an ǫ-biased sample space; as observed in [13], a slight modification
of the above construction yields somewhat better parameters, but the simpler
construction would suffice for us here.
Lemma 3 ([11]). Let k, l, m ∈ N be any numbers such that k 6 2m−1 , let A be
any set of k strings from {0, 1}n , and let φ be any Boolean function defined on
A. For a Boolean function f computed by the random formula F(n, m, l) defined
−m
in (1), we have |Pr[f |A = φ] − 2−k | 6 e−l2 , where f |A denotes the restriction
of f to the set A.
The proof of Lemma 3 is most easily obtained by manipulating certain discrete Fourier transforms. We refer the interested reader to [11] or [13] for details.
Below we give the definitions of some classes of Boolean functions hard for 1b.p.’s. We say that a Boolean function fn (x1 , . . . , xn ) is r-mixed for some r 6 n if,
for every subset X of r input variables {xi1 , . . . , xir }, no two distinct assignments
to X yield the same subfunction of f in the remaining n − r variables. We shall
see in the following section that an r-mixed function for r = n − ⌈log n⌉ − 2
has a nonzero probability in a distribution Fn,m , where m ∈ O(n), and in the
distribution induced by the random formula F(n, m, l), where m ∈ O(log n) and
l ∈ poly(n).
It was observed by many researchers that r-mixed Boolean functions are hard
for 1-b.p.’s. The following lemma is implicit in [17,5], and is a particular case of
results in [7,16].
Lemma 4 ( [17,5,7,16]). Let fn (x1 , . . . , xn ) be an r-mixed Boolean function,
for some r 6 n. Then every 1-b.p. computing fn has size at least 2r − 1.
Following Savický and Žák [15], we call a function φ : {0, 1}n → {1, 2, . . . , n}
(s, n, q)-complete, for some integers s, n, and q, if for every set I ⊆ {1, . . . , n} of
size n − s we have
1. for every 0-1 assignment to the variables xi , i ∈ I, the range of the resulting
subfunction of φ is equal to {1, 2, . . . , n}, and
2. there are at most q different subfunctions of φ, as one varies over all 0-1
assignments to xi , i ∈ I.
Our interest in (s, n, q)-complete functions is justified by the following lemma;
its proof is based on a generalization of Lemma 4.
Lemma 5 ( [15]). Let φ : {0, 1}n → {1, 2, . . . , n} be an (s, n, q)-complete function. Then the Boolean function fn (x1 , . . . , xn ) = xφ(x1 ,...,xn ) requires 1-b.p.’s of
size at least 2n−s /q.
202
V. Kabanets
The following lemma can be used to construct an (s, n, q)-complete function.
Lemma 6 ([15]). Let A be a t×n matrix over GF(2) with every t×s submatrix
of rank at least r. Let ψ : {0, 1}t → {1, 2, . . . , n} be a mapping such that its
restriction to every affine subset of {0, 1}t of dimension at least r has the range
{1, 2, . . . , n}. Then the function φ(x) = ψ(Ax) is (s, n, 2t )-complete.
A probabilistic argument shows that a t × n matrix A and a function ψ :
{0, 1}t → {1, 2, . . . , n} exist that satisfy the assumptions of Lemma 6 for the
choice of parameters s, t, r ∈ O(log n), thereby yielding a Boolean function that
requires 1-b.p.’s of size at least 2n−O(log n) . Below we will show that the argument
uses only limited independence of random bits, and hence it can be derandomized
using the known constructions of (ǫ, k)-independent spaces. Our proof will utilize
the following lemma of Razborov.
Lemma 7 ([11]). Let l > 2k be any natural numbers, let 0 < θ, ǫ < 1, and let
E1 , . . . , El be events such that, for every subset I ⊆ {1, . . . , l} of size at most k,
l
we have |Pr[∧i∈I Ei ] − θ|I| | 6 ǫ. Then Pr[∧li=1 Ēi ] 6 e−θl + k+1
(ǫk + θk ).
3
Constructing r-Mixed Boolean Functions
First, we give a simple probabilistic argument showing that r-mixed functions
exist for r = n − ⌈log n⌉ − 2. Let f be a Boolean function on n variables that
is chosen uniformly at random from the set of all Boolean n-variable functions.
For any fixed set of indices {i1 , . . . , ir } ⊆ {1, . . . , n} and any two fixed binary
strings α = α1 , . . . , αr and β = β1 , . . . , βr , the probability that fixing xi1 , . . . , xir
to α and then to β will give the same subfunction of f in the remaining n − r
variables is 2−k , where k = 2n−r . Thus, the probability that f is not r-mixed is
at most nr 22r 2−k , which tends to 0 as n grows.
We observe that the above argument only used the fact that f is random on
any set of 2k inputs: those obtained after the r variables xi1 , . . . , xir are fixed
to α, the set of which will be denoted as Aα , plus those obtained after the same
variables are fixed to β, the set of which will be denoted as Aβ . This leads us to
the following theorem.
Theorem 1. There is an m ∈ O(n) for which the probability that a Boolean
n-variable function f chosen according to the distribution Fn,m is r-mixed, for
r = n − ⌈log n⌉ − 2, tends to 1 as n grows.
Proof: By Lemma 2, the distribution Fn,m yields a function f which is equal
to any fixed Boolean function φ defined on a set Aα ∪ Bβ of 2k inputs with
probability at most 2−2k + 2−(m−n) . The number of functions φ that assume the
same values on the corresponding pairs of elements a ∈ Aα and b ∈ Aβ is 2k .
Thus, the probability that f is not r-mixed is at most nr 22r (2−k + 2−(m−n−k) ).
If m = (7 + δ)n for any δ > 0, then this probability tends to 0 as n grows.
By definition, each function from Fn,m can be computed by a Boolean circuit
of size poly(n, m). It must be also clear that checking whether a function from
Almost k-Wise Independence and Hard Boolean Functions
203
Fn,m , given by a 2m-bit string, is r-mixed can be done in LINSPACE. It follows
from Theorem 1 that we can find an r-mixed function, for r = n − ⌈log n⌉ −
2, in LINSPACE by picking the lexicographically first string of 2m bits that
determines such a function. By Lemma 4, this function will have the optimal
lower bound for 1-b.p.’s, Ω(2n /n).
We should point out that any of the three constructions of small (ǫ, k)independent spaces in [2] could be used in the same manner as described above
to obtain an r-mixed Boolean function computable in LINSPACE ∩ P/poly, for
r = n − ⌈log n⌉ − 2. Applying Lemma 3, we can obtain an r-mixed function with
the same value of r.
Theorem 2. There are m ∈ O(log n) and l ∈ poly(n) for which the probability
that a Boolean n-variable function f computed by the random formula F(n, m, l)
defined in (1) is r-mixed, for r = n − ⌈log n⌉ − 2, tends to 1 as n grows.
Proof: Proceeding as in the proof of Theorem 1, with Lemma 3 applied instead
of Lemma 2, we obtain that the probability that f is not r-mixed is at most
−m
n 2r −k
+ 2−(l2 −k) ). If m = ⌈log n⌉ + 3 and l = (6 + δ)n2 for any δ > 0,
r 2 (2
then this probability tends to 0 as n grows.
Corollary 1. There exists a Boolean function computable by a polynomial-size
depth 3 formula over the basis {&, ⊕, 1} that requires a 1-b.p. of size at least
Ω(2n /n) for all sufficiently large n.
4
Constructing (s, n, q)-Complete Functions
Let us take a look at the probabilistic proof (as presented in [15]) of the existence
of a matrix A and a function ψ with the properties assumed in Lemma 6. Suppose
that a t × n matrix A over GF(2) and a function ψ : {0, 1}t → {1, 2, . . . , n} are
chosen uniformly at random. For a fixed t × s submatrix B of A, if rank(B) < r,
then there is a set of at most r − 1 columns in B whose linear span contains
each of the remaining s − r + 1 columns of B. For a fixed set R of such r − 1
columns in B, the probability that each of the s − r + 1 vectors chosen uniformly
at random will be in the linear span of R is at most(2r−1/2t )s−r+1 . Thus, the
s
probability that the matrix A is “bad” is at most ns r−1
2−(t−r+1)(s−r+1) .
t
For a fixed affine subspace H of {0, 1} of dimension r and a fixed 1 6 i 6 n,
the probability that the range of ψ restricted to H does not contain i is at most
r
(1 − 1/n)2 . The number of different affine subspaces of {0, 1}t of dimension r is
at most 2(r+1)t ; the number of different i’s is n. Hence the probability that ψ is
r
r
“bad” is at most 2(r+1)t n(1 − 1/n)2 6 2(r+1)t ne−2 /n .
An easy calculation shows that setting s = ⌈(2 + δ) log n⌉, t = ⌈(3 + δ) log n⌉,
and r = ⌈log n + 2 log log n + b⌉, for any δ > 0 and sufficiently large b (say, b = 3
and δ = 0.01 ), makes both the probability that A is “bad” and the probability
that ψ is “bad” tend to 0 as n grows.
3
Theorem 3. There are d1 , d2 , d3 ∈ N such that every (2−d1 log n , d2 log2 n)independent sample space over nd3 -bit strings contains both matrix A and function ψ with the properties as in Lemma 6, for s, r, t ∈ O(log n).
204
V. Kabanets
Proof: We observe that both probabilistic arguments used only partial independence of random bits. For A, we need a tn-bit string coming from an (ǫ, k)2
independent sample space with k = ts and ǫ = 2−c1 log n , for a sufficiently large
constant c1 . Indeed, for a fixed t × s submatrix B of A and a fixed set R of
r − 1 columns in B, the number of “bad” t × s-bit strings α filling B so that
the column vectors in R contain in their linear span all the remaining s − r + 1
column vectors of B is at most 2(r−1)t 2(r−1)(s−r+1) = 2(r−1)(s+t−r+1) . If A is
chosen from the (ǫ, k)-independent sample space with ǫ and k as above, then the
probability that some fixed “bad” string α is chosen is at most 2−ts + ǫ. Thus,
in this case, the probability that A is “bad” is at most
n
s
(2−(t−r+1)(s−r+1) + ǫ2(r−1)(s+t−r+1) ).
s
r−1
Choosing the same s, t, and r as in the case of fully independent probability
distribution, one can make this probability tend to 0 as n grows, by choosing
sufficiently large c1 .
Similarly, for the function ψ, we need a 2t ⌈log n⌉-bit string from an (ǫ, k)3
independent sample space with k = c2 log2 n and ǫ = 2−c3 log n , for sufficiently
large constants c2 and c3 . Here we view the truth table of ψ as a concatenation
of 2t ⌈log n⌉-bit strings, where each ⌈log n⌉-bit string encodes a number from
{1, . . . , n}. The proof, however, is slightly more involved in this case, and depends
on Lemma 7.
Let s, r, and t be the same as before. For a fixed affine subspace H ⊆ {0, 1}t
of dimension r, such that H = {a1 , . . . , al } for l = 2r , and for a fixed 1 6 i 6 n,
let Ej , 1 6 j 6 l, be the event that ψ(aj ) = i when ψ is chosen from the
(ǫ, k)-independent sample space defined above. Then Lemma 7 applies with θ =
2−⌈log n⌉ , yielding that the probability that ψ misses the value i on the subspace
H is
r
2
l
−2r−⌈log n⌉
(ǫk + 2−k⌈log n⌉ ).
+
(2)
Pr[∧j=1 Ēj ] 6 e
k+1
It is easy to see that the first term on the right-hand side of (2) is at most
(when b = 3 in r). We need to bound from above the remaining two
e
−k⌈log n⌉
2r
2r
2
and k+1
ǫk. Using Stirling’s formula, one can show that
terms: k+1
−4 log2 n
2
the first of these two terms can be made at most 2−4 log n , by choosing c2
sufficiently large. Having fixed c2 , we can also make the second of the terms at
2
most 2−4 log n , by choosing c3 > c2 sufficiently large. It is then straightforward
to verify that the probability that ψ misses at least one value i, 1 6 i 6 n, on
at least one affine subspace of dimension r tends to 0 as n grows.
Using any efficient construction of almost independent sample spaces, for ex2
ample, Pow2m
N with N = tn ∈ O(n log n) and m ∈ O(log n), we can find a matrix
O(log2 n)
) by searching through all
A with the required properties in DTIME(2
elements of the sample space and checking whether any of them yields a desired
3
matrix. Analogously, we can find the required function ψ in DTIME(2O(log n) ),
Almost k-Wise Independence and Hard Boolean Functions
205
′
2m
with N ′ = 2t ⌈log n⌉ and m′ ∈ O(log3 n). Thus,
by considering, e.g., PowN
′
constructing both A and ψ can be carried out in quasipolynomial time.
Given the corresponding advice strings of O(log3 n) bits, ψ is computable
in time polylog(n) and all elements of A can be computed in time npolylog(n).
So, in this case, the function φ(x) = ψ(Ax) is computable in quasilinear time.
Hence, by “hard-wiring” good advice strings, we get the function fn (x) = xφ(x)
computable by quasilinear-size circuits, while, by Lemmas 5 and 6, fn requires
1-b.p.’s of size at least 2n−(5+ǫ) log n , for any ǫ > 0 and sufficiently large n; these
parameters appear to be better than those in [3]. By making the advice strings
a part of the input, we obtain a function in QL that requires 1-b.p.’s of size at
3
least 2n−O(log n) .
We end this section by observing that the method used above to construct an
(s, n, q)-complete Boolean function could be also used to construct an r-mixed
Boolean function for r = n − O(log n) by derandomizing Savický’s [14] modification of the procedure in [15]. This r-mixed function is also determined by an
advice string of length polylog(n), and hence can be constructed in quasipolynomial time.
5
Concluding Remarks
We have shown how the well-known constructions of small ǫ-biased sample
spaces [11,9,2] can be directly used to obtain Boolean functions that are exponentially hard for 1-b.p.’s. One might argue, however, that the hard Boolean
functions constructed in Sections 3 and 4 are not “explicit” enough, since they
are defined as the lexicographically first functions in certain search spaces. It
would be interesting to find a Boolean function in P or NP with the optimal lower
bound Ω(2n /n) for 1-b.p.’s. The problem of constructing a polynomial-time computable r-mixed Boolean function with r as large as possible is of independent
√
interest; at present, the best such function is given in [15] for r = n − Ω( n).
A related open question is to determine whether the minimum number of bits
needed to specify a Boolean function with the optimal lower bound for 1-b.p.’s,
or an r-mixed Boolean function for r = n − ⌈log n⌉ − 2, can be sublinear.
Acknowledgements. I am indebted to Alexander Razborov for bringing [11] to
my attention. I would like to thank Stephen Cook and Petr Savický for their
comments on a preliminary version of this paper, and Dieter van Melkebeek
for helpful discussions. I also want to express my sincere gratitude to Stephen
Cook for his constant encouragement and support. Finally, I am grateful to the
anonymous referee for suggestions and favourable comments.
References
1. M. Ajtai, L. Babai, P. Hajnal, J. Komlós, P. Pudlak, V. Rödl, E. Szemerédi, and
G. Turán. Two lower bounds for branching programs. In Proceedings of the Eighteenth Annual ACM Symposium on Theory of Computing, pages 30–38, 1986.
206
V. Kabanets
2. N. Alon, O. Goldreich, J. Håstad, and R. Peralta. Simple constructions of almost k−wise independent random variables. Random Structures and Algorithms,
3(3):289–304, 1992. (preliminary version in FOCS’90).
3. A.E. Andreev, J.L. Baskakov, A.E.F. Clementi, and J.D.P. Rolim. Small pseudorandom sets yield hard functions: New tight explicit lower bounds for branching
programs. Electronic Colloquium on Computational Complexity, TR97-053, 1997.
4. B. Bollig and I. Wegener. A very simple function that requires exponential size
read-once branching programs. Information Processing Letters, 66:53–57, 1998.
5. P.E. Dunne. Lower bounds on the complexity of one-time-only branching programs. In L. Budach, editor, Proceedings of the Second International Conference
on Fundamentals of Computation Theory, volume 199 of Lecture Notes in Computer Science, pages 90–99, Springer Verlag, Berlin, 1985.
6. A. Gal. A simple function that requires exponential size read-once branching
programs. Information Processing Letters, 62:13–16, 1997.
7. S. Jukna. Entropy of contact circuits and lower bound on their complexity. Theoretical Computer Science, 57:113–129, 1988.
8. M. Krause, C. Meinel, and S. Waack. Separating the eraser Turing machine classes
Le , NLe , co − NLe and Pe . Theoretical Computer Science, 86:267–275, 1991.
9. J. Naor and M. Naor. Small-bias probability spaces: Efficient constructions and
applications. SIAM Journal on Computing, 22(4):838–856, 1993. (preliminary
version in STOC’90).
10. S. Ponzio. A lower bound for integer multiplication with read-once branching
programs. SIAM Journal on Computing, 28(3):798–815, 1999. (preliminary version
in STOC’95).
11. A.A. Razborov. Bounded-depth formulae over {&, ⊕} and some combinatorial
problems. In S. I. Adyan, editor, Problems of Cybernetics. Complexity Theory and
Applied Mathematical Logic, pages 149–166. VINITI, Moscow, 1988. (in Russian).
12. A.A. Razborov. Lower bounds for deterministic and nondeterministic branching
programs. In L. Budach, editor, Proceedings of the Eighth International Conference on Fundamentals of Computation Theory, volume 529 of Lecture Notes in
Computer Science, pages 47–60, Springer Verlag, Berlin, 1991.
13. P. Savický. Improved Boolean formulas for the Ramsey graphs. Random Structures
and Algorithms, 6(4):407–415, 1995.
14. P. Savický. personal communication, January 1999.
15. P. Savický and S. Zák. A large lower bound for 1-branching programs. Electronic
Colloquium on Computational Complexity, TR96-036, 1996.
16. J. Simon and M. Szegedy. A new lower bound theorem for read-only-once branching
programs and its applications. In J.-Y. Cai, editor, Advances in Computational
Complexity, pages 183–193. AMS-DIMACS Series, 1993.
17. I. Wegener. On the complexity of branching programs and decision trees for clique
function. Journal of the ACM, 35:461–471, 1988.
18. S. Zak. An exponential lower bound for one-time-only branching programs. In
Proceedings of the Eleventh International Symposium on Mathematical Foundations
of Computer Science, volume 176 of Lecture Notes in Computer Science, pages
562–566, Springer Verlag, Berlin, 1984.
Improved Upper Bounds on the Simultaneous
Messages Complexity of the Generalized
Addressing Function
Andris Ambainis1 and Satyanarayana V. Lokam2
1
2
Department of Computer Science,
University of California at Berkeley,
Berkeley, CA.
ambainis@cs.berkely.edu
Department of Mathematical and Computer Sciences,
Loyola University Chicago,
Chicago, IL 60626.
satya@math.luc.edu
Abstract. We study communication complexity in the model of Simultaneous Messages (SM). The SM model is a restricted version of the
well-known multiparty communication complexity model [CFL,KN]. Motivated by connections to circuit complexity, lower and upper bounds on
the SM complexity of several explicit functions have been intensively
investigated in [PR,PRS,BKL,Am1,BGKL].
A class of functions called the Generalized Addressing Functions (GAF),
denoted GAFG,k , where G is a finite group and k denotes the number of
players, plays an important role in SM complexity. In particular, lower
bounds on SM complexity of GAFG,k were used in [PRS] and [BKL]
to show that the SM model is exponentially weaker than the general
communication model [CFL] for sufficiently small number of players.
Moreover, certain unexpected upper bounds from [PRS] and [BKL] on
SM complexity of GAFG,k have led to refined formulations of certain
approaches to circuit lower bounds.
In this paper, we show improved upper bounds on the SM complexity of GAFZt2 ,k . In particular, when there are three players (k = 3),
we give an upper bound of O(n0.73 ), where n = 2t . This improves a
√
bound of O(n0.92 ) from [BKL]. The lower bound in this case is Ω( n)
[BKL,PRS]. More generally, for the k player case, we prove an upper
bound of O(nH(1/(2k−2)) ) improving a bound of O(nH(1/k) ) from [BKL],
where H(·) denotes the binary entropy function. For large enough k,
this is nearly a quadratic improvement. The corresponding lower bound
is Ω(n1/(k−1) /(k − 1)) [BKL,PRS]. Our proof extends some algebraic
techniques from [BKL] and employs a greedy construction of covering
codes.
1
Introduction
The Multiparty Communication Model: The model of multiparty communication complexity plays a fundamental role in the study of Boolean function
G. Gonnet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNCS 1776, pp. 207–216, 2000.
c Springer-Verlag Berlin Heidelberg 2000
208
A. Ambainis, S.V. Lokam
complexity. It was introduced by Chandra, Furst, and Lipton [CFL] and has been
intensively studied (see the book by Kushilevitz and Nisan [KN] and references
therein). In a multiparty communication game, k players wish to collaboratively
evaluate a Boolean function f (x0 , . . . , xk−1 ). The i-th player knows each input
argument except xi ; we will refer to xi as the input missed by player i. We can
imagine input xi written on the forehead of player i. The players communicate
using a blackboard, visible to all the players. Each player has unlimited computational power. The “algorithm” followed by the players in their exchange of
messages is called a protocol. The cost of a protocol is the total number of bits
communicated by the players in evaluating f at a worst-case input. The multiparty communication complexity of f is then defined as the minimum cost of a
protocol for f .
The Simultaneous Messages (SM) Model: A restricted model of mulitparty communication complexity, called the Simultaneous Messages (SM) model,
recently attracted much attention. It was implicit in a paper by Nisan and
Wigderson [NW, Theorem 7] for the case of three players. The first papers investigating the SM model in detail are by Pudlák, Rödl, and Sgall [PRS] (under the
name “Oblivious Communication Complexity”), and independently, by Babai,
Kimmel, and Lokam [BKL].
In the k-party SM model, we have k players as before with input xi of
f (x0 , . . . , xk−1 ) written on the forehead of the i-th player. However, in this
model, the players are not allowed to communicate with each other. Instead,
each player simultaneously sends a single message to a referee who sees none of
the input. The referee announces the value of the function upon receiving the
messages from the players. An SM protocol specifies how each of the players can
determine the message to be sent based on the part of the input that player
sees, as well as how the referee would determine the value of the function based
on the messages received from the players. All the players and the referee are
assumed to have infinite computational power. The cost of an SM protocol is
defined to be the maximum number of bits sent by a player to the referee, and
the SM-complexity of f is defined to be the minimum cost of an SM protocol
for f on a worst-case input. Note that in the SM model, we use the ℓ∞ -norm
of the message lengths as the complexity measure as opposed to the ℓ1 -norm in
the general model from [CFL] described above.
The main motivation for studying the SM model comes from the observation
that sufficiently strong lower bounds in this restricted model already have some of
the same interesting consequences to Boolean circuit complexity as the general
multiparty communication model. Moreover, it is proved in [PRS] and [BKL]
that the SM model is exponentially weaker than the general communication
model when the number of players is at most (log n)1−ǫ for any constant ǫ > 0.
This exponential gap is proved by comparing the complexities of the Generalized
Addressing Function (GAF) in the respective models.
Generalized Addressing Function (GAF): The input to GAFG,k , where
G is a group of order n, consists of n + (k − 1) log n bits partitioned among the
players as follows: player 0 gets a function x0 : G −→ {0, 1} (represented as an
Improved Upper Bounds on the Simultaneous Messages Complexity
209
n-bit string) on her forehead whereas players 1 through k − 1 get group elements
x1 , . . . , xk−1 , respectively, on their foreheads. The output of GAFG,k on this
input is the value of the function x0 on x1 ◦ . . . ◦ xk−1 , where ◦ represents the
group operation in G. Formally,
GAFG,k (x0 , x1 , . . . , xk−1 ) := x0 (x1 ◦ . . . ◦ xk−1 ).
In [BKL], a general lower bound is proved on the SM complexity
of GAFG,k for any finite group G. In particular, they prove a lower bound
Ω n1/(k−1) /(k − 1) for GAFZn ,k (i.e., G is a cyclic group) and GAFZt2 ,k (i.e.,
G is a vector space over GF (2)). Pudlák, Rödl, and Sgall [PRS] consider the
special case of GAFZt2 ,k and prove the same lower bound using essentially the
same technique.
Upper Bounds on SM complexity: While the Simultaneous Messages
model itself was motivated by lower bound questions, there have been some
unexpected developments in the direction of upper bounds in this model. We
describe some of these upper bounds and their significance below. This paper is
concerned with upper bounds on the SM complexity of GAFZt2 ,k .
The results of [BKL] and [PRS] include some upper bounds on the SM complexity of GAFZt2 ,k and GAFZn ,k , respectively. In [BGKL], upper bounds are
also proved on a class of functions defined by certain depth-2 circuits. This class
included the “Generalized Inner Product” (GIP) function, which was a prime
example in the study and applications of multiparty communication complexity
[BNS,G,HG,RW], and the “Majority of Majorities” function.
Babai, Kimmel, and Lokam [BKL] show an O(n0.92 ) upper bound for
GAFZt2 ,3 , i.e., on the 3-party SM complexity of GAFG,k , when G = Zt2 . More
generally, they show an O(nH(1/k) ) upper bound for GAFZt2 ,k . Pudlák, Rödl, and
Sgall [PRS] prove upper bounds for GAFZn ,k . They show an O(n log log n/ log n)
upper bound for k = 3, and an O(n6/7 ) for k ≥ c log n. (Actually, upper bounds
in [PRS] are proved for the so-called “restricted semilinear protocols,” but it is
easy to see that they imply essentially the same upper bounds on SM complexity.)
bounds
are significantly improved by Ambainis [Am1]
These upper
√
to O n log1/4 n/2 log n for k = 3 and to O(nǫ ) for an arbitrary ǫ > 0 for
k = O((log n)c(ǫ) ). Note that the upper bounds for GAFZt2 ,k are much better
than those for GAFZn ,k . It is interesting that the upper bounds, in contrast to
the lower bound, appear to depend heavily on the structure of the group G.
Specifically, the techniques used in [BKL] and in this paper for GAFZt2 ,k and
those used in [PRS] and [Am1] for GAFZn ,k seem to be quite different.
Our upper bounds: In this paper, we give an O(n0.73 ) upper bound on the
(3-player) SM complexity of GAFZt2 ,3 , improving the upper bound of O(n0.92 )
from [BKL] for the same problem. For general k we show an upper bound of
O(nH(1/(2k−2) ) for GAFZt2 ,k improving the upper bound of O(nH(1/k) ) from
[BKL]. For large k, this is nearly a quadratic improvement.
The lower bound
on the SM complexity of GAFZt2 ,k is Ω n1/(k−1) (k − 1) . Our results extend
210
A. Ambainis, S.V. Lokam
some of the algebraic ideas from [BKL] and employ a greedy construction of
n
covering codes of {0, 1} .
Significance of Upper Bounds: Upper bounds are obviously useful in assessing the strength of lower bounds. However, upper bounds on SM complexity
are interesting for several additional reasons.
First of all, upper bounds on SM complexity of GAF have led to a refined
formulation of a communication complexity approach to a circuit lower bound
problem. Before the counterintuitive upper bounds proved in [PR,PRS,BKL], it
appeared natural to conjecture that the k-party SM complexity of GAF should
be Ω(n) when k is a constant. In fact, proving an ω(n/ log log n) lower bound on
the total amount of communication in a 3-party SM protocol for GAF would have
proved superlinear size lower bounds on log-depth Boolean circuits computing an
(n-output) explicit function. This communication complexity approach toward
superlinear lower bounds for log-depth circuits is due to Nisan and Wigderson
[NW] and is based on a graph-theoretic reduction due to Valiant [Va]. However,
results from [PR,PRS,BKL] provide o(n/ log log n) upper bounds on the total
communication of 3-party SM protocols for GAFZt2 ,3 and GAFZn ,3 and hence
ruled out the possibility of using lower bounds on total SM complexity to prove
the circuit lower bound mentioned above. On the other hand, these and similar functions are expected to require superlinear size log-depth circuits. This
situation motivated a more careful analysis of Valiant’s reduction and a refined
formulation of the original communication complexity approach. In the refined
approach, proofs of nonexistence of 3-party SM protocols are sought when there
are certain constraints on the number of bits sent by individual players as opposed to lower bounds on the total amount of communication. This new approach
is described by Kushilevitz and Nisan in their book [KN, Section 11.3].
Secondly, Pudlák, Rödl, and Sgall [PRS] use their upper bounds on restricted
semilinear protocols to disprove a conjecture of Razborov’s [Ra] concerning the
contact rank of tensors.
Finally, the combinatorial and algebraic ideas used in designing stronger
upper bounds on SM complexity may find applications in other contexts. For
example, Ambainis [Am1] devised a technique to recursively compose SM protocols and used this to improve upper bounds from [PRS]. Essentially similar
techniques enabled him in [Am2] to improve upper bounds from [CGKS] on
the communication complexity of k-server Private Information Retrieval (PIR)
schemes.
1.1
Definitions and Preliminaries
Since we consider the SM complexity of GAFZt2 ,k only, we omit the subscripts
for simplicity of notation. We describe the the 3-player protocol in detail. The
definitions and results extend naturally to the k-party case for general k.
We recall below the definition of GAF from the Introduction (in the special
case when G = Zℓ2 , and we write A for x0 , the input on forehead of Player 0).
Improved Upper Bounds on the Simultaneous Messages Complexity
211
Definition 1. Assume n = 2ℓ for a positive integer ℓ. Then, the function
n
ℓ
GAF(A, x1 , . . . , xk−1 ), where A ∈ {0, 1} and xi ∈ {0, 1} is defined by
GAF(A, x1 , . . . , xk−1 ) := A(x1 ⊕ · · · ⊕ xk−1 ),
ℓ
where A is viewed as an ℓ-input Boolean function A : {0, 1} −→ {0, 1}.
Note that Player 0 knows only (k − 1) log n bits of information. This information can be sent to the referee if each of players i, for 1 ≤ i ≤ k − 2, sends xi+1
and player k − 1 sends x1 to the referee. (This adds a log n term to the lengths of
messages sent by these players and will be insignificant.) Thus, we can assume
that player 0 remains silent for the entire protocol and that the referee knows all
the inputs x1 , . . . , xk−1 . The goal of players 1 through k − 1 is to send enough
information about A to the referee to enable him to compute A(x1 ⊕ · · · ⊕ xk−1 ).
We will use the following notation:
Λ(ℓ, b) :=
b
X
ℓ
j=0
j
.
The following estimates on Λ are well-known (see for instance, [vL, Theorem
1.4.5]):
Fact 1 For 0 ≤ α ≤ 1/2, ǫ > 0, and sufficiently large ℓ,
2 ℓ (H(α)−ǫ) ≤ Λ(ℓ, αℓ) ≤ 2 ℓH(α) .
The following easily proved estimates on the binary entropy function H(x) =
−x log x − (1 − x) log(1 − x) will also be useful:
Fact 2 (i) For |δ| ≤ 1/2,
π2 2
δ ≤ H
1−
3 ln 2
1
−δ
2
≤ 1−
2 2
δ .
ln 2
(ii) For k ≥ 3 , H(1/k) ≤ log(ek)/k.
Our results extend algebraic techniques from [BKL]. The main observation
in the upper bounds in that paper is that if the function A can be represented
as a low-degree polynomial over GF (2), significant, almost quadratic, savings
in communication are possible compared to the trivial protocol. We use the
following lemma proved in [BKL]:
Lemma 1 (BKL). Let f be an ℓ-variate multilinear polynomial of degree at
most d over Z2 . Then GAF(f, x, y) has an SM-protocol in which each player
sends at most Λ(ℓ, ⌊d/2⌋) bits.
212
2
A. Ambainis, S.V. Lokam
Simple Protocol
ℓ
For z ∈ {0, 1} , let |z| denote the number of 1’s in z.
ℓ
Lemma 2. Let A : {0, 1} −→ {0, 1}. Then for each i, 0 ≤ i ≤ ℓ, there is a
multilinear polynomial fi of degree at most ℓ/2 such that fi (z) = A(z) for every
z with |z| = i.
ℓ
Proof: For every z ∈ {0, 1} , define the polynomial
Y
xi ,
if |z| ≤ ℓ/2,
zY
i =1
δz (x) :=
(1 − xi ), if |z| > ℓ/2.
zi =0
Observe that if |x| = |z|, then δz (x) = 1 if x = z and δz (x) = 0 if x 6= z.
Now, define fi by
X
fi (x) :=
A(z)δz (x).
|z|=i
Clearly, fi is of degree at most ℓ/2, since δz (x) is of degree at most ℓ/2. Furthermore, when |x| = i, all terms, except the term A(x)δx (x), vanish in the sum
6 i, fi (x) need not be
defining fi , implying fi (x) = A(x). (Note that when |x| =
equal to A(x). )
Theorem 1. GAF(A, x, y) can be computed by an SM-protocol in which each
player sends at most ℓΛ(ℓ, ℓ/4) = O(n0.82 ) bits.
Proof: Players 1 and 2 construct the functions fi for 0 ≤ i ≤ ℓ corresponding to
A as given by Lemma 2. They execute the protocol given by Lemma 1 for each
fi to enable the Referee to evaluate fi (x + y). The Referee, knowing x and y,
can determine |x + y| and use the information sent by Players 1 and 2 for f|x+y|
to evaluate f|x+y| (x + y). By Lemma 2, f|x+y| (x + y) = A(x + y) = GAF(x + y),
hence the protocol is correct. By Lemma 1, each of the players send at most
Λ(ℓ, di /2) bits for each 0 ≤ i ≤ ℓ, where di = deg fi = min{i, ℓ − i} ≤ ℓ/2. The
claimed bound follows using estimates from Fact 1.
3
Generalization
We now present a protocol based on the notion of covering codes. The protocol
in the previous section will follow as a special case.
Definition 2. An (m, r)-covering code of length ℓ is a set of words {c1 , . . . , cm }
in {0, 1}ℓ such that for any x ∈ {0, 1}ℓ , there is a codeword ci such that d(x, ci ) ≤
r, where d(x, y) denotes the Hamming distance between x and y.
Theorem 2. If there is an (m, r)-covering code of length ℓ, then there is an
SM-protocol for GAF(A, x, y) in which each player sends at most mrΛ(ℓ, r/2)
bits.
Improved Upper Bounds on the Simultaneous Messages Complexity
213
Proof: Given an (m, r)-covering code {c1 , . . . , cm }, let us denote by Hij , the
Hamming sphere of radius j around ci :
ℓ
Hij := {x ∈ {0, 1} : d(x, ci ) = j}.
ℓ
For A : {0, 1} −→ {0, 1}, it is easy to construct a multilinear polynomial fij
that agrees with A on all inputs from Hij . More specifically, let us write, for
each 1 ≤ i ≤ m and S ⊆ [ℓ],
Y
Y
xk ·
(1 − xk ),
δiS (x) :=
{k∈S:cik =0}
{k∈S:cik =1}
where cik denotes the k’th coordinate of ci . Note that δiS (x) = 1 iff x differs
from ci in the coordinates k ∈ S.
Now, define the polynomial fij by
X
fij (x) :=
A(ci + s)δiS (x),
|S|=j
ℓ
where s ∈ {0, 1} denotes the characteristic vector of S ⊆ [ℓ]. It is easy to verify
that fij (x) = A(x) for all x ∈ Hij . Note also that deg fij ≤ r.
The players use the protocol given by Lemma 1 on fij for 1 ≤ i ≤ m and
1 ≤ j ≤ r. The Referee can determine i and j such that (x + y) ∈ Hij and
evaluate fij (x + y) = A(x + y) from the information sent by the players.
Remark 1. 1. Trivially, the all-0 vector and the all-1 vector form a (2, ℓ/2)covering code of length ℓ. Thus Theorem 1 is a special case of Theorem 2.
2. Assume ℓ = 2v , v√ a positive integer. The first-order Reed-Muller code
R(1, v) form a (2l, (ℓ − ℓ)/2)-covering code of length ℓ [vL, Exercise 4.7.10].
Thus we have the following corollary.
√
√
Corollary 1. There is an SM-protocol of complexity ℓ(ℓ − ℓ)Λ(ℓ, (ℓ − ℓ)/4).
4
Limitations
Theorem 3. Any SM-upper bound obtained via Theorem 2 is at least n0.728 .
Proof: For any (m, r)-covering code of length ℓ, we must trivially have
mΛ(ℓ, r) ≥ 2ℓ
(1)
Thus the SM-upper bound from Theorem 2 is at least
r2ℓ
Λ(ℓ, r/2)
.
Λ(ℓ, r)
(2)
Let r = αℓ. Using Fact 1 in (2), the upper bound is at least
ℓα2ℓ(1+H(α/2)−H(α)) .
√
This is minimized at α = 1 − 1/ 2 giving an upper bound of no less than
0.29289...ℓ20.7284...ℓ ≥ n0.728 .
214
5
A. Ambainis, S.V. Lokam
Upper Bound for 3 Players
We construct a protocol which gives an upper bound matching the lower bound
of Theorem 3. First, we construct a covering code.
ℓ
2
).
Lemma 3. For any ℓ, r, there is an (m, r)-covering code with m = O(ℓ Λ(ℓ,r)
Proof: We construct the code greedily. The first codeword c1 can be arbitrary.
Each next codeword ci is chosen so that it maximizes the number of words x
such that d(x, c1 ) > r, d(x, c2 ) > r, . . ., d(x, ci−1 ) > r, but d(x, ci ) ≤ r. We can
find ci by exhaustive search over all words.
Let Si be the set of words x such that d(x, c1 ) > r, . . ., d(x, ci ) > r and wi
be the cardinality of Si .
Claim. wi+1 ≤ (1 −
Λ(ℓ,r
)wi .
2ℓ
Proof of Claim: For each x ∈ Si , there are Λ(ℓ, r) pairs (x, y) such that d(x, y) ≤
r. The total number of such pairs is at most wi Λ(ℓ, r). By pigeonhole principle,
numbers x ∈ Si with d(x, y) ≤ r.
there is a y such that there are at least wi Λ(ℓ,r)
2ℓ
Recall that ci+1 is chosen so that it maximizes the number of such x. Hence,
newly covered words x ∈ Si such that d(x, ci+1 ) ≤ r.
there are at least wi Λ(ℓ,r)
2ℓ
This implies
Λ(ℓ, r)
wi .
wi+1 ≤ 1 −
2ℓ
This proves the claim.
We have
w0 = 2ℓ ,
i
i
Λ(ℓ, r)
Λ(ℓ, r)
w0 = 1 −
2ℓ .
wi ≤ 1 −
2ℓ
2ℓ
From 1 − x ≤ e−x it follows that
ℓ
wi ≤ e−
Λ(ℓ,r)
i
2ℓ
2ℓ .
2
Let m = ln 2 Λ(ℓ,r)
ℓ + 1. Then wm < 1.
Hence, wm = 0, i.e. {c1 , . . . , cm } is an (m, r)-covering code.
Theorem 4. There is an SM-protocol for GAF(A, x, y) with communication
complexity O(n0.728...+o(1) ).
Proof: We apply Theorem 2 to the code of Lemma 3 and get a protocol with
communication complexity
Λ(ℓ, r/2)
2ℓ
rΛ(ℓ, r/2) = O ℓr2ℓ
.
mrΛ(ℓ, r/2) = O ℓ
Λ(ℓ, r)
Λ(ℓ, r)
√
Let r = αℓ, where α = 1 − 1/ 2 is the constant from Theorem 3. Then, using
estimates from Fact 1, the communication complexity is at most
2 ℓ Λ(ℓ, αℓ/2)
= O(n0.728...+o(1) ).
O ℓ 2
Λ(ℓ, αℓ)
Improved Upper Bounds on the Simultaneous Messages Complexity
6
215
Upper Bounds for k Players
In this section, we generalize the idea from Section 2 to the k-player case. It
appears that for large values of k, the simpler ideas from Section 2 already give
nearly as efficient a protocol as can be obtained by generalizing Theorem 4. In
other words, the covering code (for large k) from Theorem 4 can be replaced
by the trivial (2, ℓ/2)-covering code given by the all-0 and all-1 vectors (cf.
Remark 1).
The starting point again is a lemma from [BKL] generalizing Lemma 1:
Lemma 4 (BKL). Let f be an ℓ-variate multilinear polynomial of degree at
most d over Z2 . Then GAF(f, x1 , . . . , xk−1 ) has a k-player SM protocol in which
each player sends at most Λ(ℓ, ⌊d/(k − 1)⌋) bits.
Theorem 5. GAF(A, x1 , . . . , xk−1 ) can be computed by a k-player SM protocol
in which each player sends at most ℓΛ(ℓ, ℓ/(2k − 2)) = O(nH(1/(2k−2) ) bits.
Proof: Similar to Theorem 1. Players 1 thorough k construct the polynomials
fi , 0 ≤ i ≤ ℓ, corresponding to A as given by Lemma 2. Each fi is of degree at
most ℓ/2. The players then follow the protocol from Lemma 4 for each of these
fi .
7
Conclusions and Open Problems
We presented improved upper bounds on the SM complexity of of GAFZt2 ,k . For
k = 3, we prove an upper bound of O(n0.73 ) improving the previous bound of
O(n0.92 ) from [BKL]. For general k, we show an upper bound of O(nH(1/(2k−2) )
improving the bound of O(nH(1/k) ) from [BKL]. The first open problem is to
improve these bounds to close the gap between upper and lower bounds on the
SM complexity of GAFZt2 ,k . Recall that the lower bound is Ω(n1/(k−1) /(k − 1))
[BKL].
The second open problem concerns the SM complexity of GAFZn ,k , i.e., the
generalized addressing function for the cyclic group. Note that the best known
upper bounds for the cyclic group [Am1] are significantly weaker than the upper
bounds we present here for the vector space Zt2 . The lower bounds for both
GAFZt2 ,k and GAFZn ,k are Ω(n1/(k−1) /(k − 1)) and follow from a general result
of [BKL] on SM complexity of GAFG,k for arbitrary finite groups G. In contrast,
techniques for upper bounds appear to depend on the specific group G.
References
[Am1]
[Am2]
A. Ambainis. Upper Bounds on Multiparty Communication Complexity of
Shifts. 13th Annual Symposium on Theoretical Aspects of Computer Science, LNCS, Vol. 1046, pp. 631-642, 1996.
A. Ambainis.: Upper Bound on the Communication Complexity of Private
Information Retrieval. Proceedings of ICALP’97, Lecture Notes in Computer Science, 1256(1997), pages 401-407.
216
[B]
[BE]
A. Ambainis, S.V. Lokam
B. Bollobás. Random Graphs. Academic Press, 1985, pp. 307-323.
L. Babai, P. Erdős. Representation of Group Elements as Short Products.
J. Turgeon, A. Rosa, G. Sabidussi, editor. Theory and Practice of Combinatorics, 12, in Ann. Discr. Math., North-Holland, 1982, pp. 21-26.
[BHK]
L. Babai, T. Hayes, P. Kimmel Communication with Help. ACM STOC
1998.
[BGKL] L. Babai, A. Gál, P. Kimmel, S. V. Lokam. Communictaion Complexity
of Simultaneous Messages. Manuscript. A significantly expanded version of
[BKL] below.
[BKL]
L. Babai, P. Kimmel, S. V. Lokam. Simultaneous Messages vs. Communication. Proc. of the 12th Symposium on Theoretical Aspects of Computer
Science, 1995.
[BNS]
L. Babai, N. Nisan, M. Szegedy. Multiparty Protocols, Pseudorandom Generators for Logspace and Time-Space Trade-offs. Journal of Computer and
System Sciences 45, 1992, pp. 204-232.
[BT]
R. Beigel, J. Tarui. On ACC. Proc. of the 32nd IEEE FOCS, 1991, pp.
783-792.
[CFL]
A.K. Chandra, M.L. Furst, R.J. Lipton. Multiparty protocols. Proc. of the
15th ACM STOC, 1983, pp. 94-99.
[CGKS] B. Chor, O. Goldreich, E. Kushilevitz, M. Sudan. Private Information Retrieval. Proc. of 36th IEEE FOCS, 1995, pp. 41 – 50.
[G]
V. Grolmusz. The BNS Lower Bound for Multi-Party Protocols is Nearly
Optimal. Information and Computation, 112, No, 1, 1994, pp. 51-54.
[HG]
J. Håstad, M. Goldmann. On the Power of Small-Depth Threshold Circuits.
Computational Complexity, 1, 1991, pp. 113-129.
[HMPST] A. Hajnal, W. Maass, P. Pudlák, M. Szegedy, G. Turan. Threshold Circuits
of Bounded Depth. Proc. of 28th IEEE FOCS, pp. 99 – 110, 1987.
[KN]
E. Kushilevitz, N. Nisan. Communication Complexity, Cambridge University Press, 1997.
[MNT]
Y. Mansour, N. Nisan, P. Tiwari. The Computational Complexity of Universal Hashing. Theoretical Computer Science, 107, 1993, pp. 121-133.
[NW]
N. Nisan, A. Wigderson. Rounds in Communication Complexity Revisited.
SIAM Journal on Computing, 22, No. 1, 1993, pp. 211-219.
[PR]
P. Pudlák, V. Rödl. Modified Ranks of Tensors and the Size of Circuits.
Proc. 25th ACM STOC, 1993, pp. 523 – 531.
[PRS]
P. Pudlák, V. Rödl, J. Sgall. Boolean circuits, tensor ranks and communication complexity. SIAM J. on Computing 26/3 (1997), pp.605-633.
[Ra]
A. A. Razborov. On rigid matrices. Preprint of Math. Inst. of Acad. of
Sciences of USSR (in Russian), 1989.
[RW]
A. A. Razborov, A. Wigderson. nΩ(log n) Lower Bounds on the Size of Depth
3 Circuits with AND Gates at the Bottom. Information Processing Letters,
Vol. 45, pp. 303–307, 1993.
[Va]
L. Valiant. Graph-Theoretic Arguments in Low-level Complexity. Proc. 6th
Math. Foundations of Comp. Sci., Lecture notes in Computer Science, Vol.
53, Springer-Verlag, 1977, pp. 162-176.
[vL]
J. H. van Lint: Introduction to Coding Theory. Springer-Verlag, 1982.
[Y]
A. C-C. Yao. On ACC and Threshold Circuits. Proc. of the 31st IEEE FOCS,
1990, pp. 619-627.
Multi-parameter Minimum Spanning Trees
David Fernández-Baca⋆
Department of Computer Science, Iowa State University, Ames, IA 50011, USA
fernande@cs.iastate.edu
Abstract. A framework for solving certain multidimensional parametric
search problems in randomized linear time is presented, along with its
application to optimization on matroids, including parametric minimum
spanning trees on planar and dense graphs.
1
Introduction
In the multi-parameter minimum spanning tree problem, we are given an edgeweighted graph G = (V, E), where the weight of each edge e is an affine function of a d-dimensional parameter vector λ = (λ1 , λ2 , . . . , λd ), i.e., w(e) =
Pd
a0 (e) + i=1 λi ai (e). The topology and weight of the minimum spanning tree
are therefore functions of λ. Let z(λ) be the weight of the minimum spanning
tree at λ. The goal is to find
z ∗ = max z(λ).
λ
(1)
Problem (1) arises in the context of Lagrangian relaxation. For example,
Camerini et al. [5] describe the following problem. Suppose each edge e of G has
an installation cost w(e) and d possible maintenance costs mi (e), one for each of
k possible future scenarios, where scenario i has probability pi . Edge e also has a
reliability qi (e) under each scenario i. Let T denote the set of all spanning trees
of G. Minimizing the total installation and maintenance costs while maintaining
an acceptable level of reliability Q under all scenarios is expressible as
)
!
(
d
Y
X
X
qi (e) ≥ Q, i = 1, . . . , d .
(2)
pi mi (e) :
w(e) +
min
T ∈T
e∈T
i=1
e∈T
A good lower bound on the solution to (2) can be obtained by solving the
Pd
Lagrangian dual of (2), which has the form (1) with a0 (e) = w(e)+ i=1 pi mi (e)
and ai (e) = − log qi (e) + (log Q)/(|V | − 1)⋆ .
In this paper, we give linear-time randomized algorithms for the fixed-dimensional parametric minimum spanning tree problem for planar and dense graphs
(i.e., those where m = Θ(n2 )). Our algorithms are based on Megiddo’s method
⋆
⋆
Supported in part by the National Science Foundation under grant CCR-9520946.
We also need λ ≥ 0; this can easily be handled by our scheme.
G. Gonnet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNCS 1776, pp. 217–226, 2000.
c Springer-Verlag Berlin Heidelberg 2000
218
D. Fernández-Baca
of parametric search [24], a technique that turns solutions to fixed-parameter
problems (e.g., non-parametric minimum spanning trees) into algorithms for
parametric problems. This conversion is often done at the price of incurring a
polylogarithmic slowdown in the run time for the fixed-parameter algorithm. Our
approach goes beyond the standard application of Megiddo’s method to eliminate
this slowdown, by applying ideas from the prune-and-search approach to fixeddimensional linear programming. Indeed, the mixed graph-theoretic/geometric
nature of the our problems requires us to use pruning at two levels: geometrically
through cuttings and graph-theoretically through sparsification
History and New Results. The parametric minimum spanning tree problem is
a special case of the parametric matroid optimization problem [14]. The Lagrangian relaxations of several non-parametric matroid optimization problems
with side constraints — called matroidal knapsack problems by Camerini et al. [5]
— are expressible as problems of the form (1). More generally, (1) is among the
problems that can be solved by Megiddo’s method of parametric search [23,24],
originally developed for one-dimensional search, but readily extendible to any
fixed dimension [26,10,4].
The power of the parametric search has been widely recognized (see, e.g., [2]).
Part of the appeal of the method is its formulation as an easy-to-use “black box.”
The key requirement is that the underlying fixed-parameter problem — i.e., the
problem of evaluating z(λ) for fixed λ — have an algorithm where all numbers
manipulated are affine functions of λ. If this algorithm runs in time O(T ), then
the parametric problem can be solved in time O(T 2d ) and if there is W -processor,
D-step parallel algorithm for the fixed parameter problem, the run time can be
improved to O(T (D log W )d ). In some cases, this can be further improved to
O(T (D + log W )d )⋆⋆ . This applies to the parametric minimum spanning tree
problem, for which one can obtain D = O(log n) and W = O(m), where n = |V |
and m = |E|. The (fixed-parameter) minimum spanning tree problem can be
solved in randomized O(m) expected time [21] and O(mα(m, n) log α(m, n))
deterministic time [7]. In any event, by its nature, parametric search introduces a
logO(d) n slowdown with respect to the run time of the fixed-parameter problem.
Thus, the algorithms it produces are unlikely to be optimal (for an exception to
this, see [11]).
Frederickson [20] was among the first to consider the issue of optimality in
parametric search, in the sense that no slowdown is introduced and showed how
certain location problems on trees could be solved optimally. Later, certain onedimensional parametric search problems on graphs of bounded tree-width [18]
and the one-dimensional parametric minimum spanning tree problem on planar
graphs and on dense graphs [17,19] were shown to be optimally solvable. A key
technique in these algorithms is decimation: the organization of the search into
phases to achieve geometric reduction of problem size. This is closely connected
with the prune-and-search approach to fixed-dimensional linear programming
⋆⋆
Note that the O-notation in all these time bounds hides “constants” that depend on
d. The algorithms to be presented here exhibit similar constants.
Multi-parameter Minimum Spanning Trees
219
[25,12,9]. While the geometric nature of linear programming puts fewer obstacles
to problem size reduction, a similar effect can be achieved for graph problems
through sparsification [15,16] a method that has been applied to one-dimensional
parametric minimum spanning trees before [1,19].
Here we show that multi-parameter minimum spanning trees can be found
in randomized linear expected time on planar and on dense graphs. Our procedures use prune-and-search geometrically through cuttings [8], to narrow the
search region for the optimal solution, as well as graph-theoretically through
sparsification. More generally, we identify decomposability conditions that allow parametric problems to be solvable within the same time bound as their
underlying fixed-parameter problems.
2
Multidimensional Search
Let h be a hyperplane in Rd , let Λ be a convex subset of Rd , and let signΛ (h)
be +1, 0, or −1, depending, respectively, on whether h(λ) < 0 for all λ ∈ Λ,
h(λ) = 0 for some λ ∈ Λ, or h(λ) > 0 for all λ ∈ Λ. Hyperplane h is said to
be resolved if signΛ (h) is known. An oracle is a procedure that can resolve any
given hyperplane. The following result is known (see also [25]):
Theorem 1 (Agarwal, Sharir, and Toledo [3]). Given a collection H of n
hyperplanes in Rd and an oracle B for Λ, it is possible to find either a hyperplane
that intersects Λ or a simplex △ that fully contains Λ and intersects at most n/2
hyperplanes in H by making O(d3 log d) oracle calls. The time spent in addition
to the oracle calls is n · O(d)10d log2d d.
Corollary 1. Given a set H of n hyperplanes and a simplex △ containing Λ,
a simplex △′ intersecting at most n′ elements of H and such that Λ ⊆ △′ ⊆ △
can be found with O(d3 log d lg(n/n′ )) calls to an oracle.
If Λ denotes the set of maximizers of function z, we have the following [4]:
Lemma 1. Locating the position of the maximizers of z relative to a given hyperplane h reduces to carrying out three (d − 1)-dimensional maximization problems
of the same form as (1).
3
Decomposable Problems
Let m be the problem size. A decomposable optimization problem is one whose
fixed-parameter version can be solved in two stages:
– A O(m)-time decomposition stage, independent of λ, which produces a recursive decomposition of the problem represented by a bounded-degree decomposition tree D. The nodes of D represent subproblems and its root is
the original problem. For each node v in D, mv is the size of the subproblem
associated with v. The children of v are the subproblems into which v is
decomposed.
220
D. Fernández-Baca
– A O(m)-time optimization stage, where the decomposition tree is traversed
level by level from the bottom up and each node is replaced by a sparse substitute. z(λ) can be computed in O(1) time from the root’s sparse substitute.
Note that after the decomposition stage is done we can evaluate z(λ) for multiple
values of λ by executing only the optimization stage.
We will make some assumptions about the decomposition stage:
(D1) D is organized into levels, with leaves being at level 0, their parents at level
1, grandparents at level 2, and so forth. Li will denote the set of nodes at
level i. The index of the last level is k = k(n).
(D2) There exists a constant α > 1, independent of m, such that |Li | = O(m/αi )
and mu = O(αi ) for each u ∈ Li .
We also make assumptions about the optimization stage:
(O1) For each v ∈ Li , the solution to the subproblem associated with v can be
represented by a sparse substitute of size O(β(i)), such that β(i)/αi < 1/γ i
for some γ > 1. For i = 0 the sparse substitute can be computed in
O(1) time by exhaustive enumeration, while for i > 0 the substitute for v
depends only on the sparse substitutes for v’s children.
(O2) The algorithm for computing the sparse substitute for any node v takes
time linear in the total size of the sparse substitutes of v’s children. This
algorithm is piecewise affine; i.e., every number that it manipulates is
expressible as an affine combination of the input numbers.
Note that (i) by (O1), the total size of all sparse substitutes for level i is O(m/γ i )
and (ii) assumption (O2) holds for many combinatorial algorithms; e.g., most
minimum spanning tree algorithms are piecewise affine.
4
The Search Strategy
We now describe our approach in general terms; its applications will be presented
in Section 5. We use the following notation. Given a collection of hyperplanes
H in Rd , A(H) denotes the arrangement of H; i.e., the decomposition of Rd
into faces of dimension 0 through d induced by H [13]. Given Γ ⊆ Rd , AΓ (H)
denotes the restriction of H to Γ .
Overview. The search algorithm simulates the bottom-up, level-by-level execution of the fixed-parameter algorithm for all λ within a simplex △ ⊆ Rd known
to contain Λ, the set of maximizers of z. The outcome of this simulation for any
node v in the decomposition tree is captured by a parametric sparse substitute
for v, which consists of (i) a decomposition of △ into disjoint regions such that
the sparse substitute for each region is unique and (ii) the sparse substitute for
each region of the decomposition. Given a parametric sparse substitute for v, obtaining the sparse substitute for v for any fixed λ ∈ △ becomes a point location
problem in the decomposition.
Multi-parameter Minimum Spanning Trees
221
After simulating the fixed-parameter algorithm, we will have a description of
all possible outcomes of the computation within △, which can be searched to
locate some λ∗ ∈ Λ. For efficiency, the simulation of each level is accompanied
by the shrinkage of △ using Theorem 1. This is the point where we run the risk
of incurring the polylogarithmic slowdown mentioned in the Introduction. We
get around this with three ideas. First, we shrink △ only to the point where the
average number of regions in the sparse substitute within △ for a node at level
i is constant-bounded (see also [17]). Second, the oracle used at level i relies on
bootstrapping: By Lemma 1, this oracle can be implemented by solving three
optimization problems in dimension d − 1. If parametric sparse substitutes for
all nodes at level i − 1 are available, the solution to any such problem will not
require reprocessing lower levels.
The final issue is the relationship between node v’s parametric sparse substitute P (v) and the substitutes for v’s children. An initial approximation to the
subdivision for P (v) is obtained by overlapping the subdivisions for the substitutes of the children of v. Within every face F of the resulting subdivision of
△ there is a unique sparse substitute for each of v’s children. However, F may
have to be further subdivided because there may still be multiple distinct sparse
substitutes for v within F . Instead of producing them all at once, which would
be too expensive, we proceed in three stages. First, we get rough subdivisions of
the F ’s through random cuttings [8]. Each face of these subdivisions will contain
only a relatively small number of regions of P (v). In the second stage, we shrink
△ so that the total number of regions in the rough subdivisions over all nodes
in level i is small. Finally, we generate the actual subdivisions for all v.
Intersection Hyperplanes. Consider a comparison between two values a(λ) and
b(λ) that is carried out when computing the sparse certificate for v or one of
its descendants for some λ ∈ Rd . By assumption (O2), a(λ) and b(λ) are affine
functions of λ; their intersection hyperplane is hab = {λ : a(λ) = b(λ)}. The
outcome of the comparison for a given λ-value depends only on which side of
hab contains the value. Let I(v) consist of all such intersection hyperplanes for
v. Then, there is a unique sparse substitute for each face of A△ (I(v)), since all
comparisons are resolved in the same way within it. Thus, our parametric sparse
substitutes consist of A△ (I(v)), together with the substitute for its faces.
For fast retrieval of sparse substitutes, we need a point location data structure
for A△ (I(v)). The complexity of the arrangement and the time needed to build it
are O(nd ), where n is the number of elements of I(v) that intersect the interior of
△ [13]. A point location data structure with space requirement and preprocessing
time O(nd ) can be built which answers point location queries in O(log n) time [6].
We will make
S certain assumptions about intersection hyperplanes. Let F be
a face of A△ ( {I(u) : u is a child of v}). Then, we have the following.
(H1) The number of elements of I(v) that intersect the interior of F is O(m2v ).
(H2) A random sample of size n of the elements of I(v) that intersect the interior
of F can be generated in O(n) time.
222
D. Fernández-Baca
Shrinking the Search Region. Let κ△ (v) denote the number of faces of A△ (I(v))
and let I△ (v) denote the elements of I(v) that intersect the interior of △. The
goal of the shrinking algorithm is to reduce the search region △ so that, after
simulating level r of the fixed-parameter algorithm,
X
|I△ (v)| ≤ m/αr(2d+1) .
(3)
Λ⊆△
and
v∈Lr
Lemma 2. If (3) holds, then
P
v∈Lr
κ△ (v) ≤ 2m/αr .
Lemma 3. Suppose we are given a simplex △ satisfying (3) for r = i − 1 and
parametric sparse substitutes within △ for all v ∈ Li−1 . Then, with high probability, O(id4 log d) oracle calls and O(m/αi/4d ) overhead suffice to find a new
simplex △ satisfying (3) for r = i, together with I△ (v) for all v ∈ Li .
S
Proof. Let s = αi and for each v ∈ Li let Hv denote {I△ (u) : u a child of v}.
To shrink the search region, first do the following for each v ∈ Li and each
face F ∈ A(H(v)): ¿From among the elements of I(v) that arise during the
computation of a sparse certificate for some λ ∈ F choose uniformly at random a
set CF of size s1−1/(4d) . The total number of elements in all the sets CF is (m/s)·
s1−1/(4d) = m/s1/(4d) . By the theory of cuttings [8], with high probability any
face in a certain triangulation of A(CF ), known as the canonical triangulation,
intersects at most s/s1−1/(4d) elements of I△S
(v).
Next, apply Corollary 1 to the set H = v∈Li {h : h ∈ Hv or h ∈ CF , F a
face of A(Hv )} to find a simplex △′ that intersects at most m/sd+1 elements of
H and such that Λ ⊆ △′ ⊆ △. Set △ ← △′ . Since |H| = O(m/21/(4d) ), the total
number of oracle calls is O(d4 log d log s) = O(id4 log d).
Now, for each v ∈ Li , we compute I△ (v) in two steps. First, for each v ∈ Li
construct the canonical triangulation Cv of the arrangement of Gv , which consists
of all hyperplanes in Hv ∪ {h ∈ CF : F a face of A(Hv )} intersecting △. The
total time is O(m/s), since at most m/sd+1 v’s have non-empty Gv ’s and for
each such v the number of regions in Av (Gv ) is O(sd ). Secondly, enumerate all
hyperplanes in I△ (v) that intersect △′ . This takes time O(m/s1/4 ). With high
probability, at most (m/s) · s1/(4d) = m/s1−1/(4d)
S hyperplanes will be found.
Finally, we apply Corollary 1 to the set H = v∈Li I△ (v) to obtain a simplex
△′ intersecting at most m/s2d+1 of the elements of H and such that Λ ⊆ △′ ⊆ △.
Set △ ← △′ . The total number of oracle calls is O(d4 log d log s) = O(id4 log d).
A Recursive Solution Scheme. We simulate the execution of the fixed-parameter
algorithm level by level, starting at level 0; each step of the simulation produces
parametric sparse certificates for all nodes within a given level. We use induction
on the level i and the dimension d. For i = 0 and every d ≥ 0, we compute the
parametric sparse substitutes for all v ∈ Li by exhaustive enumeration. This
takes time O(cd m), for some constant cd .
Lemma 4. Let △ be a simplex satisfying (3) for r = i − 1, and suppose that
parametric sparse substitutes within △ for all v ∈ Li−1 are known. Then we can,
Multi-parameter Minimum Spanning Trees
223
with high probability, in time O(fd · m/γdi ) find a new simplex whithin which the
parametric sparse substitute for the root of D has O(1) regions.
Proof. To process Li , i > 0, we use recursion on the dimension, d. When d = 0,
we have the fixed-parameter problem. By (O1), given sparse substitutes for all
v ∈ Li−1 , we can process Li in time O(β(i) · m/αi ) = O(f0 m/γ0i ), where f0 = 1,
γ0 = γ. Hence, Li through Lk can be processed in time O(f0 m/γ0i ).
Next, consider d ≥ 1. To process level i, we assume that level i − 1 has been
processed so that (3) holds for r = i − 1. Our goal is to successively maintain
(3) for r = i, i + 1, . . . , k. Thus, after simulating level k, |I△ (root)| = O(1).
The first step is to use Lemma 3 to reduce △ to a simplex satisfying (3)
for r = i and to obtain I△ (v) for all v ∈ Li . The oracle will use the sparse
certificates already computed for level i − 1: Let h be the hyperplane to be
resolved. If h ∩ △ = ∅, we resolve h in time O(d) by finding the side of h that
contains △. Otherwise, by Lemma 1, we must solve three (d − 1)-dimensional
problems. For each such problem, we do as follows. For every v ∈ Li−1 , find
the intersection of h with A△ (v). This defines an arrangement in the (d − 1)dimensional simplex △′ = △ ∩ h. The sparse substitute for each region of the
arrangement is unique and known; thus, we have parametric sparse substitutes
i
).
for all v ∈ Li−1 . By hypothesis, we can compute zh∗ in time O(fd−1 m/γd−1
4
i
). If
By Lemma 3, the time for all oracle calls is O(i · d log d · fd−1 · m/γd−1
we discover that h intersects Λ, we return zh∗ . After shrinking △, I△ (v) will be
known and we can build A△ (I(v)). By Lemma 2, this arrangement has O(m/αi )
regions. By assumption (O1) we can find the sparse substitute for each face F
of A△ (I(v)) as follows. First, choose an arbitrary point λ0 in the interior of F .
Next, for each child u of v, find the sparse substitute for u at λ0 . Finally, use
these sparse substitutes to compute the sparse substitute for v at λ0 ; this will
be the sparse substitute for all λ ∈ F . Thus, the total time needed to compute
the parametric sparse substitutes for v ∈ Li is O(β(i) · (m/αi )) = O(m/γ i ). The
i
oracle calls dominate the work, taking a total time of O(i · gd · fd−1 · m/γd−1
)=
i
O(fd · m/γd ), where fd = gd · fd−1 and γd satisfies 1 < γd < γd−1 .
Theorem 2. The optimum solution to a decomposable problem can be found in
O(m) time with high probability.
Proof. Let w be the root of the decomposition tree. After simulating the execution of levels 0 through k, we are left with a simplex △ that is crossed by
O(1) hyperplanes of I(w), and O(1) invocations of Theorem 1 suffice to reduce
△ to a simplex that is crossed by no hyperplane of I(w). Within this simplex,
there is a unique optimum solution, whose cost as a function of λ is, say, c · λ.
The maximizer can now be located through linear programming with objective
function c · λ. The run time is O(ad ), for some value ad that depends only on d.
5
Applications
We now show apply Theorem 2 to several matroid optimization problems. We
will rely on some basic matroid properties. Let M = (S, I) be a matroid on the
224
D. Fernández-Baca
set of elements S, where I is the set of independent subsets of S. Let B be a subset
of S and let A be a maximum-weight independent subset of B. Then, there exists
some maximum-weight independent set of S that does not contain any element
of B \ A. Furthermore the identity of the maximum-weight independent subset
of any such B depends on the relative order, but not on the actual values, of
the weights of the elements of B. Therefore, the maximum number of distinct
comparisons
to determine the relative order of elements over all possible choices
of λ is |B|
2 . Thus, (H1) is satisfied. Assumption (H2) is satisfied, since we can
get a random sample of size n the intersection hyperplanes for B by picking n
pairs of elements from B uniformly at random.
Uniform and Partition Matroids. (This example is for illustration; both problems
can be solved by linear programming.) Let S be an m-element set where every
Pd
e ∈ S has a weight w(e) = a0 (e) + Pi=1 λi ai (e), let k ≤ m be a fixed positive
integer, and let z(λ) = maxB,|B|=k e∈B w(e). The problem is to find z ∗ =
minλ z(λ). The fixed-parameter problem is uniform matroid optimization. D is
as follows: If |S| ≤ k, D consists of a single vertex v containing all the elements of
S. Otherwise, split S into two sets of size m/2; D consists of a root v connected
to the roots of trees for these sets. Thus, D satisfies conditions (D1) and (D2).
The non-parametric sparse substitute for node v consists of the k largest
elements among the subset Sv corresponding to v. A sparse substitute for v
can be computed from its children in O(k) time. Thus, (O2) is satisfied and, if
k = O(1), so is (O1). Hence, z ∗ can be found in O(m) time.
In partition matroids,
the set S is partitioned into disjoint subsets S1 , . . . , Sr
P
and z(λ) = max{ {w(e) : e ∈ B, |B ∩ Si | ≤ 1 for i = 1, 2, . . . , r}. z ∗ =
minλ z(λ) can be computed in O(m) time by similar techniques.
Minimum Spanning Trees in Planar Graphs. D will represent a recursive separatorbased decomposition of the input graph G. Every node v in D corresponds to a
subgraph Gv of G with nv vertices; the root of D represents all of G. The children u1 , . . . , ur of v represent a decomposition of Gv into edge-disjoint subgraphs
Gui such that nui ≤ nv /α for some α > 1, which share a set Xv of boundary
√
vertices, such that |Xv | = O( nui ). D satisfies conditions (D1) and (D2) and
can be constructed in time O(n) where n is the number of vertices of G [22].
A sparse substitute for a node x with a set of boundary vertices X is obtained
as follows: First, compute a minimum spanning tree Tx of Gx . Next, discard all
edges in E(Gx ) \ E(Tx ) and all isolated vertices. An edge e is contractible if it
has a degree-one endpoint that is not a boundary vertex, or it shares a degreetwo non-boundary vertex with another edge f such that cost(e) ≤ cost(f ). Now,
repeat the following step while there is a contractible edge in Gv : choose any
contractible edge e and contract it. While doing this, keep a running total of
the cost of the contracted edges. The size of the resulting graph Hx is O(|X|) =
√
O( nx ). Also, Hx is equivalent to Gx in that if the former is substituted for the
latter in the original graph, then the minimum spanning tree of the new graph,
together with the contracted edges constitute a minimum spanning tree of the
original graph. The sparse substitute computation satisfies (O1) and (O2).
Multi-parameter Minimum Spanning Trees
225
Minimum Spanning Trees in Dense Graphs. D is built in two steps. First, a
vertex partition tree is constructed by splitting the vertex set into two equal-size
parts (to within 1) and then recursively partitioning each half. This results in a
complete binary tree of height lg n where nodes at depth i have n/2i vertices.
¿From the vertex partition tree we build an edge partition tree: For any two
nodes x and y of the vertex partition tree at the same depth i containing vertex
sets Vx and Vy , create a node Exy in the edge partition tree containing all edges
of G in Vx × Vy . The parent of Exy is Euw , where u and w are, respectively, the
parents of x and y in the vertex partition tree. An internal node Exy will have
three children if x = y and four otherwise. D is built from the edge partition
tree by including only non-empty nodes.
Let u = Exy be a node in D. Let Gu be the subgraph of G with vertex set
Vx ∪ Vy and edge set E ∩ (Vx × Vy ). For every j between 0 and the depth of D, (i)
there are at most 22j depth-j nodes, (ii) the edge sets of the graphs associated
with the nodes at depth k are disjoint and form a partition of E, and (iii) if u is
at depth j, Gu has at most n/2j vertices and n2 /22j edges. If G is dense, then
mu = |E(Gu )| = Θ(|V (Gu )2 |) for all u. Thus, (D1) and (D2) hold. The sparse
substitute for Gu is obtained by deleting from Gu all edges not in its minimum
spanning forest (which can be computed in O(|V (Gu )2 |) = O(mu ) time). The
√
size of the substitute is O( mu ). Thus, (O1) and (O2) are satisfied.
6
Discussion
Our work shows the extent to which prune-and-search can be used in parametric
graph optimization problems. Unfortunately, the heavy algorithmic machinery
involved limits the practical use of our ideas. We also suspect that our decomposability framework is too rigid, and that it can be relaxed to solve other problems.
Finally, one may ask whether randomization is necessary. Simply substituting
randomized cuttings by deterministic ones [6] gives an unacceptable slowdown.
References
1. P. K. Agarwal, D. Eppstein, L. J. Guibas, and M. R. Henzinger. Parametric and
kinetic minimum spanning trees. In Proceedings 39th IEEE Symp. on Foundations
of Computer Science, 1998.
2. P. K. Agarwal and M. Sharir. Algorithmic techniques for geometric optimization. In
J. van Leeuwen, editor, Computer Science Today: Recent Trends and Developments,
volume 1000 of Lecture Notes in Computer Science. Springer-Verlag, 1995.
3. P. K. Agarwal, M. Sharir, and S. Toledo. An efficient multidimensional searching
technique and its applications. Technical Report CS-1993-20, Computer Science
Department, Duke University, July 1993.
4. R. Agarwala and D. Fernández-Baca. Weighted multidimensional search and its
application to convex optimization. SIAM J. Computing, 25:83–99, 1996.
5. P. M. Camerini, F. Maffioli, and C. Vercellis. Multi-constrained matroidal knapsack
problems. Mathematical Programming, 45:211–231, 1989.
226
D. Fernández-Baca
6. B. Chazelle. Cutting hyperplanes for divide-and-conquer. Discrete Comput. Geom.,
9(2):145–158, 1993.
7. B. Chazelle. A faster deterministic algorithm for minimum spanning trees. In
Proceedings 38th IEEE Symp. on Foundations of Computer Science, pages 22–31,
1997.
8. B. Chazelle and J. Friedman. A deterministic view of random sampling and its
use in geometry. Combinatorica, 10(3):229–249, 1990.
2
9. K. L. Clarkson. Linear programming in O(n × 3d ) time. Information Processing
Letters, 22:21–24, 1986.
10. E. Cohen and N. Megiddo. Maximizing concave functions in fixed dimension. In
P. M. Pardalos, editor, Complexity in Numerical Optimization, pages 74–87. World
Scientific, Singapore, 1993.
11. R. Cole, J. S. Salowe, W. L. Steiger, and E. Szemerédi. An optimal-time algorithm
for slope selection. SIAM J. Computing, 18:792–810, 1989.
12. M. E. Dyer. On a multidimensional search technique and its application to the
euclidean one-centre problem. SIAM J. Computing, 15(3):725–738, 1986.
13. H. Edelsbrunner. Algorithms in Combinatorial Geometry. Springer-Verlag, Heidelberg, 1987.
14. D. Eppstein. Geometric lower bounds for parametric matroid optimization. Discrete Comput. Geom., 20:463–476, 1998.
15. D. Eppstein, Z. Galil, G. F. Italiano, and A. Nissenzweig. Sparsification — a
technique for speeding up dynamic graph algorithms. J. Assoc. Comput. Mach.,
44:669–696, 1997.
16. D. Eppstein, Z. Galil, G. F. Italiano, and T. H. Spencer. Separator-based sparsification I: planarity testing and minimum spanning trees. J. Computing and Systems
Sciences, 52:3–27, 1996.
17. D. Fernández-Baca and G. Slutzki. Linear-time algorithms for parametric minimum spanning tree problems on planar graphs. Theoretical Computer Science,
181:57–74, 1997.
18. D. Fernández-Baca and G. Slutzki. Optimal parametric search on graphs of
bounded tree-width. Journal of Algorithms, 22:212–240, 1997.
19. D. Fernández-Baca, G. Slutzki, and D. Eppstein. Using sparsification for parametric minimum spanning tree problems. Nordic Journal of Computing, 34(4):352–366,
1996.
20. G. N. Frederickson. Optimal algorithms for partitioning trees and locating pcenters in trees. Technical Report CSD-TR 1029, Department of Computer Science,
Purdue University, Lafayette, IN, October 1990.
21. D. R. Karger, P. N. Klein, and R. E. Tarjan. A randomized linear-time algorithm
for finding minimum spanning trees. J. Assoc. Comput. Mach., 42:321–328, 1995.
22. P. N. Klein, S. Rao, M. Rauch, and S. Subramanian. Faster shortest-path algorithms for planar graphs. In Proceedings of the 26th Annual ACM Symposium on
Theory of Computing, pages 27–37, 1994.
23. N. Megiddo. Combinatorial optimization with rational objective functions. Math.
Oper. Res., 4:414–424, 1979.
24. N. Megiddo. Applying parallel computation algorithms in the design of serial
algorithms. J. Assoc. Comput. Mach., 30(4):852–865, 1983.
25. N. Megiddo. Linear programming in linear time when the dimension is fixed. J.
Assoc. Comput. Mach., 34(1):200–208, 1984.
26. C. H. Norton, S. A. Plotkin, and É. Tardos. Using separation algorithms in fixed
dimension. Journal of Algorithms, 13:79–98, 1992.
Linear Time Recognition of Optimal
L-Restricted Prefix Codes
(Extended Abstract)
Ruy Luiz Milidiú1 and Eduardo Sany Laber2
1
Departamento de Informática, PUC-Rio, Brazil
Rua Marquês de São Vicente 225, RDC, sala 514, FPLF
Gávea, Rio de Janeiro, CEP 22453-900, phone 5521-511-1942
milidiu@inf.puc-rio.br
2
COPPE/UFRJ,
Caixa Postal 68511 21945-970 Rio de Janeiro, RJ, Brasil
tel: +55 21 590-2552, fax: +55 21 290-6626
laber@inf.puc-rio.br
1
Introduction
Given an alphabet Σ = {a1 , . . . , an } and a corresponding list of weights [w1 , . . . ,
code for Σ that minimizes the weighted
wn ], an optimal prefix code is a prefix P
n
length of a code string, defined to be i=1 wi li , where li is the length of the
codeword assigned to ai . This problem is equivalent to the following problem:
given a list of weights [w1 , . . . , wn ], find an optimal binary
Pncode tree, that is,
a binary tree T that minimizes the weighted path length i=1 wi li , where li is
the level of the i-th leaf of T from left to right. If the list of weights is sorted,
this problem can be solved in O(n) by one of the efficient implementations of
Huffman’s Algorithm [Huf52]. Any tree constructed by Huffman’s Algorithm is
called a Huffman tree.
In this paper, we consider optimal L-restricted prefix codes. Given a list of
L ≤ n − 1, an optimal
weights [w1 , . . . , wn ] and an integer L, with ⌈log n⌉ ≤P
n
L-restricted prefix code is a prefix code that minimizes i=1 wi li constrained to
li ≤ L for i = 1, . . . , n. Gilbert [Gil71] recommends the use of these codes when
the weights wi are inaccurately known. Zobel and Moffat [ZM95] describe the
use of word-based Huffman codes for compression of large textual databases.
Their application allows the maximum of 32 bits for each codeword. For the
cases that exceed this limitation, they recommend the use of L-restricted codes.
Some methods can be found in the literature to generate optimal L-restricted
prefix codes. Different techniques of algorithm design have been used to solve
this problem. The first polynomial algorithm is due to Garey [Gar74]. The algorithm is based on dynamic programming and it has an O(n2 L) complexity for
both time and space. Larmore and Hirschberg [LH90] presented the PackageMerge algorithm. This algorithm uses a greedy approach an runs in O(nL) time,
with O(n) space requirement. The authors reduce the original problem to the
Coin’s Collector Problem, using a nodeset representation of a binary tree. Turpin
G. Gonnet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNCS 1776, pp. 227–236, 2000.
c Springer-Verlag Berlin Heidelberg 2000
228
R.L. Milidiú, E.S. Laber
and Moffat [TM96] discuss some practical aspects on the implementation
of the
√
O( log L log log n)
Package-Merge algorithm. In [Sch95], Schieber obtains an O(n2
)
time algorithm. Currently, this is the fastest strongly polynomial time algorithm
for constructing optimal L-restricted prefix codes. Despite, the effort of some
researchers, it remains open if there is an O(n log n) algorithm for this problem.
In this paper we give a linear time algorithm to recognize an optimal Lrestricted prefix code. This linear time complexity holds under the assumption
that the given list of weights is already sorted. If the list of weights is not sorted,
then the algorithm requires an O(n log n) initial sorting step. This algorithm is
based on the nodeset representation of binary trees [LH90].
We assume that we are given an alphabet Σ = {a1 , . . . , an } with corresponding weights 0 < w1 ≤ · · · ≤ wn , an integer L with ⌈log n⌉ ≤ L < n and a list
of lengths l= [l1 , . . . , ln ], where li is the length of the codeword assigned to ai .
We say that l is optimal iff l1 , . . . , ln are the codewords lengths of an optimal
L-restricted prefix code for Σ. The Recognition algorithm that we introduce here
determines if l is optimal or not.
The paper is organized as follows. In section 2, we present the nodeset representation for binary trees and we derive some useful properties. In section 3,
we present the Recognition algorithm. In section 4, we outline a proof for the
algorithm correctness.
2
Trees and Nodesets
For positive integers i and h, let us define a node as an ordered pair (i, h), where
i is called the node index and h is the node level. A set of nodes is called a
nodeset.
We define the background R(n, L) as the nodeset given by R(n, L) =
{(i, h)|1 ≤ i ≤ n, 1 ≤ h ≤ L}
Let T be a binary tree with n leaves and with corresponding leaf levels
l1 ≥ . . . ≥ ln . The treeset N (T ) associated to T is defined as the nodeset given
by N (T ) = {(i, h)|1 ≤ h ≤ li , 1 ≤ i ≤ n}
The background in figure 1 is the nodeset R(8, 5). The nodes inside the
polygon are the ones of N (T ), where T is the tree with leaves at following levels:
5, 5, 5, 5, 3, 3, 3, 1.
For any nodeset A ⊂ R(n, L) we define the complementary nodeset A as
A = R(n, L) − A. In figure 1, the nodes outside of the polygon are those of the
nodeset N (T ).
Given a node (i, h), define width(i, h) = 2−h and weight(i, h) = wi . The
width and the weight of a nodeset, are defined as the sums of the corresponding
widths and weights of their constituent nodes. Let T be a binary tree with n
leaves and corresponding leaf levels l1 ≥ . . . ≥ ln . It isPnot difficult to show
n
[LH90] that width(N (T )) = n − 1 and weight(N (T )) = i=1 wi li .
In [LH90], Larmore and Hirschberg reduced the problem of finding an optimal
code tree with restricted maximal height L, for a given list of weights w1 , . . . , wn
to the problem of finding the nodeset with width n − 1 and minimum weight
Linear Time Recognition of Optimal L-Restricted Prefix Codes
229
5
4
h
3
2
1
1
2
3
4
5
6
7
8
i
Fig. 1. The background R(8, 5) and the treeset N (T ) associated to a tree T with leaf
levels 5, 5, 5, 5, 3, 3, 3, 1
included in the background R(n, L). Here, we need a slight variation of the main
theorem proved in [LH90].
Theorem 1. If a tree T is an optimal code tree with restricted maximum height
L, then the nodeset associated to T has minimum weight among all nodesets with
width n − 1 included in R(n, L).
The proof of the previous theorem is similar to that presented in [LH90].
Therefore, the list l= [l1 ≥ · · · ≥ l1 ] is a list of an optimal L-restricted codeword
lengths for Σ if and only if the nodeset N = {(i, h)|1 ≤ i ≤ n, 1 ≤ h ≤ li } has
minimum weight among the nodesets in R(n, L) that have width equal to n − 1.
In order to find the nodeset with width n − 1 and minimum weight, Larmore
and Hirschberg used the Package-Merge algorithm. This algorithm was proposed
in [LH90] to address the following problem: given a nodeset R and a width d,
find a nodeset X included in R with width d and minimum weight. The PackageMerge uses a greedy approach and runs in O(|R|), where |R| denotes the number
of nodes in the nodeset R. The Recognition algorithm uses the Package-Merge
as an auxiliary procedure.
3
Recognition Algorithm
In this section, we describe a linear time algorithm for recognizing optimal Lrestricted prefix codes. The algorithm is divided into two phases.
3.1
First Phase
First, the algorithm scans the list l to check if there is an index i such that
wi < wi+1 and li < li+1 . If that is the case, the algorithm outputs that l is
not optimal and stops. In this case, the weighted path length can be reduced
interchanging li and li+1 . In the negative case, we sort l by non-increasing order
of lengths. This can be done in linear time since all the elements of l are integers
not greater than n. We claim that l is optimal if and only if the list obtained by
sorting l is optimal. In fact, the only case where li < li+1 is when wi = wi+1 . In
230
R.L. Milidiú, E.S. Laber
this case, we can interchange li and li+1 maintaining the same external weighted
path length. Hence, we assume that l is sorted with l1 ≥ · · · ≥ ln .
After sorting, the algorithm verifies if l1 > L. In the affirmative case, the
algorithm stops and outputs “l is not optimal”. In theP
negative case, l respects
n
−li
= 1. This step
the length restriction. Then, the algorithm verifies
if
i=1 2
Pn
−li
6= 1, then the algorithm
can be performed in O(n) since l is sorted.P
If i=1 2
n
−li
outputs “l is not optimal”. In effect, if
> 1, then it follows from
i=1 2
McMillan-Kraft inequality [McM56] that there
is
not
a
prefix code
Pnwith codeword
Pn
lengths given by l. On the other hand, if i=1 2−li < 1, then i=1 wi li can be
reduced by decreasing one unity the length of the longest codeword length l1 . If
P
n
−li
= 1, then the algorithm verifies two additional conditions:
i=1 2
1. l1 , . . . , ln are the codeword lengths of an unrestricted optimal prefix code for
Σ;
2. l1 = L.
Pn
Condition 1 can be verified in O(n) by comparing
i=1 wi li to the optimal weighted path length obtained by Huffman algorithm. Condition 2 can be
checked in O(1). We have three cases that we enumerate below:
Case 1) Condition 1 holds. The algorithm outputs “l is optimal” and stops.
Case 2) Both conditions 1 and 2 do not hold. The algorithm outputs l is not
optimal and stops.
Case 3) Only condition 2 holds. The algorithm goes to phase 2.
In the second case, the external weighted path length can be reduced without
violating the height restriction by interchanging two nodes that differs by at most
one level in the tree with leaf levels given by l.
3.2
Second Phase
If the algorithm does not stop at the first phase, then it examines the treeset
N associated to the binary tree with leaf levels l1 ≥ . . . ≥ ln . In this phase,
the algorithm determines if N is optimal or not. Recall that it is equivalent to
determine if l is optimal or not.
First, we define three disjoint sets. We define the boundary F of a treeset N
by
F = {(i, h) ∈ N |(i, h + 1) ∈
/ N}
Let us also define F2 by
/ N − F}
F2 = {(i, h) ∈ N − F |h < L − ⌈log n⌉ − 1 and (i + 1, h) ∈
Now, for h = L − ⌈log n⌉ − 1, . . . , L − 1, let ih be the largest index i such that
(i, h) ∈ N − F . Then, define the nodeset M as
M = {(i, h)|L − ⌈log n⌉ − 1 ≤ h ≤ L − 1, ih − 2h+⌈log n⌉−L+1 < i ≤ ih }.
Linear Time Recognition of Optimal L-Restricted Prefix Codes
231
The way that the nodeset M is defined assures that M contains a nodeset
with minimum weight among the nodesets included in (N − F ) that have width
equal to d, where d is a given diadic 1 number not greater than 2⌈log n⌉−L+1 .
In figure 2, R(14, 10) is the background. The polygon bounds the nodeset N .
The nodes of the boundary F are the nodes inside the polygon that have the
letter F written inside. The nodes of the nodeset M are those inside the polygon
that have the letter M, while the nodes of the nodeset F2 are those inside the
polygon that have the number 2.
Inside the Polygon
10
F
F
F
F
U
U
P
P
P
P
P
P
P
P
M
M
M
M
F
F
P
P
P
P
P
P
P
P
M
M
M
M
U
P
P
P
P
M
M
F
U
U
U
P
F
F
F
P
2
U
2
2
F
U
2
2
F
U
2
2
F
U
2
F
F
Nodes in F
9
M
Nodes in M
8
2
Nodes in F
2
7
M
6
Outside the Polygon
U
Nodes in U
P
Nodes in P
2
Nodes in U
5
4
3
2
2
P
1
1
2
3
4
5
6
7
8
9
10
11
12 13 14
Fig. 2. The nodesets used by the Recognition algorithm
Now, we define three other disjoint nodesets. The upper boundary U of the
nodeset N is defined by
U = {(i, h) ∈ N |(i, h − 1) ∈ N }
Let us also define U2 by
U2 = {(i, h) ∈ N − U |h < L − ⌈log n⌉ − 1 and (i − 1, h) ∈ N ∪ U }
Now, for h = L − ⌈log n⌉ − 1, . . . , L − 1, let i′h be the smallest index i such
that (i, h) belongs to U ∪ N . We define the nodeset P in the following way
P = {(i, h)|L − ⌈log n⌉ − 1 ≤ h ≤ L, i′h ≤ i < 2h+⌈log n⌉−L+1 + i′h }.
In figure 2, the nodes of the upper boundary U are those outside the polygon
that have the letter U written inside. The nodes of the nodeset P are those
1
A diadic number is a number that can be written as a sum of integer powers of 2
232
R.L. Milidiú, E.S. Laber
outside the polygon that have the letter P, while the nodes of the nodeset U2
are those outside the polygon that have the number 2.
The recognition algorithm performs three steps. The pseudo-code is presented
in figure 3.
Recognition Algorithm: Phase 2 ;
1. Evaluate the width and the weight of the nodeset F ∪ M ∪ F2 .
2. Apply the Package-Merge algorithm to obtain the nodeset X with minimum weight
among the nodesets included in F ∪ M ∪ F2 ∪ U ∪ P ∪ U2 that have width equal to
width(F ∪ M ∪ F2 ).
3. If weight(X) < weight(F ∪ M ∪ F2 ), then outputs N is not optimal; otherwise
outputs N is optimal
Fig. 3. The second phase of the recognition algorithm.
As an exemple, let us assume that we are interested to decide if the list of
lengths l=[10, 10, 10, 10, 9, 9, 7, 6, 6, 6, 4, 3, 2, 1] is optimal for an alphabet Σ with
weights [1, 1, 2, 3, 5, 8, 13, 21, 34, 61, 89, 144, 233, 377] and L = 10. The nodeset
N associated to l is the one presented in figure 2. The width of the nodeset
F ∪ M ∪ F2 is equal to 2 + 2−4 + 2−7 and its weight is equal to 1646. Let A be the
nodeset {[(5, 10), (6, 10), (7, 9), (7, 8), (8, 7)]} and let B be the nodeset {(10, 6)}.
The nodeset N ∪ A − B has width 2 + 2−4 + 2−7 and weight 1645, and as a
consequence, at step 2 the package-merge finds a nodeset with weight smaller
than or equal to 1645. Therefore, the algorithms outputs that l is not optimal.
3.3
Algorithm Analysis
The linear time complexity of the algorithm is established below.
Theorem 2. The Recognition algorithm runs in O(n) time.
Sketch of the Proof: The phase 1 can be implemented in O(n) time as we
showed in section 3.1.
Let us analyze the second phase. Step 1 can be performed in linear time as
we argue now. We define is (h) and ib (h), respectively, as the smallest and the
biggest index i such that (i, h) ∈ F ∪ M ∪ F2 . It is easy to verify that ib (h) is
given by the number of elements in l that are greater than or equal to h. On
the other hand, is (h) can be obtained through ib (h) and the definitions of F , M
and F2 . Hence, the set {(is (L), ib (L)), . . . , (is (1), ib (1))} can be obtained in O(n)
PL
time. Since the width of F ∪ M ∪ F2 is given by h=1 (ib (h) − is (h) + 1) × 2−h ,
it can be evaluated in linear time.
Now, we consider step 2. We define i′s (h) and i′b (h), respectively, as the smallest and the biggest index i such that (i, h) ∈ F ∪M ∪F2 ∪U ∪P ∪U2 . Observe that
Linear Time Recognition of Optimal L-Restricted Prefix Codes
233
is (h) = i′s (h). Furthermore, i′b (h) can be obtained through ib (h) and the definitions of U , P and U2 . Hence, the set {(i′s (L), i′b (L)), . . . , i′s (1), i′b (1))} is obtained
in O(n) time. Now, we show that the number of nodes in F ∪ M ∪ F2 ∪ U ∪ P ∪ U2
is O(n). We consider each nodeset separately. It follows from the definition of F
and U that each of them has at most n nodes. In addition, both F2 and U2 have at
most L−⌈log n⌉−2 nodes. Furthermore, from the definitions of M and P one can
show that |M ∪P | < 5n. Then it follows that |F ∪M ∪F2 ∪U ∪P ∪U2 | ≤ 7n+2L.
Therefore, the package-merge runs in O(n) time at step 2.
The step 3 is obviously performed in O(1)
The correctness of the algorithm is a consequence of the theorem stated
below.
Theorem 3. If N is not an optimal nodeset, then it is possible to obtain a new
nodeset with width n − 1 and weight smaller than N by replacing some nodes in
F ∪ M ∪ F2 by other nodes in U ∪ P ∪ U2 .
The sketch of the proof of theorem 3 is presented in the next section. The
complete proof can be found in the full version of this paper.
Now, we prove that theorem 3 implies on the correctness of the Recognition
algorithm.
Theorem 4. The second phase of the Recognition algorithms is correct.
Proof: First, we assume that N is not an optimal nodeset. In this case, the
theorem 3 assures the existence of nodesets A and B that satisfy the following
conditions:
(i) A ⊂ U ∪ P ∪ U2 and B ⊂ F ∪ M ∪ F2 ;
(ii) width(A) = width(B) ;
(iii) weight(A) < weight(B).
These conditions imply that weight(F ∪M ∪F2 ∪A−B) < weight(F ∪M ∪F2 )
and width(F ∪M ∪F2 ∪A−B) = widht(F ∪M ∪F2 ). Let X be the nodeset found
by package-merge at step 2 in the second phase. Since F ∪ M ∪ F2 ∪ A − B ⊂
F ∪M ∪F2 ∪U ∪P ∪U2 , it follows that weight(X) ≤ weight(F ∪M ∪F2 ∪A−B).
Therefore, weight(X) < weight(F ∪ M ∪ F2 ). Hence, the algorithm outputs that
N is not optimal.
Now, we assume that N is optimal. In this case, weight(X) ≥ weight(F ∪
M ∪ F2 ), otherwise, N ∪ X − (F ∪ M ∪ F2 ) would have weight smaller than that
of N , what would contradict our assumption. Hence, the algorithm outputs that
N is optimal.
4
Correctness
In this section, we outline the proof of theorem 3. We start by defining the
concept of a decreasing pair.
234
R.L. Milidiú, E.S. Laber
Definition 1. If a pair of nodesets (A, B) satisfy the conditions (i)-(iii) listed
below, then we say that (A, B) is a decreasing pair associated to N .
(i) A ⊂ N and B ⊂ N ;
(ii) width(A) = width(B);
(iii) weight(A) < weight(B).
For the sake of simplicity, we use the term DP to denote a decreasing pair
associated to N . We can state the following result.
Proposition 1. The nodeset N is not optimal if and only if there is a DP (A, B)
associated to N .
Proof: If N is not optimal, (N ∗ − N, N − N ∗ ) is a DP, where N ∗ is an
optimal nodeset. If (A, B) is a DP associated to N , then N ∪ A − B has width
equal to that of N and has weight smaller than that of N .
Now, we define the concept of good pair (GP).
Definition 2. A GP is a DP (A, B) that satisfies the following conditions
(i) For every DP (A′ , B ′ ), we have that width(A) ≤ width(A′ );
(ii) If A′ ⊂ N and width(A′ ) = width(A), then weight(A) ≤ weight(A′ );
(iii) If B ′ ⊂ N and width(B ′ ) = width(B), then weight(B) ≥ weight(B ′ ).
Now, we state some properties concerning good pairs.
Proposition 2. If N is not optimal, then there is a GP associated to N .
Proof: If N is not optimal, then it follows from proposition 1 that there is at
least one DP associated to N . Then, let d be the width of the DP with minimum
width associated to N . Furthermore, let A∗ be a nodeset with minimum weight
among the nodesets included in N that have width d, and let B ∗ be the nodeset
with maximum weight among the nodesets included in N that have width d. By
definition, (A∗ , B ∗ ) is a GP.
Proposition 3. Let (A∗ , B ∗ ) be
width(A′ ) 6= width(B ′ ).
a
GP. If A′ ⊂ A∗
and B ′ ⊂ B ∗ , then
Proof: Let us assume that A′ ⊂ A∗ , B ′ ⊂ B ∗ and width(A′ ) = width(B ′ ).
In this case, let us consider the following partitions A∗ = A′ ∪ (A∗ − A′ ) and
B ∗ = B ′ ∪ (B ∗ − B ′ ). Since weight(A∗ ) < weight(B ∗ ), then either weight(A′ ) <
weight(B ′ ) or weight(A∗ − A′ ) < weight(B ∗ − B ′ ). If weight(A′ ) < weight(B ′ ),
then (A′ , B ′ ) is a DP and width(A′ ) < width(A∗ ), that contradicts the fact that
(A∗ , B ∗ ) is a GP. On the other hand, if weight(A∗ − A′ ) < weight(B ∗ − B ′ ),
then (A∗ − A′ , B ∗ − B ′ ) is a DP and width(A∗ − A′ ) < width(A∗ ), what also
contradicts the fact that (A∗ , B ∗ ) is a GP. Hence, width(A′ ) 6= width(B ′ )
Proposition 4. If (A∗ , B ∗ ) is a GP, then the following conditions hold
(a) width(A∗ ) = width(B ∗ ) ≤ 2−1 ;
(b) width(A∗ ) = 2−s1 , for some integer si where 1 ≤ s1 ≤ L;
(c) Either 1 = |A∗ | < |B ∗ | or 1 = |B ∗ | < |A∗ |.
Linear Time Recognition of Optimal L-Restricted Prefix Codes
235
Proof: (a) Let us assume that width(A∗ ) = width(B ∗ ) > 2−1 . In this case,
one can show, by applying the lemma 1 of [LH90] at most L times, that both A∗
and B ∗ contain a nodeset with width 2−1 . However, it contradicts proposition
3. Hence, width(A∗ ) = width(B ∗ ) ≤ 2−1 .
Pk
(b) Now, we assume that width(A∗ ) = i=1 2−si , where 1 ≤ s1 < s2 . . . < sk
and k > 1. In this case, one can show that A∗ contains a nodeset A′ with
width 2−s1 and B ∗ contains a nodeset B ′ with width 2−s1 , that contradicts the
proposition 3. Hence, k = 1 and width(A∗ ) = 2−s1 .
(c) First, we show that either 1 = |A∗ | or 1 = |B ∗ |. Let us assume the
opposite, that is, 1 < |A∗ | and 1 < |B ∗ |. Since width(A∗ ) = width(B ∗ ) = 2−si
for some 1 ≤ si ≤ L, then one can show that both A∗ and B ∗ contain a nodeset
with width 2−(si +1) , that contradicts the proposition 3. Hence, either 1 = |A∗ |
or 1 = |B ∗ |. Furthermore, we cannot have 1 = |A∗ | = |B ∗ |. In effect, if 1 =
|A∗ | = |B ∗ |, we would have weight(A∗ ) ≥ weight(B ∗ ), and as a consequence,
(A∗ , B ∗ ) would not be a GP.
The previous result allows us to divide our analysis into two cases. In the
first case, A∗ has only one node. In the second case B ∗ has only one node. We
define two special pairs. The removal good pairs (RGP) and the addition good
pairs (AGP).
Definition 3. If (A∗ , B ∗ ) is a GP, |A∗ | = 1 and for all GP (A, B), with |A| = 1,
we have width(B ∗ − F ) ≤ width(B − F ). Then, (A∗ , B ∗ ) is a removal good pair
(RGP).
Definition 4. If (A∗ , B ∗ ) is a GP, |B ∗ | = 1 and for all GP (A, B), with |B| = 1,
we have width(A∗ − U ) ≤ width(A − U ). Then, (A∗ , B ∗ ) is an addition good
pair(AGP).
We can state the following result.
Proposition 5. If there is a GP associated to N , then there is either a RGP
associated to N or an AGP associated to N .
Proof: The proof is similar to that of proposition 2.
Theorem 5. If there is a RGP (A, B) associated to N , then there is a RGP
(A∗ , B ∗ ) associated to N that satisfies the following conditions:
(a) If (i, h) ∈ A∗ , then (i − 1, h) ∈ N ;
(b) |B ∗ − (F ∪ M )| ≤ 1;
/ N − F.
(c) If (i, h) ∈ B ∗ − (F ∪ M ), then (i + 1, h) ∈
Proof: We leave this proof for the full version of the paper. The proof
requires some additional lemmas and it uses arguments that are similar to that
employed in the proof of proposition 4.
Theorem 6. If there is an AGP (A, B) associated to N , then there is an AGP
(A∗ , B ∗ ) associated to N that satisfies the following conditions:
(a) If (i, h) ∈ B ∗ , then (i + 1, h) ∈ N ;
(b) |A∗ − (U ∪ P )| ≤ 1;
(c) If (i, h) ∈ |A∗ − (U ∪ P )|, then (i − 1, h) ∈ N ∪ U .
236
R.L. Milidiú, E.S. Laber
Proof: We leave this proof to the full paper.
Now, we prove the theorem 3 that implies on the correctness of the Recognition algorithm.
Proof of theorem 3: If N is not an optimal nodeset, then it follows
from proposition 2 that there is a GP associated to N . Hence, it follows from
proposition 5 that there is either a RGP or an AGP associated to N . We consider
two cases:
Case 1) There is a RGP associated to N .
It follows from theorem 5 that there is a RGP (A∗ , B ∗ ) associated to N
that satisfies the conditions (a), (b) and (c) proposed in that theorem. ¿From
the definitions of the nodesets F, M, F2 , U, P, U2 it is easy to verify that those
conditions imply that B ∗ ⊂ F ∪ M ∪ F2 and A∗ ⊂ U ∪ P ∪ U2 .
Case 2) There is an AGP associated to N . The proof is analogous to that of
case 1.
Hence, we conclude that if N is not optimal, then there is a DP (A∗ , B ∗ )
such that B ∗ ⊂ F ∪ M ∪ F2 e A∗ ⊂ U ∪ P ∪ U2 . Therefore, it is possible to reduce
the weight of N by adding some nodes that belong to U ∪ P ∪ U2 and removing
some other nodes that belong to F ∪ M ∪ F2 .
References
[Gar74]
M. R. Garey. Optimal binary search trees with restricted maximal depth.
Siam Journal on Computing, 3(2):101–110, June 1974.
[Gil71]
E. N. Gilbert. Codes based on innacurate source probabilities. IEEE Transactions on Information Theory, 17:304–314, 1971.
[Huf52] D. A. Huffman. A method for the construction of minimum-redundancy
codes. In Proc. Inst. Radio Eng., pages 1098–1101, September 1952. Published as Proc. Inst. Radio Eng., volume 40, number 9.
[LH90] Lawrence L. Larmore and Daniel S. Hirschberg. A fast algorithm for optimal
length-limited Huffman codes. Journal of the ACM, 37(3):464–473, July
1990.
[McM56] B. McMillan. Two inequalities implied by unique decipherability. IEEE
Transaction on Information Theory, 22:155–156, 1956.
[Sch95] Baruch Schieber. Computing a minimum-weight k-link path in graphs with
the concave Monge property. In Proceedings of the Sixth Annual ACM-SIAM
Symposium on Discrete Algorithms, pages 405–411, San Francisco, California,
22–24 January 1995.
[TM96] Andrew Turpin and Alistair Moffat. Efficient implementation of the packagemerge paradigm for generating length-limited codes. In Michael E. Houle and
Peter Eades, editors, Proceedings of Conference on Computing: The Australian Theory Symposium, pages 187–195, Townsville, 29–30 January 1996.
Australian Computer Science Communications.
[ZM95] Justin Zobel and Alistair Moffat. Adding compression to a full-text retrieval
system. Software—Practice and Experience, 25(8):891–903, August 1995.
Uniform Multi-hop All-to-All Optical Routings
in Rings ⋆
Jaroslav Opatrny
Concordia University, Department of Computer Science, Montréal, Canada,
email: opatrny@cs.concordia.ca
WWW home page: http://www.cs.concordia.ca/˜faculty/opatrny/
Abstract. We consider all-to-all routing problem in an optical ring network that uses the wavelength-division multiplexing (WDM). Since onehop, all-to-all optical routing in a WDM optical ring of n nodes needs
2
→
w (Cn , IA , 1) = ⌈ 12 ⌊ n4 ⌋⌉ wavelengths which can be too large even for
moderate values of n, we consider in this paper j-hop implementations
of all-to-all routing in a WDM optical ring, j ≥ 2. ¿From among the
possible routings we focus our attention on uniform routings, in which
each node of the ring uses the same communication pattern. We show
that there exists a uniform 2-hop, 3-hop,q
and 4-hop implementation
of
q
n+16
all-to-all routing that needs at most n+3
+ n4 , n2 3 ⌈n/4⌉+4
, and
3
5
5
p
n
n+16 4
⌈ 4 ⌉ + 8 wavelengths, respectively. These values are within mul2
tiplicative constants of lower bounds.
1
Introduction
Optical-fiber transmission systems are expected to provide a mechanism to build
high-bandwidth, error-free communication networks, with capacities that are orders of magnitude higher than traditional networks. The high data transmission
rate is achieved by transmitting information through optical signals, and maintaining the signal in optical form during switching. Wavelength-division multiplexing (or WDM for short) is one of the most commonly used approaches to
introduce concurrency into such high-capacity networks [5,6]. In this strategy,
the optical spectrum is divided into many different channels, each channel corresponding to a different wavelength. A switched WDM network consists of nodes
connected by point-to-point fiber-optic links. Typically, a pair of nodes that is
connected by a fiber-optic link is connected by a pair of optic cables. Each cable
is used in one direction and can support a fixed number of wavelengths. The
switches in nodes are capable of redirecting incoming streams based on wavelengths. We assume that switches cannot change the wavelengths, i.e. there are
no wavelength converters.
Thus, a WDM optical network can be represented by a symmetric digraph,
that is, a directed graph G with vertex set V (G) representing the nodes of the
⋆
The work was supported partially by a grant from NSERC, Canada.
G. Gonnet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNCS 1776, pp. 237–246, 2000.
c Springer-Verlag Berlin Heidelberg 2000
238
J. Opatrny
network and edge set E(G) representing optical cables, such that if directed edge
[x, y] is in E(G), then directed edge [y, x] is also in E(G). In the sequel, whenever
we refer to an edge or a path, we mean a directed edge or a directed path.
Different messages can use the same link (or directed edge) concurrently
if they are assigned distinct wavelengths. However, messages assigned the same
wavelength must be assigned edge-disjoint paths. In the graph model, each wavelength can be represented by a color.
Given an optical communication network and a pattern of communications
among the nodes, one has to design a routing i.e. a system of directed paths
and an assignment of wavelengths to the paths in the routing so that the given
communications can be done simultaneously. We can express the problem more
formally as follows.
Given a WDM optical network G, a communication request is an ordered
pair of nodes (x, y) of G such that a message is to be sent from x to y. A
communication instance I (or instance for short) is a collection of requests.
Let I be an instance in G. A j-hop solution [9,10] of I is a routing R in G
and an assignment of colors to paths in R such that
1. it is conflict-free, i.e., any two paths of R sharing the same directed edge
have different colors, and
2. for each request (x, y) in I, a directed path from x to y in R can be obtained
by concatenation of at most j paths in R.
Since the cost of an optical switch depends on the number of wavelengths it can
handle, it is important to determine paths and a conflict-free color assignment
so that the total number of colors is minimized.
In 1-hop, usually called all optical solution of I, there is a path from x to y in
R for each request (x, y) in I and all communications are done in optical form.
In a j-hop solution, j ≥ 2, the signal must be converted into electronic form
j − 1 times. The conversion into electronic form slows down the transmission,
but j > 1 can significantly reduce the number of wavelengths needed [10].
For an instance I in a graph G, and a j-hop routing R for it, the j-hop
→
wavelength index of the routing R, denoted w (G, I, R, j), is defined to be the
minimum number of colors needed for a conflict-free assignment of colors to
→
paths in the routing R. The parameter w (G, I, j), the j-hop optimal wavelength
index for the instance I in G is the minimum value over all possible routings for
the given instance I in G. In general, the problem of determining the optimal
wavelength index is NP-complete [3].
In this paper, we are interested in ring networks. In a ring network there is a
link from each node to two other nodes. Thus, the topology of a ring network on
n nodes n ≥ 3, can be represented by a symmetric directed cycle Cn (see [4] for
any graph terminology not defined here). A symmetric directed cycle Cn , n ≥ 3,
consists of n nodes x0 , x1 , . . ., xn−1 and there is an arc from xi to x(i+1) mod n
and from x(i+1) mod n to xi for 0 ≤ i ≤ n − 1, see C8 in Figure 1. We denote
by pi,j a shortest path from xi to xj . The diameter of Cn , i.e., the maximum
among the lengths of the shortest paths among nodes of Cn , denoted by dn , is
equal to ⌊ n2 ⌋.
Uniform Multi-hop All-to-All Optical Routings in Rings
239
Fig. 1. Ring C8
The all-to-all communication instance IA is the instance that contains all
pairs of nodes of a network. IA has been studied for rings and other types of
→
network topologies [1], [3], [8], [11]. It has been shown for rings in [3] that w
2
(Cn , IA , 1) = ⌈ 21 ⌊ n4 ⌋⌉. The optical switches that are available at present cannot
→
handle hundreds of different wavelengths and thus the value of w (Cn , IA , 1)
can be too large even for moderately large rings. One can reduce the number of
wavelength by considering j-hop solutions for the IA problem for j ≥ 2.
One possible 2-hop solution of the IA instance is the routing {p0,i : 1 ≤ i ≤
n−1}∪{(pi,0 : 1 ≤ i ≤ n−1}, in which there is a path from x0 to all other nodes
and a path from any node to x0 . Any request (xi , xj ) in IA can be obtained by
a concatenation of pi,0 and p0,j , and we can get a conflict-free assignment of
colors to all paths in R using ⌈ n−1
2 ⌉ colors. However, this solution has all the
drawbacks of having one server x0 for the network, i.e., a failure of x0 shuts
down the whole network and x0 is a potential performance bottleneck. This is
very obvious if we represent R by the routing graph GR [7], in which there is an
edge from x to y if and only if there is a path in R from x to y. In case of the
routing above, the routing graph is the star of Figure 2 a).
For better fault-tolerance and better load distribution, we should consider
a uniform routing [11] in which each node can communicate directly with the
same number of nodes as any other node and at the same distance along the
ring. More specifically, a routing R on a ring of length n is uniform if for some
integer m < n and some integers b1 , b2 , . . . , bm the routing R consists of paths
connecting nodes that are at distance bi , 1 ≤ i ≤ m along the ring, i.e. R =
{pi,i+bj , pi,i−bj : 0 ≤ i ≤ n − 1, 1 ≤ j ≤ m}. In a uniform routing the routing
graph is a distributed loop graph [2] of degree 2m, i.e., for m = 2 and b1 = 1
and b2 = 3 we get the routing graph in Figure 2 b). As seen from the figure, this
provides much better connectivity and can give a uniform load on the nodes.
Furthermore, the routing decisions can be the same in all nodes.
Thus, the problem that we consider in this paper is the following:
Given a ring Cn and integer j, j ≥ 2, find a routing Rn,j such that:
1. Rn,j is uniform routing on Cn ,
2. Rn,j is a j-hop solution of IA ,
240
J. Opatrny
a)
x
0
b)
Fig. 2. The routing graph in C8 of a) a non-uniform, b) a uniform routing
2
3. the number of colors needed for Rn,j is substantially less than ⌈ 21 ⌊ n4 ⌋⌉.
In Section 2, we show
q that there exists a uniform 2-hop routing Rn,2 for IA
n
n+3
n+16
that needs at most 3
5 + 4 colors. We show that this is within a constant
factor of a lower bound.
We show
Uniform j-hop solutions of IA , j ≥ 3 are considered in Section 3.q
that there exist a uniform 3-hop routing Rn,3 that needs at most n2 3 ⌈n/4⌉+4
5
p
n
4
colors, and a 4-hop Rn,4 with at most n+16
⌉
+
8
colors.
We
present
con⌈
2
4
clusions and open questions in Section 4. Most proofs in this extended abstract
are either omitted or only outlined. Complete proofs appear in [12].
2
Uniform, 2-hop All-to-All Routing
Let n be an integer, n ≥ 5, and Cn be a ring of length n with nodes
x0 , x1 , . . . , xn−1 . A routing R is a 2-hop solution of all-to-all problem in Cn , if
any request xi , xj , i 6= j can be obtained as a concatenation of at most two paths
in R. Since on a ring any pair of distinct nodes is at distance between 1 and
dn = ⌊n/2⌋, we have that routing R is a uniform 2-hop solution of the all-to-all
instance on Cn if there is a set of integers B = {b1 , b2 , . . . , bm } such that R
contains all paths on the ring of length in B, and any integer between 1 and dn
can be obtained using at most one addition or subtraction of two elements in B.
Lemma 1. Let k and m be positive integers, k ≤ m
4 , and
1
2
3
4
1
∪ Bk,m
∪ Bk,m
∪ Bk,m
where Bk,m
= {1, 2, . . . , ⌊k/2⌋},
Bk,m = Bk,m
2
= {⌊m/2⌋, ⌊m/2⌋ + 1, ⌊m/2⌋ + 2, . . . , ⌊m/2⌋ + k − 1},
Bk,m
3
Bk,m = {m − tk + 1, m − (t − 1)k + 1, . . . , m − 3k + 1, m − 2k + 1} where
4
= {m − k + 1, m − k + 2, . . . , m − 1, m}.
t = ⌈(m − ⌊m/2⌋ − 2k + 2)/k⌉, and Bk,m
Then any integer in the set {1, 2, . . . , 2m} can be obtained using at most one
addition or subtraction of two elements in Bk,m .
Uniform Multi-hop All-to-All Optical Routings in Rings
241
Proof. It is easy to verify that by at most one addition or subtraction we can
4
generate set {2m − 2k + 2, . . . , 2m} from integers in Bk,m
,
4
3
,
set {2m − (t + 1)k + 2, . . . , 2m − 2k + 1} from Bk,m and Bk,m
4
2
,
set {m + ⌊m/2⌋ − k + 1, . . . , m + ⌊m/2⌋ + k − 1}, from Bk,m and Bk,m
2
3
set {m + ⌊m/2⌋ − tk + 1, . . . , m + ⌊m/2⌋ − k}, from Bk,m and Bk,m ,
2
, set {m − k + 1, . . . , m}, from
set {2⌊m/2⌋, . . . , 2⌊m/2⌋ + 2k − 2}, from Bk,m
4
1
4
Bk,m , set {m − k − ⌊k/2⌋ + 1, . . . , m − k} from Bk,m
and Bk,m
,
3
1
,
set {m − tk − ⌊k/2⌋ + 1, . . . , m − 2k + ⌊k/2⌋ + 1}, from Bk,m and Bk,m
2
1
set {⌊m/2⌋, . . . , ⌊m/2⌋ + k + ⌊k/2⌋ − 1}, from Bk,m and Bk,m ,
2
4
and Bk,m
,
set {m − ⌊m/2⌋ − 2k + 2, . . . , m − ⌊m/2⌋}, from Bk,m
3
4
set {k, . . . , m − ⌊m/2⌋ − 2k + 2}, from Bk,m and Bk,m , and set {1, . . . , k} from
4
. Since m − tk + 1 ≤ ⌊m/2⌋ + 2k − 1, we reach the conclusion of the lemma.
Bk,m
✷
A lower bound on the number of colors needed for a routing based on set
Bk,m is equal to the sum of elements in Bk,m . One way to minimize the sum is
to keep the size of the set Bk,m as small as possible. The size of the set Bk,m is
⌉ − 1 + k ≤ 25 k + (m + 4)/(2k).
equal to |Bk,m | = ⌊ k2 ⌋ + k + ⌈ (m−⌊m/2⌋−2k+2)
k
For an integer m, we obtain the smallest set Bk,m that generates {1, 2, . . . , 2m}
given in the next lemma by minimizing the value of |Bk,m | with respect to k.
q
Lemma 2. Let m be a positive integer and s(m) = ⌊ m+4
5 ⌋. Then we can generate the set {1, 2, . . . , 2m} by at most one operation of addition
or subtraction
p
on integers in the set Bs(m),m . Set Bs(m),m contains at most 5(m + 4) integers.
If a node v in Cn can communicate directly with all nodes at distance in
Bs(m),m from v, where m = ⌈ n4 ⌉, then by Lemma 2 node v can communicate in
2-hops with every node at distance dn in Cn . This gives us a way to define a
uniform routing on Cn which is a 2-hop solution of IA .
Lemma 3. Let n be an integer, n ≥ 5, and let Rn,2 to be the routing in Cn
such that any node in Cn can communicate directly with all nodes at distance in
Bs(⌈n/4⌉)⌈n/4⌉ . Then Rn,2 is a uniform, 2-hop solution of IA on Cn .
We determine an upper bound on the wavelength index of Rn,2 by a repeated
application of the following lemma.
Lemma 4. Let n be an integer, n ≥ 5, and P = {p1 , p2 , . . . pk } be a set of
Pk
positive integers such that i=1 pi < n. Let IP be an instance in Cn such that
every node in Cn communicates with all nodes at distance in P , and R be a
routing of shortest paths.
→
If p1 + p2 + · · · + pk divides n then w (Cn , IP , R, 1) = p1 + p2 + · · · + pk .
→
If (p1 + p2 + · · · + pk ) mod n = 1 then w (Cn , IP , R, 1) ≤ p1 + p2 + · · · + pk + 1.
→
→
If (p1 + p2 + · · · + pk ) < m then w (Cn , IP , R, 1) ≤w (Cn , Im , R, 1), where Im
is an instance in Cn such that every node in Cn communicates with a node at
distance m.
242
J. Opatrny
Theorem 1. For any integer n ≥ 5 there exists a uniform routing Rn,2 on Cn
which is a 2-hop solution of IA such that
r
n + 3 n + 16 n
→
w (Cn , IA , Rn,2 , 2) ≤
+
3
5
4
q
Proof. Let d = ⌊n/2⌋, m = ⌈d/2⌉, s(m) = ⌊ m+4
5 ⌋. We define Rn,2 to consist,
for every node v in Cn , of paths from v to nodes at distances in set Bs(m),m . As
1
2
3
4
∪ Bs(m),m
∪ Bs(m),m
∪ Bs(m),m
where
specified in Lemma 1, Bs(m),m = Bs(m),m
1
2
Bs(m),m = {1, 2, 3, . . . , ⌊s(m)/2⌋}, Bs(m),m = {m, m+1, m+2, . . . , m+s(m)−1},
3
= {2m−ts(m)+1, m−(t−1)s(m)+1, . . . , 2m−3s(m)+1, 2m−2s(m)+1}
Bs(m),m
and t = ⌈(m − ⌊m/2⌋ − 2s(m) + 2)/s(m)⌉ and
4
{2m − s(m) + 1, 2m − s(m) + 2, . . . , 2m − 1, 2m}. Since any distance on
Bs(m),m
the ring can be obtained as a combination of two elements in Bs(m),m , we have
that Rn,2 is a 2-hop solution of IA .
We now determine an upper bound on the wavelength index of Rn,2 .
Assume first that n is divisible by 4. In this case m divides n. Thus, by Lemma
4 the wavelength index for all paths of length m in Cn is equal to m. Similarly,
for any integer i, 1 ≤ i ≤ ⌊s(m)/2⌋ the wavelength index for all paths of length i
and m − i is equal to m, and for any integer i, 0 ≤ i ≤ ⌊s(m)/3⌋ the wavelength
index for all paths of length ⌊m/2⌋ + i, ⌊m/2⌋ + s(m) − 1 − 2i and m − s(m) +
1
, two
i + 1 is at most 2m. This deals with all paths whose length is in Bs(m),m
2
thirds of paths whose length is in Bs(m),m , and 5/6 of paths whose length is
4
in Bs(m),m
. In total, the number of colors needed for these paths is equal to
m(1 + ⌊s(m)/2⌋) + 2(⌊s(m)/3⌋).
For any i, 1 ≤ i ≤ s(m)/6, all paths of length m − ⌊s(m)/2⌋ − i need at
most m colors, so these paths contribute at most m(⌊s(m)/6⌋ to the wavelength
index.
For 0 ≤ i ≤ ⌊s(m)/3⌋ we group together paths of length ⌊m/2⌋ + ⌊s(m)/3⌋ +
2i + 1, ⌊m/2⌋ + s(m) − 2i − 1 and m − (i + 2)s(m) + 1. Since ⌊m/2⌋ + ⌊s(m)/3⌋ +
2i+1+⌊m/2+s(m)−2i−1+m−(i+1)s(m)+1 ≤ 2m−(i+1)s(m)+⌊s(m)/3⌋,
any of these path-lengths groups needs at most 2m colors and they contribute
at most 2m(⌊s(m)/3⌋ to the wavelength index.
1
need at most m colors and there
Any remaining path-lengths in Bs(m),m
→
are at most s(m) − ⌊s(m)/3⌋ of such path-lengths. Thus, w (Cn , IA , Rn,2 , 2) ≤
m(1 + ⌊s(m)/2⌋ + 2(⌊s(m)/3⌋)
+ ⌊s(m)/6⌋ + 2(⌊s(m)/3⌋ + s(m) − ⌊s(m)/3⌋ ≤
q
m(1 + 8s(m)/3) ≤
n
3
n+16
5
+ n/4.
If n is not divisible by 4 then m = ⌈⌊n/2⌋/2⌉ and 4(m − 1) < n and 2(2m −
1) ≤ n. Thus, in this case we group the path-lengths together similarly as above,
except that in each group we put the path-lengths whose total is either m − 1
or less or 2m − 1 or less. This however may require one more color per each
group, which increases the
qwavelength index at most by s(m) and we get that
→
w (Cn , IA , Rn,2 , 2) ≤
n+3
3
n+16
5
+ n/4.
✷
Uniform Multi-hop All-to-All Optical Routings in Rings
243
The theorem shows that for a ring of length n, the ratio of the number colors
needed for 1-hop all-to-all routing over the number of wavelengths
needed for
√
a uniform 2-hop all-to-all routing is approximately 0.8 n. This can be very
significant in some applications.
Any set B of integers that generates the p
set {1, 2, . . . , n/2} by at most one
addition or subtraction must contain at least n/4 integers. In order
p to generate
all integers between n/4 and n/2, set B must contain at least n/8 integers
between n/8 and n/4 This gives us the following lower bound on the value of
→
w (Cn , IA , 2).
pn
→
n
Theorem 2. w (Cn , IA , 2) ≥ 16
2.
→
Thus, the value of w (Cn , IA , Rn,2 , 2) is within a constant factor of the lower
bound.
3
Uniform j-Hop, j ≥ 3, All-to-All Routing
We can obtain results for uniform j-hop, j ≥ 3, all-to-all routing using repeatedly
the results from the previous section. A routing R is a j-hop all-to-all routing
in a Cn , if any request xi , xl can be obtained as a concatenation of at most j
paths in R. Thus, R is a uniform, j-hop, all-to-all routing on Cn if there is a set
of integers B(j) = {b1 , b2 , . . . , bm } such that R contains all paths on the ring of
length in B(j), and any integer between 1 and ⌊n/2⌋ can be obtained using at
most j − 1 operations of addition or subtraction of j elements in B(j).
Any integer between 1 and 2m can be obtained using at most 1 operation of
1
2
3
4
∪ Bk,m
∪ Bk,m
∪ Bk,m
addition or subtraction of 2 elements in set Bk,m = Bk,m
3
from Lemma 1. As seen from the proof of Lemma 1, integers in set Bk,m are not
3
in order to obtain
involved in additions or subtractions with integers from Bk,m
3
{1, 2, . . . , 2m}. Since set Bk,m is a linear progression containing at most m+2
2k
integers between m/2 and m, we can obtain all integers in this set, similarly as
1
in Lemma 1, by at most one addition or subtraction from integers in sets Db,k
,
q
5(m+4)
1
2
3
1
2
3
+2. Thus, any integer between
Db,k , Db,k where |Db,k ∪Db,k ∪Db,k | ≤ 2
k
1 and 2m can be obtained by at most 2 additions or subtractions from integers
1
2
1
2
3
4
∪ Bk,m
∪ Dk,m
∪ Dk,m
∪ Dk,m
∪ Bk,m
and the total size of
in set Dk,m = Bk,m
q
+ 2 + k . By minimizing the value of the
Dk,m is at most ⌊ k2 ⌋ + k + 21 5(m+4)
k
size with respect to k we obtain the next lemma.
q
Lemma 5. Let m be a positive integer and r(m) = ⌊ 3 m+4
20 ⌋. Then we can
generate the set {1, 2, . . . , 2m} by at most 2 operations of addition or subtrac√
tion from integers in the set Dr(m),m . Set Dr(m),m contains at most 3 3 m + 4
integers.
Clearly, if every node in cycle Cn communicates directly with nodes at distance
in Cs(⌈n/4⌉)),⌈n/4⌉) , then every node can communicate with any other node in at
most 3-hops.
244
J. Opatrny
Lemma 6. Let n be an integer, n ≥ 5, and let Rn,3 to be the routing in Cn
such that any node in Cn can communicate directly with all nodes at distance in
Dr(⌈n/4⌉),⌈n/4⌉ . Then Rn,3 is uniform, 3-hop solution of IA on Cn .
Theorem 3. For any integer n ≥ 5 there exists a uniform routing Rn,3 on Cn
which is a 3-hop solution of IA such that
r
n 3 ⌈n/4⌉ + 4
→
w (Cn , IA , Rn,3 , 3) ≤
2
5
Proof. Let Rn,3 be the 3-hop routing from Lemma 5. We calculate the wave1
3
, Bk,m
length index similarly as in Theorem 1. For path-lengths in sets Bk,m
4
and Bk,m we use the same methods as in Theorem 1. We thus obtain that the
wavelength index of all these paths is at most
q
3 ⌈n/4⌉+4
⌈n/4⌉(1 + r(m)/2 + 2r(m)/3 + 2r(m)/3) = ⌈n/4⌉(1 + 11
⌋.
6 ⌊
20
1
2
3
Since all path-lengths in sets Dk,m , Dk,m , Dk,m are bounded from above by
q
5n 3 ⌈n/4⌉+4
n
,
the
wavelength
index
of
all
these
paths
is
at
most
⌊
⌋. Thus the
8
8
20
q
wavelength index of Rn,3 is at most n2 3 ⌈n/4⌉+4
⌋.
✷
5
Any integer between 1 and 2m can be obtained using at most 1 operation of
1
2
3
4
∪ Bk,m
∪ Bk,m
∪ Bk,m
addition or subtraction of 2 elements in set Bk,m = Bk,m
from Lemma 1.
1
2
3
4
, Bk,m
, Bk,m
, and Bk,m
form a linear
Since integers in any of the sets Bk,m
i
progression, we can obtain all integers in set Bk,m
, 1 ≤ i ≤ 4, similarly as
in Lemma 1, by at most one additions
or subtractions from integers in a set
q
|B i
|
+ 4) integers. This implies that any
of integers that contain at most 5( k,m
2
integer between 1 and 2m can be obtained
by
at most
subtractions
q
q3 additions or q
k
on sets of integers that contain at most 5( k4 + 4)+ 5( m+4
4k + 4)+ 5( 2 + 4)+
q
q
5( k2 + 4). By substituting s(m) = m+4
for k, we obtain the next lemma.
5
Lemma 7. Let m be a positive integer. There exists set Em such that the set
{1, 2, . . . , 2m} can be generated from integers in Em by
√ at most 3 operations of
addition or subtraction. Set Em contains at most 4( 4 m + 4 + 4) integers and
any of its elements is less than or equal to ⌈m/2⌉.
Clearly, if every node in cycle Cn communicates directly with nodes at distance
in E⌈n/4⌉ , then every node can communicate with any other node in at most
4-hops.
Lemma 8. Let n be an integer, n ≥ 5, and let Rn,4 to be the routing in Cn
such that any node in Cn can communicate directly with all nodes at distance in
E⌈n/4⌉ . Then Rn,4 is a uniform, 4-hop solution of IA on Cn .
Uniform Multi-hop All-to-All Optical Routings in Rings
245
Theorem 4. For any integer n ≥ 5 there exists a uniform routing Rn,4 on Cn
which is a 4-hop solution of IA such that
r
n + 16 4 n
→
w (Cn , IA , Rn,2 , 2) ≤
⌈ ⌉+8
2
4
Proof. Let Rn,4 be the routing from Lemma 8. By Lemma 4, all cycles of length
k, k ≤ ⌈n/8⌉ need at most ⌈n/8⌉ + 1 colors. Thus,
p
p
→
4
w (Cn , IA , Rn,2 , 2) ≤ (⌈ n8 ⌉ + 1)(4 4 ⌈ n4 ⌉ + 4 + 4) ≤ n+16
⌈ n4 ⌉ + 8
2
Note that in this proof the bound on the wavelength index is calculated less
precisely than those for 2-hop and 3-hop case.
✷
We can show, similarly as in the 2-hop case that the result above is within a
constant factor of a lower bound.
Clearly, the process that we used for deriving the wavelength index of 3-hop
and 4-hop wavelength indices can be extended for higher number of hops.
4
Conclusions
We gave an upper bound on the wavelength index of a uniform j-hop all-to-all
communication instance in a ring of length n for 2 ≤ j ≤ 4, which is within
a multiplicative constant of a lower bound. The results show that there is a
large reduction in the value of the wavelength index when
√ going from a 1-hop
to a 2-hop routing, since we replace one factor of n by n. However, the rate
of reduction diminishes
√
√ for subsequent number of hops, since we only replace
one factor of n by 3 n when going from a 2-hop to a 3-hop routing, etc. For
example, for a cycle on 100 nodes we get
→
→
→
w (C100 , IA , 1) = 1250, w (C100 , IA , 2) ≤ 165, w (C100 , IA , 3) ≤ 115.
The value of the upper bounds on the wavelength index that we obtained
depends on the size of the set Bs(⌈n/4⌉),⌈n/4⌉ from which we can generate the set
{1, 2, . . . , ⌊(n − 1)/2⌋} using at most one operation of addition or subtraction.
Obviously, if we obtain an improvement on the size of Bs(⌈n/4⌉),⌈n/4⌉ , we could
improve the upper bounds on value of the wavelength index of a uniform jhop all-to-all communication instance, j ≥ 2. This seems to be an interesting
combinatorial problem that, to the best of our knowledge, has not been studied
previously. It would be equally interesting to get a better lower bound on the size
of a set that generates set {1, 2, . . . , ⌊(n − 1)/2⌋} using at most one operation of
addition or subtraction. Thus we propose the following open problems:
1
Open Problem 1: Find a tight lower bound on the size of a set of integers Bm
such that any integer in the set {1, 2, . . . , m} can be obtained by at most one
1
.
operation of addition or subtraction from integers in Bm
1
such that any integer in the set
Open Problem 2: Find a set of integers Bm
{1, 2, . . . , m} can be obtained by at most one operation of addition or subtraction
1
and which is smaller in size than the set given in this paper.
from integers in Bm
Similar open problems can be asked for a higher number of operation.
246
J. Opatrny
Answer to open problems 2 does not yet solve the wavelength index of a
uniform j-hop all-to-all communication instance in rings, as it is necessary to
devise a coloring of the paths in the ring. As of now, there is no general algorithm
that can give a good color assignment to paths in case of a uniform instance on
a ring. This leads us to propose the following open problem:
Open Problem 3: Give an algorithm that, given a uniform instance I in Cn ,
→
finds a good approximation of w (Cn , I, 1).
References
1. B. Beauquier, S. Pérennes, and T. David. All-to-all routing and coloring in weighted
trees of rings. In Proceedings of Eleven Annual ACM Symposium on Parallel Algorithms and Architectures, pages 185–190, 1999.
2. J. Bermond, F. Comellas, and D. Hsu. Distributed loop computer networks: A
survey. J. of Parallel and Distributed Computing, 24:2–10, 1995.
3. J.-C. Bermond, L. Gargano, S. Perennes, and A. A. Rescigno. Efficient collective
communication in optical networks. Lecture Notes in Computer Science, 1099:574–
585, 1996.
4. J. Bondy and U. Murty. Graph Theory with Applications. Macmillan Press Ltd,
1976.
5. C. Bracket. Dense wavelength division multiplexing networks: Principles and applications. IEEE J. Selected Areas in Communications, 8:948–964, 1990.
6. N. Cheung, K. Nosu, and G. Winzer. An introduction to the special issue on dense
WDM networks. IEEE J. Selected Areas in Communications, 8:945–947, 1990.
7. D. Dolev, J. Halpern, B. Simons, and R. Strong. A new look at fault tolerant
network routing. In Proceedings of ACM 10th STOC Conference, pages 526–535,
1984.
8. L. Gargano, P. Hell, and S. Perennes. Colouring paths in directed symmetric trees
with applications to WDM routing. In 24th International Colloquium on Automata,
Languages and Programming, volume 1256 of Lecture Notes in Computer Science,
pages 505–515, Bologna, Italy, 7–11 July 1997. Springer-Verlag.
9. B. Mukherjee. WDM-based local lightwave networks, part I: Single-hop systems.
IEEE Network Magazine, 6(3):12–27, May 1992.
10. B. Mukherjee. WDM-based local lightwave networks, part II: Multihop systems.
IEEE Network Magazine, 6(4):20–32, July 1992.
11. L. Narayanan, J. Opatrny, and D. Sotteau. All-to-all optical routing in chordal
rings of degree four. In Proceedings of the Symposium on Discrete Algorithms,
pages 695–703, 1999.
12. J. Opatrny. Low-hop, all-to-all optical routing in rings. Technical report, Concordia
University, 1999.
A Fully Dynamic Algorithm
for Distributed Shortest Paths
Serafino Cicerone1 , Gabriele Di Stefano1 , Daniele Frigioni12 , and Umberto
Nanni2
1
2
Dipartimento di Ingegneria Elettrica, Università dell’Aquila, I-67040 Monteluco di
Roio - L’Aquila, Italy. {cicerone,gabriele,frigioni}@infolab.ing.univaq.it
Dipartimento di Informatica e Sistemistica, Università di Roma “La Sapienza”, via
Salaria 113, I-00198 Roma, Italy. nanni@dis.uniroma1.it
Abstract. We propose a fully-dynamic distributed algorithm for the
all-pairs shortest paths problem on general networks with positive real
edge weights. If ∆σ is the number of pairs of nodes changing the distance after a single edge modification σ (insert, delete, weight-decrease,
or weight-increase) then the message complexity of the proposed algorithm is O(n∆σ ) in the worst case, where n is the number of nodes of
the network. If ∆σ = o(n2 ), this is better than recomputing everything
from scratch after each edge modification.
1
Introduction
The importance of finding shortest paths in graphs is motivated by the numerous
theoretical and practical applications known in various fields as, for instance, in
combinatorial optimization and in communication networks (e.g., see [1,10]). We
consider the distributed all-pairs shortest paths problem, which is crucial when
processors in a network need to route messages with the minimum cost.
The problem of updating shortest paths in a dynamic distributed environment arises naturally in practical applications. For instance, the OSPF protocol,
widely used in Internet (e.g., see [9,13]), updates the routing tables of the nodes
after a change to the network, by using a distributed version of Dijkstra’s algorithm. In this and many other crucial applications the worst case complexity of
the adopted protocols is never better than recomputing the shortest paths from
scratch. Therefore, it is important to find distributed algorithms for shortest
paths that do not recompute everything from scratch after each change to the
network, because this could result very expensive in practice.
If the topology of a network is represented as a graph, where nodes represent
processors and edges represent links between processors, then the typical update
operations on a dynamic network can be modeled as insertions and deletions of
edges and update operations on the weights of edges. When arbitrary sequences
of the above operations are allowed, we refer to the fully dynamic problem; if
only insertions and weight decreases (deletions and weight increases) of edges
are allowed, then we refer to the incremental (decremental) problem.
G. Gonnet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNCS 1776, pp. 247–257, 2000.
c Springer-Verlag Berlin Heidelberg 2000
248
S. Cicerone et al.
Previous works and motivations. Many solutions have been proposed in
the literature to find and update shortest paths in the sequential case on graphs
with non-negative real edge weights (e.g., see [1,10] for a wide variety). The state
of the art is that no efficient fully dynamic solution is known for general graphs
that is faster than recomputing everything from scratch after each update, both
for single-source and all-pairs shortest paths. Actually, only output bounded fully
dynamic solutions are known on general graphs [4,11].
Some attempts have been made also in the distributed case [3,5,7,12]. In
this field the efficiency of an algorithm is evaluated in terms of message, time
and space complexity as follows. The message complexity of a distributed algorithm is the total number of messages sent over the edges. We assume that each
message contains O(log n + R) bits, where R is the number of bits available to
represent a real edge weight, and n is the number of nodes in the network. In
practical applications messages of this kind are considered of “constant” size.
The time complexity is the total (normalized) time elapsed from a change. The
space complexity is the space usage per node.
In [5], an algorithm is given for computing all-pairs shortest paths requiring O(n2 ) messages, each of size n. In [7], an efficient incremental solution has
been proposed for the distributed all-pairs shortest paths problem, requiring
O(n log(nW )) amortized number of messages over a sequence of edge insertions
and edge weight decreases. Here, W is the largest positive integer edge weight.
In [3], Awerbuch et al. propose a general technique that allows to update the
all-pairs shortest paths in a distributed network in Θ(n) amortized number of
messages and O(n) time, by using O(n2 ) space per node. In [12], Ramarao and
Venkatesan give a solution for updating all-pairs shortest paths that requires
O(n3 ) messages and time and O(n) space. They also show that, in the worst
case, the problem of updating shortest paths is as difficult as computing shortest paths.
The results in [12] have a remarkable consequence. They suggest that two
possible directions can be investigated in order to devise efficient fully dynamic
algorithms for updating all-pairs shortest paths: i) to study the trade-off between
the message, time and space complexity for each kind of dynamic change; ii) to
devise algorithms that are efficient in different complexity models (with respect
to worst case and amortized analyses).
Concerning the first direction, in [7] an efficient incremental solution has been
provided, and the difficulty of dealing with edge deletions has been addressed.
This difficulty arises also in the sequential case (see for example [2]).
In this paper, the second direction is investigated. We observed that the
output complexity [4,10] was a good candidate (it is a robust measure of performance for dynamic algorithms in the sequential case [4,10,11]). This notion
applies when the algorithms operate within a framework where explicit updates
are required on a given data structure. In such a framework, output complexity
allows to evaluate the cost of dynamic algorithms in terms of the number of updates to the output information of the problem that are needed after any input
A Fully Dynamic Algorithm for Distributed Shortest Paths
249
update. Here we show the merits of this model also in the field of distributed
computation, improving over the results in [12].
Results of the paper. The novelty of this paper is a new efficient and practical solution for the fully dynamic distributed all-pairs shortest paths problem.
To the best of our knowledge, the proposed algorithm represents the first fully
dynamic distributed algorithm whose message complexity compares favorably
with respect to recomputing everything from scratch after each edge modification. This result is achieved by explicitly devising an algorithm whose main
purpose is to minimize the cost of each output update.
We use the following model. Given an input change σ and a source node s, let
δσ,s be the set of nodes changing either the distance or the parent in the shortest
paths tree
P rooted at s as a consequence of σ. Furthermore, let δσ = ∪s∈V δσ,s and
∆σ = s∈V |δσ,s |. We evaluate the message and time complexity as a function
of ∆σ . Intuitively, this parameter represents a lower bound to the number of
messages of constant size to be sent over the network after the input change σ.
In fact, if the distance from u to v changes due to σ, then at least u and v have
to be informed about the change.
We design an algorithm that updates only the distances and the shortest
paths that actually change after an edge modification. In particular, if maxdeg
is the maximum degree of the nodes in the network, then we propose a fully
dynamic algorithm for the distributed all-pairs shortest paths problem requiring
in the worst case: O(maxdeg · ∆σ ) messages and O(∆σ ) time for insert and
weight-decrease operations; O(max{|δσ |, maxdeg} · ∆σ ) messages and time for
delete and weight-increase operations. The space complexity is O(n) per node.
The given bounds compare favourably with respect to the results of [12] when
∆σ = o(n2 ), and it is only a factor (bounded by max{|δσ |, maxdeg} in the worst
case) far from the optimal one, that is the (hypothetical) algorithm that sends
over the network a number of messages equal to the number of pairs of nodes
affected by an edge modification.
2
Network Model and Notation
We consider point-to-point communication networks, where a processor can generate a single message at a time and send it to all its neighbors in one time step.
Messages are delivered to their respective destinations within a finite delay, but
they might be delivered out of order. The distributed algorithms presented in
this paper allow communications only between neighbors. We assume an asynchronous message passing system; that is, a sender of a message does not wait
for the receiver to be ready to receive the message. In a dynamic network when
a modification occurs concerning an edge (u, v), we assume that only nodes u
and v are able to detect the change. Furthermore, we do not allow changes to
the network that occur while the proposed algorithm is executed.
We represent a computer network, where computers are connected by communication links, by an undirected weighted graph G = (V, E, w), where V is a
finite set of n nodes, one for each computer; E is a finite set of m edges, one
250
S. Cicerone et al.
for each link; and w is a weight function from E to positive real numbers. The
weight of the edge (u, v) ∈ E is denoted as w(u, v). For each node u, N (u) contains the neighbors of u. A path between two nodes u and v is a finite sequence
p = hu = v0 , v1 , . . . , vk = vi of distinct nodes such that, for
Peach 0 ≤ i < k,
(vi , vi+1 ) ∈ E, and the weight of the path is weight(p) =
0≤i<k w(vi , vi+1 ).
The distance d(u, v) between any pair of nodes u and v is the minimum weight
of all possible paths connecting u to v. A shortest path from u to v is defined as
any path p such that weight(p) = d(u, v). If s ∈ V is an arbitrary source node,
we denote as Ts a shortest paths tree of G rooted at s; for any u ∈ V , Ts (u)
denotes the subtree of Ts rooted at u. We assume that each node u knows: i)
the identities of all nodes, 1, 2, . . . , n; ii) the identity of each node in N (u); iii)
for each ui ∈ N (u), the edge connecting u to ui , and the weight w(u, ui ).
3
The Fully-Dynamic Algorithm
We describe the algorithms handling weight-decrease and weight-increase operations, being straightforward the extension to insert and delete operations,
respectively.
We use the following data structures. A routing table RT [·, ·], needed to store
the information on the all-pairs shortest paths. Each node u in G, maintains
only the set of records RT [u, ·], one record RT [u, v] for each possible destination
v ∈ V \ {u}. Each record has two fields: RT [u, v].weight, and RT [u, v].via, where
weight is the distance between u and v, and via is the neighbor of u in the
path used to determine the weight. In the following, each subcomponent of the
routing table RT [u, v].field will be also denoted as field (u, v). The space required
to store the routing table is clearly O(n) per node.
For each v ∈ V , d′ (s, v) denotes the distance from s to v in the graph G′
obtained from G after an edge modification (in general, we denote by γ ′ any
parameter γ after an edge modification).
After an edge modification, for each source s, the proposed procedures correctly update weight(v, s) as d′ (v, s), and via(v, s) as the neighbor of v in the
path used to determine weight(v, s) in G′ . Notice that, the procedures implicitly
maintain a shortest paths tree Ts for each source s; Ts is the tree induced by the
set of edges (u, via(u, s)), for each node u reachable from s.
Both for weight-decrease and for weight-increase operations, we describe the
behavior of the algorithm with respect to a fixed source s. To obtain the algorithm for updating all-pairs shortest paths, it is sufficient to apply the algorithm
with respect to all the possible sources.
3.1
Decreasing the Weight of an Edge
Suppose that a weight decrease operation σ is performed on edge (x, y), that is,
w′ (x, y) = w(x, y) − ǫ, ǫ > 0. In this case, if d(s, x) = d(s, y), then δσ,s = ∅, and
no recomputation is needed. Otherwise, without loss of generality, we assume
that d(s, x) < d(s, y). In this case, if d′ (s, y) < d(s, y) then all nodes that belong
A Fully Dynamic Algorithm for Distributed Shortest Paths
251
to Ts (y) are in δσ,s . On the other hand, there could exist nodes not contained in
Ts (y) that belong to δσ,s . In any case, every node in δσ,s decreases its distance
from s as a consequence of σ.
The algorithm shown in Fig. 1 is based on the following property: if v ∈ δσ,s ,
then there exists a shortest path connecting v to s in G′ that contains the path
from v to y in Ty as subpath. This implies that d′ (v, s) can be computed as
d′ (v, s) = d(v, y) + w′ (x, y) + d(x, s).
Node v receives “weight(u, s)” from u.
1.
2.
3.
4.
5.
6.
7.
8.
9.
if via(v, y) = u then
begin
if weight(v, s) > w(v, u) + weight(u, s) then
begin
weight(v, s) := w(v, u) + weight(u, s)
via(v, s) := u
for each vi ∈ N (v) \ {u} do send “weight(v, s)” to vi
end
end
Fig. 1. The decreasing algorithm of node v.
Based on this property, the algorithm performs a visit of Ty starting from
y. This visit finds all the nodes in δσ,s and updates their routing tables. Each
of the visited nodes v performs the algorithm of Fig. 1. When v figures out
that it belongs to δσ,s (line 3), it sends “weight(v, s)” to all its neighbors. This
is required because v does not know its children in Ty (since y is arbitrary,
maintaining this information would require O(n2 ) space per node). Only when a
node, that has received the message “weight(u, s)” from a neighbor u, performs
line 1, it figures out whether it is child of a node in Ty .
Notice that the algorithm of Fig. 1 is performed by every node v distinct
from y. The algorithm for y is slightly different: (i) y starts the algorithm when
it receives the message “weight(u, s)” from u ≡ x. This message is sent to y as
soon as x detects the decrease on edge (x, y); (ii) y does not perform the test of
line 1; (iii) the weight w(v, u) at lines 3 and 5 coincides with w′ (x, y).
Theorem 1. Updating all-pairs shortest paths over a distributed network with
n nodes and positive real edge weights, after a weight-decrease or an insert operation, requires O(maxdeg · ∆σ ) messages, O(∆σ ) time, and O(n) space.
3.2
Increasing the Weight of an Edge
Suppose that a weight increase operation σ is performed on edge (x, y), that is,
w′ (x, y) = w(x, y) + ǫ, ǫ > 0. In order to distinguish the set of required updates
determined by the operation, we borrow from [4] the idea of coloring the nodes
with respect to s, as follows:
252
S. Cicerone et al.
• color(q, s) = white if q changes neither the distance from s nor the parent in
Ts (i.e., weight ′ (q, s) = weight(q, s) and via ′ (q, s) = via(q, s));
• color(q, s) = pink if q preserves its distance from s, but it must replace the
old parent in Ts (i.e., weight ′ (q, s) = weight(q, s) and via ′ (q, s) 6= via(q, s));
• color(q, s) = red if q increases the distance from s (i.e., weight ′ (q, s) >
weight(q, s)).
According to this coloring, the nodes in δσ,s are exactly the red and pink
nodes. Without loss of generality, let us assume that d(s, x) < d(s, y). In this
case it is easy to see that if v 6∈ Ts (y), then v 6∈ δσ,s . In other words, all the red
and pink nodes belong to Ts (y).
Initially all nodes are white. If v is pink or red, then either v is child of a red
node in Ts (y), or v ≡ y. If v is red then the children of v in Ts (y) will be either
pink or red. If v is pink or white then the other nodes in Ts (v) are white.
By the above discussion, if we want to bound the number of messages delivered over the network, to update the shortest paths from s as a function of the
number of output updates, then we cannot search the whole Ts (y). In fact, if
Ts (y) contains a pink node v, then the nodes in Ts (v) remain white and do not
require any update. For each red or pink node v, we use the following notation:
• aps (v) denotes the set of alternative parents of v with respect to s, that is, a
neighbor q of v belongs to aps (v) when d(s, q) + w(q, v) = d(s, v).
• bnrs (v) denotes the best non-red neighbor of v with respect to s, that is, a
non-red neighbor q of v, such that the quantity d(s, q) + w(q, v) is minimum.
If aps (v) is empty and bnrs (v) exists, then bnrs (v) represents the best way
for v to reach s in G′ by means of a path that does not contain red nodes.
The algorithm that we propose for handling weight-increase operations consists of three phases, namely the Coloring, the Boundarization, the Recomputing
phase. In the following we describe in detail these three phases. We just state
in advance that the coloring phase does not perform any update to RT [·, s]. A
pink node v updates via(v, s) during the boundarization phase, whereas a red
node v updates both weight(v, s) and via(v, s) during the recomputing phase.
Coloring phase. During this phase each node in Ts (y) decides its color. At the
beginning all these nodes are white. The pink and red nodes are found starting
from y and performing a pruned search of Ts (y). The coloring phase of a generic
node v is given in Fig. 2. Before describing the algorithm in detail, we remark
that it works under the following assumptions.
A1. If a node v receives a request for weight(v, s) and color(v, s) from a neighbor
(line 7), then it answers immediately.
A2. If a red node v receives the message “color(z, s) = red” from z ∈ N (v), then
it immediately sends “end-coloring” to z (see line 1 for red nodes).
When v receives “color(z, s) = red” from z, it understands that has to decide its
color. The behavior of v depends on its current color. Three cases may arise:
A Fully Dynamic Algorithm for Distributed Shortest Paths
253
The red node v receives the message “color(z, s) = red” from z ∈ N (v).
1.
send to z the message “end-coloring”; HALT
The non-red node v receives the message “color(z, s) = red” from z ∈ N (v).
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
if color(v, s) = white then
begin
if z 6= via(v, s) then send to z the message “end-coloring”; HALT
aps (v) := ∅
for each vi ∈ N (v) \ {z} do
begin
ask vi for weight(vi , s) and color(vi , s)
if color(vi , s) 6= red and weight(v, s) = w(v, vi ) + weight(vi , s)
then aps (v) := aps (v) ∪ {vi }
end
end
if z ∈ aps (v) then aps (v) := aps (v) \ {z}
if aps (v) 6= ∅
then color(v, s) := pink
else begin
color(v, s) := red
for each vi ∈ N (v) \ {z} send to vi the message “color(v, s) = red”
for each vi ∈ N (v) \ {z} wait from vi the message “end-coloring”
end
send to z the message “end-coloring”;
Fig. 2. The coloring phase of node v
1. v is white: In this case, v tests whether z is its parent in Ts (y) or not. If
z 6= via(v, s) (line 3), then the color of v remains white and v communicates
to z the end of its coloring. If z = via(v, s), then v finds all the alternative
parents with respect to s, and records them into APs (v) (lines 4–10). If
APs (v) 6= ∅ (line 13), then v sets its color to pink (line 14) and communicates
to z the end of its coloring (line 20). If APs (v) = ∅ (line 15), then v does
the following: i) sets its color to red (line 16); ii) propagates the message
“color(v, s) = red” to each neighbor but z (line 17); iii) waits for the message
“end-coloring” from each of these neighbors (line 18); iv) communicates the
end of its coloring phase to z (line 20).
2. v is pink: In this case, the test at line 12 is the first action performed
by v. If z is an alternative parent of v, then z is removed from aps (v)
(since z is now red). After this removing, v performs the test at line 13: if
there are still elements in aps (v), then v remains pink and sends to z the
message concerning the end of its coloring phase (lines 14 and 20); otherwise,
v becomes red and propagates the coloring phase to its neighbors (lines 15–
19), as already described in case 1 above.
254
S. Cicerone et al.
3. v is red: In this case, v performs a different procedure: it simply communicates to z the end of its coloring phase (see line 1 for red nodes). This is
done to guarantee that Assumption A2 holds.
According to this strategy, at the end of the coloring phase node y is aware
that each node in Ts (y) has been correctly colored. The algorithm of Fig. 2 is
performed by every node distinct from y. The algorithm for y is slightly different.
In particular, at line 20, y does not send “end-coloring” to z ≡ x. Instead, y starts
the boundarization phase described below and shown in Fig. 3.
Node v receives the message “start boundarization(ǫ)” from z = via(v, s).
1. if color(v, s) = pink then
2.
begin
3.
via(v, s) := q, where q is an arbitrary node in aps (v)
4.
color(v, s) := white; HALT
5.
end
6. if color(v, s) = red then
7.
begin
8.
ℓv := weight(v, s) + ǫ
9.
bnrs (v) := nil
10.
pink-childrens (v) := ∅; red-childrens (v) := ∅
11.
for each vi ∈ N (v) \ {z} do
12.
begin
13.
v asks vi for weight(vi , s), via(vi , s), and color(vi , s)
14.
if color(vi , s) 6= red and ℓv > w(v, vi ) + weight(vi , s) then
15.
begin
16.
ℓv := w(v, vi ) + weight(vi , s)
17.
bnrs (v) := vi
18.
end
19.
if color(vi , s) = pink and via(vi , y) = v
20.
then pink-childrens (v) := pink-childrens (v) ∪ {vi }
21.
if color(vi , s) = red and via(vi , y) = v
22.
then red-childrens (v) := red-childrens (v) ∪ {vi }
23.
end
24.
if bnrs (v) = nil
25.
then Bs (v) := ∅
{v is not boundary for s}
26.
else Bs (v) := {hv; ℓv i} {v is boundary for s}
27.
for each vi ∈ pink-childrens (v) ∪ red-childrens (v)
28.
do send “start boundarization(ǫ)” to vi
29.
for each vi ∈ red-childrens (v) do
30.
begin
31.
wait the message “Bs (vi )” from vi
32.
Bs (v) := Bs (v) ∪ Bs (vi )
33.
end
34.
send “Bs (v)” to via(v, s)
35.
end
Fig. 3. The boundarization phase of node v
A Fully Dynamic Algorithm for Distributed Shortest Paths
255
Boundarization phase. During this phase, for each red and pink node v,
a path (not necessarily the shortest one) from v to s is found. Note that, as
in Assumption A1 of Coloring, if a node v receives a request for weight(v, s),
via(v, s) and color(v, s) from a neighbor (line 13), then it answers immediately.
When a pink node v receives the message “start boundarization(ǫ)” ((ǫ) is
the increment of w(x, y)) from via(v, s), it understands that the coloring phase
is terminated; at this point v needs only to choose arbitrarily via(v, s) among
the nodes in aps (v), and to set its color to white (lines 2–5).
When a red node v receives the message “start boundarization(ǫ)” from
via(v, s), it has no alternative parent with respect to s. At this point, v computes
the shortest between the old path from v to s (whose weight is now increased
by ǫ (line 8)), and the path from v to s via bnrs (v) (if any). If bnrs (v) exists, then v can reach s through a path containing no red nodes. In order to
find bnrs (v), v asks every neighbor vi for weight(vi , s), via(vi , s) and color(vi , s)
(see lines 12–23). At the same time, using color(vi , s) and via(vi , s), v finds its
pink and red children in Ts (y) and records them into pink-childrens (v) and
red-childrens (v) (see lines 20 and 22).
If bnrs (v) exists and the path from v to s via bnrs (v) is shorter than
weight(v, s) + ǫ, then v is called boundary for s. In this case, v initializes Bs (v)
as {hv; ℓv i} (line 26), where ℓv is the weight of the path from v to s via bnrs (v).
When v terminates the boundarization phase, the set Bs (v) contains all the
pairs hz; ℓz i such that z ∈ Ts (v) is a boundary node. In fact, at line 28 v
sends to each node vi ∈ pink-childrens (v) ∪ red-childrens (v) the value
ǫ (to propagate the boundarization), and then waits to receive Bs (vi ) from
vi ∈ red-childrens (v) (lines 30–33). Notice that v does not wait for any message from a pink children vi ∈ pink-childrens (v), because Bs (vi ) is empty.
Whenever v receives Bs (vi ) from a child vi , it updates Bs (v) as Bs (v) ∪ Bs (vi )
(line 32). Finally, at line 34, v sends Bs (v) to y via via(v, s).
At the end of the boundarization phase, the set Bs (y), containing all the
boundary nodes for s, has been computed and stored in y. Notice that, the
algorithm of Fig. 3 is performed by every node distinct from y. The algorithm
for y is slightly different. In particular, at line 34, y does not send “Bs (y)” to
via(y, s). Instead, y uses this information to start recomputing phase. In the
recomputing phase, y broadcasts through Ts (y) the set Bs (y) to each red node.
Recomputing phase. In this phase, each red node v computes weight ′ (v, s)
and via ′ (v, s). The recomputing phase of a red node v is shown in Fig. 4, and
described in what follows. Let us suppose that the red node v has received the
message “Bs (y)” from via(v, y).
Concerning the shortest path from v to s in G′ two cases may arise: a) it
coincides with the shortest path from v to s in G; b) it passes through a boundary
node. In case b) two subcases are possible: b1) the shortest path from v to s in
G′ passes through bnrs (v); b2) the shortest path from v to s in G′ contains a
boundary node different from v.
Node v performs the following local computation: for each b ∈ Bs (y), it
computes wmin as minb {weight(v, b) + ℓb } (line 1), and bmin as the boundary
256
S. Cicerone et al.
node such that wmin = weight(v, bmin ) + ℓbmin (line 2). After v has updated
weight(v, s) as weight(v, s)+ǫ (line 3), v compares weight(v, s) (the new weight of
the old path from v to s) with wmin (line 4) and correctly computes weight ′ (v, s)
(line 6), according to cases a) and b) above. At lines 7–9, via(v, s) is computed
according to cases b1) and b2) above. Finally, by using the information contained
in red-childrens (v), v propagates Bs (y) to the red nodes in Ts (v).
The node v receives “Bs (y)” from via(v, y).
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
wmin := min{weight(v, b) + ℓb | hb; ℓb i ∈ Bs (y)}
let bmin be a node such that wmin = weight(v, bmin ) + ℓbmin
weight(v, s) := weight(v, s) + ǫ
if weight(v, s) > wmin then
begin
weight(v, s) := wmin
if bmin = v
then via(v, s) := bnrs (v)
else via(v, s) := via(v, bmin )
end
color(v, s) := white
for each vi ∈ red-childrens (v) do send “Bs (y)” to vi
Fig. 4. The recomputing phase of node v
It is easy to show that each phase is deadlock free.
Theorem 2. Updating all-pairs shortest paths over a distributed network with n
nodes and positive real edge weights, after a weight-increase or a delete operation,
requires O(max{|δσ |, maxdeg} · ∆σ ) messages, O(max{|δσ |, maxdeg} · ∆σ ) time,
and O(n) space.
References
1. R. K. Ahuia, T. L. Magnanti and J. B. Orlin. Network Flows: Theory, Algorithms
and Applications. Prentice Hall, Englewood Cliffs, NJ (1993).
2. G. Ausiello, G. F. Italiano, A. Marchetti-Spaccamela and U. Nanni. Incremental
algorithms for minimal length paths. Journal of Algorithms, 12, 4 (1991), 615–638.
3. B. Awerbuch, I. Cidon and S. Kutten. Communication-optimal maintenance of
replicated information. Proc. IEEE Symp. on Found. of Comp. Sc., 492–502, 1990.
4. D. Frigioni, A. Marchetti-Spaccamela and U. Nanni. Fully dynamic output
bounded single source shortest paths problem. Proc. ACM-SIAM Symp. on Discrete Algorithms, 212–221, 1996. Full version, Journal of Algorithms, to appear.
5. S. Haldar. An all pair shortest paths distributed algorithm using 2n2 messages.
Journal of Algorithms, 24 (1997), 20–36.
6. P. Humblet. Another adaptive distributed shortest path algorithm. IEEE transactions on communications, 39, n. 6 (1991), 995–1003.
7. G. F. Italiano. Distributed algorithms for updating shortest paths. Proc. Int. Workshop on Distributed Algorithms. LNCS 579, 200–211, 1991.
A Fully Dynamic Algorithm for Distributed Shortest Paths
257
8. J. McQuillan. Adaptive routing algorithms for distributed computer networks.
BBN Rep. 2831, Bolt, Beranek and Newman, Inc., Cambridge, MA 1974.
9. J. T. Moy. OSPF - Anatomy of an Internet Routing Protocol. Addison, 1998.
10. G. Ramalingam. Bounded incremental computation. LNCS 1089, 1996.
11. G. Ramalingam and T. Reps. On the computational complexity of dynamic graph
problems. Theoretical Computer Science, 158, (1996), 233–277.
12. K. V. S. Ramarao and S. Venkatesan. On finding and updating shortest paths
distributively. Journal of Algorithms, 13 (1992), 235–257.
13. A.S. Tanenbaum. Computer networks. Prentice Hall, Englewood Cliffs, NJ (1996).
Integer Factorization and Discrete Logarithms
Andrew Odlyzko
AT&T Labs, Florham Park, NJ 07932, USA
amo@research.att.com
http://www.research.att.com/˜amo
Abstract. Integer factorization and discrete logarithms have been known
for a long time as fundamental problems of computational number theory. The invention of public key cryptography in the 1970s then led to
a dramatic increase in their perceived importance. Currently the only
widely used and trusted public key cryptosystems rely for their presumed security on the difficulty of these two problems. This makes the
complexity of these problems of interest to the wide public, and not just
to specialists.
This lecture will present a survey of the state of the art in integer factorization and discrete logarithms. Special attention will be devoted to the
rate of progress in both hardware and algorithms. Over the last quarter
century, these two factors have contributed about equally to the progress
that has been made, and each has stimulated the other. Some projections
for the future will also be made.
Most of the material covered in the lecture is available in the survey
papers [1,2] and the references listed there.
References
1. A.M. Odlyzko, The future of integer factorization, CryptoBytes (The
technical newsletter of RSA Laboratories), 1 (no. 2) (1995), pp. 5–
12. Available at http://www.rsa.com/rsalabs/pubs/cryptobytes/ and
http://www.research.att.com/˜amo.
2. A.M. Odlyzko, Discrete logarithms: The past and the future, Designs, Codes, and Cryptography 19 (2000), pp. 129-145. Available at
http://www.research.att.com/˜amo.
G. Gonnet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNCS 1776, pp. 258–258, 2000.
c Springer-Verlag Berlin Heidelberg 2000
Communication Complexity and Fourier
Coefficients of the Diffie–Hellman Key
Igor E. Shparlinski
Department of Computing, Macquarie University
Sydney, NSW 2109, Australia
igor@comp.mq.edu.au
Abstract. Let p be a prime and let g be a primitive root of the field IFp
of p elements. In the paper we show that the communication complexity
of the last bit of the Diffie–Hellman key g xy , is at least n/24 + o(n)
where x and y are n-bit integers where n is defined by the inequalities
2n ≤ p ≤ 2n+1 − 1. We also obtain a nontrivial upper bound on the
Fourier coefficients of the last bit of g xy . The results are based on some
new bounds of exponential sums with g xy .
1
Introduction
Let p be a prime and let IFp be a finite field of p elements which we identify with
the set {0, . . . , p − 1}. We define integer n by the inequalities 2n ≤ p ≤ 2n+1 − 1
and denote by Bn the set of n-bit integers,
Bn = {x ∈ ZZ : 0 ≤ x ≤ 2n − 1}.
Throughout the paper we do not distinguish between n-bit integers x ∈ Bn
and their binary expansions. Thus Bn can be considered as the n-dimensional
Boolean cube Bn = {0, 1}n as well.
Finally, we recall the notion of communication complexity. Given a Boolean
function f (x, y) of 2n variables
x = (x1 , . . . , xn ) ∈ Bn
and
y = (y1 , . . . , yn ) ∈ Bn ,
we assume that there are two collaborating parties and the value of x is known
to one party and the values of y is known to the other, however each party has no
information about the values of the other. The goal is to create a communication
protocol P such that, for any inputs x, y ∈ Bn , at the end at least one party
can compute the value of f (x, y). The largest number of bits to exchange by a
protocol P, taken over all possible inputs x, y ∈ Bn , is called the communication
complexity C( P) of this protocol. The smallest possible value of C( P), taken
over all possible protocols, is called the communication complexity C(f ) of the
function f , see [2,21].
G. Gonnet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNCS 1776, pp. 259–268, 2000.
c Springer-Verlag Berlin Heidelberg 2000
260
I.E. Shparlinski
Given two integers x, y ∈ Bn , the corresponding Diffie–Hellman key is defined as g xy . Studying various complexity characteristics of this function is of
primal interest for cryptography and complexity theory. Several lower bounds
on the various complexity characteristics of this function as well as the discrete
logarithm have been obtained in [30]. In particular, for a primitive root g of IFp ,
one can consider the Boolean function f (x, y) which is defined as the rightmost
bit of g xy , that is,
1, if g xy ∈ {1, 3, . . . , p − 2};
(1)
f (x1 , . . . , xn , y1 , . . . , yn ) =
0, if g xy ∈ {2, 4, . . . , p − 1}.
In many cases the complexity lower bounds of [30] are as strong as the best
known lower bounds for any other function. However, the lower bound C(f ) ≥
log2 n+O(1) of Theorem 9.4 of [30] is quite weak. Here, using a different method,
we derive the linear lower bound C(f ) ≥ n/24 + o(n) on the communication
complexity of f .
We also use the same method to obtain an upper bound of the Fourier coefficients of this function, that is,
X X
(−1)f (x,y)+huxi+hvyi ,
fˆ(u, v) = 2−2n
x∈ Bn y∈ Bn
where u, v ∈ Bn and hwzi denotes the dot product of the vectors w, z ∈ Bn . This
bound can be combined with many known relations between Fourier coefficients
and various complexity characteristics such as such as the circuit complexity, the
average sensitivity, the formula size, the average decision tree depth, the degrees
of exact and approximate polynomial representations over the reals and several
others, see [3,4,8,15,22,23,27] and references therein.
We remark, that although these results do not seem to have any cryptographic implications it is still interesting to study complexity characteristics of
such an attractive number theoretic function. Various complexity lower bounds
for Boolean functions associated with other natural number theoretic problems
can be found in [1,5,6,7,14,30].
Our main tool is exponential sums including a new upper bound of double
exponential sums
X X
e (ag xy ) ,
Sa ( X , Y) =
x∈ X y∈ Y
where e(z) = exp(2πi/p), with a ∈ IFp and arbitrary sets X , Y ⊆ Bn . These
sums are of independent number theoretic interest. In particular they can be
considered as generalizations of the well known sums
Ta ( U, V) =
X X
u∈ U v∈ V
e (auv)
and
Qa (H) =
H
X
e (ag x ) ,
x=1
where a ∈ IFp , 1 ≤ H ≤ t, and U, V ⊆ IFp , which are well known in the literature
and have proved to be useful for many applications, see [12,17,28,29] as well as
Problem 14.a to Chapter 6 of [31] for Ta ( U, V) and [18,19,20,24,25] for Qa (H).
Communication Complexity and Fourier Coefficients
261
In this paper we estimate sums Sa ( X , Y) for arbitrary sets X and Y. Provided that both sets are of the same cardinality | X | = | Y| = N our estimates
are nontrivial for N ≥ p15/16+δ with any fixed δ > 0.
We remark that the distribution of the triples of (g x , g y , g xy ) for x, y ∈ Bn
has been studied in [9,10], see also [30]. In fact this paper relies on an estimate
of some double exponential sums from [9].
Throughout the paper the implied constants in symbols ‘O’, ‘≪’ and ‘≫’ are
absolute (we recall that A ≪ B and B ≫ A are equivalent to A = O(B)).
2
Preparations
We say that a set S ⊆ Bn is a cylinder if there is a set J ⊆ {1, . . . , n} such
that the membership x ∈ S does not depend on components xj , j ∈ J , of
x = (x1 , . . . , xn ) ∈ Bn . The discrepancy ∆(f ) of f is defined as
∆(f ) = 2−2n max |N1 ( S, T ) − N0 ( S, T )|,
S, T
where the maximum is taken over all cylinders S, T ⊆ Bn and Nµ ( S, T ) is the
number of pairs (x, y) ∈ S × T with f (x, y) = µ.
The link between the discrepancy and communication complexity is provided
by the following statement which is a partial case of Lemma 2.2 from [2].
Lemma 1. The bound
C(f ) ≥ log2
holds.
1
∆(f )
We use exponential sums to estimate the discrepancy of the function (1).
The following statement has been proved in [9], see the proof of Theorem 8 of
that paper.
Lemma 2. Let λ ∈ IFp be of multiplicative order t. For any a, b ∈ IF∗p , the bound
t
t
X
X
u=1 v=1
4
v
uv
e (aλ + bλ )
≪ pt11/3
holds.
We recall the well known fact, see see Theorem 5.2 of Chapter 1 of [26], that
for any integer m ≥ 2 the number of integer divisors τ (m) of m satisfies the
bound
ln m
.
(2)
log2 τ (m) ≤ (1 + o(1))
ln ln m
262
I.E. Shparlinski
We now apply Lemma 2 to estimate Sa ( X , Y) for arbitrary sets X , Y ∈ Bn .
Lemma 3. The bound
max |Sa ( X , Y)| ≪ | X |1/2 | Y|5/6 p5/8 τ (p − 1)
a∈IF∗
p
holds.
Proof. For a divisor d|p−1 we denote by Y(d) the subset of y ∈ Y with gcd(y, p−
1) = d. Then
X
|σd |,
|Sa ( X , Y)| ≤
d|p−1
where
X X
σd =
e (ag xy ) .
x∈ X y∈ Y(d)
Using the Cauchy inequality, we derive
2
|σd | ≤ | X |
= |X|
X
X
2
xy
e (ag )
x∈ X y∈ Y(d)
p−1
X
X
y,z∈ Y(d) x=1
p−1
X
≤ |X|
X
2
xy
e (ag )
x=1 y∈ Y(d)
e (a (g xy − g xz )) .
By the Hölder inequality we have
8
4
|σd | ≤ | X | | Y(d)|
4
4
e (a (g
xy
y,z∈ Y(d) x=1
X
6
≤ | X | | Y(d)|
p−1
X
X
6
y∈ Y(d)
(p−1)/d p−1
X X
u=1
xz
− g ))
e a g
x=1
xy
−g
xud
4
.
Because each element y ∈ Y(d) can be represented in the form y = dv with
gcd(v, (p − 1)/d) = 1 and λd = g d is of multiplicative order (p − 1)/d, we see
that the double sum over u and x does not depend on y. Therefore,
8
4
7
|σd | ≤ | X | | Y(d)|
(p−1)/d p−1
X X
u=1
4
e (a (λxd
x=1
(p−1)/d (p−1)/d
= | X |4 | Y(d)|7 d4
X
u=1
X
u=1
−
λxu
d ))
4
e (a (λxd − λxu
d )) .
Communication Complexity and Fourier Coefficients
263
By Lemma 2 we obtain
8
4
7 4
|σd | ≪ | X | | Y(d)| d p
p−1
d
11/3
≤ | X |4 | Y(d)|7 p14/3 d1/3 .
(3)
Using the bound | Y(d)| ≤ | Y| for d ≤ p/| Y| and the bound | Y(d)| ≤ p/d for
d > p/| Y|, we see that
|σd | ≪ | X |1/2 | Y|5/6 p5/8
for any divisor d|p − 1 and the desired result follows.
⊓
⊔
Finally, to apply Lemma 3 to the discrepancy we need the following two
well known statements which are Problems 11.a and 11.c to Chapter 3 of [31],
respectively.
Lemma 4. For any integers u and m ≥ 2,
m−1
X
em (λu) =
λ=0
0, if u 6≡ 0 (mod m);
m, if u ≡ 0 (mod m).
Lemma 5. For any integers H and m ≥ 2,
m−1
X
H
X
em (az) = O(m ln m).
a=1 z=0
3
Communication Complexity and Fourier
Coefficients of the Diffie–Hellman Key
Now we can prove our main results.
Theorem 1. For the communication complexity of the function f (x, y) given
by (1), the bound
1
n + o(n)
C(f ) ≥
24
holds.
Proof. We can assume that p ≥ 3. Put H = (p − 1)/2. Then for any sets
S, T ⊆ Bn (not necessary cylinders), Lemma 4 implies that
N0 ( S, T ) =
p−1
H
1XX XX
e (a(g xy − 2z)) .
p a=0
z=1
x∈ S y∈ T
264
I.E. Shparlinski
Separating the term | S|| T |H/p, corresponding to a = 0, we obtain
p−1
H
X
| S|| T |H
1X
≤
|Sa ( S, T )|
e (−2az) .
N0 ( S, T ) −
p
p a=1
z=1
Using Lemma 3 and then Lemma 5, we derive
N0 ( S, T ) −
| S|| T |H
p
≪ | S|1/2 | T |5/6 p−3/8 τ (p − 1)
≪p
23/24
τ (p − 1)
p−1 X
H
X
p−1 X
H
X
e (−2az)
a=1 z=1
23/24
e (−2az) = p
a=1 z=1
τ (p − 1)
≪ p47/24 τ (p − 1) ln p.
p−1 X
H
X
e (az)
a=1 z=1
Because N0 ( S, T ) + N1 ( S, T ) = | S|| T | and H/p = 1/2 + O(p−1 ) we see that
N0 ( S, T ) −
| S|| T |H
≪ p47/24 τ (p − 1) ln p
p
as well. Therefore the discrepancy of f satisfies the bound
∆(f ) ≪ 2−2n p47/24 τ (p − 1) ln p ≪ p−1/24 τ (p − 1) ln p.
Using (2), we derive that ∆(f ) ≪ 2−n/24+o(n) . Applying Lemma 1, we obtain
the desired result.
⊓
⊔
The same considerations also imply the following estimate on the Fourier
coefficients.
Theorem 2. For for the Fourier coefficients of the function f (x, y) given by (1),
the bound
max fˆ(u, v) ≪ 2−n/24+o(n)
u,v∈ Bn
holds.
Proof. We fix some nonzero vectors u, v ∈ Bn and denote by X0 and Y0 the sets
of integers x ∈ Bn and y ∈ Bn for which huxi = 0 and hvyi = 0, respectively.
Similarly, we define the sets X1 and Y1 by the conditions huxi = 1 and hvyi = 1,
respectively. Then we obtain,
X X
X X
(−1)f (x,y)
(−1)f (x,y) + 2−2n
fˆ(u, v) = 2−2n
x∈ X0 y∈ Y0
−2n
−2
X X
x∈ X1 y∈ Y0
x∈ X1 y∈ Y1
f (x,y)
(−1)
− 2−2n
X X
x∈ X0 y∈ Y1
(−1)f (x,y) .
Communication Complexity and Fourier Coefficients
265
It is easy to see that
X X
x∈ Xη y∈ Yµ
(−1)f (x,y) = 2N0 ( Xη , Yµ ) − | Xη || Yµ |,
η, µ = 0, 1,
where, as before, N0 ( Xη , Yµ ) is the number of pairs (x, y) ∈ Xη × Yµ with
f (x, y) = 0. Using the same arguments as in the proof of Theorem 1, we derive
that
1
N0 ( Xη , Yµ ) − | Xη || Yµ | ≪ p47/24 τ (p − 1) ln p,
2
η, µ = 0, 1,
and from (2) we derive the desired result for non-zero vectors u and v.
Now, if u = 0̄ is a zero vector and v is not then defining Y0 and Y1 as before
we obtain
X X
X X
(−1)f (x,y) .
(−1)f (x,y) − 2−2n
fˆ(0̄, v) = 2−2n
x∈ Bn y∈ Y0
x∈ Bn y∈ Y1
As before we derive
N0 ( Bn , Yµ ) − 2n−1 | Yµ | ≪ p47/24 τ (p − 1) ln p,
µ = 0, 1,
which implies the desired estimate in this case. The same arguments apply if
v = 0̄ is a zero vector and u is not.
Finally, if both u = v = 0̄ are zero vectors then
fˆ(0̄, 0̄) = 2−2n
X X
x∈ Bn y∈ Bn
(−1)f (x,y) = 2−2n 2N0 ( Bn , Bn ) − 22n−1
and using the bound
N0 ( Bn , Bn ) − 22n−1 ≪ p47/24 τ (p − 1) ln p,
we conclude the proof.
4
µ = 0, 1,
⊓
⊔
Remarks
Our bound of the discrepancy of f , obtained in the proof of Theorem 1, combined with Lemma 2.2 of [2] can be used to derive a linear lower bound on
ε-distributional communication complexity, which is defined in a similar way,
however the communicating parties are allowed to make mistakes on at most
ε22n inputs x, y ∈ Bn .
266
I.E. Shparlinski
Similar considerations can be used to estimate more general sums
X
X
e (ag x1 ...xk )
...
Sa ( X1 , . . . , Xk ) =
x1 ∈ X1
xk ∈ Xk
with k ≥ 2 sets X1 , . . . , Xk ⊆ Bn and thus study the multi-party communication
complexity of the function (1).
It is obvious that the bound of Lemma 3 can be improved if a nontrivial
upper bounds on | Y(d)| is known and substituted in (3). Certainly one cannot
hope to obtain such bounds for arbitrary sets Y but for cylinders such bounds
can be proved. Unfortunately this does not yield any improvement of Theorems 1
and 2. Indeed, nontrivial bounds on | Y(d)| improve the statement of Lemma 3
for sets of small cardinality, however in our applications sets of cardinality of
order p turn out to be most important. But for such sets the trivial bound
| Y(d)| ≤ p/d, which has been used in the proof of Lemma 3, is the best possible.
The bound of Lemma 3 also implies the same results for modular exponentiation ux (mod p), u, x ∈ Bn . It would be interesting to extend this result for
modular exponentiation modulo arbitrary integers m. In some cases, for example,
when m contains a large prime divisor, this can be done within the frameworks
of this paper. Other moduli may require some new ideas.
Finally, similar results hold in a more general situation when g is an element
of multiplicative order t ≥ p3/4+δ with any fixed δ > 0, rather than a primitive
root.
On the other hand, out method does not seem to work for Boolean functions
representing middle bits of g xy and obtaining such results is an interesting open
question.
References
1. E. Allender, M. Saks and I. E. Shparlinski, ‘A lower bound for primality’, Proc.
14 IEEE Conf. on Comp. Compl., Atlanta, 1999, IEEE Press, 1999, 10–14.
2. L. Babai, N. Nisan and M. Szegedy, ‘Multiparty protocols, pseudorandom generators for logspace and time–space trade-offs’, J. Comp. and Syst. Sci., 45 (1992),
204–232.
3. A. Bernasconi, ‘On the complexity of balanced Boolean functions’, Inform. Proc.
Letters, 70 (1999), 157–163.
4. A. Bernasconi, ‘Combinatorial properties of classes of functions hard to compute
in constant depth’, Lect. Notes in Comp. Sci., Springer-Verlag, Berlin, 1449
(1998), 339–348.
5. A. Bernasconi, C. Damm and I. E. Shparlinski, ‘Circuit and decision tree complexity of some number theoretic problems’, Tech. Report 98-21 , Dept. of Math.
and Comp. Sci., Univ. of Trier, 1998, 1–17.
6. A. Bernasconi, C. Damm and I. E. Shparlinski, ‘On the average sensitivity of
testing square-free numbers’, Lect. Notes in Comp. Sci., Springer-Verlag, Berlin,
1627 (1999), 291–299.
Communication Complexity and Fourier Coefficients
267
7. A. Bernasconi and I. E. Shparlinski, ‘Circuit complexity of testing square-free
numbers’, Lect. Notes in Comp. Sci., Springer-Verlag, Berlin, 1563 (1999), 47–
56.
8. R. B. Boppana, ‘The average sensitivity of bounded-depth circuits’, Inform.
Proc. Letters, 63 (1997), 257–261.
9. R. Canetti, J. B. Friedlander, S. Konyagin, M. Larsen, D. Lieman and I. E. Shparlinski, ‘On the statistical properties of Diffie–Hellman distributions’, Israel J.
Math., (to appear).
10. R. Canetti, J. B. Friedlander and I. E. Shparlinski, ‘On certain exponential
sums and the distribution of Diffie–Hellman triples’, J. London Math. Soc., (to
appear).
11. H. Cohen, A course in computational algebraic number theory, Springer-Verlag,
Berlin, 1997.
12. J. Friedlander and H. Iwaniec, ‘Estimates for character sums’, Proc. Amer. Math.
Soc., 119 (1993), 363–372.
13. J. von zur Gathen and J. Gerhard, Modern computer algebra, Cambridge Univ.
Press, Cambridge, 1999.
14. J. von zur Gathen and I. E. Shparlinski, ‘The CREW PRAM complexity of
modular inversion’, SIAM J. Computing, (to appear).
15. M. Goldmann, ‘Communication complexity and lower bounds for simulating
threshold circuits’, Theoretical Advances in Neural Computing and Learning,
Kluwer Acad. Publ., Dordrecht (1994), 85–125.
16. D. M. Gordon, ‘A survey of fast exponentiation methods’, J. Algorithms, 27
(1998), 129–146.
17. H. Iwaniec and A. Sárközy, ‘On a multiplicative hybrid problem’, J. Number
Theory, 26 (1987), 89–95.
18. S. Konyagin and I. E. Shparlinski, Character sums with exponential functions
and their applications, Cambridge Univ. Press, Cambridge, 1999.
19. N. M. Korobov, ‘On the distribution of digits in periodic fractions’, Matem.
Sbornik , 89 (1972), 654–670 (in Russian).
20. N. M. Korobov, Exponential sums and their applications, Kluwer Acad. Publ.,
Dordrecht, 1992.
21. E. Kushilevitz and N. Nisan, Communication complexity, Cambridge University
Press, Cambridge, 1997.
22. N. Linial, Y. Mansour and N. Nisan, ‘Constant depth circuits, Fourier transform,
and learnability’, Journal of the ACM , 40 (1993), 607-620.
23. Y. Mansour, ‘Learning Boolean functions via the Fourier transform’, Theoretical
Advances in Neural Computing and Learning, Kluwer Acad. Publ., Dordrecht
(1994), 391–424.
24. H. Niederreiter, ‘Quasi-Monte Carlo methods and pseudo-random numbers’,
Bull. Amer. Math. Soc., 84 (1978), 957–1041.
25. H. Niederreiter, Random number generation and Quasi–Monte Carlo methods,
SIAM Press, Philadelphia, 1992.
26. K. Prachar, Primzahlverteilung, Springer-Verlag, Berlin, 1957.
27. V. Roychowdhry, K.-Y. Siu and A. Orlitsky, ‘Neural models and spectral methods’, Theoretical Advances in Neural Computing and Learning, Kluwer Acad.
Publ., Dordrecht (1994), 3–36.
28. A. Sárközy, ‘On the distribution of residues of products of integers’, Acta Math.
Hungar., 49 (1987), 397–401.
29. I. E. Shparlinski, ‘On the distribution of primitive and irreducible polynomials
modulo a prime’, Diskretnaja Matem., 1 (1989), no.1, 117–124 (in Russian).
268
I.E. Shparlinski
30. I. E. Shparlinski, Number theoretic methods in cryptography: Complexity lower
bounds, Birkhäuser, 1999.
31. I. M. Vinogradov, Elements of number theory, Dover Publ., NY, 1954.
Quintic Reciprocity and Primality Test for
Numbers of the Form M = A5n ± ωn
Pedro Berrizbeitia1 , Mauricio Odreman Vera1 , and Juan Tena Ayuso3
1
Universidad Simón Bolı́var, Departamento de Matemáticas
Caracas 1080-A, Venezuela. {pedrob,odreman}@usb.ve
2
Facultad de Ciencias, Universidad de Valladolid
Valladolid, España. tena@agt.uva.es
Abstract. The Quintic Reciprocity Law is used to produce an algorithm, that runs in polynomial time, and that determines the primality
of numbers √
M such that M 4 − 1 is divisible by a power of 5 which is
larger that M , provided that a small prime p, p ≡ 1(mod 5) is given,
such that M is not a fifth power modulo p. The same test equations are
used for all such M .
If M is a fifth power modulo p, a sufficient condition that determines
the primality of M is given.
1
Introduction
Deterministic primality tests that run in polynomial time, for numbers of the
form M = A5n − 1, have been given by Williams, [9]. Moreover Williams and
Judd [11], also considered primality tests for numbers M , such that M 2 ±1, have
large prime factors. A more general deterministic primality test was developed by
Adleman, Pomerance and Rumely [1], improved by H. Cohen and H.W. Lenstra
[4], and implemented by H. Cohen and A.K. Lenstra [5]. Although this is more
general, for specific families of numbers one may find more efficient algorithms.
This is what we give in this paper, for numbers M , such that M 4 − 1 is
divisible by a large power of 5. More specifically let M = A5n ± ωn , where
0 < A < 5n ; 0 < ωn < 5n /2; ωn4 ≡ 1(mod 5n ).
In the given range there are exactly two possible values for ωn . One is ωn =
1 and the other is computed inductively via Hensel’s lemma. Thus, given ωn
satisfying ωn2 ≡ −1(mod 5n ), there is a unique x(mod 5), such that (ωn +x5n )2 ≡
−1(mod 5n+1 ).
Once x(mod 5) is found select ωn+1 = ωn + x5n or ωn+1 = 5n − (ωn + x5n )
according to which one satisfies ωn+1 < 5n+1 /2.
For such integers M we use the Quintic Reciprocity Law to produce an
algorithm, which runs in polynomial time, that determines the primality of M
provided that a small prime p, p ≡ 1(mod 5), is given, such that M is not a fifth
power modulo p.
We next describe the theorem that leads naturally to the algorithm.
Let ζ = e2πi/5 be a fifth complex primitive root of unity.
G. Gonnet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNCS 1776, pp. 269–279, 2000.
c Springer-Verlag Berlin Heidelberg 2000
270
P. Berrizbeitia, M.O. Vera, J.T. Ayuso
Let D = Z[ζ] be the corresponding Cyclotomic Ring. Let π be a√primary
irreducible element of D lying over p. Let K = Q(ζ + ζ −1 ) = Q( 5). Let
G= Gal(Q(ζ)/Q) be the Galois Group of the cyclotomic field Q(ζ) over Q. For
every integer c denote by σc the element of G that sends ζ in ζ c . For τ in Z[G]
and α in D we denote by ατ to the action of the element τ of Z[G] on the element
α of D.
Let f be the order of M modulo 5 (f is also the order of M modulo 5n ). Denote
by Φf (x) the f-th Cyclotomic Polynomial. We note that φf (M ) ≡ 0(mod 5n ).
For f = 1 and f = 2 let γ = π 1−3σ3 . For f = 4 let γ = π. For all cases let
n
α = (γ/γ̄)φf (M )/5 , where bar indicates complex conjugation.
Let T0 = T raceK/Q (α + ᾱ),
and N0 = N ormK/Q (α + ᾱ)
For k ≥ 0 define Tk+1 , Nk+1 recursively by the formulas:
Qg
Tk+1 = Tk5 − 5Nk Tk3 + 5Nk2 Tk + 15Nk Tk − 5Tk3 + 5Tk
Nk+1 = Nk5 − 5Nk3 (Tk2 − 2Nk ) + 5Nk [(Tk2 − 2Nk )2 − 2Nk2 ]
+25Nk3 − 25Nk (Tk2 − 2Nk ) + 25Nk
(1.1)
(1.2)
Let j=1 Pj (x) a the factorization modulo M of the polynomial φ5 (x) as a product of irreducible polynomials. Let µj = (M, Pj (ζ)) be the ideal of D generated
by M and Pj (ζ).
We prove the following Theorem:
Theorem 1. Let M , A, wn as before and suppose that M is not divisible by
any of the solutions of x4 ≡ 1(mod5n ); 1 < x < 5n . The following statements
are equivalent:
i) M is prime
ii) For each µk there is an integer ik 6≡ 0(mod 5) such that
α5
(n−1)
≡ ζ ik ( mod µk )
(1.3)
iii)
Tn−1 ≡ Nn−1 ≡ −1(mod M )
(1.4)
We note that the equivalence of (i) and (ii) is an extension of Proth Theorem,
and the equivalence of (i) and (iii) extends the Lucas-Lehmer Test.
We use the Quintic Reciprocity Law to extend Proth’s Theorem, in the same
way Guthman [6] and Berrizbeitia-Berry [3] used the Cubic Reciprocity Law to
extend Proth’s Theorem for numbers of the form A3n ± 1. From this extension
of Proth’s Theorem we derive a Lucas-Lehmer
type test, by taking Traces and
√
Norms of certain elements in the field Q( 5), in a way which is analogous to
Rosen′ s proof [8] of the Lucas-Lehmer test. Generalization of this scheme to a
wider family of numbers is the object of a forthcoming paper.
In section 2 of this paper we introduce the quintic symbol, and state the facts
we need from the arithmetic of the ring D, including the Quintic Reciprocity
Quintic Reciprocity and Primality Test
271
Law. In section 3 we prove theorem 1. Section 4 is devoted to remarks that
have interest on their own, and are useful for implementation. We include a
theorem that gives a sufficient condition that determines the primality of M
when assumption on p is removed. In section 5 we describe an implementation of
the algoritm derived from theorem 1. Work similar to this had been done earlier
by Williams [10]. Williams derived his algoritm from properties of some Lucas
Sequences. Our algorithm is derived from a generalization of Proth’s Theorem,
and gives a unified treatment to test primality of numbers M such that M 4 − 1 is
divisible by a large enough power of 5. In particular, an interesting observation
is that the algorithm we use to test numbers M of the form A5n + 1 is the same
as the one we use to test numbers of the form A5n − 1, which was not the case
for earlier algorithm we found in the literature.
2
The Ring D. Quintic Symbol and Quintic Reciprocity
What we state in this section may be found, among other places, in [7], chapter
12 to 14, from where we borrow the notation and presentation.
Let D = Z[ζ] the ring of integer of the cyclotomic field Q(ζ). Let p be a
rational prime, p 6= 5. Let f the order of p modulo 5. Then p factors as the
product of 4/f prime ideals in D. If P and P ′ are two of these prime ideals,
there is a σ in G = Gal(Q(ζ)/Q) such that σ(P) = P ′ . D/P is a finite field
with pf elements and is called the residue class field mod P. The multiplicative
group of units mod P, denoted by (D/P )∗ is cyclic of order (pf − 1). Let α
in D an element not in P. There is an integer i, unique modulo 5 such that
f
α(p −1)/5 ≡ ζ i (modP). The quintic symbol (α/P) is defined to be that unique
fith root of unity satisfying
f
α(p
−1)/5
≡ (α/P)(mod P)
(2.1)
The symbol has the following properties:
(α/P) = 1 if, and only if,
x5 ≡ α(modP)
(2.2)
(α/P)σ = (σ(α)/σ(P))
(2.3)
is solvable in D.
For every σ ∈ G
Let A be an ideal in D, prime to 5. Then A can be written as a product
of prime ideals: A = P1 · · · Ps . Let α ∈ D be prime to A. The symbol (α/A) is
defined as the product of the symbols (α/P1 ) · · · (α/Ps ). Let β ∈ D prime to 5
and to α. The symbol (α/β) is defined as (α/(β)).
D is a Principal Ideal Domain (PID) (see the notes in page 200 of [7] for
literature on cyclotomic fields with class number one). An element α ∈ D is
called primary if it is not a unit, is prime to 5 and is congruent to a rational
272
P. Berrizbeitia, M.O. Vera, J.T. Ayuso
integer modulo (1 − ζ)2 . For each α ∈ D, prime to 5, there is an integer c in Z,
unique modulo 5, such that ζ c α is primary. In particular, every prime ideal P
in D has a primary generator π.
Quintic Reciprocity Law:
Let M be an integer, prime to 5. Let α be a primary element of D and assume
α is prime to M and prime to 5. Then
(α/M ) = (M/α)
3
(2.4)
Proof of the Theorem
The condition imposed on the prime p implies p ≡ 1(mod 5) (otherwise the
equation x5 ≡ M (mod p) would have an integer solution). It follows that the
ideal (p) factors as the product of four prime ideals in D. These are all principal,
since D is a PID. We denote by π a primary generator of one of these prime
ideals. The other ideals are generated by the Galois conjugates of π, which are
also primary.
We note that (M/π) 6= 1, otherwise M would be a fifth power modulo each
of π’s Galois conjugates, hence modulo p. We prove
i) implies ii)
Suppose first f = 1.
Let (M/π) = ζ i1 . Then i1 6≡ 0(mod 5). Since M is a rational prime, M ≡
1(mod 5), then (M ) factors in D as the product of 4 prime ideals. We write
(M ) = (µ1 )(σ2 (µ1 ))(µ̄1 )(σ2 (µ̄1 )). We get
(by(2.4))
ζ i1 = (M/π) = (π/M )
= (π/µ1 )(π/σ2 (µ1 ))(π/µ̄1 )(π/σ2 (µ̄1 ))
(because(M ) = (µ1 )(σ2 (µ1 ))(µ̄1 )(σ2 (µ̄1 )))
π
π
π
π
π
= ( /µ1 )( /σ2 (µ1 )) = ( /µ1 )(σ3 ( )/(µ1 ))−3 = (( )1−3σ3 /µ1 )
π̄
π̄
π̄
π̄
π̄
(by(2.3))
π 1−3σ3 (M −1)/5
)
(mod µ1 ) (by(2.1))
≡ (( )
π̄
n−1
(sinceφ1 (M ) = M − 1)
≡ α5 (mod µ1 )
Next suppose f = 2. In this case (M ) = (µ)(σ2 (µ)). Again we use (2.4), (2.3)
and (2.1). This time we get:
There is an integer i2 6≡ 0(mod 5), such that
ζ i2 = (π/M ) = (π 1−3σ3 /µ)
2
≡ (π 1−3σ3 )(M −1)/5
≡ (π 1−3σ3 )(M −1)(M +1)/5 (mod µ).
Quintic Reciprocity and Primality Test
273
Noting that raising to the Mth power mod µ is same as complex conjugation mod µ and that φ2 (M ) = M + 1 we get the result. Finally, if f = 4,
4
(M ) remains prime in D. We get (M/π) = (π/M ) ≡ π (M −1)/5 (mod M ) ≡
2
2
π (M −1)(M +1)/5 (mod M ). This time raising to the power M 2 is equivalent to
complex conjugation and φ4 (M ) = M 2 + 1, so we obtain the desired result .
ii) implies iii)
k
k
k
k
For k ≥ 0 let Tk = T raceK/Q (α5 + ᾱ5 ) and Nk = N ormK/Q (α5 + ᾱ5 ).
We claim that Tk and Nk satisfy the recurrent relations given by (1.1) and (1.2).
k
k
To see this we let Ak = α5 + ᾱ5 and Bk = σ2 (Ak ).
So Tk = Ak + Bk and Nk = Ak Bk
We first will obtain (1.1).
Raising Tk to the fifth power we get
A5k + Bk5 = Tk5 − 5Nk (A3k + Bk3 ) − 10Nk2 Tk
(3.1)
Computing Tk3 we obtain:
A3k + Bk3 = Tk3 − 3Nk Tk
(3.2)
On the other hand, keeping in mind that ᾱ = α−1 inverse one gets:
k
k
A5k = Ak+1 + 5((α5 )3 + (α−5 )3 ) + 10Ak
(3.3)
and
k
k
A3k = (α5 )3 + (α−5 )3 + 3Ak
(3.4)
Combining (3.3) with (3.4) leads to:
Ak+1 = A5k − 5A3k + 5Ak
(3.5)
Bk+1 = Bk5 − 5Bk3 + 5Bk
(3.6)
Similarly, one obtains
Adding (3.5) with (3.6) we get
Tk+1 = (A5k + Bk5 ) − 5(A3k + Bk3 ) + 5Tk
(3.7)
Subtituting (3.1) and (3.2) in (3.7) we obtain (1.1)
To obtain (1.2) we first multiply (3.5) and (3.6). This leads to:
Nk+1 = Nk5 − 5Nk3 (A2k + Bk2 ) + 5Nk (A4k + Bk4 )
+
25Nk3 − 25Nk (A2k + Bk2 ) + 25Nk
(3.8)
Next we note:
A2k + Bk2 = Tk2 − 2Nk
(3.9)
274
P. Berrizbeitia, M.O. Vera, J.T. Ayuso
from where we deduce
A4k + Bk4 = (Tk2 − 2Nk )2 − 2Nk2
(3.10)
(1.2) is then obtained by substituting (3.9) and (3.10) in (3.8).
Since we have proved that Tk and Nk satisfy the recurrence relations given
by (1.1) and (1.2), ii) implies that Tn−1 ≡ (ζ + ζ −1 ) + (ζ 2 + ζ −2 ) ≡ −1(mod µ).
T
Since Tn−1 is a rational number then the congruence holds modulo µ Q =
M . Similarly we get Nn−1 ≡ −1(mod M ) .
iii) implies i)
We will show that under the hypothesis every prime divisor Q of M is larger
than square root of M . This will imply that M is prime. Let Q be a prime divisor
of M . Let Q be a prime ideal in D lying over Q. Clearly, (1.4) holds modulo Q.
We will show that also (1.3) holds modulo Q.
From
Tn−1 = T raceK/Q (α5
n−1
+ ᾱ5
n−1
n−1
+ ᾱ5
) ≡ −1(mod Q)
and
Nn−1 = N ormK/Q (α5
n−1
n−1
) ≡ −1(mod Q),
n−1
+ ᾱ5 ) has the same norm and trace modulo Q than
we deduce that (α5
n−1
n−1
n−1
+ ᾱ5 ) ≡ (ζ + ζ −1 ) (mod Q) or (α5
+
(ζ + ζ −1 ), it follows that (α5
n−1
ᾱ5 ) ≡ (ζ 2 + ζ −2 ) (mod Q). This fact, together with the fact α−1 = ᾱ leads to
n−1
≡ ζ i (mod Q) for some i 6≡ 0(mod 5). Hence the class of α(mod Q) has order
α5
5n in the multiplicative group of units (Q/Q)∗ . It follows that 5n divides the
order of this group which is a divisor Q4 −1. In other words, Q4 −1 ≡ 0(mod 5n ).
Since by hypothesis no solution of this last congruence equation less than 5n is a
divisor of M it follows that Q is larger than 5n that in turn is larger than square
root of M , by the hypothesis made on A.
4
Remarks on the Implementation
In this section we will make remarks on the Implementation, and find T0 and
N0 . We will also study what happens if M is a fifth power modulo p.
– Although in principle part ii) of theorem 1 provides an algorithm for testing
the primality of M , it assumes that a factorization of φ5 (x) modulo M is
given. If M is not a prime the algorithm that finds this factorization may
not converge. Part iii) instead gives an algorithm easy to implement provided
that N0 and T0 are computed.
– Note that the recurrence relations (1.1) and (1.2) are independent of the
value of p. This is the case because ᾱ = α−1 .
Quintic Reciprocity and Primality Test
275
– In practice A is fixed, while n is taken in a range of values which vary from
relatively small to as large as possible. In the cases f = 1 and f = 2 we
obtain
T0 = T raceK/Q (α + ᾱ) = T raceK/Q ((γ/γ̄)A + (γ̄/γ)A ) and
N0 = N ormK/Q (α + ᾱ) = N ormK/Q (γ/γ̄)A + (γ̄/γ)A ). Hence T0 and N0
are computable with O(logA) modular operations.
When f = 4 the calculation of α(mod M ) is longer. In fact, in this case
2
n
α = (γ/γ̄)(M +1)/5 . The exponent this time is very big, and the calculation
of T0 and N0 in this case involve a lot of work. The calculation is still done
with O(logM ) modular operations, but not anymore with O(logA), as it is
in the cases of f = 1 and f = 2. The following observation reduces somewhat
the amount of work involved in the computation of α(mod M ) for the case
f = 4.
When dividing (M 2 + 1)/5n by M one obtains:
(M 2 + 1)/5n = AM + (wn2 + 1)/5n ± Awn
The calculation of α(mod M ) is therefore simplified by keeping in mind that
raising (γ/γ̄) to the Mth power modulo M is equivalent to applying σ2 or
σ3 , according to the congruence of M (mod 5).
– If p ≡ 1(mod 5) and M is a fith power modulo p, the following proposition
provides a sufficient condition to prove the primality of M .
Proposition 1. If Tk ≡ Nk ≡ −1( modM ) for some k such that 5k is
larger than square root of M and if no nontrivial solution of x4 ≡ 1(mod 5k )
, x < 5k , is a divisor of M , then M is prime.
The proof of this proposition goes along the lines of iii) implies i) in theorem
1, the key point being that α has order 5k mod Q, which obliges Q to be too
large or a smaller solution of x4 ≡ 1mod(5k ) .
This proposition may be particularly useful when A is much smaller than
5n .
5
Implementation
Table 1 below consist of a 2x2 matrix containing all number wn , 0 < n < 25,
such that wn2 + 1 ≡ 0(mod 5n ); 0 < wn < 5n ; wn ≡ ±2(mod 5). The first
column contains exactly those wn which are congruent to 2(mod 5) and the
second column those which are congruent to 3(mod 5). The term n + 1 of the
first column, wn+1 , is obtained from the nth term of the same column by the
following formula:
2
wn + 1
mod(
5)
5n , w1 = 2
wn+1 = wn +
5n
For the second column we use
2
wn + 1
mod(
5)
5n , w1 = 3
wn+1 = wn + −
5n
276
P. Berrizbeitia, M.O. Vera, J.T. Ayuso
Table 1. ωn
n
ωn (ω1 = 2)
ωn (ω1 = 3)
1
2
3
2
7
18
3
57
68
4
182
443
5
2057
1068
6
14557
1068
7
45807
32318
8
280182
110443
9
280182
1672943
10
6139557
3626068
11
25670807
23157318
12
123327057
120813568
13
123327057
1097376068
14
5006139557
1097376068
15
11109655182
19407922943
16
102662389557
49925501068
17
407838170807
355101282318
18
3459595983307
355101282318
19
3459595983307
15613890344818
20
79753541295807
15613890344818
21
365855836217682
110981321985443
22 2273204469030182
110981321985443
23 2273204469030182 9647724486047943
24 49956920289342682 9647724486047943
25 109561565064733307 188461658812219818
Quintic Reciprocity and Primality Test
277
Table 2 below also consist of a 24x2 matrix, this time the Ath term of the
first column contains a list of values of n , 0 < n < 100, such that M = A5n +wn ,
with wn ≡ 2(mod 5), is prime, followed by the time it took a Pentium II , 350
mhz, to compute them, using the programm we next describe. Maple was used
for implementation.
Table 2. Primes for ω1 = 2, 3
A
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
ω1 = 2
1 ≤ n ≤ 100
1
3,20,57,73
1,22,24
2,3,5,17
4,9,64
2,5
1,34
14
1,4,29,59
2,3,10,11,13,43
4,61,86
2,27,32,63,73
1,8,33,34,56
7,19,72
–
5,13,17
28
2,11,54,57
1,15,21,23,69
3,14
1
2,7,12,16,75
4,8
2,78
ω1 = 3
time
1 ≤ n ≤ 100
time
28.891
2,3,6,16,17,25
31.563
39.943
1,4,31
26.087
27.705
3,12,73,77,82
34.346
24.494
1
27.809
27.938
5,6
35.504
35.372
15,39
27.162
28.933
2,5,16,35
36.022
35.883
1,4,24
28.936
27.788
3,7,55
36.717
37.103
1
29.457
28.533
2,43,94
36.183
36.900
21
25.896
28.671 3,11,17,18,30,35,37,46,48 37.445
36.126
1,24,92
28.857
44.894
68,72
38.615
37.311
1,28,76
29.468
28.510
2,5,11,27
36.624
36.766
28,59
30.104
28.971
5,7,35,81
38.568
38.138
1
31.106
28.237
3,13,14,19,42,57
38.671
36.921
1,8,56
30.001
29.075
2,58,81
38.983
36.275
4
30.680
The first column of the table 3 contains the values of n for which A5n + 1 is
prime and the second column those values for which A5n − 1 is prime.
5.1
Description of the Algorithm
Some precomputation is needed.
We fix the primes p = 11, 31, 41, 61.
For each of these primes we found a prime element of the ciclotomic
ring D, which we will denote by Πp (ζ), lying over p (this means that
|N ormQ(ζ)/Q (Πp (ζ))| = p ).
278
P. Berrizbeitia, M.O. Vera, J.T. Ayuso
Table 3. Primes for ω1 = 1, −1
A
2
4
6
8
12
14
16
18
22
24
ω1 = 1
1 ≤ n ≤ 100
1,3,13,45
2,6,18,50
1,2,3,23,27,33,63
1
1,5,7,18,19,23,46,51,55,69
1,7,23,33
2,14,22,26,28,42
3,4,6,10,15,30
4,10,40
2,3,8,19,37,47
time
12.013
14.149
12.158
12.715
12.893
13.239
13.072
13.199
13.907
12.921
ω1 = −1
1 ≤ n ≤ 100
4,6,16,24,30,54,96
1,3,9,13,15,25,39,69
1,2,5,11,28,65,72
2,4,8,10,28
1,3,4,8,9,28,31,48,51,81
2,6,14
1,3,5,7,13,17,23,33,45,77
1,2,5,6,9,13,17,24,26,49,66
1,3,5,7,27,35,89
2,3,10,14,15,23,27,57,60
time
12.321
12.497
13.058
13.219
13.309
13.587
13.446
13.577
14.085
13.715
We note that the left side of equation (1.3) does not vary if Πp (ζ) is replaced
by another prime lying over p when n ≥ 2. Therefore the condition of primary
may be disregarded, hence we let Π11 (ζ) = (ζ + 2), Π31 (ζ) = (ζ − 2), Π41 (ζ) =
(ζ 3 + 2ζ 2 + 3ζ + 3), Π61 (ζ) = (ζ + 3).
For the case f = 1 and f = 2 (M = A5n ± 1) we let
βp,f =
Πp (ζ)
Πp (ζ)
!1−3σ
for the case f = 4 (or M = A5n + wn ; wn = ±2), we let
!
Πp (ζ)
βp,f =
Πp (ζ)
and
Tp,f,A,n ≡ T rK/Q
φ (M )/5n
(ζ)
βp f
+
φ (M )/5n
βp f
(ζ)
(M od M )
φ (M )/5n
φ (M )/5n
(ζ) + βp f
(ζ) (M od M )
Np,f,A,n ≡ N ormK/Q βp f
The program finds the first values of p for which M is not a fith power. If the
condition is not satisfied a note is made and these number are later tested by
other means.
Otherwise we set T0 = Tp,f,A,n and N0 = Np,f,A,n and we use the recurrence
equation (1.1) and (1.2) to verify if (1.4) holds.
When f = 1 or 2 we note that φf (M )/5n = A. Hence Tp,f,A,n depends only
on A, not on n. In this case, for relatively small values of A, we recommend to
Quintic Reciprocity and Primality Test
279
compute the value of
φ (M )/5n
φ (M )/5n
(ζ) + βp f
(ζ) (not modulo M )
T rK/Q βp f
and
φ (M )/5n
φ (M )/5n
(ζ) (not modulo M )
(ζ) + βp f
N ormK/Q βp f
These same numbers may be used as the starting number T0 and N0 for all
numbers n in a given range. If for a fixed value of A the calculation of T0 and N0
is counted as part of the precomputation, then the complexity of the primality
test for numbers of the form A5n ± 1, which are not congruent to a fith power
modulo p, is simply the complexity of the calculation of the recurrence relations
(1.1) and (1.2) n-1 times.
When f = 4, φf (M )/5n is large and depends on A and n. In this case, even
for small values of A, the computation of T0 and N0 is, for each value of M, of
approximately same complexity as the the computation of Tn−1 , Nn−1 , given T0
and N0 .
Acknowledgements
We are grateful to Daniel Sandonil who implemented the programm.
We are also grateful to the referees, for their useful comments, that helped
to improve the paper.
References
1. L. Adleman, C. Pomerance and R. Rumely On distinguishing prime numbers from
composite numbers, Ann. of Math. 117, (1983), 173-206.
2. D. Bressoud, Factorization and Primality Testing, Springer-Verlag, New York 1989.
3. P. Berrizbeitia,T.G. Berry, Cubic Reciprocity and generalised Lucas-Lehmer test
for primality of A3n ± 1 Proc.AMS.127(1999), 1923-1925.
4. H. Cohen and H.W. Lenstra, Primality testing and Jacobi sums, Math. Comp. 42,
(1984), 297-330.
5. H. Cohen and A.K. Lenstra, Implemetation of a new primality test, Math. Comp.
48, (1987), 103-121.
6. A. Guthmann, Effective Primality Test for Integers of the Forms N = K3n + 1
and N = K2m 3n + 1. BIT 32 (1992), 529-534.
7. K. Ireland and M. Rosen, A Classical Introduction to Modern Number Theory, 2da
ed. Springer-Verlag, New York 1982.
8. M. Rosen, A Proof of the Lucas-Lehmer Test, Amer. Math. Monthly 95(1980) 9,
855-856.
9. H.C. Williams, Effective Primality Tests for some Integers of the Forms A5n − 1
and A7n − 1, Math. of Comp. 48(1987), 385-403.
10. H.C. Williams, A Generalization of Lehmer’s functions, Acta Arith 29 (1976), 315341.
11. H.C. Williams and J.S. Judd, Determination of the Primality of N by Using Factors
of N 2 ± 1, Math. of Comp. 30(1976), 157-172.
Determining the Optimal Contrast for Secret
Sharing Schemes in Visual Cryptography
Matthias Krause1 and Hans Ulrich Simon2
1
Theoretische Informatik
Universität Mannheim
D-68131 Mannheim, Germany
krause@th.informatik.uni-mannheim.de
2
Fakultät für Mathematik
Ruhr-Universität Bochum
D-44780 Bochum, Germany
simon@lmi.ruhr-uni-bochum.de
Abstract. This paper shows that the largest possible contrast Ck,n in
an k-out-of-n secret sharing scheme is approximately 4−(k−1) . More precisely, we show that 4−(k−1) ≤ Ck,n ≤ 4−(k−1) nk /(n(n − 1) · · · (n − (k −
1))). This implies that the largest possible contrast equals 4−(k−1) in the
limit when n approaches infinity. For large n, the above bounds leave
almost no gap. For values of n that come close to k, we will present
alternative bounds (being tight for n = k). The proofs of our results proceed by revealing a central relation between the largest possible contrast
in a secret sharing scheme and the smallest possible approximation error
in problems occuring in Approximation Theory.
1
Introduction
Visual cryptography and k-out-of-n secret sharing schemes are notions introduced by Naor and Shamir in [10]. A sender wishing to transmit a secret message distributes n transparencies among n recipients, where the transparencies
contain seemingly random pictures. A k-out-of-n scheme achieves the following
situation: If any k recipients stack their transparencies together, then a secret
message is revealed visually. On the other hand, if only k − 1 recipients stack
their transparencies, or analyze them by any other means, they are not able to
obtain any information about the secret message. The reader interested in more
background information about secret sharing schemes is referred to [10].
An important measures of a scheme is its contrast, i.e., the clarity with which
the message becomes visible. This parameter lies in interval [0, 1], where contrast 1 means “perfect clarity” and contrast 0 means “invisibility”. Naor and
Shamir constructed k-out-of-k secret sharing schemes with contrast 2−(k−1) and
were also able to prove optimality. However, they did not determine the largest
possible contrast Ck,n for arbitrary k-out-of-n secret sharing schemes.
In the following, there were made several attempts to find accurate estimations for the optimal contrast and the optimal tradeoff between contrast and
G. Gonnet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNCS 1776, pp. 280–291, 2000.
c Springer-Verlag Berlin Heidelberg 2000
Determining the Optimal Contrast for Secret Sharing Schemes
281
subpixel expansion for arbitrary k-out-of-n secret sharing schemes [4],[5],[1],[2],
[3]. For k = 2 and arbitrary n this problem was completely solved by Hofmeister,
Krause, and Simon in [5]. But the underlying methods, which are based on the
theory of linear codes, do not work for k ≥ 3. Strengthening the approach of
Droste [4], the first step in the direction of determining Ck,n for some values k
and n, where k ≥ 3, was taken in [5]. They presented a simple linear program
LP(k, n) whose optimal solution represents a contrast-optimal k-out-of-n secret
sharing scheme. The profit achieved by this solution equals Ck,n . Although, Ck,n
was computable in poly(n) steps this way, and even elemantary formulas were
given for k = 3, 4, there was still no general formula for Ck,n (or for good
bounds). Based on computations of Ck,n for specific choices of k, n, it was conjectured in [5] that Ck,n ≥ 4−(k−1) with equality in the limit when n approaches
infinity. In [2] and [3], some of the results from [5] concerning k = 3, 4 and
arbitrary n could be improved. Furthermore, in [3], Blundo, D’Arco, DeSantis
and Stinson determine the optimal contrast of k-out-of-n secret sharing schemes
for arbitrary n and k = n − 1.
In this paper, we confirm the above conjecture of [5] by showing the following
bounds on Ck,n :
4−(k−1) ≤ Ck,n ≤ 4−(k−1)
nk
.
n(n − 1) · · · (n − (k − 1))
This implies that the largest possible contrast equals 4−(k−1) in the limit when
n approaches infinity. For large n, the above bounds leave almost no gap. For
values of n that come close to k, we will present alternative bounds (being
tight for n = k). The proofs of our results proceed by revealing a central relation
between the largest possible contrast in a secret sharing scheme and the smallest
possible approximation error in problems occuring in Approximation Theory. A
similar relation was used in the paper [8] of Linial and Nisan about Approximate
Inclusion-Exclusion (although there are also some differences and paper [8] endsup with problems in Approximation Theory that are different from ours).
2
Definitions and Notations
For the sake of completeness, we recall the definition of visual secret sharing
schemes given in [10]. In the sequel, we simply refer to them under the notion
scheme. For a 0-1–vector v, let H(v) denote the Hamming weight of v, i.e., the
number of ones in v.
Definition 1. A k-out-of-n scheme C = (C0 , C1 ) with m subpixels, contrast α =
α(C) and threshold d consists of two collections of Boolean n × m matrices C0 =
[C0,1 , . . . , C0,r ] and C1 = [C1,1 , . . . , C1,s ] , such that the following properties are
valid:
1. For any matrix S ∈ C0 , the OR v of any k out of the n rows of S satisfies
H(v) ≤ d − αm.
282
M. Krause, H.U. Simon
2. For any matrix S ∈ C1 , the OR v of any k out of the n rows of S satisfies
H(v) ≥ d.
3. For any q < k and any q-element subset {i1 , . . . , iq } ⊆ {1, . . . , n}, the two
collections of q × m matrices D0 and D1 obtained by restricting each n × m
matrix in C0 and C1 to rows i1 , . . . , iq are indistinguishable in the sense that
they contain the same matrices with the same relative frequencies.
k-out-of-n schemes are used in the following way to achieve the situation
described in the introduction. The sender translates every pixel of the secret
image into n sets of subpixels, in the following way: If the sender wishes to
transmit a white pixel, then she chooses one of the matrices from C0 according
to the uniform distribution. In the case of a black pixel, one of the matrices from
C1 is chosen. For all 1 ≤ i ≤ n, recipient i obtains the i-th row of the chosen
matrix as an array of subpixels, where a 1 in the row corresponds to a black
subpixel and a 0 corresponds to a white subpixel. The subpixels are arranged in
a fixed pattern, e.g. a rectangle. (Note that in this model, stacking transparencies
corresponds to “computing” the OR of the subpixel arrays.)
The third condition in Definition 1 is often referred to as the “security property” which guarantees that any k − 1 of the recipients cannot obtain any information out of their transparencies. The “contrast property”, represented by
the first two conditions in Definition 1, guarantees that k recipients are able to
recognize black pixels visually since any array of subpixels representing a black
pixel contains a “significant” amount of black subpixels more than any array
representing a white pixel.1
In [5], it was shown that the largest possible contrast Ck,n in an k-out-of-n
scheme coincides with the maximal profit in the following linear program (with
variables ξ0 , . . . , ξn and η0 , . . . , ηn ):
Linear Program LP(k, n)
Pn−k n−k n−1
(ξj − ηj ) subject to
max
j=0
j
j
1. P
For j = 0, .P
. . , n : ξj ≥ 0, ηj ≥ 0.
n
n
2.
j=0 ξj =
j=0 ηj = 1
Pn−k+l+1 n−k+1 n−1
(ξj − ηj ) = 0.
3. For l = 0, . . . , k − 1: j=l
j
j−l
The following sections only use this linear program (and do not explicitly refer
to Definition 1).
We make the following conventions concerning matrices and vectors. For
matrix A, A′ denotes its transpose (resulting from A by exchanging rows and
1
The basic notion of a secret sharing scheme, as given in Definition 1, has been
generalized in several ways. The generalized schemes in [1], for instance, intend to
achieve a situation where certain subsets of recipients can work successfully together,
whereas other subsets will gain no information. If the two classes of subsets are the
sets of at least k recipients and the sets of at most k − 1 recipients, respectively, we
obtain (as a special case) the schemes considered in this paper. Another model for
2-out-of-2 schemes involving three colors is presented in [11].
Determining the Optimal Contrast for Secret Sharing Schemes
283
columns). A vector which is denoted by c is regarded as a column vector. Thus,
its transpose c′ is a row vector. The all-zeros (column) vector is denoted as 0.
For matrix A, Aj denotes its j-th row vector. A′j denotes the j-th row vector of
its transpose (as opposed to the transpose of the j’th row vector).
3
Approximation Error and Contrast
In Subsection 3.1, we relate the problem of finding the best k-out-of-n secret
sharing scheme to approximation problems of type BAV and BAP. Problem BAV
(Best Approximating Vector) asks for the “best approximation” of a given vector
c within a vector space V . Problem BAP (Best Approximating Polynomial) asks
for the “best approximation” of a given polynomial p of degree k within the set
of polynomials of degree k − 1 or less. It turns out that, choosing c, V, p properly,
the largest possible contrast is twice the smallest possible approximation error.
In Subsection 3.2, we use this relationship to determine lower and upper bounds.
Moreover, the largest possible contrast is determined exactly in the limit (when
n approaches infinity). In Subsection 3.3, we derive a criterion that helps to
determine those pairs (k, n) for which Ck,n coincides with its theoretical upper
bound from Subsection 3.2.
3.1
Secret Sharing Schemes and Approximation Problems
As explained in Section 2, the largest possible contrast in an k-out-of-n secret
sharing scheme is the maximal profit in linear program LP(k, n). The special
form exhibited by LP(k, n) is captured by the more abstract definitions of a
linear program of type BAV (Best Approximating Vector) or of type BAP (Best
Approximating Polynomial).
We start with the discussion of type BAV. We say that a linear program LP
is of type BAV if there exists a matrix A ∈ ℜk×(1+n) and a vector c ∈ ℜn+1 such
that LP (with variables ξ = (ξ0 , . . . , ξn ) and η = (η0 , . . . , ηn )) can be written in
the following form:
The primal linear program LP(A, c) of type BAV
max c (ξ − η) subject to
(LP1) P
ξ ≥ 0, η ≥ P
0
n
n
ξ
=
(LP2)
j
j=0
j=0 ηj = 1
(LP3) A(ξ − η) = 0
′
Condition (LP2) implies that
n
X
j=0
(ξj − ηj ) = 0.
Thus, we could add the all-ones row vector (1, . . . , 1) to matrix A in (LP3)
without changing the set of legal solutions. For this reason, we assume in the
sequel that the following condition holds in addition to (LP1), (LP2), (LP3):
284
M. Krause, H.U. Simon
(LP4) The vector space VA spanned by the row vectors of A contains the all-ones
vector.
We aim to show that linear program LP(A, c) can be reformulated as the
problem of finding the “best” approximation of c in VA . To this end, we pass to
the dual problem2 (with variables s, t and u = (u0 , . . . , uk−1 )):
The dual linear program DLP(A, c) of type BAV
min s+t subject to
(DLP1) A′ u + (s, . . . , s)′ ≥ c
(DLP2) A′ u − (t, . . . , t)′ ≤ c
Conditions (DLP1) and (DLP2) are obviously equivalent to
s ≥ max (cj − A′j u) and t ≥ max (A′j u − cj ),
j=0,...,n
j=0,...,n
and an optimal solution certainly satisfies
s = max (cj − A′j u) and t = max (A′j u − cj ).
j=0,...,n
j=0,...,n
Note that vector A′ u is a linear combination of the row vectors of A. Thus,
VA = {A′ u| u ∈ ℜk }. DLP(A, c) can therefore be rewritten as follows:
min max (cj − vj ) + max (vj − cj )
v∈VA
j=0,...,n
j=0,...,n
Consider a vector v ∈ VA and let
j− (v) = arg max (cj − vj ) and j+ (v) = arg max (vj − cj ).
j=0,...,n
j=0,...,n
Term S(v) := cj− (v) − vj− (v) represents the penalty for vj− (v) being smaller than
cj− (v) . Symmetrically, L(v) := vj+ (v) − cj+ (v) represents the penalty for vj+ (v)
being larger than cj+ (v) . Note that the total penalty S(v)+L(v) does not change
if we translate v by a scalar multiple of the all-ones vector (1, . . . , 1)′ . According
to (LP4), any translation of this form can be performed within VA . Choosing
the translation of v appropriately, we can achieve S(v) = L(v), that is, a perfect
balance between the two penalty terms. Consequently, the total penalty for v
is twice the distance between c and v measured by the metric induced by the
maximum-norm. We thus arrive at the following result.
Theorem 1. Given linear program LP(A, c) of type BAV, the maximal profit C
in LP(A, c) satisfies
C = 2 · min max |cj − vj |.
v∈VA j=0,...,n
2
The rules, describing how the dual linear program is obtained from a given primal,
can be looked up in any standard text about linear programming (like [12], for
instance).
Determining the Optimal Contrast for Secret Sharing Schemes
285
Thus, the problem of finding an optimal solution to LP(A, c) boils down to the
problem of finding a best approximation of c in VA w.r.t. the maximum-norm.
We now pass to the discussion of linear programs of type BAP. We call
d ∈ ℜ1+n evaluation-vector of polynomial p ∈ ℜ[X] if dj = p(j) for j = 0, . . . , n.
We say that a linear program LP(A, c) of type BAV is of type BAP if, in addition
to Conditions (LP1),..,(LP4), the following holds:
(LP5) c is the evaluation vector of a polynomial, say p, of degree k.
(LP6) Matrix A ∈ ℜk×(1+n) has rank k, i.e., its row vectors are linearly independent.
(LP7) For l = 0, . . . , k−1, row vector Al is the evaluation vector of a polynomial,
say ql , of degree at most k − 1.
Let Pm denote the set of polynomials of degree at most m. Conditions (LP6)
and (LP7) imply that VA is the vector space of evaluation vectors of polynomials
from Pk−1 . Theorem 1 implies that the maximal profit C in a linear program of
type BAP satisfies
C = 2 · min
max |p(j) − q(j)|.
q∈Pk−1 j=0,...,n
Let λ denote the leading coeffient of p. Thus p can be be written as sum of λX k
and a polynomial in Pk−1 . Obviously, p is as hard to approximate within Pk−1
as |λ|X k . We obtain the following result:
Corollary 1. Given linear program LP(A, c) of type BAP, let p denote the polynomial of degree k with evaluation vector c, and λ the leading coefficient of p.
Then the maximal profit C in LP(A, c) satisfies
C = 2 · min
max
q∈Pk−1 j=0,...,n
|λ|j k − q(j) .
We introduce the notation
nk = n(n − 1) · · · (n − (k − 1))
for so-called “falling powers” and proceed with the following result:
Lemma 1. The linear program LP(k, n) is of type BAP. The leading coefficient
of the polynomial p with evaluation vector c is (−1)k /nk .
The proof of this lemma is obtained by a close inspection of LP(k, n) and a
(more or less) straightforward calculation.
Corollary 2. Let Ck,n denote the largest possible contrast in an k-out-of-n secret sharing scheme. Then:
Ck,n = 2 · min
max |j k /nk − q(j)|.
q∈Pk−1 j=0,...,n
Thus, the largest possible contrast in an k-out-of-n secret sharing scheme
is identical to twice the smallest “distance” between polynomial X k /nk and a
polynomial in Pk−1 , where the “distance” between two polynomials is measured
as the maximum absolute difference of their evaluations on points 0, 1 . . . , n.
286
3.2
M. Krause, H.U. Simon
Lower and Upper Bounds
Finding the “best approximating polynomial” of X k within Pk−1 is a classical
problem in Approximation Theory. Most of the classical results are stated for
polynomials defined on interval [−1, 1]. In order to recall these results and to
apply them to our problem at hand, the definition of the following metric will
be useful:
(1)
d∞ (f, g) = max |f (x) − g(x)|
x∈[−1,1]
This definition makes sense for functions that are continuous on [−1, 1] (in particular for polynomials). The metric implicitly used in Corollaries 1 and 2 is different because distance between polynomials is measured on a finite set of points
rather than on a continuous interval. For this reason, we consider sequence
zj = −1 +
2j
for j = 0, . . . , n.
n
(2)
It forms a regular subdivision of interval [−1, 1] of step width 2/n. The following
metric is a “discrete version” of d∞ :
dn (f, g) = max |f (zj ) − g(zj )|.
j=0,...,n
(3)
∗
the best approximation of Uk within Pk−1
Let Uk (X) = X k and Uk,∞
∗
w.r.t. d∞ . Analogously, Uk,n
denotes the best approximation of U within Pk−1
w.r.t. dn .
∗
∗
Dk,∞ = d∞ (Uk , Uk,∞
) and Dk,n = dn (Uk , Uk,n
)
(4)
are the corresponding approximation errors. It is well known from Approximation Theory3 that
∗
Uk,∞
(X) = X k − 2−(k−1) Tk (X)
(5)
where Tk denotes the Chebyshev polynomial of degree k (defined and visualized
in Figure 1). It is well known that Tk = cos(kθ) is a polynomial of degree k in
∗
X = cos(θ) ∈ [−1, 1] with leading coefficient 2k−1 . Thus, Uk,∞
is indeed from
Pk−1 . Since max−1≤x≤1 |Tk (x)| = 1, we get
Dk,∞ = 2−(k−1) .
(6)
Unfortunately, there is no such simple formula for Dk,n (the quantity we are
interested in). It is however easy to see that the following inequalities are valid:
k2
2−(k−1) ≤ Dk,n ≤ Dk,∞ = 2−(k−1)
(7)
1−
n
Inequality Dk,n ≤ Dk,∞ is obvious because dn (f, g) ≤ d∞ (f, g) for all f, g. The
first inequality can be derived from the fact that the first derivation of Tk is
3
See Chapter 1.2 in [13], for instance.
Determining the Optimal Contrast for Secret Sharing Schemes
287
Fig. 1. The Chebyshev polynomial Tk of degree k for k = 1, 2, 3. Tk (X) = cos(kθ),
where 0 ≤ θ ≤ π and X = cos(θ).
bounded by k 2 on [−1, 1] (applying some standard tricks). We will improve on
this inequality later and present a proof for the improved statement.
Quantities Dk,n and Ck,n are already indirectly related by Corollary 2. In
order to get the precise relation, we have to apply linear transformation X →
n
2 (X + 1), because the values attained by a function f (X)
on X = 0, . . . , n
coincide with the values attained by function f n2 (X + 1) on X = z0 , . . . , zn .
This transformation, applied to a polynomial of degree k with leading coefficient
k
λ, leads to a polynomial of the same degree with leading coefficient λ n2 . The
results corresponding to Corollaries 1 and 2 now read as follows:
Corollary 3. Given a linear program LP(A, c) of type BAP, let p denote the
polynomial of degree k with evaluation vector c, and λ the leading coefficient of
p. Then the maximal profit C in LP(A, c) satisfies
C = 2 · |λ|
Plugging in (−1)k /nk for λ, we obtain
n k
2
Dk,n .
Corollary 4. The largest possible contrast in an k-out-of-n secret sharing scheme
satisfies
nk
Ck,n = k 2−(k−1) Dk,n .
n
Since Dk,∞ = 2−(k−1) , we get the following result:
Corollary 5. The limit of the largest possible contrast in an k-out-of-n secret
sharing scheme, when n approaches infinity, satisfies
Ck,∞ = lim Ck,n = 4−(k−1) .
n→∞
288
M. Krause, H.U. Simon
The derivation of Ck,∞ from Dk,∞ profited from the classical Equation (6)
from Approximation Theory. For n = k, we can go the other way and derive
Dk,k from the fact (see [10]) that the largest possible contrast in an k-out-of-k
secret sharing scheme is 2−(k−1) :
Ck,k = 2−(k−1)
(8)
Applying Corollary 4, we obtain
Dk,k =
k!
kk
(9)
√
According to Stirling’s formula, this quantity is asymptotically equal to 2πke−k .
Equation (9) presents the precise value for the smallest possible approximation
error when X k is approximated by a polynomial of degree k − 1 or less, and the
distance between polynomials is measured by metric dk .
Sequence Ck,n monotonically decreases with n because the secret sharing
scheme becomes harder to design when more people are going to share the secret
(and threshold k is fixed). Thus, the unknown value for Ck,n must be somewhere
between Ck,∞ = 4−(k−1) and Ck,k = 2−(k−1) . We don’t expect the sequence
Dk,n to be perfectly monotonous. However, we know that Dk,n ≤ Dk,∞ . If
n is a multiple of k, the regular subdivision of [−1, 1] with step width 2/n is a
refinement of the regular subdivision of [−1, 1] with step width 2/k. This implies
Dk,n ≥ Dk,k .
Figure 2 presents an overview over the results obtained so far. An edge from
a to b with label s should be interpreted as b = s · a. For instance, the edges
′
, s′k,n , sk,n represent the equations
with labels rk,n , rk,n
Ck,n = rk,n · Ck,∞ with rk,n ≥ 1,
′
′
· Ck,n with rk,n
≥ 1,
Ck,k = rk,n
Dk,n = s′k,n · Dk,k with s′k,n ≥ 1 if n is a multiple of k,
Dk,∞ = sk,n · Dk,n with sk,n ≥ 1,
respectively. The edges between Ck,n and Dk,n explain how Dk,n is derived from
Ck,n and vice versa, i.e., these edges represent Corollary 4. Figure 2 can be used
′
, s′k,n , sk,n . The
to obtain approximations for the unknown parameters rk,n , rk,n
−(k−1)
−(k−1)
simple path from Ck,∞ = 4
to Dk,∞ = 2
corresponds to equation
nk
· rk,n · 4−(k−1) .
nk
≥ 1 and performing some cancellation, we arrive at
2−(k−1) = sk,n · 2k−1
Using rk,n ≥ 1, sk,n
nk
.
(10)
nk
A similar computation associated with the simple path from Dk,k to Ck,k leads
to
k
k
k k nk
n
′
=
· s′k,n =
.
(11)
rk,n
k
k!n
n
k
rk,n · sk,n =
Determining the Optimal Contrast for Secret Sharing Schemes
-(k-1)
D k, = 2
r k,n
’ r k,n
’ 1
2k-1
sk,n sk,n 1
nk
nk
C k,n
Dk,n
2-(k-1)
r k,n r k,n 1
nk
s’k,n sif’kn,n is 1a multiple of k
nk
-(k-1)
D k,k =
8
C k, = 4
-(k-1)
8
C k,k = 2
289
k!
kk
Fig. 2. Sequence Ck,n , sequence Dk,n and relations between them.
The following bounds on Ck,n and Dk,n are now evident from Figure 2
and (10):
4−(k−1) ≤ Ck,n = rk,n 4−(k−1) ≤
nk −(k−1)
4
nk
2−(k−1)
nk −(k−1)
= Dk,n ≤ 2−(k−1)
2
≤
k
n
sk,n
(12)
(13)
In both cases, the upper bound exceeds the lower bound by factor nk /nk only
(approaching 1 when n approaches infinity).4 An elementary computation5 shows
that 1 − k 2 /n < nk /nk ≤ 1 holds for all 1 ≤ k ≤ n. Thus, (13) improves on the
classical Inequality (7) from Approximation Theory.
Although bounds (12) and (13) are excellent for large n, they are quite poor
when n comes close to k. In this case however, we obtain from Figure 2 and (11)
(n/k)k −(k−1)
2
≤ Ck,n ≤ 2−(k−1) ,
n
(14)
k
nk
k!
≤
D
≤
,
k,n
kk
nk
(15)
where the first inequality in (15) is only guaranteed if n is a multiple of k. These
bounds are tight for n = k.
4
5
Because of (10), the two gaps cannot be maximal simultaneously. For instance, at
least one
p of the upper bounds exceeds the corresponding lower bound at most by
factor nk /nk .
making use of e−2x ≤ 1 − x ≤ e−x , where the first inequality holds for all x ∈ [0, 1/2]
and the second for all x ∈ ℜ
290
3.3
M. Krause, H.U. Simon
Discussion
Paper [5] presented some explicit formulas for Ck,n , but only for small values of
k. For instance, it was shown that Ck,n matches 4−(k−1) · nk /nk (which is the
theoretical upper bound, see (12)) if k = 2 and n even and if k = 3 and n is a
multiple of 4.
However, computer experiments (computing Ck,n by solving LP (k, n) and
comparing it with the theoretical upper bound from (12)) support the conjecture
that there is no such coincidence for most other choices of k, n. The goal of this
subsection is to provide a simple explanation for this phenomenon. Exhibiting
basic results from Approximation Theory concerning best approximating polynomials on finite subsets of the real line (see, e.g., Theorem 1.7 and Theorem
1.11 from [13]), it is possible to derive the following
Theorem 2. It holds that Ck,n = 4−(k−1) · nk /nk iff Ek ⊆ Zn , where
(k − i)π
| i = 0, . . . , k ,
Ek = cos
k
2i
Zn = {z0 , . . . , zn } = −1 + | i = 0, . . . , n .
n
Due to lack of space, for the proof of this result we refer to the journal version
of this paper. It is quite straightforward to derive that E2 ⊆ Zn iff n is even,
that E3 ⊆ Zn iff n is divisible by 4, and that Ek 6⊆ Zn for all n and ≥ 4 as Ek
contains irrational numbers. Consequently,
Corollary 6. It holds that Ck,n = 4−(k−1) · nk /nk iff k = 2 and n is even or if
k = 3 and n is a multiple of 4.
We conclude the paper with a final remark and an open problem. Based on
the results of this paper, Kuhlmann and Simon [6] were able to design arbitrary
k-out-of-n secret sharing schemes with asymptotically optimal contrast. More
precisely, the contrast achieved by their schemes is optimal up to a factor of at
most 1−k 2 /n. For moderate values of k and n, these schemes are satisfactory. For
large values of n, they use too many subpixels. It is an open problem to determine
(as precise as possible) the tradeoff between the contrast (which should be large)
and the number of subpixels (which should be small).
References
1. G. Ateniese, C. Blundo, A. De Santis, D. R. Stinson, Visual Cryptography for
General Access Structures, Proc. of ICALP 96, Springer, 416-428, 1996.
2. C. Blundo, A. De Santis, D. R. Stinson, On the contrast in visual cryptography
schemes, Journal of Cryptology 12 (1999), 261-289.
3. C. Blundo, P. De Arco, A. De Santis and D. R. Stinson,
Contrast optimal threshold visual cryptography schemes, Technical report 1998
(see http://www.cacr.math.uwaterloo.ca/˜dstinson/#Visual Cryptography)
To appear in SIAM Journal on Discrete Mathematics.
Determining the Optimal Contrast for Secret Sharing Schemes
291
4. S. Droste, New Results on Visual Cryptography, in “Advances in Cryptology” CRYPTO ’96, Springer, pp. 401-415, 1996.
5. T. Hofmeister, M. Krause, H. U. Simon, Contrast-Optimal k out of n Secret Sharing
Schemes in Visual Cryptography, in “Proceedings of the 3rd International Conference on Computing and Combinatorics” - COCOON ’97, Springer, pp. 176-186,
1997. Full version will appear in Theoretical Computer Science.
6. C. Kuhlmann, H. U. Simon, Construction of Visual Secret Sharing Schemes with
Almost Optimal Contrast. Submitted for publication.
7. R. Lidl, H. Niederreiter, Introduction to finite fields and their applications, Cambridge University Press, 1994.
8. N. Linial, N. Nisan, Approximate inclusion-exclusion, Combinatorica 10, 349-365,
1990.
9. J. H. van Lint, R. M. Wilson, A course in combinatorics, Cambridge University
Press, 1996.
10. M. Naor, A. Shamir, Visual Cryptography, in “Advances in Cryptology - Eurocrypt
94”, Springer, 1-12, 1995.
11. M. Naor, A. Shamir, Visual Cryptography II: Improving the Contrast via the Cover
Base, in Proc. of “Security protocols: international workshop 1996”, Springer LNCS
1189, 69-74, 1997.
12. Christos H. Papadimitriou and Kenneth Steiglitz, Combinatorial Optimization:
Algorithms and Complexity, Prentice Hall, 1982.
13. Theodore J. Rivlin, An Introduction to the Approximation of Functions, Blaisdell
Publishing Company, 1969.
Average-Case Analysis of Rectangle Packings
E.G. Coffman, Jr.1 , George S. Lueker2 , Joel Spencer3 , and Peter M. Winkler4
1
4
New Jersey Institute of Technology, Newark, NJ 07102
2
University of California, Irvine, Irvine, CA 92697
3
New York University, New York, NY 10003
Bell Labs, Lucent Technologies, Murray Hill, NJ 07974
Abstract. We study the average-case behavior of algorithms for finding
a maximal disjoint subset of a given set of rectangles. In the probability
model, a random rectangle is the product of two independent random
intervals, each being the interval between two points drawn uniformly at
random from [0, 1]. We have proved that the expected cardinality of a
maximal disjoint subset of n random rectangles has the tight asymptotic
bound Θ(n1/2 ). Although tight bounds for the problem generalized to
d > 2 dimensions remain an open problem, we have been able to show
that Ω(n1/2 ) and O((n logd−1 n)1/2 ) are asymptotic lower and upper
bounds. In addition, we can prove that Θ(nd/(d+1) ) is a tight asymptotic
bound for the case of random cubes.
1
Introduction
We estimate the expected cardinality of a maximal disjoint subset of n rectangles
chosen at random in the unit square. We say that such a subset is a packing
of the n rectangles, and stress that a rectangle is specified by its position as
well as its sides; it can not be freely moved to any position such as in strip
packing or two-dimensional bin packing (see [2] and the references therein for
the probabilistic analysis of algorithms for these problems). A random rectangle
is the product of two independent random intervals on the coordinate axes; each
random interval in turn is the interval between two independent random draws
from a distribution G on [0, 1].
This problem is an immediate generalization of the one-dimensional problem
of packing random intervals [3]. And it generalizes in an obvious way to packing
random rectangles (boxes) in d > 2 dimensions into the d-dimensional unit cube,
where each such box is determined by 2d independent random draws from [0, 1],
two for every dimension. A later section also studies the case of random cubes in
d ≥ 2 dimensions. For this case, to eliminate irritating boundary effects that do
not influence asymptotic behavior, we wrap around the dimensions of the unit
cube to form a toroid. In terms of an arbitrarily chosen origin, a random cube is
then determined by d + 1 random variables, the first d locating the vertex closest
to the origin, and the last giving the size of the cube, and hence the coordinates
of the remaining 2d − 1 vertices. Each random variable is again an independent
random draw from the distribution G.
G. Gonnet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNCS 1776, pp. 292–297, 2000.
c Springer-Verlag Berlin Heidelberg 2000
Average-Case Analysis of Rectangle Packings
293
Applications of our model appear in jointly scheduling multiple resources,
where customers require specific “intervals” of a resource or they require a resource for specific intervals of time. An example of the former is a linear communication network and an example of the latter is a reservation system. In a
linear network, we have a set S of call requests, each specifying a pair of endpoints (calling parties) that define an interval of the network. If we suppose also
that each request gives a future time interval to be reserved for the call, then a
call request is a rectangle in the two dimensions of space and time. In an unnormalized and perhaps discretized form, we can pose our problem of finding the
expected value of the number of requests in S that can be accommodated.
The complexity issue for the combinatorial version of our problem is easily settled. Consider the two-dimensional case, and in particular a collection of
equal size squares. In the associated intersection graph there is a vertex for each
square and an edge between two vertices if and only if the corresponding squares
overlap. Then our packing problem specialized to equal size squares becomes the
problem of finding maximal independent sets in intersection graphs. It is easy to
verify that this problem is NP-complete. For example, one can use the approach
in [1] which was applied to equal size circles; the approach is equally applicable to
equal size squares. We conclude that for any fixed d ≥ 2, our problem of finding
maximal disjoint subsets of rectangles is NP-complete, even for the special case
of equal size cubes. As a final note, we point out that, in contrast to higher dimensions, the one-dimensional (interval) problem has a polynomial-time solution
[3].
Let Sn be a given set of random boxes, and let Cn be the maximum cardinality of any set of mutually disjoint boxes taken from Sn . After preliminaries in
the next section, Section 3 proves that, in the case of cubes in d ≥ 2 dimensions,
E[Cn ] = Θ(nd/(d+1) ), and Section 4 proves that, in the case of boxes in d dimensions, E[Cn ] = Ω(n1/2 ) and E[Cn ] = O((n logd−1 n)1/2 ). Section 5 contains our
strongest result, which strengthens the above bounds for d = 2 by presenting a
O(n1/2 ) tight upper bound. We sketch a proof that relies on a similar result for
a reduced, discretized version of the two dimensional problem.
2
Preliminaries
We restrict the packing problem to continuous endpoint distributions G. Within
this class, our results are independent of G, because the relevant intersection
properties of G depend only on the relative ordering of the points that determine
the intervals in each dimension. Thus, for simplicity, we assume hereafter that
G is the uniform distribution on [0, 1].
It is also easily verified that we can Poissonize the problem without affecting
our results. In this version, the number of rectangles is a Poisson distributed
random variable Tn with mean n, and we let C(n) denote the number packed
in a maximal disjoint subset. We will continue to parenthesize arguments in the
notation of the Poissonized model so as to distinguish quantities like Cn in the
model where the number of rectangles to pack is fixed at n.
294
E.G. Coffman, Jr. et al.
Let X1 , . . . , Xn be i.i.d. with a distribution F concentrated on [0, 1]. We
assume that F is regularly varying at 0 in that it is strictly increasing and that,
for some ξ > 0, some constants K, K ′ ∈ (0, 1), and all x ∈ (0, ξ), it satisfies
F (x/2)
′
F (x) ∈ [K, K ]. For (sn ∈ (0, 1]; n ≥ 1) a given sequence, let Nn (F, sn ) be the
maximum number of the Xi that can be chosen such that their sum is at most
nsn on average. Equivalently, in terms of expected values, Nn is such that the
sum of the smallest Nn of the Xi is bounded by nsn , but the sum of the smallest
Nn + 1 of the Xi exceeds nsn .
Standard techniques along with a variant of Bernstein’s Theorem suffice to
prove the following technical lemma.
Lemma 1. With F and (sn , n ≥ 1) as above, let xn be the solution to
Z xn
xdF (x),
sn =
(1)
0
and assume the sn are such that lim xn = 0. Then if lim nF (xn ) = ∞ and
nF (xn ) = Ω(log2 s−1
n ), we have
n→∞
E[Nn (F, sn )] ∼ nF (xn ) .
3
n→∞
(2)
Random Cubes
The optimum packing of random cubes is readily analyzed. We work with a ddimensional unit cube, and allow (toroidal) wrapping in all axes. The n cubes
are generated independently as follows: First a vertex (v1 , v2 , . . . , vd ) is generated
by drawing each ci independently from the uniform distribution on [0, 1]. Then
one more value w is drawn independently, again uniformly from [0, 1]. The cube
generated is
[v1 , v1 + w) × [v2 , v2 + w) × · · · × [vd , vd + w),
where each coordinate is taken modulo 1. In this set-up, we have the following
result.
Theorem 1. The expected cardinality of a maximum packing of n random cubes
is Θ(nd/(d+1) ).
Proof: For the lower bound consider the following simple heuristic. Subdivide
the cube into c−d cells with sides
c = αn−1/(d+1) ,
where α is a parameter that may be chosen to optimize constants. For each cell
C, if there are any generated cubes contained in C, include one of these in the
packing. Clearly, all of the cubes packed are nonoverlapping.
Average-Case Analysis of Rectangle Packings
295
One can now show that the probability that a generated cube fits into a
particular cell C is cd+1 /(d + 1), and so the probability that C remains empty
after generating all n cubes is
n
n
αd+1
αd+1
cd+1
.
= 1−
∼ exp −
1−
d+1
n(d + 1)
d+1
Since the number of cells is 1/cd = nd/(d+1) /αd , the expectation of the total
number of cubes packed is
αd+1
nd/(d+1) ,
α−d 1 − exp −
d+1
which gives the desired lower bound.
The upper bound is based on the simple observation that the sum of the
volumes of the packed cubes is at most 1. First we consider the probability
distribution of the volume of a single generated cube. The side of this cube is a
uniform random variable U over [0, 1]. Thus the probability that its volume is
bounded by z is
n
o
F (z) = Pr U d ≤ z = Pr U ≤ z 1/d = z 1/d .
Then applying Lemma 1 with sn = 1/n, xn = ((d + 1)/n)d/(d+1) , and
F (xn ) = ((d + 1)/n)1/(d+1) , we conclude that the expected number of cubes selected before their total volume exceeds 1 is asymptotic to (d+1)1/(d+1) nd/(d+1) ,
which gives the desired matching upper bound.
4
Bounds for d ≥ 2 Dimensional Boxes
Let Hd denote the unit hypercube in d ≥ 1 dimensions. The approach of the
last section can also be used to prove asymptotic bounds for the case of random
boxes in Hd .
Theorem 2. Fix d and draw n boxes independently and uniformly at random
from Hd . The maximum number that can be
p packed is asymptotically bounded
√
from below by Ω( n) and from above by O( n lnd−1 n).
Proof sketch: The lower bound argument is the same as that for cubes, except
that Hd is partitioned into cells with sides on the order of 1/n2d . It is easy to
verify that, on average, there is a constant fraction of the n1/2 cells in which
each cell wholly contains at least one of the given rectangles.
To apply Lemma 1 in a proof of the upper bound, one first conducts an
asymptotic analysis of the distribution Fd , the volume of a d-dimensional box,
which shows that
2d
lnd−1 x−1 .
dFd (x) ∼
(d − 1)!
296
E.G. Coffman, Jr. et al.
Then, with sn = 1/n, we obtain
q
xn ∼ (d − 1)!/(n lnd−1 n)
q
and Fd (xn ) ∼ 2 lnd−1 n/((d − 1)!n).
which together with Lemma 1 yields the desired upper bound.
5
Tight Bound for d = 2
Closing the gaps left by the bounds on E[Cn ] for d ≥ 3 remains an interesting
open problem. However, one can show that the lower bound for d ≥ 2 is tight, i.e.,
E[Cn ] = Θ(n1/2 ). To outline the proof of the O(n1/2 ) bound, we first introduce
the following reduced, discretized version. A canonical interval is an interval
that, for some i ≥ 0, has length 2−i and has a left endpoint at some multiple
of 2−i . A canonical rectangle is the product of two canonical intervals. In the
reduced, rectangle-packing problem, a Poissonized model of canonical rectangles
is assumed in which the number of rectangles of area a is Poisson distributed with
mean λa2 , independently for each possible a. Let C ∗ (λ) denote the cardinality
of a maximum packing for an instance of the reduced problem with parameter
λ.
Note that there are i + 1 shapes possible for a rectangle of area 2−i , and that
for each of these shapes there are 2i canonical rectangles. The mean number of
each of these is λ/22i . Thus, the total number T (λ) of rectangles in the reduced
problem with parameter λ is Poisson distributed with mean
∞
X
i=0
(i + 1)2i (λ2−2i ) = λ
∞
X
(i + 1)2−i = 4λ .
(3)
i=0
To convert an instance of the original problem to an instance of the reduced
problem, we proceed as follows. It can be seen that any interval in H1 contains either one or two canonical intervals of maximal length. Let the canonical
subinterval I ′ of an interval I be the maximal canonical interval in I, if only
one exists, and one such interval chosen randomly otherwise. A straightforward
analysis shows that a canonical subinterval I = [k2−i , (k + 1)2−i ) has probability 0 if it touches a boundary of H1 , and has probability 23 2−2i , otherwise. The
canonical subrectangle R′ of a rectangle R is defined by applying the above separately to both coordinates. Extending the calculations to rectangles, we get 94 a2
as the probability of a canonical subrectangle R of area a, if R does not touch the
boundary of H2 , and 0 otherwise. Now consider a random family of rectangles
{Ri }, of which a maximum of C(n) can be packed in H2 . This family generates
a random family of canonical subrectangles {Ri′ }. The maximum number C ′ (n)
of the Ri′ that can be packed trivially satisfies C(n) ≤ C ′ (n). Since the number
of each canonical subrectangle of area a that does not touch a boundary is Poisson distributed with mean 9na2 /4, we see that an equivalent way to generate a
random family {Ri′ } is simply to remove from a random instance of the reduced
Average-Case Analysis of Rectangle Packings
297
problem with parameter 9n/4 all those rectangles touching a boundary. It follows
easily that EC(n) ≤ EC ′ (n) ≤ EC ∗ (9n/4) so if we can prove that EC ∗ (9n/4) or
more simply EC ∗ (n), has the O(n1/2 ) upper bound, then we are done.
The following observations bring out the key recursive structure of maximal
packings in the reduced problem. Let Z1 be the maximum number of rectangles that can be packed if we disallow packings that use rectangles spanning the
height of the square. Define Z2 similarly when packings that use rectangles spanning the width of the square are disallowed. By symmetry, Z1 and Z2 have the
same distribution, although they may not be independent. To find this distribution, we begin by noting that (i) a rectangle spanning the width of H2 and one
spanning the height of H2 must intersect and hence can not coexist in a packing; (ii) rectangles spanning the height of H2 are the only rectangles crossing
the horizontal line separating the top and bottom halves of H2 and rectangles
spanning the width of H2 are the only ones crossing the vertical line separating
the left and right halves of H2 . It follows that, if a maximum cardinality packing
is not just a single 1 × 1 square, then it consists of a pair of disjoint maximum
cardinality packings, one in the bottom half and one in the top half of H2 , or
a similar pair of subpackings, one in the left half and one in the right half of
H2 . After rescaling, these subpackings become solutions to our original problem
on H2 with the new parameter λ times the square of half the area of H2 , i.e.,
λ/4. We conclude that Z1 and Z2 are distributed as the sum of two independent
samples of C ∗ (λ/4), and that
C ∗ (λ) ≤ Z0 + max(Z1 , Z2 ) ,
where Z0 is the indicator function of the event that the entire square is one of
the given rectangles. Note that Z0 is independent of Z1 and Z2 .
To exploit the above recursion, it is convenient to work in terms of the gen∗
2
erating function, S(λ) := EeαC (λ) . One can show that S(λ) ≤ 2eα (S(λ/4)) ,
and that a solution to this relation along with the inequality E[C ∗ (λ)] ≤
α−1 ln E[eαC∗ (λ) ] yields the desired bound, E[C ∗ (λ)] = O(n1/2 ).
Acknowledgment
In the early stages of this research, we had useful discussions with Richard Weber,
which we gratefully acknowledge.
References
1. Clark, B. N., Colburn, C. J., and Johnson, D. S., “Unit Disk Graphs,” Discrete
Mathematics, 86(1990), 165-177.
2. Coffman, E. G., Jr. and Lueker, G. S., An Introduction to the Probabilistic Analysis
of Packing and Partitioning Algorithms, Wiley & Sons, New York, 1991.
3. Justicz, J., Scheinermann, E. R., and Winkler, P. M., “Random Intervals,” Amer.
Math. Monthly, 97(1990), 881-889.
Heights in Generalized Tries and PATRICIA
Tries⋆
Charles Knessl1 and Wojciech Szpankowski2
1
Dept. Mathematics, Statistics & Computer Science, University of Illinois, Chicago,
Illinois 60607-7045, USA
2
Dept. Computer Science, Purdue University, W. Lafayette, IN 47907, USA
Abstract. We consider digital trees such as (generalized) tries and PATRICIA tries, built from n random strings generated by an unbiased
memoryless source (i.e., all symbols are equally likely). We study limit
laws of the height which is defined as the longest path in such trees. For
tries, in the region where most of the probability mass is concentrated,
the asymptotic distribution is of extreme value type (i.e., double exponential distribution). Surprisingly enough, the height of the PATRICIA
trie behaves quite differently in this region: It exhibits an exponential
of a Gaussian distribution (with
p an oscillating term) around the most
probable value k1 = ⌊log2 n + 2 log2 n − 32 ⌋+1. In fact, the asymptotic
distribution of PATRICIA height concentrates on one or two points. For
most n all the mass is concentrated at k1 , however, there exist subsequences of n such that the mass is on the two points k1 − 1 and k1 ,
or k1 and k1 + 1. We derive these results by a combination of analytic
methods such as generating functions, Mellin transform, the saddle point
method and ideas of applied mathematics such as linearization, asymptotic matching and the WKB method.
1
Introduction
Data structures and algorithms on words have experienced a new wave of interest due to a number of novel applications in computer science, communications,
and biology. These include dynamic hashing, partial match retrieval of multidimensional data, searching and sorting, pattern matching, conflict resolution
algorithms for broadcast communications, data compression, coding, security,
genes searching, DNA sequencing, genome maps, IP-addresses lookup on the
internet, and so forth. To satisfy these diversified demands various data structures were proposed for these algorithms. Undoubtly, the most popular data
structures for algorithms on words are digital trees [9,12] (e.g., tries, PATRICIA
tries, digital search trees), and suffix trees [6,18].
The most basic digital tree is known as a trie (the name comes from retrieval).
The primary purpose of a trie is to store a set S of strings (words, keys), say
⋆
The work was supported by NSF Grant DMS-93-00136 and DOE Grant DE-FG0293ER25168, as well as by NSF Grants NCR-9415491, NCR-9804760.
G. Gonnet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNCS 1776, pp. 298–307, 2000.
c Springer-Verlag Berlin Heidelberg 2000
Heights in Generalized Tries and PATRICIA Tries
299
S = {X1 , . . . , Xn }. Each word X = x1 x2 x3 . . . is a finite or infinite string of
symbols taken from a finite alphabet. Throughout the paper, we deal only with
the binary alphabet {0, 1}, but all our results should be extendable to a general
finite alphabet. A string will be stored in a leaf of the trie. The trie over S is
built recursively as follows: For |S| = 0, the trie is, of course, empty. For |S| = 1,
trie(S) is a single node. If |S| > 1, S is split into two subsets S0 and S1 so that
a string is in Sj if its first symbol is j ∈ {0, 1}. The tries trie(S0 ) and trie(S1 )
are constructed in the same way except that at the k-th step, the splitting of
sets is based on the k-th symbol of the underlying strings.
There are many possible variations of the trie. One such variation is the btrie, in which a leaf is allowed to hold as many as b strings (cf. [12,18]). A second
variation of the trie, the PATRICIA trie eliminates the waste of space caused by
nodes having only one branch. This is done by collapsing one-way branches into
a single node. In a digital search tree (in short DST) strings are directly stored
in nodes, and hence external nodes are eliminated. The branching policy is the
same as in tries. The reader is referred to [6,9,12] for a detailed description of
digital trees. Here, we consider tries and PATRICIA tries built over n randomly
generated strings of binary symbols. We assume that every symbol is equally
likely, thus we are within the framework of the so called unbiased memoryless
model. Our interest lies in establishing asymptotic distributions of the heights for
random b-tries, HnT , and PATRICIA tries, HnP . The height is the longest path in
such trees, and its distribution is of considerable interest for several applications.
We now summarize our main results. We obtain asymptotic expansions of the
distributions Pr{HnT ≤ k} (b-tries) and Pr{HnP ≤ k} (PATRICIA tries) for three
ranges of n and k. For b-tries we consider: (i) the “right–tail region” k → ∞ and
n = O(1); (ii) the “central region” n, k → ∞ with ξ = n2−k and 0 < ξ < b; and
(iii) the “left–tail region” k, n → ∞ with n − b2k = O(1). We prove that most
probability mass is concentrated in between the right tail and the central region.
In particular, for real x
Pr
HnT
≤
1+b
1
1+b
log2 n + x ∼ exp −
2−bx+bh b log2 n+xi ,
b
(b + 1)!
where hri = r − ⌊r⌋ is the fractional part of r.1 In words, the asymptotic distribution of b-tries height around its most likely value 1+b
b log2 n resembles a
double exponential (extreme value) distribution. In fact, due to the oscillating
term h 1+b
b log2 n + xi the limiting distribution does not exist, but one can find
lim inf and lim sup of the distribution.
The height of PATRICIA tries behaves differently in the central region (i.e.,
where most of the probability mass is concentrated).
It is concentrated at or
p
near the most likely value k1 = ⌊log2 n + 2 log2 n − 23 ⌋ + 1. We shall prove
that the asymptotic distribution around k1 resembles an exponential of a Gaussian distribution, with an oscillating term (cf. Theorem 3). In fact, there exist
1
The fractional part hri is often denoted as {r}, but in order to avoid confusion we
adopt the above notation.
300
C. Knessl, W. Szpankowski
subsequences of n such that the asymptotic distribution of PATRICIA height
concentrates only on k1 , or on k1 and one of the two points k1 − 1 or k1 + 1.
With respect to previous results, Devroye [1] and Pittel [14] established the
asymptotic distribution in the central regime for tries and b-tries, respectively,
using probabilistic tools. Jacquet and Régnier [7] obtained similar results by
analytic methods. The most probable value, log2 n, of the height for
p PATRICIA
was first proved by Pittel [13]. This was then improved to log2 n + 2 log2 n(1 +
o(1)) by Pittel and Rubin [15], and independently by Devroye [2]. No results
concerning the asymptotic distribution for PATRICIA height were reported.
The full version of this paper with all proofs can be found on http://www.cs.
purdue.edu/people/spa.
2
Summary of Results
As before, we let HnT and HnP denote, respectively, the height of a b-trie and a
PATRICIA trie. Their probability distributions are
h̄kn = Pr{HnT ≤ k}
hkn = Pr{HnP ≤ k}.
and
(1)
We note that for tries h̄kn = 0 for n > b2k (corresponding to a balanced tree),
while for PATRICIA tries hkn = 0 for n > 2k . In addition, for PATRICIA we
have the following boundary condition: hkn = 1 for k ≥ n. It asserts that the
height in a PATRICIA trie cannot be bigger than n (due to the elimination of
all one-way branches).
The distribution of b-tries satisfies the recurrence relation
n
X
n k k
k+1
−n
k≥0
(2)
h̄ h̄ ,
h̄n = 2
i i n−i
i=0
with the initial condition(s)
h̄0n = 1, n = 0, 1, 2, . . . , b;
and
h̄0n = 0, n > b.
(3)
RT
RT
denote,
} + 1, where HiLT and Hn−i
This follows from HnT = max{HiLT , Hn−i
respectively, the left subtree and
the
right
subtree
of
sizes
i
and
n
−
i,
which
happens with probability 2−n ni . Similarly, for PATRICIA tries we have
hk+1
n
=
2−n+1 hk+1
n
−n
+2
n−1
X
i=1
n k k
h h , k≥0
i i n−i
(4)
with the initial conditions
h00 = h01 = 1
and
h0n = 0, n ≥ 2.
(5)
Unlike b-tries, in a PATRICIA trie the left and the right subtrees cannot be
empty (which occurs with probability 2−n+1 ).
Heights in Generalized Tries and PATRICIA Tries
301
We shall analyze these problems asymptotically, in the limit n → ∞. Despite
the similarity between (2) and (4), we will show that even asymptotically the
two distributions behave very differently.
We first consider ordinary tries (i.e., 1-tries). It is relatively easy to solve (2)
and (3) explicitly and obtain the integral representation
I
k
n!
k
(1 + z2−k )2 z −n−1 dz
(6)
h̄n =
2πi
(
0,
n > 2k
=
(2k )!
, 0 ≤ n ≤ 2k .
2nk (2k −n)!
Here the loop integral is for any closed circle surrounding z = 0.
Using asymptotic methods for evaluating integrals, or applying Stirling’s formula to the second part of (6), we obtain the following.
Theorem 1. The distribution of the height of tries has the following asymptotic
expansions:
(i) Right-Tail Region: k → ∞, n = O(1)
Pr{HnT ≤ k} = h̄kn = 1 − n(n − 1)2−k−1 + O(2−2k ).
(ii) Central Region: k, n → ∞ with ξ = n2−k , 0 < ξ < 1
h̄kn ∼ A(ξ)enφ(ξ) ,
where
φ(ξ) =
1−
1
ξ
log(1 − ξ) − 1,
A(ξ) = (1 − ξ)−1/2 .
(iii) Left-Tail Region: k, n → ∞ with 2k − n = j = O(1)
h̄kn ∼ nj
e−n−j √
2πn.
j!
This shows that there are three ranges of k and n where the asymptotic form of
h̄kn is different.
We next consider the “asymptotic matching” (see [11]) between the three
expansions. If we expand (i) for n large, we obtain 1 − h̄kn ∼ n2 2−k−1 . For ξ → 0
we have A(ξ) ∼ 1 and φ(ξ) ∼ −ξ/2 so that the result in (ii) becomes
1
1
(7)
A(ξ)enφ(ξ) ∼ e−nξ/2 = exp − n2 2−k ∼ 1 − n2 2−k
2
2
where the last approximation assumes that n, k → ∞ in such a way that
n2 2−k → 0. Since (7) agrees precisely with the expansion of (i) as n → ∞,
302
C. Knessl, W. Szpankowski
we say that (i) and (ii) asymptotically match. To be precise, we say they match
the leading order; higher order matchings can be verified by computing higher
order terms in the asymptotic series in (i) and (ii). We can easily show that the
expansion of (ii) as ξ → 1− agrees with the expansion of (iii) as j → ∞, so that
(ii) and (iii) also asymptotically match. The matching verifications imply that,
at least to leading order, there are no “gaps” in the asymptotics. In other words,
one of the results in (i)-(iii) applies for any asymptotic limit which has k and/or
n large. We recall that h̄kn = 0 for n > 2k so we need only consider k ≥ log2 n.
The asymptotic limits where (i)-(iii) apply are the three “natural scales” for
this problem. We can certainly consider other limits (such as k, n → ∞ with
k/n fixed), but the expansions that apply in these limits would necessarily be
limiting cases of one of the three results in Theorem 1. In particular, if we let
k, n → ∞ with k − 2 log2 n = O(1), we are led to
1
1
(8)
h̄kn ∼ exp − n2 2−k = exp − exp(−k log 2 + 2 log n) .
2
2
This result is well-known (see [1,7]) and corresponds to a limiting double exponential (or extreme value) distribution. However, according to our discussion, k = 2 log2 n + O(1) is not a natural scale for this problem. The scale
k = log2 n + O(1) (where (ii) applies) is a natural scale, and the result in (8)
may be obtained as a limiting case of (ii), by expanding (ii) for ξ → 0.
We next generalize Theorem 1 to arbitrary b, and obtain the following result
whose proof can be found in our full paper available on http://www.cs.purdue.
edu/people/spa.
Theorem 2. The distribution of the height of b-tries has the following asymptotic expansions for fixed b:
(i) Right-Tail Region: k → ∞, n = O(1):
Pr{HnT ≤ k} = h̄kn ∼ 1 −
n!
2−kb .
(b + 1)!(n − b − 1)!
(ii) Central Regime: k, n → ∞ with ξ = n2−k , 0 < ξ < b:
h̄kn ∼ A(ξ; b)enφ(ξ;b) ,
where
1
1
b log(ω0 ξ) − log b! − log 1 −
,
φ(ξ; b) = −1 − log ω0 +
ξ
ω0
1
.
A(ξ; b) = p
1 + (ω0 − 1)(ξ − b)
In the above, ω0 = ω0 (ξ; b) is the solution to
1−
(ω0 ξ)b
1
=
ω2 ξ2
ω0
b! 1 + ω0 ξ + 02! + · · · +
ω0b ξ b
b!
.
Heights in Generalized Tries and PATRICIA Tries
303
(iii) Left-Tail Region: k, n → ∞ with j = b2k − n
h̄kn ∼
where j = O(1).
√
2πn
nj n
b exp −(n + j) 1 + b−1 log b!
j!
When b = 1 we can easily show that Theorem 2 reduces to Theorem 1 since
in this case ω0 (ξ; 1) = 1/(1 − ξ). We also can obtain ω0 explicitly for b = 2,
namely:
ω0 (ξ; 2) =
2
p
.
1 − ξ + 1 + 2ξ − ξ 2
(9)
For arbitrary b, we have ω0 → ∞ as ξ → b− and ω0 → 1 as ξ → 0+ . More
precisely,
ξb
+ O(ξ b+1 ),
b!
b−1
1
+
+ O(b − ξ),
ω0 =
b−ξ
b
ω0 = 1 −
ξ→0
(10)
ξ → b.
(11)
Using (10) and (11) we can also show that the three parts of Theorem 2
asymptotically match. In particular, by expanding part (ii) as ξ → 0 we obtain
nξ b
T
nφ(ξ)
ξ→0
∼ exp −
Pr{Hn ≤ k} ∼ A(ξ)e
(b + 1)!
1+b −kb
n 2
.
(12)
= exp −
(b + 1)!
This yields the well-known (see [7,14]) asymptotic distribution of b-tries. We note
that, for k, n → ∞, (12) is O(1) for k − (1 + 1/b) log2 n = O(1). More precisely,
let us estimate the probability mass of HnT around (1 + 1/b) log2 n + x where x
is a real fixed value. We observe from (12) that
Pr{HnT ≤ (1 + 1/b) log2 n + x} = Pr{HnT ≤ ⌊(1 + 1/b) log2 n + x⌋}
1
−bx+bh(1+b)/b·log2 n+xi
2
∼ exp −
(13)
,
(1 + b)!
where hxi is the fractional part of x, that is, hxi = x − ⌊x⌋.
Corollary 1. While the limiting distribution of the height for b-tries does not
exist, the following lower and upper envelopes can be established
1
−b(x−1)
T
2
,
lim inf Pr{Hn ≤ (1 + 1/b) log2 n + x} = exp −
n→∞
(1 + b)!
1
2−bx
lim sup Pr{HnT ≤ (1 + 1/b) log2 n + x} = exp −
(1 + b)!
n→∞
for fixed real x.
304
C. Knessl, W. Szpankowski
We next turn our attention to PATRICIA tries. Using ideas of applied mathematics, such as linearization and asymptotic matching, we obtain the following.
The derivation can be found on http://www.cs.purdue.edu/people/spa where
we make certain assumptions about the forms of the asymptotic expansions, as
well as the asymptotic matching between the various scales.
Theorem 3. The distribution of PATRICIA tries has the following asymptotic
expansions:
(i) Right-Tail Regime: k, n → ∞ with n − k = j = O(1), j ≥ 2
∼ 1 − ρ0 Kj · n! · 2−n
Pr{HnP ≤ n − j} = hn−j
n
2
/2+(j−3/2)n
,
where
1 −j 2 /2 3j/2
2
2
Cj ,
j!
I 1−j z Y
∞
z e
1 − exp(−z2−m−1 )
j!
dz
Cj =
2πi
2 m=0
z2−m−1
(14)
Kj =
and ρ0 =
Q∞
ℓ=2 (1
(15)
− 2−ℓ )−1 = 1.73137 . . .
(ii) Central Regime: k, n → ∞ with ξ = n2−k , 0 < ξ < 1
p
hkn ∼ 1 + 2ξΦ′ (ξ) + ξ 2 Φ′′ (ξ)e−nΦ(ξ) .
We know Φ(ξ) analytically only for ξ → 0 and ξ → 1. In particular, for ξ → 0
log2 ξ
1
,
ξ → 0+ ,
(16)
Φ(ξ) ∼ ρ0 eϕ(log2 ξ) ξ 3/2 exp −
2
2 log 2
with
X
∞
∞
X
1 − exp(−2x−ℓ )
log 2
x(x + 1) +
log(1 − exp(−2x+ℓ ))
log
+
ϕ(x) =
2
2x−ℓ
ℓ=1
ℓ=0
2
γ
1
π2
log 2
,
(17)
+
+ γ(1) −
= Ψ (x) −
12
log 2 2
12
∞
X
2πiℓ
1
2πiℓ
ζ 1−
e2πixℓ .
(18)
Ψ (x) =
Γ 1−
2πiℓ
log
2
log
2
ℓ=−∞
ℓ6=0
In the above, Γ (·) is the Gamma function, ζ(·) is the Riemann zeta function,
γ = −Γ ′ (1) is the Euler constant, and γ(1) is defined by the Laurent series
ζ(s) = 1/(s−1)+γ −γ(1)(s−1)+O((s−1)2 ). The function Ψ (x) is periodic with
a very small amplitude, i.e., |Ψ (x)| < 10−5 . Moreover, for ξ → 1 the function
Φ(ξ) becomes
Φ(ξ) ∼ D1 + (1 − ξ) log(1 − ξ) − (1 − ξ)(1 + log D2 ),
ξ → 1−
Heights in Generalized Tries and PATRICIA Tries
305
where D1 = 1 + log(K0∗ ) and D2 = K1∗ K0∗ /e with
K0∗ =
K1∗ =
∞
Y
1 − 2−2
ℓ=1
∞
∞ Y
Y
ℓ=1 m=1
ℓ
+1
2−ℓ
1 − 2−2
ℓ+1
= .68321974 . . . ,
+2
−1 h
1 − 2−2
ℓ+m
+1
i2−m
= 1.2596283 . . .
(iii) Left-Tail Regime: k, n → ∞ with 2k − n = M = O(1)
√
2π M M +1/2 −D1 n
k
e
D n
hn ∼
M! 2
where D1 and D2 are defined above.
The expressions for hkn in parts (i) and (iii) are completely determined. However, the expression in part (ii) involves the function Φ(ξ). We have not been able
to determine this function analytically, except for its behaviors as ξ approaches
0 or 1. The behavior of Φ(ξ) as ξ → 1− implies the asymptotic matching of
parts (ii) and (iii), while the behavior as ξ → 0+ implies the matching of (i) and
(ii). As ξ → 0, this behavior involves the periodic function ϕ(x), which satisfies
ϕ(x + 1) = ϕ(x). In part (ii) we give two different representations for ϕ(x); the
latter (which involves Ψ (x)) is a Fourier series.
Since Φ(ξ) > 0, we see that in (ii) and (iii), the distribution is exponentially
small in n, while in (i), 1 − hkn is super–exponentially small (the dominant term
2
in 1 − hkn is 2−n /2 ). Thus, (i) applies in the right tail of the distribution while
(ii) and (iii) apply in the left tail. We wish to compute the range of k where hkn
undergoes the transition from hkn ≈ 0 to hkn ≈ 1, as n → ∞. This must be in the
asymptotic matching region between (i) and (ii). We can show that Cj , defined
in Theorem 3(i), becomes as j → ∞
1 log2 j
j 5/2 ϕ(α)
e
,
(19)
exp −
Cj ∼
2
2 log 2
where α = hlog2 ji. With (19), we can verify the matching between parts (i) and
(ii), and the limiting form of (ii) as ξ → 0+ is
!!!
2
3
9
log 2
ρ0 ϕ(log2 n)
k
k + − log2 n − 2 log2 n −
exp −
hn ∼ exp − e
2
2
2
4
log 2
2
(k + 1.5 − log2 n)
= exp −ρ0 eϕ(log2 n) 21/8 n exp −
(20)
2
log 2
2
(21)
(k + 1.5 − log2 n) + θ + Ψ (log2 n)
= exp −ρ0 · n · exp −
2
where ρ0 is defined in Theorem 3(i) and
2
π2
γ
log 2
1
+ γ(1) −
= −1.022401 . . .
+
θ=
log 2 2
12
24
306
C. Knessl, W. Szpankowski
while |Ψ (log2 n)| ≤ 10−5 . We havepwritten (20) in terms of k and n, recalling
that ξ = n2−k . We also have used 1 + 2ξΦ′ (ξ) + ξ 2 Φ′′ (ξ) ∼ 1 as ξ → 0.
We now set, for an integer ℓ,
p
3
+ℓ
(22)
kℓ = log2 n + 2 log2 n −
2
p
3
= log2 n + 2 log2 n − + ℓ − βn ,
2
where
βn =
log2 n +
p
3
2 log2 n −
2
∈ [0, 1).
(23)
In terms of ℓ and βn , (21) becomes
p
(24)
Pr{HnP ≤ ⌊log2 n + 2 log2 n − 1.5⌋ + ℓ}
√
2
∼ exp −ρ0 eθ+Ψ (log2 n) 2−(ℓ−βn ) /2−(ℓ−βn ) 2 log2 n .
For 0 < βn < 1 and n → ∞ the above is small for ℓ ≤ 0, and it is close to one
for ℓ ≥ 1. This shows that asymptotically, as n → ∞, all the mass accumulates
when k = k1 is given by (22) with ℓ = 1. Now suppose βn = 0 for some n, or
more
generally that we
p can find a sequence ni such that ni → ∞ as i → ∞ but
p
2 log2 ni log2 ni + 2 log2 ni − 23 remains bounded. Then, the expression in
p
(24) would be O(1) for ℓ = 0 (since βn 2 log2 n = O(1)). For ℓ = 1, (24) would
then be asymptotically close to 1. Thus, now the mass would accumulate at
two points,
p namely, k0 = k1 − 1 and k1 . Finally, if βn = 1 − o(1) such that
(1 − βn ) 2 log2 n = O(1), then the probability mass is concentrated on k1 and
k1 + 1.
In order to verify the latter assertions, we must eitherpshow that βn = 0 for
an integer n or that there is a subsequence ni such that 2 log2 ni βni = O(1).
The former is false, while the latter is true. To prove that βn = 0 is impossible
for integer n, let us assume the contrary. If there exists an integer N such that
log2 n +
then
p
3
2 log2 n − = N,
2
√
n = 2N +5/2−
4+2N
.
But this is impossible since this would require that
p 4 + 2N is odd. To see that
there exists a subsequence such that R(ni ) = βni 2 log2 ni = O(1), we observe
p
that the function
R(n) fluctuates from zero to 2 log2 n. We can show that if
√
ni = ⌊2i+5/2− 2i+4 ⌋ + 1, then R(ni ) → 0 as i → ∞. Note that this subsequence
corresponds to the minima of R(n).
Corollary 2. The asymptotic distribution of PATRICIA height
p is concentrated
among the three points k1 −1, k1 and k1 +1 where k1 = ⌊log2 n+ 2 log2 n− 23 ⌋+1,
that is,
Pr{HnP = k1 − 1 or k1 or k1 + 1} = 1 − o(1)
Heights in Generalized Tries and PATRICIA Tries
307
as n → ∞. More precisely: (i) there are subsequences ni for which Pr{HnPi =
k1 } = 1 − o(1) provided that
p
p
3
→∞
R(n) = 2 log2 n log2 n + 2 log2 n −
2
as i → ∞; (ii) there are subsequences ni for which Pr{HnPi = k1 − 1 or k1 } =
1 − o(1) provided that R(ni ) = O(1); (iii) finally, there
p are subsequences ni for
which Pr{HnPi = k1 or k1 +1} = 1−o(1) provided that 2 log2 ni −R(ni ) = O(1).
References
1. L. Devroye, A Probabilistic Analysis of the Height of Tries and the complexity of
Trie Sort, Acta Informatica, 21, 229–237, 1984.
2. L. Devroye, A Note on the Probabilistic Analysis of Patricia Tries, Random Structures & Algorithms, 3, 203–214, 1992.
3. L. Devroye, A Study of Trie-Like Structures Under the Density Model, Ann. Appl.
Probability, 2, 402–434, 1992.
4. P. Flajolet, On the Performance Evaluation of Extendible Hashing and Trie Searching, Acta Informatica, 20, 345–369, 1983.
5. N. Froman and P. Froman, JWKB Approximation, North-Holland, Amsterdam
1965.
6. D. Gusfield, Algorithms on Strings, Trees, and Sequences, Cambridge University
Press, 1997.
7. P. Jacquet and M. Régnier, Trie Partitioning Process: Limiting Distributions, Lecture Notes in Computer Science, 214, 196-210, Springer Verlag, New York 1986.
8. P. Jacquet and W. Szpankowski, Asymptotic Behavior of the Lempel-Ziv Parsing
Scheme and Digital Search Trees, Theoretical Computer Science, 144, 161-197,
1995.
9. D. Knuth, The Art of Computer Programming. Sorting and Searching, Second
Edition, Addison-Wesley, 1998.
10. C. Knessl and W. Szpankowski, Quicksort Algorithm Again Revisited, Discrete
Mathematics and Theoretical Computer Science, 3, 43-64, 1999.
11. P. Lagerstrom, Matched Asymptotic Expansions: Ideas and Techniques, SpringerVerlag, New York 1988.
12. H. Mahmoud, Evolution of Random Search Trees, John Wiley & Sons, New York
1992.
13. B. Pittel, Asymptotic Growth of a Class of Random Trees, Ann. of Probab., 13,
414–427, 1985.
14. B. Pittel, Path in a Random Digital Tree: Limiting Distributions, Adv. Appl. Prob.,
18, 139–155, 1986.
15. B. Pittel and H. Rubin, How Many Random Questions Are Necessary to Identify
n Distinct Objects?, J. Combin. Theory, Ser. A., 55, 292–312, 1990.
16. W. Szpankowski, Patricia Tries Again Revisited, Journal of the ACM, 37, 691–711,
1990.
17. W. Szpankowski, On the Height of Digital Trees and Related Problems, Algorithmica, 6, 256–277, 1991.
18. W. Szpankowski, A Generalized Suffix Tree and Its (Un)Expected Asymptotic
Behaviors, SIAM J. Compt., 22, 1176–1198, 1993.
19. J. Ziv and A. Lempel, Compression of Individual Sequences via Variable-rate Coding, IEEE Trans. Information Theory, 24, 530-536, 1978.
On the Complexity of Routing Permutations on
Trees by Arc-Disjoint Paths
Extended Abstract
D. Barth1 , S. Corteel2 , A. Denise2 , D. Gardy1 , and M. Valencia-Pabon2
1
PRiSM, Université de Versailles,
45 Av. des Etats Unis, 78035
VERSAILLES, FR.
2
L.R.I., Bât 490,
Université Paris-Sud,
91405 ORSAY, FR.
Abstract. In this paper we show that the routing permutation problem
is NP-hard even for binary trees. Moreover, we show that in the case of
unbounded degree tree networks, the routing permutation problem is
NP-hard even if the permutations to be routed are involutions. Finally,
we show that the average-case complexity of the routing permutation
problem on linear networks is n/4 + o(n).
Keywords: Average-Case Complexity, Routing Permutations, Path Coloring, Tree Networks, NP-completeness.
1
Introduction
Efficient communication is a prerequisite to exploit the performance of large
parallel systems. The routing problem on communication networks consists in
the efficient allocation of resources to connection requests. In this network, establishing a connection between two nodes requires selecting a path connecting
the two nodes and allocating sufficient resources on all links along the paths associated to the collection of requests. In the case of all-optical networks, data is
transmitted on lightwaves through optical fiber, and several signals can be transmitted through a fiber link simultaneously provided that different wavelengths
are used in order to prevent interference (wavelength-division multiplexing) [4].
As the number of wavelengths is a limited resource, then it is desirable to establish a given set of connection requests with a minimum number of wavelengths.
In this context, it is natural to think in wavelengths as colors. Thus the routing
problem for all-optical networks can be viewed as a path coloring problem: it
consists in finding a desirable collection of paths on the network associated with
the collection of connection requests in order to minimize the number of colors
needed to color these paths in such a way that any two different paths sharing
a same link of the network are assigned different colors. For simple networks,
such as trees, the routing problem is simpler, as there is always a unique path
for each communication request.
This paper is concerned with routing permutations on trees by arc-disjoint paths,
G. Gonnet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNCS 1776, pp. 308–317, 2000.
c Springer-Verlag Berlin Heidelberg 2000
On the Complexity of Routing Permutations on Trees by Arc-Disjoint Paths
309
that is, the path coloring problem on trees when the collection of connection requests represents a permutation of the nodes of the tree network.
Previous and related work. In [1], Aumann and Rabani have shown that
2
O( logβ 2 n ) colors suffice for routing any permutation on any bounded degree network on n nodes, where β is the arc expansion of the network. The result of
Aumman and Rabani almost matches the existential lower bound of Ω( β12 ) obtained by Raghavan and Upfal [18]. In the case of specific network topologies,
Gu and Tamaki [13] proved that 2 colors are sufficient to route any permutation
on any symmetric directed hypercube. Independently, Paterson et al. [17] and
Wilfong and Winkler [22] have shown that the routing permutation problem
on ring networks is NP-hard. Moreover, in [22] a 2-approximation algorithm is
given for this problem on ring networks. To our knowledge, the routing permutation problem on tree networks by arc-disjoint paths has not been studied in
the literature.
Our results. In Section 2 we first give some definitions and recall previous
results. In Section 3 we show that for arbitrary permutations, the routing permutation problem is NP-hard even for binary trees. Moreover, we show that the
routing permutations problem on unbounded degree trees is NP-hard even if
the permutations to be routed are involutions, i.e. permutations with cycles of
length at most two. In Section 4 we focus on linear networks. In this particular
case, since the problem reduces to coloring an interval graph, the routing of any
permutation is easily done in polynomial time [14]. We show that the average
number of colors needed to color any permutation on a linear network on n vertices is n/4 + o(n). As far as we know, this is the first result on the average-case
complexity for routing permutations on networks by arc-disjoint paths. Finally,
in Section 5 we give some open problems and future work.
2
Definitions and Preliminary Results
We model the tree network as a rooted labeled symmetric directed tree T =
(V, A), where processors and switches are vertices and links are modeled by
two arcs in opposite directions. In the sequel, we assume that the labels of the
vertices of a tree T on n vertices are {1, 2, . . . , n} and are such that a postfix tree
traversal would be exactly 1, 2, . . . , n. This implies that for any internal vertex
labeled by i the labels of the vertices in his subtree are less than i. Given two
vertices i and j of the tree T , we denote by <i, j> the unique path from vertex
i to vertex j. The arc from vertex i to its father (resp. from the father of i to i)
(1 ≤ i ≤ n − 1) is labeled by i+ (resp. i− ). See Figure 1(a) for the linear network
on n = 6 vertices rooted at vertex i = 6. We want to route permutations in
Sn on any tree T on n vertices. Given a tree T and a vertex i we call T (i) the
subtree of T rooted at vertex i.
We associate with any permutation a graphical representation. To represent
the permutation σ we draw an arrow from i to σ(i), if i 6= σ(i), that is, the path
<i, σ(i)>, 1 ≤ i ≤ n. The arrow going from i to σ(i) crosses the arc j + if and
310
D. Barth et al.
only if i is in T (j) and σ(i) is not in T (j) and it crosses the arc j − if and only
if i is not in T (j) and σ(i) is in T (j), 1 ≤ j ≤ n − 1.
1+
1
1-
2+
2
2-
3+
3
3-
(a)
4+
4
4-
5+
5
5-
6
(b)
Fig. 1. (a) Labeling of the vertices and the arcs for the linear network on n = 6 vertices
rooted at vertex i = 6. (b) representation of permutation σ = (3, 1, 6, 5, 2, 4) on the
linear network given in (a).
Definition 1. Let T be a tree on n vertices and σ be a permutation in Sn . We
define the height of the arc i+ (resp. height of the arc i− ), 1 ≤ i ≤ n − 1,
−
+
denoted h+
T (σ, i) (resp. hT (σ, i)), as the number of paths crossing the arc i
−
+
−
(resp. i ); that is, hT (σ, i) = |{j ∈ T (i) | σ(j) 6∈ T (i)}| (resp. hT (σ, i) = |{j 6∈
T (i) | σ(j) ∈ T (i)}|).
Lemma 1. Let T be a tree with n vertices. For all σ in Sn and for all i ∈
−
{1, 2, . . . , n − 1}, h+
T (σ, i) = hT (σ, i).
This lemma is straightforward to prove. It tells us that in order to study the
height of a permutation on a tree on n vertices, it suffices to consider only the
height of the labeled arcs i+ .
Definition 2. Given a tree T and a permutation σ to be routed on T , the height
of σ, denoted hT (σ), is the maximum number of paths crossing any arc of T :
hT (σ) = max h+
T (σ, i).
i
For example the permutation σ = (3, 1, 6, 5, 2, 4) on the linear network in Figure
1(a) has height 2 (see Figure 1(b)). The maximum is reached in the arcs 4± .
Definition 3. Given a tree T and a permutation σ to be routed on T , the coloration number of σ, denoted RT (σ), is the minimum number of colors assigned to the paths on T associated with σ such that no two paths sharing a same
arc of T are assigned the same color.
Clearly, for any permutation σ of the vertex set of a tree T , we have RT (σ) ≥
hT (σ). For linear networks the equality holds, because the conflict graph of
the paths associated with σ is an interval graph (see [12]). Moreover, optimal
vertex coloring for interval graphs can be computed efficiently [14]. However, for
arbitrary tree networks, equality does not hold as we will see in the Section 3.3.
3
Complexity of Computing the Coloration Number
We begin this section by showing the NP-completeness of the routing permutations problem in binary trees, and then for the case of routing involutions
On the Complexity of Routing Permutations on Trees by Arc-Disjoint Paths
311
on unbounded degree trees. Finally, we discuss some polynomial cases of this
problem and we show, by an exemple, that in the case of binary trees having at
most two vertices with degree equal to 3, the equality between the height and
the coloration number of permutations does not hold.
3.1
NP-Completeness Results
Independently, Kumar et al. [15] and Erlebach and Jansen [6] have shown that
computing a minimal coloring of any collection of paths on symmetric directed
binary trees is NP-hard. However, the construction given in [15,6] does not work
when the collection of paths represents a permutation of the vertex set of a
binary tree. Thus, by using a reduction similar to the one used in [15,6] we
obtain the following result.
Theorem 1. Let σ ∈ Sn be any permutation to be routed on a symmetric directed binary tree T on n vertices, then computing RT (σ) is NP-hard.
Sketch of the proof. We use a reduction from the ARC-COLORING problem
[19]. The ARC-COLORING problem can be defined as follows : given a positive integer k, an undirected cycle Cn with vertex set numbered clockwise as
1, 2, . . . , n, and any collection of paths F on Cn , where each path <v, w> ∈ F
is regarded as the path beginning at vertex v and ending at vertex w again
in the clockwise direction, does F can be colored with k colors so that no two
paths sharing an edge of Cn are assigned the same color ? It is well known that
the ARC-COLORING problem is NP-complete [10]. Let I be an instance of the
ARC-COLORING problem. We construct from I an instance I ′ of the routing
permutations problem on binary trees, consisting of a symmetric directed binary
tree T and a permutation-set of paths F ′ on T such that F can be k-colored if
and only if F ′ can be k-colored. Without loss of generality, we may assume that
each edge of Cn is crossed by exactly k paths in F . If some edge of Cn is crossed
by more than k paths, then this can be discovered in polynomial time, and it
implies that the answer in this instance I must be “no”. If some edge [i, i + 1] of
Cn is crossed by r < k paths, then we can add k − r paths of the form <i, i + 1>
(or <i, 1> if i = n) to F without changing its k-colorability.
Let B(i) ⊂ F (resp. E(i) ⊂ F ) be the subcollection of paths of F beginning
(resp. ending) at vertex i of Cn , 1 ≤ i ≤ n. Thus, by the previous hypothesis, it
is easy to verify that the following property holds for instance I.
Claim. For all vertices i of Cn , |B(i)| = |E(i)|.
Construction of the binary tree T of I ′ : first, construct a line on 2k+n vertices denoted from left to right by lk , lk−1 , . . . , l2 , l1 , v1 , v2 , . . . , vn , r1 , r2 , . . . , rk . Next,
for each vertex li (resp. ri ), 1 ≤ i ≤ k, construct a new different line on 2k + 1
vertices denoted from left to right by lli1 , lli2 , . . . , llik , wli , rlik , rlik−1 , . . . , rli1 (resp.
lri1 , lri2 , . . . , lrik , wri , rrik , rrik−1 , . . . , rri1 ) and add to T the arc set {(wli , li ),
(li , wli )} (resp. {(wri , ri ), (ri , wri )}). Finally, for each vertex vi , 1 ≤ i ≤ n, if
|B(i)| > 1, then construct a new different line on αi = |B(i)|−1 vertices denoted
by vi1 , vi2 , . . . , viαi and add to T the arc set {(vi1 , vi ), (vi , vi1 )}.
312
D. Barth et al.
The construction of the permutation-set of paths F ′ of I ′ is as follows: for each
path <i, j> ∈ F , let bi (resp. ej ) be the first vertex of T in {vi , vi1 , . . . , viαi }
α
(resp. {vj , vj1 , . . . , vj j }) not already used by any path in F ′ as beginning-vertex
(resp. ending-vertex), then we consider the following two types of paths in F :
• Type 1 : i < j. Then add to F ′ the path set {<bi , ej >}.
• Type 2 : i > j. Let rp (resp. lq ) be the first vertex of T in {r1 , r2 , . . . , rk } (resp.
{l1 , l2 , . . . , lk }) such that the arc (rp , wrp ) (resp. (lq , wlq )) of T has not be already
used by any path in F ′ , then add to F ′ the path set {<bi , rrp1 >, <lrp1 , rlq1 >,
<llq1 , ej >}. In addition, for each i, 1 ≤ i ≤ k, add to F ′ the following path sets :
{<llij , rlij > : 2 ≤ j ≤ k} ∪ {<rlis , llis > : 1 ≤ s ≤ k} and {<lrij , rrij > : 2 ≤ j ≤
k} ∪ {<rris , lris > : 1 ≤ s ≤ k}. The paths <llij , rlij > and <lrij , rrij >, 2 ≤ j ≤ k,
1 ≤ i ≤ k, act as blockers. They make sure that all the three paths in F ′
corresponding to one path in F of type 2 are colored with the same color in any
k-coloration of F ′ . The other paths that we call permutation paths, are used to
ensure that the path collection F ′ represents a permutation of the vertex set
of T . In Figure 2 we present an example of this polynomial construction. By
T
G
5
8
g
j
h
a
4
b
i
7
f
c
a
b
e
f
e
d
c
d
h
1
2
3
j
6
9
g
10
i
(a)
(b)
Fig. 2. Partial construction of I ′ from I, where k = 3.
our construction, it is easy to check that the set of paths F ′ on T represents a
permutation of the vertex set of T , and that there is a k-coloring of F if and
only if there is a k-coloring of F ′ .
✷
In the case of unbounded degree symmetric directed trees, Caragiannis et
al. [3] have shown that the path coloring problem remains NP-hard even if the
collection of paths is symmetric (we call this problem the symmetric path coloring problem), i.e., for each path beginning at vertex v1 and ending at vertex v2 ,
there also exists its symmetric, a path beginning at v2 and ending at v1 . Thus,
using a polynomial reduction from the symmetric path coloring problem on trees
[3] we have the following result which proof is omitted for lack of space.
Theorem 2. Let σ ∈ In be any involution to be routed on an unbounded degree
tree T on n vertices. Then computing RT (σ) is NP-hard.
On the Complexity of Routing Permutations on Trees by Arc-Disjoint Paths
3.2
313
Polynomial Cases
As noticed in Section 2, the coloration number associated to any permutation
to be routed on a linear network can be computed efficiently in polynomial time
[14]. In the case of generalized star networks, i.e., a tree network having only one
vertex with degree greater to 2 and the other vertices with degree at most equal to
2, Gargano et al. [11] show that an optimal coloring of any collection of paths on
these networks can be computed efficiently in polynomial time. Moreover, in [11]
is also showed that the number of colors needed to color any collection of paths
on a generalized star network is equal to the height of such a collection of paths.
Thus, based on the results given in [11] we obtain the following proposition.
Proposition 1. Given a generalized star network G on n vertices and a permutation σ ∈ Sn to be routed on G, the coloration number RG (σ) can be computed
efficiently in polynomial time. Moreover, RG (σ) = hG (σ) always holds.
3.3
General Trees
Given any permutation σ ∈ Sn to be routed on a tree T on n vertices, the
equality between the heigth hT (σ) and the coloration number RT (σ) does not
always hold. In Figure 3(a) we give an exemple of a permutation σ ∈ S10 to be
routed on a tree T on 10 vertices, which height hT (σ) is equal to 2. Moreover,
in Figure 3(b) we present the conflict graph G associated with σ, that is an
undirected graph whose vertices are the paths on T associated with σ, and in
which two vertices are adjacent if and only if their associated paths share a same
arc in T . Thus, clearly the coloration number RT (σ) is equal to the chromatic
number of G. Therefore, as the conflict graph G has the cycle C5 as induced
subgraph, then the chromatic number of G is equal to 3, and thus RT (σ) = 3.
l3
l2
l1
r1
1
1
5
4
3
3
4
r2
r3
5
wr 1
wl 1
2
I
2
3
ll1
3
rl 1
ll1
2
rl 1
2
1
ll1
1
1
rl
I’
3
lr1
3
rr 1
lr1
2
rr 1
1
lr1
rr 1
2
1
Fig. 3. (a) A tree T on 10 vertices and a permutation σ = (5, 4, 8, 2, 6, 3, 9, 10, 7, 1) to
be routed on T . (b) The conflict graph G associated with permutation σ in (a).
The best known approximation algorithm for coloring any collection of paths
with height h on any tree network is given in [7], which uses at most ⌈ 53 h⌉ colors.
Therefore it trivially also holds for any permutation-set of paths with height h
on any tree.
Proposition 2. Given a tree T on n vertices and a permutation σ ∈ Sn to be
routed on T with heigth hT (σ), there exists a polynomial algorithm for coloring
the paths on T associated with σ which uses at most ⌈ 53 hT (σ)⌉ colors.
314
4
D. Barth et al.
Average Coloration Number on Linear Networks
The main result of this section is the following:
Theorem 3. The average coloration number of the permutations in Sn to be
routed on a linear network on n vertices is
n λ 1/3
+ n + O(n1/6 )
4
2
where λ = 0.99615 . . . .
To prove this result, we use the equality between the height and the coloration
number (see Section 2). Then our approach, developed in Subsections 4.1 and
4.2, is as follows: at first we recall a bijection between permutations in Sn and
special walks in N × N, called “Motzkin walks”, which are labeled in a certain
way. The bijection is such that the height parameter is “preserved”. Then we
prove Theorem 3 by studying the asymptotic behaviour of the height of these
walks. On the other hand, we get in Subsection 4.3 the generating function of
permutations with coloration number k, for any given k. This gives rise to an
algorithm to compute exactly the average coloration number of the permutations
for any fixed n.
4.1
A Bijection between Permutations and Motzkin Walks
A Motzkin walk of length n is a (n+1)-uple (s0 , s1 , . . . , sn ) of points in N × N
satisfying the following conditions:
– For all 0 ≤ i ≤ n, si = (i, yi ) with yi ≥ 0;
– y0 = yn = 0;
– For all 0 ≤ i < n, yi+1 − yi equals either 1 (North-East step), or 0 (East
step), or −1 (South-East step);
The height of a Motzkin walk ω is H(ω) =
max
{yi }.
i∈{0,1,... ,n}
Labeled Motzkin walks are Motzkin walks in which steps can be labeled by
integers. These structures are in relation with several well-studied combinatorial
objects [8,20,21] and in particular with permutations. The walks we will deal
with are labeled as follows:
– each South-East step (i, yi ) → (i + 1, yi − 1) is labeled by an integer between
1 and yi 2 (or, equivalently, by a pair of integers, each one between 1 and yi );
– each East step (i, yi ) → (i + 1, yi ) is labeled by an integer between 1 and
2yi + 1.
Let Pn be the set of such labeled Motzkin walks of length n. We recall that
Sn is the set of permutations on [n]. The following result was first established
by Françon and Viennot [9]:
Theorem (Françon-Viennot) There is a one-to-one correspondence between the
elements of Pn and the elements of Sn .
On the Complexity of Routing Permutations on Trees by Arc-Disjoint Paths
315
Several bijective proofs of this theorem are known. Biane’s bijection [2] is particular, in the sense that it preserves the height: to any labeled Motzkin walk of
length n and height k corresponds a permutation in Sn with height k (and so
with coloration number k). We do not present here the whole Biane’s bijection;
we just focus on the construction of the (unlabelled) Motzkin walk associated
to a permutation, in order to show that the height is preserved. This property,
which is not explicitely noticed in Biane’s paper, is essential for our purpose.
Biane’s correspondence between a permutation σ = (σ(1), σ(2), . . . , σ(n))
and a labeled Motzkin walk ω = (s0 , s1 , . . . , sn ) is such that, for 1 ≤ i ≤ n):
– step (si−1 , si ) is a North-East step if and only if σ(i) > i and σ −1 (i) > i;
– step (si−1 , si ) is a South-East step if and only if σ(i) < i and σ −1 (i) < i;
– otherwise, step (si−1 , si ) is an East step.
Now, for any 1 ≤ i ≤ n, the height of point si in ω is obviously equal to
the number of North-East steps minus the number of South-East steps in the
shrinked walk (s0 , s1 , . . . , si ). On the other hand, we can prove easily that the
4
height of arc i+ in σ is equal to the number of integers
5
(3,2) 1
j ≤ i such that σ(j) > j and σ −1 (j) > j, minus the
(2,2)
(1,1)
number of integers j ≤ i such that σ(j) < j and
(1,1)
σ −1 (j) < j. This proves the property. We present in
Figure 4 an exemple of correspondence. The above
description permits to construct the “skeleton” of
the permutation, in the center of the figure, given
the Motzkin walk on the top. Then the labeling of
the path allows to complete the permutation. This
is described in detail in [2] and in the full version of Fig. 4: From a walk to a
permutation
this paper, in preparation.
3
1
4.2
2
1
2
1
1
2
1
1
Proof of Theorem 3
In [16], Louchard analyzes some list structures; in particular his “dictionary
structure” corresponds to our labeled Motzkin walks. We will use his notation
in order to refer directly to his article. From Louchard’s theorem 6.2, we deduce
the following lemma:
Lemma 2. The height Y ∗ ([nv]) of a random labeled Motzkin walk of length n
after the step [nv] (v ∈ [0, 1])) has the following behavior
Y ∗ ([nv]) − nv(1 − v)
√
⇒ X(v),
n
where “⇒” denotes the weak convergence and X is a Markovian process with
mean 0 and covariance C(s, t) = 2s2 (1 − t)2 , s ≤ t.
Then the work of Daniels and Skyrme [5] gives us a way to compute the maximum
of Y ∗ ([nv]), that is the height of a random labeled Motzkin walk.
316
D. Barth et al.
Proposition 3. The height of a random labeled Motzkin walk Y ∗ is
p
n
max Y ∗ ([nv]) = + m n/2 + O(n1/6 ),
v
4
(1)
where m is asymptotically Gaussian with mean E(m) ∼ λn−1/6 (1/2)1/2 and
variance σ 2 (m) ∼ 1/8 and λ = 0.99615 . . . .
In the formula (1) of the above Proposition 2, the only non-deterministic
part is m which is Gaussian. So we just have to replace m by E(m) to prove
Theorem 3.
4.3
An Algorithm to Compute Exactly the Average Coloration
Number
We just have to look at known results in enumerative combinatorics [8,21] to
get the generating function of the permutations of coloration number exactly
k, that is
(k!)2 z 2k
∗ (z)P ∗ (z)
Pk+1
k
with P0 (z) = 1, P1 (z) = z − b0 and Pn+1 (z) = (t − bn )Pn (z) − λn Pn−1 (z) for
n ≥ 1, where P ∗ is the reprocical polynomial of P , that is Pn∗ (z) = z n Pn (1/z)
for n ≥ 0.
This generating function leads to a recursive algorithm to compute the number of permutations with coloration number k, denoted by hn,k .
Proposition 4. The number of permutations in Sn,k follows the following recurrence
if n < 2k
0
if n = 2k
hn,k = (k!)2
P2h+1
− i=1 p(i)hn−i,k otherwise
∗
(z)Pk∗ (z).
where p(i) is the coefficient of z i in Pk+1
From this result
P we are able to compute the average height of a permutation as
it is h̄(n) = k≥0 khn,k /n!.
5
Open Problems and Future Work
It remains open the complexity of routing involutions on binary trees by arcdisjoint paths. The average coloration number of permutations to be routed
on general trees is also an interesting open problem. Computing the average
coloration number of permutations to be routed on arbitrary topology networks
seems a very difficult problem.
Acknowledgements. We are very grateful to Philippe Flajolet, Dominique
Gouyou-Beauchamps and Guy Louchard for their help.
On the Complexity of Routing Permutations on Trees by Arc-Disjoint Paths
317
References
1. Y. Aumann, Y. Rabani. Improved bounds for all optical routing. In Proc. of the
6th ACM-SIAM SODA, pp 567-576, 1995.
2. Ph. Biane. Permutations suivant le type d’excédance et le nombre d’inversions
et interprétation combinatoire d’une fraction continue de Heine. Eur. J. Comb.,
14(4):277-284, 1993.
3. I. Caragiannis, Ch. Kaklamanis, P. Persiano. Wavelength Routing of Symmetric
Communication Requests in Directed Fiber Trees. In Proc. of SIROCCO, 1998.
4. N. K. Cheung, K. Nosu, and G. Winzer, editors. Special Issue on Dense Wavelength Division Multiplexing Techniques for High Capacity and Multiple Access
Communications Systems. IEEE J. on Selected Areas in Comm., 8(6), 1990.
5. H. E. Daniels, T. H. R. Skyrme. The maximum of a random walk whose mean path
has a maximum. Adv. Appl. Probab., 17:85-99, 1985.
6. T. Erlebach and K. Jansen. Call scheduling in trees, rings and meshes. In Proc. of
HICSS-30, vol. 1, pp 221-222. IEEE CS Press, 1997.
7. T. Erlebach, K. Jansen, C. Kaklamanis, M. Mihail, P. Persiano. Optimal wavelength
routing on directed fiber trees. Theoret. Comput. Sci. 221(1-2):119-137, 1999.
8. Ph. Flajolet. Combinatorial aspects of continued fractions. Discrete Math., 32:125161, 1980.
9. J. Françon, X. Viennot. Permutations selon leurs pics, creux, doubles montées et
doubles descentes, nombres d’Euler et nombres de Genocchi. Discrete Math., 28:2135, 1979.
10. M.R. Garey, D.S. Johnson, G.L. Miller, C.H. Papadimitriou. The complexity of
colouring circular arcs and chords. SIAM J. Alg. Disc. Meth., 1(2):216-227,1980.
11. L. Gargano, P. Hell, S. Perennes. Coloring all directed paths in a symmetric tree
with applications to WDM routing. In Proc. of ICALP, LNCS 1256, pp 505-515,
1997.
12. M. C. Golumbic. Algorithmic graph theory and perfect graphs. Academic Press,
New York, 1980.
13. Q.-P. Gu, H. Tamaki. Routing a permutation in the Hypercube by two sets of edgedisjoint paths. In Proc. of 10th IPPS. IEEE CS Press, 1996.
14. U. I. Gupta, D. T. Lee, J. Y.-T. Leung. Efficient algorithms for interval graphs
and circular-arc graphs. Networks, 12:459-467, 1982.
15. S. R. Kumar, R. Panigrahy, A. Russel, R. Sundaram. A note on optical routing on
trees. Inf. Process. Lett., 62:295-300, 1997.
16. G. Louchard. Random walks, Gaussian processes and list structures. Theoret. Comput. Sci., 53:99-124, 1987.
17. M. Paterson, H. Schröder, O. Sýkora, I. Vrto. On Permutation Communications
in All-Optical Rings. In Proc. of SIROCCO, 1998.
18. P. Raghavan, U. Upfal. Efficient routing in all optical networks. In Proc. of the
26th ACM STOC, pp 133-143, 1994.
19. A. Tucker. Coloring a family of circular arcs. SIAM J. Appl. Maths., 29(3):493-502,
1975.
20. X. Viennot. A combinatorial theory for general orthogonal polynomials with extensions and applications. Lect. Notes Math., 1171:139-157, 1985. Polynômes orthogonaux et applications, Proc. Laguerre Symp., Bar-le-Duc/France 1984.
21. X. Viennot. Une théorie combinatoire des polynômes orthogonaux généraux. Notes
de conférences, Univ. Quebec, Montréal, 1985.
22. G. Wilfong, P. Winkler. Ring routing and wavelength translation. In Proc. of the
9th Annual ACM-SIAM SODA, pp 333-341, 1998.
Subresultants Revisited
Extended Abstract
Joachim von zur Gathen and Thomas Lücking
FB Mathematik-Informatik, Universität Paderborn
33095 Paderborn, Germany
{gathen,luck}@upb.de
1
1.1
Introduction
Historical Context
The Euclidean Algorithm was first documented by Euclid (320–275 BC). Knuth
(1981), p. 318, writes: “We might call it the granddaddy of all algorithms, because it is the oldest nontrivial algorithm that has survived to the present day.” It
performs division with remainder repeatedly until the remainder becomes zero.
With inputs 13 and 9 it performs the following:
13 = 1 · 9 + 4,
9=2·4+ 1 ,
4 = 4 · 1 + 0.
This allows us to compute the greatest common divisor (gcd) of two integers
as the last non-vanishing remainder. In the example, the gcd of 13 and 9 is
computed as 1.
At the end of the 17th century the concept of polynomials was evolving.
Researchers were interested in finding the common roots of two polynomials f
and g. One question was whether it is possible to apply the Euclidean Algorithm
to f and g. In 1707 Newton solved this problem and showed that this always
works in Q[x].
1
3
x3 + 2x2 − x − 2 = ( x + )(2x2 − 2x − 4) +
2
2
4x + 4
1
2x2 − 2x − 4 = ( x − 1)(4x + 4) + 0.
2
In this example f = x3 + 2x2 − x − 2 and g = 2x2 − 2x − 4 have a greatest
common divisor 4x + 4, and therefore the only common root is −1. In a certain
sense the Euclidean Algorithm computes all common roots. If you only want to
know whether f and g have at least one common root, then the whole Euclidean
G. Gonnet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNCS 1776, pp. 318–342, 2000.
c Springer-Verlag Berlin Heidelberg 2000
Subresultants Revisited
319
Algorithm has to be executed. Thus the next goal was to find an indicator for
common roots without using any division with remainder.
The key to success was found in 1748 by Euler, and later by Bézout. They
defined the resultant of f and g as the smallest polynomial in the coefficients of f
and g that vanishes if and only if f and g have a common root. In 1764 Bézout was
the first to find a matrix whose determinant is the resultant. The entries of this
Bézout matrix are quadratic functions of the coefficients of f and g. Today we
use the matrix discovered by Sylvester in 1840, known as the Sylvester matrix. Its
entries are simply coefficients of the polynomials f and g. Sylvester generalized
his definition and introduced what we now call subresultants as determinants
of certain submatrices of the Sylvester matrix. They are nonzero if and only if
the corresponding degree appears as a degree of a remainder of the Euclidean
Algorithm.
These indicators, in particular the resultant, also work for polynomials in
Z[x]. So the question came up whether it is possible to apply the Euclidean
Algorithm to f and g in Z[x] without leaving Z[x]. The answer is no, as illustrated
in the example above, since division with remainder is not always defined in Z[x],
although the gcd exists. In the example it is x + 1.
However, in 1836 Jacobi found a way out. He introduced pseudo-division:
he multiplied f with a certain power of the leading coefficient of g before performing the division with remainder. This is always possible in Z[x]. So using
pseudo-division instead of division with remainder in every step in the Euclidean
Algorithm yields an algorithm with all intermediate results in Z[x].
About 40 years later Kronecker did research on the Laurent series in x−1 of
g/f for two polynomials f and g. He considered the determinants of a matrix
whose entries are the coefficients of the Laurent series of g/f . He obtained the
same results as Sylvester, namely that these determinants are nonzero if and
only if the corresponding degree appears in the degree sequence of the Euclidean
Algorithm. Furthermore Kronecker gave a direct way to compute low degree
polynomials s, t and r with sf + tg = r via determinants of matrices derived
again from the Laurant series of g/f , and showed that these polynomials are
essentially the only ones. He also proved that the polynomial r, if nonzero,
agrees with a remainder in the Euclidean Algorithm, up to a constant multiple.
This was the first occurrence of polynomial subresultants.
In the middle of our century, again 70 years later, the realization of computers
made it possible to perform more and more complicated algorithms faster and
faster. However, using pseudo-division in every step of the Euclidean Algorithm
causes exponential coefficient growth. This was suspected in the late 1960’s.
Collins (1967), p. 139 writes: “Thus, for the Euclidean algorithm, the lengths
of the coefficients increases exponentially.” In Brown & Traub (1971) we find:
“Although the Euclidean PRS algorithm is easy to state, it is thoroughly impractical since the coefficients grow exponentially.” An exponential upper bound
is in Knuth (1981), p. 414: “Thus the upper bound [. . . ] would be approximately
n
N 0.5(2.414) , and experiments show that the simple algorithm does in fact have
this behavior; the number of digits in the coefficients grows exponentially at
320
J. von zur Gathen, T. Lücking
each step!”. However, we did not find a proof of an exponential lower bound;
our bound in Theorem 7.3 seems to be new.
One way out of this exponential trap is to make every intermediate result
primitive, that is, to divide the remainders by the greatest common divisors of
their coefficients, the so-called content. However, computing the contents seemed
to be very expensive since in the worst case the gcd of all coefficients has to be
computed. So the scientists tried to find divisors of the contents without using any gcd computation. Around 1970, first Collins and then Brown & Traub
reinvented the polynomial subresultants as determinants of a certain variant of
the Sylvester matrix. Habicht had also defined them independently in 1948.
Collins and Brown & Traub showed that they agree with the remainders of the
Euclidean Algorithm up to a constant factor. They gave simple formulas to compute this factor and introduced the concept of polynomial remainder sequences
(PRS), generalizing the concept of Jacobi. The final result is the subresultant
PRS that features linear coefficient growth with intermediate results in Z[x].
Since then two further concepts have come up. On the one hand the fast
EEA allows to compute an arbitrary intermediate line in the Euclidean Scheme
directly. Using the fast O(n log n log log n) multiplication algorithm of Schönhage
and Strassen, the time for a gcd reduces from O(n2 ) to O(n log 2 n log log n)
field operations (see Strassen (1983)). On the other hand, the modular EEA is
very efficient. These two topics are not considered in this thesis; for further
information we refer to von zur Gathen & Gerhard (1999), Chapters 6 and 11.
1.2
Outline
After introducing the notation and some well-known facts in Section 2, we start
with an overview and comparison of various definitions of subresultants in Section 3. Mulders (1997) describes an error in software implementations of an
integration algorithm which was due to the confusion caused by the these various
definitions. It turns out that there are essentially two different ways of defining
them: the scalar and the polynomial subresultants. Furthermore we show their
relation with the help of the Euclidean Algorithm. In the remainder of this work
we will mainly consider the scalar subresultants.
In Section 4 we give a formal definition of polynomial remainder sequences
and derive the most famous ones as special cases of our general notion. The
relation between polynomial remainder sequences and subresultants is exhibited
in the Fundamental Theorem 5.1 in Section 5. It unifies many results in the
literature on various types of PRS which can be derived as corollaries from
this theorem. In Section 6 we apply it to the various definitions of polynomial
remainder sequences already introduced. This yields a collection of results from
Collins (1966, 1967, 1971, 1973), Brown (1971, 1978), Brown & Traub (1971),
Lickteig & Roy (1997) and von zur Gathen & Gerhard (1999). Lickteig & Roy
(1997) found a recursion formula for polynomial subresultants not covered by the
Fundamental Theorem. We translate it into a formula for scalar subresultants
and use it to finally solve an open question in Brown (1971), p. 486. In Section 7
we analyse the coefficient growth and the running time of the various PRS.
Subresultants Revisited
321
Finally in Section 8 we report on implementations of the various polynomial
remainder sequences and compare their running times. It turns out that computing contents is quite fast for random inputs, and that the primitive PRS behaves
much better than expected.
Much of this Extended Abstract is based on the existing literature. The
following results are new:
– rigorous and general definition of division rules and PRS,
– proof that all constant multipliers in the subresultant PRS for polynomials
over an integral domain R are also in R,
– exponential lower bound for the running time of the pseudo PRS (algorithm).
2
Foundations
In this chapter we introduce the basic algebraic notions. We refer to von zur Gathen & Gerhard (1999), Sections 2.2 and 25.5, for the notation and fundamental
facts about greatest common divisors and determinants. More information on
these topics is in Hungerford (1990).
2.1
Polynomials
Let R be a ring. In what follows, this always means a commutative ring with 1.
A basic tool in computer algebra is division with remainder. For given polynomials f and g in R[x] of degrees n and m, respectively, the task is to find
polynomials q and r in R[x] with
f = qg + r and deg r < deg g.
(2.1)
Unfortunately such q and r do not always exist.
Example 2.2. It is not possible to divide x2 by 2x + 3 with remainder in Z[x]
because x2 = (ux + v)(2x + 3) + r with u, v, r ∈ Q has the unique solution
u = 1/2, v = 0 and r = −3/2, which is not over Z.
If defined and unique we call q = quo(f, g) the quotient and r = rem(f, g)
the remainder. A ring with a length function (like the degree of polynomials)
and where division with remainder is always defined is a Euclidean domain. R[x]
is a Euclidean domain if and only if R is a field. Moreover a solution of (2.1) is
not necessarily unique if the leading coefficient lc(g) of g is a zero divisor.
Example 2.3. Let R = Z8 and consider f = 4x2 + 2x and g = 2x + 1. With
r1 = 0
q1 = 2x,
q2 = 2x + 4, r2 = 4
we obtain
q1 g + r1 = 2x(2x + 1) + 0 = 4x2 + 2x = f,
q2 g + r2 = (2x + 4)(2x + 1) + 4 = 4x2 + 10x + 8 = 4x2 + 2x = f.
Thus we have two distinct solutions (q1 , r1 ) and (q2 , r2 ) of (2.1).
322
J. von zur Gathen, T. Lücking
A way to get solutions for all commutative rings is the general pseudo-division
which allows multiplication of f by a ring element α:
αf = qg + r, deg r < deg g.
(2.4)
n−m+1
, then this is the (classical) pseudo-division. If lc(g) is not a zero
If α = gm
divisor, then (2.4) always has a unique solution in R[x]. We call q = pquo(f, g)
the pseudo-quotient and r = prem(f, g) the pseudo-remainder.
Example 2.2 continued. For x2 and 2x + 3 we get the pseudo-division
22 · x2 = (2x − 3)(2x + 3) + 9
A simple computation shows that we cannot choose α = 2.
Lemma 2.5.
(i) Pseudo-division always yields a solution of (2.4) in R[x].
(ii) If lc(g) is not a zero divisor, then any solution of (2.4) has deg q = n − m.
Lemma 2.6. The solution (q, r) of (2.4) is uniquely determined if and only if
lc(g) is not a zero-divisor.
Let R be a unique factorization domain. We then have gcd(f,
Pg) ∈ R for
f, g ∈ R[x], and the content cont(f ) = gcd(f0 , . . . , fn ) ∈ R of f = 0≤j≤n fj xj .
The polynomial is primitive if cont(f ) is a unit. The primitive part pp(f ) is
defined by f = cont(f ) · pp(f ). Note that pp(f ) is a primitive polynomial.
The Euclidean Algorithm computes the gcd of two polynomials by iterating
the division with remainder:
ri−1 = qi ri + ri+1 .
3
3.1
(2.7)
Various Notions of Subresultants
The Sylvester Matrix
The various definitions of the subresultant are based on the Sylvester matrix.
Therefore we first take a look at the historical motivation forPthis special maj
trix. Our goal is to decide whether two polynomials f =
0≤j≤n fj x and
P
j
g =
0≤j≤m gj x ∈ R[x] of degree n ≥ m > 0 over a commutative ring R
in the indeterminate x have a common root. To find an answer for this question, Euler (1748) and Bézout (1764) introduced the (classical) resultant that
vanishes if and only if this is true. Although Bézout also succeeded in finding
a matrix whose determinant is equal to the resultant, today called Bézout matrix, we will follow the elegant derivation in Sylvester (1840). The two linear
equations
fn xn + fn−1 xn−1 + · · · + f1 x1 + f0 x0 = 0
gm xm + gm−1 xm−1 + · · · + g1 x1 + g0 x0 = 0
Subresultants Revisited
323
in the indeterminates x0 , . . . , xn are satisfied if xj = αj for all j, where α is a
common root of f and g. For n > 1 there are many more solutions of these two
linear equations in many variables, but Sylvester eliminates them by adding the
(m − 1) + (n − 1) linear equations that correspond to the following additional
conditions:
xf (x) = 0 , . . . , xm−1 f (x) = 0,
xg(x) = 0 , . . . , xn−1 g(x) = 0.
These equations give a total of n + m linear relations among the variables
xm+n−1 , · · · , x0 :
fn xm+n−1 + · · ·
..
.
+ f0 xm−1
=0
fn xn + fn−1 xn−1 + · · · + f0 x0 = 0
=0
gm xm+n−1 + · · · + g0 xn−1
..
.
gm xm + gm−1 xm−1 + · · · + g0 x0 = 0
Clearly xj = αj gives a solution for any common root α of f and g, but the point
is that (essentially) the converse also holds: a solution of the linear equations
gives a common root (or factor). The (n+m)×(n+m) matrix, consisting of coefficients of f and g, that belongs to this system of linear equations is often called
Sylvester matrix. In the sequel we follow von zur Gathen & Gerhard (1999), Section 6.3, p. 144, and take its transpose.
P
j
Definition 3.1. Let R be a commutative ring and let f =
0≤j≤n fj x and
P
j
g =
0≤j≤m gj x ∈ R[x] be polynomials of degree n ≥ m > 0, respectively.
Then the (n + m) × (n + m) matrix
f
gm
n
fn−1 fn
gm−1 gm
.
.. . .
..
.. . .
.
.
.
.
.
.
.
.
..
..
.
..
.
.
fn
.
.
g
1
..
..
..
.
..
.
.
fn−1 g0 .
Syl(f, g) = .
..
..
..
gm
g0
.
.
..
..
..
..
f0 .
.
.
.
..
..
..
.
.
.
f0
. . ..
.
..
. .
. ..
f0
|
{z
m
is called the Sylvester matrix of f and g.
}|
{z
n
g0
}
324
J. von zur Gathen, T. Lücking
Remark 3.2. Multiplying the (n + m − j)th row by xj and adding it to the
last row for 1 ≤ j < n + m, we get the (n + m) × (n + m) matrix S ∗ . Thus
det(Syl(f, g)) = det(Syl∗ (f, g)).
More details on resultants can be found in Biermann (1891), Gordan (1885) and
Haskell (1892). Computations for both the univariate and multivariate case are
discussed in Collins (1971).
3.2
The Scalar Subresultant
We are interested in finding out which degrees appear in the degree sequence of
the intermediate results in the Euclidean Algorithm. Below we will see that the
scalar subresultants provide a solution to this problem.
P
Definition 3.3. Let R be a commutative ring and f = 0≤j≤n fj xj and g =
P
j
0≤j≤m gj x ∈ R[x] polynomials of degree n ≥ m > 0, respectively. The determinant σk (f, g) ∈ R of the (m + n − 2k) × (m + n − 2k) matrix
fn
gm
f
f
g
g
n−1
n
m−1
m
..
..
..
..
.
.
.
.
f
· · · · · · gm
gk+1
n−m+k+1 · · · · · · fn
..
..
..
.
..
Sk (f, g) =
.
.
.
fk+1
· · · · · · fm gm−n+k+1 · · · · · · · · · · · · gm
..
..
..
..
.
.
.
.
..
..
..
..
.
.
.
.
f2k−m+1 · · · · · · fk g2k−n+1 · · · · · · · · · · · · gk
{z
}|
{z
}
|
m−k
n−k
is called the kth (scalar) subresultant of f and g. By convention an fj or
gj with j < 0 is zero. If f and g are clear from the context, then we write Sk
and σk for short instead of Sk (f, g) and σk (f, g).
Sylvester (1840) already contains an explicit description of the (scalar) subresultants. In Habicht (1948), p. 104, σk is called Nebenresultante (minor resultant) for polynomials f and g of degrees n and n − 1. The definition is also in
von zur Gathen (1984) and is used in von zur Gathen & Gerhard (1999), Section 6.10, p. 169.
Remark 3.4.
(i) S0 = Syl(f, g) and therefore σ0 = det(S0 ) is the resultant.
n−m
(ii) σm = gm
.
Subresultants Revisited
325
(iii) Sk is the matrix obtained from the Sylvester matrix by deleting the last 2k
rows and the last k columns with coefficients of f , and the last columns with
coefficients of g.
(iv) Sk is a submatrix of Si if k ≥ i.
3.3
The Polynomial Subresultant
We now introduce two slightly different definitions of polynomial subresultants.
The first one is from Collins (1967), p. 129, and the second one is from Brown
& Traub (1971), p. 507 and also in Zippel (1993), Chapter 9.3, p. 150. They
yield polynomials that are related to the intermediate results in the Euclidean
Algorithm.
P
Definition 3.5. Let R be a commutative ring, and f = 0≤j≤n fj xj and g =
P
j
0≤j≤m gj x ∈ R[x] polynomials of degree n ≥ m > 0. Let Mik = Mik (f, g) be
the (n+m−2k)×(n+m−2k) submatrix of Syl(f, g) obtained by deleting the last
k of the m columns of coefficients of f , the last k of the n columns of coefficients
of g and the last 2k + 1 rows exceptP
row (n + m − i − k), for 0 ≤ k ≤ m and 0 ≤
i ≤ n. The polynomial Rk (f, g) = 0≤i≤n det(Mik )xi ∈ R[x] is called the kth
polynomial subresultant of f and g. In fact Collins (1967) considered the
transposed matrices. If f and g are clear from the context, then we write Rk for
short instead of Rk (f, g). Note that det(Mik ) = 0 if i > k since
P then the last row
of Mik is identical to the (n + m − i − k)th row. Thus Rk = 0≤i≤k det(Mik )xi .
Remark 3.6.
(i) M00 = Syl(f, g) and therefore R0 = det(M00 ) is the resultant.
(ii) Remark 3.4(i) implies σ0 = R0 .
P
Definition 3.7. Let R be a commutative ring and f = 0≤j≤n fj xj and g =
P
j
0≤j≤m gj x ∈ R[x] polynomials of degree n ≥ m > 0. We consider the determinant Zk (f, g) = det(Mk∗ ) ∈ R[x] of the (n+m−2k)×(n+m−2k) matrix Mk∗ obtained from Mik by replacing the last row with (xm−k−1 f, · · · , f, xn−k−1 g, · · · , g).
Table 1 gives an overview of the literature concerning these notions. There
is a much larger body of work about the special case of the resultant, which we
do not quote here.
3.4
Comparison of the Various Definitions
As in Brown & Traub (1971), p. 508, and Geddes et al. (1992), Section 7.3,
p. 290, we first have the following theorem which shows that the definitions
in Collins (1967) and Brown & Traub (1971) describe the same polynomial.
Theorem 3.8.
(i) If σk (f, g) 6= 0, then σk (f, g) is the leading coefficient of Rk (f, g). Otherwise,
deg Rk (f, g) < k.
326
J. von zur Gathen, T. Lücking
Definition
Authors
σk (f, g) = det(Sk ) ∈ R
Sylvester (1840), Habicht (1948)
von zur Gathen (1984)
von zur Gathen & Gerhard (1999)
Rk (f, g) =
P
= Zk (f, g) =
i
0≤i≤n
det(Mik )x Collins (1967), Loos (1982)
Geddes et al. (1992)
det(Mk∗ )
∈ R[x]
Brown & Traub (1971)
Zippel (1993), Lickteig & Roy (1997)
Reischert (1997)
Table 1. Definitions of subresultants
(ii) Rk (f, g) = Zk (f, g).
Lemma 3.9. Let F be a field, f and g in F [x] be polynomials of degree n ≥
m > 0, respectively, and let ri , si and ti be the entries in the ith row of the
Extended Euclidean Scheme, so that ri = si f + ti g for 0 ≤ i ≤ ℓ. Moreover, let
ρi = lc(ri ) and ni = deg ri for all i. Then
σ ni
· ri = Rni for 2 ≤ i ≤ ℓ.
ρi
Remark 3.10. Let f and g be polynomials over an integral domain R, let F be
the field of fractions of R, and consider the Extended Euclidean Scheme of f
and g in F [x]. Then the scalar and the polynomial subresultants are in R and
R[x], respectively, and Lemma 3.9 also holds:
σ ni
· ri = Rni ∈ R[x].
ρi
Note that ri is not necessarily in R[x], and ρi not necessarily in R.
4
Division Rules and Polynomial Remainder Sequences
(PRS)
We cannot directly apply the Euclidean Algorithm to polynomials f and g over
an integral domain R since polynomial division with remainder in R[x], which
is used in every step of the Euclidean Algorithm, is not always defined. Hence
our goal now are definitions modified in such a way that they yield a variant of
the Euclidean Algorithm that works over an integral domain. We introduce a
generalization of the usual pseudo-division, the concept of division rules, which
leads to intermediate results in R[x].
Subresultants Revisited
327
Definition 4.1. Let R be an integral domain. A one-step division rule is a partial mapping
R : R[x]2 →− R2
such that for all (f, g) ∈ def(R) there exist q, r ∈ R[x] satisfying
(i) R(f, g) = (α, β),
(ii) αf = qg + βr and deg r < deg g.
Recall that def(R) ⊆ R[x]2 is the domain of definition of R, that is, the set
of (f, g) ∈ R[x]2 at which R is defined. In particular, R : def(R) −→ R2 is a
total map. In the examples below, we will usually define one-step division rules
by starting with a (total or partial) map R0 : R[x]2 →− R2 and then taking R
to be the maximal one-step division rule consistent with R0 . Thus
def(R) = {(f, g) ∈ R[x]2 :∃α, β ∈ R, ∃q, r ∈ R[x]
(α, β) = R0 (f, g) and (ii) holds},
and R is R0 restricted to def(R). Furthermore (f, 0) is never in def(R) (“you
can’t divide by zero”), so that
def(R) ⊆ Dmax = R[x] × (R[x] \ {0}).
We are particularly interested in one-step division rules R with def(R) = Dmax .
In our examples, (0, g) will always be in def(R) if g 6= 0.
We may consider the usual remainder as a partial function rem : R[x]2 →−
R[x] with rem(f, g) = r if there exist q, r ∈ R[x] with f = qg + r and deg r <
deg g, and def(rem) maximal. Recall from Section 2 the definitions of rem, prem
and cont.
Example 4.2. Let f and g be polynomials over an integral domain R of degrees
n and m, respectively, and let fn = lc(f ), gm = lc(g) 6= 0 be their leading
coefficients. Then the three most famous types of division rules are as follows:
– classical division rule: R(f, g) = (1, 1).
– monic division rule: R(f, g) = (1, lc(rem(f, g))).
– Sturmian division rule: R(f, g) = (1, −1).
Examples are given below. When R is a field, these three division rules have
the largest possible domain of definition def(R) = Dmax , but otherwise, it may
be smaller; we will illustrate this in Example 4.7. Hence they do not help us in
achieving our goal of finding rules with maximal domain Dmax . But there exist
two division rules which, in contrast to the first examples, always yield solutions
in R[x]:
n−m+1
, 1).
– pseudo-division rule: R(f, g) = (gm
In case R is a unique factorization domain, we have the
n−m+1
, cont(prem(f, g))).
– primitive division rule: R(f, g) = (gm
328
J. von zur Gathen, T. Lücking
For algorithmic purposes, it is then useful for R to be a Euclidean domain.
The disadvantage of the pseudo-division rule, however, is that in the Euclidean Algorithm it leads to exponential coefficient growth; the coefficients of
the intermediate results are usually enormous, their bit length may be exponential in the bit length of the input polynomials f and g. If R is a UFD, we get
the smallest intermediate results if we use the primitive division rule, but the
computation of the content in every step of the Euclidean Algorithm seems to
be expensive. Collins (1967) already observed this in his experiments. Thus he
tries to avoid the computation of contents and to keep the intermediate results
“small” at the same time by using information from all intermediate results in
the EEA, not only the two previous remainders. Our concept of one-step division
rules does not cover his method. So we now extend our previous definition, and
will actually capture all the “recursive” division rules from Collins (1967, 1971,
1973), Brown & Traub (1971) and Brown (1971) under one umbrella.
Definition 4.3. Let R be an integral domain. A division rule is a partial mapping
R : R[x]2 →− (R2 )∗
associating to (f, g) ∈ def(R) a sequence ((α2 , β2 ), . . . , (αℓ+1 , βℓ+1 )) of arbitrary
length ℓ such that for all (f, g) ∈ def(R) there exist ℓ ∈ N≥0 , q1 , . . . , qℓ ∈ R[x]
and r0 , . . . , rℓ+1 ∈ R[x] satisfying
(i) r0 = f, r1 = g,
(ii) Ri (f, g) = R(f, g)i = (αi , βi ),
(iii) αi ri−2 = qi−1 ri−1 + βi ri and deg ri < deg ri−1
for 2 ≤ i ≤ ℓ + 1. The integer ℓ = |R(f, g)| is the length of the sequence.
A division rule where ℓ = 1 for all values is the same as a one-step division
rule, and from an arbitrary division rule we can obtain a one-step division rule
by projecting to the first coordinate (α2 , β2 ) if ℓ ≥ 2. Using Lemma 2.6, we find
that for all (f, g) ∈ def(R), qi−1 and ri are unique for 2 ≤ i ≤ ℓ + 1. If we have
a one-step division rule R∗ which is defined at all (ri−2 , ri−1 ) for 2 ≤ i ≤ ℓ + 1
(defined recursively), then we obtain a division rule R by using R∗ in every step:
Ri (f, g) = R∗ (ri−2 , ri−1 ) = (α, β).
If we truncate R at the first coordinate, we get R∗ back. But the notion of
division rules is strictly richer than that of one-step division rules; for example
the first step in the reduced division rule below is just the pseudo-division rule,
but using the pseudo-division rule repeatedly does not yield the reduced division
rule.
Example 4.2 continued. Let f = r0 , g = r1 , r2 , . . . , rℓ ∈ R[x] be as in Definition 4.3, let ni = deg ri be their degrees, ρi = lc(ri ) their leading coefficients,
and di = ni − ni+1 ∈ N≥0 for 0 ≤ i ≤ ℓ (if n0 ≥ n1 ). We now present two
Subresultants Revisited
329
different types of recursive division rules. They are based on polynomial subresultants. It is not obvious that they have domain of definition Dmax , since
divisions occur in their definitions. We will show that this is indeed the case in
Remarks Remark 6.8 and Remark 6.12.
– reduced division rule: Ri (f, g) = (αi , βi ) for 2 ≤ i ≤ ℓ + 1,
where we set α1 = 1 and
d
i−2
(αi , βi ) = (ρi−1
+1
, αi−1 ) for 2 ≤ i ≤ ℓ + 1.
– subresultant division rule: Ri (f, g) = (αi , βi ) for 2 ≤ i ≤ ℓ + 1,
where we set ρ0 = 1, ψ2 = −1, ψ3 , . . . , ψℓ+1 ∈ R with
d
i−2
(αi , βi ) = (ρi−1
+1
d
, −ρi−2 ψi i−2 ) for 2 ≤ i ≤ ℓ + 1,
1−d
ψi = (−ρi−2 )di−3 ψi−1 i−3 for 3 ≤ i ≤ ℓ + 1.
The subresultant division rule was invented by Collins (1967), p. 130. He tried
to find a rule such that the ri ’s agree with the polynomial subresultants up to a
small constant factor. Brown (1971), p. 486, then provided a recursive definition
of the αi and βi as given above. Brown (1971) also describes an “improved
division rule”, where one has some magical divisor of ρi .
We note that the exponents in the recursive definition of the ψi ’s in the
subresultant division rule may be negative. Hence it is not clear that the βi ’s are
in R. However, we will show this in Theorem 6.15, and so answer the following
open question that was posed in Brown (1971), p. 486:
Question 4.4. “At the present time it is not known whether or not these equations imply ψi , βi ∈ R.”
By definition, a division rule R defines a sequence (r0 , . . . , rℓ ) of remainders; recall that they are uniquely defined. Since it is more convenient to work
with these “polynomial remainder sequences”, we fix this notion in the following
definition.
Definition 4.5. Let R be a division rule. A sequence (r0 , . . . , rℓ ) with each ri ∈
R[x] \ {0} is called a polynomial remainder sequence (PRS) for (f, g) according
to R if
(i) r0 = f, r1 = g,
(ii) Ri (f, g) = (αi , βi ),
(iii) αi ri−2 = qi−1 ri−1 + βi ri ,
for 2 ≤ i ≤ ℓ + 1, where ℓ is the length of R(f, g). The PRS is complete if
rℓ+1 = 0. It is called normal if di = deg ri − deg ri+1 = 1 for 1 ≤ i ≤ ℓ − 1
(Collins (1967), p. 128/129).
In fact the remainders for PRS according to arbitrary division rules over an
integral domain only differ by a nonzero constant factor.
330
J. von zur Gathen, T. Lücking
Proposition 4.6. Let R be an integral domain, f, g ∈ R[x] and r = (r0 , . . . , rℓ )
and r∗ = (r0∗ , . . . , rℓ∗∗ ) be two PRS for (f, g) according to two division rules R
and R∗ , respectively, none of whose results αi , βi , αi∗ , βi∗ is zero. Then ri∗ = γi ri
with
∗
Y
αi−2k
βi−2k
∈ F \ {0}
γi =
∗
αi−2k βi−2k
0≤k≤i/2−1
∗
for 0 ≤ i ≤ min{ℓ, ℓ }, where F is the field of fractions of R.
The proposition yields a direct way to compute the PRS for (f, g) according
to R∗ from the PRS for (f, g) according to R and the αi , βi , αi∗ , βi∗ . In particular,
the degrees of the remainders in any two PRS are identical.
In Example 4.2 we have seen seven different division rules. Now we consider
the different polynomial remainder sequences according to these rules. Each PRS
will be illustrated by the following example.
Example 4.7. We perform the computations on the polynomials
f = r0 = 9x6 − 27x4 − 27x3 + 72x2 + 18x − 45 and
g = r1 = 3x4 − 4x2 − 9x + 21
over R = Q and, wherever possible, also over R = Z.
i
classical
monic
4
3
pseudo
2
9x − 27x − 27x + 72x + 18x − 45
0
3x4 − 4x2 − 9x + 21
1
2 −11x2 − 27x + 60 x2 +
880
3 − 164
x+
1331
4
Sturmian
6
248 931
1331
959 126 851
− 1335
622 400
i
27
x
11
x−
60
11
27 659
18 320
1
primitive
−
11x2 − 27x + 60
−297x2 − 729x + 1620
+
3 245 333 040x − 4 899 708 873
164 880
x
1331
248 931
1331
959 126 851
− 1335
622 400
reduced
−1 659 945 865 306 233 453 993
subresultant
9x6 − 27x4 − 27x3 + 72x2 + 18x − 45
0
1
2 −11x2 − 27x + 60
3x4 − 4x2 − 9x + 21
−297x2 − 729x + 1620
297x2 + 729x − 1620
3 18 320x − 27 659 120 197 520x − 181 470 699 13 355 280x − 20 163 411
4
−1
86 915 463 129
9 657 273 681
1. Classical PRS. The most familiar PRS for (f, g) is obtained according
to the classical division rule. Collins (1973), p. 736, calls this the natural
Euclidean PRS (algorithm). The intermediate results of the classical PRS
and of the Euclidean Algorithm coincide.
2. Monic PRS. In Collins (1973), p. 736, the PRS for (f, g) according to the
monic division rule is called monic PRS (algorithm). The ri are monic for
2 ≤ i ≤ ℓ, and we get the same intermediate results as in the monic Euclidean
Algorithm in von zur Gathen & Gerhard (1999), Section 3.2, p. 47.
Subresultants Revisited
331
3. Sturmian PRS. We choose the PRS for (f, g) according to the Sturmian division rule as introduced in Sturm (1835). Kronecker (1873), p. 117,
Habicht (1948), p. 102, and Loos (1982), p. 119, deal with this generalized
Sturmian PRS (algorithm). Kronecker (1873) calls it Sturmsche Reihe (Sturmian sequence), and in Habicht (1948) it is the verallgemeinerte Sturmsche
Kette (generalized Sturmian chain). If g = ∂f /∂x as in Habicht (1948), p. 99,
then this is the classical Sturmian PRS (algorithm). Note that the Sturmian
PRS agrees with the classical PRS up to sign.
If R is not a field, then Example 4.7 shows that the first three types of PRS
may not have Dmax as their domain of definition. In the example they are only
of length 1. But fortunately there are division rules that have this property.
4. Pseudo PRS. If we choose the PRS according to the pseudo-division rule,
then we get the so-called pseudo PRS. Collins (1967), p. 138, calls this the
Euclidean PRS (algorithm) because it is the most obvious generalization of
the Euclidean Algorithm to polynomials over an integral domain R that is
not a field. Collins (1973), p. 737, also calls it pseudo-remainder PRS.
5. Primitive PRS. To obtain a PRS over R with minimal coefficient growth,
we choose the PRS according to the primitive division rule which yields
primitive intermediate results. Brown (1971), p. 484, calls this the primitive
PRS (algorithm).
6. Reduced PRS. A perceived drawback of the primitive PRS is the (seemingly) costly computation of the content; recently the algorithm of Cooperman et al. (1999) achieves this with an expected number of less than two
integer gcd’s. In fact, in our experiments in Section 8, the primitive PRS
turns out to be most efficient among those discussed here. But Collins (1967)
introduced his reduced PRS (algorithm) in order to avoid the computation of
contents completely. His algorithm uses the reduced division rule and keeps
the intermediate coefficients reasonably small but not necessarily as small as
with the primitive PRS.
7. Subresultant PRS. The reduced PRS is not the only way to keep the coefficients small without computing contents. We can also use the subresultant
division rule. According to Collins (1967), p. 130, this is the subresultant
PRS (algorithm).
5
Fundamental Theorem on Subresultants
Collins’ Fundamental Theorem on subresultants expresses an arbitrary subresultant as a power product of certain data in the PRS, namely the multipliers α and
β and the leading coefficients of the remainders in the Euclidean Algorithm. In
this section our first goal is to prove the Fundamental Theorem on subresultants
for polynomial remainder sequences according to an arbitrary division rule R.
The following result is shown for PRS in Brown & Traub (1971), p. 511,
Fundamental theorem, and for reduced PRS in Collins (1967), p. 132, Lemma 2,
and p. 133, Theorem 1.
332
J. von zur Gathen, T. Lücking
Fundamental Theorem 5.1. Let f and g ∈ R[x] be polynomials of degrees
n ≥ m > 0, respectively, over an integral domain R, let R be a division rule and
(r0 , . . . , rℓ ) be the PRS for (f, g) according to R, (αi , βi ) = Ri (f, g) the constant
multipliers, ni = deg ri and ρi = lc(ri ) for 0 ≤ i ≤ ℓ, and di = ni − ni+1 for
0 ≤ i ≤ ℓ − 1.
(i) For 0 ≤ j ≤ n1 , the jth subresultant of (f, g) is
Y βk nk−1 −ni n −n
n
−n
k−2
k
ρk−1
σj (f, g) = (−1)bi ρi i−1 i
αk
2≤k≤i
P
if j = ni for some 1 ≤ i ≤ ℓ, otherwise 0, where bi =
2≤k≤i (nk−2 −
ni )(nk−1 − ni ).
(ii) The subresultants satisfy for 1 ≤ i < ℓ the recursive formulas
σn1 (f, g) = ρd10 and
σni+1 (f, g) = σni (f, g) · (−1)di (n0 −ni+1 +i+1) (ρi+1 ρi )di
Q
2≤k≤i+1
βk
αk
di
.
Corollary 5.2. Let R be a division rule and (r0 , . . . , rℓ ) be the PRS for (f, g)
according to R, let ni = deg ri for 0 ≤ i ≤ ℓ be the degrees in the PRS, and let
0 ≤ k ≤ n1 . Then
σk 6= 0 ⇐⇒ ∃i: k = ni .
6
Applications of the Fundamental Theorem
Following our program, we now derive results for the various PRS for polynomials
f, g ∈ R[x] of degrees n ≥ m ≥ 0, respectively, over an integral domain R,
according to the division rules in Example 4.2.
Corollary 6.1. Let (r0 , . . . , rℓ ) be a classical PRS and 1 ≤ i ≤ ℓ. Then
Y n −n
d
k−2
k
.
ρk−1
(i)
σni (f, g) = (−1)bi ρi i−1
2≤k≤i
(ii) The subresultants satisfy the recursive formulas
σn1 (f, g) = ρd10 , and
σni+1 (f, g) = σni (f, g) · (−1)di (n0 −ni+1 +i+1) (ρi+1 ρi )di .
If the PRS is normal, then this simplifies to:
(iii)
σni (f, g) = (−1)(d0 +1)(i+1) ρi ρ1d0 +1
Y
3≤k≤i
ρ2k−1 for i ≥ 2.
(iv) The subresultants satisfy the recursive formulas
σn1 (f, g) = ρd10 , and
σni+1 (f, g) = σni (f, g) · (−1)d0 +1 ρi+1 ρi .
Subresultants Revisited
333
The following is the Fundamental Theorem 11.13 in Gathen & Gerhard
(1999), Chapter 11.2, p. 307.
Corollary 6.2. Let (r0 , . . . , rℓ ) be a monic PRS, and 1 ≤ i ≤ ℓ. Then
(i)
Y
σni (f, g) = (−1)bi ρn0 1 −ni ρn1 0 −ni
n
βk k−1
−ni
.
2≤k≤i
(ii) The subresultants satisfy the recursive formulas
σn1 (f, g) = ρd10 , and
σni+1 (f, g) = σni (f, g) · (−1)di (n0 −ni+1 +i+1) (ρ0 ρ1 β2 · · · βi+1 )di .
If the PRS is normal, then this simplifies to:
(iii)
Y
σni (f, g) = (−1)(d0 +1)(i+1) ρ0i−1 ρd10 +i−1
i−(k−1)
βk
2≤k≤i
for i ≥ 2.
(iv) The subresultants satisfy for 1 ≤ i < ℓ the recursive formulas
σn1 (f, g) = ρd10 , and
σni+1 (f, g) = σni (f, g) · (−1)d0 +1 ρ0 ρ1 β2 · · · βi+1 .
Corollary 6.3. Let (r0 , . . . , rℓ ) be a Sturmian PRS, and 1 ≤ i ≤ ℓ. Then
(i)
σni (f, g) = (−1)bi +
P
2≤k≤i (nk−1 −ni )
d
ρi i−1
Y
n
k−2
ρk−1
−nk
.
2≤k≤i
(ii) The subresultants satisfy the recursive formulas
σn1 (f, g) = ρd10 , and
σni+1 (f, g) = σni (f, g) · (−1)di (n0 −ni+1 +1) (ρi+1 ρi )di .
If the PRS is normal, then this simplifies to:
(iii)
σni (f, g) = (−1)(d0 +1)(i+1) ρd10 +1 ρi
Y
3≤k≤i
ρ2k−1 for i ≥ 2.
(iv) The subresultants satisfy the recursive formulas
σn1 (f, g) = ρd10 , and
σni+1 (f, g) = σni (f, g) · (−1)d0 +i+1 ρi+1 ρi .
The following corollary can be found in Collins (1966), p. 710, Theorem 1,
for polynomial subresultants.
Corollary 6.4. Let (r0 , . . . , rℓ ) be a pseudo PRS, and 1 ≤ i ≤ ℓ. Then
(i)
d
σni (f, g) = (−1)bi ρi i−1
Y
2≤k≤i
n
k−2
ρk−1
−nk −(nk−1 −ni )(dk−2 +1)
.
334
J. von zur Gathen, T. Lücking
(ii) The subresultants satisfy the recursive formulas
σn1 (f, g) = ρd10 , and
σni+1 (f, g) = σni (f, g) · (−1)di (n0 −ni+1 +i+1) (ρi+1 ρi )di
Y
−(d
ρk−1k−2
+1)di
.
2≤k≤i+1
If the PRS is normal, then this simplifies to:
(iii)
(d0 +1)(2−i)
σni (f, g) = (−1)(d0 +1)(i+1) ρ1
ρi
Y
2(k−i)
ρk−1
3≤k≤i−1
(iv) The subresultants satisfy the recursive formulas
σn1 (f, g) = ρd10 , and
−(d0 +1)
σni+1 (f, g) = σni (f, g) · (−1)d0 +1 ρ1
ρi+1 ρi
Y
for i ≥ 2.
ρ−2
k−1 .
3≤k≤i+1
Remark 6.5. If the PRS is normal, then Corollary 6.4(iii) implies that
(d0 +1)(i−2)
σni (f, g)(−1)(δ0 +1)(i+1) ρ1
Y
2(i−k)
ρk−1
= ρi .
3≤k≤i−1
Thus σni (f, g) divides ρi . This result is also shown for polynomial subresultants
in Collins (1966), p. 711, Corollary 1.
Since the content of two polynomials cannot be expressed in terms of our parameters ρi and ni , we do not consider the Fundamental theorem for primitive
PRS.
The following is shown for polynomial subresultants in Collins (1967), p. 135,
Corollaries 1.2 and 1.4.
Corollary 6.6. Let (r0 , . . . , rℓ ) be a reduced PRS, and 1 ≤ i ≤ ℓ. Then
(i)
d
σni (f, g) = (−1)bi ρi i−1
Y
d
k−2
ρk−1
(1−dk−1 )
2≤k≤i
(ii) The subresultants satisfy for the recursive formulas
σn1 (f, g) = ρd10 , and
−d
d
i
ρi i−1 i .
σni+1 (f, g) = σni (f, g) · (−1)di (n0 −ni+1 +i+1) ρdi+1
If the PRS is normal, then this simplifies to:
(iii)
σni (f, g) = (−1)(d0 +1)(i+1) ρi for i ≥ 2.
(iv) The subresultants satisfy the recursive formulas
σn1 (f, g) = ρd10 , and
σni+1 (f, g) = σni (f, g) · (−1)d0 +1 ρi+1 ρ−1
i .
Subresultants Revisited
335
Remark 6.7. We obtain from Corollary 6.6(i)
σni (f, g)
Y
d
k−2
(−1)(nk−2 −ni )(nk−1 −ni ) ρk−1
(dk−1 −1)
d
= ρi i−1 .
2≤k≤i
d
Thus σni (f, g) divides ρi i−1 . This result can also be found in Collins (1967),
p.135, Corollary 1.2.
Remark 6.8. For every reduced PRS, ri is in R[x] for 2 ≤ i ≤ ℓ. Note that
Corollary 6.6(iii) implies ri = (−1)(d0 +1)(i+1) Ri (f, g). So the normal case is
clear. An easy proof for the general case based on polynomial subresultants
is in Collins (1967), p. 134, Corollary 1.1, and Brown (1971), p. 485/486.
Lemma 6.9. Let ei,j = dj−1
division rule. Then
ψi = −
Q
j≤k≤i (1−dk ),
Y
1≤j≤i−2
and let ψi be as in the subresultant
e
ρj i−3,j for 2 ≤ i ≤ ℓ.
Corollary 6.10. Let (r0 , . . . , rℓ ) be a subresultant PRS, and 1 ≤ i ≤ ℓ. Then
Y e
ρki−1,k .
(i)
σni (f, g) =
1≤k≤i
(ii) The subresultants satisfy the recursive formulas
σn1 (f, g) = ρd10 , and
i
σni+1 (f, g) = σni (f, g) · ρdi+1
Y
−di ei−1,k
ρk
.
1≤k≤i
If the PRS is normal, then this simplifies to:
(iii)
σni (f, g) = ρi for i ≥ 2.
(iv) The subresultants satisfy the recursive formulas
σn1 (f, g) = ρd10 , and
σni+1 (f, g) = σni (f, g) · ρi+1 ρ−1
i .
Now we have all tools to prove the relation between normal reduced and
normal subresultant PRS which can be found in Collins (1967), p. 135, Corollary 1.3, and Collins (1973), p. 738.
Corollary 6.11. Let (r0 , . . . , rℓ ) be a normal reduced PRS and (a0 , . . . , aℓ ) a
normal subresultant PRS for the polynomials r0 = a0 = f and r1 = a1 = g.
Then the following holds for 2 ≤ i ≤ ℓ:
lc(ri ) = (−1)(n0 −ni )(n1 −ni ) · lc(ai ).
336
J. von zur Gathen, T. Lücking
Remark 6.12. For every subresultant PRS the polynomials ri are in R[x] for
2 ≤ i ≤ ℓ. Note that Corollary 6.10(iii) implies ri = Ri (f, g). So the normal case
is clear. An easy proof for the general case based on polynomial subresultants is
in Collins (1967), p. 130, and Brown (1971), p. 486.
Corollary 6.10 does not provide the only recursive formula for subresultants. Another one is based on an idea in Lickteig & Roy (1997), p. 12, and
Reischert (1997), p. 238, where the following formula has been proven for polynomial subresultants. The translation of this result into a theorem on scalar
subresultants leads us to an answer to Question 4.4.
Theorem 6.13. Let (r0 , . . . , rℓ ) be a subresultant PRS. Then the subresultants
satisfy for 1 ≤ i < ℓ the recursive formulas
σn1 (f, g) = ρd10 and
i
.
σni+1 (f, g) = σni (f, g)1−di · ρdi+1
The proof of the conjecture now becomes pretty simple:
1−d
Corollary 6.14. Let ψ2 = −1 and ψi = (−ρi−2 )di−3 ψi−1 i−3 for 3 ≤ i ≤ ℓ.
Then
ψi = −σni−2 (f, g) for 3 ≤ i ≤ ℓ.
Since all subresultants are in R, this gives an answer to Question 4.4:
Theorem 6.15. The coefficients ψi and βi of the subresultant PRS are always
in R.
7
Analysis of Coefficient Growth and Running Time
We first estimate the running times for normal PRS. A proof for an exponential
upper bound for the pseudo PRS is in Knuth (1981), p. 414, but our goal is to
show an exponential lower bound. To this end, we prove two such bounds on
the bit length of the leading coefficients ρi in this PRS. Recall that ρ1 = lc(g)
and σn1 = ρ1δ0 +1 .
Lemma 7.1. Suppose that (f, g) ∈ Z[x]2 have a normal pseudo PRS. Then
|ρi | ≥ |ρ1 |2
i−3
for 3 ≤ i ≤ ℓ.
Lemma 7.2. Suppose that (f, g) ∈ Z[x]2 have a normal pseudo PRS, and that
|ρ1 | = 1. Then
|ρi | ≥ |σni
Y
2≤k≤i−2
σnk (f, g)2
i−k−1
| for 3 ≤ i ≤ ℓ.
Theorem 7.3. Computing the pseudo PRS takes exponential time, at least 2n ,
in some cases with input polynomials of degrees at most n.
Subresultants Revisited
337
We have the following running time bound for the normal reduced PRS
algorithm.
Theorem 7.4. Let kf k∞ , kgk∞ ≤ A, B = (n + 1)n An+m , and let (r0 , . . . , rℓ )
be the normal reduced PRS for f, g. Then the max-norm of the ri is at most
4B 3 , and the algorithm uses O(n3 m log 2 (nA)) word operations.
Corollary 7.5. Since Corollary 6.11 shows that normal reduced PRS and normal subresultant PRS agree up to sign, the estimates in Theorem 7.4 are also
true for normal subresultant PRS.
We conclude the theoretical part of our comparison with an overview of all
worst-case running times for the various normal PRS in Table 2. The length of
the coefficients of f and g are assumed to be at most n. The estimations that
are not proven here can be found in von zur Gathen & Gerhard (1999).
PRS
time
classical/Sturmian
monic
pseudo
primitive
reduced/subresultant
proven in
n
8
von zur Gathen & Gerhard (1999)
n
6
von zur Gathen & Gerhard (1999)
n
c with c ≥ 2
Theorem 7.3
n
6
von zur Gathen & Gerhard (1999)
n
6
Theorem 7.4
Table 2. Comparison of various normal PRS. The time in bit operations is for polynomials of degree at most n and with coefficients of length at most n and ignores
logarithmic factors.
8
Experiments
We have implemented six of the PRS for polynomials with integral coefficients
in C++, using Victor Shoup’s “Number Theory Library” NTL 3.5a for integer
and polynomial arithmetic. Since the Sturmian PRS agrees with the classical
PRS up to sign, it is not mentioned here. The contents of the intermediate results in the primitive PRS are simply computed by successive gcd computations.
Cooperman et al. (1999) propose a new algorithm that uses only an expected
number of two gcd computations, but on random inputs it is slower than the
naı̈ve approach. All timings are the average over 10 pseudorandom inputs. The
software ran on a Sun Sparc Ultra 1 clocked at 167MHz.
In the first experiment we pseudorandomly and independently chose three
polynomials f, g, h ∈ Z[x] of degree n − 1 with nonnegative coefficients of length
338
J. von zur Gathen, T. Lücking
CPU seconds
n
pseudo
classical
monic
reduced
subresultant
primitive
Fig. 1. Computation of polynomial remainder sequences for polynomials of degree n−1
with coefficients of bit length less than n for 1 ≤ n ≤ 32.
less than n, for various values of n. Then we used the various PRS algorithms
to compute the gcd of f h and gh of degrees less than 2n. The running times are
shown in Figures Figure 1 and Figure 2.
As seen in Table 2 the pseudo PRS turns out to be the slowest algorithm.
The reason is that for random inputs with coefficients of length at most n the
second polynomial is almost never monic. Thus Theorem 7.3 shows that for random inputs the running time for pseudo PRS is mainly exponential. A surprising
result is that the primitive PRS, even implemented in a straightforward manner, turns out to be the fastest PRS. Collins and Brown & Traub only invented
the subresultant PRS in order to avoid the primitive PRS since it seemed too
expensive, but our tests show that for our current software this is not a problem.
Polynomial remainder sequences of random polynomials tend to be normal.
Since Corollary 6.11 shows that reduced and subresultant PRS agree up to signs
in the normal case, their running times also differ by little.
We are also interested in comparing the reduced and subresultant PRS, so
we construct PRS which are not normal. To this end, we pseudorandomly and
independently choose six polynomials f, f1 , g, g1 , h, h1 for various degrees n as
follows:
Subresultants Revisited
339
CPU minutes
n
monic
reduced
subresultant
primitive
Fig. 2. Computation of polynomial remainder sequences for polynomials of degree n−1
with coefficients of bit length less than n for 32 ≤ n ≤ 96. Time is now measured in
minutes.
polynomial degree coefficient length
f, g
n/6
n/4
f1 , g1
n/3
n
h
n/2
3n/4
h1
n
n
So the polynomials
F = (f h · xn + f1 )h1
G = (gh · xn + g1 )h1
have degrees less than 2n with coefficient length less than n, and every polynomial remainder sequence of F and G has a degree jump of n3 at degree 2n − n6 .
Then we used the various PRS algorithms to compute the gcd of F and G. The
running times are illustrated in Figures Figure 3 and Figure 4.
As in the first test series the pseudo PRS turns out to be the slowest, and
the primitive PRS is the fastest. Here the monic PRS is faster than the reduced
PRS. Since the PRS is non-normal, the αi ’s are powers of the leading coefficients
of the intermediate results, and their computation becomes quite expensive.
340
J. von zur Gathen, T. Lücking
CPU seconds
n
pseudo
classical
monic
reduced
subresultant
primitive
Fig. 3. Computation of non-normal polynomial remainder sequences for polynomials
of degree 2n − 1 with coefficient length less than n and a degree jump of n3 at degree
2n − n6 , for 1 ≤ n ≤ 32.
References
Étienne Bézout, Recherches sur le degré des équations résultantes de
l’évanouissement des inconnues. Histoire de l’académie royale des sciences (1764),
288–338. Summary 88–91.
Otto Biermann, Über die Resultante ganzer Functionen. Monatshefte fuer Mathematik und Physik (1891), 143–146. II. Jahrgang.
W. S. Brown, On Euclid’s Algorithm and the Computation of Polynomial Greatest
Common Divisors. Journal of the ACM 18(4) (1971), 478–504.
W. S. Brown, The Subresultant PRS Algorithm. ACM Transactions on Mathematical
Software 4(3) (1978), 237–249.
W. S. Brown and J. F. Traub, On Euclid’s Algorithm and the Theory of Subresultants. Journal of the ACM 18(4) (1971), 505–514.
G. E. Collins, Polynomial remainder sequences and determinants. The American
Mathematical Monthly 73 (1966), 708–712.
George E. Collins, Subresultants and Reduced Polynomial Remainder Sequences.
Journal of the ACM 14(1) (1967), 128–142.
George E. Collins, The Calculation of Multivariate Polynomial Resultants. Journal
of the ACM 18(4) (1971), 515–532.
G. E. Collins, Computer algebra of polynomials and rational functions. The American
Mathematical Monthly 80 (1973), 725–755.
Subresultants Revisited
341
CPU minutes
n
monic
reduced
subresultant
primitive
Fig. 4. Computation of non-normal polynomial remainder sequences for polynomials
of degree 2n − 1 with coefficient length less than n and a degree jump of n3 at degree
2n − n6 , for 32 ≤ n ≤ 96. Time is now measured in minutes.
Gene Cooperman, Sandra Feisel, Joachim von zur Gathen, and George
Havas, Gcd of many integers. In COCOON ’99, ed. T. Asano et al., Lecture
Notes in Computer Science 1627. Springer-Verlag, 1999, 310–317.
Leonhard Euler, Démonstration sur le nombre des points où deux lignes des ordres
quelconques peuvent se couper. Mémoires de l’Académie des Sciences de Berlin 4
(1748), 1750, 234–248. Eneström 148. Opera Omnia, ser. 1, vol. 26, Orell Füssli,
Zürich, 1953, 46–59.
Joachim von zur Gathen, Parallel algorithms for algebraic problems. SIAM Journal
on Computing 13(4) (1984), 802–824.
Joachim von zur Gathen and Jürgen Gerhard, Modern Computer Algebra. Cambridge University Press, 1999.
K. O. Geddes, S. R. Czapor, and G. Labahn, Algorithms for Computer Algebra.
Kluwer Academic Publishers, 1992.
Paul Gordan, Vorlesungen über Invariantentheorie. Erster Band: Determinanten.
B. G. Teubner, Leipzig, 1885. Herausgegeben von Georg Kerschensteiner.
Walter Habicht, Eine Verallgemeinerung des Sturmschen Wurzelzählverfahrens.
Commentarii Mathematici Helvetici 21 (1948), 99–116.
M. W. Haskell, Note on resultants. Bulletin of the New York Mathematical Society
1 (1892), 223–224.
342
J. von zur Gathen, T. Lücking
Thomas W. Hungerford, Abstract Algebra: An Introduction. Saunders College
Publishing, Philadelphia PA, 1990.
C. G. J. Jacobi, De eliminatione variabilis e duabus aequationibus algebraicis. Journal
für die Reine und Angewandte Mathematik 15 (1836), 101–124.
Donald E. Knuth, The Art of Computer Programming, vol.2, Seminumerical Algorithms. Addison-Wesley, Reading MA, 2nd edition, 1981.
L. Kronecker, Die verschiedenen Sturmschen Reihen und ihre gegenseitigen
Beziehungen. Monatsberichte der Königlich Preussischen Akademie der Wissenschaften, Berlin (1873), 117–154.
L. Kronecker, Zur Theorie der Elimination einer Variabeln aus zwei algebraischen Gleichungen. Monatsberichte der Königlich Preussischen Akademie der Wissenschaften, Berlin (1881), 535–600. Werke, Zweiter Band, ed. K. Hensel, Leipzig,
1897, 113–192. Reprint by Chelsea Publishing Co., New York, 1968.
Thomas Lickteig and Marie-Françoise Roy, Cauchy Index Computation. Calcolo
33 (1997), 331–357.
R. Loos, Generalized Polynomial Remainder Sequences. Computing 4 (1982), 115–
137.
Thom Mulders, A note on subresultants and the Lazard/Rioboo/Trager formula in
rational function integration. Journal of Symbolic Computation 24(1) (1997), 45–
50.
Isaac Newton, Arithmetica Universalis, sive de compositione et resolutione arithmetica liber. J. Senex, London, 1707. English translation as Universal Arithmetick:
or, A Treatise on Arithmetical composition and Resolution, translated by the late
Mr. Raphson and revised and corrected by Mr. Cunn, London, 1728. Reprinted in:
Derek T. Whiteside, The mathematical works of Isaac Newton, Johnson Reprint
Co, New York, 1967, p. 4 ff.
Daniel Reischert, Asymptotically Fast Computation of Subresultants. In Proceedings of the 1997 International Symposium on Symbolic and Algebraic Computation
ISSAC ’97, Maui HI, ed. Wolfgang W. Küchlin. ACM Press, 1997, 233–240.
V. Strassen, The computational complexity of continued fractions. SIAM Journal on
Computing 12(1) (1983), 1–27.
C. Sturm, Mémoire sur la résolution des équations numériques. Mémoires présentés
par divers savants à l’Acadèmie des Sciences de l’Institut de France 6 (1835), 273–
318.
J. J. Sylvester, A method of determining by mere inspection the derivatives from two
equations of any degree. Philosophical Magazine 16 (1840), 132–135. Mathematical
Papers 1, Chelsea Publishing Co., New York, 1973, 54–57.
Richard Zippel, Effective polynomial computation. Kluwer Academic Publishers,
1993.
A Unifying Framework for the Analysis of a Class of
Euclidean Algorithms
Brigitte Vallée
GREYC, Université de Caen, F-14032 Caen (France)
Brigitte.Vallee@info.unicaen.fr
Abstract. We develop a general framework for the analysis of algorithms of a
broad Euclidean type. The average-case complexity of an algorithm is seen to be
related to the analytic behaviour in the complex plane of the set of elementary
transformations determined by the algorithms. The methods rely on properties of
transfer operators suitably adapted from dynamical systems theory. As a consequence, we obtain precise average-case analyses of four algorithms for evaluating
the Jacobi symbol of computational number theory fame, thereby solving conjectures of Bach and Shallit. These methods provide a unifying framework for
the analysis of an entire class of gcd-like algorithms together with new results
regarding the probable behaviour of their cost functions.
1 Introduction
Euclid’s algorithm, discovered as early as 300BC, was analysed first in the worst case
in 1733 by de Lagny, then in the average-case around 1969 independently by Heilbronn
[8] and Dixon [5], and finally in distribution by Hensley [9] who proved in 1994 that
the Euclidean algorithm has Gaussian behaviour; see Knuth’s and Shallit’s vivid accounts [12,20]. The first methods used range from combinatorial (de Lagny, Heilbronn)
to probabilistic (Dixon). In parallel, studies by Lévy, Khinchin, Kuzmin and Wirsing
had established the metric theory of continued fractions by means of a specific density
transformer. The more recent works rely for a good deal on transfer operators, a farreaching generalization of density transformers, originally introduced by Ruelle [17,18]
in connection with the thermodynamic formalism and dynamical systems theory [1].
Examples are Mayer’s studies on the continued fraction transformation [14], Hensley’s
work [9] and several papers of the author [21,22] including her analysis of the Binary
GCD Algorithm [23].
In this paper, we provide new analyses of several classical and semi-classical variants
of the Euclidean algorithm. A strong motivation of our study is a group of gcd-like
algorithms that compute the Jacobi symbol whose relation to quadratic properties of
numbers is well-known.
Methods. Our approach consists in viewing an algorithm of the broad gcd type as a dynamical system, where each iterative step is a linear fractional transformation (LFT) of
the form z → (az +b)/(cz +d). The system control may be simple, what we call generic
below, but also multimodal, what we call Markovian. A specific set of transformations
is then associated to each algorithm. It will appear from our treatment that the computational complexity of an algorithm is in fact dictated by the collective dynamics of its
associated set of transformations. More precisely, two factors intervene: (i) the characteristics of the LFT’s in the complex domain; (ii) their contraction properties, notably
G. Gonnet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNCS 1776, pp. 343–354, 2000.
c Springer-Verlag Berlin Heidelberg 2000
344
B. Vallée
near fixed points. There results a classification of gcd-like algorithms in terms of the
average number of iterations: some of them are “fast", that is, of logarithmic complexity
Θ(log N ), while others are “slow", that is, of the log-squared type Θ(log2 N ).
It is established here that strong contraction properties of the elementary transformations that build up a gcd-like algorithm entail logarithmic cost, while the presence of
an indifferent fixed-point leads to log-squared behaviour. In the latter case, the analysis
requires a special twist that takes its inspiration from the study of intermittency phenomena in physical systems that was introduced by Bowen [2] and is nicely exposed in
a paper of Prellberg and Slawny [15]. An additional benefit of our approach is to open
access to characteristics of the distribution of costs, including information on moments:
the fast algorithms appear to have concentration of distribution—the cost converges in
probability to its mean—while the slow ones exhibit an extremely large dispersion of
costs.
Technically, this paper relies on a description of relevant parameters by means of generating functions, a by now common tool in the average-case of algorithms [7]. As is usual
in number theory contexts, the generating functions are Dirichlet series. They are first
proved to be algebraically related to specific operators that encapsulate all the important
informations relative to the “dynamics" of the algorithm. Their analytical properties depend on spectral properties of the operators, most notably the existence of a “spectral
gap” that separates the dominant eigenvalue from the remainder of the spectrum. This
determines the singularities of the Dirichlet series of costs. The asymptotic extraction of
coefficients is then achieved by means of Tauberian theorems [4], one of several ways
to derive the prime number theorem. Average complexity estimates finally result. The
main thread of the paper is thus adequately summarized by the chain:
Euclidean algorithm ❀ Associated transformations ❀ Transfer operator
❀ Dirichlet series of costs ❀ Tauberian inversion ❀ Average-case complexity.
This chain then leads to effective and simple criteria for distinguishing slow algorithms
from fast ones, for establishing concentration of distribution, for analysing various cost
parameters of algorithms, etc. The constants relative to the sloŽw algorithms are all
explicit, while the constants relative to the fast algorithms are closely related to the
entropy of the associated dynamical system: they are computable numbers; however,
except in two classical cases, they do not seem be related to classical constants of
analysis.
Motivations. We study here eight algorithms: the first four algorithms are variations of
the classical Euclidean algorithm and are called Classical (G), By-Excess (L), ClassicalCentered (K), and Subtractive (T). The last four algorithms serve to compute the Jacobi
symbol introduced in Section 2, and are called Even (E), Odd (O), Ordinary (U) and
Centered (C).
The complexity of the first four algorithms is now known: The two classical algorithms
(G) and (K) have been analysed by Heilbronn, Dixon and Rieger [16]. The Subtractive
algorithm (T) was studied by Knuth andYao [25], and Vardi [24] analysed the By-Excess
Algorithm (L) by comparing it to the Subtractive Algorithm. The methods used are rather
disparate, and their applicability to new situations is somewhat unclear. Here, we design
A Unifying Framework for the Analysis of a Class of Euclidean Algorithms
345
a unifying framework that also provides new results on the distribution of costs.
Two of the Jacobi Symbol Algorithms, the Centered (C) and Even (E) algorithms, have
been introduced respectively by Lebesgue [13] in 1847 and Eisenstein [6] in 1844. Three
of them, the Centered, Ordinary and Even algorithms, have been studied by Shallit [19]
who provided a complete worst-case analysis. The present paper solves completely a
conjecture of Bach and Shallit. Indeed, in [19], Shallit writes: “Bach has also suggested that one could investigate the average number of division steps in computing
the Jacobi symbol [. . . ]. This analysis is probably feasible to carry out for the Even
Algorithm, and it seems likely that the average number of division steps is Θ(log2 N ).
However, determining the average behaviour for the two other algorithms seems quite
hard.”
Results and plan of the paper. Section 3 is the central technical section of the paper.
There, we develop the line of attack outlined earlier and introduce successively Dirichlet
generating functions, transfer operators of the Ruelle type, and the basic elements of
Tauberian theory that are adequate for our purposes. The main results of this section
are summarized in Theorem 1 and Theorem 2 that imply a general criterion for logarithmic versus log-squared behaviour, while providing a framework for higher moment
analyses.
In Section 4, we return to our eight favorite algorithms—four classical variations and four
Jacobi symbol variations. The corresponding analyses are summarized in Theorems 3
and 4 where we list our main results, some old and some new, that fall as natural
consequences of the present framework. It results from the analysis (Theorem 3) that
the Fast Class contains two classic algorithms, the Classical Algorithm (G), and the
Classical Centered Algorithm (K), together with three Jacobi Symbol algorithms: the
Odd (O), Ordinary (U) and Centered (C) Algorithms. Their respective average-case
complexities on pairs of integers less than N are of the form HN ∼ AH log N for
H ∈ {G, K, O, U, C}.
The five constants are effectively characterized in terms of entropies of the associated
dynamical system, and the constants related to the two classical algorithms are easily
AK = (12/π 2 ) log φ.
obtained, AG = (12/π 2 ) log 2,
Theorem 4 proves that the Slow Class contains the remaining three algorithms, the ByExcess Algorithm (L), the Subtractive Algorithm (T), and one of the Jacobi Symbol
Algorithm, the Even Algorithm (E). They all have a complexity of the log-squared type,
LN ∼ (3/π 2 ) log2 N, TN ∼ (6/π 2 ) log2 N, EN ∼ (2/π 2 ) log2 N.
Finally, Theorem 5 provides new probabilistic characterizations of the distribution of
the costs: in particular, the approach applies to the analysis of the subtractive GCD
algorithms for which we derive the order of growth of higher moments, which appears
to be new. We also prove that concentration of distribution holds in the case of the the
five fast algorithms (G, K, O, U, C).
Finally, apart from specific analyses, our main contributions are the following:
(a) We show how transfer operator method may be extended to cope with complex
situations where the associated dynamical system may be either random or Markovian
(or both!).
346
B. Vallée
(b) An original feature in the context of analysis of algorithms is the encapsulation of
the method of inducing (related to intermittency as evoked above).
(c) Our approch opens access to information on higher moments of the distribution of
costs.
2 Eight Variations of the Euclidean Algorithm
We present here the eight algorithms to be analysed; the first four are classical variants
of the Euclidean Algorithm, while the last four are designed for computing the Jacobi
Symbol.
2.1. Variations of the classical Euclidean Algorithm. There are two divisions between
u and v (v > u), that produce a positive remainder r such that 0 ≤ r < u: the classical
division (by-default) of the form v = cu + r, and the division by-excess, of the form
v = cu − r. The centered division between u and v (v > u), of the form v = cu + εr,
with ε = ±1 produces a positive remainder r such that 0 ≤ r < u/2. There are three
Euclidean algorithms associated to each type of division, respectively called the Classical
Algorithm (G), the By-Excess Algorithm (L), and the Classical Centered Algorithm (K).
Finally, the Subtractive Algorithm (T) uses only subtractions and no divisions, since it
replaces the classical division v = cu+r by exactly c subtractions of the form v := v−u.
2.2. Variations for computing Jacobi symbol. The Jacobi symbol, introduced in [11], is
a very important tool in algebra, since it is related to quadratic characteristics of modular
arithmetic. Interest in its efficient computation has been reawakened by its utilisation in
primality tests and in some important cryptographic schemes.
For two integers u and v (v odd), the possible values for the Jacobi symbol J(u, v)
are −1, 0, +1. Even if the Jacobi symbol can be directly computed from the classical
Euclidean algorithm, thanks to a formula due to Hickerson [10], quoted in [24], we are
mainly interested in specific algorithms that run faster. These algorithms are fundamentally based on the following two properties,
Quadratic Reciprocity law: J(u, v) = (−1)(u−1)(v−1)/4 J(v, u) for u, v ≥ 0 odd,
Modulo law: J(v, u) = J(v − bu, u),
and they perform, like the classical Euclidean algorithm, a sequence of Euclidean-like
divisions and exchanges. However, the Quadratic Reciprocity law being only true for odd
integers, the standard Euclidean division has to be transformed into a pseudo–euclidean
division of the form
with ε = ±1, s odd and strictly less than u,
v = bu + ε2k s
that creates another pair (s, u) for the following step. Then the symbol J(u, v) is easily
computed from the symbol J(s, u).
The binary division, used in the Binary GCD algorithm, can also be used for computing
the Jacobi symbol. However, it is different since the pseudo–division that it uses is NOT
a modification of the classical euclidean division. We consider here four main algorithms
according to the kind of pseudo–euclidean division that is performed. They are called
the Even, Odd, Ordinary and Centered Algorithms, and their inputs are odd integers. The
Even algorithm (E) performs divisions with even pseudo–quotients, and thus odd pseudoremainders. The Odd algorithm (O) performs divisions with odd pseudo–quotients, and
A Unifying Framework for the Analysis of a Class of Euclidean Algorithms
347
thus even pseudo-remainders from which powers of 2 are removed. The Ordinary (U) and
Centered (C) Algorithms perform divisions where the pseudo–quotients are equal to the
ordinary quotients or to the centered quotients; then, remainders may be even or odd, and,
when they are even, powers of two are removed for obtaining the pseudo–remainders.
Alg., Type
Division
(G) (1, all, 0)
v = cu + r
0≤r<u
(L) (1, all, 1)
v = cu − r
0≤r<u
(K) ( 12 , all, 0)
v = cu + εr
c ≥ 2, ε = ±1,
0 ≤ r < u2
(T) (1, all, 0)
v = u + (v − u)
(E) (1, odd, 1)
v = cu + εs
c even, ε = ±1
s odd, 0 < s < u
(O) (1, odd, 0)
v = cu + ε2k s
c odd, ε = ±1, s odd
k ≥ 1, 0 ≤ 2k s < u
(U) (1, odd, 0)
v = cu + 2k s
s = 0 or s odd, k ≥ 0
0 ≤ 2k s < u
(C)
( 12 ,
odd, 0)
v = cu + ε2k s
s = 0 or s odd, k ≥ 0
0 ≤ 2k s < u2
LFT’s
1
c+x
c≥1
1
c−x
c≥2
1
c + εx
c ≥ 2, ε = ±1,
(c, ε) 6= (2, −1)
x
1
T = {q =
,p =
}
1+x
1+x
1
c + εx
c even, ε = ±1
2k
c + εx
k ≥ 1, c odd ≥ 2k + 1
1
U0 = {
, c ≥ 1}
c+x
2k
U1 = {
, ≥ 1, c ≥ 2k }
c+x
Ui|j = Uj ∩ {c ≡ i mod 2}
1
C0 = {
}, ε = ±1,
c + εx
c ≥ 2, (c, ε) 6= (2, −1)
2k
C1 = {
}, k ≥ 1, ε = ±1
c + εx
k+1
c≥2
, (c, ε) 6= (2k+1 , −1)
Ci|j = Cj ∩ {c ≡ i mod 2}
Conditions.
F :c≥2
F :c≥3
F :ε=1
Finishes with pq
F :ε=1
J :k=0
initial state: 0
final state: 1
initial state: 0
final state: 1
2.3. The sets of linear fractional transformations. When performing ℓ (pseudo)euclidean divisions on the input (u, v), each of the eight algorithms builds a specific
continued fraction of height ℓ that decomposes the rational x = u/v as
u/v = h1 ◦ h2 ◦ . . . ◦ hℓ (a),
where the hi ’s are linear fractional transformations (LFT’s) and a is the last value of the
rational. The value a equals 1 for the Even Algorithm (E) and By-Excess Algorithm (L),
and equals 0 for the other six algorithms. The rational inputs of all the algorithms always
belong to a basic interval of the form I = [0, ρ] with ρ = 1/2 for the two centered
algorithms (K) and (C) and ρ = 1 in the other six cases. For the first four algorithms, the
valid inputs are all the rationals of I, while the valid inputs of the last four algorithms
348
B. Vallée
are only the odd rationals of I. The variable valid has two possible values {all, odd},
and finally, the type of the algorithm is defined as the value of the triple (ρ, valid, a).
The precise form of the possible LFT’s depends on the algorithm, and there are two
classes of algorithms, the generic class and the Markovian class:
In the case of the first six algorithms, there may exist special sets of LFT’s in the initial
step (J ) or in the final step (F). However, all the other steps are generic, in the sense
that they use the same set of LFT’s, that we call the generic set. These algorithms are
called themselves generic.
On the contrary, the last two algorithms –the Ordinary Algorithm (U) and the Centered
Algorithm (C)– have a Markovian flavour. If the quotient b is odd, then the remainder
is even, and thus k satisfies k ≥ 1; if b is even, then the remainder is odd, and thus k
satisfies k = 0. This link is of Markovian type, and we consider two states: the 0 state,
which means “the remainder of (u, v) is odd", i.e. k = 0, and the 1 state, which means
“the remainder of (u, v) is even ", i.e. k ≥ 1. Denoting by Uj , resp. Cj the set of LFT’s
which can be used in state j, we obtain four different sets, Ui|j , resp. Ci|j , each of them
brings rationals from state j to state i. The initial state is the 0 state and the final state is
the 1 state.
3 Dynamical Operators and Tauberian Theorems
Here, we describe the general tools for analysing algorithms of the Euclidean type
that are based on some division-like operation and exchanges. We first introduce the
generating functions relative to the height of the continued fraction and we relate them
to the dynamical operator associated to the algorithm. This operator can be generic or
Markovian, according to the structure of the algorithm. In this way, the two Dirichlet
series that intervene in the analysis, called F (s) and G(s), are expressed in terms of
the Ruelle operator. The average number of steps involves partial sums of coefficients
of these two Dirichlet series, and Tauberian Theorems are a classical tool that transfers
analytical properties of Dirichlet series into asymptotic behaviour of their coefficients.
3.1. Generating functions. We consider the following sets relative to I := [0, ρ],
e v ≤ N },
e := {(u, v); u, v valid, (u/v) ∈ I}, Ω
eN := {(u, v) ∈ Ω,
Ω
Ω := {(u, v); u, v valid, gcd(u, v) = 1, (u/v) ∈ I}, ΩN := {(u, v) ∈ Ω, v ≤ N },
[ℓ]
[ℓ]
e [ℓ] , Ω , Ω
e the
for the possible inputs of an algorithm, and we denote by Ω [ℓ] , Ω
N
N
e
e
subsets of Ω, Ω, ΩN , ΩN for which the algorithm performs exactly ℓ pseudo–divisions.
Equivalently, the height of the continued fraction is equal to ℓ. We study the average
eN
number of steps SN , SeN of the algorithm on ΩN , Ω
SN :=
1 X
[ℓ]
ℓ |ΩN |
|ΩN |
ℓ≥0
SeN :=
1 X e [ℓ]
ℓ |ΩN |
eN |
|Ω
(1)
ℓ≥0
and we wish to evaluate their asymptotic behaviour (for N → ∞). We first consider
[ℓ]
[ℓ]
pairs (u, v) with fixed v = n, and we denote by νn (resp. νen ) the number of such
A Unifying Framework for the Analysis of a Class of Euclidean Algorithms
349
e [ℓ] ). We introduce the double generating functions S(s, w) and
elements of Ω [ℓ] (resp. Ω
[ℓ]
e w) of the sequences (νn[ℓ] ) and (e
νn ),
S(s,
S(s, w) :=
X
wℓ
ℓ≥1
X νn[ℓ]
,
ns
n>1
e w) :=
S(s,
X
wℓ
ℓ≥1
X νen[ℓ]
.
ns
n>1
(2)
P
b
The Riemann series ζb relative to valid numbers ζ(s)
:= v valid v −s relates the two
b
e w) = ζ(s)S(s,
generating functions via the equality S(s,
w). It is then sufficient to study
S(s, w). We introduce the two sequences (an ) and (bn ) together with their associated
Dirichlet series F (s), G(s),
an :=
X
ℓ≥1
νn[ℓ]
bn :=
X
ℓ νn[ℓ] ,
F (s) =
ℓ≥1
X an
,
ns
n>1
G(s) =
X bn
. (3)
ns
n>1
Now, F (s) and G(s) can be easily expressed in terms of S(s, w) since
F (s) = S(s, 1),
G(s) =
d
S(s, w)|w=1 ,
dw
(4)
and intervene, via partial sums of their coefficients, in the quantity SN ,
P
n≤N
SN = P
P
n≤N
[ℓ]
ℓ≥0
P
ℓ νn
[ℓ]
ℓ≥0 νn
P
n≤N
=P
n≤N
bn
an
.
(5)
3.2. Ruelle operators. We show now how the Ruelle operators associated to the algorithms intervene in the evaluation of the generating function S(s, w). We denote by L
a set of LFT’s. For each h ∈ L, D[h] denotes the denominator of the linear fractional
transformation (LFT) h, defined for h(x) = (ax + b)/(cx + d) with a, b, c, d coprime
integers by D[h](x) := |cx + d| = | det h|1/2 |h′ (x)|−1/2 . The Ruelle operator Ls
relative to the set L depends on a complex parameter s
Ls [f ](x) :=
X
h∈L
1
f ◦ h(x).
D[h](x)s
(6)
More generally, when given two sets of LFT’s, L and K, the set LK is formed of all
h ◦ g with h ∈ L and g ∈ K, and the multiplicative property of denominator D, i.e.,
D[h ◦ g](x) = D[h](g(x)) D[g](x), implies that the operator Ks ◦ Ls uses all the LFT’s
of LK
X
1
f ◦ h(x).
(7)
Ks ◦ Ls [f ](x) :=
D[h](x)s
h∈LK
3.3. Ruelle operators and generating functions. The first six algorithms are generic,
since they use the same set H at each generic step. In this case, the ℓ-th iterate of
Hs generates all the LFT’s used in ℓ (generic) steps of the algorithm. The last two
algorithms are Markovian. There are four sets of LFT’s, and each of these sets, denoted
350
B. Vallée
by Ui|j “brings" rationals from state j to state i. We denote by Us,i|j the Ruelle operator
associated to set Ui|j , and by Us the “matrix operator"
Us,0|0 Us,0|1
.
(8)
Us =
Us,1|0 Us,1|1
By multiplicative properties (7), the ℓ-th iterate of Us generates all the elements of
U <ℓ> , i.e., all the possible LFT’s of height ℓ. More precisely, the coefficient of index
<ℓ>
that brings rationals
(i, j) of the matrix Uℓs is the Ruelle operator relative to the set Ui|j
from state j to state i in ℓ steps.
In both cases, the Ruelle operator is then a “generating" operator, and generating functions themselves can be easily expressed with the Ruelle operator:
[ℓ]
Proposition 1. The double generating function S(s, w) of the sequence (νn ) can be
expressed as a function of the Ruelle operators associated to the algorithm. In the generic
case,
S(s, w) = wKs [1](a) + w2 Fs ◦ (I − wHs )−1 ◦ Js [1](a).
Here, the Ruelle operators Hs , Fs , Js , Ks are relative to the generic set H, final set F,
initial set J or mixed set K := J ∩ F; the value a is the final value of the rational u/v.
In the Markovian case, the Ruelle operator Us is a matrix operator, and
1
(0).
S(s, w) = ( 0 1 ) wUs (I − wUs )−1
0
In both cases, the Dirichlet series F (s) and G(s) involve powers of quasi-inverse of the
Ruelle operator of order 1 for F (s), and of order 2 for G(s).
3.4. Tauberian Theorems. Finally, we have shown that the average number of steps SN
of the four Algorithms on ΩN is a ratio where the numerators and the denominators
involve the partial sums of the Dirichlet series F (s) and G(s). Thus, the asymptotic
evaluation of SN , SeN (for N → ∞) is possible if we can apply the following Tauberian
b
b
theorem [4] to the Dirichlet series F (s), ζ(s)F
(s), G(s), ζ(s)G(s).
Tauberian Theorem. [Delange] Let F (s) be a Dirichlet series with non negative coefficients such that F (s) converges for ℜ(s) > σ > 0. Assume that (i) F (s) is analytic on
ℜ(s) = σ, s 6= σ, and (ii) for some β ≥ 0, one has F (s) = A(s)(s − σ)−β−1
X+ C(s),
an =
where A, C are analytic at σ, with A(σ) 6= 0. Then, as N → ∞,
n≤N
A(σ)
N σ logγ N [1 + ε(N ) ], ε(N ) → 0.
σΓ (β + 1)
In the remainder of the paper, we show that the Tauberian Theorem applies to F (s), G(s)
with σ = 2. For F (s), it applies with β = 0. For G(s), it applies with β = 1 or β = 2.
For the slow algorithms, β equals 2, and the average number of steps will be of order
log2 N . For the fast algorithms, β equals 1, and this will prove the logarithmic behaviour
of the average number of steps. First, the function F (s) is closely linked to the ζb function
b
relative to valid numbers. Then, the Tauberian Theorem applies to F (s) and ζ(s)F
(s),
with σ = 2 and β = 0.
A Unifying Framework for the Analysis of a Class of Euclidean Algorithms
351
3.5. Functional Analysis. Here, we consider the following conditions on a set H of LFTs
that will entail that the Ruelle operator Hs relative to H fulfills all the properties that
we need for applying the Tauberian Theorem to Dirichlet series G(s).
Conditions Q(H). There exist an open disk V ⊂ I and a real α > 2 such that
(C1 ) Every LFT h ∈ H has an analytic continuation on V, and h maps the closure V̄ of
h.
disk V inside V. Every function |h′ | has an analytic continuation on V denoted by e
h(z)| ≤ δ(h) for all z ∈ V
(C2 ) For each h ∈ H, there exists δ(h) < 1 for which 0 < |e
X δ(h) s/2
and such that the series
converges on the plane ℜ(s) > α.
det(h)
h∈H
(C3 ) Let H[k] denote the subset of H defined as H[k] := {h ∈ H | det(h) = 2k }. One
of two conditions (i) or (ii) holds: (i) H = H[0] ,
S
(ii) For any k ≥ 1, H[k] is not empty and H = k≥1 H[k] .
Moreover, for any k ≥ 0 for which H[k] 6= ∅, the intervals (h(I), h ∈ H[k] ) form a
pseudo-partition of I.
(C4 ) For some integer A, the set H contains a subset D of the form
D := {h | h(x) = A/(c + x) with integers c → ∞}.
We denote by Hs a Ruelle operator associated to a generic set H, and by Us a Ruelle
Markovian operator (associated to a Markovian process with two states). In this case,
the subset Ui denotes the set relative to state i. Here I denotes the basic interval [0, ρ].
We consider that conditions Q(H) and Q(Ui ) (for i = 0, 1) hold. Then, we can prove
the following: Under conditions (C1 ) and (C2 ), the Ruelle operator Hs acts on A∞ (V)
for ℜ(s) > σ and the operator Us acts on A∞ (V)2 for ℜ(s) > α. They are compact
(even nuclear in the sense of Grothendieck). Furthermore, for real values of parameter s,
they have dominant spectral properties: there exist a unique dominant eigenvalue λ(s)
positive, analytic for s > α, a unique dominant eigenfunction ψs , and a unique dominant
projector es such that es [ψs ] = 1. Then, there is a spectral gap between the dominant
eigenvalue and the remainder of the spectrum. Under conditions (C3 ), the operators H2 ,
R1
U2 are density transformers; thus one has λ(2) = 1 and e2 [f ] = 0 f (t)dt (generic case)
R1
or e2 [f0 , f1 ] = 0 [f0 (t) + f1 (t)]dt (Markovian case). Finally, condition (C4 ) implies
that the operators Hs , Us have no eigenvalue equal to 1 on the line ℜ(s) = 2, s 6= 2.
Finally, the powers of the quasi–inverse of the Ruelle operator which intervene in the
expression of generating functions F (s) and G(s) fulfill all the hypotheses of Tauberian
Theorem:
Theorem 1. Let H be a generic set that satisfies Q(H) and U be a Markovian set
that satisfies Q(Ui ) (i = 0, 1). Then, for any p ≥ 1, the p-th powers (I − Hs )−p ,
(I − Us )−p of the quasi–inverse of the Ruelle operators relative to H and U are analytic
on the punctured plane {ℜ(s) ≥ 2, s 6= 2} and have a pole of order p at s = 2. Near
s = 2, one has, for any function f positive on J , and any x ∈ J ,
1
−1 p
) ψ2 (x) e2 [f ],
or
(I − Us )−p [f ](x) ∼
(
(I − Hs )−p [f ](x)
(s − 2)p λ′ (2)
Here, λ(s) is the dominant eigenvalue, ψs is the dominant eigenfunction and es the
dominant projector with the condition es [ψs ] = 1.
352
B. Vallée
Then the Euclidean Algorithm associated to H or to U performs an average number
of steps on valid rationals of I with denominator less than N that is asymptotically
UN ∼ AU log N . The constants AH , AU
HN ∼ AH log N,
logarithmic,
involve the entropy of the associated dynamical systems.
In the case when the set H is only almost well-behaved –it contains one “bad" LFT p, but
the set Q := H \ {p} is well-behaved– we adapt the method of inducing that originates
from dynamical systems theory.
Theorem 2. Let H be a generic set of LFTs for which the following holds: (i) There
exists a element p of H which possesses an indifferent point, i.e., a fixed point where the
absolute value of the derivative equals 1, (ii) The LFT p does not belong to the final set
F, (iii) If Q denotes the set H \ {p}, and M, M+ the sets p⋆ Q, p⋆ F, then conditions
Q(M), Q(M+ ) are fullfilled.
Then the Euclidean Algorithm associated to H performs an average number of steps on
valid rationals of I with denominator less than N that is asymptotically of log–squared
e N ∼ AH log2 N.
type,
HN ∼ H
The average number QN of good steps (i.e., steps that use elements of Q) performed by
the Euclidean Algorithm on valid rationals of I with denominator less than N satisfies
e N ∼ AQ log N, and the constant AQ involves the entropy of the dynamical
QN ∼ Q
system relative to set M.
4 Average-Case Analysis of the Algorithms
We now come back to the analysis of the eight algorithms, and we study successively
the fast algorithms, and the slow algorithms
4.1. The Fast Algorithms. We consider the generic sets G, K, O relative to the Classical
Algorithm, the Classical Centered Algorithm and the Odd Algorithm, or the Markovian
sets U or C relative to the Ordinary or the Centered Algorithm. It is easy to verify that
the conditions Q(G), Q(K), Q(O), Q(Ui )(i = 0, 1), Q(Ci )(i = 0, 1), hold.
Moreover, at s = 2, the Ruelle operators can be viewed as density transformers. However,
the dynamical systems to which they are associated may be complex objects, since they
are random for the Odd Algorithm, and are both random and Markovian for the Ordinary
and Centered Algorithm. The reason is that the three pseudo-divisions (odd, ordinary,
centered) are related to dyadic valuation, so that continued fractions expansions are only
defined for rationals numbers. However, one can define random continued fraction for
real numbers when choosing in a suitable random way the dyadic valuation of a real
number. Then, the Ruelle operator relative to each algorithm can be viewed as the transfer
operator relative to this random dynamical system. Now, the application of Theorem 1
gives our first main result:
Theorem 3. Consider the five algorithms, the Classical Algorithm (G), the Classical
Centered Algorithm (K), the Odd Algorithm (O), the Ordinary Algorithm (U) or the
Centered Algorithm (C). The average numbers of division steps performed by each
of these five algorithms, on the set of valid inputs of denominator less than N are of
asymptotic logarithmic order. They all satisfy
e N ∼ (2/h(H)) log N
for
H ∈ {G, K, O, U, C},
HN ∼ H
A Unifying Framework for the Analysis of a Class of Euclidean Algorithms
353
where h(H) is the entropy of the dynamical system relative to the algorithm. For the
two first algorithms, the entropies are explicit,
h(K) = π 2 /(6 log φ).
h(G) = π 2 /(6 log 2),
Each of the previous entropies can be computed, by adapting methods developed in
previous papers [3,22]. What we have at the moment is values from simulations that
already provide a consistent picture of the relative merits of the Centered, Odd, and
Ordinary Algorithms, namely,
A0 ≈ 0.435 ± 0.005
AU ≈ 0.535 ± 0.005.
AC ≈ 0.430 ± 0.005
It is to be noted that the computer algebra system Maple makes use of the Ordinary
Algorithm, (perhaps on the basis that only unsigned integers need to be manipulated),
although this algorithm appears to be from our analysis the fast one that has the worst
convergence rate.
4.2. The Slow Algorithms. For the Even algorithm and the By-Excess Algorithm, the
“bad" LFT is defined by p(x) := 1/(2 − x), with an indifferent point in 1. For the
SubtractiveAlgorithm, the “bad" LFT is defined by p(x) := x/(1+x), with an indifferent
point in 0. In the latter case, the induced set M coincides with the set G relative to the
Classical Algorithm (G).
When applying Theorem 2 to sets E, L, T , we obtain our second main result:
Theorem 4. Consider the three algorithms, the By-Excess Algorithm (L), the Subtractive
Algorithm (T), the Even Algorithm (E). The average numbers of steps performed by
each of the three algorithms, on the set of valid inputs of denominator less than N are
of asymptotic log–squared order. They satisfy
e N ∼ AL log2 N, TN ∼ TeN ∼ AT log2 N, EN ∼ E
eN ∼ AE log2 N,
LN ∼ L
with AE = (3/π 2 ), AT = (6/π 2 ), AE = (2/π 2 ).
The average numbers of good steps performed by the algorithms on the set of valid
inputs of denominator less than N satisfy
e N ∼ AG log N,
fN ∼ AM log N.
GN ∼ G
MN ∼ M
PN ∼ PeN ∼ AP log N
2
2
with AP = (6 log 2)/π , AG = (12 log 2)/π , AM = (4 log 3)/π 2 .
4.3. Higher moments. Our methods apply to other parameters of continued fraction.
On the other hand, by using successive derivatives of the double generating function
S(s, w), we can easily evaluate higher moments of the random variable “number of
iterations".
Theorem 5. (i) For any integer ℓ ≥ 1 and any of the five fast algorithms, the ℓ-th moment
of the cost function is asymptotic to the ℓth power of the mean. In particular the standard
deviation is o(log N ). Consequently the random variable expressing the cost satisfies
the concentration of distribution property.
(ii) For any integer ℓ ≥ 2 and any of the three slow algorithms,√the ℓth moment of the
cost function is of order N ℓ−1 and the standard deviation is Θ( N ).
Acknowledgements. I wish to thank Pierre Ducos for earlier discussions of 1992, Charlie Lemée
and Jérémie Bourdon for their master’s theses closely related to this work, Thomas Prellberg for
his explanations of the inducing method, Dieter Mayer for his introduction to random dynamical
systems, Ilan Vardi for Hickerson’s formula, and Philippe Flajolet for the experimentations.
354
B. Vallée
References
1. Bedford, T., Keane, M., and Series, C., Eds. Ergodic Theory, Symbolic Dynamics and
Hyperbolic Spaces, Oxford University Press, 1991.
2. Bowen, R. Invariant measures for Markov maps of the interval, Commun. Math. Phys. 69
(1979) 1–17.
3. Daude, H., Flajolet, P., and Vallee, B. An average-case analysis of the Gaussian algorithm
for lattice reduction, Combinatorics, Probability and Computing (1997) 6 397–433
4. Delange, H. Généralisation du Théorème d’Ikehara, Ann. Sc. ENS, (1954) 71, pp 213–242.
5. Dixon, J. D. The number of steps in the Euclidean algorithm, Journal of Number Theory 2
(1970), 414–422.
6. Eisenstein, G. Einfacher Algorithmus zur Bestimmung der Werthes von ( ab ), J. für die
Reine und Angew. Math. 27 (1844) 317-318.
7. Flajolet, P. and Sedgewick, R. Analytic Combinatorics, Book in preparation (1999), see
also INRIA Research Reports 1888, 2026, 2376, 2956.
8. Heilbronn, H. On the average length of a class of continued fractions, Number Theory and
Analysis, ed. by P. Turan, New-York, Plenum, 1969, pp 87-96.
9. Hensley, D. The number of steps in the Euclidean algorithm, Journal of Number Theory
49, 2 (1994), 142–182.
10. Hickerson, D. Continued fractions and density results for Dedekind sums, J. Reine Angew.
Math. 290 (1977) 113-116.
11. Jacobi, C.G.J. Uber die Kreistheilung und ihre Anwendung auf die Zalhentheorie, J. für die
Reine und Angew. Math. 30 (1846) 166–182.
12. Knuth, D.E. The art of Computer programming, Volume 2, 3rd edition, Addison Wesley,
Reading, Massachussets, 1998.
13. Lebesgue V. A. Sur le symbole (a/b) et quelques unes de ses applications, J. Math. Pures
Appl. 12(1847) pp 497–517
14. Mayer, D. H. Continued fractions and related transformations, In Ergodic Theory, Symbolic Dynamics and Hyperbolic Spaces, T. Bedford, M. Keane, and C. Series, Eds. Oxford
University Press, 1991, pp. 175–222.
15. Prellberg, T. and Slawny, J. Maps of intervals with Indifferent fixed points: Thermodynamic formalism and Phase transitions. Journal of Statistical Physics 66 (1992) 503-514
16. Rieger, G. J. Uber die mittlere Schrittazahl bei Divisionalgorithmen, Math. Nachr. (1978)
pp 157-180
17. Ruelle, D. Thermodynamic formalism, Addison Wesley (1978)
18. Ruelle, D. Dynamical Zeta Functions for Piecewise Monotone Maps of the Interval, vol. 4
of CRM Monograph Series, American Mathematical Society, Providence, 1994.
19. Shallit, J. On the worst–case of the three algorithmss for computing the Jacobi symbol,
Journal of Symbolic Computation 10 (1990) 593–610.
20. Shallit, J. Origins of the analysis of the Euclidean Algorithm, Historia Mathematica 21
(1994) pp 401-419
21. Vallee, B. Opérateurs de Ruelle-Mayer généralisés et analyse des algorithmes d’Euclide et
de Gauss, Acta Arithmetica 81.2 (1997) 101–144.
22. Vallee, B. Fractions continues à contraintes périodiques, Journal of Number Theory 72
(1998) pp 183–235.
23. Vallee, B. Dynamics of the Binary Euclidean Algorithm: Functional Analysis and Operators., Algorithmica (1998) vol 22 (4) pp 660–685.
24. Vardi, I. Continued fractions, Preprint, chapter of a book in preparation.
25. Yao, A.C., and Knuth, D.E. Analysis of the subtractive algorithm for greatest common
divisors. Proc. Nat. Acad. Sc. USA 72 (1975) pp 4720-4722.
Worst–Case Complexity of the Optimal LLL Algorithm
Ali Akhavi
GREYC - Université de Caen, F-14032 Caen Cedex, France
ali.akhavi@info.unicaen.fr
Abstract. In this paper, we consider the open problem of the complexity of the
LLL algorithm in the case when the approximation parameter of the algorithm
has its extreme value 1. This case is of interest because the output is then the
strongest Lovász–reduced basis. Experiments reported by Lagarias and Odlyzko
[13] seem to show that the algorithm remain polynomial in average. However no
bound better than a naive exponential order one is established for the worst–case
complexity of the optimal LLL algorithm, even for fixed small dimension (higher
than 2). Here we prove that, for any fixed dimension n, the number of iterations
of the LLL algorithm is linear with respect to the size of the input. It is easy to
deduce from [17] that the linear order is optimal. Moreover in 3 dimensions, we
give a tight bound for the maximum number of iterations and we characterize
precisely the output basis. Our bound also improves the known one for the usual
(non–optimal) LLL algorithm.
1 Introduction
A Euclidean lattice is a set of all integer linear combinations of p linearly independent
vectors in Rn . Any lattice can be generated by many bases (all of them of cardinality
p). The lattice basis reduction problem is to find bases with good Euclidean properties,
that is sufficiently short vectors and almost orthogonal. The problem is old and there
exist numerous notions of reduction; the most natural ones are due to Minkowski or
to Korkhine–Zolotarev. For a general survey, see for example [8,16]. Both of these
reduction processes are “strong”, since they build reduced bases with in some sense best
Euclidean properties. However, they are also computationally hard to find, since they
demand the first vector of the basis should be a shortest one in the lattice. It appears that
finding such an element in a lattice is likely to be NP-hard [18,1,5].
Fortunately, even approximate answers to the reduction problem have numerous theoretical and practical applications in computational number theory and cryptography:
Factoring polynomials with rational coefficients [12], finding linear diophantine approximations (Lagarias, 1980), breaking various cryptosystems [15] and integer linear
programming [7,11]. In 1982, Lenstra, Lenstra and Lovász [12] gave a powerful approximation reduction algorithm. It depends on a real approximation parameter δ ∈ [1, 2[
and is called LLL(δ). It is a possible generalization of its 2–dimensional version, which
is the famous Gauss algorithm. The celebrated LLL algorithm seems difficult to analyze
precisely, both in the worst–case and in average–case. The original paper [12] gives an
upper bound for the number of iterations of LLL(δ), which is polynomial in the data
G. Gonnet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNCS 1776, pp. 355–366, 2000.
c Springer-Verlag Berlin Heidelberg 2000
356
A. Akhavi
size, for all values of δ except the optimal value 1: When given n input vectors of Rn of
length at most M , the data size is O(n2 log M ) and the upper bound is n2 logδ M + n.
2
When the approximation parameter δ is 1, the only known upper–bound is M n , which
is exponential even for fixed dimension. It was still an open problem whether the optimal LLL algorithm is polynomial. In this paper, we prove that the number of iterations
3
of the algorithm is linear for any fixed dimension.
precisely, it is O(An log M )
√More
where A is any constant strictly greater than (2/ 3)(1/6) . We prove also that under a
√ 2
quite reasonable heuristic principle, the number of iterations is O((2/ 3)n /2 log M ).
In the 3–dimensional case (notice that the problem was totally open even in this case),
we provide a precise linear bound, which is even better than the usual bounds on the
non–optimal versions of the LLL algorithm. Several reasons motivate our work on the
complexity of the optimal LLL algorithm.
1. This problem is cited as an open question by respected authors [4,17] and I think that
the answer will bring at least a better understanding of the lattice reduction process. Of
course, this paper is just an insight to the general answer of the question.
2. The optimal LLL algorithm provides the strongest Lovász–reduced basis in a lattice
(the best bounds on the classical length defects and orthogonality defect). In many
applications, people seem to be interested by such a basis [13], and sometimes even in
fixed low dimension [14].
3. We believe that the complexity of finding an optimal Lovász–reduced basis is of great
interest and the LLL algorithm is the most natural way to find an optimal Lovász–reduced
basis in a lattice (we develop it more in the conclusion).
Plan of the paper. Section 2 presents the LLL algorithm and introduces some definitions
and notations. In Section 3, we recall some known results in 2 dimensions. Section 4
deals with the worst–case complexity of the optimal LLL algorithm in 3-dimensional
case. Finally, in Section 5, we prove that in any fixed dimension, the number of iterations
of the LLL algorithm is linear with respect to the length of the input.
2 General Description of the LLL Algorithm
Let Rp be endowed with the usual scalar product (·, ·) and Euclidean length |u| =
(u, u)1/2 . The notation (u)⊥H will denote the projection of the vector u in the classical
orthogonal space H ⊥ of H in Rp . The set hu1 , u2 , ..., ur i denotes the vector space
spanned by a family of vectors (u1 , u2 , ..., ur ). A lattice of Rp is the set of all integer
linear combinations of a set of linearly independent vectors. Generally it is given by one
of its bases (b1 , b2 , . . . , bn ) and the number n is the dimension of the lattice. So, if M is
the maximum length of the vectors bi , the data-size is (n2 log M ) and when working in
fixed dimension, the data-size is O(log M ). The determinant det(L) of the lattice L is
the volume of the n–dimensional parallelepiped spanned by the origin and the vectors of
any basis. Indeed it does not depend on the choice of a basis. The usual Gram–Schmidt
orthogonalization process, builds in polynomial–time from a basis b = (b1 , b2 , . . . , bn )
an orthogonal basis b∗ = (b∗1 , b∗2 , . . . , b∗n ) (which is generally not a basis for the lattice
generated by b) and a lower–triangular matrix m = (mij ) that expresses the system b
Worst–Case Complexity of the Optimal LLL Algorithm
357
into the system b∗ . By construction,
∗
b1 = b1
b∗2 = b2⊥hb1 i
..
.
b∗i = bi⊥hb1 ,···,bi−1 i
.
..
∗
bn = bn⊥hb1 ,···,bn−1 i
b1
..
.
, m=
bi
bi+1
..
.
bn
b∗1
1
..
.
···
···
.
..
mn1
···
···
..
.
b∗i
···
..
.
b∗i+1
···
···
1
0
···
..
.
···
mi+1,i
..
.
···
1
···
···
..
..
···
.
.
···
b∗n
0
..
.
..
.
.. .
.
0
(1)
1
We recall that if L is the lattice generated by Q
the basis b, its determinant det(L) is
n
expressed in term of the lengths |b∗i |: det(L) = i=1 |b∗i |.
The ordered basis b is called proper if |mij | ≤ 1/2, for 1 ≤ j < i ≤ n. There exists a
simple polynomial-time algorithm which makes any basis proper by means of integer
translations of each bi in the directions of bj , for j decreasing from i − 1 to 1.
Definition 1. [12] For any δ ∈ [1, 2[, the basis (b1 , . . . , bn ) is called δ–reduced (or
LLL(δ)-reduced or δ-Lovász–reduced) if it fullfils the two conditions:
(i) (b1 , ..., bn ) is proper.
(ii) ∀i ∈ {1, 2, · · · , n − 1}
(1/δ) |(bi )⊥hb1 ,···,bi−1 i | < |(bi+1 )⊥hb1 ,···,bi−1 i |.
The optimal LLL algorithm (δ = 1) is a possible generalization of its 2–dimensional
version, the famous Gauss’ algorithm, whose precise analysis is already done both in the
worst–case [10,17,9] and in the average–case [6]. In the sequel, a reduced basis denotes
always an optimal LLL-reduced one. When talking about the algorithm without other
precision, we always mean the optimal LLL algorithm.
For all integer i in {1, 2, · · · , n − 1}, we call Bi the two–dimensional basis formed by
the two vectors (bi )⊥hb1 ,···,bi−1 i and (bi+1 )⊥hb1 ,···,bi−1 i . Then, by the previous Definition
(b1 , . . . , bn ) is reduced iff it is proper and if all bases Bi are reduced.
Definition 2. Let t be a real parameter such that t > 1. We call a basis (b1 , . . . , bn )
t–quasi–reduced if it satisfies the following conditions:
(i) the basis (b1 , . . . , bn−1 ) is proper.
(ii) For all 1 ≤ i ≤ n − 2, the bases Bi are reduced.
(iii) The last basis Bn−1 is not reduced but it is t–reduced: |mn,n−1 | < 1/2 and
(1/t) |(bn−1 )⊥hb1 ,···,bn−2 i | ≤ |(bn )⊥hb1 ,···,bn−2 i | < |(bn−1 )⊥hb1 ,···,bn−2 i |.
In other words, whenever the beginning basis (b1 , · · · , bn−1 ) is reduced, but the whole
basis b = (b1 , · · · , bn ) is not, then for all t > 1 such that the last two–dimensional basis
Bi is t–reduced, the basis b is called t–quasi-reduced.
Here is a simple enunciation of the LLL(δ) algorithm:
358
A. Akhavi
The LLL(δ)-reduction algorithm:
Input: A basis b = (b1 , . . . , bn ) of a lattice L.
Output: A LLL(1)-reduced basis b of the lattice L.
Initialization: Compute the orthogonalized system b∗ and the matrix m.
i := 1;
While i < n do
bi+1 := bi+1 − ⌊mi+1,i ⌉bi (⌊x⌉ is the integer nearest to x).
Test: Is the two-dimensional basis Bi δ–reduced?
If true, make (b1 , . . . , bi+1 ) proper by translations; set i := i + 1;
If false, swap bi and bi+1 ; update b∗ and m; if i 6= 1 then set i := i − 1;
During an execution of the algorithm, the index i variates in {1, . . . , n}. It is called the
current index. When i equals some k ∈ {1, . . . , n−1}, the beginning lattice generated by
(b1 , . . . , bk ) is already reduced. Then, the reduction of the basis Bk is tested. If the test is
positive, the basis (b1 , . . . , bk+1 ) is made proper and the beginning lattice generated by
(b1 , . . . , bk+1 ) is then reduced. So, i is incremented. Otherwise, the vectors bk and bk+1
are swapped. At this moment, nothing guarantees that (b1 , . . . , bk ) “remains” reduced.
So, i is decremented and the algorithm updates b∗ and m, translates the new bk in
the direction of bk−1 and tests the reduction of the basis Bk−1 . Thus the index i may
fall down to 1. Finally when i equals n, the whole basis is reduced and the algorithm
terminates. An example of the variation of the index i is shown by Figure 1.
value of the current index i
1
n-1
n
time
t-phase of type I
t’-phase of type II
END
Fig. 1. Variation of the index i presented as a walk.
In the sequel an iteration of the LLL algorithm is precisely an iteration of the “while”
loop in the previous enunciation. So each iteration has exactly one test (Is the two–
dimensional basis Bi reduced ?) and the number of steps is exactly the number of tests.
Notice that whenever a test at a level i is negative, i.e. the basis Bi is not reduced,
after the swap of bi and bi+1 the determinant di of the lattice (b1 , . . . , bi ) is decreased.
Moreover, for any t > 1, if at the moment of the test the basis Bi is even not t–reduced,
the determinant di is decreased by a factor at least 1/t. This explains the next definition.
Definition 3. For a real parameter t > 1, a step of index i is called t–decreasing if at
the moment of the test, the basis Bi is not t–reduced. Else, it is called t–non–decreasing.
[12] pointed out that during the execution of a non–optimal LLL algorithm, say LLL(δ)
for some δ > 1, all steps with negative tests are δ–decreasing. Similarly, we assert the
next lemma, based on the decrease of the integer quantity
Worst–Case Complexity of the Optimal LLL Algorithm
D :=
Qn−1
i=1
d2i :=
Qn−1 Qi
i=1
j=1
|b∗j |2 ,
359
(2)
by a factor 1/t2 , whenever a step is t–decreasing (other steps do not make D increase).
Lemma 1. Let the LLL(1) algorithm run on an integer input (b1 , . . . , bn ) of length
log M . For any t > 1, the number of t-decreasing steps is less than n(n − 1)/2 logt M.
Definition 4. A phase is a sequence of steps that occur between two successive tests of
reduction of the last two–dimensional lattice Bn−1 . For a real t > 1, we say that a phase
is a t–phase if at the beginning of the phase the basis (b1 , . . . bn ) is t–quasi–reduced.
Phases are classified in two groups: A phase is called of type I if during the next phase
the first vector b1 is never swapped. Else, it is called of of type II (see Figure 1).
3 Some Known Results in 2 Dimensions: Gauss’ Algorithm
In two dimensions a phase of the algorithm is an iteration (of the “while” loop) and the
only positive test occurs at the end of the algorithm. Thus the number of steps equals the
number of negative tests plus one. For any t > 1, before the input is t-quasi-reduced,
each step is t–decreasing. So by Lemma 1 any input basis (b1 , b2 ) will be t-quasi-reduced
within at most logt M steps. Then the next Lemma leads to the bound log√3 M +2 for the
total number of steps of Gauss’ algorithm. This bound is not optimal [17]. However, in
next sections we generalize this argumentation to the case of an arbitrary fixed dimension.
Notice that the Lemma does not suppose that the input basis is integral and this fact is
used in the sequel (proof in [2]).
√
Lemma 2. For any t ∈]1, 3], during the execution of Gauss’ algorithm on any input
basis (not necessarily integral), there are at most 2 t–non-decreasing steps.
4 The 3–Dimensional Case
Let t be a real parameter such that 1 < t ≤ 3/2. Here we count separately the iterations
that are inside t–phases and the iterations that are not inside t–phases.
First we show that the total number of steps that are not inside t–phases is linear with
respect to the input length log M (Lemma 3). Second we prove that the total number
of iterations inside t–phases is always less than nine (Lemma 4). Thus we exhibit for
the first time a linear bound for the number of iterations of the LLL algorithm in 3–
dimensional space (the naive bound is M 6 .) In addition, our argumentation gives a
precise characterization of a reduced basis in the three–dimensional space.
Theorem 1. The number of iterations of the LLL(1) algorithm on an input integer basis
(b1 , b2 , b3 ) of length log M is less than log√3 M + 6 log3/2 M + 9.
The linear order for our bound is in fact optimal since it is so in dimension 2 [17] and
one can obviously build from a basis b of n − 1 vectors of maximal length M , another
basis b′ of n vectors of the same maximal length such the number of iterations of the
360
A. Akhavi
LLL algorithm on the b′ is strictly greater than on b. Moreover, even if we have not tried
here to give the best coefficient of linearity in dimension 3, our bound is quite acceptable
since [19] exhibits a family of bases of lengths log M , for which the number of iterations
is 13.2 log2 M + 9. Observe
of the algorithm is greater than 2.6 log2 M + 4. Our bound√
that the classical bound on the usual non–optimal LLL(2/ 3) is 28.9 log2 M + 2, and
even computed more precisely as in Lemma 3 it remains 24.1 log2 M + 2. So our bound,
which is also valid for LLL(δ) with δ < 3/2, improves the classical upper–bound on the
number of steps of LLL(δ) provided that δ < 1.3.
The next Lemma (proof in [2]) is a more precise version of Lemma 1 in the particular
case of 3 dimensions. (It is used in the sequel for t ≤ 3/2.)
Lemma 3. Let the LLL algorithm run on a√integer basis (b1 , b2 , b3 ) of length log M .
Let t be a real parameter such that 1 < t ≤ 3. The number of steps that are not inside
any t–phase is less than log√3 M + 6 logt M .
4.1
From a Quasi–reduced Basis to a Reduced One
Lemma 4. Let t be a real parameter in ]1, 3/2]. When the dimension is fixed at 3, there
are at most three t–phases during an execution of the algorithm. The total number of
steps inside t–phases is less than nine.
The proof is based on Lemma 5 and Corollaries 2 and 1. Lemma 5 shows that a t–phase
of type I is necessarily an ending phase and has exactly 3 iterations.
The central role in the proof is played by Lemma 7 and its Corollary 2 asserting that in
3 dimensions, there are at most 2 t–phases of type II during an execution.
Finally, Lemma 6 and Corollary 1 show that any t–phase of type II has a maximum
number of 3 iterations1 .
p
Remarks. (1) For t chosen closer to 1 ( 6/5 rather than 3/2), if during an execution,
a t-reduced basis is obtained, then a reduced one will be obtained after at most 9 steps
(see [2]). (A t–phase is necessarily followed by another t–phase.) (2) Of course, the
general argumentation of the next section (for an arbitrary fixed dimension) holds here.
But both of these directions lead to a less precise final upper bound.
√
Lemma 5. For all t ∈]1, 3], a t-phase of type I has 3 steps and is an ending phase.
Proof. The vector b1 will not be modified during a phase of type I. Then, by Lemma 2,
the basis ((b2 )⊥b1 , (b3 )⊥b1 ) will be reduced after only two iterations2 . But here, there is
one additional step (of current index 1, with a positive test) between these two iterations.
Lemma 6. For any t > 1, if a basis (b1 , . . . , bn−1 , bn ) is√t–quasi–reduced, then the
basis (b1 , . . . , bn−2 , bn ) is t′ –quasi–reduced, with t′ > (2/ 3)t.
1
These facts are used to make the proof clearer but they are not essential in the proof: actually,
if a phase has more than 3 iterations, then these additional steps (which are necessarily with
negative tests and with the index i equal to 1) are t–decreasing and all t–decreasing steps are
already counted by Lemma 3.
2
Lemma 2 does not demand the input basis to be integral.
Worst–Case Complexity of the Optimal LLL Algorithm
361
Corollary 1. In 3 dimensions, for all t ∈]1, 3/2], a t–phase of type II has 3 steps.
√
Proof. Since (b1 , b2 , b3 ) is 3/2–quasi–reduced, by the previous Lemma, (b1 , b3 ) is 3–
quasi–reduced. Then by Lemma 2, (b1 , b3 ) will be reduced after 2 steps.
The next Lemma plays a central role in the whole proof. This result which remains true
when (b1 , b2 , b3 ) is reduced gives also a precise characterization of a 1-Lovász reduced
basis in dimension 3. A detailed proof is available in [2].
Lemma 7. For all t ∈]1, 3/2], if the basis (b1 , b2 , b3 ) is t–quasi–reduced and proper,
then among all the vectors of the lattice that are not in the plan hb1 , b2 i, there is at most
one pair of vectors ±u whose lengths are strictly less than |b3 |.
Proof (sketch). Let u := x b1 + y b2 + z b3 be a vector of the lattice ((x, y, z) ∈ ZZ3 ).
The vector u is expressed in the orthogonal basis b∗ defined by (1) and its length satisfies
|u|2 = (x + y m21 + z m31 )2 |b∗1 |2 + (y + z m32 )2 |b∗2 |2 + |b∗3 |2 .
First, since (b1 , b2 , b3 ) is 3/2–quasi reduced, one gets easily that if |z| > 1 or |y| > 1 or
|x| > 1, then |u| > |b3 |. Now, if z = 1, by considering the ratio |u|2 /|b3 |2 , one shows
that there exits at most one pair (x, y) ∈ {0, 1, −1}2 \{(0, 0)} such that |u| < |b3 |. This
unique vector depends on the signs of m21 , m31 et m32 as recapitulated by Table 1.
u
b3 − b 2 b3 − b1 + b2 b3 + b1 − b2 b 3 + b2 b 3 − b1 − b 2 b3 + b2 b3 − b2 b3 + b 1 + b2
m21
+
+
+
+
−
−
−
−
m31
+
+
−
−
+
+
−
−
m32
+
−
+
−
+
−
+
−
Table 1. The unique vector possibly strictly shorter than b3 , as a function of signs of mij .
Corollary 2. During an execution of LLL(1) algorithm in three dimensions, for all
t ∈]1, 3/2], there are at most two t–phases of type II.
Proof. Assume (b1 , b2 , b3 ) is the t–quasi–reduced basis at the beginning of a first t–
phase of type II and and let (b1 , b2 , b′3 ) denote the basis obtained from (b1 , b2 , b3 ) by
making the latter proper. Since the t-phase is of type II, |b′3 | < |b1 | and the algorithm
swaps |b1 | and |b′3 |. As (b1 , b2 ) is Gauss-reduced, b1 is a shortest vector of sub-lattice
generated by (b1 , b2 ). Thus the fact |b′3 | < |b1 | together with the previous Lemma show
that in the whole lattice there is at most one pair of vectors ±u strictly shorter than b′3 .
So the vector b′3 can be swapped only once. In particular, only one new t′ –phase (for any
t′ > 1 ) of type II may occur before the end of the algorithm and after the first t–phase
(t ≤ 3/2) of type II, all phases except eventually one have exactly 2 iterations.
5 Arbitrary Fixed Dimension n
In the previous section, we argued in an additive way: We chose a tractable value t0
(3/2 in the 3 dimensions) such that for 1 < t ≤ t0 we could easily count the maximum
362
A. Akhavi
number of steps inside all t–phases. Then we added the last bound (9 in the last Section)
to the total number of iterations that were not inside t–phases.
Here we argue differently. On one hand, the total number of t–decreasing steps is classically upper bounded by n2 logt M (Lemma 1). Now, for all t > 1, we call a t–non–
decreasing sequence, a sequence of t–non–decreasing consecutive steps. During such
a sequence just before any negative test of index i, the basis (b1 , . . . , bi+1 ) is t–quasi–
reduced. The problem is that during a t–non–decreasing sequence, we cannot quantify
is
efficiently the decreasing of the usual integer potential function3 D (whose definition
√
recalled in (2) ). The crucial point here (Proposition 1) is that for all t ∈]1, 3], there
exists some integer c(n, t) such that any t–non–decreasing sequence of the LLL(1) algorithm – when it works on an arbitrary input basis (b1 , . . . , bn ) (no matter the lengths
of the vectors)– has strictly less than c(n, t) steps. In short, any sequence of iterations,
which is longer than c(n, t), has a t–decreasing step.
Hence, our argumentation is in some sense multiplicative since the total number of
iterations with negative tests is thus bounded from above by c(n, t)n2 logt M . We deduce
the following theorem which for the first time exhibits a linear bound on the number of
iterations of the LLL algorithm in fixed dimension.
Theorem 2. For any fixed dimension n, let the optimal LLL algorithm run on an integer
input (b1 , . . . , bn ) of length log M . The maximum number of iterations K satisfies:
√
3
(i) for all constant A > (2/ 3)(1/6) , K is O(An log M ).
√ 2
(ii) under a very plausible heuristic, K is also O (2/ 3)n /2 log M .
The first formulation (i) is based on Proposition 1, and Lemmata 1, 8. For the second
formulation (ii) we use also Lemma 9 (proved under a very plausible heuristic).
The next Lemma is an adaptation of counting methods used by Babai, Kannan and
Schnorr [3,7,14] when finding a shortest vector in a lattice with a Lovász–reduced basis
on hand. For a detailed proof, see [2].
Lemma 8. Let t ∈]1, 2[ be a real parameter and L be a lattice generated by a basis
b := (b1 , . . . , bn ), which is not necessarily integral and whose vectors are of arbitrary
length. If b is proper and t–quasi-reduced then there exists an integer α(n, t) such that
the number of vectors of the lattice L whose lengths are strictly less than |b1 | is strictly
less than α(n, t). Moreover,
α(n, t) <
p
√ n(n−1)
3t2 /(4 − t2 ) 3n−1 (2/ 3) 2 .
(3)
Remark. The sequence α(n, t) is increasing with n (and also with t).
√
Proposition 1. Let n be a fixed dimension and t a real parameter in ]1, 3]. There exists
an integer c(n, t) such that the length of any t–non–decreasing sequence of the LLL(1)
algorithm – on any input basis (b1 , . . . , bn ), no matter the lengths of its vectors and no
matter the basis is integral – is strictly less than c(n, t).
3
The naive bound is obtained using only the fact that D is a strictly positive integer less than
M n(n−1) and it is strictly decreasing at each step with a negative test.
Worst–Case Complexity of the Optimal LLL Algorithm
363
Proof (sketch). By induction on n. The case n = 2 is trivial and c(2, t) = 2 (Lemma
2). Suppose that the assertion holds for any basis of n − 1 vectors and let the algorithm
run on a basis b := (b1 , . . . , bn ). Let us consider the longest possible t–non–decreasing
sequence. After at most c(n − 1, t) t–non–decreasing steps, b is t–quasi–reduced 4 .
If the next phase if of type I, then the algorithm works actually with the basis of cardinality
(n − 1), b⊥b1 := ((b2 )⊥b1 , . . . , (bn )⊥b1 ), which is also t–quasi–reduced. Then by the
induction hypothesis, the t–non–decreasing sequence will be finished after at most c(n−
1, t) + α(n − 1, t) more steps5 .
On the other hand, there are at most α(n, t) successive phases of type II, since Lemma
8 asserts that the first vector of the t–quasi–reduced basis (b1 , . . . , bn ) can be modified
at most α(n, t) times. Each of them has no more than c(n − 1, t) steps, because the
algorithm works actually on (b1 , . . . , bn−1 ).
After the last t–phase of type II, it may be one more t-phase of type I. Finally, since
α(n, t) is increasing with respect to n, the quantity c(n, t) is less than
c(n−1, t)+c(n−1, t)α(n, t)+c(n−1, t)+α(n−1, t) < (c(n−1, t)+1)(α(n, t)+2),
Qn
(4)
and finally c(n, t) + 1 ≤ (c(2, t) + 1) i=2 (α(i, t) + 2) .
Proof ((i) of Theorem 2). Each sequence of c(n, t) steps contains at least one t–
decreasing step. At each t–decreasing step, the quantity D, which is always in the
2
interval [1, M n ], is decreasing by at least 1/t. So the total number of iterations of√the
algorithm is always less than c(n, t)n2 logt M . Now by choosing a fixed t ∈]1, 3],
relations (3) and (4) together show that the quantity n2 c(n, t) is bounded from above by
√
3
An , where A is any constant greater than (2/ 3)1/6 .
√
In the first proof, we choose for t an arbitrary value in the interval ]1, 3]. Now, we
improve our bound by choosing t as a function of n. What we really need here is to
evaluate the number of possible successive t-phases of type II. So the main question is:
When a basis (b1 , . . . , bn ) of a lattice L is t–quasi–reduced, how many lattice points
u are satisfying (1/t)|b1 | < |u| < |b1 |? More precisely, is it possible to choose t, as
a function of dimension n, such that the open volume between the two n–dimensional
balls of radii |b1 | and (1/t)|b1 | does not contain any lattice point?
Now, we answer these questions under a quite reasonable heuristic principle which is
often satisfied. So the bound on C(n) and on the maximum number of iterations will be
improved. This heuristic is due to Gauss. Consider a lattice of determinant det(L). The
heuristic claims that the number of lattice points inside a ball B is well approximated
by volume(B)/ det(L). More precisely the error is of the order of the surface of the
ball B. This principle holds for very large class of lattices, in particular those used in
applications (for instance “regular lattices” where the minima are close to each other
and where the fundamental parallelepiped is close to a hyper–cube). Moreover, notice
that this heuristic also leads to the result of Lemma 8.
4
5
Otherwise, there would be a t–non–decreasing sequence of more than c(n − 1, t) steps while
the algorithm runs on the basis (b1 , . . . , bn−1 ).
During the c(n − 1, t) steps on b⊥b1 , each change of the first vector (b2 )⊥b1 (no more than
α(n − 1, t), by Lemma 8) is followed by one step (of current index one) with a positive test
which has not been counted yet.
364
A. Akhavi
Under this assumption and if γn = π n/2 /Γ (1 + n/2) denotes the volume of the n–
dimensional unit ball, then the number of lattice points β(n, t) that lie strictly between
balls of radii |b1 | and (1/t)|b1 | satisfies (at least asymptotically)
β(n, tn ) ≤ γn ( |b1 |n /det(L) ) (1 − 1/tn ).
√
Now if (b1 , . . . , bn ) is t–quasi–reduced with t ≤ 3,
√ n(n−1)
Qn
|b1 |n /det(L) = |b1 |n / i=1 |b∗i | ≤ 3n (2/ 3) 2 .
(5)
Then, using the classical Stirling approximation, β(n, t) is bounded from above:
√ n(n−1)
(1 − 1/tn ).
β(n, t) < 3 (πe/n)n/2 (2/ 3) 2
By routine computation, we deduce the following Lemma.
Lemma 9. Suppose that there exists n0 such that for n ≥ n0 , relation (5) is true. Then
there exists a sequence tn > 1 satisfying:
(i) (tn ) is decreasing and tends to 1 with respect to n.
√ n(n−1)
(ii) for all n ≥ n0 , β(n, tn ) < 1 and 1/log tn < 3n (πe/n)n/2 (2/ 3) 2 .
Remark: One deduces from (i) that if β(n − 1, tn ) = 0, then β(n − 1, tn−1 ) = 0.
Proof (sketch for (ii) of Theorem 2).
The quantities tn and n0 are defined by the previous Lemma. First we prove that for
n > n0 and with the notations of Proposition1,
c(n, tn ) ≤ c(n − 1, tn ) + (c(n0 , tn ) + α((n0 , tn )).
(6)
Indeed, after c(n − 1, t) steps, (b1 , . . . , bn ) is tn –quasi–reduced and if H denotes the vector space hb1 , . . . , bn−n0 −1 i, the (n0 )–dimensional basis b⊥H :=
( (bn−n0 )⊥H , . . . , (bn )⊥H ) is tn –quasi–reduced as well. Thus by the previous Lemma
during the tn –non–decreasing sequence, its first vector cannot be modified. So from the
first time that (b1 , . . . , bn ) is tn –quasi–reduced until the end of the tn –non–decreasing
sequence the current index i will always be in the integral interval {n − n0 , . . . , n}.
Then by Proposition 1 the sequence of tn –non–decreasing iterations may continue for
at most c(n0 , tn ) + α(n0 , tn ) more iterations.6 This ends the proof of relation (6).
So for n > n0 ,
c(n, tn ) ≤ (n − n0 + 1)(c(n0 , tn ) + α(n0 , tn )).
Since tn < tn0 , the basis b⊥H is also tn0 –quasi–reduced and by Lemma 8, α(n0 , tn ) ≤
α(n0 , tn0 ). (The same relation is true for k < n0 .) Finally the quantity c(n0 , tn ) +
α(n0 , tn ) is a constant B that depends only on n0 . We have then c(n, tn ) < nB. So
a sequence longer than nB contains always a tn –decreasing step and the total number
of iterations is less than nBn2 logtn M . Finally Lemma 9 gives an upper–bound for
1/log tn and leads to the (ii) of Theorem 2.
6
The quantity α(n0 , tn ) corresponds to the maximum number of positive tests with the current
index i = n − n0 , after the the first time b⊥H is tn –quasi–reduced.
Worst–Case Complexity of the Optimal LLL Algorithm
365
6 Conclusion
Our paper gives for the first time linear bounds for the the maximum number of iterations
of the optimal LLL algorithm, in fixed dimension. I believe that the complexity of finding
an optimal Lovász–reduced basis is of great interest and not well–known.
Kannan presented [7] an algorithm which uses as sub–routine the non–optimal LLL
algorithm (δ > 1) and outputs a Korkine–Zolotarev basis of the lattice in O(nn ) log M
steps. Such an output is also an optimal Lovász–reduced basis (Actually it is stronger).
Thus Kannan’s algorithm provides an upper–bound on the complexity of finding an
optimal Lovász–reduced basis7 . For the future, one of the two following possibilities (or
both) has to be considered.
(1) Our upper–bound is likely to be improved. However, observe that in this paper we
have already improved notably the naive bound for fixed dimension (the exponential
order is replaced by linear order). For the moment our bound remains worse than the
one Kannan exhibits for his algorithm.
(2) The LLL algorithm which is the most natural way to find an optimal Lovász–reduced
basis is not the best way (and then the same phenomenon may be possible for finding
a non–optimal Lovász–reduced basis: more efficient algorithms than the classical LLL
algorithm may output the same reduced basis).
Acknowledgments
I am indebted to Brigitte Vallée for drawing my attention to algorithmic problems in
lattice theory and for regular helpful discussions.
References
1. M. Ajtai. The shortest vector problem in L2 is NP-hard for randomized reduction. ECCCTR97-047, 1997. http://www.eccc.uni-trier.de/eccc/.
2. A. Akhavi. The worst case complexity of the optimal LLL algorithm. Preprint.
http://www.info.unicaen.fr/˜akhavi/publi.html, Caen, 1999.
3. L. Babai. On Lovász’ lattice reduction and the nearest lattice point problem. Combinatorica,
6(1):1–13, 1986.
4. A. Bachem and R. Kannan. Lattices and the basis reduction algorithm. CMU-CS, pages
84–112, 1984.
5. J. Cai. Some recent progress on the complexity of lattice problems. ECCC-TR99-006, 1999.
http://www.eccc.uni-trier.de/eccc/
6. H. Daudé, Ph. Flajolet, and B. Vallée. An average-case analysis of the Gaussian algorithm
for lattice reduction. Comb., Prob. & Comp., 123:397–433, 1997.
7. R. Kannan. Improved algorithm for integer programming and related lattice problems. In
15th Ann. ACM Symp. on Theory of Computing, pages 193–206, 1983.
8. R. Kannan. Algorithmic geometry of numbers. Ann. Rev. Comp. Sci., 2:231–267, 1987.
7
Moreover, so far as I know, there exists no polynomial time algorithm for finding a Korkine–
Zolotarev reduced basis from an an optimal Lovász–reduced one. Finding an optimal Lovász–
reduced basis seems to be strictly easier than finding a Korkine–Zolotarev reduced one.
366
A. Akhavi
9. M. Kaib and C. P. Schnorr. The generalized Gauss reduction algorithm. J. of Algorithms,
21:565–578, 1996.
10. J. C. Lagarias. Worst-case complexity bounds for algorithms in the theory of integral quadratic
forms. J. Algorithms, 1:142–186, 1980.
11. H.W. Lenstra. Integer programming with a fixed number of variables. Math. Oper. Res.,
8:538–548, 1983.
12. A. K. Lenstra, H. W. Lenstra, and L. Lovász. Factoring polynomials with rational coefficients.
Math. Ann., 261:513–534, 1982.
13. J. C. Lagarias and A. M. Odlyzko. Solving low-density subset sum problems. In 24th IEEE
Symposium FOCS, pages 1–10, 1983.
14. C. P. Schnorr. A hierarchy of polynomial time lattice basis reduction algorithm. Theoretical
Computer Science, 53:201–224, 1987.
15. A. Joux and J. Stern. Lattice reduction: A toolbox for the cryptanalyst. J. of Cryptology,
11:161–185, 1998.
16. B. Vallée. Un problème central en géométrie algorithmique des nombres: la réduction des
réseaux. Inf. Th. et App., 3:345–376, 1989.
17. B. Vallée. Gauss’ algorithm revisited. J. of Algorithms, 12:556–572, 1991.
18. P. van Emde Boas. Another NP-complete problem and the complexity of finding short vectors
in a lattice. Rep. 81-04 Math. Inst. Amsterdam, 1981.
19. O. von Sprang. Basisreduktionsalgorithmen für Gitter kleiner Dimension. PhD thesis, Universität des Saarlandes, 1994.
Iteration Algebras Are Not Finitely
Axiomatizable
Extended Abstract
Stephen L. Bloom and Zoltán Ésik⋆
1
Stevens Institute of Technology
Department of Computer Science
Hoboken, NJ 07030
bloom@cs.stevens-tech.edu
2
A. József University
Department of Computer Science
Szeged, Hungary
esik@inf.u-szeged.hu
Abstract. Algebras whose underlying set is a complete partial order
and whose term-operations are continuous may be equipped with a least
fixed point operation µx.t. The set of all equations involving the µoperation which hold in all continuous algebras determines the variety
of iteration algebras. A simple argument is given here reducing the axiomatization of iteration algebras to that of Wilke algebras. It is shown
that Wilke algebras do not have a finite axiomatization. This fact implies that iteration algebras do not have a finite axiomatization, even by
“hyperidentities”.
1
Introduction
For a fixed signature Σ, a µ/Σ-algebra A = (A, σ)σ∈Σ , is a Σ-algebra equipped
with an operation (µx.t)A , for each µ/Σ-term t. Algebras whose underlying set A
is equipped with a complete partial order and whose basic operations σ : An → A
are continuous, determine µ/Σ-algebras in which µx.t is defined using least fixed
points (see below). The variety of µ/Σ-algebras generated by these continuous
algebras is the variety of µ/Σ-iteration algebras. Such algebras have been used in
many studies in theoretical computer science (for only a few of many references,
see [14,8,15,9,11,12,1].)
The main theorem in the current paper shows that the identities satisfied by
continuous algebras are not finitely based. This result has been known for some
⋆
Partially supported by grant no. FKFP 247/1999 from the Ministry of Education
of Hungary and grant no. T22423 from the National Foundation of Hungary for
Scientific Research.
G. Gonnet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNCS 1776, pp. 367–376, 2000.
c Springer-Verlag Berlin Heidelberg 2000
368
S.L. Bloom, Z. Ésik
time, [6], but only in an equivalent form for iteration theories. In this note
we give an argument which may have independent interest. We show how to
translate “scalar” iteration algebra identities into Wilke algebra identities, [17].
Since the identities of Wilke algebras are not finitely based, as we show, the
same property holds for iteration algebras.
In fact, we prove a stronger result. We show there is no finite number of hyperidentities which axiomatize iteration algebras. Our notion of “hyperidentity” is
stronger than that introduced by Taylor [19]. (In this extended abstract, we will
omit many of the proofs.)
2
µ/Σ-Terms and Algebras
In this section, we formulate the notion of a µ/Σ-algebra, where Σ is a signature,
i.e., a ranked set. We do not assume that the underlying set of an algebra is
partially ordered. Let V = {x1 , x2 , . . . } be a countably infinite set of “variables”,
and let Σ = (Σ0 , Σ1 , . . . ) be a ranked alphabet. The set of µ/Σ-terms, denoted
TΣ , is the smallest set of expressions satisfying the following conditions:
– each variable is in TΣ ;
– if σ ∈ Σn and t1 , . . . , tn ∈ TΣ , then σ(t1 , . . . , tn ) is in TΣ ;
– if x ∈ V and t ∈ TΣ , then µx.t is in TΣ .
Every occurrence of the variable x is bound in µx.t. The free variables occurring
in a term are defined as usual. We use the notation t = t[x1 , . . . , xn ] to indicate
that t is a term whose free variables are among the distinct variables x1 , . . . , xn ,
and no bound variable is among x1 , . . . , xn . Perhaps confusingly, we write
t[t1 /x1 , . . . , tn /xn ]
to indicate the term obtained by simultaneously substituting the terms ti for
the free occurrences of xi in t, for i = 1, . . . , n. (By convention, we assume no
variable free in ti becomes bound in t[t1 /x1 , . . . , tn /xn ].) But here, we do not
rule out the possibility that there are other free variables in t not affected by
this substitution.
Definition 1. A µ/Σ-algebra consists of a Σ-algebra A = (A, σ A )σ∈Σ and an
assignment of a function tA : An → A to each µ/Σ-term t = t[x1 , . . . , xn ] which
satisfies the (somewhat redundant) requirements that
1.
xA
i (a1 , . . . , an ) = ai , i = 1, 2, . . . , n;
Iteration Algebras Are Not Finitely Axiomatizable
369
2. for each σ ∈ Σn ,
A
(σ(t1 , . . . , tn ))A (a1 , . . . , an ) = σ A (tA
1 (a1 , . . . , an ), . . . , tn (a1 , . . . , an ));
3. if s and t differ only in their bound variables, then sA = tA ;
4. if s = s[x1 , . . . , xn ], t = t[x1 , . . . , xn ] and if
sA (a1 , . . . , an ) = tA (a1 , . . . , an ),
for all a1 , . . . , an ∈ A, then, for each i ∈ [n], all aj ∈ A, 1 ≤ j ≤ n, j 6= i,
(µxi .s)A (a1 , . . . ai−1 , ai+1 , . . . , an ) = (µxi .t)A (a1 , . . . ai−1 , ai+1 , . . . , an );
5. if t = t[x1 , . . . , xn ], then the function tA depends on at most the arguments
corresponding to the variables occurring freely in t.
If A and B are µ/Σ-algebras, a morphism ϕ : A → B is a function A → B
such that for all terms t = t[x1 , . . . , xn ], and all a1 , . . . , an ∈ A,
ϕ(tA (a1 , . . . , an )) = tB (ϕ(a1 ), . . . , ϕ(an )).
In particular, a morphism of µ/Σ-algebras is also a Σ-algebra morphism.
As usual, we say that a µ/Σ-algebra A satisfies an identity s ≈ t between
µ/Σ-terms if the functions sA and tA are the same.
3
Conway and Iteration Algebras
Definition 2. A µ/Σ-Conway algebra is a µ/Σ-algebra satisfying the double
iteration identities (1) and composition identities (2)
µx.µy.t ≈ µz.t[z/x, z/y]
µx.s[r/x] ≈ s[µx.r[s/x] /x],
(1)
(2)
for all terms t = t[x, y, z1 , . . . , zp ], s = s[x, z1 , . . . , zp ], and r = r[x, z1 , . . . , zp ]
in TΣ . A morphism of Conway algebras is just a morphism of µ/Σ-algebras.
The class of Conway algebras is the class of all µ/Σ-Conway algebras, as Σ
varies over all signatures.
Letting r be the variable x in the composition identity, we obtain the following
well known fact.
Lemma 1. Any µ/Σ-Conway algebra satisfies the fixed point identities
µx.s ≈ s[µx.s/x].
370
S.L. Bloom, Z. Ésik
In particular,
µx.s ≈ s,
whenever x does not occur free in s.
We will be mostly concerned with scalar signatures. A signature Σ is scalar if
Σn = ∅ for n > 1. The class of scalar Conway algebras is the collection of all
µ/Σ-Conway algebras, as Σ varies over all scalar signatures.
Proposition 1. Suppose that Σ is a scalar signature. If A is a µ/Σ-algebra
satisfying the composition identities, then the double iteration identities (1) hold
in A. Moreover, a µ/Σ-algebra A is a Conway-algebra iff (2) holds in A for all
terms r = r[x] and s = s[x].
✷
Note again that unlike most treatments of µ/Σ-algebras, we do not assume that
such an algebra comes with a partial order with various completeness properties
guaranteeing that all functions which are either monotone and/or continuous
have least fixed points. We say that a µ/Σ-algebra A = (A, σ A )σ∈Σ is continuous if the underlying set A is equipped with a directed complete partial
order, and each basic function σA : An → A preserves all sups of directed sets;
the µ-operator is defined via least fixed points. (See [8].) For example, when a
term t[x, y] denotes such a function tA : A × A → A, say, then µx.t denotes the
function A → A whose value at b ∈ A is the least a in A such that a = tA (a, b).
For us, µx.t is interpreted as a function which has no particular properties. Of
course, we will be interested in classes of algebras in which these functions are
required to satisfy certain identities.
Definition 3. A µ/Σ-algebra is a µ/Σ-iteration algebra if it satisfies all
identities satisfied by the continuous Σ-algebras or, equivalently, the regular Σalgebras of [15], or the iterative Σ-algebras of [11,16]. A morphism of iteration
algebras is a morphism of µ/Σ-algebras.
For axiomatic treatments of iteration algebras, see [2]. It is proved in [7] that an
µ/Σ-algebra is an iteration algebra iff it is a Conway algebra satisfying certain
“group-identities”.
When Σ is either clear from context or not important, we say only iteration
algebra instead of “µ/Σ-iteration algebra”. The class of scalar iteration algebras is the class of all µ/Σ-iteration algebras, as Σ varies over all scalar
signatures. The following is well known [2].
Proposition 2. If A is an iteration algebra, A satisfies the power identities:
for each term t = t[x, y1 , . . . , yp ],
µx.t ≈ µx.tn ,
where t1 = t and tk+1 := t[tk /x].
n ≥ 1,
✷
Iteration Algebras Are Not Finitely Axiomatizable
4
371
Scalar µ/Σ-Iteration Algebras
We have already pointed out in Proposition 2 that any iteration algebra satisfies
the power identities. It turns out that for scalar signatures, the composition and
power identities are complete.
Theorem 1. When Σ is scalar, a µ/Σ-algebra A is an iteration algebra iff A
satisfies the composition identities and the power identities.
Proof. We need prove only that if A satisfies the composition and power identities, then A satisfies all iteration algebra identities. The idea of the proof is to
show that in fact A is a quotient of a free iteration algebra.
✷
5
Wilke Algebras
A Wilke algebra is a two-sorted algebra A = (Af , Aω ) equipped with an
associative operation Af ×Af → Af , written u·v, a binary operation Af ×Aω →
Aω , written u · x, and a unary operation Af → Aω , written u† , which satisfies
the following identities:
(u · v) · w = u · (v · w), u, v, w ∈ Af
(u · v) · x = u · (v · x), u, v ∈ Af , x ∈ Aω ,
(u · v)† = u · (v · u)† , u, v ∈ Af ,
(un )† = u† , u ∈ Af , n ≥ 2.
(3)
(4)
(5)
(6)
(See [17], where these structures were called “binoids”. In [18,13], it is shown
how Wilke algebras may be used to characterize regular sets of ω-words.)
A morphism h = (hf , hω ) : A → B of Wilke algebras A = (Af , Aω ) and
B = (Bf , Bω ), is a pair of functions
hf : Af → Bf
hω : Aω → Bω
which preserve all of the structure, so that hf is a semigroup morphism, and
hω (u† ) = hf (u)† , u ∈ Af
hω (u · x) = hf (u) · hω (x), u ∈ Af , x ∈ Aω
The function Af × Aω → Aω is called the action. We refer to the two equations
(3) and (4) as the associativity conditions; equation (5) is the commutativity condition, and (6) are the power identities.
372
S.L. Bloom, Z. Ésik
We will also consider unitary Wilke algebras, which are Wilke algebras A =
(Af , Aω ) in which Af is a monoid with neutral element 1, which satisfies the
unit conditions:
1 · u = u = u · 1, u ∈ Af
1 · x = x, x ∈ Aω .
A morphism of unitary Wilke algebras h : A → B is a morphism of Wilke
algebras which preserves the neutral element. (It follows that h(1† ) = 1† .)
We will need a notion weaker than that of a unitary Wilke algebra.
Definition 4. A unitary Wilke prealgebra (Af , Aω ) is an algebra with the
operations and constant of a unitary Wilke algebra whose operations need to
satisfy only the associativity and unit conditions. A morphism h = (hf , hω ) :
(Af , Aω ) → (Bf , Bω ) is defined as for unitary Wilke algebras.
6
Axiomatizing Wilke Algebras
We adopt an argument from [4] to show that unitary Wilke algebras do not have
a finite axiomatization. Thus, neither do Wilke algebras. In particular, we prove
the following fact.
Proposition 3. For each prime p > 2 there is a unitary Wilke prealgebra Mp =
(Mf , Mω ) which satisfies the commutativity condition (5) and all power identities
(6) for integers n < p. However, for some u ∈ Mf , (up )† 6= u† . Thus, (unitary)
Wilke algebras have no finite axiomatization.
Proof. Let ⊥, ⊤ be distinct elements not in N, the set of nonnegative integers.
We define a function ρp : N → {⊥, ⊤} as follows.
(
⊤ if p divides n
ρp (n) =
⊥ otherwise.
Let Mf = N, and Mω = {⊥, ⊤}. The monoid operation u · v on Mf is addition,
u + v, and the action of Mf on Mω is trivial: u · x = x, for u ∈ Mf , x ∈ Mω .
Lastly, define u† = ρp (u). It is clear that (Mf , Mω ) is a unitary Wilke prealgebra
satisfying the commutativity condition.
Now, p divides un = nu iff p divides n or p divides u. Thus, in particular, for
n < p, (un )† = u† . Also, (uv)† = u(vu)† , since uv = vu and the action is trivial.
But if u = 1, then up = p and u† = ⊥ =
6 ⊤ = p† = (up )† .
Now if there were any finite axiomatization of unitary Wilke algebras, then, by
the compactness theorem, there would be a finite axiomatization consisting of
the associativity, commutativity and unit conditions, together with some finite
subset of the power identities. This has just been shown impossible.
✷
Iteration Algebras Are Not Finitely Axiomatizable
7
373
From Iteration Algebras to Wilke Algebras and Back
Suppose that Σ is a signature, not necessarily scalar, and that B is a µ/Σalgebra. Let Af denote the set of all functions tB : B → B for Σ-terms t = t[x]
having at most the one free variable x, and let Aω denote the set of elements
of the form tB , for µ/Σ-terms t having no free variables. We give A = (Af , Aω )
the structure of a unitary Wilke prealgebra as follows.
B
For ai = tB
i ∈ Af , i = 1, 2, and w = s ∈ Aω ,
a1 · a2 := (t1 [t2 /x])B ; a1 · w := (t1 [s/x])B ;
a†1 := (µx.t1 )B .
Proposition 4. With the above definitions, (Af , Aω ) is a unitary Wilke prealgebra. Moreover, (Af , Aω ) is a unitary Wilke algebra iff B is an iteration algebra.
✷
Notation: We write BW for the unitary Wilke algebra determined by the
iteration algebra B.
We want to give a construction of a scalar µ/Σ-algebra A[B] from a unitary
Wilke prealgebra B = (Bf , Bω ). So let B = (Bf , Bω ) be a unitary Wilke prealgebra. Define the scalar signature Σ as follows:
Σ1 := {σb : b ∈ Bf }
Σ0 := {σz : z ∈ Bω }.
Let the underlying set A of A[B] be Bf ∪ Bω . The functions σ A for σ ∈ Σ1 are
defined as follows:
σbA (w) := b · w.
For z ∈ Bω , σzA := z. Lastly, we define
t = t[x], by induction on the structure of
only consider the case t = µx.s[x] and x
σb1 (. . . σbk (x) . . . ), for some k ≥ 0 and bj
otherwise,
the functions tA for all µ/Σ-terms
the term t. By Lemma 1, we need
occurs free in s. In this case, s is
∈ Bf . If k = 0, (µx.x)A := 1† ;
(µx.σb1 (. . . σbk (x) . . . ))A := (b1 · . . . · bk )† .
Proposition 5. A[B] is a Conway-algebra iff B satisfies the commutativity condition.
✷
Lemma 2. For each n ≥ 2, the Conway algebra A[B] satisfies the power identity
µx.t ≈ µx.tn , for all terms t = t[x] iff B satisfies the identity x† ≈ (xn )† .
✷
374
S.L. Bloom, Z. Ésik
Corollary 1. A[B] is an iteration algebra iff B is a unitary Wilke algebra.
Corollary 2. For any finite subset X of the power identities, there is a scalar
Conway algebra A which satisfies all of the identities in X, but fails to satisfy
all power identities.
Proof. This follows from the previous lemma and Proposition 3.
✷
Remark 1. When B = (Bf , Bω ) is generated by a set B0 ⊆ Bf , one may reduce
the signature of A[B] to contain only the letters associated with the elements of
B0 . By the same argument one obtains the following stronger version of Corollary 2: For any finite subset X of the power identities, there is a scalar Conway
µ/Σ-algebra A having a single operation which satisfies all of the identities in
X, but fails to satisfy all power identities.
Corollary 3. Each unitary Wilke algebra B is isomorphic to A[B]W.
8
✷
Hyperidentities
A ‘hyperidentity’ differs from a standard identity only in the way it is interpreted. Suppose that ∆ is a fixed ranked signature in which ∆n is countably
infinite, for each n ≥ 0. For any signature Σ, an identity s ≈ t between µ/∆
terms is said to be a hyperidentity of the µ/Σ-algebra A, if for each way of
substituting a µ/Σ term tδ [x1 , . . . , xn , y] for each letter δ ∈ ∆n in the terms s, t,
the resulting µ/Σ-identity is true in A. Thus, the operation symbols in ∆ may
be called “meta-operation symbols”. This definition of “hyperidentity” extends
the notion in Taylor [19], in that terms with more than n variables are allowed
to be substituted for n-ary function symbols.
For example, if F, G ∈ ∆1 , then
µx.F (G(x)) ≈ F (µx.G(F (x)))
(7)
is a hyperidentity of the class of all iteration algebras. Indeed, this is just a
restatement of the composition identity.
The following proposition follows from our definition of hyperidentity.
Proposition 6. The two hyperidentities (7) and
µx.µy.F (x, y) ≈ µz.F (z, z)
axiomatize the Conway algebras.
✷
Iteration Algebras Are Not Finitely Axiomatizable
375
Each group identity mentioned above may be formulated as a hyperidentity (containing no free variables), as well. Thus, iteration algebras can be axiomatized
by an infinite set of hyperidentities.
Theorem 2. There is no finite set of hyperidentities which axiomatize iteration
algebras.
Proof idea. Suppose that there is a finite set E of hyperidentities that axiomatize
iteration algebras. Then there is a finite set E ′ of scalar hyperidentities such that
a scalar µ/Σ-algebra is an iteration algebra iff it is a model of the hyperidentities
E ′ . The equations in E ′ are obtained from those in E by replacing each metaoperation symbol F (x1 , . . . , xn ), n > 1, by unary meta-operation symbols fi (xj )
in all possible ways. (For example, F (x, y) ≈ F (y, x) gets replaced by the two
equivalent equations f1 (x1 ) ≈ f1 (x2 ) and and f2 (x1 ) = f2 (x2 ).) When the rank
n of F is zero, F remains unchanged. Now if a scalar hyperidentity s ≈ t holds
in all iteration algebras, then either no variable occurs free in either s or t, or
both sides contain the same free variable. We then translate each such scalar
hyperidentity s ≈ t in the finite set E ′ into an identity tr(s) ≈ tr(t) between
two unitary Wilke prealgebra terms. The translation t 7→ tr(t) is by induction
on the structure of the term t. For example, if x has a free occurrence in t,
the translation of µx.t is tr(t)† ; otherwise tr(µx.t) is tr(t). We then show the
resulting set of identities tr(s) ≈ tr(t) together with the axioms for unitary Wilke
prealgebras gives a finite axiomatization of unitary Wilke algebras, contradicting
Proposition 3.
✷
9
Conclusion
Although most equational theories involving a fixed point operation which are
of interest in theoretical computer science are nonfinitely based, several of them
have a finite relative axiomatization over iteration algebras. Examples of such
theories are the equational theory of Kleene algebras of binary relations, or
(regular) languages, the theory of bisimulation or tree equivalence classes of
processes equipped with the regular operations, etc. See [10,3,7] and [5]. Since the
nonfinite equational axiomatizability of these theories is caused by the nonfinite
axiomatizability of iteration algebras, the constructions of this paper may be
used to derive simple proofs of the nonfinite axiomatizability of these theories
as well.
References
1. S.L. Bloom and Z. Ésik. Iteration algebras. International Journal of Foundations
of Computer Science, 3(3):245–302, 1992. Extended abstract in Colloq. on Trees
in Algebra and Programming, volume 493 of Lecture Notes in Computer Science,
264–274, 1991.
376
S.L. Bloom, Z. Ésik
2. S.L. Bloom and Z. Ésik. Iteration Theories: The Equational Logic of Iterative Processes. EATCS Monographs on Theoretical Computer Science. Springer–Verlag,
1993.
3. S.L. Bloom and Z. Ésik. Equational axioms for regular sets. Mathematical Structures in Computer Science, 3:1–24, 1993.
4. S.L. Bloom and Z. Ésik. Shuffle binoids. Theoretical Informatics and Applications,
32(4-5-6):175–198, 1998.
5. S.L. Bloom, Z. Ésik and D. Taubner. Iteration theories of synchronization trees.
Information and Computation, 102:1–55, 1993.
6. Z. Ésik. Independence of the equational axioms of iteration theories. Journal of
Computer and System Sciences, 36:66–76, 1988.
7. Z. Ésik. Group axioms for iteration. Information and Computation, 148:131–180,
1999.
8. J. Goguen, J. Thatcher, E. Wagner, and J. Wright. Initial algebra semantics and
continuous algebras. Journal of the ACM, 24:68–95, 1977.
9. I. Guessarian. Algebraic Semantics. Lecture Notes in Computer Science 99,
Springer, Berlin-New York, 1981.
10. D. Krob. Complete systems of B-rational identities. Theoretical Computer Science,
89:207–343, 1991.
11. E. Nelson. Iterative algebras. Theoretical Computer Science 25:67–94, 1983.
12. D. Niwinski. Equational mu-calculus. Computation theory (Zaborów, 1984), Lecture Notes in Comput. Sci., 208:169–176, Springer, Berlin-New York, 1985.
13. D. Perrin and J-E. Pin. Semigroups and automata on infinite words. in J. Fountain
(ed.), Semigroups, Formal Languages and Groups, pages 49–72, Kluwer Academic
Pub., 1995.
14. D. Scott. Data types as lattices. SIAM Journal of Computing, 5:522–587, 1976.
15. J. Tiuryn. Fixed points and algebras with infinitely long expressions, I. Regular
algebras. Fundamenta Informaticae, 2:103–127, 1978.
16. J. Tiuryn. Unique fixed points vs. least fixed points. Theoretical Computer Science,
12:229–254, 1980.
17. T. Wilke. An Eilenberg Theorem for ∞-languages. In “Automata, Languages
and Programming”, Proc. of 18th ICALP Conference, vol. 510 of Lecture Notes in
Computer Science, 588–599, 1991.
18. T. Wilke. An algebraic theory for regular languages of finite and infinite words.
International Journal of Algebra and Computation, 3:447–489, 1993.
19. W. Taylor. Hyperidentities and hypervarieties. Aequationes Mathematicae, 21:30–
49, 1981.
Undecidable Problems in Unreliable Computations
Richard Mayr ⋆
Department of Computer Science, University of Edinburgh,
JCMB, Edinburgh EH9 3JZ, UK. e-mail: mayrri@dcs.ed.ac.uk
Abstract. Lossy counter machines are defined as Minsky n-counter machines
where the values in the counters can spontaneously decrease at any time. While
termination is decidable for lossy counter machines, structural termination (termination for every input) is undecidable. This undecidability result has far reaching
consequences. Lossy counter machines can be used as a general tool to prove the
undecidability of many problems, for example (1) The verification of systems that
model communication through unreliable channels (e.g. model checking lossy
fifo-channel systems and lossy vector addition systems). (2) Several problems for
reset Petri nets, like structural termination, boundedness and structural boundedness. (3) Parameterized problems like fairness of broadcast communication
protocols.
1 Introduction
Lossy counter machines (LCM) are defined just like Minsky counter machines [19], but
with the addition that the values in the counters can spontaneously decrease at any time.
This is called ‘lossiness’, since a part of the counter is lost. (In a different framework this
corresponds to lost messages in unreliable communication channels.) There are many
different kinds of lossiness, i.e. different ways in which the counters can decrease. For
example, one can define that either a counter can only spontaneously decrease by 1, or
it can only become zero, or it can change to any smaller value. All these different ways
are described by different lossiness relations (see Section 2).
The addition of lossiness to counter machines weakens their computational power.
Some types of lossy counter machines (with certain lossiness relations) are not Turingpowerful, since reachability and termination are decidable for them. Since lossy counter
machines are weaker than normal counter machines, any undecidability result for lossy
counter machines is particularly interesting.
The main result of this paper is that structural termination (termination for every
input) is undecidable for every type of lossy counter machine (i.e. for every lossiness
relation).
This result can be applied to prove the undecidability of many problems. To prove
the undecidability of a problem X, it suffices to choose a suitable lossiness relation L
and reduce the structural termination problem for lossy counter machines with lossiness
relation L to the problem X. The important and nice point here is that problem X does
not need to simulate a counter machine perfectly. Instead, it suffices if X can simulate a
counter machine imperfectly, by simulating only a lossy counter machine. Furthermore,
one can choose the right type of imperfection (lossiness) by choosing the lossiness
relation L.
⋆
Work supportet by DAAD Post-Doc grant D/98/28804.
G. Gonnet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNCS 1776, pp. 377–386, 2000.
c Springer-Verlag Berlin Heidelberg 2000
378
R. Mayr
Thus lossy counter machines can be used as a general tool to prove the undecidability
of problems. Firstly, they can be used to prove new undecidability results, and secondly
they can be used to give more elegant, simpler and much shorter proofs of existing results
(see Section 5).
2 Definitions
Definition 1. A n-counter machine [19] M is described by a finite set of states Q,
an initial state q0 ∈ Q, a final state accept ∈ Q, n counters c1 , . . . , cn and a finite set of instructions of the form (q : ci := ci + 1; goto q ′ ) or (q : If ci =
0 then goto q ′ else ci := ci − 1; goto q ′′ ) where i ∈ {1, . . . , n} and q, q ′ , q ′′ ∈ Q. A
configuration of M is described by a tuple (q, m1 , . . . , mn ) where q ∈ Q and mi ∈ IN
is the content of the counter
Pcni (1 ≤ i ≤ n). The size of a configuration is defined by
size((q, m1 , . . . , mn )) := i=1 mi . The possible computation steps are defined by
1. (q, m1 , . . . , mn ) → (q ′ , m1 , . . . , mi + 1, . . . , mn )
if there is an instruction (q : ci := ci + 1; goto q ′ ).
2. (q, m1 , . . . , mn ) → (q ′ , m1 , . . . , mn ) if there is an instruction (q : If ci =
0 then goto q ′ else ci := ci − 1; goto q ′′ ) and mi = 0.
3. (q, m1 , . . . , mn ) → (q ′′ , m1 , . . . , mi − 1, . . . , mn ) if there is an instruction
(q : If ci = 0 then goto q ′ else ci := ci − 1; goto q ′′ ) and mi > 0.
A run of a counter machine is a (possibly infinite) sequence of configurations
s0 , s1 , . . . with s0 → s1 → s2 → s3 → . . .. Lossiness relations describe spontaneous
changes in the configurations of lossy counter machines.
s
Definition 2. Let → (for ‘sum’) be a relation on configurations of n-counter machines
s
. . , m′n ) ∨
(q, m1 , . . . , mn ) → (q ′ , m′1 , . . . , m′n ) :⇔ (q, m1 , . . . , mn ) = (q ′ , m′1 , . !
n
n
X
X
mi >
m′i
q = q′ ∧
i=1
i=1
This relation means that either nothing is changed or the sum of all counters strictly
l
decreases. Let id be the identity relation. A relation → is a lossiness relation iff id ⊆
l
s
→ ⊆ →. A lossy counter machine (LCM) is given by a counter machine M and a
l
lossiness relation →. Let → be the normal transition relation of M . The lossy transition
l
l
relation =⇒ of the LCM is defined by s1 =⇒ s2 :⇔ ∃s′1 , s′2 . s1 → s′1 → s′2 → s2 . An
arbitrary lossy counter machine is a lossy counter machine with an arbitrary (unspecified)
lossiness relation. The following relations are lossiness relations:
Perfect The relation id is a lossiness relation. Thus arbitrary lossy counter machines
subsume normal counter machines.
cl
cl
Classic Lossiness The classic lossiness relation → is defined by (q, m1 , . . . , mn ) →
′
′
′
′
′
(q , m1 , . . . , mn ) :⇔ q = q ∧ ∀i. mi ≥ mi . Here the contents of the counters
l
can become any smaller value. A relation → is called a subclassic lossiness relation
l
cl
iff id ⊆ → ⊆ →.
Undecidable Problems in Unreliable Computations
379
Bounded Lossiness A counter can loose at most x ∈ IN before and after every coml(x)
l(x)
putation step. Here the lossiness relation −→ is defined by (q, m1 , . . . , mn ) −→
l(x)
(q ′ , m′1 , . . . , m′n ) :⇔ q = q ′ ∧ ∀i. mi ≥ m′i ≥ max {0, mi − x}. Note that −→ is
a subclassic lossiness relation.
Reset Lossiness If a counter is tested for zero, then it can suddenly become zero. The
rl
rl
lossiness relation → is defined as follows: (q, m1 , . . . , mn ) → (q ′ , m′1 , . . . , m′n )
′
′
′
iff q = q and for all i either mi = mi or mi = 0 and there is an instruction
rl
(q : If ci = 0 then goto q ′ else ci := ci − 1; goto q ′′ ). Note that → is subclassic.
The definition of these lossiness relations carries over to other models like Petri nets
[21], where places are considered instead of counters.
Definition 3. For any arbitrary lossy n-counter machine and any configuration s let
runs(s) be the set of runs that start at configuration s. (There can be more than one run
if the counter machine is nondeterministic or lossy.) Let runs ω (s) be the set of infinite
ω
runs that start at configuration s.
run r = {(q i , mi1 , . . . , min )}∞
i=0 ∈ runs (s) is
PA
n
i
ω
space-bounded iff ∃c ∈ IN. ∀i. j=1 mj ≤ c. Let runs b (s) be the space-bounded
infinite runs that start at s. An (arbitrary lossy) n-counter machine M is
zero-initializing iff in the initial state q0 it first sets all counters to 0.
space-bounded iff the space used by M is bounded by a constant c. ∃c ∈ IN. ∀r ∈
runs((q0 , 0, . . . , 0)).∀s ∈ r. size(s) ≤ c
input-bounded iff in every run from any configuration the size of every reached configuration is bounded by the input. ∀s. ∀r ∈ runs(s). ∀s′ ∈ r. size(s′ ) ≤ size(s)
strongly-cyclic iff every infinite run from any configuration visits the initial state q0
infinitely often. ∀q ∈ Q, m1 , . . . , mn ∈ IN.∀r ∈ runs ω ((q, m1 , . . . , mn )).
∃m′1 , . . . , m′n ∈ IN. (q0 , m′1 , . . . , m′n ) ∈ r.
bounded-strongly-cyclic iff every space-bounded infinite run from any configuration
visits the initial state q0 infinitely often. ∀q ∈ Q, m1 , . . . , mn ∈ IN.
′
′
′
′
∀r ∈ runs ω
b ((q, m1 , . . . , mn )).∃m1 , . . . , mn ∈ IN. (q0 , m1 , . . . , mn ) ∈ r
If M is input-bounded then it is also space-bounded. If M is strongly-cyclic then it is
also bounded-strongly-cyclic. If M is input-bounded and bounded-strongly-cyclic then
it is also strongly-cyclic.
3 Decidable Properties
Since arbitrary LCM subsume normal counter machines, nothing is decidable for them.
However, some problems are decidable for classic LCM (with the classic lossiness
relation). They are not Turing-powerful. The following results in this section are special
cases of positive decidability results in [4,5,2].
Lemma 1. Let M be a classic LCM and s a configuration of M . The set pre ∗ (s) :=
{s′ | s′ =⇒∗ s} of predecessors of s is effectively constructible.
Theorem 1. Reachability is decidable for classic LCM.
Lemma 2. Let M be a classic LCM with initial configuration s0 . It is decidable if there
is an infinite run that starts at s0 , i.e. if runs ω (s0 ) 6= ∅.
Theorem 2. Termination is decidable for classic LCM.
It has been shown in [4] that even model checking classic LCM with the temporal
logics EF and EG (natural fragments of computation tree logic (CTL) [7,10]) is decidable.
380
R. Mayr
4 The Undecidability Result
We show that structural termination is undecidable for LCM for every lossiness relation.
We start with the problem CM, which was shown to be undecidable by Minsky [19].
CM
Instance: A 2-counter machine M with initial state q0 .
Question: Does M accept (q0 , 0, 0) ?
BSC-ZI-CMω
b
Instance: A bounded-strongly-cyclic, zero-initializing 3-counter machine M with initial state q0 .
Question: Does M have an infinite space-bounded run from (q0 , 0, 0, 0),
i.e. runs ω
b ((q0 , 0, 0, 0)) 6= ∅ ?
Lemma 3. BSC-ZI-CMω
b is undecidable.
Proof. We reduce CM to BSC-ZI-CMω
b . Let M be a 2-counter machine with initial state
q0 . We construct a 3-counter machine M ′ as follows: First M ′ sets all three counters to
0. Then it does the same as M , except that after every instruction it increases the third
counter c3 by 1. Every instruction of M of the form (q : ci := ci + 1; goto q ′ ) with
(1 ≤ i ≤ 2) is replaced by q : ci := ci +1; goto q2 and q2 : c3 := c3 +1; goto q ′ , where
q2 is a new state. Every instruction of the form (q : If ci = 0 then goto q ′ else ci :=
ci − 1; goto q ′′ ) with (1 ≤ i ≤ 2) is replaced by three instructions: q : If ci =
0 then goto q2 else ci := ci − 1; goto q3 , q2 : c3 := c3 + 1; goto q ′ , q3 : c3 :=
c3 + 1; goto q ′′ where q2 , q3 are new states.
Finally, we replace the accepting state accept of M by the initial state q0′ of M ′ ,
i.e. we replace every instruction (goto accept) by (goto q0′ ). M ′ is zero-initializing by
definition. M ′ is bounded-strongly-cyclic, because c3 is increased after every instruction
and only set to zero at the initial state q0′ .
⇒ If M is a positive instance of CM then it has an accepting run from (q0 , 0, 0). This
run has finite length and is therefore space-bounded. Then M ′ has an infinite spacebounded cyclic run that starts at (q0′ , 0, 0, 0). Thus M ′ is a positive instance of
BSC-ZI-CMω
b.
⇐ If M ′ is a positive instance of BSC-ZI-CMω
b then there exists an infinite spacebounded run that starts at the configuration (q0′ , 0, 0, 0). By the construction of M ′
this run contains an accepting run of M from the configuration (q0 , 0, 0). Thus M
is a positive instance of CM.
⊓
⊔
∃nLCMω
Instance: A strongly-cyclic, input-bounded 4-counter LCM M with initial state q0 .
Question: Does there exist an n ∈ IN s.t. runs ω ((q0 , 0, 0, 0, n)) 6= ∅ ?
Theorem 3. ∃nLCMω is undecidable for every lossiness relation.
l
ω
Proof. We reduce BSC-ZI-CMω
b to ∃nLCM with any lossiness relation →. For any
bounded-strongly-cyclic, zero-initializing 3-counter machine M we construct a
strongly-cyclic, input-bounded lossy 4-counter machine M ′ with initial state q0′ and
l
lossiness relation → as follows: The 4-th counter c4 holds the ‘capacity’. In every
operation it is changed in a way s.t. the sum of all counters never increases. (More
exactly, the sum of all counters can increase by 1, but only if it was decreased by 1 in
Undecidable Problems in Unreliable Computations
381
the previous step.) Every instruction of M of the form (q : ci := ci + 1; goto q ′ ) with
(1 ≤ i ≤ 3) is replaced by two instructions: q : If c4 = 0 then goto fail else c4 :=
c4 −1; goto q2 , q2 : ci := ci +1; goto q ′ , where fail is a special final state and q2 is a new
state. Every instruction of the form (q : If ci = 0 then goto q ′ else ci := ci − 1; goto q ′′ )
with (1 ≤ i ≤ 3) is replaced by two instructions: q : If ci = 0 then goto q ′ else ci :=
ci − 1; goto q2 , q2 : c4 := c4 + 1; goto q ′′ , where q2 is a new state.
M ′ is bounded-strongly-cyclic, because M is bounded-strongly-cyclic. M ′ is inputbounded, because every run from a configuration (q, m1 , . . . , m4 ) is space-bounded by
m1 + m2 + m3 + m4 . Thus M ′ is also strongly-cyclic.
⇒ If M is a positive instance of BSC-ZI-CMω
b then there exists a n ∈ IN and an infinite
run of M that starts at (q0 , 0, 0, 0), visits q0 infinitely often and always satisfies
l
c1 + c2 + c3 ≤ n. Since id ⊆→, there is also an infinite run of M ′ that starts at
(q0 , 0, 0, 0, n), visits q0 infinitely often and always satisfies c1 + c2 + c3 + c4 ≤ n.
Thus M ′ is a positive instance of ∃nLCMω .
⇐ If M ′ is a positive instance of ∃nLCMω then there exists an n ∈ IN s.t. there is an
infinite run that starts at the configuration (q0′ , 0, 0, 0, n). This run is space-bounded,
because it always satisfies c1 + c2 + c3 + c4 ≤ n. By the construction of M ′ , the
sum of all counters can only increase by 1 if it was decreased by 1 in the previous
step. By the definition of lossiness (see Def. 2) we get the following: If lossiness
occurs (when the contents of the counters spontaneously change) then this strictly
and permanently decreases the sum of all counters. It follows that lossiness can only
occur at most n times in this infinite run and the sum of all counters is bounded by n.
Thus there is an infinite suffix of this run of M ′ where lossiness does not occur. Thus
there exist q ′ ∈ Q, m′1 , . . . , m′4 ∈ IN s.t. an infinite suffix of this run of M ′ without
lossiness starts at (q ′ , m′1 , . . . , m′4 ). It follows that there is an infinite space-bounded
run of M that starts at (q ′ , m′1 , . . . , m′3 ). Since M is bounded-strongly-cyclic, this
run must eventually visit q0 . Thus there exist m′′1 , . . . , m′′3 ∈ IN s.t. an infinite spacebounded run of M starts at (q0 , m′′1 , . . . , m′′3 ). Since M is zero-initializing, there is
an infinite space-bounded run of M that starts at (q0 , 0, 0, 0). Thus M is a positive
⊓
⊔
instance of BSC-ZI-CMω
b.
Note that this undecidability result even holds under the additional condition that
the LCMs are strongly-cyclic and input-bounded. It follows immediately that model
checking LCM with the temporal logics CTL (computation-tree logic [7,10]) and LTL
(linear-time temporal logic [22]) is undecidable, since the question of ∃nLCMω can be
expressed in these logics. There are two variants of the structural termination problem:
Structterm-LCM, Variant 1
Instance: A strongly-cyclic, input-bounded 4-counter LCM M with initial state q0 .
Question: Does M terminate for all inputs from q0 ?
Formally: ∀n1 , . . . , n4 ∈ IN. runs ω ((q0 , n1 , n2 , n3 , n4 )) = ∅ ?
Structterm-LCM, Variant 2
Instance: A strongly-cyclic, input-bounded 4-counter LCM M with initial state q0 .
Question: Does M terminate for all inputs from every control state q ?
Formally: ∀n1 , . . . , n4 ∈ IN. ∀q ∈ Q.runs ω ((q, n1 , n2 , n3 , n4 )) = ∅ ?
Theorem 4. Structural termination is undecidable for lossy counter machines. Both
variants of STRUCTTERM-LCM are undecidable for every lossiness relation.
382
R. Mayr
Proof. The proof of Theorem 3 carries over, because the LCM is strongly-cyclic and the
⊓
⊔
3-CM in BSC-ZI-CMω
b is zero-initializing.
Space-Boundedness for LCM
Instance: A strongly-cyclic 4-counter LCM M with initial configuration (q0 , 0, 0, 0, 0)
Question: Is M space-bounded ?
Theorem 5. Space-boundedness for LCM is undecidable for all lossiness relations.
Theorem 6. Structural space-boundedness for LCM is undecidable for every lossiness
relation.
Proof. The proof is similar to Theorem 4. An extra counter is used to count the length
of the run. It is unbounded iff the run is infinite. All other counters are bounded.
⊓
⊔
5 Applications
5.1
Lossy Fifo-Channel Systems
Fifo-channel systems are systems of finitely many finite-state processes that communicate with each other by sending messages via unbounded fifo-channels (queues,
buffers). In lossy fifo-channel systems these channels are lossy, i.e. they can spontaneously loose (arbitrarily many) messages. This can be used to model communication
via unreliable channels. While normal fifo-channel systems are Turing-powerful, some
safety-properties are decidable for lossy fifo-channel systems [2,5,1]. However, liveness properties are undecidable even for lossy fifo-channel systems. In [3] Abdulla and
Jonsson showed the undecidability of the recurrent-state problem for lossy fifo-channel
systems. This problem is if certain states of the system can be visited infinitely often. The
undecidable core of the problem is essentially if there exists an initial configuration of a
lossy fifo-channel system s.t. it has an infinite run. The undecidability proof in [3] was
done by a long and complex reduction from a variant of Post’s correspondence problem
(2-permutation PCP [23], which is (wrongly) called cyclic PCP in [3]).
Lossy counter machines can be used to give a much simpler proof of this result.
The lossiness of lossy fifo-channel systems is classic lossiness, i.e. the contents of a
fifo-channel can change to any substring at any time. A lossy fifo-channel system can
simulate a classic LCM (with some additional deadlocks) in the following way: Every
lossy fifo-channel contains a string in X ∗ (for some symbol X) and is used as a classic
lossy counter. The only problem is the test for zero. We test the emptiness of a channel
by adding a special symbol Y and removing it in the very next step. If it can be done then
the channel is empty (or has become empty by lossiness). If this cannot be done, then the
channel was not empty or the symbol Y was lost. In this case we get a deadlock. These
additional deadlocks do not affect the existence of infinite runs, and thus the results of
Section 4 carry over. Thus the problem ∃nLCMω (for the classic lossiness relation) can
be reduced to the problem above for lossy fifo-channel systems and the undecidability
follows immediately from Theorem 3.
Undecidable Problems in Unreliable Computations
5.2
383
Model Checking Lossy Basic Parallel Processes
Petri nets [21] (also described as ‘vector addition systems’ in a different framework) are
a widely known formalism used to model concurrent systems. They can also be seen as
counter machines without the ability to test for zero, and are not Turing-powerful, since
the reachability problem is decidable for them [17]. Basic Parallel Processes correspond
to communication-free nets, the (very weak) subclass of labeled Petri nets where every
transition has exactly one place in its preset. They have been studied intensively in the
framework of model checking and semantic equivalences (e.g. [12,18,6,15,20]).
An instance of the model checking problem is given by a system S (e.g. a counter machine, Petri net, pushdown automaton,. . . ) and a temporal logic formula ϕ. The question
is if the system S has the properties described by ϕ, denoted S |= ϕ.
The branching-time temporal logics EF, EG and EGω are defined as extensions of
Hennessy-Milner Logic [13,14,10] by the operators EF , EG and EG ω , respectively.
∗
s |= EF ϕ iff there exists an s′ s.t. s → s′ and s′ |= ϕ. s0 |= EG ω ϕ iff there exists
an infinite run s0 → s1 → s2 → . . . s.t. ∀i. si |= ϕ. EG is similar, except that
it also includes finite runs that end in a deadlock. Alternatively, EF and EG can be
seen as fragments of computation-tree logic (CTL [7,10]), since EF ϕ = true U ϕ and
EGϕ = ϕ wU false.
Model checking Petri nets with the logic EF is undecidable, but model checking
Basic Parallel Processes with EF is PSPACE -complete [18]. Model checking Basic
Parallel Processes with EG is undecidable [12]. It is different for lossy systems: By
induction on the nesting-depth of the operators EF , EG and EG ω , and constructions
similar to the ones in Lemma 1 and Lemma 2, it can be shown that model checking
classic LCM with the logics EF, EG and EGω is decidable. Thus it is also decidable for
classical lossy Petri nets and classical lossy Basic Parallel Processes (see [4]).
However, model checking lossy Basic Parallel Processes with nested EF and EG/EG ω
operators is still undecidable for every subclassic lossiness relation. This is quite surprising, since lossy Basic Parallel Processes are an extremely weak model of infinite-state
concurrent systems and the temporal logic used is very weak as well.
Theorem 7. Model checking lossy Basic Parallel Processes (with any subclassic
lossiness relation) with formulae of the form EF EG ω Φ, where Φ is a Hennessy-Milner
Logic formula, is undecidable.
Proof. Esparza and Kiehn showed in [12] that for every counter machine M (with all
counters initially 0) a Basic Parallel Processes P and a Hennessy-Milner Logic formula
ϕ can be constructed s.t. M does not halt iff P |= EG ω ϕ. The construction carries over
to subclassic LCM and subclassic lossy Basic Parallel Processes. The control-states of
the counter machine are modeled by special places of the Basic Parallel Processes. In
every infinite run that satisfies ϕ exactly one of these places is marked at any time.
We reduce ∃nLCMω to the model checking problem. Let M be a subclassic LCM. Let
P be the corresponding Basic Parallel Processes as in [12] and let ϕ be the corresponding
Hennessy-Milner Logic formula as in [12]. We use the same subclassic lossiness relation
on M and on P . P stores the contents of the 4-th counter in a place Y . Thus P kY n
corresponds to the configuration of M with n in the 4-th counter (and 0 in the others).
a
b
We define a new initial state X and transitions X → XkY and X → P , where a and b
do not occur in P . Let Φ := ϕ ∧ ¬hbitrue. Then M is a positive instance of ∃nLCMω
iff X |= EF EG ω Φ. The result follows from Theorem 3.
⊓
⊔
384
R. Mayr
For Petri nets and Basic Parallel Processes, the meaning of Hennessy-Milner Logic
formulae can be expressed by boolean combinations of constraints of the form p ≥ k
(at least k tokens on place p). Thus the results also hold if boolean combinations of such
constraints are used instead of Hennessy-Milner Logic formulae. Another consequence
of Theorem 7 is that model checking lossy Petri nets with CTL is undecidable.
5.3
Reset/Transfer Petri Nets
Reset Petri nets are an extension of Petri nets by the addition of reset-arcs. A reset-arc
between a transition and a place has the effect that, when the transition fires, all tokens
are removed from this place, i.e. it is reset to zero. Transfer nets and transfer arcs are
defined similarly, except that all tokens on this place are moved to some different place.
It was shown in [8] that termination is decidable for ‘Reset Post G-nets’, a more general
extension of Petri nets that subsumes reset nets and transfer nets. (For normal Petri nets
termination is EXPSPACE -complete [24]). While boundedness is trivially decidable
for transfer nets, the same question for reset nets was open for some time (and even a
wrong decidability proof was published). Finally, it was shown in [8,9] that boundedness
(and structural boundedness) is undecidable for reset Petri nets. The proof in [8] was
done by a complex reduction from Hilbert’s 10th problem (a simpler proof was later
given in [9]).
Here we generalize these results by using lossy counter machines. This also gives a
unified framework and considerably simplifies the proofs.
Lemma 4. Reset Petri nets can simulate lossy counter machines with reset-lossiness.
Theorem 8. Structural termination, boundedness and structural boundedness are undecidable for lossy reset Petri nets with every subclassic lossiness relation.
Proof. It follows from Lemma 4 that a lossy reset Petri net with subclassic lossiness
l
l
rl
relation → can simulate a lossy counter machine with lossiness relation → ∪ →. The
results follow from Theorem 4, Theorem 5 and Theorem 6.
⊓
⊔
The undecidability result on structural termination carries over to transfer nets (instead of a reset the tokens are moved to a special ‘dead’ place), but the others don’t.
Note that for normal Petri nets structural termination and structural boundedness can be
decided in polynomial time (just check if there is a positive linear combination of effects
of transitions). Theorem 7 and Theorem 8 also hold for arbitrary lossiness relations, but
this requires an additional argument. The main point is that Petri nets (unlike counter
machines) can increase a place/counter and decrease another in the same step.
5.4
Parameterized Problems
We consider verification problems for systems whose definition includes a parameter
n ∈ IN. Intuitively, n can be seen as the size of the system. Examples are
– Systems of n indistinguishable communicating finite-state processes.
– Systems of communicating pushdown automata with n-bounded stack.
– Systems of (a fixed number of) processes who communicate through (lossy) buffers
or queues of size n.
Undecidable Problems in Unreliable Computations
385
Let P (n) be such a system with parameter n. For every fixed n, P (n) is a system
with finitely many states and thus (almost) every verification problem is decidable for
it. So the problem P (n) |= Φ is decidable for any temporal logic formula Φ from
any reasonable temporal logic, e.g. modal µ-calculus [16] or monadic second-order
theory. The parameterized verification problem is if a property holds independently of
the parameter n, i.e. for any size. Formally, the question is if for given P and Φ we
have ∀n ∈ IN. P (n) |= Φ (or ¬∃n ∈ IN. P (n) |= ¬Φ). Many of these parameterized
problems are undecidable by the following meta-theorem.
Theorem 9. A parameterized verification problem is undecidable if it satisfies the following conditions:
1. It can encode an n-space-bounded lossy counter machine (for some lossiness relation) in such a way that P (n) corresponds to the initial configuration with n in one
counter and 0 in the others.
2. It can check for the existence of an infinite run.
Proof. By a reduction of ∃nLCMω and Theorem 3. The important point is that in the
⊓
⊔
problem ∃nLCMω one can require that the LCM is input-bounded.
The technique of Theorem 9 is used in [11] to show the undecidability of the model
checking problem for linear-time temporal logic (LTL) and broadcast communication
protocols. These are systems of n indistinguishable communicating finite-state processes
where a ‘broadcast’ by one process can affect all other n − 1 processes. Such a broadcast
can be used to set a simulated counter to zero. However, there is no test for zero. One
rl
reduces ∃nLCMω with lossiness relation → to the model checking problem. In the
same way, similar results can be proved for parameterized problems about systems with
bounded buffers, stacks, etc.
6 Conclusion
Lossy counter machines can be used as a general tool to show the undecidability of many
problems. It provides a unified way of reasoning about many quite different classes
of systems. For example the recurrent-state problem for lossy fifo-channel systems,
the boundedness problem for reset Petri nets and the fairness problem for broadcast
communication protocols were previously thought to be completely unrelated. Yet lossy
counter machines show that the principles behind their undecidability are the same.
Moreover, the undecidability proofs for lossy counter machines are very short and much
simpler than previous proofs of weaker results [3,8].
Lossy counter machines have also been used in this paper to show that even for very
weak temporal logics and extremely weak models of infinite-state concurrent systems,
the model checking problem is undecidable (see Subsection 5.2). We expect that many
more problems can be shown to be undecidable with the help of lossy counter machines,
especially in the area of parameterized problems (see Subsection 5.4).
Acknowledgments
Thanks to Javier Esparza and Petr Jančar for fruitful discussions.
386
R. Mayr
References
1. P. Abdulla, A. Bouajjani, and B. Jonsson. On-the-fly Analysis of Systems with Unbounded,
Lossy Fifo Channels. In 10th Intern. Conf. on Computer Aided Verification (CAV’98). LNCS
1427, 1998.
2. P. Abdulla and B. Jonsson. Verifying Programs with Unreliable Channels. In LICS’93. IEEE,
1993.
3. P. Abdulla and B. Jonsson. Undecidable verification problems for programs with unreliable
channels. Information and Computation, 130(1):71–90, 1996.
4. A. Bouajjani and R. Mayr. Model checking lossy vector addition systems. In Proc. of
STACS’99, volume 1563 of LNCS. Springer Verlag, 1999.
5. G. Cécé, A. Finkel, and S.P. Iyer. Unreliable Channels Are Easier to Verify Than Perfect
Channels. Information and Computation, 124(1):20–31, 1996.
6. S. Christensen, Y. Hirshfeld, and F. Moller. Bisimulation equivalence is decidable for Basic
Parallel Processes. In E. Best, editor, Proceedings of CONCUR 93, volume 715 of LNCS.
Springer Verlag, 1993.
7. E.M. Clarke and E.A. Emerson. Design and synthesis of synchronization skeletons using
branching time temporal logic. volume 131 of LNCS, pages 52–71, 1981.
8. C. Dufourd, A. Finkel, and Ph. Schnoebelen. Reset nets between decidability and undecidability. In Proc. of ICALP’98, volume 1443 of LNCS. Springer Verlag, 1998.
9. C. Dufourd, P. Jančar, and Ph. Schnoebelen. Boundedness of Reset P/T Nets. In Proc. of
ICALP’99, volume 1644 of LNCS. Springer Verlag, 1999.
10. E.A. Emerson. Temporal and modal logic. In J. van Leeuwen, editor, Handbook of Theoretical
Computer Science : Volume B, FORMAL MODELS AND SEMANTICS. Elsevier, 1994.
11. J. Esparza, A. Finkel, and R. Mayr. On the verification of broadcast protocols. In Proc. of
LICS’99. IEEE, 1999.
12. J. Esparza and A. Kiehn. On the model checking problem for branching time logics and Basic
Parallel Processes. In CAV’95, volume 939 of LNCS. Springer Verlag, 1995.
13. M. Hennessy and R. Milner. On observing nondeterminism and concurrency. volume 85 of
LNCS, pages 295–309, 1980.
14. M. Hennessy and R. Milner. Algebraic laws for nondeterminism and concurrency. Journal
of Association of Computer Machinery, 32:137–162, 1985.
15. Y. Hirshfeld, M. Jerrum, and F. Moller. A polynomial-time algorithm for deciding bisimulation
equivalence of normed Basic Parallel Processes. Journal of Mathematical Structures in
Computer Science, 6:251–259, 1996.
16. D. Kozen. Results on the propositional µ-calculus. TCS, 27:333–354, 1983.
17. E. Mayr. An algorithm for the general Petri net reachability problem. SIAM Journal of
Computing, 13:441–460, 1984.
18. R. Mayr. Weak bisimulation and model checking for Basic Parallel Processes. In Foundations
of Software Technology and Theoretical Computer Science (FST&TCS’96), volume 1180 of
LNCS. Springer Verlag, 1996.
19. M.L. Minsky. Computation: Finite and Infinite Machines. Prentice-Hall, 1967.
20. F. Moller. Infinite results. In Ugo Montanari and Vladimiro Sassone, editors, Proceedings of
CONCUR’96, volume 1119 of LNCS. Springer Verlag, 1996.
21. J.L. Peterson. Petri net theory and the modeling of systems. Prentice-Hall, 1981.
22. A. Pnueli. The temporal logic of programs. In FOCS’77. IEEE, 1977.
23. K. Ruohonen. On some variants of Post’s correspondence problem. Acta Informatica, 19:357–
367, 1983.
24. H. Yen. A unified approach for deciding the existence of certain Petri net paths. Information
and Computation, 96(1):119–137, 1992.
Equations in Free Semigroups with
Anti-involution and Their Relation to Equations
in Free Groups
Claudio Gutiérrez1,2
1
2
Computer Science Group, Dept. of Mathematics, Wesleyan University
Departamento de Ingenierı́a Matemática, D.I.M., Universidad de Chile
(Research funded by FONDAP, Matemáticas Aplicadas)
cgutierrez@wesleyan.edu
Abstract. The main result of the paper is the reduction of the problem
of satisfiability of equations in free groups to the satisfiability of equations
in free semigroups with anti-involution (SGA), by a non-deterministic
polynomial time transformation.
A free SGA is essentially the set of words over a given alphabet plus an
operator which reverses words. We study equations in free SGA, generalizing several results known for equations in free semigroups, among them
that the exponent of periodicity of a minimal solution of an equation E
in free SGA is bounded by 2O(|E|) .
1
Introduction
The study of the problem of solving equations in free SGA (unification in free
SGA) and its computational complexity is a problem closely related to the problem of solving equations in free semigroups and in free groups, which lately have
attracted much attention of the theoretical computer science community [3], [12],
[13], [14].
Free semigroups with anti-involution is a structure which lies in between that
of free semigroups and free groups. Besides the relationship with semigroups and
groups, the axioms defining SGA show up in several important theories, like
algebras of binary relations, transpose in matrices, inverse semigroups.
The problem of solving equations in free semigroups was proven to be decidable by Makanin in 1976 in a long paper [10] . Some years later, in 1982, again
Makanin proved that solving equations in free groups was a decidable problem
[11]. The technique used was similar to that of the first paper, although the
details are much more involved. He reduced equations in free groups to solving
equations in free SGA with special properties (‘non contractible’), and showed
decidability for equation of this type. For free SGA (without any further condition) the decidability of the problem of satisfiability of equations is still open,
although we conjecture it is decidable.
Both of Makanin’s algorithms have received very much attention. The enumeration of all unifiers was done by Jaffar for semigroups [6] and by Razborov
G. Gonnet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNCS 1776, pp. 387–396, 2000.
c Springer-Verlag Berlin Heidelberg 2000
388
C. Gutiérrez
for groups [15]. Then, the complexity has become the main issue. Several authors
have analyzed the complexity of Makanin’s algorithm for semigroups [6], [16], [1],
being EXPSPACE the best upper-bound so far [3]. Very recently Plandowski,
without using Makanin’s algorithm, presented an upper-bound of PSPACE for
the problem of satisfiability of equations in free semigroups [14]. On the other
hand, the analysis of the complexity of Makanin’s algorithm for groups was done
by Koscielski and Pacholski [8], who showed that it is not primitive recursive.
With respect to lower bounds, the only known lower bound for both problems
is NP-hard, which seems to be weak for the case of free groups. It is easy to see
that this lower bound works for the case of free SGA as well.
The main result of this paper is the reduction of equations in free groups to
equations in free SGA (Theorem 9 and Corollary 10). This is achieved by generalizing to SGA several known results for semigroups, using some of Makanin’s
results in [11], and proving a result that links these results (Proposition 3).
Although we do not use it here, we show that the standard bounds on the exponent of periodicity of minimal solutions to word equations also hold with minor
modifications in the case of free SGA (Theorem 5).
For concepts of word combinatorics we will follow the notation of [9]. By ǫ
we denote the empty word.
2
Equations in Free SGA
A semigroup with anti-involution (SGA) is an algebra with a binary associative operation (written as concatenation) and a unary operation ( )−1 with the
equational axioms
(xy)z = x(yz),
(xy)−1 = y −1 x−1 ,
x−1−1 = x.
(1)
A free semigroup with anti-involution is an initial algebra for this variety. It is
not difficult to check that for a given alphabet C, the set of words over C ∪ C −1
together with the operator ( )−1 , which reverses a word and changes every letter
to its twin (e.g. a to a−1 and conversely) is a free algebra for SGA over A.
Equations and Solutions. Let C and V be two disjoint alphabets of constants
and variables respectively. Denote by C −1 = {c−1 : c ∈ C}. Similarly for V −1 .
An equation E in free SGA with constants C and variables V is a pair (w1 , w2 ) of
words over the alphabet A = C ∪C −1 ∪V ∪V −1 . The number |E| = |w1 |+|w2 | is
the length of the equation E and |E|V will denote the number of occurrences of
variables in E. These equations are also known as equations in a paired alphabet.
A map S : V −→ (C ∪ C −1 )∗ can be uniquely extended to a SGAhomomorphism S̄ : A∗ −→ (C ∪ C −1 )∗ by defining S(c) = c for c ∈ C and
S(u−1 ) = (S(u))−1 for u ∈ C ∪ V . We will use the same symbol S for the map
S and the SGA-homomorphism S̄. A solution S of the equation E = (w1 , w2 )
is (the unique SGA-homomorphism defined by) a map S : V −→ (C ∪ C −1 )∗
such that S(w1 ) = S(w2 ). The length of the solution S is |S(w1 )|. By S(E)
we denote the word S(w1 ) (which is the same as S(w2 )). Each occurrence of
Equations in Free Semigroups with Anti-involution
389
a symbol u ∈ A in E with S(u) 6= ǫ determines a unique factor in S(E), say
S(E)[i, j], which we will denote by S(u, i, j) and call simply an image of u in
S(E).
The Equivalence Relation (S, E). Let S be a solution of E and P be the set of
positions of S(E). Define the binary relation (S, E)′ in P × P as follows: given
positions p, q ∈ P , p(S, E)′ q if and only if one of the following hold:
1. p = i + k and q = i′ + k, where S(x, i, j) and S(x, i′ , j ′ ) are images of x in
S(E) and 0 ≤ k < |S(x)|.
2. p = i + k and q = j ′ − k, where S(x, i, j) and S(x−1 , i′ , j ′ ) are images of x
and x−1 in S(E) and 0 ≤ k < |S(x)|.
Then define (S, E) as the transitive closure of (S, E)′ . Observe that (S, E) is an
equivalence relation.
Contractible Words. A word w ∈ A∗ is called non-contractible if for every u ∈ A
the word w contains neither the factor uu−1 nor u−1 u. An equation (w1 , w2 ) is
called non-contractible if both w1 and w2 are non-contractible. A solution S to
an equation E is called non-contractible if for every variable x which occurs in
E, the word S(x) is non-contractible.
Boundaries and Superpositions. Given a word w ∈ A∗ , we define a boundary of
w as a pair of consecutive positions (p, p + 1) in w. We will write simply pw , the
subindex denoting the corresponding word. By extension, we define i(w) = 0w
and f (w) = |w|w , the initial and final boundaries respectively. Note that the
boundaries of w have a natural linear order (pw ≤ qw iff p ≤ q as integers).
Given an equation E = (w1 , w2 ), a superposition (of the boundaries of the
left and right hand sides) of E is a linear order ≤ of the set of boundaries of w1
and w2 extending the natural orders of the boundaries of w1 and w2 , such that
i(w1 ) = i(w2 ) and f (w1 ) = f (w2 ) and possibly identifying some pw1 and qw2 .
Cuts and Witnesses. Given a superposition ≤ of E = (w1 , w2 ), a cut is a boundary j of w2 (resp. w1 ) such that j 6= b for all boundaries b of w1 (resp. w2 ).
Hence a cut determines at least three symbols of E, namely w2 [j], w2 [j + 1] and
w1 [i + 1], where i is such that iw1 < jw2 < (i + 1)w1 in the linear order, see
Figure 1. The triple of symbols (w2 [j], w2 [j + 1], w1 [i]) is called a witness of the
cut. A superposition is called consistent if w1 [i + 1] is a variable.
Observe that every superposition gives rise to a system of equations (E, ≤),
which codifies the constraints given by ≤, by adding the corresponding equations
and variables x = x′ y which the cuts determine. Also observe that every solution
S of E determines a unique consistent superposition, denoted ≤S . Note finally
that the cut j determines a boundary (r, r + 1) in S(E); if p ≤ r < q, we say
that the subword S(E)[p, q] of S(E) contains the cut j.
Lemma 1 Let E be an equation in free SGA. Then E has a solution if and only
if (E, ≤) has a solution for some consistent superposition ≤. There are no more
than |E|4|E|V consistent superpositions.
390
C. Gutiérrez
0w1 1w1
0w2
iw1
(i + 1)w1
jw 2
|w1 |w1
|w2 |w2
Fig. 1. The cut jw .
Proof. Obviously if for some consistent superposition ≤, (E, ≤) has a solution,
then E has a solution. Conversely, if E has a solution S, consider the superposition generated by S.
As for the bound, let E = (w1 , w2 ) and write v for |E|V . First observe
that if w2 consists only of constants, then there are at most |w2 |v consistent
superpositions. To get a consistent superposition in the general case, first insert
each initial and final boundary of each variable in w2 in the linear order of the
boundaries of w1 (this can be done in at most |E| + v ways). Then it rest to
deal with the subwords of w2 in between variables (hence consisting only of
constants and of total length ≤ |E| − v). Summing up, there are no more than
(|E| + v)2v (|E| − v)v ≤ |E|4v consistent superpositions.
Lemma 2 (Compare Lemma 6, [12]) Assume S is a minimal (w.r.t. length)
solution of E. Then
1. For each subword w = S(E)[i, j] with |w| > 1, there is an occurrence of w
or w−1 which contains a cut of (E, ≤S ).
2. For each letter c = S(E)[i] of S(E), there is an occurrence of c or c−1 in E.
Proof. Let 1 ≤ p ≤ q ≤ |S(E)|. Suppose neither w = S(E)[p, q] nor w−1 have
occurrences in S(E) which contain cuts. Consider the position p in S(E) and its
(S, E)-equivalence class P , and define for each variable x occurring in E,
S ′ (x) = the subsequence of some image S(x, i, j) of x consisting of
all positions which are not in the set P . (i.e. “cut off” from S(x, i, j) all
the positions in P ).
It is not difficult to see that S ′ is well defined, i.e., it does not depend on the
particular image S(x, i, j) of x chosen, and that S ′ (w1 ) = S ′ (w2 ) (these facts
follow from the definition of (S, E)-equivalence). Now, if P does not contain any
images of constants of E, it is easy to see that S ′ is a solution of the equation E.
But |S ′ (E)| < |S(E)|, which is impossible because S was assumed to be minimal.
Hence, for each word w = S[p, q], its first position must in the same (S, E)class of the position of the image of a constant c of E. If p < q the right (resp.
left) boundary of that constant is a cut in w (resp. w−1 ) which is neither initial
nor final (check definition of (S, E)-equivalence for S(E)[p + 1], etc.), and we are
in case 1. If p = q we are in case 2.
Equations in Free Semigroups with Anti-involution
391
Proposition 3 For each non-contractible equation E there is a finite list of
systems of equations Σ1 , . . . , Σk such that the following conditions hold:
1. E has a non-contractible solution if and only if one Σi has a solution.
2. k ≤ |E|8|E|V .
3. There is c > 0 constant such that |Σi | ≤ c|E| and |Σi |V ≤ c|E|V for each
i = 1, . . . , k.
Proof. Let ≤ be a consistent superposition of E, and let
(x1 , y1 , z1 ), . . . , (xr , yr , zr )
(2)
be a list of those witnesses of the cuts of (E, ≤) for which at least one of the
xi , yi is a variable. Let
D = {(c, d) ∈ (C ∪ C −1 )2 : c 6= d−1 ∧ d 6= c−1 },
and define for each r-tuple h(ci , di )ii , of pairs of D the system
Σh(ci ,di )ii = (E, ≤) ∪ {(xi , x′i ci ), (yi , di yi′ ) : i = 1, . . . , r}.
Now, if S is a non-contractible solution of (E, ≤) then S define a solution of
some Σi , namely the one defined by the r-tuple defined by the elements (ci , di ) =
(S(xi )[|S(xi )|], S(yi )[1]), for i = 1, . . . , r. Note that because E and S are noncontractible, each (ci , di ) is in D.
On the other direction, suppose that S is a solution of some Σi . Then obviously S is a solution of (E, ≤). We only need to prove that the S(z) is noncontractible for all variables z occurring in E. Suppose some z has a factor cc−1 ,
for c ∈ C. Then by Lemma 2 there is an occurrence of cc−1 (its converse is
the same) which contains a cut of (E, ≤). But because E is non-contractible, we
must have that one of the terms in (2), say (xj , yj , zj ), witnesses this occurrence,
hence xj = x′j c and yj = c−1 yj′ , which is impossible by the definition of the Σi ’s.
The bound in 2. follows by simple counting: observe that r ≤ 2|E|V and |D| ≤
|C|2r ≤ |E|4|E|V , and the number k of systems is no bigger than the number
of superpositions times |D|. For the bounds in 3. just sum the corresponding
numbers of the new equations added.
The following is an old observation of Hmelevskii [5] for free semigroups
which extends easily to free SGA:
Proposition 4 For each system of equations Σ in free SGA with generators C,
there is an equation E in free SGA with generators C ∪ c, c ∈
/ (C ∪ C −1 ), such
that
1. S is a solution of E if and only if S is a solution of Σ.
2. |E| ≤ 4|Σ| and |E|V = |Σ|V .
Moreover, if the equations in Σ are non-contractible, the E is non-contractible.
392
C. Gutiérrez
Proof. Let (v1 , w1 ), . . . , (vn , wn ) the system of equations Σ. Define E as
(v1 cv2 c · · · cvn cv1 c−1 v2 c−1 · · · c−1 vn , w1 cw2 c · · · cwn cw1 c−1 w2 c−1 · · · c−1 wn ).
Clearly E is non-contractible because so was each equation (vi , wi ), and c is a
fresh letter. Also if S is a solution of Σ, obviously it is a solution of E. Conversely,
if S is a solution of E, then
|S(v1 cv2 c · · · cvn )| = |S(v1 c−1 v2 c−1 · · · c−1 vn )|,
hence
|S(v1 cv2 c · · · cvn )| = |S(w1 cw2 c · · · cwn )|,
and the same for the second pair of expressions with c−1 . Now it is easy to show
that S(vi ) = S(wi ) for all i: suppose not, for example |S(v1 )| < |S(w1 )|. Then
S(w1 )[|S(v1 )| + 1] = c and S(w1 )[|S(v1 )| + 1] = c−1 , impossible. Then argue the
same for the rest.
The bounds are simple calculations.
The next result is a very important one, and follows from a straightforward
generalization of the result in [7], where it is proved for semigroups.
Theorem 5 Let E be an equation in free SGA. Then, the exponent of periodicity
of a minimal solution of E is bounded by 2O(|E|) .
Proof. It is not worth reproducing here the ten-pages proof in [7] because the
changes needed to generalize it to free SGA are minor ones. We will assume that
the reader is familiar with the paper [7].
The proof there consist of two independent parts: (1) To obtain from the
word equation E a linear Diophantine equation, and (2) To get good bound
for it. We will sketch how to do step (1) for free SGA. The rest is completely
identical.
First, let us sketch how the system of linear equations is obtained from a
word equation E. Let S be a solution of E. Recall that a P -stable presentation
of S(x), for a variable x, has the form
S(x) = w0 P µ1 w1 P µ2 . . . wn−1 P µn−1 wn .
¿From here, for a suitable P (which is the word that witnesses the exponent of
periodicity of S(E)), a system of linear Diophantine equations LDP (E) is built,
roughly speaking, by replacing the µi by variables xµi in the case of variables,
plus some other pieces of data. Then it is proved that if S is a minimal solution
of E, the solution xµi = µi is a minimal solution of LDP (E).
For the case of free SGA, the are two key points to note. First, for the
variables of the form x−1 , the solution S(x−1 ) will have the following P −1 -stable
presentation (same P, wi , µi as before):
−1
(P −1 )µn−2 . . . w1−1 (P −1 )µ1 w0−1 .
S(x−1 ) = wn−1 (P −1 )µn−1 wn−1
Equations in Free Semigroups with Anti-involution
393
Second, note that P −1 is a subword of P P if and only if P is a subword of
P −1 P −1 . Call a repeated occurrence of P in w, say w = uP k v, maximal, if P
is neither the suffix of u nor a prefix of v. So it holds that maximal occurrences
of P and P −1 in w either (1) do not overlap each other, or (2) overlap almost
completely (exponents will differ at most by 1).
In case (1), consider the system LDP (E ′ )∪LDP −1 (E ′ ) (each one constructed
exactly as in the case of word equations) where E ′ is the equation E where we
consider the pairs of variables x−1 , x as independent for the sake of building
the system of linear Diophantine equations. And, of course, the variables xµi
obtained from the same µi in S(x) and S(x−1 ) are the same.
In case (2), notice that P -stable and P −1 -stable presentations for a variable
x differ very little. So it is enough to consider LDP (E ′ ), taking care of using for
the P -presentation of S(x−1 ) the same set of Diophantine variables (adding 1
or −1 where it corresponds) used for the P -presentation of S(x).
It must be proved then that if S is a minimal solution of the equation in free
SGA E, then the solution xµi = µi is a minimal solution of the corresponding
system of linear Diophantine equations defined as above. This can be proved
easily with the help of Lemma 2.
Finally, as for the the parameters of the system of Diophantine equations,
observe that |E ′ | = |E|, hence the only parameters that grow are the number of
variables and equations, and by a factor of at most 2. So the asymptotic bound
remains the same as for the case of E ′ , which is 2O(|E|) .
The last result concerning equations in free SGA we will prove follows from
the trivial observation that every equation in free semigroups is an equation in
free SGA. Moreover:
Proposition 6 Let M be a free semigroup on the set of generators C, and N
be a free SGA on the set of generators C, and E an equation in M . Then E is
satisfiable in M if and only if it is satisfiable in N .
Proof. An equation in free SGA which does no contain ( )−1 has a solution if
and only if it has a solution which does not contain ( )−1 . So the codification of
equations in free semigroups into free SGA is straightforward: the same equation.
We get immediately a lower bound for the problem of satisfiability of equations in free SGA by using the corresponding result for the free semigroup case.
Corollary 7 Satisfiability of equations in free SGA is NP-hard.
3
Reducing the Problem of Satisfiability of Equations in
Free Groups to Satisfiability of Equations in Free SGA
A group is an algebra with a binary associative operation (written as concatenation), a unary operation ( )−1 , and a constant 1, with the axioms (1) plus
xx−1 = 1,
x−1 x = 1,
1x = x1 = 1.
(3)
394
C. Gutiérrez
As in the case of free SGA, is not hard to see that the set of non-contractible
words over C ∪ C −1 plus the empty word, and the operations of composition and
reverse suitable defined, is a free group with generators C.
Equations in free groups. The formal concept of equation in free groups is almost
exactly the same as that for free SGA, hence we will not repeat it here. The
difference comes when speaking of solutions. A solution S of the equation E
is (the unique group-homomorphism S : A −→ (C ∪ C −1 )∗ defined by) a map
S : V −→ (C∪C −1 )∗ extended by defining S(c) = c for each c ∈ C and S(w−1 ) =
(S(w))−1 , which satisfy S(w1 ) = S(w2 ). Observe that the only difference with
the case of SGA is that now we possibly have ‘simplifications’ of subexpressions
of the form ww−1 or w−1 w to 1, i.e. the use of the equations (3).
Proposition 8 (Makanin, Lemma 1.1 in [11]) For any non-contractible
equation E in the free group G with generators C we can construct a finite list
Σ1 , . . . , Σk of systems of non-contractible equations in the free SGA G′ with
generators C such that the following conditions are satisfied:
1. E has a non-contractible solution in G if and only if k > 0 and some system
Σj has a non-contractible solution in G′ .
2. There is c > 0 constant such that |Σi | ≤ |E| + c|E|V2 and |Σi |V ≤ c|E|V2 for
each i = 1, . . . , k.
2
3. There is c > 0 constant such that k ≤ (|E|V )c|E|V .
Proof. This is essentially the proof in [11] with the bounds improved. Let E be
the equation
(4)
C0 X1 C1 X2 · · · Cv−1 Xv Cv = 1,
where Ci are non-contractible, v = |E|V , and Xi are meta-variables representing
the actual variables in E.
Let S be a non-contractible solution of E. By a known result (see [11], p. 486),
there is a set W of non-contractible words in the alphabet C, |W | ≤ 2v(2v + 1),
such that each Ci and S(Xi ) can be written as a concatenation of no more than
2v words in W , and after replacement Equation (4) holds in the free group with
generators W .
Let Z be a set of 2v(2v + 1) fresh variables. Then choose words
y0 , x1 , y1 , x1 , . . . , xv , yv ∈ (Z∪Z −1 )∗ , each of length at most 2v, non-contractible,
and define the system of equations
1. Cj = yj , j = 0, . . . , v,
2. Xj = xj , j = 1, . . . , v.
Each such set of equations, for which Equation (4) holds in the free group with
generators Z when replacing Ci and Xi by the corresponding words in (Z∪Z −1 )∗ ,
defines one system Σi .
It is clear from the result mentioned earlier, that E has a solution if and
only if there is some Σi which has a non-contractible solution. How many Σi are
there? No more than [(2v(2v + 1))2v ]2v+1 .
Equations in Free Semigroups with Anti-involution
395
Theorem 9 For each equation E in a free group G with generators C there
is a finite set Q of equations in a free semigroup with anti-involution G′ with
/ C, such that the following hold:
generators C ∪ {c1 , c2 }, c1 , c2 ∈
1. E is satisfiable in G if and only if one of the equations in Q is satisfiable in
G′ .
2. There is c > 0 constant, such that for each E ′ ∈ Q, it holds |E ′ | ≤ c|E|2 .
3
3. |Q| ≤ |E|c|E|V , for c > 0 a constant.
Proof. By Proposition 8, there is a list of systems of non-contractible equations Σ1 , . . . , Σk which are equivalent to E (w.r.t. non-contractible satisfiability). By Proposition 4, each such system Σj is equivalent (w.r.t. to satisfiability)
to a non-contractible equation E ′ . Then, by Proposition 3, for each such noncontractible E ′ , there is a system of equations (now without the restriction of
non-contractibility) Σ1′ , . . . , Σk′ ′ such that E ′ has a non-contractible solution if
and only if one of the Σj′ has a solution (not necessarily non-contractible). Finally, by Proposition 4, for each system Σ ′ , we have an equation E ′′ which have
the same solutions (if any) of Σ ′ . So we have a finite set of equations (the E ′′ ’s)
with the property that E is satisfiable in G if and only if one of the E ′′ is
satisfiable in G′ .
The bounds in 2. and 3. follow by easy calculations from the bounds in the
corresponding results used above.
Remark. It is not difficult to check that the set Q in the previous theorem can
be generated non-deterministically in polynomial time.
Corollary 10 Assume that fT is an upper bound for the deterministic TIMEcomplexity of the problem of satisfiability of equations in free SGA. Then
3
max{fT (c|E|2 ), |E|c|E|V },
for c > 0 a constant, is an upper bound for the deterministic TIME-complexity
of the problem of satisfiability of equations in free groups.
4
Conclusions
Our results show that solving equations in free SGA comprises the cases of free
groups and free semigroups, the first with an exponential reduction (Theorem
9), and the latter with a linear reduction (Proposition 6). This suggest that free
SGA, due to its simplicity, is the ‘appropriate’ theory to study when seeking
algorithms for solving equations in those theories.
In a preliminary version of this paper we stated the following conjectures:
1. Satisfiability of equations in free groups is PSPACE-hard.
2. Satisfiability of equations in free groups is in EXPTIME.
3. Satisfiability of equations in free SGA is decidable.
In the meantime the author proved that satisfiability of equations in free SGA is
in PSPACE, hence answering positively (2) and (3). Also independently, Diekert
and Hagenah announced the solution of (3) [2].
396
C. Gutiérrez
Acknowledgements
Thanks to Volker Diekert for useful comments.
References
1. V. Diekert, Makanin’s Algorithm for Solving Word Equations with Regular Constraints, in forthcoming M. Lothaire, Algebraic Combinatorics on Words. Report
Nr. 1998/02, Fakultät Informatik, Universität Stuttgart.
2. V. Diekert, Personal communication, 8 Oct. 1999.
3. C. Gutiérrez, Satisfiability of Word Equations with Constants is in Exponential
Space, in Proc. FOCS 98.
4. C. Gutiérrez, Solving Equations in Strings: On Makanin’s Algorithm, in Proceedings of the Third Latin American Symposium on Theoretical Informatics,
LATIN’98, Campinas, Brazil, 1998. In LNCS 1380, pp. 358-373.
5. J.I. Hmelevskii, Equations in a free semigroup, Trudy Mat. Inst. Steklov 107(1971),
English translation: Proc. Steklov Inst. Math. 107(1971).
6. J. Jaffar, Minimal and Complete Word Unification, Journal of the ACM, Vol. 37,
No.1, January 1990, pp. 47-85.
7. A. Kościelski, L. Pacholski, Complexity of Makanin’s algorithm, J. Assoc. Comput.
Mach. 43 (1996) 670-684.
8. A. Kościelski, L. Pacholski, Makanin’s algorithm is not primitive recursive, Theoretical Computer Science 191 (1998) 145-156.
9. Lothaire, M. Combinatorics on Words, Cambridge Mathematical Texts, reprinted
1998.
10. G.S. Makanin, The problem of satisfiability of equations in a free semigroup, Mat.
Sbornik 103, 147-236 (in Russian). English translation in Math. USSR Sbornik 32,
129-198.
11. G.S. Makanin. Equations in a free group. Izvestiya NA SSSR 46, 1199-1273, 1982
(in Russian). English translation in Math USSR Izvestiya, Vol. 21 (1983), No. 3.
12. W. Rytter and W. Plandowski, Applications of Lempel-Ziv encodings to the solution
of word equations, In Proceedings of the 25th. ICALP, 1998.
13. Plandowski, W., Satisfiability of word equations with constants is in NEXPTIME,
in Proc. STOC’99.
14. Plandowski, W., Satisfiability of word equations with constants is in PSPACE, in
Proc. FOCS’99.
15. A.A. Razborov, On systems of equations in a free group, Izvestiya AN SSSR 48
(1984) 779-832 (in Russian). English translation in Math. USSR Izvestiya 25 (1985)
115-162.
16. K. Schulz, Word Unification and Transformation of Generalized Equations, Journal
of Automated Reasoning 11:149-184, 1993.
Squaring Transducers:
An Efficient Procedure for Deciding
Functionality and Sequentiality of Transducers
Marie-Pierre Béal1 , Olivier Carton1 , Christophe Prieur2 , and Jacques
Sakarovitch3
1
3
Institut Gaspard Monge, Université Marne-La Vallée
2
LIAFA, Université Paris 7 / CNRS
Laboratoire Traitement et Communication de l’Information, ENST / CNRS
Abstract. We described here a construction on transducers that give a
new conceptual proof for two classical decidability results on transducers:
it is decidable whether a finite transducer realizes a functional relation,
and whether a finite transducer realizes a sequential relation. A better
complexity follows then for the two decision procedures.
In this paper we give a new presentation and a conceptual proof for two
classical decision results on finite transducers.
Transducers are finite automata with input and output; they realize thus
relations between words, the so-called rational relations. Eventhough they are a
very simple model of machines that compute relations — they can be seen as
2-tape 1-way Turing machines — most of the problems such as equivalence or
intersection are easily shown to be equivalent to the Post Correspondence Problem and thus undecidable. The situation is drastically different for transducers
that are functional, that is, transducers that realize functions, and the above
problems become then easily decidable. And this is of interest because of the
following result.
Theorem 1. [12]
Functionality is a decidable property for finite transducers.
Among the functional transducers, those which are deterministic in the input (they are called sequential ) are probably the most interesting, both from a
pratical and from a theoretical point of view: they correspond to machines that
can really and easily be implemented. A rational function is sequential if it can
be realized by a sequential transducer. Of course, a non sequential transducer
may realize a sequential function and this occurrence is known to be decidable.
Theorem 2. [7]
Sequentiality is a decidable property for rational functions.
The original proofs of these two theorems are based on what could be called a
“pumping” principle, implying that a word which contradicts the property may
be chosen of a bounded length, and providing thus directly decision procedures
G. Gonnet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNCS 1776, pp. 397–406, 2000.
c Springer-Verlag Berlin Heidelberg 2000
398
M.-P. Béal et al.
of exponential complexity. Theorem 1 was published again in [4], with exactly
the same proof, hence the same complexity.
Later, it was proved that the functionality of a transducer can be decided
in polynomial time, as a particular case of a result obtained by reduction to
another decision problem on another class of automata ([10, Theorem 2]).
With this communication, we shall see how a very natural construction performed on the square of the transducer yields a decision procedure for the two
properties, that is, it can be read on the result of the construction whether the
property holds or not.
The size of the object constructed for deciding functionality is quadratic in
the size of the considered transducer. In the case of sequentiality, one has to be
more subtle for the constructed object may be too large. But it is shown that it
can be decided in polynomial time whether this object has the desired property.
Due to the short space available on the proceedings, the proofs of the results
are omited here and will be published in a forthcoming paper.
1
Preliminaries
We basically follow the definitions and notation of [9,2] for automata.
The set of words over a finite alphabet A, i.e. the free monoid over A, is
denoted by A∗ . Its identity, or empty word is denoted by 1A∗ .
An automaton A over a finite alphabet A, noted A = h Q, A, E, I, T i, is a
directed graph labelled by elements of A; Q is the set of vertices, called states,
I ⊂ Q is the set of initial states, T ⊂ Q is the set of terminal states and
E ⊂ Q × A × Q is the set of labelled edges called transitions. The automaton A
is finite if Q is finite.
The definition of automata as labelled graphs extends readily to automata
over any monoid: an automaton A over M , noted A = h Q, M, E, I, T i, is a
directed graph the edges of which are labelled by elements of the monoid M . A
computation is a path in the graph A; its label is the product of the label of
its transitions. A computation is successful if it begins with an initial state and
ends with a final state. The behaviour of A is the subset of M consisting of the
labels of the successful computations of A.
A state of A is said to be accessible if it belongs to a computation that begins
with an initial state; it is useful if it belongs to a successful computation. The
automaton A is trim if all of its states are useful. The accessible part and the
useful part of a finite automaton A are easily computable from A.
An automaton T = h Q, A∗ ×B ∗ , E, I, T i over a direct product A∗×B ∗ of two
free monoids is called transducer from A∗ to B ∗ . The behaviour of a transducer T
is thus (the graph of) a relation α from A∗ into B ∗ : α is said to be realized by T .
A relation is rational (i.e. its graph is a rational subset of A∗ ×B ∗ ) if and only
if it is realized by a finite transducer.
It is a slight generalization — that does not increase the generating power of
the model — to consider transducers T = h Q, A∗ ×B ∗ , E, I, T i where I and T
are not subsets of Q (i.e. functions from Q into {0, 1}) but functions from Q
Squaring Transducers: An Efficient Procedure
399
into B ∗ ∪ ∅ (the classical transducers are those for which the image of a state
by I or T is either ∅ or 1B ∗ ).
A transducer is said to be real-time if the label of every transition is a
pair (a, v) where a is letter of A, the input of the transition, and v a word over B,
the output of the transition, and if for any states p and q and any letter a there
is at most one transition from p to q whose input is a. Using classical algorithms
from automata theory, any transducer T can be transformed into a transducer
that is real-time if T realizes a function ([9, Th. IX.5.1], [2, Prop. III.7.1]).
If T = h Q, A∗ ×B ∗ , E, I, T i is a real-time transducer, the underlying input
automaton of T is the automaton A over A obtained from T by forgetting the
second component of the label of every transition and by replacing the functions I
and T by their respective domains. The language recognized by A is the domain
of the relation realized by T .
We call sequential a transducer that is real-time, functional, and whose underlying input automaton is deterministic. A function α from A∗ into B ∗ is
sequential if it can be realized by a sequential transducer. It has to be acknowlegded that this is not the usual terminology: what we call “sequential”
(transducers or functions) have been called “subsequential ” since the seminal
paper by Schützenberger [13] — cf. [2,5,7,8,11, etc. ]. There are good reasons for
this change of terminology that has already been advocated by V. Bruyère and
Ch. Reutenauer: “the word subsequential is unfortunate since these functions
should be called simply sequential ” ([5]). Someone has to make the first move.
2
Squaring Automata and Ambiguity
Before defining the square of a transducer, we recall what is the square of an
automaton and how it can be used to decide whether an automaton is unambiguous or not. A trim automaton A = h Q, A, E, I, T i is unambiguous if any
word it accepts is the label of a unique successful computation in A.
Let A′ = h Q′ , A, E ′ , I ′ , T ′ i and A′′ = h Q′′ , A, E ′′ , I ′′ , T ′′ i be two automata
on A. The Cartesian product of A′ and A′′ is the automaton C defined by
C = A′ ×A′′ = h Q′ ×Q′′ , A, E, I ′ ×I ′′ , T ′ ×T ′′ i
where E is the set of transitions defined by
E = {((p′ , p′′ ), a, (q ′ , q ′′ )) | (p′ , a, q ′ ) ∈ E ′
and
(p′′ , a, q ′′ ) ∈ E ′′ } .
Let A×A = h Q×Q, A, F, I ×I, T ×T i be the Cartesian product of the automaton A = h Q, A, E, I, T i with itself; the set F of transitions is defined by:
F = {((p, r), a, (q, s)) | (p, a, q), (r, a, s) ∈ E} .
Let us call diagonal of A×A the sub-automaton D of A×A determined by the
diagonal D of Q×Q, i.e. D = {(q, q) | q ∈ Q}, as set of states. The states and
transitions of A and D are in bijection, hence A and D are equivalent.
400
M.-P. Béal et al.
Lemma 1. [3, Prop. IV.1.6] A trim automaton A is unambiguous if and only
if the trim part of A×A is equal to D.
Remark that as (un)ambiguity, determinism can also be described in terms
of Cartesian square, by a simple rewording of the definition: a trim automaton A
is deterministic if and only if the accessible part of A×A is equal to D.
3
Product of an Automaton by an Action
We recall now what is an action, how an action can be seen as an automaton,
and what can be then defined as the product of a (normal) automaton by an
action. We end this section with the definition of the specific action that will be
used in the sequel.
Actions. A (right) action of a monoid M on a set S is a mapping δ : S×M → S
which is consistent with the multiplication in M :
∀s ∈ S , ∀m, m′ ∈ M
δ(δ(s, m), m′ ) = δ(s, m m′ ) .
δ(s, 1M ) = s and
We write s · m rather than δ(s, m) when it causes no ambiguity.
Actions as automata.
An action δ of M on a set S with s0 as distinguished
element may then be seen as an automaton on M (without terminal states):
Gδ = h S, M, E, s0 i
is defined by the set of transitions E = {(s, m, s · m) | s ∈ S , m ∈ M }.
Note that, as both S and M are usually infinite, the automaton Gδ is “doubly”
infinite: the set of states is infinite, and, for every state s, the set of transitions
whose origin is s is infinite as well.
Product of an automaton by an action.
Let A = h Q, M, E, I, T i be a (finite
trim) automaton on a monoid M and δ an action of M on a (possibly infinite)
set S. The product of A and Gδ is the automaton on M :
A×Gδ = h Q×S, M, F, I ×{s0 }, T ×S i
the transitions of which are defined by
F = {((p, s), m, (q, s · m)) | s ∈ S , (p, m, q) ∈ E} .
We shall call product of A by δ, and denote by A×δ, the accessible part of A×Gδ .
The projection on the first component induces a bijection between the transitions of A whose origin is p and the transitions of A×δ whose origin is (p, s),
for any p in Q and any (p, s) in A×δ. The following holds (by induction on the
length of the computations):
m
(p, s) −−→ (q, t)
A×δ
=⇒
t=s·m .
Squaring Transducers: An Efficient Procedure
401
We call value of a state (p, s) of A×δ the element s of S. We shall say that the
product A×δ itself is a valuation if the projection on the first component is a
1-to-1 mapping between the states of A×δ and the states of A.
Remark 1.
Let us stress again the fact that A×δ is the accessible part of A×Gδ .
This makes possible that it may happen that A × δ is finite eventhough Gδ is
infinite (cf. Theorem 5).
The “Advance or Delay” action.
Let B ∗ be a free monoid and let us denote
∗
by HB the set HB = (B ×1B ∗ ) ∪ (1B ∗ ×B ∗ ) ∪ {00}. A mapping ψ : B ∗×B ∗ → HB
is defined by:
−1
if v is a prefix of u
(v u, 1B ∗ )
∗
∀u, v ∈ B
ψ(u, v) = (1B ∗ , u−1 v)
if u is a prefix of v
0
otherwise
Intuitively, ψ(u, v) tells either how much the first component u is ahead of the
second component v, or how much it is late, or if u and v are not prefixes of a
common word. In particular, ψ(u, v) = (1B ∗ ×1B ∗ ) if, and only if, u = v.
Lemma 2.
The mapping ωB from HB ×(B ∗ ×B ∗ ) into HB defined by:
∀(f, g) ∈ HB \ 0
ωB ((f, g), (u, v)) = ψ(f u, g v)
and
ωB (00, (u, v)) = 0
is an action, which will be called the “ Advance or Delay” (or “ AD ”) action
(relative to B ∗ ) and will thus be denoted henceforth by a dot.
Remark 2.
The transition monoid of ωB is isomorphic to B ∗ ×B ∗ if B has
at least two letters, to Z if it has only one letter. (We have denoted by 0 the
absorbing element of HB under ωB in order to avoid confusion with 0, the
identity element of the monoid Z).
4
Deciding Functionality
Let T = h Q, A∗ ×B ∗ , E, I, T i be a real-time trim transducer such that the
output of every transition is a single word of B ∗ — recall that this is a necessary
condition for the relation realized by T to be a function. The transducer T is
not functional if and only if there exist two distinct computations:
a1 /u′
an /u′
q0′ −−−−−1→ q1′ · · · −−−−−n→ qn′
T
T
a1 /u′′
an /u′′
T
T
1
q0′′ −−−−−
→ q1′′ · · · −−−−−n→ qn′′
There exists then at least one i such that u′i 6=
with
′′
ui , and thus
This implies, by projection on the first component, that the underlying input
automaton A of T is ambiguous. But it may be the case that A is ambiguous
and T still functional, as it is shown for instance with the transducer Q1 represented on the top of Figure 1 (cf. [2]). We shall now carry on the method of
Cartesian square of section 2 from automata to transducers.
u′1
u′2
u′′2 . . . u′′n .
6=
such that qi′ 6= qi′′ .
and
. . . u′n
u′′1
402
M.-P. Béal et al.
Cartesian square of a real-time transducer. By definition, the Cartesian product of T by itself is the transducer T ×T from A∗ into B ∗ ×B ∗ :
T ×T = h Q×Q, A∗ ×(B ∗ ×B ∗ ), F, I ×I, T ×T i
whose transitions set F is defined by:
F = {((p, r), (a, (u′ , u′′ )), (q, s)) |
(p, (a, u′ ), q)
and
(r, (a, u′′ ), s) ∈ E} .
The underlying input automaton of T × T is the square of the underlying
input automaton A of T . If A is unambiguous, then T is functional, and the
trim part of A×A is reduced to its diagonal.
An effective characterization of functionality.
The transducer T × T is an
automaton on the monoid M = A∗ ×(B ∗ ×B ∗ ). We can consider that the AD
action is an action of M on HB , by forgetting the first component. We can thus
make the product of T ×T , or of any of its subautomata, by the AD action ωB .
a/x3
a/x4
a/x
a/x
a/x3
a/x2
0
-1
1
0
1
0
2
1
-1
-2
0
-1
0
-1
1
0
a/x
a/x3
a/x4
a/x3
a/x
a/x2
Fig. 1. Cartesian square of Q1 , valued by the product with the action ω{x} .
As the output alphabet has only one letter, H{x} is identified with Z and the states
are labelled by an integer. Labels of transitions are not shown: the input is always a
and is kept implicit; an output of the form (xn , xm ) is coded by the integer n − m
which is itself symbolised by the drawing of the arrow: a dotted arrow for 0, a simple
solid arrow for +1, a double one for +2 and a bold one for +3; and the corresponding
dashed arrows for the negative values.
Squaring Transducers: An Efficient Procedure
403
Theorem 3.
A transducer T from A∗ into B ∗ is functional if and only if the
product of the trim part U of the Cartesian square T ×T by the AD action ωB
is a valuation of U such that the value of any final state is (1B ∗ , 1B ∗ ).
Figure 1 shows the product of the Cartesian square of a transducer Q1 by
the AD action1 .
Let us note that if α is the relation realized by T , the transducer obtained
from T ×T by forgeting the first component is a transducer from B ∗ into itself
that realizes the composition product α ◦ α−1 . The conditon expressed may then
seen as a condition for α ◦ α−1 being the identity, which is clearly a condition
for the functionality of α.
5
Deciding Sequentiality
The original proof of Theorem 2 goes indeed in three steps: first, sequential functions are characterized by a property expressed by means of a distance function,
then this property (on the function) is proved to be equivalent to a property on
the transducer, and finally a pumping-lemma like procedure is given for deciding
the latter property (cf. [7,2]). We shall see how the last two steps can be replaced
by the computation of the product of the Cartesian square of the transducer by
the AD action. We first recall the first step.
5.1
A Quasi-Topological Characterization of Sequential Functions
If f and g are two words, we denote by f ∧ g the longuest prefix common to f
and g. The free monoid is then equipped with the prefix distance
∀f, g ∈ A∗
dp (f, g) = |f | + |g| − 2|f ∧ g| .
In other words, if f = h f ′ and g = h g ′ with h = f ∧ g, then dp (f, g) = |f ′ | + |g ′ |.
Definition 1.
A function α : A∗ → B ∗ , is said to be uniformly diverging2
if for every integer n there exists an integer N which is greater than the prefix
distance of the images by α of any two words (in the domain of α) whose prefix
distance is smaller than n, i.e.
∀n ∈ N , ∃N ∈ N , ∀f, g ∈ Dom α
Theorem 4. [7,13]
formly diverging.
dp (f, g) 6 n
=⇒
dp (f α, gα) 6 N .
A rational function is sequential if, and only if it is uni-
Remark 3.
The characterization of sequential functions by uniform divergence
holds in the larger class of functions whose inverse preserves rationality. This is
a generalization of a theorem of Ginsburg and Rose due to Choffrut, a much
stronger result, the full strength of which will not be of use here (cf. [5,8]).
1
2
It turns out that, in this case, the trim part is equal to the whole square.
After [7] and [2], the usual terminology is “function with bounded variation”. We
rather avoid an expression that is already used, with an other meaning, in other
parts of mathematics.
404
5.2
M.-P. Béal et al.
An Effective Characterization of Sequential Functions
Theorem 5.
A transducer T realizes a sequential function if, and only if the
product of the accessible part V of T ×T by the AD action ωB
i) is finite;
ii) has the property that if a state with value 0 belongs to a cycle in V, then
the label of that cycle is (1B ∗ , 1B ∗ ).
The parallel between automata and transducers is now to be emphasized.
Unambiguous (resp. deterministic) automata are characterized by a condition on
the trim (resp. accessible) part of the Cartesian square of the automaton whereas
functional transducers (resp. transducers that realize sequential functions) are
characterized by a condition on the product by ωB of the trim (resp. accessible)
part of the Cartesian square of the transducer.
Figure 2 shows two cases where the function is sequential: in (a) since the
accessible part of the product is finite and no state has value 0 ; in (b) since the
accessible part of the product is finite as well and the states whose value is 0 all
belong to a cycle every transition of which is labelled by (1B ∗ , 1B ∗ ).
a/x2
a/y
a/1B ∗
a/x
a/x
a/x
(x2, x)
0
a/x2
(y, x)
(x2, x2)
a/x
(x, x2)
a/1B ∗
(1, 1)
(x, x)
a/y
(y, y)
(x, y) (x, x)
1
a/x
a/1
-1
1
(a)
a/x
-1
0
a/x
a/x
0
(1, 1)
a/1
0
0
(1, 1)
(1, 1)
(b)
Fig. 2. Two transducers that realize sequential functions.
Figure 3 shows two cases where the function is not sequential: in (a) since
the accessible part of the product is infinite; in (b) since although the accessible
part of the product is finite some states whose value is 0 belong to a cycle whose
label is different from (1B ∗ , 1B ∗ ).
The following lemma is the key to the proof of Theorem 5 as well as to its
effectivity.
Lemma 3.
Let w = (1B ∗ , z) be in HB \ 0 and (u, v) in B ∗×B ∗ \ (1B ∗ , 1B ∗ ).
Then the set {w · (u, v)n | n ∈ N} is finite and does not contain 0 if, and only
if, u and v are congugate words and z is a prefix of a power of u.
Squaring Transducers: An Efficient Procedure
405
Remark 4.
The original proof of Theorem 2 by Ch. Choffrut goes by the
definition of the so-called twinning property (cf. [2, p. 128]). It is not difficult
to check that two states p and q of a real-time transducer T are (non trivially)
twinned when: i) (p, q) is accessible in T ×T ;
ii) (p, q) belongs to a cycle
iii) (p, q) has
in V every transition of which is not labelled by (1B ∗ , 1B ∗ );
not the value 0 in the product of V by ωB .
It is then shown that a transducer realizes a sequential function if, and only
if, every pair of its states has the twinning property.
a/x2
a/x2
a/x
a/x
a/x2
a/x
2
a/yx
4
a/x
0
a/x
6
a/x
a/x
1
3
5
a/x
a/x
0
(x2, x)
(yx, x)
(x, x)
(1, 1)
0
a/x2
0
a/x2
a/yx
-5
2
(x, x )
(x, yx) (x, x)
-3
-1
a/x2
-6
a/x
(yx, yx)
a/x
a/x
-4
0
0
-2
(a)
0
(1, 1)
(x, x)
(1, 1)
(b)
Fig. 3. Two transducers that realize non sequential functions.
6
The Complexity Issue
The “size” of an automaton A (on a free monoid A∗ ) is measured by the number m of transitions. (The size |A| = k of the (input) alphabet is seen as a
constant.) The size of a transducer T will be measured by the sum of the sizes
of its transitions where the size of a transition (p, (u, v), q) is the length |u v|. It
is denoted by |T |.
The size of the transducer T × T is |T |2 and the complexity to build it is
proportional to that size. The complexity of determining the trim part as well
as the accessible part is linear in the size of the transducer.
Deciding whether the product of the trim part U of T ×T by the AD action ωB
is a valuation of U (and if the value of any final state is (1B ∗ , 1B ∗ )) is again linear
in the size of U. Hence deciding whether a transducer T is functional is quadratic
in the size of the transducer. Note that the same complexity is also established
in [6].
406
M.-P. Béal et al.
The complexity of a decision procedure for the sequentiality of a function,
based on Theorem 5, is polynomial. However, this is less straightforward to
establish than functionality, for the size of the product V×ωB may be exponential.
One first checks whether the label of every cycle in V is of the form (u, v)
with |u| = |v|. It suffices to check it on a base of simple cycles and this can
be done by a deep-first search in V. Let us call true cycle a cycle which is not
labelled by (1B ∗ , 1B ∗ ) and let W be the subautomaton of V consisting of states
from which a true cycle is accessible. By Theorem 5, if suffices to consider the
product W×ωB . This product may still be of exponential size. However one does
not construct it entirely. For every state of W, the number of values which are
to be considered in W × ωB may be bounded by the size of T . This yields an
algorithm of polynomial complexity in order to decide the sequentiality of the
function realized by T .
In [1], it is shown directly that the twinning property is decidable in polynomial time.
References
1. M.-P. Béal and O. Carton: Determinization of transducers over finite and infinite
words, to appear.
2. J. Berstel: Transductions and context-free languages, Teubner, 1979.
3. J. Berstel and D. Perrin: Theory of codes, Academic Press, 1985.
4. M. Blattner and T. Head: Single valued a-transducers, J. Computer System Sci. 7
(1977), 310–327.
5. V. Bruyère and Ch. Reutenauer: A proof of Choffrut’s theorem on subsequential
functions, Theoret. Comput. Sci. 215 (1999), 329–335.
6. O. Carton, Ch. Choffrut and Ch. Prieur: How to decide functionality of rational
relations on infinite words, to appear.
7. Ch. Choffrut: Une caractérisation des fonctions séquentielles et des fonctions sousséquentielles en tant que relations rationnelles, Theoret. Comput. Sci. 5 (1977),
325–337.
8. Ch. Choffrut: A generalization of Ginsburg and Rose’s characterization of g-s-m
mappings, in Proc. of ICALP’79 (H. Maurer, Ed.), Lecture Notes in Comput.
Sci. 71 (1979), 88–103.
9. S. Eilenberg: Automata, Languages and Machines vol. A, Academic Press, 1974.
10. E. M. Gurari and O. H. Ibarra: Finite-valued and finitely ambiguous transducers,
Math. Systems Theory 16 (1983), 61-66.
11. Ch. Reutenauer: Subsequential functions: characterizations, minimization, examples, Lecture Notes in Comput. Sci. 464 (1990), 62–79.
12. M. P. Schützenberger: Sur les relations rationnelles, in Automata Theory and Formal Languages (H. Brackhage, Ed.), Lecture Notes in Comput. Sci. 33 (1975),
209–213.
13. M. P. Schützenberger: Sur une variante des fonctions séquentielles, Theoret. Comput. Sci. 4 (1977), 47–57.
Unambiguous Büchi Automata
Olivier Carton1 and Max Michel2
1
Institut Gaspard Monge, 5, boulevard Descartes F-77454 Marne-la-Vallée cedex 2
Olivier.Carton@univ-mlv.fr
2
CNET, 38, rue du Général Leclerc F-92131 Issy-les-Moulineaux
Max.Michel@cnet.francetelecom.fr
Abstract. In this paper, we introduce a special class of Büchi automata
called unambiguous. In these automata, any infinite word labels exactly
one path going infinitely often through final states. The word is accepted
by the automaton if this path starts at an initial state. The main result of
the paper is that any rational set of infinite words is recognized by such
an automaton. We also provide two characterizations of these automata.
We finally show that they are well suitable for boolean operations.
1
Introduction
Automata on infinite words have been introduced by Büchi [3] in order to prove
the decidability of the monadic second-order logic of the integers. Since then,
automata on infinite objects have often been used to prove the decidability of
numerous problems. From a more practical point of view, they also lead to
efficient decision procedures as for temporal logic [12]. Therefore, automata of
infinite words or infinite trees are one of the most important ingredients in model
checking tools [14]. The complementation of automata is then an important issue
since the systems are usually modeled by logical formulas which involve the
negation operator.
There are several kinds of automata that recognize sets of infinite words.
In 1962, Büchi [3] introduced automata on ω-words, now referred to as Büchi
automata. These automata have initial and final states and a path is successful if
it starts at an initial state and goes infinitely often through final states. However,
not all rational sets of infinite words are recognized by a deterministic Büchi
automaton [5]. Therefore, complementation is a rather difficult operation on
Büchi automata [12].
In 1963, Muller [9] introduced automata, now referred to as Muller automata,
whose accepting condition is a family of accepting subsets of states. A path is
then successful if it starts at the unique initial state and if the set of states which
occurs infinitely in the path is accepting. A deep result of McNaughton [6] shows
that any rational set of infinite words is recognized by a deterministic Muller
automaton. A deterministic automaton is unambiguous in the following sense.
With each word is associated a canonical path which is the unique path starting
at the initial state. A word is then accepted iff its canonical path is successful.
In a deterministic Muller automaton, the unambiguity is due to the uniqueness
G. Gonnet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNCS 1776, pp. 407–416, 2000.
c Springer-Verlag Berlin Heidelberg 2000
408
O. Carton, M. Michel
of the initial state and to the determinism of the transitions. Independently,
the acceptance condition determines if a path is successful or not. The unambiguity of a deterministic Muller automaton makes it easy to complement. It
suffices to exchange accepting and non-accepting subsets of states. However, the
main drawback of using deterministic Muller automata is that the acceptance
condition is much more complicated. It is a family of subsets of states instead
of a simple set of final states. There are other kinds of deterministic automata
recognizing all rational sets of infinite words like Rabin automata [11], Street automata or parity automata [8]. In all these automata, the acceptance condition
is more complicated than a simple set of final states.
In this paper, we introduce a class of Büchi automata in which any infinite word labels exactly one path going infinitely often through final states.
A canonical path can then be associated with each infinite word and we call
these automata unambiguous. In these automata, the unambiguity is due to
the transitions and to the final states whereas the initial states determine if a
path is successful. An infinite word is then accepted iff its canonical path starts
at an initial state. The main result is that any rational set of infinite words is
recognized by such an automaton. It turns out that these unambiguous Büchi
automata are codeterministic, i.e., reverse deterministic. Our result is thus the
counterpart of McNaughton’s result for codeterministic automata. It has already
been proved independently in [7] and [2] that any rational set of infinite words
is recognized by a codeterministic automaton but the construction given in [2]
does not provide unambiguous automata. We also show that unambiguous automata are well suited for boolean operations and especially complementation.
In particular, our construction can be used to find a Büchi automaton which
recognizes the complement of the set recognized by another Büchi automaton.
For a Büchi automaton with n states, our construction provides an unambiguous
automaton which has at most (12n)n states.
The unambiguous automata introduced in the paper recognize right-infinite
words. However, the construction can be adapted to bi-infinite words. Two unambiguous automata on infinite words can be joined to make an unambiguous
automaton on bi-infinite words. This leads to an extension of McNaughton’s
result to the realm of bi-infinite words.
The main result of this paper has been first obtained by the second author
and his proof has circulated as a hand-written manuscript among a bunch of
people. It was however never published. Later, the first author found a different
proof of the same result based on algebraic constructions on semigroups. Both
authors have decided to publish their whole work on this subject together.
The paper is organized as follows. Section 2 is devoted to basic definitions on
words and automata. Unambiguous Büchi automata are defined in Sect. 3. The
main result (Theorem 1) is stated there. The first properties of these automata
are presented in Sect. 4. Boolean Operations are studied in Sect. 5.
Unambiguous Büchi Automata
2
409
Automata
We recall here some elements of the theory of rational sets of finite and infinite
words. For further details on automata and rational sets of finite words, see [10]
and for background on automata and rational sets of infinite words, see [13]. Let
A be a set called an alphabet and usually assumed to be finite. We respectively
denote by A∗ and A+ the set of finite words and the set of nonempty finite
words. The set of right-infinite words, also called ω-words, is denoted by Aω .
A Büchi automaton A = (Q, A, E, I, F ) is a non-deterministic automaton
with a set Q of states, subsets I, F ⊂ Q of initial and final states and a set
a
→ q.
E ⊂ Q × A × Q of transitions. A transition (p, a, q) of A is denoted by p −
A path in A is an infinite sequence
a
a
0
1
q1 −→
q2 · · ·
γ : q0 −→
of consecutive transitions. The starting state of the path is q0 and the ω-word
λ(γ) = a0 a1 . . . is called the label of γ. A final path is a path γ such that at least
one of the final states of the automaton is infinitely repeated in γ. A successful
path is a final path which starts at an initial state.
As usual, an ω-word is accepted by the automaton if it is the label of a
successful path. The set of accepted ω-words is said to be recognized by the
automaton and is denoted by L(A). It is well known that a set of ω-words is
rational iff it is recognized by some automaton.
A state of a Büchi automaton A is said to be coaccessible if it is the starting
state of a final path. A Büchi automaton is said to be trim if all states are
coaccessible. Any state which occurs in a final path is coaccessible and thus noncoaccessible states of an automaton can be removed. In the sequel, automata are
usually assumed to be trim.
An automaton A = (Q, A, E, I, F ) is said to be codeterministic if for any
a
→ q for some
state q and any letter a, there is at most one incoming transition p −
state p. If this condition is met, for any state q and any finite word w, there is
w
at most one path p −→ q ending in q.
3
Unambiguous Automata
In this section, we introduce the concept of unambiguous Büchi automata. We
first give the definition and we state one basic property of these automata. We
then establish a characterization of these automata. We give some examples and
we state the main result.
Definition 1. A Büchi automaton A is said to be unambiguous (respectively
complete) iff any ω-word labels at most (respectively at least) one final path
in A.
The set of final paths is only determined by the transitions and the final
states of A. Thus, the property of being unambiguous or complete does not
410
O. Carton, M. Michel
depend on the set of initial states of A. In the sequel, we will freely say that
an automaton A is unambiguous or complete without specifying its set of initial
states.
The definition of the word “complete” we use here is not the usual definition
given in the literature. A deterministic automaton is usually said to be complete
if for any state q and letter a there is at least an outgoing transition labeled
by a. This definition implies that any finite or infinite word labels at least a
path starting at the initial state. It will stated in Proposition 1 that the unambiguity implies that the automaton is codeterministic. Thus, we should reverse the
definition and we should say that for any state q and letter a there is at least an
incoming transition labeled by a. However, since the words are right-infinite, this
condition does not imply anymore that any ω-word labels a path going infinitely
often though final states as it is shown in Example 3. Thus, the definition chosen
in this paper really insures that any ω-word is the label of a final path. It will
be stated in Proposition 1 that this condition is actually stronger that the usual
one.
In the sequel, we write UBA for Unambiguous Büchi Automaton and CUBA
for Complete Unambiguous Büchi Automaton. The following example is the
simplest CUBA.
a
→ 0 | a ∈ A} is
Example 1. The automaton ({0}, A, E, I, {0}) with E = {0 −
obviously a CUBA. It recognizes the set Aω of all ω-words if the state 0 is initial
and recognizes the empty set otherwise. It is called the trivial CUBA.
The following proposition states that an UBA must be codeterministic. Such
an automaton can be seen as a deterministic automaton which reads infinite
words from right to left. It starts at infinity and ends at the beginning of the
word. Codeterministic automata on infinite words have already been considered
in [2]. It is proved in that paper that any rational set of ω-words is recognized by
a codeterministic automata. Our main theorem generalizes this results. It states
that any rational set of ω-words is recognized by a CUBA.
Proposition 1. Let A = (Q, A, E, I, F ) be a trim Büchi automaton. If A is
unambiguous, then A is codeterministic. If A is complete, then for any state q
a
→ q for some state p.
and any letter a, there is at least one incoming transition p −
The second statement of the proposition says that our definition of completeness implies the usual one. Example 3 shows that the converse does not hold.
However, Proposition 3 provides some additional condition on the automaton to
ensure that it is unambiguous and complete.
Before giving some other examples of CUBA, we provide a simple characterization of CUBA which makes it easy to verify that an automaton is unambiguous
and complete. This proposition also shows that it can be effectively checked if a
given automaton is unambiguous or complete.
Let A = (Q, A, E, I, F ) be a Büchi automaton and let q be a state of A. We
denote by Aq = (Q, A, E, {q}, F ) the new automaton obtained by taking the
singleton {q} as set of initial states. The set L(Aq ) is then the set of ω-words
labeling a final path starting at state q.
Unambiguous Büchi Automata
411
Proposition 2. Let A = (Q, A, E, I, F ) be a Büchi automaton. For q ∈ Q,
let Aq the automaton (Q, A, E, {q}, F ). The automaton A is unambiguous iff
ω
the
S sets L(Aq ) are pairwise disjoint. The automaton A is complete iff A ⊂
q∈Q L(Aq )
In particular, the automaton A is unambiguous and complete iff the family
of sets L(Aq ) for q ∈ Q is a partition of Aω . It can be effectively verified that
the two sets recognized by the automata Aq and Aq′ are disjoint for q 6= q ′ . It
can then be checked if the automaton is unambiguous.
Furthermore, this test
S
can be performed in polynomial time. The set q∈Q L(Aq ) is recognized by
the automaton
AQ = (Q, A, E, Q, F ) whose all states are initial. The inclusion
S
Aω ⊂ q∈Q L(Aq ) holds iff this automaton recognizes Aω . This can be checked
but it does not seem it can be performed in polynomial time.
We now come to examples. We use Proposition 2 to verify that the following
a
→q
two automata are unambiguous and complete. In the figures, a transition p −
of an automaton is represented by an arrow labeled by a from p to q. Initial states
have a small incoming arrow while final states are marked by a double circle. A
Büchi automaton which is complete but ambiguous is given is Example 4.
a
a
1
0
b
b
Fig. 1. CUBA of Example 2
Example 2. Let A be the alphabet A = {a, b} and let A be the automaton
pictured in Fig. 1. This automaton is unambiguous and complete since we have
L(A0 ) = aAω and L(A1 ) = bAω . It recognizes the set aAω of ω-words beginning
with an a.
a, b
0
b
1
a
3
b
a
a
2
b
Fig. 2. CUBA of Example 3
The following example shows that a CUBA may have several connected components.
412
O. Carton, M. Michel
Example 3. Let A be the alphabet A = {a, b} and let A be the automaton
pictured in Fig. 2. It is unambiguous and complete since we have L(A0 ) = A∗ baω ,
L(A1 ) = aω , L(A2 ) = a(A∗ b)ω and L(A3 ) = b(A∗ b)ω . It recognizes the set (A∗ b)ω
of ω-words having an infinite number of b.
The automaton of the previous example has two connected components. Since
it is unambiguous and complete any ω-word labels exactly one final path in this
automaton. This final path is in the first component if the ω-word has finitely
many b and it is the second component otherwise. This automaton shows that
our definition of completeness for an unambiguous Büchi automaton is stronger
than the usual one. Any connected component is complete in the usual sense
if it is considered as a whole automaton. For any letter a and any state q in
a
→ q. However, each
this component, there is exactly one incoming transition p −
component is not complete according to our definition since not any ω-word
labels a final path in this component.
In the realm of finite words, an automaton is usually made unambiguous by
the usual subsets construction [4, p. 22]. This construction associates with an
automaton A an equivalent deterministic automaton whose states are subsets of
states of A. Since left and right are symmetric for finite words, this construction
can be reversed to get a codeterministic automaton which is also equivalent
to A. In the case of infinite words, the result of McNaughton [6] states that a
Büchi automaton can be replaced by an equivalent Muller automaton which is
deterministic. However, this construction cannot be reversed since ω-words are
right-infinite. We have seen in Proposition 1 that a CUBA is codeterministic.
The following theorem is the main result of the paper. It states that any rational
set of ω-words is recognized by a CUBA. This theorem is thus the counterpart
of McNaughton’s result for codeterministic automata. Like Muller automata,
CUBA make the complementation very easy to do. This will be shown in Sect. 5.
The proof of Theorem 1 contains a new proof that the class of rational sets of
ω-words is closed under complementation.
Theorem 1. Any rational set of ω-words is recognized by a complete unambiguous Büchi automaton.
There are two proofs of this result which are both rather long. Both proofs
yield effective procedures which give a CUBA recognizing a given set of ω-words.
The first proof is based on graphs and it directly constructs a CUBA from a
Büchi automaton recognizing the set. The second proof is based on semigroups
and it constructs a CUBA from a morphism from A+ into a finite semigroup
recognizing the set. An important ingredient of both proofs is the notion of a
generalized Büchi automaton.
In a Büchi automaton, the set of final paths is the set of paths which go
infinitely often through final states. In a generalized Büchi automaton, the set of
final paths is given in a different way. A generalized Büchi automaton is equipped
with an output function µ which maps any transition to a nonempty word over
an alphabet B and with a fixed set K of ω-words over B. A path is final if the
concatenation of the outputs of its transitions belongs to K. A generalized Büchi
Unambiguous Büchi Automata
413
automaton can be seen as an automaton with an output function. We point out
that usual Büchi automata are a particular case of generalized Büchi automata.
a
→ q to 1 if p or q is final and to 0
Indeed, if the function µ maps any transition p −
∗ ω
otherwise and if K = (0 1) is the set of ω-words over {0, 1} having an infinite
number of 1, a path in A is final if some final state occurs infinitely often in it.
The notions of unambiguity and completeness are then extended to generalized Büchi automata. A generalized Büchi automaton is said to be unambiguous
(respectively complete) if any ω-word labels at most (respectively at least) one
final path.
The generalized Büchi automata can be composed. If a set X is recognized
by an automaton A whose fixed set K is recognized by automaton B which has a
fixed set K ′ , then X is also recognized by an automaton having the fixed set K ′
which can be easily constructed from A and B. Furthermore, this composition
is compatible with unambiguity and completeness. This means that if both automata A and B are unambiguous (respectively complete), then the automaton
obtained by composition is also unambiguous (respectively complete).
4
Properties and Characterizations
In this section, we present some additional properties of CUBA. We first give another characterization of CUBA which involves loops going through final states.
We present some consequences of this characterization. The characterization of
CUBA given in Proposition 2 uses sets of ω-words. The family of sets of ω-words
labeling a final path starting in the different states must be a partition of the
set Aω of all ω-words. The following proposition only uses sets of finite words to
characterize UBA and CUBA.
Proposition 3. Let A = (Q, A, E, I, F ) be a Büchi automaton such that for
a
→ q.
any state q and any letter a, there exists exactly one incoming transition p −
w
Let Sq be the set of nonempty finite words w such that there is a path q −→ q
going through a final state. The automaton A is unambiguous iff the sets Sq are
pairwise disjoint. The automaton A is unambiguous and complete iff the family
of sets Sq for q ∈ Q is a partition of A+ . In this case, the final path labeled by
the periodic ω-word wω is the path
w
w
q −→ q −→ q · · ·
where q is the unique state such that w ∈ Sq .
The second statement of the proposition says that if that if the automaton
A
S
is supposed to be unambiguous, it is complete iff the inclusion A+ ⊂ q∈Q Sq
holds. The assumption that the automaton is unambiguous is necessary. As the
following example shows, it is not true in general that the automaton is complete
iff the inclusion holds.
Example 4. The automaton of Fig. 3 is ambiguous since the ω-word bω labels
two final paths. Since this automaton is deterministic and all states are final, it
414
O. Carton, M. Michel
a
b
0
1
b
a
Fig. 3. CUBA of Example 4
is complete. However, it is not true that A+ ⊂
automaton is labeled by the finite word a.
S
q∈Q
Sq . Indeed, no loop in this
Proposition 3 gives another method to check if a given Büchi automaton is
unambiguous and complete. It must be first verified that for any state q and
a
→ q. Then, it must be
any letter a, there is exactly one incoming transition p −
checked if the family of sets Sq for q ∈ Q forms a partition of A+ . The sets Sq are
rational and a codeterministic automaton recognizing Sq can be easily deduced
from the automaton A. It is then straightforward to verify that the sets Sq form
a partition of A+ .
The last statement of Proposition 3 says that the final path labeled by a
periodic word is also periodic. It is worth mentioning that the same result does
not hold for deterministic automata.
If follows from Proposition 3 that the trivial CUBA with one state (see
Example 1) is the only CUBA which is deterministic.
5
Boolean Combinations
In this section, we show that CUBA have a fine behavior with the boolean operations. From CUBA recognizing two sets X and Y , CUBA recognizing the
complement Aω \ X, the union X ∪ Y and the intersection X ∩ Y can be easily
obtained. For usual Büchi automata or for Muller automata, automata recognizing the union and the intersection are easy to get. It is sufficient to consider
the product of the two automata with some small additional memory. However,
complementation is very difficult for general Büchi automata.
5.1
Complement
We begin with complementation which turns out to be a very easy operation
for CUBA. Indeed, it suffices to change the initial states of the automaton to
recognize the complement.
Proposition 4. Let A = (Q, A, E, I, F ) be a CUBA recognizing a set X of ωwords. The automaton A′ = (Q, A, E, Q \ I, F ) where Q \ I is the set of non
initial states, is unambiguous and complete and it recognizes the complement
Aω \ X of X.
Unambiguous Büchi Automata
415
It must be pointed out that it is really necessary for the automaton A to be
unambiguous and complete. Indeed, if A is ambiguous, it may happen that an
ω-word x of X labels a final path starting at an initial state and another final
path starting at a non initial state. In this case, the ω-word x is also recognized
by the automaton A′ . If A is not complete, some ω-word x labels no final path.
This ω-word which does not belong to X is not recognized by the automaton A′ .
By the previous result, the proof of Theorem 1 also provides a new proof of the
fact that the family of rational sets of ω-words is closed under complementation.
5.2
Union and Intersection
In this section, we show how CUBA recognizing the union X1 ∪ X2 and the
intersection X1 ∩ X2 can be obtained from CUBA recognizing X1 and X2 .
We suppose that the sets X1 and X2 are respectively recognized by the CUBA
A1 = (Q1 , A, E1 , I1 , F1 ) and A2 = (Q2 , A, E2 , I2 , F2 ). We will construct two
CUBA U = (Q, A, E, IU , F ) and I = (Q, A, E, II , F ) respectively recognizing
the union X1 ∪ X2 and the intersection X1 ∩ X2 . Both automata U and I share
the same states set Q, the same transitions set E and the same set F of final
states.
We first describe the states and the transitions of both automata U and I.
These automata are based on the product of the automata A1 and A2 but a
third component is added. The final states may not appear at the same time in
A1 and A2 . The third component synchronizes the two automata by indicating
in which of the two automata comes the first final state. The set Q of states is
Q = Q1 ×Q2 ×{1, 2}. Each state is then a triple (q1 , q2 , ε) where q1 is a state of A1 ,
a
q2 is a state of A2 and ε is 1 or 2. There is a transition (q1′ , q2′ , ε′ ) −
→ (q1 , q2 , ε) if
a
a
→ q1 and q2′ −
→ q2 are transitions of A1 and A2 and if ε′ is defined as follows.
q1′ −
1 if q1 ∈ F1
′
ε = 2 if q1 6∈ F1 and q2 ∈ F2
ε otherwise
This definition is not completely symmetric. When both q1 and q2 are final
states, we choose to set ε′ = 1. We now define the set F of final states as
F = (q1 , q2 , ε) q2 ∈ F2 and ε = 1 .
This definition is also non symmetric.
It may be easily verified that any loop around a final state (q1 , q2 , ε) also
contains a state (q1′ , q2′ , ε′ ) such that q2′ ∈ F2 . This implies that the function
which maps a path γ to the pair (γ1 , γ2 ) of paths in A1 and A2 is one to one
from the set of final paths in U or I to the set of pairs of final paths in A1
and A2 . Thus if both A1 and A2 are unambiguous and complete, then both
automata U and I are also unambiguous and complete.
If q1 and q2 are the respective starting states of γ1 and γ2 , the starting state
of γ is then equal to (q1 , q2 , ε) with ε ∈ {1, 2}. We thus define the sets IU and II
416
O. Carton, M. Michel
of initial states of the automata U and I as follows.
IU = (q1 , q2 , ε) (q1 ∈ I1 or q2 ∈ I2 ) and ε ∈ {1, 2}
II = (q1 , q2 , ε) q1 ∈ I1 and q2 ∈ I2 and ε ∈ {1, 2}
From these definitions, it is clear that both automata U and I are unambiguous
and complete and that they respectively recognize X1 ∪ X2 and X1 ∩ X2 .
Acknowledgment
The authors would like to thank Dominique Perrin, Jean-Éric Pin and Pascal Weil for helpful discussions and suggestions.
References
1. André Arnold. Rational omega-languages are non-ambiguous. Theoretical Computer Science, 26(1-2):221–223, 1983.
2. Danièle Beauquier and Dominique Perrin. Codeterministic automata on infinite
words. Information Processing Letters, 20:95–98, 1985.
3. J. Richard Büchi. On a decision method in the restricted second-order arithmetic.
In Proc. Int. Congress Logic, Methodology and Philosophy of science, Berkeley
1960, pages 1–11. Stanford University Press, 1962.
4. John E. Hopcroft and Jeffrey D. Ullman. Introduction to Automata Theory, Languages and Computation. Addison-Wesley, 1979.
5. Lawrence H. Landweber. Decision problems for ω-automata. Math. Systems Theory, 3:376–384, 1969.
6. Robert McNaughton. Testing and generating infinite sequences by a finite automaton. Inform. Control, 9:521–530, 1966.
7. A. W. Mostowski. Determinancy of sinking automata on infinite trees and inequalities between various Rabin’s pair indices. Inform. Proc. Letters, 15(4):159–163,
1982.
8. A. W. Mostowski. Regular expressions for infinite trees and a standard form for
automata. In A. Skowron, editor, Computation theory, volume 208 of Lect. Notes
in Comput. Sci., pages 157–168. Springer-Verlag, Berlin, 1984.
9. David Muller. Infinite sequences and finite machines. In Proc. of Fourth Annual
IEEE Symp., editor, Switching Theory and Logical Design, pages 3–16, 1963.
10. Dominique Perrin. Finite automata. In J. van Leeuwen, editor, Handbook of
Theoretical Computer Science, volume B, chapter 1. Elsevier, 1990.
11. Micheal Ozer Rabin. Decidability of second-order theories and automata on infinite
trees. Trans. Amer. Math. Soc., 141:1–35, 1969.
12. A. Prasad Sistla, Moshe Y. Vardi, and Pierre Wolper. The complementation problem for Büchi automata and applications to temporal logic. Theoret. Comput. Sci.,
49:217–237, 1987.
13. Wolfgang Thomas. Automata on infinite objects. In J. van Leeuwen, editor,
Handbook of Theoretical Computer Science, volume B, chapter 4. Elsevier, 1990.
14. Moshe Y. Vardi. An automata-theoretic approach to linear temporal logic. In
Logics for Concurrency: Structure versus Automata, volume 1043 of Lect. Notes in
Comput. Sci., pages 238–266. Springer-Verlag, 1996.
Linear Time Language Recognition on Cellular
Automata with Restricted Communication
Thomas Worsch
Universität Karlsruhe, Fakultät für Informatik
worsch@ira.uka.de
Abstract. It is well-known that for classical one-dimensional one-way
CA (OCA) it is possible to speed up language recognition times from
(1 + r)n, r ∈ R+ , to (1 + r/2)n. In this paper we show that this no
longer holds for OCA in which a cell can comminucate only one bit (or
more generally a fixed amount) of information to its neighbor in each
step. For arbitrary real numbers r2 > r1 > 1 in time r2 n 1-bit OCA can
recognize strictly more languages than those operating in time r1 n. Thus
recognition times may increase by an arbitrarily large constant factor
when restricting the communication to 1 bit. For two-way CA there is
also an infinite hierarchy but it is not known whether it is as dense as
for OCA. Furthermore it is shown that for communication restricted
CA two-way flow of information can be much more powerful than an
arbitrary number of additional communication bits.
1
Introduction
The model of 1-bit CA results from the standard definition by restricting the
amount of information which can be transmitted by a cell to its neighbors in
one step to be only 1 bit. We call this the communication bandwidth.
Probably the first paper investigating 1-bit CA is the technical report by [2]
where it is shown that even with this model solutions of the FSSP in optimal time
are possible. More recently [4] has described 1-bit CA for several one- and twodimensional problems (e.g. generation of Fibonacci sequences and determining
whether two-dimensional patterns are connected) which again are running in the
minimum time possible. Therefore immediately the questions arises about the
consequences of the restriction to 1-bit information flow in the general case.
In Section 2 basic definitions are given and it is proved that each CA with s
states can be simulated by a 1-bit CA with a slowdown by a factor of at most
⌈log s⌉. This seems to be some kind of folklore, but we include the proof for the
sake of completeness and reference in later sections.
In Section 3 it is shown that for one-way CA (OCA) in general there must be
a slowdown. More specifically there is a very fine hierarchy with an uncountable
number of distinct levels (order isomorphic to the real numbers greater than 1)
within the class of languages which can be recognized by 1-bit OCA in linear
time.
In Section 4 we consider two-way CA with restricted communication.
G. Gonnet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNCS 1776, pp. 417–426, 2000.
c Springer-Verlag Berlin Heidelberg 2000
418
T. Worsch
The results obtained are in contrast to those for cellular devices with unrestricted information flow. For example, general speedup theorems have been
shown for iterative arrays [1] and cellular automata [3].
2
Simulation of k-bit CA by 1-bit CA
A deterministic CA is determined by a finite set of states Q, a neighborhood
N ′ = N ∪ {0} and a local rule. (For a simpler notation below, we assume that
N = {n1 , . . . , n|N | } does not contain {0}). The local rule of C is of the form
τ : Q × QN → Q, i.e. each cell has the full information on the states of its
neighbors.
In a k-bit CA B, each cell only gets k bits of information about the state
of each neighbor. To this end there are functions bi : Q → Bk specified, where
B = {0, 1}. If a cell is in state q then bi (q) are the bits observed by neighbor ni .
We allow different bits to be seen by different neighbors. The local transformation
of B is of the form τ : Q × (Bk )N → Q.
Given a configuration c : Z → Q and its successor configuration c′ the new
state of a cell i is c′i = τ (ci , b1 (ci+n1 ), . . . , b|N | (ci+n|N | ).
As usual, for the recognition of formal languages over an input alphabet A
one chooses Q ⊃ A and a set of accepting final states F ⊆ Q r A. In the initial
configuration for an input x1 · · · xn ∈ An cell i is in state xi for 1 ≤ i ≤ n and all
other cells are in a quiescent state q (satisfying τ (q, q N ) = q). A configuration c
is accepting iff c1 ∈ F .
Given a k-bit CA C one can construct a 1-bit CA C ′ with the same neighborhood simulating C in the following sense: Each configuration c of C is also
a legal configuration of C ′ , and there is a constant l (independent of c) such
that if c′ is C’s successor configuration of c then C ′ when starting in c reaches
c′ after l steps. The basic idea is to choose representations of states by binary
words and to transmit them bit by bit to the neighbors before doing a “real”
state transition.
Let B≤i denote B0 ∪ · · · ∪ Bi . Denote by bi,j (q) the j-th bit of bi (q), i.e.
bi (q) = bi,k (q) · · · bi,1 (q).
Algorithm 1. As the set of states of C ′ choose Q′ = Q × (BN )≤k−1 ; i.e. each
state q ′ consists of a q ∈ Q and binary words v1 , . . . , v|N | of identical length j
for some 0 ≤ j ≤ k − 1. For each q ∈ Q identify (q, ε, . . . , ε) with q so that Q can
be considered a subset of Q′ . (Here, ε is the empty word.) For j ≤ k − 1 and a
q ′ = (q, v1 , . . . , v|N | ) ∈ Q × (BN )j define b′i (q ′ ) = bi,j+1 (q), where the b′i are the
functions describing the bit seen by neighbor ni in C ′ .
The local transformation τ ′ of C ′ is defined as follows:
– If the length j if all ve is < k − 1 then τ ′ ((q, v1 , . . . , v|N | ), x1 , . . . , x|N | ) =
(q, x1 v1 , . . . , x|N | v|N | ).
– If the length j if all ve is = k − 1 then τ ′ ((q, v1 , . . . , v|N | ), x1 , . . . , x|N | ) =
(τ (q, x1 v1 , . . . , x|N | v|N | ), ε, . . . , ε).
Linear Time Language Recognition on Cellular Automata
419
The above construction shows that the following lemma holds:
Lemma 2. A k-bit CA can be simulated by a 1-bit CA with the same neighborhood and slowdown k.
Since the states of a set Q can be unambiguously represented as binary words
of length ⌈log2 |Q|⌉, it is straightforward to see:
Corollary 3. Each CA with s states can be simulated by a 1-bit CA with slowdown ⌈log2 s⌉ having the same neighborhood and identical functions bi for all
neighbors.
It should be observed that the above slowdown happens if the bit visible to other
cells is the same for all neighbors. One could wonder whether the slowdown is
always less if different bits are sent to different neighbors. However this is not
the case. The proofs below for the lower bounds do not specifically make any
use of the fact that all neighbors are observing the same bit; they work even if
there were |N | (possibly different) functions bi for the neighboring cells.
On the other hand one should note that for certain CA there is a possibility
for improvement, i.e. conversion to 1-bit CA with a smaller slowdown: Sometimes it is already known that neighbors do not need the full information about
each state. In a typical case the set of states might be the Cartesian product
of some sets and a neighbor only needs to know one component, as it is by
definition the case in so-called partitioned CA. It is then possible to apply a
similar construction as above, but only to that component. Since the latter can
be described with less bits than the whole state, the construction results in a
smaller slowdown.
We will make use of this and a related trick in Section 3.3.
3
A Linear-Time Hierarchy for 1-bit OCA
For a function f : N+ → N+ denote by OCAk (f (n)) the family of languages
which can be recognized by k-bit OCA in time f (n). In this section we will prove:
Theorem 4. For all real numbers 1 < r1 < r2 holds:
OCA1 (r1 n) $ OCA1 (r2 n)
We will proceed in 3 major steps.
3.1
An Infinite Hierarchy
Let Am be an input alphabet with exactly m = 2l − 1 symbols. Hence l bits are
needed to describe one symbol of Am ∪ {}, where is the quiescent state. The
case of alphabets with an arbitrary number of symbols will be considered later.
Denote by Lm the set {vv R | v ∈ A+
m } of all palindromes of even length over
Am .
420
T. Worsch
Lemma 5. Each 1-bit OCA recognizing Lm needs at least time (l−ε)n for every
ε > 0.
Proof. Consider a 1-bit OCA C recognizing Lm and an input length n. Denote
by t the worst case computation time needed by C for inputs of length n.
Consider the boundary between cells k and k+1, 1 ≤ k < n, which separates a
left and a right part of an input. The computations in the left part are completely
determined by the corresponding part of the input and the sequence Br of bits
received by cell k from the right during time steps 1, . . . , t − k. There are exactly
2t−k such bit sequences. On the other hand there are m(n−k) = (2l − 1)(n−k)
right parts of inputs of length n.
Assume that 2t−k < (2l − 1)(n−k) . Then there would exist two different
words v1 and v2 of length n − k resulting in the same bit string received by cell
k during any computation for an input of one of the forms vv1 or vv2 . Since we
are considering OCA, the bit string is independent of any symbols to the left of
cell k + 1. Therefore C would either accept or reject both inputs v1 v1 or v1 v2 ,
although exactly one of them is in Lm . Contradiction.
Therefore 2t−k ≥ (2l − 1)(n−k) . For sufficiently large n there is an arbitrarily
′′
small ε′′ such that this implies 2t−k ≥ 2(l−ε )(n−k) , i.e. t − k ≥ (l − ε′′ )(n − k),
i.e. t ≥ ln + k − lk − ε′′ n + ε′′ k. For an arbitrarily chosen ε′ > 0 consider the case
k = ε′ n (for sufficiently large n). One then gets t ≥ ln+ε′ n−lε′ n−ε′′ n+ε′′ ε′ n =
ln − (ε′ (l − 1 − ε′′ ) + ε′′ )n. If ε′ and ε′′ are chosen sufficiently small this is larger
than ln − εn for a given ε.
Lemma 6. Lm can be recognized by 1-bit OCA in time (l + 1)n + O(1).
Proof. Algorithm 7 below describes a (l + 1)-bit OCA recognizing the language
in time n + O(1). Hence the claim follows from Lemma 2.
Algorithm 7. We describe a (l + 1)-bit OCA recognizing Lm . The set of states
can be chosen to be of the form Qm = Am ∪ Am × (Am ∪ {}) × B2 . The local
rule mapping the state qc of a cell and the state qr of its neighbor to τ (qc , qr ) is
chosen as follows. For the first step, with a, b′ ∈ Am :
τ (a, b′ ) = (a, b′ , 1, 1)
For later steps:
τ ((a, a′ , x, x′ ), (b, b′ , y, y ′ )) = (a, b′ , x′ ∧ [a = a′ ], y)
where [a = a′ ] is 1 if the symbols are equal and 0 otherwise. As can be seen
immediately, the only information needed from the right neighbor is one symbol
b′ and one bit y. Hence an (l + 1)-bit OCA can do the job.
A closer look at the local rule reveals that the OCA above indeed recognizes
palindromes in time n + O(1) if one chooses as the set of final states Am × {} ×
{1} × B (see [5] for details). Hence Lm can also be recognized by 1-bit OCA in
time (l + 1)n + O(1).
The upper bound of the previous lemma is not very close to the lower bound of
Lemma 5, and it is not obvious how to improve at least one of them.
Linear Time Language Recognition on Cellular Automata
3.2
421
Reducing the Gap to (l ± ε)n
We will now define variants of the palindrome language for which gaps between
upper and lower bound can be proved to be very small.
We will use vectors of length r of symbols from an alphabet A as symbols
of a new alphabet A′ . Although a vector of symbols is more or less the same
as a word of symbols, we will use different notations for both concepts in order
to make the construction a little bit clearer. Denote by M hri the set of vectors
of length r of elements from a set M and by Ar the set of words of length r
consisting of symbols from A. The obvious mapping hx1 , . . . , xr i 7→ x1 · · · xr
induces a monoid homomorphism h : (Ahri )∗ → (Ar )∗ ⊆ A∗ .
Definition 8. For integers m ≥ 1 and r ≥ 1 let
+
Lm,r = {vh(v)R | v ∈ (Ahri
m ) }
hri
Lm,r is a language over the alphabet Am ∪ Am . The words in Lm,r are still more
or less palindromes where in the left part of a word groups of r elements from
Am are considered as one (vector) symbol. As a special case one has Lm,1 = Lm
as defined earlier.
Lemma 9. For each ε > 0 there is an r ≥ 1 such that each 1-bit OCA recognizing Lm,r needs at least time (l − ε)n .
A proof can be given analogously to the proof of Lemma 5 above. One only has
to observe that the border between cells k and k + 1 must not lie within “the left
part v” of an input. Therefore for small ε one must choose a sufficiently large r,
e.g. r > 1/ε, to make sure that |v| < ε|vh(v)R |.
Thus for sufficiently large r although ⌈log2 |Am |⌉ · n is not a lower bound on
the recognition time of Lm,r by 1-bit OCA, it is “almost”.
Lemma 10. For each ε > 0 and r = 1/ε the language Lm,r can be recognized
by a 1-bit OCA in time (l + ε)n + O(1).
Thus for sufficiently large r although ⌈log2 |Am |⌉ · n is not an upper bound on
the on the achievable recognition time on 1-bit OCA, it is “almost”.
For the proof we use a construction similar to Algorithm 7.
Algorithm 11. The CA uses a few additional steps before and after the check
for palindromes, where the check itself also has to be adapted to the different
form of inputs.
– In the first step each cell sends one bit to its left neighbor indicating whether
hri
hri
its input symbol is from Am or Am . Thus, if the input is not in (Am )∗ A∗m
this is detected by at least one cell and an error indicator is stored locally.
It will be used later.
– One may therefore assume now that the input is of the indicated form, and
we will call cells with an input symbol from Am the “right” cells and those
hri
with a symbol from Am the “left” cells.
After the first step the rightmost of the left cells has indentified itself.
422
T. Worsch
– With the second step an algorithm for palindrome checking is started. The
modifications with respect to Algorithm 7 are as follows:
– Each cell is counting modulo lr + 1 in each step. (This doesn’t require
any communication.)
– During the first lr steps of a cycle the right cells are shifting r symbols
to the left. In the (lr + 1)-st step they do not do anything.
– During the first lr steps of a cycle the left cells are also shifting r symbols
to the left. In addition they are accumulating what they receive in registers. In step lr step they are comparing whether the register contents
“match” their own input symbol, and in step lr + 1 they are sending the
result of the comparison, combined with the previously received comparison bit to their left neighbor.
One should observe that the last point is the basic trick: the comparison bit
has not to be transported one cell to the left each time a symbol has been
received, but only every r symbols. Thus by increasing r the fraction of time
needed for transmitting these bits can be made arbitrarily small.
– All the algorithms previously described have the following property: The part
of the time space diagram containing all informations which are needed for
the decision whether to accept or reject an input has the form of a triangle.
Its longest line is a diagonal with some slope n/t(n) (or t(n)/n depending on
how you look at it) leading from the rightmost input cell the leftmost one.
Furthermore every cell can know when it has done its job because afterwards
it only receives the encodings of the quiescent state.
– Therefore the following signal can be implemented easily: It starts at the
rightmost input cell and collects the results of the checks done in the very
first step. It is moved to the left immediately after a cell has transmitted
at least one (encoding of the) quiescent state in a (lr + 1)-cycle. Thus this
signal causes only one additional step to the overall recognition time.
Since the above algorithm needs lr + 1 steps per r input symbols from Am and
since the rightmost r symbols have to travel approximately n cells far, the total
running time is n · (lr + 1)/r + O(1), i.e. (l + 1/r)n + O(1) as required.
¿From the Lemmata 9 and 10 one can immediately deduce the following:
Corollary 12. For each integer constant c the set of languages which can be
recognized by 1-bit OCA in time cn is strictly included in the the set of languages
which can be recognized by 1-bit OCA in time (c + 2)n.
This has to be contrasted with unlimited OCA where there is no such infinite
hierarchy within the family of languages which can recognized in linear time.
One therefore gets the situation depicted in Figure 1.
In the top row one uses the fact that for each i ≥ 1
OCA1 (2in) ⊆ OCA1 ((2i + 1 − ε)n) $ OCA1 ((2i + 1 + ε)n) ⊆ OCA1 ((2i + 2)n)
and for each column one has to observe that
OCA1 (2in) $ OCA1 ((2i + 2)n) ⊆ OCA((2i + 2)n) = OCA(2in) .
Linear Time Language Recognition on Cellular Automata
423
OCA1 (2n) $ OCA1 (4n) $ OCA1 (6n) $ OCA1 (8n) $ · · ·
$
$
$
$
OCA(2n) = OCA(4n) = OCA(6n) = OCA(8n) = · · ·
Fig. 1. A hierarchy for 1-bit OCA.
3.3
There Are Small Gaps Everywhere
Finally we will prove now that a small increase of the linear-time complexity
already leads to an increased recognition power not only around (r ± ε)n for
natural numbers r, but for all real numbers r > 1. Since the rational numbers
are dense in R it suffices to prove the result for r ∈ Q.
The basic idea is the following: The number l playing an important role in
the previous sections is something like an “average number of bits needed per
symbol”. What we want to achieve below is an average number r of bits needed
per symbol.
Assume that an arbirtrary rational number r > 1 has been fixed as well as
the relatively prime natural numbers x and y < x such that r = x/y. Then
the above is more or less equivalent to saying that one needs x bits for every y
symbols.
Therefore choose the smallest m such that 2x < my and a set M of 2x − 1
different words from Aym . Then extend the alphabet and “mark” the first and
last symbols of these words. These markings will only be used in Algorithm 15.
For the sake of simplicity will ignore them in the following descriptions. In order
to define the languages L′m,r to be used later we start with the languages Lm′ ,r
considered in the previous section, where m′ = 2x − 1. Denote by gx,y a oneto-one mapping gx,y : Am′ → M which is extended vectors of length r by
considering it as a function mapping each r-tuple of symbols from Am′ to word
of length y of r-tuples of symbols from Am and extending this further to a
monoid homomorphism in the obvious way. Now choose
Lx,y,m,r = gx,y (Lm′ ,r )
Lemma 13. For each ε > 0 there is an r ≥ 1 such that each 1-bit OCA recognizing Lx,y,m,r needs at least time (x/y − ε)n.
It is a routine exercise to adapt the proof of Lemma 9 to the new situation.
Lemma 14. For each ε > 0 and r = 1/ε the language Lx,y,m,r can be recognized
by a 1-bit OCA in time (x/y + ε)n + O(1).
Algorithm 15. Basically the same idea as in Algorithm 11 can be used. Two
modifications are necessary.
The first one is a constant number of steps which have to be carried out in
the very beginning. During these steps each cell collects the information about
424
T. Worsch
the y − 1 input symbols to its right, so that it knows which of the words w ∈ M
is to its right, assuming that it is the left end of one of it. From then on these
marked cells play the role of all cells from Algorithm 11.
The second modification is an additional signal of appropriate speed which
is sent from the right end of the input word. It checks that all the left end and
right end of word markings are indeed distributed equidistantly over the whole
input word. If this is not the case the input is rejected.
As a consequence there is an uncountable set of families of languages ordered
by proper inclusion which is order isomorphic to the real numbers greater than
1 as already claimed at the beginning of this section:
Proof (of Theorem 4). Choose a rational number x/y and an ε > 0 such that
r1 < x/y − ε < x/y + ε < r2 . From Lemmata 13 and 14 follows that there is a
language in OCA1 (x/y + ε) r OCA1 (x/y − ε) which is then also a witness for
the properness of the above inclusion.
Therefore the hierarchy depicted in Figure 1 can be generalized to the following,
where r1 and r2 are arbitrary real numbers satisfying 1 < r1 < r2 :
· · · $ OCA1 (r1 n) $ · · · $ OCA1 (r2 n) $ · · ·
$
$
· · · = OCA(r1 n) = · · · = OCA(r2 n) = · · ·
Fig. 2. The very fine hierarchy for 1-bit OCA.
4
Two-Way CA
For two-way CA (CA for short) with 1-bit communications one has the following
result:
Lemma 16. Each 1-bit CA recognizing Lm needs at least time
l/2)n/2.
l+2
4 n
= (1 +
Proof. Consider a 1-bit CA C recognizing Lm and an input length n = 2k.
Denote by t the worst case computation time needed by C for inputs of length
n.
Consider the boundary between cells k = n/2 and k + 1, which separates
the two halves of an input. The computations in the left half are completely
determined by the sequence Br of bits received by cell k during the time steps
1, . . . , t − n/2 from the right, and the computations in the right half are completely determined by the sequence Bl of bits received by cell k + 1 during the
Linear Time Language Recognition on Cellular Automata
425
time steps 1, . . . , t − n/2 from the left. There are exactly 2t−k bit sequences Bl
and 2t−k bit sequences Br . On the other hand there are mk = 2l·k left resp. right
halves of inputs of length n.
Assume that 22(t−k) < 2l·k . Then there would exist two different words v1
and v2 of length k resulting in the same bit strings Bl and Br for the inputs
v1 v1 and v2 v2 . Therefore, in cells 1, . . . , k the computation of C for the input
v1 v2 would be the same as for v1 v1 and since C has to accept v1 v1 , it would also
accept v1 v2 . Contradiction.
Therefore 22(t−k) ≥ 2l·k , i.e. t − k ≥ 12 l · k, i.e. t ≥ l+2
4 n.
On the other hand it is not difficult to construct a CA which shows:
Lemma 17. Lm can be recognized by 1-bit CA in time (l + 2)n/2.
The straightforward construction of shifting the input symbols in both directions,
accumulating comparison results everywhere and using the result of the middle
cell suffices.
Lemmata 16 and 17 immediately give rise to an infinite hierarchy of complexity classes, but the gaps are large. For example one has
Corollary 18.
CA1 (n) $ CA1 (3n) $ CA1 (32 n) $ CA1 (33 n) $ · · ·
In fact the constants can be improved somewhat (using the lemmata above
ultimately to cj for any constant c > 2 if j is large enough). On the other hand
it is unfortunately not clear at all how to results which are as sharp as for OCA.
Finally we point to the following relation between communication bounded
OCA and CA. As mentioned above Lm can be recognized by 1-bit CA in time
(l + 2)n/2, but on the other hand it cannot be recognized by 1-bit OCA in time
(l − ε)n. This is a gap of (l − ε)n − (l + 2)n/2 = (l − 2 − 2ε)n/2 which can be
made arbitrarily large! In other words:
Lemma 19. For each constant k > 1 there are languages for which 1-bit CA
can be faster than any 1-bit OCA recognizing it by a factor of k.
Corollary 20. For no constants r > 1 and k > 1 is CA1 (rn) ⊆ OCA1 (krn).
For no constants r > 1 and k > 1 is CA1 (rn) ⊆ OCAk (rn).
Thus in a sense sometimes the ability to communicate in both directions is more
powerful than any bandwidth for communication in only one direction.
5
Conclusion and Outlook
It has been shown that for all real numbers r > 1 and ε > 0 there are problems
which can be solved on 1-bit OCA in time (r + ε)n, but not in time rn. As a
consequence there are problems the solution of which on 1-bit OCA must be
slower than on unlimited OCA by a factor of at least r.
426
T. Worsch
It is therefore interesting, and in some way surprising, that certain problems
which are considered to be nontrivial, e.g. the FSSP, can solved on 1-bit CA
without any loss of time.
Two-way CA with the ability to communicate 1 bit of information in each
direction are more powerful than one-way CA with the ability to communicate
k bit in one direction. For certain formal languages the latter have to be slower
by a constant factor which cannot be bounded.
Our current research on communication restricted CA is mainly concerned
with two problem fields. One is the improvement of the results for two-way CA. In
particular we suspect that the lower bound given in Lemma 16 can be improved.
The other is an extension of the definitions to CA with an “average bandwith”
of z bits, where z > 1 is allowed to be a rational number. We conjecture that
for OCA there is also a dense hierarchy with respect to the bandwith (while
keeping the time fixed). This is true if one restricts oneself to integers. For
rational numbers there a some additional technical difficulties due to the not
completely straightforward definition of z-bit CA.
Acknowledgements
The author would like to thank Hiroshi Umeo for valuable hints concerning
preliminary versions of this paper and for asking the right questions. Interesting
discussions with Thomas Buchholz and Martin Kutrib during IFIPCA 98 were
also helpful.
References
1. S. N. Cole. Real-time computation by n-dimensional iterative arrays of finite-state
machines. IEEE Transactions on Computers, C-18(4):349–365, 1969.
2. Jacques Mazoyer. A minimal time solution to the firing squad synchronization
problem with only one bit of information exchanged. Technical Report TR 89-03,
Ecole Normale Supérieure de Lyon, Lyon, 1989.
3. Alvy Ray Smith III. Cellular automata complexity trade-offs. Information and
Control, 18:466–482, 1971.
4. Hiroshi Umeo. Cellular algorithms with 1-bit inter-cell communications. In Thomas
Worsch and Roland Vollmar, editors, MFCS’98 Satellite Workshop on Cellular Automata, pages 93–104, 1998.
5. Roland Vollmar and Thomas Worsch. Modelle der Parallelverarbeitung – eine
Einführung. Teubner, Stuttgart, 1995.
From Semantics to Spatial Distribution
⋆
Luis R. Sierra Abbate1⋆⋆ , Pedro R. D’Argenio2⋆ ⋆ ⋆ , and Juan V. Echagüe1
1
2
Instituto de Computación. Universidad de la República. Uruguay
{echague,sierra}@fing.edu.uy
Dept. of Computer Science. University of Twente. The Netherlands
dargenio@cs.utwente.nl
Abstract. This work studies the notion of locality in the context of
process specification. It relates naturally with other works where information about the localities of a program is obtained information from
its description written down in a programming language.
This paper presents a new approach for this problem. In our case, the
information about the system will be given in semantic terms using asynchronous transition systems. Given an asynchronous transition system we
build an algebra of localities whose models are possible implementations
of the known system. We present different results concerning the models
for the algebra of localities. In addition, our approach neatly considers
the relation of localities and non-determinism.
1
Introduction
In the framework of the so called true concurrency, the idea of causality has
been widely studied [13,12,8,7,15]. Localities, an idea somehow orthogonal to
causality, has become also interesting [1,4,5,10,11,9,3]. Causality states which
events are necessary for the execution of a new one, while localities observe in
which way the events are distributed. Both approaches have been shown not to
be equivalent or to coincide in a very discriminating point [6,17].
The idea of the work on localities is to state where an event occurs given
the already known structure of a process. Thus, the starting point is a process
written in a clearly defined syntax. For instance, consider the process
a.c.stop ||c c.b.stop
(1)
where ||c is the CSP parallel composition: there are two processes running together, but they must synchronize in the action c. This process may execute
actions a@ • |∅, b@∅|•, and c@ • |•. The term in the right hand side of the @
indicates the places in which the action on the left side of @ occurs. In particular,
the • shows in which side of the parallel operation the action takes place. Notice
that a and b do not share any locality: a occurs at the left hand side of the
⋆
⋆⋆
⋆⋆⋆
This work is supported by the CONICYT/BID project 140/94 from Uruguay
This author is supported by the PEDECIBA program
This author is supported by the PROGRESS project TES-4999
G. Gonnet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNCS 1776, pp. 427–436, 2000.
c Springer-Verlag Berlin Heidelberg 2000
428
L.R. Sierra Abbate, P.R. D’Argenio, J.V. Echagüe
parallel composition while b occurs at the right hand side. On the other hand,
the process
a.c.b.stop
(2)
presents the same sequence of actions a, c, and b, although in this case they
occur exactly in the same place.
Besides, these works on localities have a nagging drawback: in some cases
where non deterministic choice and parallel composition are involved, localities
for actions do not seem to match our intuition. For instance, in the process
(a.stop || b.stop) + (c.stop || d.stop)
(3)
we have that a@ • |∅ and d@∅|•. We could think that a and d do not share any
resource, but in a causal-based model they are clearly in conflict: the occurrence
of one of them forbids the occurrence of the other. From a causal point of view
actions a and d must be sharing some locality.
The approach we chose is to deduce the distribution of events from the semantics of a given process. We use asynchronous transition systems [14,16] (ATS
for short) to describe its behavior. Thus, in our case the architecture (i.e., the
syntax) of the process is not known.
Our contribution consists of the statement and exploration of this original
semantic-based approach. For each ATS we define an algebra of localities with
a binary operation ∧ that returns the common places of two events, and a constant 0 meaning “nowhere”. The axioms of this algebra will give the minimal
requirements needed for events to share or not to share some place. The axiomatization does not specify anything if such a statement cannot be deduced
from the behavior. Thus, given the interpretation of the processes (1) and (2)
we may deduce that a and c must have some common place, and we will write
a ∧ c 6= 0.However, the axiomatization is not going to state whether a ∧ b = 0
or a ∧ b 6= 0. This will depend on the model chosen for the axiomatization, that
gives the definitive criterion for the distribution of events: our models will be
true implementations of ATS. We will show that our approach detects situations
like the one described in process (3). In this case, we will have an explicit axiom
saying that a and d share some common place, i.e, a ∧ d 6= 0.
In addition, we discuss different models for the algebra of localities of a given
ATS. These models may be associated to a program whose specification was
given in terms of the original ATS. First we introduce the non-independence
models which consider whether two events are independent in the corresponding
ATS. Then, we define models which take into account whether two events are
adjacent.
Consider two events sharing a locality in a model M for a given ATS. If they
share some locality in every possible model for this ATS, we call M a minimal
sharing model. On the other hand, if two events share a locality in M only when
they share a locality in any other model, then we call M a maximal sharing
model. We show that the models concerning adjacency introduced in this work
hold one of these properties.
From Semantics to Spatial Distribution
429
The paper is organized as follows. Section 2 recalls the definition of ATS as
well as some notions of graph theory. Section 3 introduces the algebra of localities. Six models for this algebra are presented in Section 4. Finally, conclusions
and future works are given in Section 5.
2
Preliminaries
Asynchronous Transitions Systems Asynchronous transition systems [14,16]
are a generalization of labeled transition systems. In ATSs, transitions are labeled with events, and each event represents a particular occurrence of an action.
In addition, ATSs incorporate the idea of independent events. Two independent
events can be executed in parallel, and so they cannot have resources in common.
Formally, we define:
Definition 1. Let A = {α, β, γ, . . .} be a set of actions. An asynchronous transition system is a structure T = (S, E, I,−→, ℓ) where
– S = {s, t, s′ , . . .} is a set of states and E = {a, b, c, . . .} is a set of events;
– I ⊆ E × E is an irreflexive and symmetric relation of independence. We
write aIb instead of (a, b) ∈ I;
a
– −→⊆ S × E × S is the transition relation. We write s −→ s′ instead of
(s, a, s′ ) ∈−→;
– ℓ : E → A is the labeling function.
In addition, T has to satisfy the following axioms,
Determinism:
Forward stability:
Commutativity:
a
a
s −→ s′ ∧ s −→ s′′ =⇒ s′ = s′′
b
a
a
b
aIb ∧ s −→ s′ ∧ s −→ s′′ =⇒ ∃t ∈ E. s′ −→ t ∧ s′′ −→ t
b
a
a
b
aIb ∧ s −→ s′ ∧ s′ −→ t =⇒ ∃s′′ ∈ E. s −→ s′′ ∧ s′′ −→ t
⊓
⊔
Example 1. In the Introduction we have mentioned a couple of examples. We
are going to use them as running examples. To simplify notation, we use the
same name for events and actions.
We can represent both a.c.b.stop and a.c.stop ||c c.b.stop by the ATS in
Figure 1. Notice that for the second process, we could have aIb although that is
not actually relevant. However, it is important to notice that ¬(aIc) and ¬(bIc)
in both cases.
The ATS for process (a.stop || b.stop) + (c.stop || d.stop) is depicted in
Figure 2. Notice that aIb and cId while any other pair of events is not independent. Shadowing is used to show the independence relation between events. ⊓
⊔
Graphs A graph G consists of a finite set V of vertices together with a set X
of unordered pairs of distinct vertices of V . The elements of X are the edges of
G. We will note {v, w} ∈ X as vw. We will write (V, X) for the graph G. Two
vertices v and w are adjacent in G if vw ∈ X. Two edges e and f are adjacent
if e ∩ f 6= ∅.
430
L.R. Sierra Abbate, P.R. D’Argenio, J.V. Echagüe
a
a
c
b
Fig. 1. The ATS for a.c.b.stop and
a.c.stop ||c c.b.stop
d
b
b
a
c
c
d
Fig. 2. The ATS for (a.stop || b.stop) +
(c.stop || d.stop)
Definition 2 (Subgraphs). We call H = (V ′ , X ′ ) a subgraph of G = (V, X),
and note H ⊆ G, whenever V ′ ⊆ V and X ′ ⊆ X. We write ℘G for the set of
all subgraphs of G. We write ℘℘G for the power set of ℘G.
⊓
⊔
A clique of a graph G is a maximal complete subgraph of G. As a complete
graph is defined by its vertices, we will identify a clique with its corresponding
set of vertices. We write K(G) for the set of cliques of the graph G.
Lemma 1. Let v and w be two vertices of G = (V, X). Then, vw ∈ X iff there
exists a clique K ∈ K(G) such that vw ∈ X(K).
3
The Algebra of Localities
In this section we explain how to obtain an algebra of localities from a given
ATS. The algebra of localities is constructed over a semilattice by adding some
particular axioms for each ATS.
Definition 3. A semilattice is a structure (L, ∧, 0) where ∧ : L × L → L and
0 ∈ L satisfying the following axioms:
a ∧ b = b ∧ a (commutativity)
a∧a=a
(idempotence)
a ∧ (b ∧ c) = (a ∧ b) ∧ c (associativity)
a∧0=0
(absorption)
⊓
⊔
Each element in the set L refers to a set of “places”. In particular, 0 means
“nowhere”. The operation ∧ gives the “common places” between the operands.
The axioms make sense under this new nomenclature. Commutativity says that
the common places of a and b are the same as the common places of b and a.
Associativity says that the common places of a, b, and c are always the same
regardless we consider first the common places of a and b, or the common places
of b and c. According to idempotency, the common places of a and itself are
again the places of a. Finally, absorption says that any element of L has no
common place with nowhere.
From Semantics to Spatial Distribution
431
Now we introduce the concept of adjacent events. Two events are adjacent
if they label two consecutive transitions, or two outgoing transitions from the
same state.
Definition 4. Let T = (S, E, I,−→, ℓ) be an ATS. Two events a, b ∈ E are
adjacent in T , notation adj(a, b), if and only if there exist s, s′ , s′′ ∈ S such that
a
b
s −→ s′ −→ s′′
or
b
a
s −→ s′ −→ s′′
or
a
b
s −→ s′ and s −→ s′′
⊓
⊔
We are interested in independence relation between adjacent events. When
two events are not adjacent an observer cannot differentiate whether they are
independent. For instance, in the ATS of Figure 1 it is not relevant whether a
and b are independent since that does not affect the overall behavior.
The carrier set of the algebra of localities associated to an ATS includes an
appropriate interpretation of its events. Such an interpretation refers to “the
places where an event happens”.
Definition 5. Let T = (S, E, I,−→, ℓ) be an ATS. The algebra of localities
associated to T is a structure A = (L, E, ∧, 0) satisfying:
1. E ⊆ L, and (L, ∧, 0) is a semilattice
2. aIb and adj(a, b) =⇒ a ∧ b = 0
3. ¬(aIb) and adj(a, b) =⇒ a ∧ b 6= 0
⊓
⊔
Example 2. For the ATS of Figure 1 we obtain the following axioms:
a ∧ c 6= 0
c ∧ b 6= 0
Notice that the axiom system does not say whether a ∧ b 6= 0 or a ∧ b = 0.
Thus, the algebra does not contradict the decision of implementing the ATS
either with process a.c.b.stop, in which a and b occur in the same place, or with
a.c.stop ||c c.b.stop, in which a and b occur in different places.
For the ATS of Figure 2 we obtain the following axioms:
a∧b=0
c∧d=0
a ∧ c 6= 0
a ∧ d 6= 0
b ∧ c 6= 0
b ∧ d 6= 0
Notice that the axioms state that a and d must share some places. On the other
hand, as we already said, other approaches to localities cannot identify such a
conflict.
⊓
⊔
4
Models for the Algebra of Localities
In this section we introduce several models for the algebra of localities associated
to a given ATS, thus proving its soundness. Each of our models may be an
implementation.The interpretation for the events will be based on the relations
of independence and adjacency. The names of the models are taken from these
basic relations.
432
L.R. Sierra Abbate, P.R. D’Argenio, J.V. Echagüe
a
b
a
a
c b
c
c
Fig. 3. I models for a.c.b.stop and
a.c.stop ||c c.b.stop
b
d
Fig. 4. I model for (a.stop || b.stop) +
(c.stop || d.stop)
The I Models The non-independence models (I models for short) for the algebra of localities associated to a given ATS assign common places to nonindependent events. We define the non-independent models I and I2, based on
cliques and edges respectively.
Let T = (S, E, I,−→, ℓ) be an ATS. We define the graph GI = (E, {{a, b} ⊆
E | ¬(aIb)}). We define the interpretation of an event a in the model I (I2) to
be the set of cliques (edges) in GI in which a appears.
def
[[a]]I = {A ∈ K(GI ) | a ∈ A}
def
([[a]]I2 = {A ∈ X(GI ) | a ∈ A})
Each set A ∈ [[a]]I is a different place where a may happen: each place is
identified with the set of all events that can happen there. Moreover, an event
can happen in several places simultaneously. The operation ∧ of the algebra of
localities is interpreted as the intersection ∩ between sets, and the constant 0 is
interpreted as the empty set ∅.
Example 3. For the ATS in Figure 1 with aIb, we obtain the graph GI on the
left of Figure 3. This implementation uses two places or localities. One of them is
shared by a and c, and the other by b and c. So, this model is well suited for the
implementation a.c.stop ||c c.b.stop. In this case, both I and I2 interpretations
coincide. These could be written down as
[[a]]I = {{a, c}}
[[b]]I = {{b, c}}
[[c]]I = {{a, c}, {b, c}}
We have a new interpretation in case a and b are not independent. We can
see it on the right of the Figure 3. Now, every event occurs in the same place.
In other words, if ¬(aIb), the I model implements a.c.b.stop.
[[a]]I = [[b]]I = [[c]]I = {{a, b, c}}
A different interpretation is established for model I2. In this case, we have
[[a]]I2 = {{a, c}, {a, b}}
[[b]]I2 = {{b, c}, {a, b}}
[[c]]I2 = {{a, c}, {b, c}}
This model implements the program a.b.stop || a.c.stop || c.b.stop that uses
three localities.
For the ATS of Figure 2 we have GI depicted in Figure 4. The execution of
a requires two places, one shared with c and the other with d. Thus, the event
From Semantics to Spatial Distribution
433
a prevents the execution of d by occupying a place required by this event. This
reflects the fact that selection between non independent events occurs actually
in a place. For this implementation, we have
[[a]]I = {{a, c}, {a, d}}
[[c]]I = {{a, c}, {b, c}}
[[b]]I = {{b, c}, {b, d}}
[[d ]]I = {{a, d}, {b, d}}
⊓
⊔
Now we prove that non-independence models are indeed models for the algebra of localities.
Theorem 1 (Soundness). Let T = (S, E, I,−→, ℓ) be an ATS, A its algebra of
localities, and [[E ]]I(I2) = {[[a]]I(I2) | a ∈ E}. Then,
def
def
MI = ℘℘ GI , [[E ]]I , ∩, ∅ and MI2 = ℘℘ GI , [[E ]]I2 , ∩, ∅
are models for A.
Proof. By definition, [[E ]]I2 ⊆ ℘℘ GI . Moreover, ℘℘ GI , ∩, ∅ is a well
known semilattice.
Suppose that aIb and adj(a, b). They are not adjacent in GI , and so there is
no edge between a and b in GI2 . Thus, [[a]]I2 ∩ [[b]]I2 = ∅.
Finally, suppose that ¬(aIb) and adj(a, b). Then, ab ∈ X(GI ), and hence
I2
[[a]] ∩ [[b]]I2 6= ∅.
The proof for model I is similar, taking into account Lemma 1.
⊓
⊔
We can see that, although localities may change, the relation between these
two models remain substantially unchanged. More explicitly, two events sharing
resources in any of these models will share resources in the other.
Theorem 2. MI |= a ∧ b 6= 0 if and only if MI2 |= a ∧ b 6= 0
Minimal Sharing Models: IJ and IJ2 In the models IJ and IJ2 we assign
common places to events that are both adjacent and non-independent. We will
show they are minimal sharing in the following sense : whenever two events share
a place for this models, they will share a place in any other model.
Let T = (S, E, I,−→, ℓ) be an ATS. Taking adjacent events into account we
define the graph GIJ = (E, {{a, b} ⊆ E | ¬(aIb) and adj(a, b)}). As before, we
define the interpretation of an event a to be the set of cliques or edges in GIJ
where a appears.
def
[[a]]IJ = {A ∈ K(GIJ ) | a ∈ A}
def
[[a]]IJ2 = {A ∈ X(GIJ ) | a ∈ A}
Theorem 3 (Soundness). Let T = (S, E, I,−→, ℓ) be an ATS and let A be its
algebra of localities. Then,
def
def
MIJ = ℘℘ GIJ , [[E ]]IJ , ∩, ∅ and MIJ2 = ℘℘ GIJ , [[E ]]IJ2 , ∩, ∅
are models for A.
434
L.R. Sierra Abbate, P.R. D’Argenio, J.V. Echagüe
Theorem 4. MIJ |= a ∧ b 6= 0 if and only if MIJ2 |= a ∧ b 6= 0
These models enjoy the following property: if two events are distributed (i.e.,
do not share a place) in some model for the algebra of localities of a given ATS,
they are also distributed in these models. This justifies calling them minimal
sharing models. The following theorem states the counter positive of that property.
Theorem 5. Let T = (S, E, I,−→, ℓ) be an ATS and let A be its algebra of
localities. Let M be any model for A. Then, for all events a, b ∈ E,
MIJ2 |= a ∧ b 6= 0 =⇒ M |= a ∧ b 6= 0
Proof. Suppose MIJ2 |= a∧b 6= 0, that is [[a]]IJ2 ∩[[b]]IJ2 6= ∅. Thus ab ∈ X(GIJ ),
which implies ¬(aIb) and adj(a, b). So, by Definition 5, A ⊢ a ∧ b 6= 0. Hence,
for any model M of A, M |= a ∧ b 6= 0.
⊓
⊔
An easy application of Theorem 4 give us this corollary:
Corollary 1. MIJ is a minimal sharing model.
⊓
⊔
Maximal Sharing Models: InJ and InJ2 In a similar way we construct a
model of maximal sharing. In this case, two events share places unless they must
execute independently. We call them InJ models because they may require non
adjacency.
Let T = (S, E, I,−→, ℓ) be an ATS. We define the graph GInJ = (E, {{a, b} ⊆
E | ¬ ( aIb and adj(a, b) )}). We define the interpretation of an event a to be
the set of cliques or edges in GInJ where a appears.
def
[[a]]InJ = {A ∈ K(GInJ ) | a ∈ A}
def
[[a]]InJ2 = {A ∈ X(GInJ ) | a ∈ A}
Theorem 6 (Soundness). Let T = (S, E, I,−→, ℓ) be an ATS and let A be its
algebra of localities. Then,
def
and
MInJ = ℘℘ GInJ , [[E ]]InJ , ∩, ∅
InJ2
InJ2 def ℘℘
InJ
, [[E ]]
, ∩, ∅
=
G
M
are models for A.
Theorem 7. MInJ2 |= a ∧ b 6= 0 if and only if MInJ |= a ∧ b 6= 0
This model describes maximal sharing in the sense that if two events are
distributed in it, they are distributed in any other model. The following theorems
state this property for the InJ models.
Theorem 8. Let T = (S, E, I,−→, ℓ) be an ATS and let A be its algebra of
localities. Let M be any model for A. Then, for all events a, b ∈ E,
MInJ2 |= a ∧ b = 0 =⇒ M |= a ∧ b = 0
From Semantics to Spatial Distribution
Corollary 2. MInJ is a maximal sharing model.
435
⊓
⊔
Example 4. We can see on the right of Figure 3 the graph GInJ for the ATS
of Figure 1, no matter whether a and b are independent. Thus, we obtain the
following interpretation in the maximal sharing model.
[[a]]InJ = [[b]]InJ = [[c]]InJ = {{a, b, c}}
Thus, MInJ |= a ∧ b 6= 0. However, from Example 3 we know that when aIb,
MI |= a ∧ b = 0. So, we have that I models are not maximal sharing models.
We have that for the same ATS when a and b are not independent, MI |=
a ∧ b 6= 0 and MIJ |= a ∧ b = 0. Thus, I models are not minimal sharing models
either.
⊓
⊔
5
Conclusions
In this work we have exploited the information about localInJ
InJ2
ities hidden in the ATS definition. Such information helps
us to find implementations of systems with certain properties, like maximal or minimal sharing of localities.
I
I2
The way to state how the locality of events are related
is by means of the algebra of localities. We have introduced
IJ
IJ2
several models for this algebra and showed that this is not
a trivial set of models. Figure 5 summarizes our result in
Section 4. The up-going arrows in the picture mean that Fig. 5. Models of losharing on the lower models implies sharing on the upper calities
models.
We also have shown that our semantic approach exposes clearly difficulties
arisen in syntactic language oriented approaches when dealing with non deterministic choices.
As a consequence of this work we can extract locality information from a
specification written in terms of ATS. So, ATS formalism appears as a good
candidate to become a theoretical assembler for distributed programming. At
least, there are three interesting directions to continue this work. One of them is
to go on a deeper comprehension of locality models. The nature of the hierarchy
of models seems far away from being trivial, requiring more detailed studies on
its structure. We believe that research in this direction will allow us to detect
not only minimal sharing models, but also models with some constraints which
require less localities to work.
We may develop the same strategy for other semantic formalisms, that is,
to associate an algebra of localities and to obtain a model as before. Event
structures [12], from where the notion of independence can be easily derived,
would be a good candidate to study.
Another direction for future work would be to extend ATS with new characteristics. Time is a natural factor to consider in this extensions, as far as
resources are used for events during certain time. A relation between a not yet
436
L.R. Sierra Abbate, P.R. D’Argenio, J.V. Echagüe
defined timed ATS and timed graphs [2] would enable us to move into timed systems, where tools and methods for automatic verification have been developed.
Another way for continuing our work is the development of a toolkit for description of systems based in ATS. We believe that semantic studies in programming must come together with software development, and so implementation of
good toolkits for both theoretical and practical developments will become more
important in future.
References
1. L. Aceto. A static view of localities. Formal Aspects of Computing, 1994.
2. R. Alur and D. L.Dill. A theory of timed automata. Theoretical Computer Science,
126, 1994.
3. R. M. Amadio. An asynchronous model of locality, failure, and process mobility.
Technical Report 3109, INRIA Sophia Antipolis, Feb. 1997.
4. G. Boudol, I. Castellani, M. Hennessy, and A. Kiehn. Observing localities. Theoretical Computer Science, 114:31–61, 1993.
5. G. Boudol, I. Castellani, M. Hennesy, and A. Kiehn. A theory of processes with
localities. Formal Aspects of Computing, 6(2):165–200, 1994.
6. I. Castellani. Observing distribution in processes: static and dynamic localities.
Int. Journal of Foundations of Computer Science, 1995.
7. P. Degano, R. D. Nicola, and U. Montanari. Partial orderings descriptions and
observations of nondeterministic concurrent processes. In REX School and Workshop on Linear Time, Branching Time and Partial Order in Logics and Models for
Concurrency, 1989.
8. R. v. Glabbeek. Comparative Concurrency Semantics and Refinement of Actions.
PhD thesis, Free University, Amsterdam, 1990.
9. U. Montanari, M. Pistore, and D. Yankelevich. Efficient minimization up to location equivalence. In Programming Languages and Systems – ESOP’96, 1996.
10. U. Montanari and D. Yankelevich. A parametric approach to localities. In Proceedings 19th ICALP, Vienna, 1992.
11. U. Montanari and D. Yankelevich. Location equivalence in a parametric setting.
Theoretical Computer Science, 149:299–332, 1995.
12. M. Nielsen, G. Plotkin, and G. Winskel. Petri nets, event structures and domains,
part I. Theoretical Computer Science, 13(1):85–108, 1981.
13. W. Reisig. Petri nets – an introduction. EATCS Monographs on Theoretical
Computer Science, Volume 4. Springer-Verlag, 1985.
14. M. Shields. Deterministic asynchronous automata. In Formal Methods in Programming. North-Holland, 1985.
15. W. Vogler. Bisimulation and action refinement. Theoretical Computer Science,
114:173–200, 1993.
16. G. Winskel and M. Nielsen. Models for concurrency. Technical Report DAIMI
PB-492, Comp. Sci. Dept., Aarhus Univ., Nov. 1992.
17. D. Yankelevich. Parametric views of process description languages. PhD thesis,
University of Pisa, 1993.
On the Expressivity and Complexity of Quantitative
Branching-Time Temporal Logics
F. Laroussinie, Ph. Schnoebelen, and M. Turuani
Lab. Spécification & Vérification
ENS de Cachan & CNRS UMR 8643
61, av. Pdt. Wilson, 94235 Cachan Cedex France
email: {fl,phs,turuani}@lsv.ens-cachan.fr
Abstract. We investigate extensions of CTL allowing to express quantitative requirements about an abstract notion of time in a simple discrete-time framework,
and study the expressive power of several relevant logics.
When only subscripted modalities are used, polynomial-time model checking is
possible even for the largest logic we consider, while introducing freeze quantifiers
leads to a complexity blow-up.
1 Introduction
Temporal logic is widely used as a formal language for specifying the behaviour of
reactive systems (see [7]). This approach allows model checking, i.e. the automatic
verification that a finite state system satisfies its expected behavourial specifications.
The main limitation to model checking is the state-explosion problem but, in practice,
symbolic model checking techniques [5] have been impressively successful, and model
checking is now commonly used in the design of critical reactive systems.
Real-time. While temporal logics only deal with “before and after” properties, realtime temporal logics and more generally quantitative temporal logics aim at expressing
quantitative properties of the time elapsed during computations. Popular real-time logics
are based on timed transition systems and appear in several tools (e.g., HyTech, Uppaal,
Kronos). The main drawback is that model checking is expensive [2,4].
Efficient model checking. By contrast, some real-time temporal logics retain usual discrete Kripke structures as models and allow to refer to quantitative information with
“bounded” modalities such as “AF≤10 A” meaning that A will inevitably occur in at
most 10 steps. A specific aspect of this framework is that the underlying Kripke structures
have no inherent concept of time. It is the designer of the Kripke structure who decides
to encode the flow of elapsing time by this or that event, so that the temporal logics in use
are more properly called quantitative temporal logics than real-time logics. [8] showed
that RTCTL (i.e. CTL plus bounded modalities “A U≤k ” and “E U≤k ” in the Kripke
structure framework) still enjoys the bilinear model checking time complexity of CTL.
G. Gonnet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNCS 1776, pp. 437–446, 2000.
c Springer-Verlag Berlin Heidelberg 2000
438
F. Laroussinie, Ph. Schnoebelen, and M. Turuani
Our contribution. One important question is how far can one go along the lines of
RTCTL-like logics while still allowing efficient model checking ? Here we study two
quantitative extensions of CTL, investigate their expressive power and evaluate the
complexity of model checking.
The first extension, called TCTLs , s for “subscripts”, is basically the most general
logic along the lines of the RTCTL proposal : it allows combining “≤ k”, “≥ k” and
“= k” (so that modalities counting w.r.t. intervals are possible). We show this brings real
improvements in expressive power, and model checking is still in polynomial time. This
extends results for RTCTL beyond the increased expressivity: we use a finer measure
for size of formula (EF=k has size in O(log k) and not k) and do not require that one
step uses one unit of time.
The second extension, called TCTLc , c for “clocks”, uses formula clocks, a.k.a.
freeze quantifiers [3], and is a more general way of counting events. TCTLc can still be
translated directly into CTL but model checking is expensive.
The results on expressive power formalize natural intuitions which (as far as we
know) have never been proven formally, even in the dense time framework 1 . Furthermore, in our discrete time framework our results on expressive power must be stated
in terms of how succinctly can one logic express this or that property. Such proofs are
scarce in the literature (one example is [13]).
Related work. TCTLs and TCTLc are similar to (and inspired from) logics used in
dense real-time frameworks (though, in the discrete framework we use here, their behaviour is quite different). Our results on complexity of model checking build on ideas
from [6,11,4,10].
Other branching-time extensions of RTCTL have been considered. Counting with
regular patterns makes model checking intractable [9]. Merging different time scales
makes model checking NP-complete [10]. Allowing parameters makes model checking
exponential in the number of parameters [10].
Another extension with freeze variables can be found in [14] where richer constraints
on number of occurrences of events can be stated (rending satisfiability undecidable).
On the other hand, the “until” modality is not included and the expressive power of
different kinds of constraints is not investigated.
Plan of the paper. We introduce the basic notions and definitions in § 2. We discuss
expressive power in § 3 and model checking in § 4. We assume the reader is familiar with
standard notions of branching-time temporal logic (see [7]) and structural complexity
(see [12]). Complete proofs appear in a full version of the paper, available from the
authors.
2 CTL + Discrete Time
We write N for the set of natural numbers, and AP = {A, B, . . .} for a finite set of
atomic propositions. Temporal formulae are interpreted over states in Kripke structures.
Formally,
1
See e.g. the conjecture at the end of [1] which becomes an unproved statement in [2].
On the Expressivity and Complexity of Quantitative Branching-Time Temporal Logics
439
Definition 2.1. A Kripke structure (a “KS”) is a tuple S = hQS , RS , lS i where QS =
{q1 , . . .} is a non-empty set of states, RS ⊆ QS × QS is a total transition relation, and
lS : QS → 2AP labels every state with the propositions it satisfies.
Below, we drop the “S” subscript in our notations whenever no ambiguity will arise. A
computation in a KS is an infinite sequence π of the form q0 q1 . . . s.t. (qi , qi+1 ) ∈ R
for all i ∈ N. For i ∈ N, π(i) (resp. π|i ) denotes the i-th state, qi (resp. i-th prefix:
q0 q1 , . . . , qi ). We write Π(q) for the set of all computations starting from q. Since R is
total, Π(q) is never empty.
The flow of time. We assume a special atomic proposition tick ∈ AP that describes the
elapsing of time in the model. The intuition is that states labeled by tick are states where
we observe that time has just elapsed, that the clock just ticked. Equivalently, we can
see all transitions as taking 1 time unit if they reach a state labeled by tick, and as being
instantaneous otherwise 2 . In pictures, we use different grey levels to distinguish tick
states from non-tick ones.
Given a computation π = q0 q1 . . . and i ≥ 0 , Time(π|i ) denotes |{j | 0 < j ≤
i ∧ tick ∈ l(qj )}|, the time it took to reach qi from q0 along π.
2.1
TCTLs
Syntax. TCTLs formulae are given by the following grammar:
ϕ, ψ ::= ¬ϕ | ϕ ∧ ψ | EXϕ | EϕUI ψ | AϕUI ψ | A | B | . . .
where I can be any finite union [a1 , b1 [∪ · · · ∪ [an , bn [ of disjoint integer intervals with
0 ≤ a1 < b1 < a2 < b2 < · · · an < bn ≤ ω.
Standard abbreviations include ⊤, ⊥, ϕ ∨ ψ, ϕ ⇒ ψ, . . . as well as EFI ϕ (for
E⊤UI ϕ), AFI ϕ (for A⊤UI ϕ), EGI ϕ (for ¬AFI ¬ϕ), and AGI ϕ (for ¬EFI ¬ϕ).
Moreover we let U<k stand for U[0,k[ , U>k for U[k+1,ω[ , and U=k for U[k,k+1[ .
The usual CTL operators are included since the usual U corresponds to U<ω .
Semantics. Figure 1 defines when a state q in some KS S, satisfies a TCTLs formula
ϕ, written q |= ϕ, by induction over the structure of ϕ.
We let TCTLs [<], TCTLs [<, =], etc. denote the fragments of TCTLs where only
simple constraints using only < (resp. < or =, etc.) are allowed. E.g., RTCTL is
TCTLs [<] (with the proviso that our KS’s have tick’s).
2.2
TCTLc
TCTLc uses freeze quantifiers [3]. Here “clocks” are introduced in the formula, set to
zero when they are bound, and can be referenced “later” in arbitrary ways. This standard
construct gives more flexibility than subscripts.
2
Thus KS’s with tick’s can be seen as discrete timed structures, i.e. KS’s where edges (q, q ′ ) ∈ R
are labeled by a natural number: the time it takes to follow the edge. While discrete timed
structures are more natural, KS’s with tick are an essentially equivalent framework where
technicalities are simpler since they do not need labels on the edges.
440
F. Laroussinie, Ph. Schnoebelen, and M. Turuani
q
q
q
q
q
q
π
π
|=
|=
|=
|=
|=
|=
|=
|=
A
¬ϕ
ϕ∧ψ
EXϕ
EϕUI ψ
AϕUI ψ
Xϕ
ϕUI ψ
iff
iff
iff
iff
iff
iff
iff
iff
A ∈ l(q),
q 6|= ϕ,
q |= ϕ and q |= ψ,
there exists π ∈ Π(q) s.t. π |= Xϕ,
there exists π ∈ Π(q) s.t. π |= ϕUI ψ
for all π ∈ Π(q), we have π |= ϕUI ψ
π(1) |= ϕ,
there exists i ≥ 0 s.t. Time(π|i ) ∈ I
and π(i) |= ψ and π(j) |= ϕ for all 0 ≤ j < i,
Fig. 1. Semantics of TCTLs
Syntax. For a set Cl = {x, y, . . .} of clocks, TCTLc formulae are given by the following
grammar:
ϕ, ψ ::= ¬ϕ | ϕ ∧ ψ | EXϕ | EϕUψ | AϕUψ | x in ϕ | x ∼ k | A | B | . . .
where ∼∈ {=, ≤, <, ≥, >} and k ∈ N. Constraints referring to clocks are restricted to
the simple form x ∼ k, in the spirit of TCTLs .
An occurrence of a formula clock x in some x ∼ k is bound if it is in the scope
of a “x in ” freeze quantifier, otherwise it is free. A formula is closed if it has no free
variables. Only closed formulae express properties of states in KS’s.
Semantics. TCTLc formulae are interpreted over a state of a KS S together with a
valuation v : Cl → N of the clocks free in ϕ.
q, v
q, v
q, v
q, v
q, v
q, v
q, v
q, v
π, v
π, v
|=
|=
|=
|=
|=
|=
|=
|=
|=
|=
A
¬ϕ
ϕ∧ψ
EXϕ
EϕUψ
AϕUψ
x in ϕ
x∼k
Xϕ
ϕUψ
iff
iff
iff
iff
iff
iff
iff
iff
iff
iff
A ∈ l(q),
q, v 6|= ϕ,
q, v |= ϕ and q, v |= ψ,
there exists π ∈ Π(q) s.t. π, v |= Xϕ
there exists π ∈ Π(q) s.t. π, v |= ϕUψ
for all π ∈ Π(q) we have π, v |= ϕUψ
q, v[x ← 0] |= ϕ
v(x) ∼ k
π(1), v + d |= ϕ with d = Time(π|1 )
there exists i ≥ 0 s.t. π(i), v + di |= ψ and
def
π(j), v + dj |= ϕ for all 0 ≤ j < i (where dl = Time(π|l ))
Fig. 2. Semantics of TCTLc
Figure 2 defines when q, v |= ϕ in some KS S by induction over the structure of ϕ.
For m ∈ N, v + m denotes the valuation which maps each clock x ∈ Cl to the value
v(x)+m, and v[x ← 0] is v where now x evaluates to 0.
Clearly the TCTLs operators can be defined with TCTLc operators:
def
EϕUI ψ = x in EϕU(I(x) ∧ ψ)
def
AϕUI ψ = x in AϕU(I(x) ∧ ψ)
On the Expressivity and Complexity of Quantitative Branching-Time Temporal Logics
441
where, for I of the form [a1 , b1 [∪ · · · ∪ [an , bn [, I(x) denotes the clocks constraint
n
_
(ai ≤ x) ∧ (x < bi ) . Hence TCTLs can be seen as a fragment of TCTLc where
i=1
only one formula clock is allowed (and used in restricted ways).
A standard observation for logics such as TCTLc is that the actual values recorded
in v are only relevant up to a certain point depending on the formula at hand. Let Mϕ
denote the largest constant appearing in ϕ (largest k in the “x ∼ k”’s) and, for m ∈ N,
let v ≡m v ′ when for any x ∈ Cl, either v(x) = v ′ (x) or v(x) > m < v ′ (x) (i.e. v and
v ′ agree, or are beyond m).
Lemma 2.2. If v ≡m v ′ and m ≥ Mϕ , then q, v |= ϕ iff q, v ′ |= ϕ.
Proof. Easy induction over the structure of ϕ, using the fact that v ≡m v ′ entails
v + k ≡m v ′ + k and v[x ← 0] ≡m v ′ [x ← 0].
⊓
⊔
Remark 2.3. A related property is used by Emerson et al. in their study of RTCTL:
when checking whether q |= ϕ inside some KS with | Q |= m states, it is possible to
replace by m any constant k larger than m in the subscripts of ϕ. We emphasize that this
⊓
⊔
property does not hold for TCTLs [=] (it does hold for TCTLs [<, >]).
The size of our formulae is the length of the string 3 used to write them down in
a sufficiently succinct way, e.g., | AαUI β | is 1+ | α | + | β | + | I |. For I of
def
the form [a1 , b1 [∪ · · · ∪ [an , bn [, we have | I | = ⌈log a1 ⌉ + · · · + ⌈log bn ⌉ (assuming
log(0) = log(ω) = 0). ht(ϕ) denotes the temporal height of formula ϕ. As usual, it is
the maximal number of nested modalities in ϕ. Obviously, ht(ϕ) is smaller than the size
of ϕ (even when viewed as a dag).
3 Expressivity
Formally, TCTLs or TCTLc do not add expressive power to CTL:
Theorem 3.1. Any closed TCTLc (or TCTLs ) formula is equivalent to a CTL formula.
Proof. With any TCTLc formula ϕ, and valuation v, we associate a CTL formula (ϕ)v
s.t. q, v |= ϕ iff q |= (ϕ)v for any state q of any Kripke structure. Then, if ϕ has no free
clock variables, any (ϕ)v is a CTL equivalent to ϕ. The definition of (ϕ)v is given by
the following rewrite rules:
def
(ϕ ∧ ψ)v = ϕv ∧ ψ v
def
(¬ϕ)v = ¬ϕv
def
(A)v = A
3
def
(x ∼ k)v =
⊤ if v(x) ∼ k,
⊥ otherwise
def
(x in ϕ)v = ϕv[x←0]
We sometimes see a formula as a dag, where identical subformulae are only counted once.
Such cases are stated explicitly.
442
F. Laroussinie, Ph. Schnoebelen, and M. Turuani
def
(AFϕ) =
v
v
AFϕ
if v + 1 ≡Mϕ v,
otherwise
ϕv ∨ AX A(¬tick) U (¬tick ∧ ϕv ) ∨ (tick ∧ (AFϕ)v+1 )
v
Eϕ U ψ v
if v + 1 ≡Mϕ,ψ v,
def
(Eϕ U ψ)v =
v
v
v
v
v+1
ψ
∨
ϕ
∧
EX
E(ϕ
∧¬tick)
U
(ψ
∧¬tick)
∨
(tick
∧
(EϕUψ)
)
otherwise
This gives a well-founded definition for ( )v since in the right-hand sides either ( )v
is recursively applied over subformulae, or ( )v+1 is applied on the same formula (or
both). But moving from ( )v to ( )v+1 is only done until v ≡M v + 1, which is bound
to eventually happen. Then it is a routine matter to check that the correctness invariant
⊓
⊔
(i.e., “q, v |= ϕ iff q |= (ϕ)v ”) is preserved by these rules.
The translation we just gave is easy to describe but the resulting (ϕ)v formulae have
enormous size. It turns out that this cannot be avoided. Even more, we can say that
moving from CTL to TCTLs [<] to TCTLs to . . . allows writing new formulae that
have no succinct equivalent at the previous level.
Theorem 3.2. 1. TCTLs [<] can be exponentially more succinct than CTL,
2. TCTLs [<, >] can be exponentially more succinct than TCTLs [<].
The proof is given by the following lemmas.
Lemma 3.3. Any CTL formula equivalent to EF<n A (a log n-sized formula) has temporal height at least n.
Proof. Consider the KS described in Figure 3. One easily shows (by structural induction
over ϕ) that for any CTL formula ϕ, ht(ϕ) ≤ i implies αi |= ϕ iff αi+1 |= ϕ. On the
αn
αn−1
α0
γ
A
αi |= tick ∧ ¬A
γ |= ¬tick ∧ A
Fig. 3. αn |= EF<n+1 A and αn+1 6|= EF<n+1 A
other hand, αj |= EF<n A iff j < n. Thus any CTL equivalent to EF<n A must have
temporal height larger than n.
⊓
⊔
Lemma 3.4. Any TCTLs [<] formula equivalent to EF>n A (a log n-sized formula) has
temporal height at least n.
Proof. Consider the KS described in Figure 4. One easily shows (by structural induction
over ϕ) that for any formula ϕ in TCTLs [<], ht(ϕ) ≤ i implies αi |= ϕ iff αi+1 |= ϕ
and βi |= ϕ iff βi+1 |= ϕ. On the other hand, αj |= EF>n A iff j > n. Thus any
⊓
⊔
TCTLs [<] equivalent to EF>n A must have temporal height larger than n.
Let us mention two (natural) conjectures that would allow separating further fragments:
On the Expressivity and Complexity of Quantitative Branching-Time Temporal Logics
αn−1
αn
γ
βn
443
α0
β0
βn−1
A
αi |= tick ∧ ¬A
βi |= ¬tick ∧ ¬A
γ |= ¬tick ∧ A
Fig. 4. αn 6|= EF>n A and αn+1 |= EF>n A
Conjecture 3.5. 1. TCTLs [<, >, =] can be exponentially more succinct than
TCTLs [<, >],
2. TCTLc can be exponentially more succinct than TCTLs .
We have not yet been able to find the required proofs, which are hard to build. The first
point is based on the conjecture that any TCTLs [<, >] formula equivalent to EF=k A has
temporal height at least k. For the second one,
we conjecture that any TCTLs formula
equivalent to x in EF A ∧ EF(B ∧ x = k) has size at least k.
We have explained how TCTLs becomes more and more expressive when we allow
subscripts with <, then also with >, then also with =. Subscripts of the form “= k” are
the main difference between RTCTL and our proposal. They enhance expressivity and
make model checking more complex (see § 4).
Once we have TCTLs [<, >, =], subscripts with intervals are just a convenient shorthand:
Theorem 3.6. TCTLs is not more succinct than TCTLs [<, >, =].
S
S
Proof. For I of the form i=1...n [ai , bi [, we denote by I−k the set i=1...n [ai−k, bi−k[
(after the obvious normalization if k > a1 ).
Let ϕ be a TCTLs formula. We build an equivalent TCTLs [<, >, =] formula ϕ̃ with
the following equivalences:
E α UI β ≡
_
i=1...n
E α U=ai (E α U<bi −ai β)
A α U=a1 (A α UI−a1 β)
A α UI β ≡
¬E(¬β)U<b1 (¬
α ∧ ¬β)
∧ ¬E(¬β)U
¬A α U
=b1
=a2 −b1
if a1 > 0,
(A α UI−a2 β)
otherwise
Correctness is easy to check. The size of ϕ̃, seen as a dag, is linear in the size of ϕ seen
⊓
⊔
as a dag 4 .
4 Model Checking
For the logics we investigate, the model checking problem is the problem of computing
whether q |= ϕ for q a state of a KS S and ϕ a temporal formula. In this section we
analyse the complexity of model checking problems for TCTLs and TCTLc .
4
Viewing formulae as dags is convenient here, and agree with our later use of Theorem 3.6 when
we investigate efficient model checking for TCTLs .
444
F. Laroussinie, Ph. Schnoebelen, and M. Turuani
Given a KS S and a formula ϕ, the complexity of model checking can be evaluated
in term of | S | and | ϕ |. But more discriminating information can be obtained by also
looking at the program complexity of model checking (i.e., the complexity when ϕ is
fixed and S, q is the only input) and the formula complexity (i.e., when S, q is fixed and
ϕ is the only input).
While TCTLs model checking can be done efficiently, this is not true for TCTLc
(even when considering a fixed KS).
Theorem 4.1. Let S = hQ, R, li be a KS and ϕ a TCTLs formula.
There exists a model
if ϕ belongs to
checking algorithm running in time O (| Q |3 + | R |)× | ϕ | . Moreover
TCTLs [<, >], the algorithm runs in time O (| Q | + | R |)× | ϕ | .
Proof (Idea). The algorithm extends the classical algorithms for CTL and RTCTL
(see [8]) with procedures dealing with TCTLs [<, =, >] operators (as seen in Theorem 3.6, formulae with interval subscripts can be decomposed). The most expensive
procedure concerns the EU= case where we compute the transitive closures of relations, hence the (quite naive) O(| Q |3 + | R |). The TCTLs [<, >] fragment uses only
procedures in O (| Q | + | R |)× | ϕ | .
⊓
⊔
Theorem 4.2. The model checking problem for TCTLc is PSPACE-complete. The formula complexity of TCTLc model checking is PSPACE-complete.
Proof. To prove this result, it is sufficient to show that TCTLc model checking is in
PSPACE 5 and that the formula complexity is PSPACE-hard. The proof of this last point
relies on ideas from [4]: let P be an instance of QBF (Quantified Boolean Formula, a
PSPACE-complete problem). W.l.o.g. P is some Q1 p1 . . . Qn pn .ϕ (with Qi ∈ {∃, ∀}
and ϕ a propositional formula over p1 , . . . , pn ). We reduce P to a model checking
problem S, q |= Φ where S is the simple KS ({q}, {q → q}, {l(q) = tick}) and Φ is the
following TCTLc formula:
h
t in EF t = 1 ∧ O1 x1 in EF t = 2 ∧ . . . t = i ∧ Oi (xi in EF(t = i+1 ∧ . . .
i
EF(t = n+1 ∧ ϕ̃) . . .
where Oi is EF≤1 (resp. EG≤1 ) if Qi is ∃ (resp. ∀) and ϕ̃ is ϕ where occurrences of
pi have been replaced by xi = n + 1 − i. Observe that any clock xi is reset at time i
or i + 1 and depending on this reset time the atomic propositions pi will be interpreted
as true or false after the n+1-th transition. The operator EF≤1 (resp. EG≤1 ) allows to
quantify existentially (resp. universally) over these two reset times. Clearly Φ is valid iff
S, q |= Φ.
⊓
⊔
In practice, one can easily use any CTL model checker for model checking TCTLc
formulae, and the resulting algorithm runs in time O(| S | .M |Cl| . | ϕ |). For example,
with SMV, one just adds one variable for each formula clock and update them in the
obvious way. This is much more practical than an approach based on Theorem 3.1 and
the complexity is not too frightening for formulae with | Cl |= 1 (only one clock), a
fragment already more expressive than TCTLs .
5
This uses standard arguments, see the long version for details.
On the Expressivity and Complexity of Quantitative Branching-Time Temporal Logics
445
A theoretical view. The following table gives a synthetic summary of complexity measures for model checking CTL, TCTLs and TCTLc , showing that model checking the
full TCTLs is as tractable as model checking CTL in both arguments. On the other hand,
model checking TCTLc requires polynomial space even for a fixed Kripke structure.
Complexity of model checking
Formula complexity
Program complexity
CTL
TCTLs
TCTLc
P-complete
PSPACE-complete
LOGSPACE
PSPACE-complete
NLOGSPACE-complete
Filling the table. Model checking TCTLs is in P as we just saw. P-hardness results from
the obvious reading of the circuit-value problem (with proper alternation) as a model
checking problem for the EX fragment of CTL. The formula complexity of model
checking CTL is LOGSPACE and this result can be easily extended to TCTLs . The
program complexity of model checking TCTLs and TCTLc is NLOGSPACE-complete
since we proved (Theorem 3.1) that these logics can be translated into CTL, for which
the NLOGSPACE-complete complexity is given in [11].
Symbolic model checking. When it comes to symbolic model checking (i.e., when S is
given under the form of a synchronized product of k structures S1 , . . . , Sk ), CTL model
checking becomes PSPACE-complete [11], this is also true for TCTLs and TCTLc :
Theorem 4.3. The symbolic model checking problem for TCTLs and TCTLc is
PSPACE-complete.
5 Conclusion
We investigated the expressive power and the complexity of model checking for TCTLs
and TCTLc , two quantitative extensions of CTL along the lines of RTCTL [8,10].
The expressive power must be measured in a framework where, strictly speaking,
everything can be translated into CTL.
We showed that TCTLs , while more succinct than RTCTL, still allows an efficient model checking algorithm. By contrast TCTLc , the extension of CTL with freeze
quantifiers leads to a complexity blow-up.
References
1. R. Alur, C. Courcoubetis, and D. Dill. Model-checking for real-time systems. In Proc. 5th
IEEE Symp. Logic in Computer Science (LICS’90), Philadelphia, PA, USA, June 1990, pages
414–425, 1990.
2. R. Alur, C. Courcoubetis, and D. Dill. Model-checking in dense real-time. Information and
Computation, 104(1):2–34, 1993.
3. R. Alur and T. A. Henzinger. A really temporal logic. Journal of the ACM, 41(1):181–203,
1994.
4. L. Aceto and F. Laroussinie. Is your model checker on time ? In Proc. 24th Int. Symp. Math.
Found. Comp. Sci. (MFCS’99), Szklarska Poreba, Poland, Sep. 1999, volume 1672 of Lecture
Notes in Computer Science, pages 125–136. Springer, 1999.
446
F. Laroussinie, Ph. Schnoebelen, and M. Turuani
5. J. R. Burch, E. M. Clarke, K. L. McMillan, D. L. Dill, and L. J. Hwang. Symbolic model
checking: 1020 states and beyond. Information and Computation, 98(2):142–170, 1992.
6. S. Demri and Ph. Schnoebelen. The complexity of propositional linear temporal logics in
simple cases (extended abstract). In Proc. 15th Ann. Symp. Theoretical Aspects of Computer
Science (STACS’98), Paris, France, Feb. 1998, volume 1373 of Lecture Notes in Computer
Science, pages 61–72. Springer, 1998.
7. E. A. Emerson. Temporal and modal logic. In J. van Leeuwen, editor, Handbook of Theoretical
Computer Science, vol. B, chapter 16, pages 995–1072. Elsevier Science, 1990.
8. E. A. Emerson, A. K. Mok, A. P. Sistla, and J. Srinivasan. Quantitative temporal reasoning. In
Proc. 2nd Int. Workshop Computer-Aided Verification (CAV’90), New Brunswick, NJ, USA,
June 1990, volume 531 of Lecture Notes in Computer Science, pages 136–145. Springer,
1991.
9. E. A. Emerson and R. J. Trefler. Generalized quantitative temporal reasoning: An automatatheoretic approach. In Proc. 7th Int. Joint Conf. Theory and Practice of Software Development
(TAPSOFT’97), Lille, France, Apr. 1997, volume 1214 of Lecture Notes in Computer Science,
pages 189–200. Springer, 1997.
10. E. A. Emerson and R. J. Trefler. Parametric quantitative temporal reasoning. In Proc. 14th
IEEE Symp. Logic in Computer Science (LICS’99), Trento, Italy, July 1999, pages 336–343,
1999.
11. O. Kupferman, M.Y. Vardi, and P. Wolper. An automata-theoretic approach to branching-time
model checking, 1998. Full version of the CAV’94 paper, accepted for publication in J. ACM.
12. C. H. Papadimitriou. Computational Complexity. Addison-Wesley, 1994.
13. T. Wilke. CTL+ is exponentially more succint than CTL. In Proc. 19th Conf. Found. of
Software Technology and Theor. Comp. Sci. (FST&TCS’99), Chennai, India, Dec. 1999,
volume 1738 of Lecture Notes in Computer Science. Springer, 1999.
14. J. Yang, A. K. Mok, and F. Wang. Symbolic model checking for event-driven real-time
systems. ACM Transactions on Programming Languages and Systems, 19(2):386–412, 1997.
A Theory of Operational Equivalence for
Interaction Nets
Maribel Fernández1 and Ian Mackie2
1
LIENS (CNRS UMR 8548), École Normale Supérieure
45 Rue d’Ulm, 75005 Paris, France. maribel@dmi.ens.fr
2
CNRS-LIX (UMR 7650), École Polytechnique
91128 Palaiseau Cedex, France. mackie@lix.polytechnique.fr
Abstract. The notion of contextual equivalence is fundamental in the theory of
programming languages. By setting up a notion of bisimilarity, and showing that
it coincides with contextual equivalence, one obtains a simple coinductive proof
technique for showing that two programs are equivalent in all contexts. In this
paper we apply these (now standard) techniques to interactions nets, a graphical
programming language characterized by local reduction. This work generalizes
previous studies of operational equivalence in interaction nets since it can be
applied to untyped systems, thus all systems of interaction nets are captured.
1 Introduction
Interaction nets, introduced by Lafont [7], are graph rewriting systems that generalize the
multiplicative proof nets of linear logic, and can be seen both as a high-level programming
language or as a low-level implementation language. A program consists of a net (a graph
built from a set of agents and wires) and a set of interaction rules that describe the way in
which the net will be reduced. We are interested in the problem of defining an equivalence
relation between programs that compute the same results, or in other words, that behave
in the same way, in all contexts. In that case, one program can be replaced by the other, for
example for efficiency reasons, without altering the operational semantics of the system.
To define this equivalence relation we first need to develop an operational theory of
interaction nets specifying in a precise way how programs are executed (i.e. a strategy
of evaluation of nets and a notion of value).
In [2] we proposed a way of adapting the coinductive techniques, used successfully
for the functional and object-oriented programming paradigms, to give a notion of operational equivalence for the interaction paradigm. The language of interaction nets that
was studied focussed on the notion of type, which is natural if interaction nets are seen
as a programming paradigm. In particular, types allow us to distinguish values from programs. However, some applications of interaction nets do not fit into the typed framework
in a natural way. For instance, systems based on the interaction combinators [8], or the
systems of interaction used for the encoding of the λ-calculus [9], are untyped. Although
it is possible to develop a type system for them [6], a natural approach would be to develop
an operational theory of equivalence of interaction nets that does not rely on the notion
of types. The same remark can be made in the case of functional languages based on the
G. Gonnet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNCS 1776, pp. 447–456, 2000.
c Springer-Verlag Berlin Heidelberg 2000
448
M. Fernández, I. Mackie
λ-calculus, where two different approaches can be found in the literature, depending on
whether the calculus is typed (see for instance [10]) or untyped (see for instance [1]).
In this paper we present an operational theory for untyped interaction nets, including
a notion of contextual equivalence and an associated bisimilarity relation which permits
the use of coinductive techniques in the proofs of operational equivalence. To express
these notions we use the textual calculus of interaction nets presented in [3] instead of
the graphical language, since it allows us to give a concise and formal presentation. We
leave the use of diagrams for the examples and intuitive explanations.
A system of interaction nets is a user-defined language, in the same spirit as systems
based on term-rewriting. Our results are applicable to any system of interaction nets;
we are not restricted to one specific set of rules. If the system is typed, the information
provided by types can be used to obtain a more refined equivalence relation between
nets, recovering the results of [2]. We remark that interaction nets are also used as an
object language for the coding of other rewriting systems. The λ-calculus is perhaps the
most studied example of this (see e.g. [4,9]). Our results are also applicable here, so we
have a proof technique for optimizations of such systems.
The paper is organized as follows. In the next section we set up the definition of
interaction nets and define our evaluation strategy. Section 3 sets up to notion of bisimilarity. In Section 4 we give some examples of use of this relation. In Section 5 we
formalize the notion of contextual equivalence, and Section 6 shows that this coincides
with bisimilarity. Finally we conclude the paper in Section 7.
2 Background: Interaction Nets
We begin by presenting the textual calculus of interaction nets that we will use for the
rest of the paper; we refer the reader to [3] for a more detailed description and examples.
Let Σ be a set of symbols, called agents, ranged over by α, β, . . ., each with a given
arity, one principal port and a number of auxiliary ports equal to its arity. Σ can be
partitioned into a set C of constructors and a set D of destructors, depending on the
application. Let N be a disjoint set of names, ranged over by x, y, z, etc. Terms are
defined by the grammar: t ::= x | α(t1 , . . . , tn ), where x ∈ N , α ∈ Σ, arity(α) = n
and t1 , . . . , tn are terms, with the restriction that each name may appear at most twice.
N (t) denotes the set of names occurring in t. If a name occurs twice in a term, we say
that it is bound, otherwise it is free. We write t for a list of terms t1 , . . . , tn . Graphically,
a term of the form α(t) can be seen as a tree with connections between its leaves: the
principal port of α (indicated by an arrow) is at the root, and the terms t1 , . . . , tn are the
subtrees connected to the auxiliary ports of α. A free variable represents a free port, and
a bound variable represents a wire connecting two auxiliary ports.
✻
✓✏
α
✒✑
···❅
tn
t1
If t and u are terms, then the (unordered) pair t == u is an equation. ∆, Θ, . . . will be
used to range over multisets of equations. The graphical representation of an equation
is a pair of trees connected by their roots (principal ports).
A Theory of Operational Equivalence for Interaction Nets
449
Interaction rules are pairs of terms written as α(t) ⊲⊳ β(u), where (α, β) ∈ Σ 2 is
the active pair of the rule. All names occur exactly twice in a rule, and there is one rule
for each pair of agents.
Definition 1 (Configurations). A configuration is a pair: c = (R, ht | ∆i), where R is
a set of rules, t a list t1 , . . . , tn of terms, and ∆ a multiset of equations. Each variable
occurs at most twice in c. If a name occurs once in c then it is free, otherwise it is bound.
For simplicity we sometimes omit R when there is no ambiguity. We use c, c′ to range
over configurations. We call t the head and ∆ the body of a configuration.
Intuitively, (R, ht | ∆i) represents a net that we evaluate using R. To draw the net we
simply draw the trees for the terms in ht | ∆i, connect the common variables together,
and connect the roots of the trees corresponding to the members of an equation together.
The roots of the terms in the head of the configuration and the free names correspond to
free ports in the interface of the net. Note that the head of the configuration may contain
all or just some of the ports in the interface of the net, called observable. For this reason,
the head is called the observable interface of the configuration.
We work modulo α-equivalence for bound names as usual, but also for free names.
Configurations that differ only on the names of the free variables are equivalent, since
they represent the same net.
Computation is performed by rewriting configurations using the following rewrite
system, where if r is a rule, rb denotes a fresh generic instance of r, that is, a copy of r
where we introduce a new set of names:
Indirection: If x ∈ N (u), then x == t, u == v −→ u[t/x] == v.
Interaction: If r ∈ R and rb = α(t′1 , . . . , t′n ) ⊲⊳ β(u′1 , . . . , u′m ), then
α(t1 , . . . , tn ) == β(u1 , . . . , um ) −→
t1 == t′1 , . . . , tn == t′n , u1 == u′1 , . . . , um == u′m
Context: If ∆ −→ ∆′ , then ht | Γ, ∆, Γ ′ i −→ ht | Γ, ∆′ , Γ ′ i.
Collect: If x ∈ N (t), then ht | x == u, ∆i −→ ht[u/x] | ∆i.
This rewrite system generates an equational theory, the corresponding equivalence
relation is denoted by c ↔∗ c′ . The reduction relation −→ is strongly confluent [3] since
there is one rule for each pair of agents. Various strategies of evaluation are defined
in [3]. The values that we use in this paper, called interface normal forms, have terms
rooted by agents in the head whenever this is possible.
Definition 2 (Interface Normal Form). A configuration (R, ht | ∆i) is in interface
normal form (INF) if each ti in t is of one of the following canonical forms:
– α(s). E.g. hS(x) | x == Z, ∆i.
– x where either x ∈ N (tj ) for some j 6= i, or x ∈ N (u) for some y == u ∈ ∆ such
that y ∈ N is free (x is in an open path). E.g. hx, x | ∆i
– x where x ∈ N (u) for some y == u ∈ ∆ such that y ∈ N (u) (x occurs in a cycle).
E.g. hx | y == α(β(y), x), ∆i.
We denote by INFi the set of configurations where the ith port in the head is canonical.
450
M. Fernández, I. Mackie
Computing interface normal forms suggests that we do the minimum work required
to bring principal ports to the interface. This strategy is defined by the inference rules:
Axiom:
c ∈ INF
c⇓c
Collect:
ht1 , . . . , t, . . . , tn | ∆i ⇓ c
ht1 , . . . , x, . . . , tn | x == t, ∆i ⇓ c
Indirection: if x ∈ N (u) and y ∈ N (t, u == v)
ht1 , . . . , y, . . . , tn | u[t/x] == v, ∆i ⇓ c
ht1 , . . . , y, . . . , tn | x == t, u == v, ∆i ⇓ c
Interaction: if x ∈ N (α(t) == β(u)), r ∈ R, rb = α(t′ ) ⊲⊳ β(u′ )
−−−→ −−−−→
hs1 , . . . , x, . . . , sn | t == t′ , u == u′ , ∆i ⇓ c
hs1 , . . . , x, . . . , sn | α(t) == β(u), ∆i ⇓ c
This system is deterministic [3]. If c ⇓ v can be derived with these rules we say that v
is the interface normal form of c. We write c ⇓i v (i.e. the position i in the head of v
is canonical) if the rules Indirection and Interaction are only applied at position i in the
head of the configuration and the axiom is replaced by
c ∈ INFi
c ⇓i c
Example 1 (Combinators). The interaction combinators [8] are a universal system of
interaction built from the 0-ary agent ǫ and the binary agents δ and γ, with the rules:
δ(x, y) ⊲⊳ δ(x, y)
γ(x, y) ⊲⊳ γ(y, x)
ǫ
⊲⊳
ǫ
δ(ǫ, ǫ)
⊲⊳
ǫ
γ(ǫ, ǫ)
⊲⊳
ǫ
δ(γ(a, b), γ(c, d)) ⊲⊳ γ(δ(a, c), δ(b, d))
The configuration hx | γ(y, x) = δ(ǫ, y)i gives a non-terminating sequence of reductions, but has an interface normal form: hδ(a, b) | ǫ = γ(c, a), δ(c, d) = γ(d, b)i.
3 Bisimilarity
In functional languages, we consider two functions equivalent when we can apply them
to the same arguments and obtain the same results. In other words, we perform some
form of experiment on the objects under test, and compare the results. For interaction
nets, we take this general idea as inspiration. The way that we can make experiments
with a net is to interact with it on a free principal port. Connecting nets on free principal
ports is our analogue of applying a function to an argument. After evaluation, we can
observe whether some principal ports are at the interface, which is analogous to observing
A Theory of Operational Equivalence for Interaction Nets
451
whether a λ-term has evaluated to an abstraction. Only the observable ports (in the head
of the configuration) are available for the experiments. Configurations with the same
number of terms in the head will be said comparable.
Two configurations that cannot be distinguished by any experiment will be called
bisimilar. We will show that the bisimilarity relation can be defined as the greatest
postfixpoint of an operator (which allows us to use coinductive techniques to prove that
two configurations are bisimilar), and more important, it coincides with the contextual
equivalence, that is, two bisimilar configurations cannot be distinguished by any context
and can therefore be exchanged without altering the behaviour of the system. We begin
with some basic definitions to formalise these ideas.
Definition 3 (Visible Interface). A configuration c = ht1 , . . . , tn | ∆i ∈ INFi has a
visible interface at position i if either ti is not a variable or there is an open path starting at
ti and finishing at some tj = α′ (u) ∈ t, that is, ti = α(u) or ti = x and there is an open
path to tj = α′ (u). The visible agent at position i is α in the first case, α′ in the second
case. The rest of the net is called the kernel: Ki (c) = ht1 , . . . , tk−1 , u, tk+1 , . . . , tn | ∆i
where k = i or k = j depending on whether we are in the first or second case. The set
of new observable positions in Ki (c), denoted N PK (c, i), is the set of positions of the
terms u if k = i, otherwise it contains just the new position of ti . We denote by Vi the
set of all the configurations with a visible interface at position i.
If v and v ′ are comparable and have the same visible agent at position i, we write
SVAi (v, v ′ ). If the visible agents are different, but they are not both constructors, we
write ¬Constri (v, v ′ ).
Example 2. Let c = hI(x), x | i. Since t1 = I(x) is not a variable, c ∈ V1 . Since t2 = x
and there is an open path to t1 = I(x), c ∈ V2 . The visible agent is I for both positions,
and K1 (c) = K2 (c) = hx, x | i.
When we have different agents in the visible interfaces of the nets under test, and
they are not constructors, we need to see if these agents behave in the same way for each
possible agent interacting with them. For this we use closings.
Definition 4 (Closing). A closing at position i of a configuration c = ht | ∆i ∈ Vi ,
denoted by cli (c), is obtained from c by one of the following operations, where k = i,
or k = j if there is an open path starting at position i and finishing at position j in c:
1. replace tk ≡ α(s) in t, by a list of new variables z1 , . . . , zp ∈ N (u), p ≥ 0, and
add to ∆ the equation tk == α′ (u), where α′ is any agent, and the terms in u are
either new variables (in which case they can appear twice in α′ (u) or once in α′ (u)
and once in z) or elements of t, in which case they are erased from t.
The set N Pcl (c, i) of new observable positions in cli (c) contains the positions of
the variables z1 , . . . , zp in the new head if i = k, otherwise it contains just the new
position of ti .
2. erase tk ≡ α(s) and another term tp in t and add tk == tp to ∆. In this case
N Pcl (c, i) = ∅ if i = k, otherwise it contains just the new position of ti .
By abuse of notation, we will denote by cli (c′ ) the result of applying to a configuration
c comparable with c the operations that define a closing at position i for c.
′
452
M. Fernández, I. Mackie
Graphically, the first operation corresponds to connecting the principal port of an
agent α′ to the kth observable port in the interface of the net, and connecting some
auxiliary ports of α′ between them (if a variable appears twice in u), or to other observable
ports in the net (if u contains terms in t). The second operation corresponds to simply
adding a wire connecting the observable ports k and p.
We consider a complete lattice (Rel, ⊆) where Rel is the set of binary relations
between pairs (c, i), (c′ , i) such that c, c′ are comparable configurations whose heads
have at least i elements (i.e. we can talk of the ith observable port). The operators hRi
and [R] for R ∈ Rel will be used to define similarity and bisimilarity respectively.
Definition 5 (Operators). Let c, c′ be comparable configurations with at least i terms
in the head.
def
(c, i) hRi (c′ , i) ⇐⇒ c ⇓i v ∈ Vi ⇒ ∃v ′ , (c′ ⇓i v ′ and
either SVAi (v, v ′ ) and ∀p ∈ N PK (v, i), (Ki (v), p) R (Ki (v ′ ), p)
or ¬Constri (v, v ′ ), ∀cli (v), ∀p ∈ N Pcl (v, i), (cli (v), p) R (cli (v ′ ), p))
(c, i)[R](c′ , i)
def
⇐⇒ (c, i) hRi (c′ , i) and (c′ , i) hRi (c, i)
Property 1. h·i, [·] are monotone operators.
Definition 6 (Similarity, Bisimilarity).
– A relation S ∈ Rel such that S ⊆ hSi (i.e. S is a post-fixpoint of h·i) is a simulation.
The greatest such S is called a similarity, and written as -. If c, c′ are comparable
configurations with n elements in the head, then c-c′ if (c, i) - (c′ , i), 1 ≤ i ≤ n.
– A relation B ∈ Rel such that B ⊆ [B] (i.e. B is a post-fixpoint of [·]) is a bisimulation.
The greatest such B is called a bisimilarity, and written as ≃. If c, c′ are comparable
configurations with n elements in the head, then c≃c′ if (c, i) ≃ (c′ , i), 1 ≤ i ≤ n.
Note that h·i and [·] posses a greatest post-fixpoint by the Tarski-Knaster Fixed Point
Theorem. Moreover, - and ≃ are fixed points, i.e. - = h-i and ≃ = [≃].
Remark 1. The main difference with the typed approach resides in the definition of
closings and the way they are used in the definition of the operators h·i and [·]. Here
closings are applied “on demand” whereas they are a static notion in the typed framework.
More precisely, in a typed net a closing is built just by connecting agents to all the free
input ports. The Subject Reduction property ensures that reduction will not create new
free input ports. Instead, here we close one principal port at a time, and since reduction
might create a new free principal port, closings are applied in a dynamic way.
The relations -, ≃ can be defined by levels, as done by Abramsky for the untyped
λ-calculus [1].
Proposition 1 (Coinduction Principle). Let c, c′ be comparable configurations with n
observable ports. To prove c≃c′ it suffices to find a bisimulation B such that
(c, i)B(c′ , i) for 1 ≤ i ≤ n.
By coinduction we can show that the equational theory is included in the bisimilarity
relation. In Section 4 we give more examples of application of coinduction to prove
bisimilarity, in particular we will show that this inclusion is strict.
Theorem 1 (Bisimilarity includes the equational theory). c ↔∗ c′ ⇒ c≃c′ .
A Theory of Operational Equivalence for Interaction Nets
453
4 Examples
The Identity agent and a wire. Let I be the identity agent defined by rules
I(α(x1 , . . . , xn )) ⊲⊳ α(I(x1 ), . . . , I(xn ))
for any α ∈ Σ. We can prove hI(x), x | i≃hx, x | i by coinduction. Take a symmetric
R containing the pairs ((c, i), (c′ , i)) such that c′ is obtained from c by erasing the I
agents at the root of a term in the head, or at the root of a member of an equation. We
show that R is a bisimulation: if c ⇓i v ∈ Vi , then c′ ⇓i v ′ , and either they have the
same visible agent α at position i, in which case the kernels are in the relation for all the
new observable positions, or if they differ, then one is rooted by I and the other is just a
variable. In that case the closings are in the relation, which is sufficient since I is not a
constructor.
Copying before erasing or just erasing. In the system of the interaction combinators,
replacing a net of the form
✓✏
✓✏
ǫ
ǫ
✒✑
✒✑
❅
❘ ✓✏
✠
❅
δ
✒✑
❄
by the agent ǫ seems an intuitive optimization. We can prove that they are bisimilar by
coinduction. The Main Theorem 2 tells us then that these configurations are contextually
equivalent— the optimization is correct.
Agents γ and δ. The following nets are bisimilar:
✓✏
❅✓✏ ❅✓✏ ❅✓✏
γ
γ
δ
δ
✒✑
✒✑
✒✑
✒✑
❅ ✓✏
❘
✠
❘ ✓✏
✠
❅
≃
❅
❅
γ
δ
✒✑
✒✑
❄
❄
To show it using the coinduction principle we consider a relation containing ≃ and
these pairs, for any closing of the free principal port. The interesting closings are built
by adding an agent ǫ, γ, or δ (the closings using just wires do not reduce to a value
with a visible interface). The case of ǫ is trivial. For the other cases, by reducing to
interface normal form we obtain configurations that have the same visible agents and
whose kernels are easily shown to be bisimilar, hence contained in our relation.
η-rules for γ and δ. The following nets are bisimilar:
✻
✓✏
γ
✒✑
≃
✓✏
γ
✒✑
❄
✻
✓✏
δ
✒✑
≃
✓✏
δ
✒✑
❄
454
M. Fernández, I. Mackie
Note that these last two equivalences are neither included in the equational theory (since the nets are different normal forms) nor provable using the path semantics
developed by Lafont [8].
5 Contextual Equivalence
We define a set of operations that build a context for a configuration, in the same way
that closings were defined by operations. But there are more operations in the case of
contexts, and we can have a sequence of operations instead of just one operation.
Definition 7 (Context). A context at position i for a configuration c = ht | ∆i is defined
by a (possibly empty) sequence of operations, where k = i, or k = j if there is an open
path starting at position i and finishing at position j in c. Non-empty sequences (i.e.
contexts) are defined inductively, there are three cases according to the first operation
used.
1. Addition of agent by principal port: This operation replaces tk in t by a list of new
variables z1 , . . . , zp ∈ N (u), and adds to ∆ the equation tk == α(u), where α is
any agent, and the terms in u are either new variables (in which case they can occur
twice in α(u) or once in α(u) and once in z) or elements of t, in which case they
are erased from t.
In this case the rest of the sequence is the concatenation of contexts at the positions
of the variables z1 , . . . , zp in the new head and at the new position of ti if i 6= k.
2. Addition of agent by auxiliary port: This operation replaces tk in t by a list of new
variables z1 , . . . , zp occurring free in y == α(u) and adds this equation to ∆, where
α is any agent, and the terms in u are either new variables (in which case they can
occur twice in α(u) or once in α(u) and once in z) or elements of t, in which case
they are erased from t. The term tk must occur in u.
Also in this case the rest of the sequence is composed of contexts at the positions of
the variables z1 , . . . , zp in the new head and at the new position of ti if i 6= k.
3. Addition of a wire: erase tk and another term tp in t and add tk == tp to ∆. In this
case the rest of the sequence is empty if i = k, otherwise it is a context at the new
position of ti .
We denote by opi,j (c) the result of applying an operation as above to the configuration c
at position i, using the positions j in t. We denote by Ci [c] the configuration resulting of
applying the context C, defined by a sequence of operations as above, to the configuration
c at position i, and by C(c, i) a generic context for c at position i. We will also denote
by Ci [c′ ] the result of applying to a configuration c′ comparable with c the operations
that define a context at position i for c.
The set N PC (c, i) of new observable positions of Ci [c] is computed as follows: we
start with the set {i}, and compute a new set each time we perform an operation. The
first and second operations add the positions of the variables z1 , . . . , zp in the new head,
and if i = k they erase i, otherwise they replace i by the new position of ti in the head.
The third operation simply erases the position i from the set if i = k, otherwise replaces
i by the new position of ti .
A Theory of Operational Equivalence for Interaction Nets
455
Graphically, the first two operations correspond to connecting an agent α to an
observable port of the net (using the principal port of α in the first one, and an auxiliary
port in the second one). The third operation corresponds to adding a wire connecting
the observable ports k and p. Closings are particular cases of contexts defined by one
operation of the first or third class.
Definition 8 (Contextual Preorder and Contextual Equivalence). Let c, c′ be comparable configurations with n elements in the head.
c
≤
c′
def
⇐⇒ ∀i ∈ [1 . . . n], (c, i) ≤ (c′ , i)
def
(c, i) ≤ (c′ , i) ⇐⇒ ∀C(c, i), ∀p ∈ N PC (c, i), Ci [c] ⇓p v ∈ Vp ⇒ ∃v ′ , (Ci [c′ ] ⇓p v ′
and either SVAp (v, v ′ ) or ¬Constrp (v, v ′ ))
def
(c, i) = (c′ , i) ⇐⇒ (c, i) ≤ (c′ , i) and (c′ , i) ≤ (c, i)
c
=
c′
def
⇐⇒ ∀i ∈ [1 . . . n], (c, i) = (c′ , i)
6 Main Result
We will show that the notions of contextual equivalence and bisimilarity coincide, if the
interaction net system has “enough contexts” to extract the kernels of all values.
Definition 9. A system of interaction is complete if for any v ∈ Vi with visible agent α at
position i, there exists a context C α such that ∀p ∈ N PK (v, i), (Ki (v), p) ≃ (Ciα [v], p)
and p ∈ N PC α (v, i).
Theorem 2. If the interaction net system is complete, ≤ (resp. =) coincides with (resp. ≃). Otherwise - (resp. ≃) is included in ≤ (resp. =).
Proof. To prove - ⊆ ≤ it is sufficient to show that - is preserved by context. Following
Howe [5] we prove that - is a precongruence (a preorder preserved by context) using
an auxiliary relation -∗ , the precongruence candidate, defined as follows.
Let c, c′ be comparable configurations with n elements in their heads.
c -∗ c′
def
⇐⇒ ∀i ∈ [1 . . . n], (c, i) -∗ (c′ , i)
def
(c, i) -∗ (c′ , i) ⇐⇒ either
or
(c, i) - (c′ , i),
c = opp,j (d), i ∈ N Pop (d, p),
(d, q) -∗ (d′ , q), ∀q ∈ j and
(opp,j (d′ ), i) - (c′ , i).
The precongruence candidate enjoys the following properties.
Property 2. 1. - ⊆ -∗ .
2. -∗ is reflexive.
3. c-∗ c′ , c′ -c′′ ⇒ c-∗ c′
4. -∗ is preserved by context: c-∗ c′ ⇒ ∀i, ∀C(c, i), Ci [c]-∗ Ci [c′ ].
To show that - is a precongruence it is sufficient to prove that it coincides with -∗ ,
for which it remains to prove -∗ ⊆ -. This follows, by coinduction, from:
456
M. Fernández, I. Mackie
Proposition 2. 1. v ∈ Vi , (v, i) -∗ (c′ , i) ⇒ (v, i) h-∗ i (c′ , i)
2. c-∗ c′ , c ⇓i v ∈ Vi ⇒ v-∗ c′
This concludes the proof of the first inclusion: - ⊆ ≤. Now we prove ≤⊆ - by
coinduction, showing ≤ ⊆ h≤i. Assume (c, i) ≤ (c′ , i). By definition of ≤, using an
empty context, c ⇓i v ∈ Vi ⇒ ∃v ′ , c′ ⇓i v ′ and either SVAi (v, v ′ ) or ¬Constri (v, v ′ ).
In the latter case we are done, since closings are particular cases of contexts. In the
first case, we know by completeness that (Ki (v), p) ≃ (Ciα [v], p), ∀p ∈ N PK (v, i).
Moreover, since bisimilarity includes the equational theory (Theorem 1), and (c, i) ≤
(c′ , i): (Ciα [v], p) ≃ (Ciα [c], p) ≤ (Ciα [c′ ], p) ≃ (Ciα [v ′ ], p). Again by completeness
(since SVAi (v, v ′ )), (Ciα [v ′ ], p) ≃ (Ki (v ′ ), p). Since we have already proved ≃ ⊆ =, we
⊓
⊔
get (Ki (v), p) ≤ (Ki (v ′ ), p), ∀p ∈ N PK (v, i) as required.
7 Conclusion
In this paper we have presented a notion of bisimilarity for (untyped) interaction nets.
This notion has been shown to coincide with the contextual equivalence, thus we have
a simple proof technique for showing when two nets are equivalent in all contexts.
One of the main applications that we see for this work are general correctness proofs
for optimizations in interaction net implementations of various systems, such as the
λ-calculus or term rewriting systems.
References
1. Samson Abramsky. The lazy λ-calculus. In David A. Turner, editor, Research Topics in
Functional Programming, chapter 4, pages 65–117. Addison Wesley, 1990.
2. Maribel Fernández and Ian Mackie. Coinductive techniques for operational equivalence of
interaction nets. In Proceedings of the 13th Annual IEEE Symposium on Logic in Computer
Science (LICS’98), pages 321–332. IEEE Computer Society Press, June 1998.
3. Maribel Fernández and Ian Mackie. A calculus for interaction nets. In Proceedings of
the first International Conference on Principles and Practice of Declarative Programming
(PPDP’99), Lecture Notes in Computer Science, Springer-Verlag, September 1999.
4. Georges Gonthier, Martı́n Abadi, and Jean-Jacques Lévy. The geometry of optimal lambda
reduction. In Proceedings of the 19th ACM Symposium on Principles of Programming Languages (POPL’92), pages 15–26. ACM Press, January 1992.
5. Douglas J. Howe. Proving congruence of bisimulation in functional programming languages.
Information and Computation, 124(2):103–112, 1996.
6. Lionel Khalil. Mémoire de DEA SPP, 1999. Available at http://www.dmi.ens.fr/∼khalil.
7. Yves Lafont. Interaction nets. In Proceedings of the 17th ACM Symposium on Principles of
Programming Languages (POPL’90), pages 95–108. ACM Press, January 1990.
8. Yves Lafont. Interaction combinators. Information and Computation, 137(1):69–101, 1997.
9. Ian Mackie. YALE: Yet another lambda evaluator based on interaction nets. In Proceedings
of the 3rd ACM SIGPLAN International Conference on Functional Programming (ICFP’98),
pages 117–128. ACM Press, September 1998.
10. Andrew M. Pitts. Operationally-based theories of program equivalence. In P. Dybjer and A. M.
Pitts, editors, Semantics and Logics of Computation, Publications of the Newton Institute,
pages 241–298. Cambridge University Press, 1997.
Run Statistics for Geometrically Distributed
Random Variables
(Extended Abstract)
Peter J. Grabner1 , Arnold Knopfmacher2 , and Helmut Prodinger3
⋆
1
Institut für Mathematik A
Technische Universität Graz
Steyrergasse 30
8010 Graz, Austria
grabner@weyl.math.tu-graz.ac.at
2
The John Knopfmacher Centre for Applicable Analysis and Number Theory
Department of Computational and Applied Mathematics
University of the Witwatersrand, P. O. Wits
2050 Johannesburg, South Africa
arnoldk@gauss.cam.wits.ac.za,
WWW home page: http://www.wits.ac.za/science/number_theory/arnold.htm
3
The John Knopfmacher Centre for Applicable Analysis and Number Theory
Department of Mathematics
University of the Witwatersrand, P. O. Wits
2050 Johannesburg, South Africa
helmut@gauss.cam.wits.ac.za,
WWW home page: http://www.wits.ac.za/helmut/index.htm
Abstract. For words of length n, generated by independent geometric
random variables, we consider the mean and variance, and thereafter
the distribution of the number of runs of equal letters in the words. In
addition, we consider the mean length of a run as well as the length of
the longest run over all words of length n.
1
Introduction
Let X denote a geometrically distributed random variable, i. e. P{X = k} =
pq k−1 for k ∈ N and q = 1 − p. The combinatorics of n geometrically distributed
independent random variables X1 , . . . , Xn has attracted recent interest, especially because of applications in computer science. We mention just two areas,
the skip list [1,13,15,8] and probabilistic counting [3,6,7,9].
⋆
The first named author is supported by the START-project Y96-MAT of the Austrian Science Foundation. Part of this work was done during his visit to the John
Knopfmacher Centre for Applicable Analysis and Number Theory at the University
of the Witwatersrand, Johannesburg, South Africa
G. Gonnet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNCS 1776, pp. 457–462, 2000.
c Springer-Verlag Berlin Heidelberg 2000
458
P.J. Grabner, A. Knopfmacher, H. Prodinger
In [14] the number of left-to-right maxima was investigated for words
a1 . . . an , where the letters ai are independently generated according to the geometric distribution. In [10] the study of left-to-right maxima was continued, but
now the parameters studied were the mean value and mean position of the r-th
maximum.
In this article we study runs of consecutive equal letters in a string of
n geometrically distributed independent random letters. For example in w =
22211114431 we have 5 runs of equal letters of respective lengths 3, 4, 2, 1, 1. In
the sequel we denote by Rn (w) the number of runs in the word w, where w
is of length n. Run statistics play a significant role in the behaviour of sorting
algorithms, as explained at length in [12].
In section 2 we study the mean and variance of Rn (w). Thereafter, in section 3
we study the distribution of the number of runs, which turns out to be Gaussian.
Subsequently, in section 4 we study the average length of the runs per word.
Finally, in section 5 we determine the mean and variance of the length of the
longest run in a word of length n.
2
Moments of Number of Runs
In order to determine the mean and variance of the number of runs we will make
use of the following decomposition of the set of all (non-empty) words. Here
{≥ k} denotes the set {k, k + 1, . . . }; for a given set A we denote
A+ =
∞
[
Ak ,
k=1
A∗ = ε ∪ A+ ,
where ε stands for the empty word. We decompose the set of non-empty words
according to runs of 1’s, separated by words consisting of larger digits only
∗
(1)
{≥ 1}+ = (ε + 1+ ) {≥ 2}+ 1+ {≥ 2}+ (ε + 1+ ) + 1+ ;
here we find it more convenient to write + instead of ∪.
We consider a probability generating function F (z, u), where z labels the
length of the word, and u counts the number of runs. We should always have
z
, and a replacement of z by qz, if we increase all letters by 1.
F (z, 1) = 1−z
Then (1) translates into the functional equation
F (z, u) =
F (qz, u)
pzu
1 − F (qz, u)
1 − pz
2
pzu
pzu
+1 +
.
1 − pz
1 − pz
Now we differentiate it w. r. t. u, plug in u = 1, set G(z) =
get
G(z) = G(qz)
pz(1 − pz)
(1 − qz)2
+
.
2
(1 − z)
(1 − z)2
∂
∂u F (z, 1),
(2)
and
Run Statistics for Geometrically Distributed Random Variables
459
Setting H(z) = (1 − z)2 G(z) yields
H(z) = H(qz) + pz(1 − pz) .
Comparing coefficients, we see that
[z]H(z) = 1 ,
[z 2 ]H(z) = −
p
p2
=−
,
1 − q2
1+q
and that the other coefficients are zero. Consequently,
p 2
z ,
H(z) = z −
1+q
and
G(z) =
This leads to
z−
(1
p
2
1+q z
− z)2
.
Proposition 1. The mean value of the number of runs for n ≥ 1 is given by
p
2q
n+
.
µn = ERn = [z n ]G(z) =
1+q
1+q
The computation of the variance is rather lengthier and requires that we
differentiate (2) twice. This leads after some work to
Proposition 2. The variance of the number of runs is given for n ≥ 2 by
σn2 = VRn =
3
2q(1 − q)2 (3 − q + q 2 )
2q(1 − q)2 (2 + q 2 )
n
−
.
(1 + q)2 (1 − q 3 )
(1 + q)2 (1 − q 3 )
Distribution of the Number of Runs
In this section we discuss a central limit theorem for the distribution of the
number of runs. In order to derive this, we have to extract further information
from the functional equation (2). We observe that the terms on the right-hand
side are all simple rational functions, except for the terms containing F (qz, u).
By investigating the analytic properties of F (z, u) it can be shown that F (z, u)
can be written as
g(z, u)
+ R(z, u),
(3)
F (z, u) =
1 − f (u)z
where g(z, u) and R(z, u) are holomorphic in |z| < 1 + δ, |u − 1| < δ for some
δ > 0. Now we are in the general framework of Hwang’s quasi-power theorem
(cf. [5]) and can deduce the following theorem.
Theorem 1. The number of runs in words of length n produced by independent
geometric random variables obeys a central limit law, more precisely
s
!
1
2q
2q(2 + q 2 ) 1 − q √
n+t
n = Φ(t) + O(n− 2 ).
(4)
P Rn (w) ≤
3
1+q
1−q
1+q
460
4
P.J. Grabner, A. Knopfmacher, H. Prodinger
Average Length of Runs
Given a string w of geometric random variables of length n with Rn (w) = k runs
we define the average length of a run to be Ln (w) = Rnn(w) . It is of interest to
determine the moments and the distribution of this parameter over all strings of
length n. Intuitively, one expects that the mean length of a run should be close
to n divided by the mean number of runs, which is
n
2q
1+q n +
p
1+q
=
1
1 + q 1 − q2 1
−
+O 2 .
2
2q
4q n
n
In fact we obtain
Proposition 3. For n ≥ 1 the mean and variance of Ln (w) are given respectively by
1
1+q
+ O( ),
2q
n
(1 − q 2 )2 (2 + q 2 ) 1
1
+ O( 2 ).
8q 3 (1 − q 3 ) n
n
Moreover, Ln (w) obeys a central limit theorem:
!
p
1
(1 − q 2 ) 2 + q 2 t
1+q
√
= Φ(t) + O(n− 2 ).
≤ p
P Ln (w) −
3
3
2q
n
8q (1 − q )
The proof makes use of the distribution obtained for the number of runs in
Theorem 1.
5
Longest Runs
In this section we study the mean of the longest run Mn (w) of equal digits in
a string of length n. For this purpose we introduce the probability generating
function Gh (z) of all strings that have runs only of length less than h. Similar
arguments as in the proof of (2) show that Gh satisfies
Gh (z) =
1 − (pz)h
1 − pz
2
Gh (qz)
1 − (pz)h−1
.
+
pz
pz
1 − Gh (qz) 1−pz (1 − (pz)h−1 )
1 − pz
(5)
In order to extract the asymptotic behaviour of the probability that a string of
length n has runs of length at most h, we have to find the singularities of Gh (z).
Using bootstrapping we estimate ρh , the dominant singularity of the function
Gh .
Combining this with estimates for Gh leads to
P (Mn (w) < h) = (1 − pq h )n + O(hq h ).
(6)
Run Statistics for Geometrically Distributed Random Variables
461
Using (6) and Abel summation we then find that the first and second moment
of the longest run are given by
X
X
1 − P (Mn (w) < h) =
1 − (1 − pq h )n + O(1),
EMn (w) =
h≥1
2
EMn (w) = 2
X
h≥1
=
X
h≥1
h≥1
h 1 − P (Mn (w) < h) − EMn (w)
(7)
(2h − 1) 1 − (1 − pq h )n + O(1).
In order to compute the asymptotic behaviour of these two moments, we use
the now classical exponential approximation technique (cf. [12]). Thereafter we
make use of the Mellin transform and Mellin inversion formula to obtain finally
Proposition 4. The mean value of the length of the longest run Mn (w) in a
string of n geometric random variables satisfies
EMn (w) = log q1 n + O(1).
Similarly, we could obtain an expression for the second moment
EMn (w)2 + O(1) = log21 n + O(log n).
q
(8)
References
1. L. Devroye. A limit theory for random skip lists. Advances in Applied Probability,
2:597–609, 1992.
2. P. Flajolet, X. Gourdon, and P. Dumas. Mellin transforms and asymptotics: Harmonic sums. Theoretical Computer Science, 144:3–58, 1995.
3. P. Flajolet and G. N. Martin. Probabilistic counting algorithms for data base
applications. Journal of Computer and System Sciences, 31:182–209, 1985.
4. L. Guibas and A. Odlyzko. Long repetitive patterns in random sequences.
Zeitschrift für Wahrscheinlichkeitstheorie, 53:241–262, 1980.
5. H.-K. Hwang. On convergence rates in the central limit theorems for combinatorial
structures. European Journal of Combinatorics, 19:329–343, 1998.
6. P. Kirschenhofer and H. Prodinger. On the analysis of probabilistic counting. In
E. Hlawka and R. F. Tichy, editors, Number–theoretic Analysis, volume 1452 of
Lecture Notes in Mathematics, pages 117–120, 1990.
7. P. Kirschenhofer and H. Prodinger. A result in order statistics related to probabilistic counting. Computing, 51:15–27, 1993.
8. P. Kirschenhofer and H. Prodinger. The path length of random skip lists. Acta
Informatica, 31:775–792, 1994.
9. P. Kirschenhofer, H. Prodinger, and W. Szpankowski. Analysis of a splitting process arising in probabilistic counting and other related algorithms. Random Structures and Algorithms, 9:379–401, 1996.
10. A. Knopfmacher and H. Prodinger. Combinatorics of geometrically distributed
random variables: Value and position of the rth left-to-right maximum. Discrete
Mathematics, to appear.
462
P.J. Grabner, A. Knopfmacher, H. Prodinger
11. D. E. Knuth. The average time for carry propagation. Indagationes Mathematicae,
40:238–242, 1978.
12. D. E. Knuth. The Art of Computer Programming, volume 3: Sorting and Searching.
Addison-Wesley, 1973. Second edition, 1998.
13. T. Papadakis, I. Munro, and P. Poblete. Average search and update costs in skip
lists. BIT, 32:316–332, 1992.
14. H. Prodinger. Combinatorics of geometrically distributed random variables: Leftto-right maxima. Discrete Mathematics, 153:253–270, 1996.
15. W. Pugh. Skip lists: a probabilistic alternative to balanced trees. Communications
of the ACM, 33:668–676, 1990.
Generalized Covariances of Multi-dimensional
Brownian Excursion Local Times
Guy Louchard
Université Libre de Bruxelles, Département d’Informatique, CP 212, Boulevard du
Triomphe, B-1050 Bruxelles, Belgium
Email: louchard@ulb.ac.be
Abstract. Expressions for the generalized covariances of multi-dimensional Brownian excursion local times are derived from corresponding
densities transforms. Typical applications are moments of the cost of
structures such as M/G/1 queue, Random trees, Markov stack or priority
queue in Knuth’s model. Brownian excursion area and a result of Biane
and Yor are also revisited.
1
Introduction
Throughout this paper, the standard Brownian motion (BM) will be denoted by
x(t).
Fix t > 0 and denote the last zero of x before t and the first zero of x after
t by
G(t) := sup{s : x ≤ t; x(s) = 0}
and
D(t) := inf{s : s ≥ t; x(s) = 0}.
The processes restricted to [G(t), t] and [G(t), D(t)] are called the meandering
process ending at t : Z(u) := x+ (G(t) + u), 0 ≤ u ≤ L− (t) := t − G(t) and the
excursion process straddling t :
Y (u) := x+ (G(t) + u), 0 ≤ u ≤ L(t) := D(t) − G(t), respectively. The standard
d √
scale excursion (BE) is X(u) := [Y (u)|L = 1]; note that Y (u) ≡ ℓX(u/ℓ) when
L = ℓ. The distributions of G and L are well known: see Chung [2, Theorem 1].
The local time of x(t) at a, denoted by
Z
1 t
I[a,a+ǫ] (x(t))dt,
t+ (t, a) = lim
ǫ→0 ǫ 0
and the local time of the standard scaled excursion X at a, denoted by τ + (a),
have been studied by several authors (note that for an excursion of length ℓ we
√
d √
have: τ + (ℓ, a) ≡ ℓτ + (a/ ℓ). See for instance Getoor and Sharpe [9], Knight
[13], Cohen and Hooghiemstra [3], Hooghiemstra [12], Drmota and Gittenberger
[4], Louchard [14], Gittenberger and Louchard [11]. Intuitively, the local time at
a is the total time spent by the excursion in the neighbourhood of a.
G. Gonnet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNCS 1776, pp. 463–472, 2000.
c Springer-Verlag Berlin Heidelberg 2000
464
G. Louchard
Applications of the BE are numerous: we will mention a few of them, emphasizing the meaning of the local time. For instance, consider a M/G/1 queuing
system. There the customers arrive according to Poisson process (πt , t ≥ 0) with
rate α−1 where α > 0. Denote the arriving time of the n-th customer by tn
and the service time by sn which is assumed to be independent of the arrival
process πt . Then the actual waiting time process is defined by w1 := 0, wn+1 :=
max{0, wn + sn − (tn+1 − tn )} and the virtual waiting time process by
vt := max{0, wπt + sπt − (t − tπt )}, t ≥ 0.
Furthermore, denote the length of the first busy period by ℓ. Then Cohen and
Hooghiemstra [3] have shown that for arbitrary δ > 0 the following limit theorem
holds:
vsu
d
s < ℓ ≤ s + δ), 0 ≤ u ≤ 1 → X(u), s → ∞.
(√
2αs
In this context the BE local time process appears as the weak limit of the (suitably normalized) number of downcrossings of the virtual waiting time process,
i.e.
d(v) = #{t : 0 ≤ t ≤ ℓ, vt = v};
(#A denotes the cardinality of A) conditioned on the number of customers
served during the first busy period (see [3, Sec. 7]). Another BE application is
the number of nodes at some level in a random tree. Consider a simply generated
random tree (according to the notion of Meir and Moon [19] or, equivalently,
the family tree of a Galton-Watson branching process conditioned on the total
progeny. Then BE appears as the weak limit of the contour process of this tree,
i.e. the process constructed of the distances of the nodes from the root when
traversing the tree (for details see Gittenberger [10]). The local time corresponds
here to the number of nodes at some level. The generation sizes of the branching
processes converge weakly to BE local time. The external path length (EPL) of
a random tree is given by the sum of distances from the root to the leafs.
Dynamical algorithms are also related the BE. The Stack structure of length
2n (see Flajolet [6] p. 126) is asymptotically equivalent to a BE (Louchard [15]).
The priority queue in Knuth’s model is combinatorially equivalent to a Markov
Stack (see Louchard et al [17]). So the distribution of the size of this structure
is asymptotically related, after suitable normalization,to the BE local time. The
local time corresponds to the time spent by the structure at some level.
The cost G of structures such as M/G/1 queue busy period, Random tree,
Markov stack or priority queue
R 1 in Knuth’s model is asymptotically given, for any
cost function g(·), by G = 0 g[X(u)]du. For stacks and priority queue, the cost
is related to the size. For the M/G/1 queue, the cost is related to the waiting
time. For EPL, the cost is related to the distance to the leafs.
Moments of G are immediately related to the local time: we have
Z ∞
Z ∞
Z ∞
d
dx1
dx2 · ·
dxd g(x1 )g(x2 ) · ·g(xd ) · ·K(x1 , x2 , · · xd )
E[G ] = d!
0
x1
xd−1
Generalized Covariances
465
with K(x1 , x2 , · · xd ) := E[τ + (x1 )τ + (x2 ) · ·τ + (xd )] denoting the generalized covariances. In this paper we obtain explicit expressions for K(x1 , x2 · ·xd ). We
revisit also two classical examples: the BE area (g(x) = x) related to the Airy
distribution (which has a lot of applications in combinatorics and data structures) and a result of Biane and Yor [1] related to g(x) = 1/x.
The paper is organized as follows. Sec. 2 gives the basic formula’s we need
in the sequel, Sec. 3 provides an efficient algorithm for the generalized covariances computation. In Sec. 4, we consider two typical applications: the Brownian
excursion area and the Biane and Yor formula.
2
Basic Formulas
In this section, we start from known results to derive expressions for the first
generalized covariances K(x1 , x2 , ··, xd ), d = 1 · ·4.
In [11] we obtained the following result depending on some Laplace transforms.
Z
Pd
+
1
eα Θ(d)dα, xd > xd−1 · · > x1
E[e− 1 βi τ (xi ) , 1 > mxd ] = √
2πi S
where S := [a − i∞, a + i∞], a > 0, mx := inf{s : X(s) = x},
Θ(d) =
2[F1 (d)]2 [βd
αd
+ C1 (d) + C2 (d)D2 (d)/F1 (d)]
with some functions depending only on α and x· :
r
α
E(d, d − 1)/Sh(d, d − 1)
C1 (d) =
2
α
C2 (d) = −
2
2Sh (d, d − 1)
r
Sh(d, d − 2)
α
C3 (d) =
2 Sh(d, d − 1)Sh(d − 1, d − 2)
C4 (d) = C2 (d − 1)
√
C5 (d) = 2Sh(d, d − 1)
and some functions depending also on β· :
F1 (d) = βd−1 D2 (d) + D1 (d)
D2 (d) = βd−2 D4 (d) + D3 (d)
D1 (d)
= C3 (d) + C4 (d)D4 (d)/D2 (d)
D2 (d)
D3 (d) = C5 (d)D1 (d − 1)
D4 (d) = C5 (d)D2 (d − 1)
(1)
466
G. Louchard
and
√
E(ℓ, m) := e ·[xℓ −xm ] ,
√
√
Sh(ℓ, m) := sinh[ ·(xℓ − xm )], Sh(ℓ) := sinh( ·xℓ )
√
√
· := 2α.
Initialisations are given by
√
√
D1 (2) = αSh(2)/ 2
D2 (2) = Sh(1)Sh(2, 1)
¿From (1), it is possible (with MAPLE) to derive explicit
expressions for suc√
cessive derivaties of Θ(d). For instance, with E(i) := e ·xi ,
K̄(α, x1 ) =
K̄(α, x1 , x2 ) =
∂2Θ
∂β2 ∂β1
K̄(α, x1 , x2 , x3 ) =
2
∂Θ
∂β1
β1 ,β2 =0
∂3Θ
∂β3 ∂β2 ∂β1
= 2E(1)−2
β1=0
√
−2 2
= √ E(2)−2 (E(1)−2 − 1)
α
=
β1 ,β2 ,β3 =0
−2
(3E(1)−2 E(2)−2 − E(2)−2 − 2E(1)
E(1)−2 α
K̄(α, x1 , x2 , x3 , x4 ) =
∂Θ
∂β4 ∂β3 ∂β2 ∂β1
(2)
)(E(1)−2 − 1)E(3)−2
(3)
=
β1 ,β2 ,β3 ,β4 =0
4
− √ E(4)−2 (E(1)−2 − 1) ·
2
·(6E(1)−2 E(2)−4 E(3)−2 − 6E(1)−2 E(3)−2 E(2)−2 + 2E(1)−2 E(2)−2 + E(2)−4
+E(1)−2 E(3)−2 − 3E(1)−2 E(2)−4 + 2E(3)−2 E(2)−2 − 3E(2)−4 E(3)−2 )/
(E(2)−2 E(1)−2 α3/2 )
High order derivatives become difficult to compute, even with MAPLE. So another technique is obviously needed.
3
An Efficient Algorithm for Generalized Covariances
Computation
In this section, we first derive a recurrence equation for some functions arising
in the generalized covariances. This leads to some differential equations for related exponential generating functions. A simple matrix representation is finally
obtained and it remains to invert the Laplace transforms.
Generalized Covariances
3.1
467
A Recurrence Equation
Let us first differentiate Θ w.r.t. βd (each time, after differentiation w.r.t. βi , we
set βi = 0). This gives
−αd
2[C1 F1 + C2 D2 ]2
We should write C1 (d), etc. but we drop the d− dependency to ease notations.
Differentiating now w.r.t. βd−1 , this leads to
αd C1 [βd−2 D4 + D3 ]
αd C1 D2
=
3
[C1 D1 + C2 D2 ]
[C7 βd−2 D4 + C7 D3 + C1 C4 D4 ]3
with C7 (d) := C1 (d)C3 (d) + C2 (d) = C1 (d)C1 (d − 1) after detailed computation.
It is clear that the next differentiations will lead to some pattern. Indeed, set
H(d, i) :=
D2 (d)i
∂ d−2
∂βd−2 · ·∂β1 [C1 (d)D1 (d) + C2 (d)D2 (d)]i+2
(4)
βd−2 ··β1=0
obviously
∂dΘ
∂βd · ·∂β1
= C1 (d)αd (−1)d H(d, 1)
(5)
βd ··β1 =0
Expanding (4), we derive (omitting the details)
1
·
C1 (d)i+2 C1 (d − 1)i−1 C5 (d)2
i−1
X
i−1
(−1)i−1−j C2 (d − 1)i−1−j [−2H(d − 1, i − j)
·
j
H(d, i) =
j=0
+(i + 2)C2 (d − 1)H(d − 1, i − j + 1)]
(6)
(6) is still too complicated. So we set first H1 (d, i) := H(d, i)C2 (d)i−1 . This leads
to
C2 (d)i−1
·
C1 (d)i+2 C1 (d − 1)i−1 C5 (d)2
i−1
X
i−1
(−1)i−1−j [−2H1 (d − 1, i − j) + (i + 2)H1 (d − 1, i − j + 1)]
·
j
H1 (d, i) =
j=0
But we remark that
C2 (d)
C1 (d)C1 (d−1)
6 (d−1)
= − CC
with
6 (d)
C6 (d) := E(d)2 − E(d − 1)2
Then, we set H2 (d, i) := H1 (d, i)C6 (d)i−1 and we obtain
H2 (d, i) =
1
C1 (d)3 C5 (d)2
·
(7)
468
G. Louchard
·
i−1
X
i−1
j=0
+
3.2
j
(−1)j C6 (d − 1)j [−2H2 (d − 1, i − j)
(i + 2)
H2 (d − 1, i − j + 1)]
C6 (d − 1)
(8)
Some Generating Function
Eq. (8) is a perfect candidate for an exponential generating function (see Flajolet
and Sedgewick [5]). We set
ϕ2 (d, v) :=
∞
X
H2 (d, i)v i−1
(i − 1)!
1
(8) leads to
1
·
ϕ2 (d, v) =
C1 (d)3 C5 (d)2
∂2
1
[ϕ
(d
−
1,
v)
·
v]
e−vC6 (d−1)
· −2ϕ2 (d − 1, v) +
2
C6 (d − 1) ∂v 2
1
∂v ϕ2 (d − 1, v) · ∂v [e−vC6 (d−1) · v]
+
C6 (d − 1)
With (7), we are led to set
2
ϕ3 (d, v) := ϕ2 (d, v)evE(d−1) and H(d, 1) = ϕ3 (d, 0)
Before establishing the corresponding equation for ϕ3 , it is now time to find the
effect of all our transforms on ϕ2 . Indeed, H(2, i) = γδ3i−1 (see (4) with
δ1 = D2 (2), δ2 = C1 (2)D1 (2) + C2 (2)D2 (2)
−2E(2)−2 (E(1)−2 − E(2)−2 )(E(1)−2 − 1)
δ1
γ= 3 =
δ2
E(1)−2 α3
δ1
δ3 =
δ2
So
H1 (2, i) = γδ4i−1 , with δ4 = δ3 C2 (2)
H2 (2, i) = γδ5i−1 , with δ5 = δ4 C6 (2)
ϕ2 (2, v) = γevδ5 , ϕ3 (2, v) = γevδ6 , with δ6 = δ5 + E(1)2 = 1,
after all computations.
We see that it is convenient to finally set
ϕ3 (d, v) := ϕ4 (d, v)ev , ϕ4 (2, v) = γ, H(d, 1) = ϕ4 (d, 0)
Generalized Covariances
469
The differential equation for ϕ4 is computed as follows (we omit the details)
2(E(d − 1)−2 − E(d)−2 )E(d)−2
√ 3·
E(d − 1)−4 (E(d − 2)−2 − E(d − 1)−2 ) ·
∂
∂2
µ1 v 2 ϕ4 (d − 1, v) + (µ2 + µ3 v) ϕ4 (d − 1, v)
∂v
∂v
+ (µ4 + µ5 v)ϕ4 (d − 1, v)]
ϕ4 (d, v) =
(9)
with
µ1
µ2
µ3
µ4
µ5
3.3
:= E(d − 2)−2 E(d − 1)−2
:= 3E(d − 2)−2 E(d − 1)−2
:= 2E(d − 2)−2 E(d − 1)−2 − E(d − 2)−2 − E(d − 1)−2
:= −2E(d − 2)−2 − E(d − 1)−2 + 3E(d − 2)−2 E(d − 1)−2
:= E(d − 2)−2 E(d − 1)−2 − E(d − 1)−2 − E(d − 2)−2 + 1.
A Matrix Representation
It is now clear that ϕ4 (d, v) is made of 2 parts: the first one is given by the
product of γ with all coefficients in front of (9).
The other part is given, for each d, by a polynomial in v, the coefficients of
which are given by the following algorithm.
Start with vec2 [0] = 1, vec2 [i] = 0, i ≥ 1.
Construct a tri-diagonal band matrix Ad as follows: if we apply the differential
P
∂2
∂
i
operator of (9) i.e. [µ1 v ∂v
2 + (µ2 + µ3 v) ∂v + (µ4 + µ5 )v] to a polynomial
0 ai v ,
P
i
we see that the new polynomial 0 āi v is given by
ā0 = Ad [0, 1]a1 + Ad [0, 0]a0
āi = Ad [i, i + 1]ai+1 + Ad [i, i]ai + Ad [i, i − 1]ai−1 , i ≥ 1
with
Ad [i, i + 1] := [i(i + 1) + 3(i + 1)]E(d − 1)−2 E(d − 2)−2
Ad [i, i] := [(2i + 3)E(d − 1)−2 E(d − 2)−2 − (i + 2)E(d − 2)−2
−(i + 1)E(d − 1)−2 ]
−2
−2
−2
Ad [i, i − 1] := [E(d − 1) E(d − 2) − E(d − 1) − E(d − 2)−2 + 1] (10)
All other elements of Ad are set to 0.
Successive applications of Aℓ to vec2 give the coefficients of the polynomial
part of ϕ4 (d, v) :
d
Y
Aℓ vec2
(11)
vecd :=
ℓ=3
470
G. Louchard
d
Θ
Now, by (5), | ∂β∂d ··∂β
|βd ··β1 =0 = C1 (d)αd (−1)d ϕ4 (d, 0) = C8 (d) vecd [0], where
1
C8 (d), after all simplification, is given, with (9) by
−4(−1)d E(d)−2 (E(1)−2 − 1)
,d ≥ 3
C8 (d) = √ d−1
E(1)−2 · ·E(d − 2)−2
·
(12)
Let us summarize our results in the following theorem
d
Θ
|βd ··β1=0 = C8 (d) vecd [0] where C8 (d)
Theorem 1. K̄(α, x1 , x2 · ·xd ) = | ∂β∂d ··∂β
1
Qd
is given by (12), vecd = ℓ=3 Aℓ vecℓ , with vec2 [0] = 1, vec2 [i] = 0, i ≥ 1 and
the band matrix Aℓ is given by (10)
The computation of our covariances is now trivial.
3.4
Inverting the Laplace Transforms
√
It is well known that Lα [f (u)] = e−
√ √
·a
2
e− √
,
2 α
·a
2
, with f (u) =
2
with g(u) =
/2u
e−a
√
.
2πu
−a /2u
e√
a
.
2πu3/2
Also Lα [g(u)] =
Hence, from (2),
2
2
E[τ + (x1 )τ + (x2 )] = −4[e−2(x1 +x2 ) − e−2x2 ], x2 ≥ x1
(13)
We recover immediately Cohen and Hooghiemstra [3], (6.15)
Similarly, from (3),
E[τ + (x1 )τ + (x2 )τ + (x3 )]
Z 1n
2
2
3[e−2[x1 +x2 +x3 ] /t (x1 + x2 + x3 ) − e−2[x2 +x3 ] /t (x2 + x3 )]
=4
0
−[e−2[x2 +x3 ]
2
/t
2
(x2 + x3 ) − e−2[x3 +x2 −x1 ] (x2 + x3 − x1 )]
o dt
2
2
−2[e−2[x1 +x3 ] /t (x1 + x3 ) − e−2[x3 ] /t (x3 )] 3/2
t
(14)
Next covariances lead to similar expressions, with multiple integrals on t.
4
Some Applications
In this section, we apply the generalized covariances to two classical problems:
the Brownian Excursion Area and a result of Biane and Yor related to the cost
function g(u) = 1/x. We can proceed either from the Laplace transforms or from
the explicit covariances.
4.1
Brownian Excursion Area
R1
In [14], [16] we proved that Wd := E[ 0 X(u)du]d satisfies the following recur√ k
√
rence equation. Let γk := (36 2) Wk Γ ( 2k−1
2 )/2 π.
Generalized Covariances
Then
n−1
X
12n
ϕn −
ϕk
γn =
6n − 1
k=1
n
γn−k
k
471
(15)
with ϕk := Γ (3k + 21 )/Γ (k + 12 ).
The corresponding distribution, also called the Airy distribution has been
the object of recent renewed interest (see Spencer [20], Flajolet et al [8], where
many examples are given).
√
√
(15) leads to W1 = 2π/4, W2 = 5/12, W3 = 2π15/128... From (2) we
compute the Laplace transforms
√
√
Z ∞
Z ∞
−2 2
5 2
−2
−2
x1 dx1
x2 dx2 √ E(2) [E(1) − 1] =
2
α
32α5/2
0
x1
Inverting, this leads to 5/12 as expected. R
R∞
R∞
∞
Similarly with G(α) given by (3), 3! 0 x1 dx1 x1 x2 dx2 x2 x3 dx3 G(α) =
√
6.15
2π15/128 as expected.
128α4 . Inverting, this leads to
An interesting question is how to derive the recurrence (15) from the matrix
representation given by Theorem 1.
4.2
A Formula of Biane and Yor
In [1], Biane and Yor proved that
Y :=
Z
0
1
dt D
≡ 2ξ, where ξ :=
X(t)
sup X(t)
[0,1]
With our techniques, we prove in the full report [18] that all moments of both
sides are equal.
5
Conclusion
We have constructed a simple and efficient algorithm to compute the generalized
covariances K(x1 , x2 , · · xd ). Another challenge would be to derive the crossmoments of any order:
K(x1 , i1 , x2 , i2 · ·xd , id ) = E[τ + (x1 )i1 τ + (x2 )i2 · ·τ + (xd )id ]
It appears that the first two derivatives
i
Θ|βd =βd−1 =0
∂βidd ∂βd−1
d−1
lead to a linear combination of terms of type
D2 (d)ℓ D4 (d)r /(C1 (d)D1 (d) + C2 (d)D2 (d))m
472
G. Louchard
i
The next derivative ∂βd−2
Θ, after all simplifications (and setting βd−2 = 0) lead
d−2
to terms of type
H(d − 1, s, t) :=
D2 (k − 1)s
[C1 (d − 1)D1 (d − 1) + C2 (d − 1)D2 (d − 1)]t
and this pattern appears in all successive derivatives.
So, we can, in principle, construct (complicated) recurrence equations for
H(·, s, t) and recover our K by linear combinations. This is quide tedious and
up to now, we couldn’t obtain such a simple generating function as in Sec. 3.2 .
References
1. Biane, Ph.,Yor, M.: Valeurs principales associées aux temps locaux Browniens.
Bull. Sc. math. 2e, 111 (1987) 23–101.
2. Chung, K.L.: Excursions in Brownian motion. Ark. Mat. 14 (1976) 155–177.
3. Cohen, J.W., Hooghiemstra, G.: Brownian excursion, the M/M/1 queue and their
occupation times. Math. Operat. Res. 6 (1981) 608–629.
4. Drmota, M., Gittenberger, B.: On the profile of random trees. Rand. Str. Alg. 10
(1997) 421–451.
5. Flajolet, Ph., Sedgewick, R.: An Introduction to the Analysis of Algorithms.
Addison-Wesley, U.S.A.(1996)
6. Flajolet, Ph. Analyse d’algorithmes de manipulation d’arbres et de fichiers. Université Pierre-et-Marie-Curie, Paris (1981)
7. Flajolet, Ph., Odlyzko, A.: Singularity analysis of generating functions. Siam J.
Disc. Math. 3, 2 (1990) 216–240.
8. FljoletT, Ph., Poblete, P., Viola, A.: On the analysis of linear probing hashing.
Algorithmica 22 (1998) 490–515.
9. Getoor, R.K., Sharpe, M.J.: Excursions of Brownian motion and Bessel processes.
Z. Wahrscheinlichkeitsth. 47 (1979) 83–106.
10. Gittenberger, B.: On the countour or random trees. SIAM J. Discr. Math., to
appear.
11. Gittenberger, B., Louchard, G.: The Brownian excursion multi-dimensional local
time density. To appear in JAP. (1998)
12. Hooghiemstra, G.: On the explicit form of the density of Brownian excursion local
time. Proc. Amer. Math. Soc. 84 (1982) 127–130.
13. Knight, F.B.: On the excursion process of Brownian motion. Zbl. Math. 426,abstract 60073, (1980).
14. Louchard, G.: Kac’s formula, Levy’s local time and Brownian Excursion. J. Appl.
Prob. 21 (1984) 479–499.
15. Louchard, G.: Brownian motion and algorithm complexity. BIT 26 (1986) 17–34.
16. Louchard, G.: The Brownian excursion area: a numerical analysis. Comp & Maths.
with Appls. 10, 6, (1984) 413–417.
17. Louchard, G., Randrianarimanana, B., Schott, R.: Dynamic algorithms in D.E.
Knuth’s model: a probabilistic analysis. Theoret. Comp. Sci. 93 (1992) 201–225.
18. Louchard, G.: Generalized covariances of multi-dimensional Brownian excursion
local times. TR 396, Département d’Informatique (1999)
19. Meir, A. and Moon, J.W.: On the altitude of nodes in random trees. Can. J. Math
30 (1978) 997–1015.
20. Spencer, J.: Enumerating graphs and Brownian motion. Comm. Pure and Appl.
Math. Vol. L (1997) 0291–0294.
Combinatorics of Geometrically Distributed
Random Variables:
Length of Ascending Runs
Helmut Prodinger
⋆
The John Knopfmacher Centre for Applicable Analysis and Number Theory
Department of Mathematics
University of the Witwatersrand, P. O. Wits
2050 Johannesburg, South Africa
helmut@gauss.cam.wits.ac.za,
WWW home page: http://www.wits.ac.za/helmut/index.htm
Abstract. For n independently distributed geometric random variables
we consider the average length of the m–th run, for fixed m and n → ∞.
One particular result is that this parameter approaches 1 + q.
In the limiting case q → 1 we thus rederive known results about runs in
permutations.
1
Introduction
Knuth in [6] has considered the average length Lk of the kth ascending run in
random permutations of n elements (for simplicity, mostly the instance n → ∞
was discussed).
This parameter has an important impact on the behaviour of several sorting
algorithms.
Let X denote a geometrically distributed random variable, i. e. P{X = k} =
k−1
for k ∈ N and q = 1 − p.
pq
In a series of papers we have dealt with the combinatorics of geometric random variables, and it turned out that in the limiting case q → 1 the results
(when they made sense) where the same as in the instance of permutations.
Therefore we study the concept of ascending runs in this setting. We are considering infinite words, with letters 1, 2, · · · , and they appear with probabilities
p, pq, pq 2 , · · · . If we decompose a word into ascending runs
a1 < · · · < ar ≥ b1 < · · · < bs ≥ c1 < · · · < ct ≥ · · · ,
then r is the length of the first, s of the second, t of the third run, and so on.
We are interested in the averages of these parameters.
⋆
This research was partially conducted while the author was a guest of the projet Algo
at INRIA, Rocquencourt. The funding came from the Austrian–French “Amadée”
cooperation.
G. Gonnet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNCS 1776, pp. 473–482, 2000.
c Springer-Verlag Berlin Heidelberg 2000
474
2
H. Prodinger
Words with Exactly m Runs
As a preparation, we consider the probability that a random word of length n
has m ascending runs. For m = 1, this is given by
[z n ]
Y
(1 + pq i−1 z)
i≥1
for n ≥ 1.
But the product involved here is well known in the theory of partitions; the
usual notation is
(a)n = (a; q)n = (1 − a)(1 − aq)(1 − aq 2 ) · · · (1 − aq n−1 )
and
2
(a)∞ = (a; q)∞ = (1 − a)(1 − aq)(1 − aq ) · · · .
Therefore
Y
(1 + pq
i≥1
i−1
z) = (−pz)∞
X pn q (n2 ) z n
,
=
(q)n
n≥0
the last equality being the celebrated identity of Euler [2]. This was already
noted in [7]. If we set
Λm (z) =
X
[Pr. that a word of length n has (exactly) m ascending runs]z n ,
n≥0
then Λ0 (z) = 1 and Λ1 (z) = (−pz)∞ − 1.
Now for general m we should consider (−pz)m
∞ . Indeed, words with exactly
m ascending runs have a unique representation in this product. However, this
product contains also words with less than m runs, and we have to subtract
that.
A word with m − 1 ascending runs is n + 1 times as often contained as in
Λm−1 (z). This is so because we can choose any gap between two letters (also on
the border) in n + 1 ways. Such a gap means that we deliberately cut a run into
pieces. Then, however, everything is unique. In terms of generating functions,
d
.) For m − 2 ascending runs, we can
this is D(zΛm−1 (z)). (We write D = dz
n+2
select 2 gaps in 2 ways, which amounts to 12 D(z 2 Λm−2 (z)), and so on.
Therefore we have the following recurrence:
Λm (z) = P m −
m
X
1 k k
D z Λm−k (z) ,
k!
k=1
Λ0 (z) = 1;
(1)
Combinatorics of Geometrically Distributed Random Variables
475
here are the first few values. We use the abbreviations P = (−pz)∞ and Pk =
z k Dk P .
Λ0 = 1,
Λ1 = P − 1
Λ2 = (P − 1)P − P1 ,
1
Λ3 = (P − 1)P 2 + P1 − 2P P1 + P2 ,
2
1
1
3
Λ4 = (P − 1)P + 2P P1 − P2 − 3P 2 P1 + P12 + P P2 − P3 .
2
6
In the limiting case q → 1 we can specify these quantities explicitly. This
was obtained by experiments after a few keystrokes with trusty Maple.—Instead
of P we just have ez , and that definitely makes life much easier, since all the
derivatives are still P .
Theorem 1. The sequence Λm (z) is defined as follows,
Λm (z) := emz −
Then we have for m ≥ 1
Λm (z) =
m
X
m
X
1 k k
D z Λm−k (z) ,
k!
Λ0 (z) := 1.
k=1
jz z
m−j−1
e
j=0
(−1)m−j j m−j−1 j(z − 1) + m
.
(m − j)!
Proof. First notice that if we write
λm (z) =
m
X
j=0
ejz
(−jz)m−j
,
(m − j)!
for m ≥ 0, then Λm (z) = λm (z) − λm−1 (z). And the equivalent formula is
m
m
X
X
1 k k
D z λm−k (z) =
ejz ,
k!
j=1
k=0
which we will prove by induction, the basis being trivial (as usual). Now
m−k
m
m
X
X
1 k X jz m−j (−j)m−k−j
1 k k
e z
D z λm−k (z) =
D
k!
k!
(m − k − j)!
j=1
k=0
k=0
m
k m−k
X
1 X k X k−i jz m−j−i (m − j)! (−j)m−k−j
,
j e z
=
k! i=0 i j=1
(m − j − i)! (m − k − j)!
k=0
476
H. Prodinger
and we have to prove that the coefficient of ejz therein is 1. Writing M := m − j
it is
M
k
X
1 X
M ! (−j)M −k
binomkij k−i z M −i
k! i=0
(M − i)! (M − k)!
k=0
M
M
X
X
1 k (−1)M −k
M!
(jz)M −i
=
(M − i)!
k! i (M − k)!
i=0
k=i
M
M
X
X
M
M −i
1
M −i
(jz)
(−1)M −k
=
(M
−
i)!
i
k
−
i
i=0
k=i
M
M −i
X
M
(−1)
(jz)M −i
=
δM,i = 1.
(M − i)!
i
i=0
Thus
m
m
m
X
X
X
1 k k
1 k k
1 k k
D z Λm−k (z) =
D z λm−k (z) −
D z Λm−1−k (z)
k!
k!
k!
k=1
k=1
=
m
X
j=1
k=1
ejz − λm (z) −
m−1
X
j=1
ejz + λm−1 (z) = emz − Λm (z),
and the result follows.
⊓
⊔
Remark. Since every word has some number of ascending runs, we must have
that
X
Λm (z) =
m≥0
1
.
1−z
In the limiting case q → 1 we will give an independent proof. For the general
case, see the next sections. Consider
Λ0 (z) + Λ1 (z) + · · · + Λm (z) = λm (z);
for m = 6 we get e. g.
λ6 (z) = 1 + z + z 2 + z 3 + z 4 + z 5 + z 6 +
5039 7
5040 z
+
5009 8
5040 z
Now this is no coincidence since we will prove that
[z n ]λm (z) = 1
for n ≤ m.
+
38641 9
40320 z
+ O z 10 .
Combinatorics of Geometrically Distributed Random Variables
477
This amounts to proving that1
n
1 X n
(m − k)n (−1)k = 1.
n!
k
k=0
Notice that
n
h
1 X n
(−1)n−k k h
=
n!
k
n
k=0
are Stirling subset numbers, and they are zero for h < n. Therefore, upon expanding (m − k)n by the binomial theorem, almost all theterms are annihilated,
only (−k)n survives. But then it is a Stirling number nn = 1. It is not too
hard to turn this proof involving “discrete convergence” into one with “ordinary
convergence.”
⊓
⊔
3
The Average Length of the mth Run
We consider the parameter “combined lengths” of the first m runs in infinite
strings and its probability generating function.
Note carefully hat now the elements of the probability space are infinite
words, as opposed to the previous chapter where we had words of length n. This
is not unlike the situation in the paper [4].
To say that this parameter is larger or equal to n is the same as to say that
a word of length n (the first n letters of the infinite word) has ≤ m ascending
runs. Therefore the sought probability generating function is
1
z−1
Λ0 (z) + · · · + Λm (z) + .
Fm (z) =
z
z
Now
′
Fm
(1) = Λ0 (1) + · · · + Λm (1) − 1 = Λ1 (1) + · · · + Λm (1)
is the expected value of the combined lengths of the first m runs. Thus Λm (1)
is the expected value of the length of the mth run (m ≥ 1).
In the limiting case q → 1 we can say more since we know Λm (z) explicitly:
Lm = Λm (1) = m
m
X
(−1)m−j j m−j−1
j=0
(m − j)!
ej ,
and this is exactly the formula that appears in [6] for the instance of permutations.
There, we also learn that Lm → 2; in the general case, we will see in the next
sections that Lm → 1 + q.
1
A gifted former student of electrical engineering (Hermann A.) contacted me years
after he was in my Discrete Mathematics course, telling me that he found this (or an
equivalent) formula, but could not prove it. He was quite excited, but after I emailed
him how to prove it, I never heared from him again.
478
4
H. Prodinger
Solving the Recursion
1
The effect of the operator k!
Dk (z k f ) can also be described by the Hadamard
product (see [3] for definitions) of f with the series
X n + k
zn.
Tk =
n
n
The reformulation of the recursion is then
Pm =
m
X
k=0
Tk ⊙ Λm−k .
We want to invert this relation to get the Λm (z)’s from the powers of P . We get
Λm (z) =
m
X
k=0
where
Uk ⊙ P m−k ,
.X
Tj w j .
Uk = [wk ] 1
j≥0
Lemma 1.
Uk = (−1)k
X n + 1
k
n
Proof. Since
Un = −
n−1
X
zn.
Uk Tn−k ,
k=0
it is best to prove the formula by induction, the instance n = 0, i.e. U0 = 1/(1−z)
being obvious.
The righthandside multiplies the coefficient of z n by
−
n−1
X
(−1)l
k=0
k−1
X n + 1−n − 1
n+1 n+k−l
= (−1)k+1
l
k−l
l
n
l=0
n+1
,
= (−1)k
k
which finishes the proof.
Therefore we get the formula
Proposition 1.
n
n
[z ]Λm (z) = [z ]
m
X
k=0
k
(−1)
n+1
P m−k ;
k
for n < m this is zero by the combinatorial interpretation or directly.
Combinatorics of Geometrically Distributed Random Variables
479
Now this form definitely looks like an outcome of the inclusion–exclusion
principle. Indeed, there are n + 1 gaps between the n letters, and one can pick
k of them where the condition that a new run should start is violated. Since we
don’t need that, we confine ourselves to this brief remark.
Let us prove that
X
Λm (z) =
m≥0
n
[z ]
X
X
n
Λm (z) = [z ]
m≥0
1
:
1−z
k
(−1)
0≤k≤m≤n
n+1
P m−k
k
k
n+1 X i
P
(−1)
= [z ]
k + 1 i=0
k=1
k i
n
X
n + 1 XX i
(P − 1)i
= [z n ]
(−1)n−k
k + 1 i=0 j=0 j
n
n
X
n−k
k=1
n
= [z ](P − 1)n = 1.
(Note that P − 1 = z + . . . .)
Let us also rederive the formula for Λm (1) in the limiting case q → 1; since
[z n ]Λm (z) =
m
X
(−1)m−k
k=0
n + 1 kn
m − k n!
we have
X
[z n ]Λm (z) =
n≥0
=
m
X
(−1)m−k X
k=0
m
X
(m − k)!
n≥0
m−k
1
kn
+
(n − m + k)! (n + 1 − m + k)!
(−1)
k m−k + (m − k)k m−k−1 ek
(m − k)!
k=0
m
X
=m
k=0
m−k
(−1)m−k m−k−1 k
k
e .
(m − k)!
480
5
H. Prodinger
A Double Generating Function
From Proposition 1 we infer that
Λm (z) =
=
m
X
(−1)k
k=0
m
X
k=0
=z
k!
Dk (zP m−k )
(−1)k
zDk P m−k + kDk−1 P m−k
k!
m
X
(−1)k
k=0
k!
Dk P m−k −
m−1
X
k=0
(−1)k k m−1−k
D P
.
k!
Now these forms look like convolutions, and thus we introduce the double
generating function
R(w, z) =
X
Λm (z)wm .
m≥0
Upon summing we find
R(w, z) = ze−wD
1
1
z−w
;
− we−wD
=
1 − wP
1 − wP
1 − w − p(z − w) ∞
the last step was by noticing that eaD is the shift operator E a (see e. g. [1]).
It is tempting to plug in w = 1, because of the summation of the Λm (z)’s,
but this is forbidden, because of divergence.
The instance (q → 1)
R(w, 1) =
1−w
1 − we1−w
differs from Knuth’s
w(1 − w)
+w
ew−1 − w
just by 1, because in [6] L0 = 0, whereas here it is L0 = 1.
Theorem 2. The generating function of the numbers Lm is given by
1−w
Q
i≥0
1−w
.
1 − (w − 1)pq i
Combinatorics of Geometrically Distributed Random Variables
481
The dominant singularity is at w = 1, and the local expansion starts as
1+q
+ ··· .
1−w
From this, singularity analysis [5] entails that
Lm = 1 + q + O(ρ−m )
for some ρ > 1 that depends on q.
Experiments indicate the expansion (m → ∞)
Lm = 1 + q − q m+1 + 2mq m+2 − (1 + 2m2 )q m+3 −
so that it seems that one could take ρ =
proof.
6
1
q
2m4 +10m2 −15m+9 m+4
q
3
+ ···
− ǫ. However, that would require a
Weakly Ascending Runs
Relaxing the conditions, we might now say that · · · > a1 ≤ · · · ≤ ar > · · · is a
run (of length r).
Many of the previous considerations carry over, so we only give a few remarks.
The recursion (1) stays the same, but with P = 1/(pz)∞ .
With this choice of P the formula
m
X
n+1
P m−k
(−1)k
[z n ]Λm (z) = [z n ]
k
k=0
still holds.
The bivariate generating function is
z−w
.
1 − w/ p(z − w) ∞
The poles of interest are the solutions of
Y
1 + (w − 1)pq i = w;
i≥0
the dominant one is w = 1 with a local expansion
1 1
+ ··· ,
1+
q 1−w
from which we can conclude that Lm → 1 + 1q .
And the experiments indicate that
Lm = 1 + 1q − (−1)m q 2m−1 1 − 2(m − 1)q + (2m2 − 5m + 4)q 2 + · · · .
482
H. Prodinger
References
1. M. Aigner. Combinatorial Theory. Springer, 1997. Reprint of the 1979 edition.
2. G. E. Andrews. The Theory of Partitions, volume 2 of Encyclopedia of Mathematics
and its Applications. Addison–Wesley, 1976.
3. L. Comtet. Advanced Combinatorics. Reidel, Dordrecht, 1974.
4. P. Flajolet, D. Gardy, and L. Thimonier. Birthday paradox, coupon collectors,
caching algorithms, and self–organizing search. Discrete Applied Mathematics,
39:207–229, 1992.
5. P. Flajolet and A. Odlyzko. Singularity analysis of generating functions. SIAM
Journal on Discrete Mathematics, 3:216–240, 1990.
6. D. E. Knuth. The Art of Computer Programming, volume 3: Sorting and Searching.
Addison-Wesley, 1973. Second edition, 1998.
7. H. Prodinger. Combinatorial problems of geometrically distributed random variables and applications in computer science. In V. Strehl and R. König, editors,
Publications de l’IRMA (Straßbourg), volume 30, pages 87–95, 1993.
Author Index
Akhavi, Ali, 355
Ambainis, Andris, 207
Avis, David, 154
Barth, D., 308
Béal, Marie-Pierre, 397
Bender, Michael A., 88
Berrizbeitia, Pedro, 269
Bloom, Stephen L., 367
Borodin, Allan, 173
Carton, Olivier, 397, 407
Cicerone, Serafino, 247
Coffman Jr., E. G., 292
Cohen, Myra B., 95
Colbourn, Charles J., 95
Corneil, Derek G., 126
Corteel, S., 308
D’Argenio, Pedro R., 427
De Simone, Caterina, 154
Denise, A., 308
Di Stefano, Gabriele, 247
Echagüe, Juan V., 427
El-Yaniv, Ran, 173
Ésik, Zoltán, 367
Farach-Colton, Martín, 88
Fernández, Maribel, 447
Fernández-Baca, David, 217
Figueiredo, Celina M.H. de, 145, 163
Frigioni, Daniele, 247
Gardy, D., 308
Gathen, Joachim von zur, 318
Goerdt, Andreas, 38
Gogan, Vincent, 173
Grabner, Peter, 457
Gutiérrez, Claudio, 387
Klein, Sulamita, 163
Knessl, Charles, 298
Knopfmacher, Arnold, 457
Kohayakawa, Yoshiharu, 1, 48, 163
Krause, Matthias, 280
Laber, Eduardo Sany, 227
Lanlignel, Jean-Marc, 126
Laroussinie, F., 437
Linhares Sales, Cláudia, 135
Lokam, Satyanarayana V., 207
Louchard, Guy, 463
Lücking, Thomas, 318
Lueker, George S., 292
Mackie, Ian, 447
Maffray, Frédéric, 135
Mastrolilli, Monaldo, 68
Mayr, Richard, 377
Michel, Max, 407
Milidiú, Ruy Luiz, 227
Miyazawa, F. K., 58
Molloy, Mike, 38
Moura, Lucia, 105
Nanni, Umberto, 247
Nobili, Paolo, 154
Odlyzko, Andrew, 258
Odreman Vera, Mauricio, 269
Opatrny, Jaroslav, 237
Ortiz, Carmen, 145
Picinin de Mello, Célia, 145
Prieur, Christopher, 397
Prodinger, Helmut, 457, 473
Rödl, V., 1, 48
Raghavan, Prabhakar, 123
Ravelomanana, Vlady, 28
Reed, Bruce A., 126, 163
Rotics, Udi, 126
Habib, Michel, 126
Jansen, Klaus, 68
Kabanets, Valentine, 197
Sakarovitch, Jacques, 397
Schnoebelen, Ph., 437
Shparlinski, Igor E., 259
Sierra Abbate, Luis R., 427
484
Author Index
Simon, Hans Ulrich, 280
Skokan, J., 48
Solis-Oba, Roberto, 68
Spencer, Joel, 292
Stevens, Brett, 115
Szpankowski, Wojciech, 298
Taylor, Stephen, 78
Tena Ayuso, Juan, 269
Thimonier, Loÿs, 28
Turuani, M., 437
Valencia-Pabon, M., 308
Vallée, Brigitte, 343
Wakabayashi, Y., 58
Winkler, Peter M., 292
Worsch, Thomas, 417
Zito, Michele, 18