12
Science of Computer Programming 78 (2013) 544–555 Contents lists available at SciVerse ScienceDirect Science of Computer Programming journal homepage: www.elsevier.com/locate/scico Spanders: Distributed spanning expanders Shlomi Dolev , Nir Tzachar Department of Computer Science, Ben-Gurion University of the Negev, Beer-Sheva, 84105, Israel article info Article history: Received 24 August 2011 Received in revised form 30 September 2012 Accepted 1 October 2012 Available online 11 October 2012 Keywords: Expanders Random walks Self-stabilization Self-organization Dynamic networks abstract Self-stabilizing distributed construction of expanders by the use of short random walks. We consider self-stabilizing and self-organizing distributed construction of a spanner that forms an expander. We advocate the importance of designing systems to be self-stabilizing and self-organizing, as designers cannot predict and address all fault scenarios and should address unexpected faults in the fastest possible way. We use folklore results to randomly define an expander graph. Given the randomized nature of our algorithms, a monitoring technique is presented for ensuring the desired results. The monitoring is based on the fact that expanders have a rapid mixing time and the possibility of examining the rapid mixing time by O(n log n) short (O(log 4 n) length) random walks even for non-regular expanders. We then use our results to construct a hierarchical sequence of spanders, each being an expander spanning the previous spander. Such a sequence of spanders may be used to achieve different quality of service (QoS) assurances in different applications. Several snap- stabilizing algorithms that are used for monitoring are presented, including: (i) Snap- stabilizing data-link, (ii) Snap-stabilizing message passing reset, and (iii) Snap-stabilizing token tracing. © 2012 Elsevier B.V. All rights reserved. 1. Introduction Self-stabilizing and self-organizing distributed algorithms. Self-stabilization ensures automatic recovery from an arbitrary state, whereas self-organization is a property of algorithms that displays local attributes. More precisely, an algorithm is self- organizing(a) if it converges in sublinear time and (b) if it reacts ‘‘fast’’ to topology changes. If s(n) is an upper bound on the convergence time and d(n) is an upper bound on the convergence time following a topology change, then s(n) o(n) and d(n) o(s(n)). Self-organization and self-stabilization appear to be very important properties in variety of natural phenomena, for example, in social communities and neural networks. The property of self-organization can be used for obtaining, in sub-linear time, global properties and reaction to changes. We advocate the importance of designing systems to be self-stabilizing (see [6,7]), as designers cannot predict and ad- dress each possible fault scenario. In the unfortunate case in which a system experiences a fault scenario that had not been taken into consideration by the designer, the system may malfunction forever. In contrast, designing self-stabilizing systems ensures recovery following an arbitrary fault scenario that leaves the system in an arbitrary global state. Self- organizing design ensures fast convergence following an unexpected fault scenario. Thus, a system that is designed to be self-stabilizing and self-organizing automatically recovers, in a graceful manner, following the occurrence of arbitrary faults. Corresponding author. E-mail addresses: [email protected] (S. Dolev), [email protected] (N. Tzachar). 0167-6423/$ – see front matter © 2012 Elsevier B.V. All rights reserved. doi:10.1016/j.scico.2012.10.001

Spanders: Distributed spanning expanders

  • Upload
    nir

  • View
    222

  • Download
    3

Embed Size (px)

Citation preview

Page 1: Spanders: Distributed spanning expanders

Science of Computer Programming 78 (2013) 544–555

Contents lists available at SciVerse ScienceDirect

Science of Computer Programming

journal homepage: www.elsevier.com/locate/scico

Spanders: Distributed spanning expanders

Shlomi Dolev ∗, Nir TzacharDepartment of Computer Science, Ben-Gurion University of the Negev, Beer-Sheva, 84105, Israel

a r t i c l e i n f o

Article history:Received 24 August 2011Received in revised form 30 September2012Accepted 1 October 2012Available online 11 October 2012

Keywords:ExpandersRandom walksSelf-stabilizationSelf-organizationDynamic networks

a b s t r a c t

Self-stabilizing distributed construction of expanders by the use of short random walks.We consider self-stabilizing and self-organizing distributed construction of a spanner thatforms an expander.We advocate the importance of designing systems to be self-stabilizingand self-organizing, as designers cannot predict and address all fault scenarios and shouldaddress unexpected faults in the fastest possible way. We use folklore results to randomlydefine an expander graph. Given the randomized nature of our algorithms, a monitoringtechnique is presented for ensuring the desired results. Themonitoring is based on the factthat expanders have a rapid mixing time and the possibility of examining the rapid mixingtime by O(n log n) short (O(log4 n) length) random walks even for non-regular expanders.We then use our results to construct a hierarchical sequence of spanders, each being anexpander spanning the previous spander. Such a sequence of spanders may be used toachieve different quality of service (QoS) assurances in different applications. Several snap-stabilizing algorithms that are used for monitoring are presented, including: (i) Snap-stabilizing data-link, (ii) Snap-stabilizing message passing reset, and (iii) Snap-stabilizingtoken tracing.

© 2012 Elsevier B.V. All rights reserved.

1. Introduction

Self-stabilizing and self-organizing distributed algorithms. Self-stabilization ensures automatic recovery from an arbitrary state,whereas self-organization is a property of algorithms that displays local attributes. More precisely, an algorithm is self-organizing(a) if it converges in sublinear time and (b) if it reacts ‘‘fast’’ to topology changes. If s(n) is an upper bound onthe convergence time and d(n) is an upper bound on the convergence time following a topology change, then s(n) ∈ o(n)and d(n) ∈ o(s(n)). Self-organization and self-stabilization appear to be very important properties in variety of naturalphenomena, for example, in social communities and neural networks. The property of self-organization can be used forobtaining, in sub-linear time, global properties and reaction to changes.

We advocate the importance of designing systems to be self-stabilizing (see [6,7]), as designers cannot predict and ad-dress each possible fault scenario. In the unfortunate case in which a system experiences a fault scenario that had notbeen taken into consideration by the designer, the system may malfunction forever. In contrast, designing self-stabilizingsystems ensures recovery following an arbitrary fault scenario that leaves the system in an arbitrary global state. Self-organizing design ensures fast convergence following an unexpected fault scenario. Thus, a system that is designed tobe self-stabilizing and self-organizing automatically recovers, in a graceful manner, following the occurrence of arbitraryfaults.

∗ Corresponding author.E-mail addresses: [email protected] (S. Dolev), [email protected] (N. Tzachar).

0167-6423/$ – see front matter© 2012 Elsevier B.V. All rights reserved.doi:10.1016/j.scico.2012.10.001

Page 2: Spanders: Distributed spanning expanders

S. Dolev, N. Tzachar / Science of Computer Programming 78 (2013) 544–555 545

Self-stabilizing and self-organizing algorithms for many distributed tasks, including distributed snapshot and leaderelection, are presented in [10]. Such algorithms assume that it is possible to locally define (and then use) hyperlinks,1 asdone in phone connections. A new randomized self-stabilizing distributed algorithm for cluster definition in communicationgraphs of bounded degree processors is presented in [10]. These graphs fit sensor networks deployment. The algorithmconverges inO(log n) expected number of rounds, handles dynamic changes locally, and is therefore self-organizing. Applyingthe clustering algorithm in O(log n) levels, using an overlay network abstraction, results in a self-stabilizing and self-organizing distributed algorithm for hierarchy definition.Spanders, spanning expanders. In thiswork,wepresent the construction of expanders for the cases inwhich hyperlinks are notsupported (extending the conference version [12] that is based on [11,25,13]). Self-stabilizing and self-organizing distributedconstruction of a spanner that forms an expander is considered. Roughly speaking, an expander graph is characterized by asmall ratio between the outgoing edges from a set of nodes, to the number of nodes in the set. Expander graphs are sparsegraphs (in which the number of edges is relatively small) with strong connectivity and small diameter. A spanner that is anexpander is extremely local (as the entire graph has logarithmic diameter), robust and dynamic (as the structure is typicallyobtained by a randomprocess and therefore the structure is flexible rather than rigid). Folklore results for randomly definingan expander are used.

Given the randomized nature of the algorithms, a monitoring technique is presented for ensuring the desired results.The monitoring is based on the fact that expanders have a rapid mixing time and the possibility of examining the rapidmixing time by O(n · log n) short (O(log4 n) length) randomwalks (e.g., [9,19,20]) even for non-regular expanders. We thenuse our results to construct a hierarchical sequence of spanders, each being an expander spanning the previous one. Such asequence of spandersmay be used to achieve different quality of service (QoS) assurances in different applications. One boldapplication for spanners is message routing. A spanner is defined by a subset of the edges of the graph such that the subsetpreserves the connectivity of the graph and therefore yields a significantly more efficient routing distributed data structure.In the extreme, a tree is a spanner with the minimum number of edges, n− 1 edges, that allows broadcast with no furtherrouting information beyond the distributed information that is used to define the (spanning) tree. Other spanners that havemore edges than the minimal possible number of edges, ensure additional properties, such as a small stretch factor, wherethe stretch factor is the maximal ratio between the shortest path of two nodes in the spanner and in the original graph. Wepresent a spanner that significantly reduces the number of edges in relation to the original graph and therefore reduces thedistributed data structure used for routing, while ensuring a small additive stretch factor.

1.1. Related work

Expander graphs are of great importance in computer science, with applications in randomness extractions, errorcorrecting codes, spanners in communication networks and complexity theory [15]. In particular, expander graphs can beused as graph spanners [22,23]. To the best of our knowledge, there is a limited number of works that address the problemof distributed expander construction. In [18], Law and Siu propose to construct an expander graph by composing a sufficientnumber of Hamiltonian cycles. The proposed constructionmakes the following assumptions: the communication network isan overlay network (two nodes can directly communicate as long as their identifiers are known to each other), the algorithmstarts from a predefined graph of at least three nodes, and nodes wishing to join the graph must send a special message to anode that has already joined the graph. Unfortunately, the proposed algorithms cannot be started in an arbitrary state andtherefore are not self-stabilizing.

A different approach for distributed construction of expanders is proposed in [24], in which Reiter et al. suggest usinguniform sampling to select, for each node, a set of expander-neighbors. The goal in [24] is to construct an ‘‘almost’’ regulargraph, in which each node maintains a list of expander neighbors of size between d − c and d + c , where d is the desireddegree of the almost regular graph and c is a small constant. When a node v selects a node u to be its neighbor, v sends ajoinmessage to u. However, umight already have more than d+ c neighbors, so u throws one neighbor, chosen uniformly,from its neighbor list and inserts v instead. u then sends a leavemessage to the neighbor that has been thrown, w, to notifyw to remove u from w’s neighbor list. As a result, it is not straightforward to predict whether the given algorithm convergesor oscillates among different incomplete graphs.

The definition of self-organization that we use here was first presented in [10]. However, the system model of [10] dif-fers from that in the present model; in [10] we assumed the system was designed to support hyperlinks, i.e., given a pathbetween two nodes, u and v, a direct link between u and v may be established. Moreover, the communication overhead ofsuch links was assumed to take one time unit. In contrast, our current model assumes amore conventional system, in whichhyperlinks cannot be defined. The algorithms we present achieve self-organization by employing the characteristics of theunderlying expander graph.

Expansion evaluation has been considered in the past. For example, from a mathematical viewpoint, our techniquesfor expansion evaluation resemble those presented in [14]. However, the authors of [14] deal only with bounded degreegraphs, and theirmethods are not readily convertable to distributed settings. Independently of [11], in a recentwork, Czumaj

1 Hyperlinks are paths of edges in the network, through which a sent message does not stop in intermediate nodes. Hyperlinks are commonly used inpeer-to-peer overlay networks and in the MPLS and ATM communication protocols.

Page 3: Spanders: Distributed spanning expanders

546 S. Dolev, N. Tzachar / Science of Computer Programming 78 (2013) 544–555

and Sohler [5] extended the result of [14] by selecting optimal parameters for the expansion testing algorithm. Recently,[21,16] have improved upon the results of [5]. The results obtained are closely related to our own. However, the question ofa distributed implementation remains in all of the work discussed above.

2. System settings

The system is defined by a communication graph, G = (V , E), where V is a set of nodes {v1, v2 . . . , vn} and E is a setof undirected communication links; if (v, u) ∈ E, then v and u can communicate by sending messages of bounded size toeach other. Message sizes are restricted to O(log n) bits. We further assume that each communication link is a boundedcapacity fifo queue. Let lc denote the capacity of the links. When messages are sent over a full link, we assume that one ofthe messages (either a message already in the link or a new message) is lost. We require only that messages that are sentinfinitely often are received infinitely often. We present data–link algorithms to ensure that communication over such linksis snap-stabilizing (see the discussion on snap-stabilizing data–link above).

Nodes may join and leave the system at any time. We make no distinction between a node that leaves and a node thatcrashes, assuming that both can be detected by neighboring nodes in a timely fashion.

For a graph G = (V , E), given two sets of nodes, V1, V2, we define the following: E(V1, V2) = {e = (v1, v2) ∈ E|v1 ∈

V1 ∧ v2 ∈ V2} (i.e., the set of edges between V1 and V2). We also define V1 = V \ V1, the set of nodes not in V1.A graphG = (V , E) is an edge expander if there exists a constant a, such that for each set S of vertices (where |S| < |V |/2)

it follows that |E(S, S)|/|S| > a. For a comprehensive overviewof expander graphs, their properties and sample uses thereof,we recommend to the reader the excellent text in [15].

Given a graph G = (V , E), a spander, S = (V , E ′), is a spanning subgraph of G, if there exists a constant p > 0 such that|E ′| ≤ p|E| and the edge expansion of S is at worst p times the edge expansion of G.

A configuration c of the system is a tuple c = (S, L); S is a vector of states, ⟨s1, s2, . . . sn⟩, where the state si is a stateof node vi; L is a vector of link states ⟨li,j, . . .⟩ for each (i, j) ∈ E. A link li,j is modeled by a fifo queue of messages that arewaiting to be received by vj, and the content of the queue is the state of the link. Whenever vi sends a messagem to vj,m isenqueued in li,j (if the link is full, an arbitrary message in the queue will be dropped). Also, whenever vj receives a messagem from vi, m is dequeued from li,j. A node changes its state according to its transition function (or program). A transitionof node vi from a state sj to state sk is called an atomic step (or simply a step) and is denoted by a. A step a consists of localcomputation and terminates with either a single send or a single receive operation.

The system is asynchronous, meaning that there is no correlation between the non-constant rate of steps taken by thenodes. We model our system by using the interleaving model. An execution is a sequence of global configurations and steps,E =< c0, a0, c1, a1, . . . >, so that configuration ci is reached from ci−1 by a step ai of one node vj. The changes in ci, due to ai,are associatedwith vj and consisting of: the state change of vj according to the transition function of vj, and possibly the stateof a link attached to vj. The content of a link state is changed when vj sends or receives a message during ai. An execution Eis fair if every node executes a step infinitely often in E . Within the scope of self-stabilization, we consider executions thatare started in an arbitrary initial configuration.

A task is defined by a set of executions termed legal executions and denoted LE. A configuration c is a safe configurationfor a system and a set of legal executions LE if every fair execution that starts in c is in LE. A system is self-stabilizing for atask and a set of legal executions LE if every infinite execution reaches a safe configuration in relation to LE. We sometimesuse the term ‘‘the algorithm stabilizes’’ to note that the algorithm has reached a safe configuration with regard to the legalexecution of the corresponding task.

To measure time, we use the notion of communication rounds: a communication round (or simply a round) is a sequenceof atomic steps such that each node has taken at least one atomic step during this sequence. If this atomic step involves asend operation of a messagem over link l, then we require that the atomic step that corresponds to the receipt of a messagefrom l, which has been sent during this sequence of atomic steps, will also appear in the sequence. Such timemeasurementsare appropriatewhen a protocol involves the entire system.Whenmeasuring the time complexity of a protocol that involvesonly a subset of nodes in the system, we use the notion of the happened before relation (see [17]). We then say that the timecomplexity of the protocol is the longest chain of happened before relation induced by the protocol.

A distributed algorithm is termed self-organizing ([10]) if it satisfies the following properties: (1) the algorithm isself-stabilizing, (2) convergence time to a safe configuration, s(n), is in o(n), and (3) after reaching a safe configuration,convergence time following a dynamic change, d(n), is in o(s(n)).

A distributed algorithm is termed snap-stabilizing if the algorithm stabilizes following the first request by any node andbefore, or simultaneously with, a notification arriving at the requesting node at the completion of the request (for moreinformation, see e.g., [4]). Some of the proofs are omitted from this version and can be found in [11,25,13].

3. Expander extraction

In this section, we develop a simple, yet effective, technique for building a spander, given an arbitrary expander. In somecases, the initial topology of the graph is unknown at each node or cannot be stored at each node due tomemory constraints(consider, for example, a peer-to-peer systemwith millions of nodes). The only input may be that the initial graph is a good

Page 4: Spanders: Distributed spanning expanders

S. Dolev, N. Tzachar / Science of Computer Programming 78 (2013) 544–555 547

expander. Yet, a spander construction must be carried out under such constraints; one may wish to employ the techniqueused for the complete graph to define a spander over any given expander graph. However, when the graph is not complete,choosing a constant number of neighbors at each node may result in non-expander graph. As an example, consider thefollowing graph: let G = (V , E) be a regular (degree) expander graph. Now, let V1 be a set of half the nodes in V andV2 = V \ V1. Consider the following graph, G′ = (V , E ′), where E ′ = E ∪ {(v, u) : v, u ∈ V1} ∪ {(v, u) : v, u ∈ V2}.Since adding edges can only increase the expansion of a graph, G′ is a good expander (at least as good as G). Suppose wethen proceed to generate a constant-degree expander from G′ by choosing for each node a constant number of neighbors,independently at random. It is easy to see that with non-negligible probability we will obtain a disconnected graph, andwith overwhelming probability the resulting graph will not be a good expander.

Taking the above observation into account, we still wish to reduce the number of edges of an arbitrary expander withoutsacrificing the expansion property. When considering a graph G = (V , E) such that each cut in the graph contains sufficientedges, implying that the graph is a good edge expander, one may notice that if each edge is selected using a constantprobability, then with very high probability all cuts will remain large — thus keeping the edge expansion of the graph.

In the following lemma, we prove that starting from a graph with good enough edge expansion, namely, with edgeexpansion in Θ(log n) and a selection of each edge with a constant probability, results in an edge expander withoverwhelming probability:

Lemma 3.1. Let G = (V , E) be a graph with edge expansion c · log n, where c is a constant and n = |V |. The graph G∗ = (V , E∗),such that P[(u, v) ∈ E∗] = p has edge expansion of pc log n with overwhelming probability for appropriate p and c. For example,when p = 1

2 , c = 48, the probability is at least 1− 1n .

Proof. First consider a set S ⊂ V such that |S| = s ≤ n2 . Since G is an edge expander, we know that h = |E(S, S)| > sc log n.

Next, we calculate the probability that such a cut is ‘‘small’’ in G∗, applying Chernoff’s inequality for the cumulativedistribution function of the binomial distribution;

P|EG∗(S, S)| <

ph2

≤ exp

(ph− ph/2)2

2ph

= exp

ph8

≤ exp

psc log n8

= n−

psc8 .

Now, using the union bound and denoting a ‘‘bad’’ cut as a cut S such that |EG∗(S, S)| <ph2 , the probability that no such

‘‘bad’’ cut exists is:

P [!∃a ‘‘bad’’ set S] ≤

1≤s≤n/2

ns

n−

psc8 ≤

1≤s≤n/2

ns(1− pc8 ) ≤ n2− pc

8 .

It is easy to see that appropriate values of c, p (for example, p = 12 , c = 48) imply a probability of failure that is less

than 1n . �

Realizing such a construction in a distributed manner is simple: for each edge to be chosen with probability p, eachnode must choose each of its adjacent edges with probability 1−

√1− p (hence, the probability the edge is not chosen is:√

1− p ·√1− p = 1− p). The edge is chosen if at least one node chooses it.

4. Expansion monitoring

Assuming that, with high probability, we can construct an expander, sometimes it might be necessary to evaluate ourconstruction: we may wish to check that the resulting graph is a good enough expander (or even whether it forms aconnected component) andwhether enough edgeswere removed.Moreover, if the construction resulted in a good expander,we wish to preserve the current state of the network, so as not to disrupt service. As a result, periodically reconstructing thenetwork is unacceptable, which implies the need for an algorithm to monitor the resulting construction.

When message sizes, memory, and processing power of a single node are not restricted, it is easy to collect the entiretopology of the graph at each node and checkwhether the resulting graph is a good expander.When restricting themessagesizes, the memory requirements at each node, and the convergence time to O(log n), such solutions are no longer feasible.

In the remainder of this section, we address the task of monitoring the result of our construction whenmessage sizes arelimited to O(log n) by employing the mixing rate of expanders. We present a distributed expansion evaluation algorithm,which displays an inherent two-sided error. When the evaluated graph has expansion in Θ(log n), the algorithm returns‘‘good’’ with overwhelming probability. When the graph has expansion less than O(1/ log3 n), the algorithm returns ‘‘bad’’with probability at least 1

2 .When the graph’s expansion is in-between the above values, ourmonitoring algorithmwill returnan arbitrary answer.Monitoring by random sampling. Our first approach to monitoring is presented for the sake of building intuition, as the timeit takes for detection is exponential. The monitoring is achieved by sampling sets of nodes from the graph and calculatingthe expansion for each such set. If a set of nodes is found to be a set of small expansion, a reset procedure will follow, whichwill reinitialize the nodes to a predetermined, consistent state.

Sets are sampled in the following manner: a node repeatedly starts a randomized propagation of information withfeedback (PIF) flooding of the graph,which defines the set of nodes. Each of these sampled nodeswill report back the number

Page 5: Spanders: Distributed spanning expanders

548 S. Dolev, N. Tzachar / Science of Computer Programming 78 (2013) 544–555

of neighboring nodes that were not selected. This information will then propagate back to the initiating node, which willthen calculate the expansion of the selected set.

The main observation is given in Lemma 4.1.

Lemma 4.1. Let G = (V , E) be a graph. If there exists a set of nodes, S ⊂ V , |S| ≤ n2 , such that |E(S,S)|

|S| ≤ c for some c, then there

exists a subset S ′ ⊆ S, |S| ≤ n2 ,|E(S′,S′)||S′| ≤ c and S ′ induces a connected subgraph of G.

Proof. The set S can be decomposed in the following manner: S =

i Si, ∀i, j : E(Si, Sj) = ∅ and ∀i : Si induces a connectedsubgraph,where each Si ismaximal (vertex-wise). Such a decomposition is easily obtained byusing greedy selection. Assumeto the contrary that none of these subsets satisfies Lemma 4.1. It follows that for each subset Si,

|E(Si,Si)||Si|

> c. Now:

|E(S, S)||S|

=

i |E(Si, Si)|

i |Si|≤ c

c <

i c · |Si|i |Si|

≤ c

it follows that c < c , which is a contradiction. Hence, one of the sets Si must satisfy Lemma 4.1. �

More details on the way to implement analogous self-stabilizing monitoring will be presented in the next section as partof the description of the efficient monitoring scheme.Mixing-rate-based monitoring. Expander graphs are rapidly mixing graphs (O(log n) mixing rate, implying that a randomwalk on the graph converges to the uniform distribution following O(log n) steps), and it follows that the cover time of suchgraphs is also short (O(n log n)). We can use this fact in the following way: assume a node, v, wishes to check if the graph israpidly mixing. v will start a random walk of length O(n log n) and attach a random color to this walk, chosen from a largeenough domain. Furthermore, v associates three counters of O(log n) bits each to the randomwalk; one to limit the numberof hops taken by the token, one to count the nodes discovered by thewalk, and one to count the edges of the graph. Each timethe walk visits a node for the first time (Fig. 1, line 1), the node increments the node counter by 1 and the edges counter bythe number of its neighbors. Afterwards, the token is transferred to a random neighbor (line 1). When the walk terminates(line 1), the counters are examined, either by the last node or by v after routing a message with the counters back to v. Ifthe walk has covered less than n nodes, which implies that the graph is not rapidly mixing, or if there are too many edgesin the graph (relative to the original graph, which implies that the construction is not as productive as desired), then a resetis initiated.

When the exact n is not known, the protocol has to be adjusted slightly; since counting the nodes cannot be used todetermine coverage, each node should remember the last color of a token traversing it. After the randomwalk is terminatedat node v, v initiates a flooding of the network to check whether all the nodes were colored by the same color of the token.If the flooding detects a node that has not been colored, a reset procedure will ensue.

To speed up the detection rate, we have to use a short random walk. To achieve this, v may send out many randomwalks, each of polylogarithmic length, in parallel, and attach the same color to all of these walks. This elegant techniquewas suggested in [1] and analyzed for regular expanders (expanders with a regular edge degree); the proof in [1] is notapplicable to non-regular graphs, because the authors used the symmetric nature of the random walk on a regular graphin their proof, namely, the probability that a random walk of length i visits a specific node at step i does not deviate bymore than ( λ

di) (where λ is the second largest eigenvalue of the transition matrix and d is the degree of the graph) from

the uniform distribution. Here, we employ more general bounds on the mixing time, derived from bounds on the stationarydistribution of a random walk on an edge expander, to extend the analogy for the non-regular case.

Each of the random walks will hold counters in a fashion similar to that in the single random walk solution (see Fig. 1).These random walks will cover the graph with high probability (see Lemmas 4.3 and 4.4). To count the number of visitednodes, the final counters must be routed back to v (we will discuss this point in greater detail in Section 5.3). If the totalnumber of nodes visited is less than n, then, with very high probability, the graph is not a good expander. Analogously tothe discussion above, coloring and checking nodes can be used for the case in which n is not known.

If the graph is not a good expander, we argue that initiating the randomwalks from a random edge results in failure withprobability larger than half (see Lemma 4.6).

The proof determines, for a given walk length, the probability of visiting a specific node. Next, we calculate the numberof walks needed to cover the entire graph with high probability. We begin by deriving a lower bound for the second largesteigenvalue of the Laplacian of the graph, which we later employ.

Lemma 4.2. Let G be a connected, non-bipartite graph, such that the second largest eigenvalue of the Laplacian of G is λ andthe expansion of the graph is h. Let ∆2

=dmaxdmin

be the ratio between the maximal and minimal degrees of nodes in G. Then

1− λ > h2

2∆8d2max.

Page 6: Spanders: Distributed spanning expanders

S. Dolev, N. Tzachar / Science of Computer Programming 78 (2013) 544–555 549

N = list of neighborsL = an upper limit on the length of the walklast_seen = ∅

Receive(color, node_counter, edge_counter, length):1 if color /∈ last_seen then2 last_seen← last_seen ∪ {color}3 node_counter ← node_counter + 14 edge_counter ← edge_counter + |N|5 fi6 if length > L then7 Report counters to initiating node8 else9 choose u ∈ N uniformly at random10 Send(u, color, counter, length+ 1)11 fi

Fig. 1. Rules for node u.

Proof. Let π(v) be the stationary distribution of v. It easily follows that for each v, 12n∆2 ≤ π(v) ≤ ∆2

2n . Let Φ(S) = Φ(S) =|E(S,S)|

2|E|π(S)π(S)the conductance of a set S ⊂ V . The conductance of the graph is defined by Φ = minS Φ(S). From [19], we know

that 1− λ > Φ2

8 . For each set S, such that |S| < n/2, we use the following:

2|E|π(S)π(S) < 2|E||S||S|∆4

4n2<

∆4|S|dmax

2.

It follows that ∀S, |S| < n/2, Φ(S) ≥ 2|E(S,S)|∆4|S|dmax

. which further implies that Φ ≥ 2h∆4dmax

. Plugging into the bound given in[19], we get the desired bound. �

Lemma 4.3. Using the notations of Lemma 4.2, let s = 8 log n1−λ

>log 4n∆3

log 1λ

. The probability of a random walk of length 2 · s, starting

at any vertex u ∈ V , to visit a given vertex v ∈ V , is at least log n2∆4n

Proof. We will follow the proofs presented in [1]. Fix u as the node from which the random walk starts and fix a node, v.Let Yi, s ≤ i ≤ 2s, be the indicator of random variables such that Yi = 1 if the walk visited v at step i. Let Y =

2si=s Yi be the

sum of these random variables. We will show that P[Y > 0] > s

4∆2n+4s∆4+ 8∆3n1−λ

.

Using the Cauchy–Schwartz inequality, we can see thatj>0

P[Y = j]

j>0

j2P[Y = j]

j>0

jP[Y = j]

2

(1)

P[Y > 0] =

j>0

P[Y = j]

j>0

jP[Y = j]

2

j>0

j2P[Y = j]

= (E(Y ))2

E(Y 2). (2)

From here on, we will focus on estimating both E(Y ) and E(Y 2). From linearity of expectation, E(Y ) =2s

i=s E(Yi) andfor each i, E(Yi) equals the probability that the walk that started at u visits v precisely at step i.

In [19], the following bound is given, where Pk(u, v) is the probability that a random walk of length k, started at u, willterminate at v:

|Pk(u, v)− π(v)| ≤ λk∆

which implies the following:

π(v)− λk∆ ≤ Pk(u, v) ≤ π(v)+ λk∆.

A simple calculation shows that when s = 8 log n1−λ

>log 4n∆3

log 1λ

, for each Yi and k > s we get that P(Yi = 1) = Pk(u, v) ≥

π(v)/2 ≥ 14∆2n

. From linearity of expectation, we obtain the E(Y ) ≥ s4∆2n

.

Page 7: Spanders: Distributed spanning expanders

550 S. Dolev, N. Tzachar / Science of Computer Programming 78 (2013) 544–555

We now turn our attention to calculating an upper bound for E(Y 2). From the definition of Y we obtain:

E(Y 2) = E

2si=s+1

Yi

2 =

i,j

E(YiYj) =

i

E(Y 2i )+

i=j

E(YiYj)

=

i

E(Yi)+ 2

s<i<j<2s

E(YiYj).

Now, E(YiYj) is exactly the probability that the walk visits v at step i and then at step j (i < j), which is the probability ofvisiting v at step i times the probability of returning to v after a walk of length j− i. The probability of returning to v after ksteps equals Pk(v, v) < π(v)+∆λk < 2∆2

n +∆λk.

E(Y 2) = E(Y )+ 2

s≤i<j≤2s

E(Yi)P(Yj|Yi)

≤ E(Y )+ 2

s≤i≤2s

E(Yi)

s∆2

2n+

k>0

∆λk

≤ E(Y )

1+

s∆2

n+

2∆1− λ

.

Therefore,

P[Y > 0] ≥E(Y )2

E(Y 2)≥

E(Y )

1+ s∆2

n +2∆1−λ

s4∆2n

1+ s∆2

n +2∆1−λ

≥s

4∆2n+ 4s∆4 + 8∆3n1−λ

.

Using the facts that s = 8 log n1−λ

, log 1/λ > 1 − λ and plugging into the equation above, one can show that P[Y > 0]>

log n2∆4n

. �

Lemma 4.4. Using the notations of Lemma 4.2, k = 4n∆4 random walks starting from the same node in the graph, u, each oflength 2s, cover the entire graph with probability of at least 1− 1

n .

Proof. For each node v, the probability that it is not visited by a specific randomwalk is less than 1− log n2∆4n

. The probabilitythat none of the k randomwalks visit v is less than (1− log n

2∆4n)k. When k = 4n∆2, we get that this probability is smaller than

1n2. Using the union bound, we get that the probability that all nodes are covered is larger than 1− 1

n . �

Corollary 4.5. When ∆2 < log n and h > log n, we get that k < 4n log3 n random walks, each of length 2s, where s =8 log n1−λ

<16∆8d2max log n

h2< 16d2max log

2 n ∈ O(log4 n), cover the graph with overwhelming probability.

Corollary 4.5, in fact, implies that a single instance of a monitoring session on such a graph takes O(log4 n) rounds andrequires O(n log7 n) messages.

Following Lemma 4.4, we now know that if a graph is a good expander, the short random walks that we use will coverthe graph with high probability. However, we also wish to investigate the case in which a graph is not a good expander. Thefollowing lemma shows that when a graph has less than constant expansion, the short random walks used will not coverthe graph with a probability of at least 1

2 . For brevity, constants are omitted, and we assume that ∆ ∈ o(log n).

Lemma 4.6. Let G = (V , E) be a graph, such that there exists S ⊂ V , |S| ≤ |V |2 =n2 for which |E(S, S)| < |S|

8 log3 n. n log n random

walks starting from a uniformly chosen edge, each of length O(log n), will not cover the entire graph with a probability of atleast 1

2 .

Proof. Let (u, v) ∈ E be a directed edge, chosen uniformly at random. Start the n log n random walks from v. For each edgee ∈ E define Xe as a random variable counting the number of times one of the random walks traverses e. Now, since (u, v)is chosen uniformly, which is the stationary distribution of the edges of the graph, for each directed edge e ∈ E we haveE(Xe) =

n log2 n|E| . Let Y =

e∈E(S,S) Xe (where e is a directed edge). Linearity of expectation implies that

E(Y ) =2n log2 n · |E(S, S)|

|E|≤

2n log2 n|S|8n log3 n

=|S|

4 log n.

FromMarkov’s inequality, we obtain that with probability of at least 12 , the cut between S and S is not crossed more than

|S|2 log n times (in both directions). Assuming v /∈ S, and since each walk is of length log n, we cover at most |S|2 nodes withinS. If v ∈ S, we get that we cover even fewer of the nodes of S, since |S| ≤ |S|. It follows that with a probability of at least 1

2 ,we do not cover the entire graph. �

Page 8: Spanders: Distributed spanning expanders

S. Dolev, N. Tzachar / Science of Computer Programming 78 (2013) 544–555 551

5. Self-stabilizing distributed monitoring

In the following sections, we present a self-stabilizing distributed monitoring algorithm to monitor the mixing rate of agraph. The algorithm is based on the repeated selection of a random edge from the graph and on starting randomwalks fromthis edge. Whenever a problem is observed in the graph, the snap-stabilizing reset that we describe here will be utilized toreset the monitoring algorithm and to rebuild the current graph (if it is not a good spander). As the entire construction andmonitoring must be logarithmic in the number of nodes, we had to introduce, reform and augment new techniques.

5.1. Snap-stabilizing data link

A distributed algorithm is termed snap-stabilizing if the algorithm stabilizes following the first request by any node andbefore, or simultaneously with, a notification arriving at the requesting node upon completion of the request.

Throughout the text we assume the use of a snap-stabilizing data-link layer. We consider several algorithms that can beused to realize a snap-stabilizing data-link layer.• Spontaneous receiver. When the receiver can send duplicated answers for a single frame sent (or when the duplication iscaused by the underlying physical layer), we suggest the following algorithm (see [10]): when the sender, s, wishes to senda frame, f , to the receiver, r , swill send f to r repeatedly and attach a sequence number to f . At first, swill attach the number1 to f , and repeatedly send f to r with sequence number 1. Once s receives an acknowledgment from r upon the receipt of fwith sequence number 1, swill repeatedly send f with sequence number 2. swill keep incrementing the sequence numberof f each time s receives an acknowledgment for the current sequence number, until s reaches 2 · lc + 1 times (where lc isthe bound on the link capacity). At this point f is assured to be delivered at r .

When r receives a frame f with a sequence number 2 · lc+ 1, following a frame with a different sequence number, r willdeliver the frame upwards in r ’s network stack.• Non-spontaneous receiver. When frames are not duplicated (either by r or by the physical layer), we suggest the followingalgorithm: s will first repeatedly send f to r but will mark f with 0. s will then count the number of acknowledgments itreceives from r . When s has received 2 · lc + 1 acknowledgments, s will then mark f with 1 and repeat the process. After shas received 2 · lc + 1 additional acknowledgments for f , f is assured to be delivered at r .

When r receives a frame f with a mark 0 that is immediately followed by a frame with a mark 1, r will deliver the frameupwards in r ’s network protocol stack.

The algorithm for the non-spontaneous receiver is more efficient both communication-wise and time-wise.

5.2. Snap-stabilizing message passing reset

A reset procedure ensures that once it is initiated, and before the reset is terminated, each node receives a logical ‘‘reset’’signal (possibly resetting the node’s state to a predetermined state). Moreover, following a logical reset at node u, if u sendsa message to node v, then node v will have received a reset signal before processing the message from u.

We present a reset procedure for message passing networks. The reset is executed on a graph G, which in our case is ofdiameter d ∈ O(log n). We assume the use of a snap-stabilizing data-link protocol for passing messages between nodes.A similar technique appears in [2,3,7], here we present and prove snap-stabilization and termination that fit both networkswith identifiers and anonymous networks. The reset procedure for a single node, u, appears in Fig. 2.

The reset procedure is based on bounded counters of 3d at each node. Each node umaintains a reset counter. To initiatea reset, u simply sets u’s counter to zero (line 2). The technique used by the reset procedure is that a node, u, will onlyincrement u’s counter after u is certain that all of u’s neighbors have counters that are greater than or equal to u’s counter.This is achieved by first saving the counter values received from the neighbors (line 2) and assigning u’s counter with theminimal value among the counters of all of u’s neighbors, plus one (line 2).

u must also repeatedly broadcast its counter value to its neighbors. The broadcast is based on the snap stabilizing data-link algorithm presented above. The SnapSend procedure used in line 2 ensures that the counter value C is sent to theneighbor v by using a snap-stabilizing data-link algorithm, thus ensuring the delivery of the new value. The receiving nodeof the SnapSend procedure piggy backs its own counter value while sending acknowledgments, which are returned fromthe SnapSend procedure on the initiator side. This ensures that upon the termination of a single SnapSend procedure bothends hold the other side’s counter values that existed during the SnapSend execution.

The flag variable is used to ensure that uwill not reassign its counter before updating all of its neighbors with its currentvalue. This property is vital to our proof and establishes a happened before relation between counter updates across thegraph.

The proof is based on the following observation: when a node, v, assigns 0 to its counter (starting a reset), this value willpropagate in the graph, causing other nodes to adopt a counter value not greater than their distance to v. First, v’s immediateneighbors will set their counters to at most 1. Thereafter, v’s neighbors’ neighbors will set their counters to (at most) 2, andso on. We then show that when v reaches 2d, the counter of each node cannot exceed 3d. The full proof can be foundin [11,25,13].

Page 9: Spanders: Distributed spanning expanders

552 S. Dolev, N. Tzachar / Science of Computer Programming 78 (2013) 544–555

cntu = a counterd = the diameter of the graphN[i] = the last counter received

from neighbor iflag = a boolean flag

Reset1 cntu ← 02 flag ← true

Receive Counter value c from neighbor v3 N[v] ← c4 if c < cntu then5 flag ← true6 cntu ← c + 17 fi

While cntu < 3d8 while flag = true do9 flag = false10 Broadcast()11 done12 if ∀i ∈ N : N[i] ≤ cntu + 1 then13 flag ← true14 cntu ← min{cntu,mini N[i]} + 115 fi

Broadcast16 foreach neighbor v do17 c ←SnapSend cntu to v18 Receive(c, v)19 done

Fig. 2. Snap-stabilizing reset procedure for u.

Theorem 5.1. The reset protocol of Fig. 2 is a snap-stabilizing reset protocol, which terminates after at most 3d rounds followinginitialization.

The snap-stabilization paradigm requires that once one node initiates the (reset) algorithm, the algorithmwill terminatesuccessfully, namely, the initiator receives a notification on the termination and nomoremessages are sent. The terminationproperty of the snap-stabilizing reset is a rare property in the scope of self-stabilizing algorithms. The reset algorithm is snap-stabilizing, since once a node, v, starts the reset algorithm, each node in the system will go through a reset, namely, lowerits counter value below 3d, before v receives an indication of the reset termination.

In the context of our monitoring algorithm presented in the following section, a leader must be elected following thereset, and a bfs tree rooted in the leader has to be defined. We suggest the following algorithm: once the counter of a node,v, reaches 3d, v will start broadcasting to its neighbors its candidate for the bfs leader and the distance to the candidate,starting with v: (v, 0). v will repeatedly collect the bfs leaders of its neighbors. If there exists a bfs leader with a loweridentifier than v’s current bfs leader or if one of v’s neighbors is closer to the leader than v, v will adopt this bfs leader,mark the node from which v received this bfs leader as v’s predecessor in the bfs tree, and add one to the distance to thebfs leader. Once any bfs leader has a distance larger than d, v can safely discard this bfs leader.

It is easy to see that this leader election algorithm, when started after a reset, will elect a leader, define a bfs tree rootedin the leader, and establish correct distances. After a further 2d communication rounds (due to the liveness property), allnodes will agree on one leader, the node with the lowest id in the system. If we let nodes continue the counting of the resetalgorithm, starting to execute the leader election algorithm at 3d and up to 5d, then the leader can start a new monitoringsession once its counter reaches 5d. In the sequel, once the counter reaches 5d, the tree structure remains fixed in the sensethat parent pointers are not changed.Monitoring Algorithm for a Single Node.We next present a self-stabilizing expansion monitoring algorithm. In fact, given theself-stabilizing reset presented earlier, a simpler non-stabilizing version can be used, as long as there is a local predicate forchecking the consistency of the monitoring that triggers the reset. Still, the self-stabilizing monitoring may be of interestin its own right. Next, we consider an instance of the algorithm that is started from one node, which we denote by ml (formonitoring leader). The monitoring algorithm is a self-stabilizing version of the technique presented in Section 4, namely,a node starting a monitoring session will send the required number of tokens (see Lemma 4.4), each performing a randomwalk, counting the number of nodes and edges in the graph.

When designing the self-stabilizing version of the monitoring algorithm, we may notice two problems that we have tosolve. First, we have to ensure that tokens can be routed back to ml once they have traveled the required length. Second,if ml assumes that a monitoring session has started and none of ml’s tokens exist in the graph, we must prevent ml fromwaiting forever. To overcome both of these obstacles, we use the same repeated token sending mechanism.

Page 10: Spanders: Distributed spanning expanders

S. Dolev, N. Tzachar / Science of Computer Programming 78 (2013) 544–555 553

N = list of neighborsL = an upper limit on the length of a walktokens[] = an array, holding the last tokens receivedanswers[] = an array, holding the last hop of a token (if exists)

Start Walks1 choose a random color c2 for i ∈ tokens do3 choose a random neighbor p4 tokens[i] ← (c, 0, 0, 0, p)5 done

Receive(color, index, origin, hop_cntr, node_cntr, edge_cntr) from v6 choose a random neighbor p7 if tokens[index].color = color8 tokens[index] ← (color, hop_cntr, node_cntr + 1,9 edge_cntr + |N|, p)10 else if tokens[index].hop_cntr < hop_cntr then11 tokens[index] ← (color, hop_cntr, node_cntr,12 edge_cntr, p)13 fi14 if answers[index].color = color then15 Send (i, answers[index]) to v16 fi

Repeatedly17 foreach i ∈ tokens do18 (color, hop_cntr, node_cntr, edge_cntr, p)← tokens[i]19 if hop_cntr < L then20 Send (color, i, v, hop_cntr + 1, node_cntr, edge_cntr) to p21 else22 answers[i] ← (color, node_cntr, edge_cntr)23 fi24 done

Receive(index, answer) from v25 (c, hop_cntr, node_cntr, edge_cntr, p)← tokens[index]26 if color = answer.color then27 answers[index] = answer28 fi

Fig. 3. Self-stabilizing monitoring algorithm for u.

Each node uwill follow the algorithmpresented in Fig. 3:mlwill add a serial number to each token it sends, i.e., the tokenswill be consecutively numbered, t1, t2, . . . . Each node u records not only the color of the token, but also, for each token ti, towhich neighbor,w, u had sent token ti after themost recent arrival of ti at u (lines 3 and 3).We call this the forward pointer ofti at u. u then repeatedly sends all the tokens it has received to ensure delivery (line 3). We use the repeated send techniqueto establish the termination property of each monitoring phase.

To ensure the delivery of the tokens back to ml, when they have reached the end of their random walk, we use thefollowing mechanism; when a random walk terminates (reaches its maximum hop count), the node at which the walk hasterminated saves the information collected by the walk (line 3). Each time a node u receives a token (whether for the firsttime or not), if u currently holds an answer to the token, u forwards the answer to the sender of the token (line 3).

We next show that the traversal of a single token sent by ml will terminate, and the information contained in the token(node count and edge count) will be propagated back to ml. The proof considers one token, ti. Denote by k the length ofthe random walks according to the previous section (k ∈ O(log n)). The full proof of the following Theorem can be foundin [11,25,13].

Theorem 5.2. Themonitoring algorithm is a self-stabilizing algorithm that stabilizes in O(k) rounds. Furthermore, themonitoringalgorithm requires O(t(log t + log k+ logN)) bits of memory at each node.

5.3. Global monitoring algorithm

Given the monitoring algorithm and the reset procedure above, we present a self-stabilizing expansion monitoringalgorithm. The algorithm is based on repeatedly invoking the monitoring algorithm for a single node, in each instance froma different node.

To coordinate between the nodes and to ensure only one monitoring session is active at a time, we employ a singleleader in the system, with a bfs tree rooted in this same leader. Repeatedly, the bfs leader, bl, selects a directed edge, ⟨u, v⟩,uniformly among all directed edges, and informs v, the edge’s endpoint, to start a newmonitoring session as themonitoring

Page 11: Spanders: Distributed spanning expanders

554 S. Dolev, N. Tzachar / Science of Computer Programming 78 (2013) 544–555

session leader. If a specific percentage of monitoring sessions failed, blmay conclude that the graph is not a good expanderand initiate a reset to rebuild the graph.

We next detail the specific techniques used to realize the global monitoring algorithm.• Self-Stabilizing bfs tree. The bfs tree is assumed to be defined. Nodes will repeatedly inspect the status of the tree and uponfinding any error will initiate a reset, namely, each node v repeatedly checks that its neighbors have the same leader as ithas. If v is the leader, v also checks whether its distance to the bfs root is 0. Otherwise, when v is not a leader, it checkswhether its parent in the bfs is closer to the leader than itself by exactly one and whether there is no neighbor that is closerto the leader. Once any node (either a leader or a non-leader node) detects a violation of the above assertions, the node willinitiate a snap-stabilizing reset, ensuring that a single bfs leader will be elected and that a correct bfs tree will hence beconstructed.

To facilitate communications between the bfs leader and the rest of the nodes, the bfs leader will repeatedly color thebfs defined by the parent–child relation, using randomly selected colors. Such a self-stabilizing coloring technique usingbroadcast and converge-cast over a tree has been well investigated (see [7]).

When coloring the bfs tree with a new color, the bfs leader may piggy back messages on the messages used for coloring,and nodes may piggy back information back to the bfs leader with their replies.• Selecting the next monitoring node. Each monitoring session must start from a directed edge, chosen uniformly. To selectthe edge, the bfs leader, bl, will employ the coloring of the tree defined above.

During and after the construction of the bfs tree, each node v will be repeatedly notified by its children about seu,the number of edges connected to all the nodes in the subtree rooted in each child u. v will repeatedly define sev to be∆v +

u child of v seu, where ∆v is the number of edges directly connected to v. In particular, a leaf u will repeatedly notify

its parent with ∆u. Moreover, each node maintains a local ordering of its children.Each node u associates a set of natural numbers between 1 and seu with its children andwith itself, so that each number is

associated with a single node and each node v receives a set of numbers, Nv , of cardinality sev so that Nv = {i1, i2, . . . , isev }.The bfs leader, bl, initiates the selection of the new node by first selecting, uniformly at random, a natural number k

between 1 and sebl. If k is associated with bl, bl is the new monitoring leader. Otherwise, let v be bl’s child with which kis associated, and let k = ij ∈ Nv . bl will inform v (by piggy-backing on the coloring algorithm’s messages) that v shouldcontinue the process with the number j.

When a node v receives amessage that it should select the nextmonitoring leader using the number k, v first checkswithwhich node k is associated. If k is associated with v itself, v selects itself as the new monitoring leader. Otherwise, assumek is associated with a child, u, such that k = ij ∈ Nu. v will notify u that u should continue the process with the number j.Note that the selection algorithm always terminates and selects a node with a uniform distribution over all the endpointsof the edges.

At the end of the coloring of the bfs tree, bl learns the identity of the node that has been selected as the next monitoringsession. Moreover, the natural number, k, which bl has randomly chosen, uniquely identifies the next monitoring leader.bl will then initiate a new coloring of the tree, in which bl broadcasts the identity of the new monitoring leader. Next, wedescribe the mechanism used to ensure that the monitoring leader does indeed perform the monitoring.• Ensuring that a monitoring leader exists. bl must ensure that the current monitoring leader is active. There are severalways to achieve this, amongst them using the coloring to communicate with the current monitoring leader. A different way,which does not involve the coloring algorithm, is by routing messages directly to the monitoring leader, down the bfs tree,by employing the same mechanism used to select the monitoring leader; as bl remembers k, the randomly chosen naturalnumber that identifies themonitoring leader, bl can routemessages to themonitoring leader, which can then send replies upthe tree towards bl. To ensure self-stabilization, each time bl sends a message, bl will attach a random color to the messageand expect a reply containing the same random color.

If bl does not receive confirmation that a monitoring leader exists, blwill perform a reset to reinitialize the bfs tree.• Detecting bad expansion. There exists a small constant c , such that if the monitoring algorithm (for a single node) hasindicated that the graph is not a good expander at least c/2 times when running c monitoring sessions consecutively (fromdifferent nodes), thenwith probability 1−o(1) the graph is not a good expander. This follows fromexamining the conditionalprobability that the graph is not a good expander, given that the monitoring algorithm has indicated so in more than halfthe times of c successive sessions, and that the probability of a wrong answer on a good expander and the probability ofgetting a good expander a priori are both 1− o(1).

bl will employ the constant c to check the last c invocations of the monitoring algorithm. If more than c/2 invocationswill have failed, blwill initiate a reset to rebuild the graph.Wewish to draw the reader’s attention to the fact thatmonitoringresultsmaynot be reused, i.e., once c successivemonitoring sessions are finished, a new set of successivemonitoring sessionswill start.

Note that one may employ more than one monitoring session in parallel, by uniformly choosing a set of edges which, inturn, define a set of nodes such that each such node is a monitoring leader, therefore boosting the speed of the detectiontime.• Stretch factor. The diameter of the resulting spander graph is in O(log n). This, in turn, implies that the spander has anadditive stretch factor in O(log n).

We note that the construction can be extended to form hierarchical spanner construction, such that each spander willhave fewer edges than the one before it (and, as a by-product, smaller expansion). This hierarchical construction can

Page 12: Spanders: Distributed spanning expanders

S. Dolev, N. Tzachar / Science of Computer Programming 78 (2013) 544–555 555

then be used to ensure quality of service (QoS) in a wide variety of applications; the system can automatically adjust thecommunication graph used in order to achieve its goals, taking into account the underlying structure of the chosen graphand the fitting number of edges versus the probability of expansion. Details can be found in [11,25,13].

6. Concluding remarks

The design of self-stabilizing and self-organizing schemes and software simplifies the need to exhaustively consider everyparticular fault scenario and the appropriate remedy. System designers can incorporate event-driven actions as specificremedies and use self-stabilization as a fall-back mechanism [8].

Self-stabilizing and self-organizing distributed graph algorithms are of great importance for the emerging heterogeneousand mobile communication networks. Such algorithms ensure automatic recovery from an arbitrary global state and fastresponse to changes.

We believe that the application of techniques established in theoretical computer science, such as graph expanders, thenew technique that uses short random walks for monitoring expansion, and the snap-stabilizing reset in the scope of suchdynamic networks hold great promise in practical systems.

Acknowledgments

The research of the first author has been supported by the Ministry of Science and Technology (MOST), the Lynne andWilliam Frankel Center for Computer Science at Ben-Gurion University, the ICT Programme of the European Union undercontract number FP7-215270 (FRONTS), Microsoft, US Air-Force, Israel Science Foundation (grant number 428/11), Verisign25th Anniversary of .COMgrant, Deutsche Telekom Labs at BGU and Rita Altura Trust Chair in Computer Sciences. The secondauthor was partially supported by EU ICT-2008-215270 FRONTS, and the Lynne and William Frankel Center for ComputerSciences.

References

[1] N. Alon, C. Avin, M. Koucký, G. Kozma, Z. Lotker, M.R. Tuttle, Many random walks are faster than one, in: F.M. auf der Heide, N. Shavit (Eds.),Proceedings of the 20th Annual ACM Symposium on Parallel Algorithms and Architectures, Munich, Germany, June 14–16, 2008, SPAA 2008, ACM,2008, pp. 119–128.

[2] A. Arora, S. Dolev, M. Gouda, Maintaining digital clocks in step, Parallel Processing Letters 1 1 (1991) 11–18.[3] B. Awerbuch, G. Varghese, Distributed program checking: a paradigm for building self-stabilizing distributed protocols, in: Proceedings of the 32nd

Annual IEEE Symposium on Foundations of Computer Science, FOCS’91 (San Juan, Puerto Rico, October 1–4, 1991), IEEE Computer Society Press, LosAlamitos, Washington, Brussels, Tokyo, 1991, pp. 258–267.

[4] A. Cournier, A. Datta, F. Petit, V. Villain, Enabling snap-stabilization, in: Proc. of the 23rd International Conference on Distributed Computing Systems,2003, pp. 12–19.

[5] A. Czumaj, C. Sohler, Testing expansion in bounded-degree graphs, in: FOCS, IEEE Computer Society, 2007, pp. 570–578.[6] E.W. Dijkstra, Self-stabilizing systems in spite of distributed control, Communications of the ACM 17 (11) (1974) 643–644.[7] S. Dolev, Self-Stabilization, MIT Press, 2000.[8] S. Delaet, S. Dolev, O. Peres, Safe and eventually safe: comparing stabilizing algorithms and non-stabilizing algorithms on a common ground, in: Proc.

of the 2009 International Conference On Principles Of Distributed Systems, OPODIS, December, 2009. Also brief Announcement Safer Than Safe: Onthe Initial State of Self-stabilizing Systems, SSS 2009, pp. 775–776.

[9] S. Dolev, E. Schiller, J.L. Welch, Random walk for self-stabilizing group communication in ad hoc networks, IEEE Trans. Mob. Comput. 5 (7) (2006)893–905.

[10] S. Dolev, N. Tzachar, Empire of colonies: self-stabilizing and self-organizing distributed algorithms, Theoretical Computer Science 410 (6–7) (2009)514–532. Special issue of OPODIS06.

[11] S. Dolev, N. Tzachar, Spanders: Distributed spanning expanders. TR 08-02, Ben Gurion University, Department of Computer Science, 2007.[12] S. Dolev, N. Tzachar, Spanders: distributed spanning expanders, in: Proc. of the 25th ACM Symposium on Applied Computing, SAC-SCS, 2010.[13] S. Dolev, N. Tzachar, Self-stabilizing and self-organizing virtual infrastructures for mobile networks, in: Theoretical Aspects of Distributed Computing

in Sensor Networks, Springer, 2010.[14] O. Goldreich, D. Ron, On testing expansion in bounded-degree graphs, Electronic Colloquium on Computational Complexity (ECCC) 7 (2000) 20.[15] S. Hoory, N. Linial, A. Wigderson, Expander graphs and their applications, Bulletin of the AMS 43 (4) (2006) 439–561.[16] S. Kale, C. Seshadrhi, Testing expansion in bounded degree graphs, ECCC report TR07-076, 2007.[17] L. Lamport, Time, clocks, and the ordering of events in a distributed system, Communications of the ACM 21 (7) (1978) 558–565.[18] C. Law, K.-Y. Siu, Distributed construction of random expander networks, in: INFOCOM, 2003.[19] L. Lovasz, Random walks on graphs: a survey.[20] R. Motwani, P. Raghavan, Randomized Algorithms, Cambridge university press, 2006.[21] A. Nachmias, A. Shapira, Testing the expansion of graphs, ECCC report TR07-118, 2007.[22] D. Peleg, A.A. Schaffer, Graph spanner, Journal of Graph Theory 13 (1989) 99–116.[23] D. Peleg, Distributed Computing: A Locality-Sensitive Approach, SIAM, 2000.[24] M.K. Reiter, A. Samar, C. Wang, Distributed construction of a fault-tolerant network from a tree, in: Proceedings of the 24th IEEE Symposium on

Reliable Distributed Systems, SRDS’05, IEEE Computer Society, Washington, DC, USA, 2005, pp. 155–165.[25] N. Tzachar, Self-Stabilizing and Self-Organizing Distributed Algorithms, Ph.D. Thesis, Ben-Gurion University of the Negev, 2008.