Màster en Enginyeria Informàtica i de la Seguretat (MEIS ...deim.urv.cat/~francesc.serratosa/2010_09_08_Raul_Arlandez_MEIS_… · Raül’Arlàndez’Reverté:’Attributed’Planar’Graph’Matching

Màster en Enginyeria Informàtica i de la Seguretat

(MEIS)

Màster interuniversitari en Intel·∙ligència Artificial (MIA)

Presentació de Treball de Recerca/Tesi de Màster

Noms i cognoms dels alumnes: Raül Arlàndez Reverté Director del Treball de Recerca: Dr. Francesc Serratosa Acknoledgment: Albert Solé Ribalta Data de presentació: Títol del Treball de Recerca/Tesi de Màster: Attributed Graph Matching of Planar Graphs Marcar les caselles corresponents a l’itinerari i el pla d’estudis:

MEIS, Itinerari Seguretat MEIS, Itinerari Sistemes Intel·∙ligents MIA MEIS, Pla d’estudis 2006 (Treball de Recerca, 10.5 crèdits) MEIS, Pla d’estudis 2007 (Treball de Recerca, 30 crèdits) MIA, Pla d’estudis 2006 (Tesi de Màster, 30 crèdits)

Qualificació: President Secretari Vocal ___________ _____________ _______________

Raül Arlàndez Reverté: Attributed Planar Graph Matching _

2

CONTENTS

1. Problem description to solve, research objectives .................................................................. 4

1.1. Tasks ................................................................................................................................... 4

2. Basic concepts and definitions ................................................................................................. 5

2.1. Related with graphs ........................................................................................................... 5

2.1.1. Graph definition .......................................................................................................... 5

2.1.2. Attributed graph ......................................................................................................... 5

2.1.3. Subgraph ..................................................................................................................... 5

2.1.4. Graph edit distance ..................................................................................................... 6

2.1.5. Adjacency matrix ......................................................................................................... 7

2.1.6. Cycle ............................................................................................................................ 7

2.1.7. Girth of a planar graph ................................................................................................ 7

2.2. Related with planar graphs ................................................................................................ 7

2.2.1. Planar graph ................................................................................................................ 7

2.2.2. Attribute planar graph ................................................................................................ 8

2.2.3. Representation based on planar graphs ..................................................................... 9

2.3. Related with planar graph matching .................................................................................. 9

2.3.1. Dynamic programming ................................................................................................ 9

2.3.2. Rooted Spanning Tree ................................................................................................. 9

2.3.3. Tree width ................................................................................................................. 10

2.3.4. Tree Decomposition .................................................................................................. 10

2.3.5. Graph isomorphism ................................................................................................... 11

2.3.6. Graph isomorphism problem .................................................................................... 12

2.4. Other definitions .............................................................................................................. 12

2.4.1. Euclidian distance ..................................................................................................... 12

3. Current technology status about the problem to solve .......................................................... 13

3.1. Introduction ..................................................................................................................... 13

3.2. State of the art in the literature ....................................................................................... 14

4. New solution design, justifying the new proposal .................................................................. 17

4.1. Planar Graph Isomorphism .............................................................................................. 17

4.1.1. Objectives .................................................................................................................. 17

4.1.2. Definitions ................................................................................................................. 18

4.1.3. Consistency ............................................................................................................... 20


3

4.1.3.1. Checking given 2 vertexes N and N1 are consistent .......................................... 21

4.1.4. Compatible triple ...................................................................................................... 24

4.1.4.1. Checking given 3 vertices N, N1 and N2 are consistent ................................... 24

4.1.5. Joining ....................................................................................................................... 25

4.1.6. Graph and planar graph edges .................................................................................. 26

4.1.7. Partial Isomorphism list ............................................................................................ 27

4.1.8. Subgraph Isomorph Algorithm .................................................................................. 28

4.2. Attributed planar graph matching ................................................................................... 28

4.2.1. Objectives .................................................................................................................. 28

4.2.2. Structure ................................................................................................................... 29

4.2.3. Graph and planar graph edges and vertices attributes ............................................. 29

4.2.4. Subgraph Isomorph Algorithm with attributes ......................................................... 30

4.2.5. Attribute planar graph matching samples ................................................................ 31

5. Development and implementation ......................................................................................... 34

5.1. How to generate all the possible partial isomorphism? ................................................. 34

5.1.1. L(N) Combinations ..................................................................................................... 34

5.1.2. X and Y combinations ................................................................................................ 36

Figure 35: Generating combinations when x has values ..................................................... 37

5.1.3. Final combinations .................................................................................................... 37

5.2. How to join two labels? .................................................................................................... 38

5.3. Attribute edge function ................................................................................................... 41

5.4. Sorting euclidian distance ................................................................................................ 42

6. Practical evaluation ................................................................................................................ 43

6.1. Analysis between old system and the new system .......................................................... 44

6.2. Tree decomposition analysis ............................................................................................ 49

6.3. Vertices analysis ............................................................................................................... 50

6.4. Euclidian distance isomorphism analysis ......................................................................... 51

7. Conclusion, future works ........................................................................................................ 54

7-‐ References .............................................................................................................................. 56

8. Annex 1: Software application ................................................................................................ 57

8.1. Requirement .................................................................................................................... 57

8.2. How does it work? ........................................................................................................... 57


4

1. Problem description to solve, research objectives Many fields such as computer vision, scene analysis, chemistry and molecular biology have applications in which images have to be processed and some regions have to be searched for and identified. When this processing is to be performed by a computer automatically without the assistance of a human expert, a useful way of representing the knowledge is by using attributed graphs. Attributed graphs have been proved as an effective way of representing objects. When using graphs to represent objects or images, vertices usually represent regions (or features) of the object or images, and edges between them represent the relations between regions. Nonetheless planar graphs are graphs which can be drawn in the plane without intersecting any edge between them. Most applications use planar graphs to represent an image. Graph matching (with attributes or not) represents an NP-‐complete problem, nevertheless when we use planar graphs without attributes we can solve this problem in polynomial time [1]. No algorithms have been presented that solve the attributed graph-‐matching problem and use the planar-‐graphs properties. In this master thesis, we research about Attributed-‐Planar-‐Graph matching. The aim is to find a fast algorithm through studying in depth the properties and restrictions imposed by planar graphs.

1.1. Tasks

⇒ Current status report about graph matching in planar graphs ⇒ Define a new framework for matching attributed planar graphs ⇒ Implement and test the new model ⇒ Publish the model and results in a congress


5

2. Basic concepts and definitions

2.1. Related with graphs

2.1.1. Graph definition Graph is a collection of vertices or nodes and a collection of edges that connect pairs of vertices. A graph is defined such as a pair of G(V,E):

• V= {V1,..., Vn} are a set of elements called vertices • E is a set of edges between the vertices E ⊆ 𝑢, 𝑣 𝑢 , 𝑣 ∈ 𝑉}.

A graph may be undirected, meaning that there is no distinction between the two vertices associated with each edge, even though it may be directed from one vertex to another.

Figure 1: directed graph and undirected graph, respectively

2.1.2. Attributed graph When we use graphs for doing applications such as character recognition, schematic drawing analysis, two-‐dimensional shape analyses... these kinds of graphs cannot give enough information to solve the problem. Then we use attributed graph, because it can incorporate more semantic information in the vertices or edges.

2.1.3. Subgraph A subgraph of graph G is a graph where the sets of their vertices and edges are subsets of G. We say that a graph G contains another graph H if any subgraph of G holds G’⊆ H or is an isomorphism of H. An induced subgraph of G is a subgraph G’ of G such that contains all the edges adjacent to the subset of vertices of G.


6

b b a a

Let G= (VG,AG) and H=(VH,AH) is subgraph if:

• VH⊆ VG • AH⊆ AG • (VH,AH) is a subgraph

Figure 2: Subgraphs

2.1.4. Graph edit distance

The basic idea behind the graph edit distance is to define the dissimilarity of two graphs as the minimum amount of distortion required to transform one graph into the other. To this end, a number of distortion or edit operations e, consisting of the insertion, deletion and substitution of both nodes and edges are defined. Given these edit operations, for every pair graphs, g1 and g2 there exists a sequence of edit operations, or edit path p(g1 g2)= (e1, e2,..., ek) (where each ei denotes an edit operation) that transforms g1 into g2. In the following figure, an example of an edit path between two given graphs g1 and g2 is given. This edit path consists of one edge deletion, one node substitution, one node insertion and two edge insertions. In general, several edit paths exist between two given graphs. This set of edit paths is denoted by 𝜌 (g1, g2). In order to quantitatively evaluate which edit path is the best, edit costs are introduced. The fundamental idea is to assign a penalty cost c to each edit operation according to the amount of distortion it introduces in the transformation. The edit distance d between two graphs g1 and g2 denoted by d(g1, g2), is the cost of the edit path with minimum cost that transforms one graph into the other.

Given two graphs 𝑔! = (𝑉!,𝐸!, 𝜇!, 𝑣!), and 𝑔! = (𝑉!,𝐸!, 𝜇!, 𝑣!), the graph edit distance between g1 and g2 is defined by:

𝑑 𝑔!,𝑔! = !!,…,!! ∈!(!!,!!)!"# 𝑐(𝑒!)

!

!!!

Where 𝜌(𝑔!,𝑔!) denotes the set of edit paths that transform g1 into g2 and c(e) denotes the cost of an edit operation e.

Figure 3: A possible edit path between two graphs g1 and g2. Note that node labels are indicated by different colours.

c VG VH


7

2.1.5. Adjacency matrix To know if two vertices are connected or not by an edge, we need some mechanism to achieve it. Therefore we use a matrix to represent it. For a graph with n vertices, construct n x n matrix A. If there is an edge between node rows and node columns, the value of adjacency matrix [rows,columns] will be 1, otherwise will be 0.

Figure 4: graph along with his adjacency matrix

2.1.6. Cycle A cycle graph or circular graph is a graph that consists of a single cycle, or in other words, some number of vertices connected in a closed chain. The cycle graph with n vertices is called Cn. The number of vertices in a Cn equals the number of edges, and every vertex has a degree of 2, that is, every vertex has exactly two edges incident with it.

2.1.7. Girth of a planar graph

The girth of a planar graph is the length of a shortest cycle contained in the graph. Therefore, it is the shortest path between vertices connected in a closed chain. If the graph does not contain any cycles (i.e. An acyclic graph), its girth is defined to be infinity. A square will have girth=4, a triangle girth=3·∙

Figure 5: girth of planar graph=3

2.2. Related with planar graphs

2.2.1. Planar graph Planar graph is a graph which can be drawn in the plane without intersecting any edge between them. Kuratowski’s theorem says the goals of that kind of graphs:


8

• A graph is planar if it does not contain a subgraph that it is a K5 elementary subdivision

(complete graph of five vertices) or K3,3 (a complete bipartite graph of six vertices).

The basic subdivision is the result when inserting vertices on the edges that go node to node.

Figure 6: K5,5 Figure 7: K3,3 Figure 8: Planar Graph There is an algorithm to determine whether a graph is planar or not, and it can achieved with linear time O(n), which uses the following two theorems:

• Theorem 1: If 1 n≥ 3 then a≤3n-‐6 (where n is the number of vertices and a the numbers of edges)

• Theorem 2: If n> 3 and there are no cycles of length 3, then a≤2n -‐4

Given this condition we can prove that the graph K5,5 does not hold with the above theorems as 5≥3 does not hold that 10≤15-‐6 and the second theorem 5>3 does not hold that 10≤10-‐4. It must be said that these two theorems guarantee that the graph is not flat, but this is not to say that it should be planar.

2.2.2. Attribute planar graph Before we defined an attribute graph as a graph that might incorporate more semantic information in the vertices or edges. Well, attribute planar graph is the same concept as before but now applied to a planar graph.

Figure 9: Planar graph with their attributes in vertices and edges

Attributes -‐ colour =”red” -‐ position={0,0} -‐ weight=10



3

4

3


9

2.2.3. Representation based on planar graphs A representation based on graphs or planar graphs allow us to understand better the domain that has been studied, and it can have several performances of the same domain. It is possible to represent different objects using planar graphs. The following figure shows how a street map can be transformed to planar graph.

Figure 10: Building planar graph from street map

2.3. Related with planar graph matching

2.3.1. Dynamic programming Dynamic programming is a programming method used in the computing field to reduce the algorithm runtime by dividing the problem with overlapping subproblems and optimal substructures. In the case that finding the shortest path given two vertices of the graph, for example, to obtain an optimal substructure would suppose finding all the possible paths from an initial vertex, and then using these solutions, choosing the best route. Dynamic programming can be achieved in either of two ways:

-‐ Top-‐down: The problem is divided by subproblems. It is a combination of recursive and memorization

-‐ Bottom-‐up: All the subproblems that we could need are solved before tackling a bigger problem. This approach makes it better when you call the function and for space consumption.

The latter, is the method that we will use in our algorithm to make the route in the tree decomposition

2.3.2. Rooted Spanning Tree Given a graph G, a rooted spanning tree is a subgraph that continues being a tree and it contains all the graph vertices. Moreover we choose a random vertex that will be the root.


10

Planar graph G Tree T

L(N)

Figure 11: Rooted Spanning Tree given a graph G

2.3.3. Tree width The width of tree decomposition is the size of the largest set X minus one. The tree width of a graph G is the minimum width among all possible tree decomposition of G. The graphs with tree width at most k are also called partial k-‐trees. It is an NP-‐complete problem to determine whether a given graph G has a tree width at most a given variable k. Nonetheless, when k is any fixed constant, the graphs with tree width k can be recognized, and width k tree decomposition constructed for them, in linear time.

2.3.4. Tree Decomposition When we are talking about tree decomposition we are referring to a technique of a graph theory such that given a graph, we want to map it in a tree to solve problems that the original graph could have, for example speed-‐up. And why do we use this technique? Well, when we work with graphs, we have the NP-‐complete problem, so we cannot solved it with polynomial time, but the tree decomposition can improve this because, its leaves are limited by the width of the tree decomposition (tree-‐width), and this value is a constant and can be computed in linear time. Definition: A tree decomposition of a graph G consists of a Tree T, in which each node N ∈ T has a label L(N)⊂ V(G),such that the set of tree nodes whose labels contain any particular vertex of G forms a contiguous subtree of T, and such that any edge of G connects two vertices belonging to the same label L(N), for at least one node N of T. The width of the tree decomposition is one less than the size of the largest label set in T. The tree-‐width of G is the minimum width of any tree decomposition of G.

Figure 12: Planar graph tree decomposition

Graph G

root root


11

In this example, we can see that definition is correct. We have a tree T on which each label there is a set of vertices belonging to G. Starting from random root node, such as L(N)={ABC}, we can see that the tree keeps a contiguous structure. Moreover, at least two edges of each L(N) connect two vertices with one node N whether there is just one child, or two nodes have two children, as {BEG} that it connects with {BFG} and {EGH}. The Tree-‐width has w=2 as a value.

2.3.5. Graph isomorphism In the field of graph theory, an isomorphism given two graphs G and H is a bijection between the sets of their vertices and edges, so that holds with the adjacency. They must have the same number of vertices and edges such that the adjacency matrix 𝑓:𝑉 𝐺 → 𝑉 𝐻 This example will help us better understand the definition:

Figure 13: Isomorphism example between graph G and graph H

In figure 13 we can see that they are using same colours to represent the relationship between vertices of graph G and graph H, and these contain the same number of edges and match their adjacencies. Graph isomorphism properties:

• Same number of vertices and edges • Vertex degree graph is equal to another graph • Adjacency matrix elements are equals (can be in different order) • Same type of concepts (loops, multiple edges, triangles...)

Isomorphism between G and H

f(a)=1 f(b)=6 f(c)=8 f(d)=3 f(g)=5 f(h)=2 f(i)=4 f(j)=7

Graph G Graph H


12

2.3.6. Graph isomorphism problem To know whether two graphs with the same vertices n and e edges are an isomorphism or not, it is known as a graph isomorphism problem. This kind of problem allows a brute force attack that would require checking that the n! bijection retains its adjacency. But an efficient algorithm to carry this out is not known yet, at least for this case in general. In this context, when we talk about efficiency, we mean the number of growth steps smaller than O(en). The graph isomorphism problem is presented within a complex computational field as a problem belonging to NP, which is unknown whether it is solvable in polynomial time or if it is a NP-‐complete problem.

2.4. Other definitions

2.4.1. Euclidian distance Euclidean distance is the “ordinary” distance between two points that one would measure with a ruler, and is given by the Pythagorean formula. The Euclidian distance between two points p and q is the length of the line segment !"

___ . In Cartesian coordinates, if p=(p1, p2,...,pn) and q=(q1, q2,...,qn) are two points in Euclidean n-‐space, then the distance from p to q is given by:

𝑑 𝑝, 𝑞 = (𝑝!!𝑞!)! + (𝑝!!𝑞!)! +⋯+ (𝑝!!𝑞!)! = (𝑝! − 𝑞!)!!

!!!

In one dimension, the distance between two points on the real line is the absolute value of their numerical difference. Thus if x and y are two points on the real line, then the distance between them is computed as

(𝑥 − 𝑦)! = |𝑥 − 𝑦|


13

3. Current technology status about the problem to solve

3.1. Introduction In recent years graphs have been recognized as a powerful concept to represent structural patterns. Similarity measures for graphs that are based on an exact structural correspondence such as graph isomorphism and maximum common subgraph are often quite efficient. Due to the enormous computational complexity of the matching problem for general graphs, a number of authors have studied special classes of graphs, such as trees, bounded-‐valence graphs and graphs with unique node labels. Graph are commonly used as abstract representations for complex scenes, and many computer vision problems can be formulated as an attributed graph matching problem, where the nodes of the graphs correspond to local features of the images ad edges correspond to relational aspects between features. Graph matching consists in finding a correspondence between nodes of two graphs so that they look like almost equals when vertices are labeled according to such correspondence. The main issue of research in graph matching has been focused on finding faster algorithms to solve the problem approximately NP.

We can define the graph matching problem as follows: Given two graphs G1=(V1,E1) and G2=(V2,E2) with |V1|=|V2|, the problem is to find a one-‐to-‐one mapping 𝑓: 𝑉! → 𝑉! such that 𝑢, 𝑣 ∈ 𝐸! 𝑖𝑓 (𝑓 𝑢 , 𝑓 𝑣 ∈ 𝐸!. When such mapping f exists, this is called an isomorphism, and G2 is said to be isomorphic to G1. These type of problems are said to be exact graph matching. Donatello Conte[2] says that a weaker form of matching is subgraph isomorphism, that requires that an isomorphism holds with one of the two graphs and a node-‐induced subgraph of the other.

Nonetheless, often it is not possible to find an isomorphism between the two graphs to be matched. It happens when the number of vertices is different in G1 and G2. This might be due to the schematic aspect of the model and the difficulty of accurately segmenting the image into meaningful entities. Hence, in these cases the graph matching problem does not consist in searching for the exact way of matching vertices of a graph with vertices of the other graph, but we want to find the best matching between both graphs. These kinds of cases lead to a class of problem known as inexact graph matching. With this method the matching focuses on finding a non-‐bijective correspondence between G1 and G2.

The interest in inexact graph matching has recently increased in the last few years due to the application of computer vision to different areas such as cartography, medicine, character recognition. In all of these areas, automatic segmentation of images results and over-‐segmentation has been carried out and therefore two graphs might contain different number of vertices and edges. That is the reason why applications in these fields need inexact graph matching techniques. Donatello Conte[2] also says that inexact graph matching using approximate or suboptimal matching algorithms, only ensures finding a local minim of the matching cost. Usually this minimum is not very far from the global one, but there are no guarantees.


14

Although, inexact graph matching is better than exact graph matching, it cannot give an optimal solution, because it discards cases which it considers are not possible. On the other hand exact graph matching is able to find all the possible solutions but with more time.

The best correspondence of a graph matching problem is defined as the optimum of some objective function which measures the similarity between matched vertices and edges.

The following figure 14 shows a classification of all the graph matching types split by exact graph matching and inexact graph matching:

Figure 14: Graph matching classification

3.2. State of the art in literature So far no papers have been presented about a study of attributes in planar graph matching. Nevertheless, several authors have studied planar graph matching applications. Frank R. [4] introduced a fast graph cut algorithm for planar graphs and this leads to an efficient method that is applied on shape matching and image segmentation. Numerous computational challenges like image segmentation or shape matching can be solved by planar graphs cuts. In particular, the algorithm is able to match two different planar shapes of N point in O(N2 log N) and segment a given image of N pixels in O(N log N).

Figure 15:. Planar graph cut applications[4]

Another study not related to matching in planar graph, although it is about finding planar subgraphs, is introduced by G. Calinescu, Cristina G. Fernandes, U. Finkler and H. Karloff [5]. They have tried to get a better approximation algorithm for finding planar subgraphs. It consists of giving a provide graph G to find a largest planar subgraph of G. They achieved the maximum planar subgraph with the maximum number of edges, because the solution of this problem serves to provide applications in circuit layout, facility layout, and graph drawing.

Graph matching

Exact Graph Matching

Graph Isomorphism

Subgraph Isomorphism

Inexact (Best) Graph Matching

A_ributed Graph

Matching

A_ributed Subgraph Matching


15

Michel Neuhaus and Horst Bunke[6] focus on the problem of efficiently matching in attributed planar graphs in the context of the edit distance framework. Their paper is the study that resembles most our work because it is based on attributes. Nevertheless, it is not the same due to the fact, they have an error-‐tolerant when obtaining the graph edit distance because the need to insert, delete or substitute nodes and edges and with isomorphisms these kinds of operations are not allowed. Furthermore they use attributes in planar graph edges. Sayyed Bashir Sadjad [7] present the algorithm of Eppstein[1] for the well-‐know subgraph isomorphism problem in planar graphs. In the next section we talk in depth about this algorithm. Results so far: The current results achieved are: Generalizing given a pattern H, the search time to find isomorphism is exponential, but using planar graph algorithm, this time can be improved.

-‐ We can test whether any fixed pattern H is a subgraph of a planar graph G, or count the number of occurrences of H as a subgraph of G, in time O(n).

-‐ If connected pattern H has k occurrences as a subgraph of a planar graph G, we can list all occurrences in time O(n +k). If H is 3-‐connected, then k = O(n)[8], and we can list all occurrences in time O(n).

-‐ We can count the number of induced subgraphs of a planar graph G isomorphic to any fixed connected pattern H in time O(n), and if there are k occurrences we can list them in time O(n + k).

-‐ For any planar graph G for which we know a constant bound on the diameter, we can compute the exact diameter in time O(n).

-‐ For any constant h we can solve the h-‐clustering and connected h-‐clustering problems [9] in planar graphs in time O(n).

-‐ For any planar graph G for which we know a constant bound on the girth, we can compute the exact girth in time O(n). The same bound holds if instead of girth we ask for the shortest separating cycle or for the shortest nonfacial cycle in a given plane embedding of the graph.

-‐ For any planar graph G, we can compute the vertex connectivity and edge connectivity of G in time O(n). (For planar multigraphs, we can test k-‐edge-‐connectivity for any fixed k in time O(n).)


16

-‐ For any planar graph G and any constant l, we construct in time O(n) a linear-‐space routing data structure which can test for any pair of vertices whether their distance is at most l, and if so find a shortest path between them, in time O(logn).


17

4. New solution design, justifying the new proposal In this section, we will describe different theoretical parts of the new model. The main idea is based on Eppstein algorithm [1]. He proposes matching between planar graphs and graphs, and then find how many isomorphisms there are between them. In section 3, we talked about different types of graph matching. Well, now we want to add a new feature in the exact graph matching because the Eppstein algorithm works with subgraph isomorphism. First of all, we will describe how his algorithm works which we will base on the new solution, because without that, it would be impossible to understand how the new solution works. After that we might define our solution applying attributed planar graph isomorphism to do the matching.

Figure 16: Graph matching classification including attributes in the Subgraph Isomorphism

4.1. Planar Graph Isomorphism

4.1.1. Objectives We know that graphs matching represents an NP-‐complete problem, but using planar graphs without attributes we can solve this problem in polynomial time. When we are matching two graphs, we are computing all the possible combinations that vertices and edges of a graph G hold the same structure in vertices and edges of graph H. Nonetheless, to solve the matching in polynomial time Eppstein considers graph H as a subgraph of the planar graph G, because his algorithm is based on a technique of partitioning the planar graph into pieces of small tree-‐width, and applying dynamic programming within each piece. The same methods can be used to solve other planar graph problems including connectivity, diameter, induced subgraph isomorphism, and shortest path. So that, Eppstein’s algorithm [1] proposes matching between a planar graph G and a graph H that is subgraph of G, and then finding how many isomorphisms there are between them.

Graph matching

Exact Graph Matching

Graph Isomorphism

Subgraph Isomorphism

Exact A_ributed Subgraph Isomorphism

Inexact Subgraph Isomorphism with a_ributes

Inexact (Best) Graph Matching

A_ributed Graph Matching

A_ributed Subgraph Matching


18

4.1.2. Definitions Lemma 1: The subtree rooted at N provides a tree decomposition of the associated induced subgraph of G. Proof: The only property of a tree decomposition that does not follow immediately is the requirement that each edge connect two vertices contained in the label of some node. If that is accepted, any edge of the induced subgraph (v,v’) must have {v,v’}⊂ L(N’) for some N’, but N’ may not be a descendant of N. However, if not, v belongs to both L(N’) and by assumption L(N’’) where N’’ is a descendant of N. Therefore, by contiguity, v ∈ L(N), and v’ ∈ L(N), so in this case (v,v’) still both belong to the label of at least one node in the subtree.

Figure 17: Example with 2 vertices of subgraph L(N) are contained in a subgraph L(N’)

In figure 17, we see as the vertices of the subgraph L(N) as the vertices of subgraph L(N’), there is always an edge that connects them. Furthermore, we can see that if we have v={a,b} mapped on {C,D} respectively, this v still belongs to the subgraph L(N’’) because it also belongs to the subtree N. Lemma 2: Assume we are given graph G with n vertices along with a tree decomposition T of G with width w. Let S be a subset of the vertices of G, and let H be a fixed graph with at most w vertices. Then in time 2O(W log w)n we can count all isomorphs of H in G that includes some vertex in S ∈ 𝑉(𝐺). We can list all such isomorphs in time 2O(w log w)n + O(kw), where k denotes the number of isomorphs and the term kw represents the total output size. NOTE: We supposed that S is equal to G and we work with all the set of G. Proof: We obtain the tree T by dynamic programming. Let a partial isomorphism at a node N of the tree be an isomorphism between an induced subgraph H’ of the pattern H and the induced subgraph of G associated with the subtree rooted at N. For a node N we say that G’N is the graph which includes the subgraph of G induced by vertex set L(N), and it is formed by adding two additional vertices x and y, which are connected in all the vertices of L(N).

Subgraph L(N’)

: L(N)

Subgraph L(N)

: L(N’)


19

Figure 18: subgraph generation G’N with vertices x and y

The idea is that for each node N in the tree decomposition the number of graphs are counted that are partial isomorphism of H, starting from leaf node to the root, ie bottom-‐up system. Vertex x represents any vertex that is in the label of the subtree N, but is not in the current L(N). Vertex y represents vertices that are not in the labels of the nodes of the subtree N, including current N.

Remember w is a constant with size L(N)-‐1

Figure 19: Partial isomorph of a triangle graph H in the induced subgraph associated with node {A,B,D,E}

and corresponding partial isomorph boundary mapping the triangle to G’ Figure 19 shows a typical situation when we are computing partial isomorph. In this case, if we pay attention we can see that vertices A and B are perfectly mapped on the induced subgraph G’, but vertex b of subgraph H is mapped to y. That means vertex b has not been mapped yet, because in the current subgraph G’ does not have vertex F, but it will do when we go up the

G’N G’N G’N

Partial isomorph boundary

graph H Graph G

subgraph G’


20

tree decomposition and we check that vertex F of the graph G is contained in subgraph G’’ or another induced subgraph. Let’s see another example where graph H was matched successful to graph G. The following figure shows another mapping case, which is perfectly mapped on the induced subgraph G’’, without needing to check up the other labels tree because of vertices x and y haven’t been mapped.

Figure 20: Partial isomorph of a triangle graph H in the induced subgraph associated with node {A,B,C,D}

and corresponding partial isomorph boundary successfully mapping triangle to G’’

4.1.3. Consistency Suppose that node N has children N1 and N2. We say that two partial isomorphisms: B: H→G’N and B1: H → G’N1 or, B: H→G’N and B2: H → G’N2 are consistent if the following conditions all hold:

• For each vertex v ∈ H, if B(v) ∈ L(N1) or B1(v) ∈ L (N), then B(v)= B1(v) • For each vertex v ∈ H, if B(v) ≠ X , then B1(v) ∈ L(N) ∪ {Y} • At least one vertex v ∈ H has B1(v) ∉ L(N) ∪ {Y1} or B2(v) ∉ L(N) ∪ {Y2} if we have B1

and B2, otherwise it does not apply the condition.

These conditions are applied to B1 and B2 as the one as they both have the same father should it be necessary.

In case that N has only one child, then N1 also have to hold the next condition, although this condition it does not refer to consistent terms.


graph H Graph G

subgraph G’’


21

L(N)

B1:

Y1 X L(N1)DDS

Y1 X L(N1)DDS

Y1 X L(N1)DDS

Y X L(N) Y X L(N)

Y X L(N)

B1

B B

B1 B1

B

L(N1)

X Y

X Y1

a

a

• For each partial isomorphism it must hold that if 𝐵 𝑣 = 𝑥 and 𝐵1 𝑣 = 𝑦1 a partial possible isomorphism is discarded.

Figure 21: Discarding partial isomorphism

4.1.3.1. Checking given 2 vertexes N and N1 are consistent We have several opportunities to check the consistency between father and son, which by applying the three consistent conditions, we will end up discarding the partial isomorphism that is not valid. In total we have nine different options depending if vertex belongs to N, N1, X or Y.

Figure 22: Different options to check the consistency

Figure 22 show that when a vertex v ∈ L(N) could happen that v belongs to L(N1), X or Y. It is also possible that v ∈ X of B, then this produces three more possible options. Finally, the last possibility is that v ∈ Y of B and there are three more relations again. Let us do an example to check the condition:

C9 C8 C7

C6 C5 C4 C3 C1 C2

B:


22

-‐ Given a set of vertices V(v) ∈ H where H is a subgraph and v={a,b,c} -‐ N is the node belonging to the tree which has a label L that contains a set of vertices of

planar graph G. -‐ Furthermore we have B: H -‐-‐> G’ N and B1: H-‐-‐>G’N1

With these conditions let us try to check all the possible cases and discard all the invalid partial isomorphism.

• For each vertex v ∈ H, if B(v) ∈ L(N1) or B1(v) ∈ L (N), then B(v)= B1(v)

The mapped vertices of the subgraphs must have the same label in B and B1 to be consistent. The next cases we assume that N has just one son. Test 1:

A C D E X Y

B: a b c Ø Ø Ø

Case 1 -‐ B(a) ∈ L(N1) ? Yes or B1(a) ∈ L(N)? Yes => B(a)=B1(a) ? Yes Case 2 -‐ B(b) ∈ L(N1) ? Yes or B1(b) ∈ L(N)? No => B(b)=B1(b)? No Case 3 -‐ B(c) ∈ L(N1) ? No or B1(c) ∈ L(N)? Yes => B(a)=B1(a)? No

In this test we can see that in (case 1) C1 is accepted, so it is a possible partial isomorphism. In case 2 also we are in C1, but this time the condition does not hold the condition because B1(b) is not mapped in the same vertex that B(b), therefore we are discarding case 2. The same happens in case 3. It should be borne in mind that we are looking for all v ∈ H, so if any of these vertices are rejected by any of the three conditions it is already discarded as a possible partial isomorphism. Test 2:

A C D E X={B} Y

B: c

Ø a b

Case 4-‐ B(a) ∈ L(N1) ? Yes or B1(a) ∈ L(N)? No => B(a)=B1(a) ? Yes Case 5-‐ B(b) ∈ L(N1) ? No or B1(b) ∈ L(N)? Yes => B(b)=B1(b)? No Case 6-‐ B(c) ∈ L(N1) ? Yes or B1(c) ∈ L(N)? No => B(b)=B1(c)? No

In this test we can see that case 4 belongs to C4 and it is accepted because B(a)=B1(a). It is situation C4. Case 5 B1(b) ∈ L(N) but B(b)∉ L(N1), then we are discarding this case, it’s situation C7. Moreover, it means that vertex b was mapped in L(N1) and it is already treated in current node, so we can discard it.

A B C E X Y1

B1: a b c Ø Ø Ø

A B C E X= {Ø} Y1

B1:

a b Ø c Ø


23

Case 6 refers to situation C2. This case will never happen because X contains vertices that are in L(N) but not in L(N1), and when we go up the tree these vertices are already treated and are like L(N1) expansion. X is always the result of L(N)-‐L(N1).

• For each vertex v ∈ H, if B(v) ≠ X , then B1(v) ∈ L(N) ∪ {Y} Test 3:

A C D E X={B} Y

B: a Ø c Ø Ø b

Case 7-‐ B(a) ≠X? Yes => B1(a) ∈ L(N) ∪ {Y1} Yes Case 8 -‐B(b) ≠X? Yes => B1(b) ∈ L(N) ∪ {Y1} No Case 9 -‐B(c) ≠X? Yes => B1(c) ∈ L(N) ∪ {Y1} Yes

Case 7 is the situation C3, we accepted the condition. Case 8 is the situation c8. We commented before that it is impossible to map some vertices in B1(v)=x, so we discard this possible isomorphism. Case 9 accepts the current condition, but not the first consistency condition. It is situation C1.

• For each partial isomorphism it must hold that if 𝐵 𝑣 = 𝑥 and 𝐵1 𝑣 = 𝑦1 a partial possible isomorphism is discarded.

Test 4:

A C D E X={E,K} Y

B: Ø Ø Ø Ø a, b c

Case 10-‐ 𝐵 𝑏 = 𝑥 and 𝐵1 𝑏 = 𝑦1 (discard) Case 11-‐ 𝐵 𝑎 = 𝑥 and 𝐵1 𝑎 = 𝑦1 (discard) Case 12-‐ 𝐵 𝑐 = 𝑦 and 𝐵1 𝑐 = 𝑦1 (not treated yet)

Case 10 is the situation C6. We must discard the partial isomorphism because of the last condition. Case 11 is the situation C5, although it will never happen. Finally, case 12 is the situation C9. We accepted it because it is a possible partial isomorphism that we will treat in the future.

A B C E X= {Ø} Y1

B1: Ø

c Ø b a

A B C E X={Ø} Y1

B1: Ø Ø Ø Ø a b,c


24

4.1.4. Compatible triple

We say that two partial isomorph boundaries B1: H → G’N1 and B2: H → G’N2 form a compatible triple with B if the following conditions hold:

1. B1 and B2 are both consistent with B.

2. For each v with B(v)=x, exactly one of B1(v) and one of B2(v) is equal to y.

4.1.4.1. Checking given 3 vertices N, N1 and N2 are consistent Sometimes when we are computing the possible partial isomorphism we find a node that has two children. Then we have to apply the same conditions as before, although now, if in one child we find a possible partial isomorphism, we can say that it is consistent. The first condition: B1 and B2 are both consistent with B, it is the same as in the former, except that the next condition is only for nodes that have two children.

• At least one vertex v ∈ H has B1(v) ∉ L(N) ∪ {Y1} or B2(v) ∉ L(N) ∪ {Y1} if we have B1 and B2, otherwise the condition does not apply.

Test 5: Remember:

-‐ Given a set of vertices V(v) ∈ H where H is a subgraph and v={a,b,c} -‐ N is the node belonging to the tree which has a label L that contains a set of vertices of

planar graph G.

-‐ Furthermore we have B: H -‐-‐> G’ N and B1: H-‐-‐>G’N1 B: H -‐-‐> G’ N and B2: H-‐-‐>G’N2

A F H M X={E,K} Y

B: c b Ø Ø a Ø A E F M X={ Ø } Y1

B1: c a b Ø Ø Ø

F H K M X={ Ø } Y1 B2: b Ø Ø Ø Ø a,c


25

Looking at vertex a in B1 we can see that the condition is correct. Nonetheless in B2 v ∈ Y1, but it is only necessary that one son holds the condition, so then we accept this partial isomorphism. Test 6:

A F H M X={E,K} Y

B: Ø Ø c Ø a,b Ø A E F M X={ Ø } Y1 B1: Ø a Ø c Ø b Here, we prove vertices a and b follow the condition but, vertex c does not follow the consistency condition, and then we discard this possible partial isomorphism. And finally the last condition of compatible triple.

-‐ For each v with B(v)=x, exactly one of B1(v) and one of B2(v) is equal to y.

Test 7:

A F H M X={E,K} Y

B: Ø c Ø Ø a,b Ø A E F M X={ Ø } Y1

B1: Ø a Ø c Ø b This means that when we have two children, the size of B=X will be at least two, one node of B1 and one node of B2. Then if we have mapped vertex a to E then in B2 vertex a must belong to Y1 because in B2 node E does not exist. On the other hand if we have mapped vertex b to K, then in B1 vertex b must be Y1 because in B1 node K does not exist. It this condition does not hold we discard the partial isomorphism.

4.1.5. Joining Consistency and compatible triple are important because it is necessary when we are drawing the tree and going up towards the root. Each time that we go up one label of the tree we have

F H K M X={ Ø } Y1

B2: a Ø b Ø Ø c

F H K M X={ Ø } Y1

B2: c Ø b Ø Ø a


26

to check these two things between label father and his child, or children if the father has two sons. Now define X1(B) the number of partial isomorphism that can be extended to B and does not have not any vertex of S and define X2(B) the number of those which does not have not a vertex of S. To compute Xi(B) we use the values of Xi(B1) for all mappings B1 computed at nodes N1 and N2.

Hence, for each partial isomorph B1 that is consistent with B, we increment X1(B) by X1(B1) and increment X2(B) by X2(B1). If we have compatible triple B, B1 and B2 (2 sons) we increment X1(B) by X1(B1) ·∙ X1(B2) + X1(B1) ·∙ X2(B2) + X2(B2) ·∙ X1(B2) and increment X2(B) by X2(B1) ·∙ X2(B2). In our case, we assume that G=S, then:

• For each partial isomorph B1 that is consistent with B, we increment X1(B) by X1(B1) • For compatible triple B, B1 and B2 we increment X1(B) by X1(B1)

A,B,C,D

Figure 23: Resulting label after joining B with B1

A,F,H,M

Figure 24: Resulting label after joining B with B1 and B2

4.1.6. Graph and planar graph edges When we compute all the possible partial isomorphisms, edges have a very important function, because we are checking all the possible combinations, and if we know that two vertices in a subgraph are not formed by a common edge, in the planar graphs either, we will have a common edge, so we can discard this type of partial isomorphism. Doing this, we achieve a faster computing time because when we go up the tree we will have less combination to check. Considering all mappings B from V(H) to L(N)∪{x,y} where if B(h)=B(h’) then B(h)∉ L(N) or h=h’. Also if (h,h’) is an edge in H then B(h)=B(h’)∈ {x,y} or (B(h),B(h’)) ∈ E(G) where G is the planar graph and H the subgraph, and B is the partial isomorphism.

A,B,C,D,E

A,B,C,E

A,E,F,M F,H,K,M A,F,H,M,E,K


27

Graph G Graph H A B C D x y A B C D x y

Suppose that we have a graph G with three vertices= {A,B,C} and a graph H with two vertices={a,b} and we want to list all the possible partial isomorphisms. We have several combinations, but not all are partial isomorphism valid:

Figure 25: Checking and discarding combinations that are not partial isomorphism In this figure we can see that graph G (is not planar) does not have an edge between vertex A and vertex C (marked with colour), so when we compute all the possible combinations, we must discard theses cases. Doing this we will improve our performance. In total there are 18 possible partial isomorphisms but just 12 are partial isomorphisms.

4.1.7. Partial Isomorphism list If there is no v for which B(v)=x, then all partial isomorphims having boundary B involving only vertices in L(N), and can be enumerated by brute force in time 2O(w logw). Previosly, we have explained how to get partial isomorphism by drawing the tree. Once it is done and we have arrived at the root, we compute the number of isomorphs for which B(v)≠y for all the v. We discard partial isomorphism that is still containing some vertex in y because when we arrive at the root we have already treated all the vertices. The total time for testing all triples for compatibility and performing the above computation is O(w3(w+3)+1=2O(w log w)).

Figure 26: Discarding y in last step

A B C a b Ø a Ø b Ø a b

A B C b a ø b ø a ø b a

1 a b c ø ø Ø 2 b ø c a ø Ø 3 a c b ø ø Ø 4 b ø a c ø Ø 5 b ø c a ø Ø

1 a b c ø ø ø 2 a ø ø b ø c 3 a b ø ø ø c 4 b c ø ø ø a 5 a ø ø ø ø b,c 6 b ø c a ø ø 7 a c b ø ø ø 8 b ø a c ø ø 9 ø b ø ø ø a,c


28

4.1.8. Subgraph Isomorph Algorithm The algorithm consists of the following steps:

1-‐ Apply the method of Lemma 5 to find a partition of the vertices into sets Si associated with graphs Gi having low width tree decompositions.

2-‐ For each i ≥ 0, count or list the subgraph isomorphs of H in Gi that involve at least one vertex of Si , using the algorithm of Lemma 2.

3-‐ Sum all the counts or concatenate the lists, to get a count or list of the isomorphs in G.

4.2. Attributed planar graph matching

4.2.1. Objectives So far, we have computed all the possible partial isomorphisms only comparing vertices and edges. Now we want to apply the same method as before, but adding more information with attributes. Attributes allow adding information such as colour, weight, position... to the vertices or edges graph. Our model only uses attributes in the vertices, even though it could be possible to use attributes in the edges. Once attributes are implemented we want to study the speed of the new algorithm to see if it is better or not.

Figure 27: Planar graph with attributes When we find all the possible partial isomorphism we will apply the same mechanism as in 4.1.6. furthermore we will compare the subgraph node attributes with the planar graph node attributes. If they have the same attribute, for example same colour, then we will accept it, otherwise we will discard such a possible partial isomorphism.

H L Attributes

-‐ color -‐ position -‐ weight

Attributes -‐ color -‐ position -‐ weight


29

2 5 2 2 2 5

Moreover we can make approximations, such as accepting possible partial isomorphism between specific ranks.

4.2.2. Structure Matching with attribute planar graph follows the same structure as matching planar graph without attributes (explained in previous section) but now, these kinds of graphs have information in their vertices. Thus, the most important change in the new model is the edge function, because there, we will check attribute nodes and see if there is a common edge between two vertices. Our model carries out matching between two graphs looking for possible isomorphisms between them, that means if two vertices are mapped one to one and are equals, but vertices of graph G do not have an edge in common with vertices of graph H, this fact does not hold isomorphism properties, so that, we discard such a possible solution.

Figure 28: Matching planar graph that does not hold isomorphism properties

Figure 28 shows that the diagram on the left has the same attributes as the on the right, nonetheless, the latter pattern does not have an edge between vertices so that it does not hold isomorphism properties, thus, we do not accept this matching.

4.2.3. Graph and planar graph edges and vertices attributes As before, edges are the most important function to check all the possible combinations, but now we add attributes in the vertices of these edges. In this way, we are closing in on the exact matching; consequently, we will have fewer combinations to check. Considering all mappings B from V(H) and A(H) to L(N)∪{x,y} where if B(h)=B(h’) then B(h)∉ L(N) or h=h’. Also if (h,h’) is an edge in H then B(h)=B(h’)∈ {x,y} or (B(h),B(h’)) ∈ E(G) where G is the planar graph and H the subgraph, and B is the partial isomorphism. Moreover, if (h,h’) is an edge in H with their respectively attributes (ha,ha’) then B(ha)=B(ha’)∈ {x,y} or (B(ha),B(ha’)) ∈ E(G). Assume that we have the same example as before but now with attributes.

-‐ graph G with three vertices= {A,B,C} and attributes {1,2,3} respectively -‐ graph H with two vertices={a,b} and attributes {1,2} respectively -‐ we want to list all the possible partial isomorphisms.


30

Graph G Graph H

We have several combinations, but not all are partial isomorphism valid:

Figure 28: Checking and discarding combinations that are not partial isomorphism with attributes

Comparing the same example as before but adding attributes we can say that this method is completely strict when it is searching for partial isomorphism, and it obtains only exact isomorphism or approximate isomorphism if we wish it. This method is very good because discarding the greatest possible number of partial isomorphism supposes that there are less combinations to check in the tree and it will be faster than without attributes. We can use a tolerance variable that accepts 𝑡!! values. Then any node that has t of tolerance we can accept, and discard the others. In conclusion, we have two methods to compute partial isomorphism by attributes:

-‐ Exact method: Completely strict method to obtain partial isomorphism. This method just gets perfect matching.

-‐ Approximation method: Method by which you can define a rank and obtain partial isomorphism that is included in this rank. This method gets approximation matching.

4.2.4. Subgraph Isomorph Algorithm with attributes

The algorithm structure is the same as Subgraph Isomorph Algorithm without attributes. Nonetheless now, when we apply the algorithm and check if is there an edge, we have changed this function and moreover to check the edge we also confirm that subgraph vertices and planar graph vertices have similarity attributes if we are basing on approximation or the same attributes if we want an exact isomorphism. Once isomorphisms are achieved, we compute the euclidian distance of each one and order them from lesser to greater. The first isomorphism will have more similarities in common with the planar graph that we have done matching, or will have the same distance if the matching has been exact. The algorithm is:

A B C a b ø a ø b ø a b

A B C b a ø b ø a ø b a

1 1 2

3 2


31

1-‐ Apply the method of Lemma 5 to find a partition of the vertices into sets Si associated

with graphs Gi having low width tree decompositions.

2-‐ For each i ≥ 0, count or list the subgraph isomorphs of H in Gi that involve at least one vertex of Si , using the algorithm of Lemma 2 . Apply functions to check the attributes of the vertices (exact method or approximation method).

3-‐ Sum all the counts or concatenate the lists, to get a count or list of the isomorphs in G.

4-‐ Compute the euclidian distance of the isomorphism achieved between planar graph

and the given graph, after sorting the isomorphism corresponding distances from lesser to greater value.

4.2.5. Attribute planar graph matching samples Let’s do some samples of different situation of attribute planar graph matching: Assume given an attributed planar graph G and an attributed graph H, let’s match Sample 1:

Figure 29: Partial isomorph of a triangle graph H with attributes in the induced subgraph associated with

node {A,B,D,E} and corresponding partial isomorph boundary mapping the triangle to G’

Sample 1 shows a situation where we are computing partial isomorphism. In this case we use the exact method when we are checking vertices. Vertices A and B are perfectly mapped on the induced subgraph G’ because they have the same attributes. But vertex F that also has the same attribute as b is mapped to y because in the current subgraph G’ there is no vertex F, but it will be treated later.

Graph H Graph G subgraph G’

4

3

7

4

8 1

3

4 7

8

7

4

4



32

Sample 2:

Figure 30: Partial isomorph of a triangle graph H with attributes in the induced subgraph associated with

node {A,B,C,D} and corresponding partial isomorph boundary mapping the triangle to G’

Here we can see that vertices B,D,C are mapped by a,c,b respectively, and all hold the edges condition, but D and C attributes do not correspond to c and b, therefore these vertices are not partial isomorphism and we must discard them. If these vertices did not have attributes, we would not discard them. Sample 3:

Figure 31: Partial isomorph of a triangle graph H with approximation attributes 1!! in the induced subgraph associated with node {A,B,C,D} and corresponding partial isomorph boundary mapping the

triangle to G’

graph H Graph G subgraph G’

7

3

4

4

8 1

7

4

4


7 4

3 1

graph H Graph G subgraph G’

7

3

4

4

8 6

7

4

4


7 4

3 6


33

Sample 3 shows an approximation matching with 1!! of tolerance. The three vertices of each graph have the same isomorphism properties and furthermore, using an approximation method in the attributes we can achieve vertices D and C as partial isomorphism. Otherwise, if we had used the exact attributes method we would have discarded this matching because vertices D and C would not have held the same attributes.


34

A B C D A B C D A B C D

5. Development and implementation To develop this work MATLAB has been used, because it is an environment that provides facilities when programming related functions about computing, statistics, probability... It also allows easy matrix manipulation, plotting functions, creating user interfaces and communication with other programming languages.

5.1. How to generate all the possible partial isomorphism? Here we are going to explain how to achieve all the possible combinations to generate partial isomorphism. We have to compute the values of L(N) and x, and for y the vertices that do not correspond to either of these before.

5.1.1. L(N) Combinations Before finding all the possible partial isomorphisms, we need to know all the possible combinations when we do matching between the subgraph and planar graph. So the idea is to generate all the isomorphism cases, and once it is done, to discard those cases that do not satisfy isomorphism properties. Let us assume that L(N)={ABCD}∈ G’, and we look for partial isomorphism of subgraph H, where V(H) ∈ G’, and v={a,b,c}. Let us see a different situation where we can find:

V!,! =n!

n − k != n n − 1 … n − k − 1

We apply this operation because the mapped vertex sort is important, moreover, once the vertex is mapped we cannot use it again, so we cannot produce repetitions. The variable n is the size of L(N) and k is the number of vertices that we choose. Firstly mapping all the L(N) nodes taking just one vertex.

Figure 32: Generating combinations in L(N), selecting one vertex

Figure X shows the case that there is one vertex belonging to L(N). If we apply the following

equation we get: V!,! =!!

!!! != !"

!= 𝟒, but we have three vertices, so that, several vertex

combinations depending on the number of vertices that we take. For v ∈ H we get:

1 a Ø Ø Ø 2 Ø a Ø Ø 3 Ø Ø a Ø 4 Ø Ø Ø a

9 c Ø Ø Ø 10 Ø c Ø Ø 11 Ø Ø c Ø 12 Ø Ø Ø c

5 b Ø ø ø 6 ø b ø ø 7 ø Ø b ø 8 ø Ø ø b


35

A B C D A B C D

A B C D

A B C D A B C D

𝑐!,! = (!!) = 𝑛!

𝑘! 𝑛 − 𝑘 !

In this first case we have only one vertex getting: 𝐶!,! = (!!) =!!

!! !!! != !

!= 𝟑

Finally, to achieve the possible partial isomorphism we have to do:

(V!,!) x( 𝐶!,!) = 4 𝑥3 = 𝟏𝟐 Now, we will repeat the same process but mapping in L(N) two vertices.

V!,! =!!

!!! != !"

!= 12, and compute possible combinations taking 2 vertices:

C!,! = (!!) =!!

!! !!! != !

!= 3 → (V!,!) x( C!,!) = 12 x 3 = 36

Using 2 vertices we obtain 36 possible partial isomorphisms. Let us show the combinations.

Figure 33: Generating combinations in L(N),selecting two vertices And ending when we use the three vertices of H.

25 a c ø ø 26 a ø c ø 27 a ø ø c 28 ø a c ø 29 ø a ø c 30 ø ø a c 31 b c ø ø 32 b ø c ø 33 b ø ø c 34 ø b c ø 35 ø b ø c 36 ø ø b c

1 b a ø ø 2 b ø a ø 3 b ø ø a 4 ø b a ø 5 ø b ø a 6 ø ø b a 7 c a ø ø 8 c ø a ø 9 c ø ø a 10 ø c a ø 11 ø c ø a 12 ø ø c a

13 c b ø ø 14 c ø b ø 15 c ø ø b 16 ø c b ø 17 ø c ø b 18 ø ø c b 19 a b ø ø 20 a ø b ø 21 a ø ø b 22 ø a b ø 23 ø a ø b 24 ø ø a b

13 b a c ø 14 b a ø c 15 b ø a c 16 ø b a c 17 a b c ø 18 a b ø c

1 c b a ø 2 c b ø a 3 a ø b a 4 ø c b a 5 b c a ø 6 b c ø a


36

A B C D A B C D A B C D

Figure 34: Generating combinations in L(N),selecting three vertices V!,! =

!!!!! !

= !"!= 𝟐𝟒, and compute the possible combinations taking 3 vertices:

C!,! = (!!) =!!

!! !!! != !

!= 1 → (V!,!) x( C!,!) = 24 x1 = 𝟐𝟒

Finally we should not forget the case that any vertex is mapped in L(N): Once calculations are made we get the total possible isomorphisms: Isomorphism= 12 +36+24+1 =73 The general equation to know the L(N) combinations is:

(V!,!) x 𝐶!,! + 1 =n!

n − k !×

𝑣!𝑘! 𝑣 − 𝑘 !

+ 1!

!!!

Where n is the size of the label L(N), v is the size of the subgraph vertices and k the number of subgraph vertices that we have taken.

5.1.2. X and Y combinations When we generate the X combinations, we use the same technique as before, although now x is the current node which some vertices are part of, besides when we mapped these subgraph vertices they will always be mapped in the same position, so the number of combinations will be:

𝐶!,! + 1 = (!!) + 1 =𝑋!

𝑘! 𝑋 − 𝑘 !+ 1

!

!!!

7 b ø c a 8 ø b c a 9 c a b ø 10 c a ø b 11 c ø a b 12 ø c a b

19 a ø b c 20 ø a b c 21 a c b ø 22 a c ø b 23 a ø c b 24 ø a c b

1 Ø Ø Ø Ø


37

A B C D x={R,S} y A B C D x={R,S} y A B C D x={R,S} y

Where x is the number of vertices that contains X, k the number of elements that we take to do the combination and v is the number of vertices that has subgraph H. On the other hand, y will have vertices of H which have not been mapped in L(N) and x. Besides, if given a graph H, the number of vertices of it is higher than the size L(N)-‐1, we should compute the combinations of a subgraph H’ of size L(N)-‐1, and y will contain H-‐H’ plus vertices that are not mapped in L(N) and x. Assume there is L(N)={ABCD} ∈ G’ with x={RS}, and we are looking for partial isomorphism of subgraph H, where V(H) ∈ G’, and v={a,b,c}. Let us see different possibilities that we can find:

Figure 35: Generating combinations when x has values

5.1.3. Final combinations Once computed the combinations for L(N) and x, the last step is to join all the possibilities to get the final number of possible partial isomorphism, that we will achieve with the following expression:

𝑎 = (V!,!) × 𝐶!,! × 𝐶!,!" + 1!!!

!"!!

!

!!!

𝑖𝑓 𝑣 − 𝑘 ≠ 0

𝑏 = 𝑏 + (V!,!) × 𝐶!,!

!

!!!

𝑡ℎ𝑒 𝑜𝑡ℎ𝑒𝑟 𝑐𝑎𝑠𝑒

2 ø c ø ø a b 2 ø c ø ø b a 2 ø c ø ø a,b ø 2 ø c ø ø ø a,b

1 a ø ø ø c b 1 a ø ø ø b c 1 a ø ø ø c,b ø 1 a ø ø ø ø c,b

3 Ø Ø Ø Ø a b,c 3 Ø Ø Ø Ø b a,c 3 Ø Ø Ø Ø c a,b 3 Ø Ø Ø Ø a,b c 3 Ø Ø Ø Ø a,c b 3 Ø Ø Ø Ø b,c a 3 Ø Ø Ø Ø Ø a,b,c


38

Initially, b=0. If 𝑣 − 𝑘 ≠ 0 means that it is possible there are combinations belonging to x, otherwise it means that vertices have been mapped in L(N) and in x there is not any vertex mapped.

𝐹𝑖𝑛𝑎𝑙 𝐶𝑜𝑚𝑏𝑖𝑛𝑎𝑡𝑖𝑜𝑛 = 𝑎 + 𝑏 + 𝐶!,!"

!

!!!

+ 1

The final expression means that there is not anything in L(N) but there is something in x.

5.2. How to join two labels?

First of all, we have to define the structure dates and the different elements that we will work with. For matching two graphs are necessary; g1 will be any graph, and g2 a planar graph. Each graph has the following information:

-‐ Adjacency matrix: matrix that gives information about edges, where rows and columns are node identifiers. If there is an edge between node rows and node columns, the value of AdjacencyMatrix[rows, columns] will be 1, otherwise it will be 0.

0 1 11 0 11 1 0

Figure 36: Graph with respective adjacency matrix

-‐ Label: variable that contains vertices concatenation. In the case above label has the value ‘abc’.

-‐ Nodes: Number of vertices that the graph has.

-‐ Atribut: vector with information about each node.

Furthermore, we need another structure for the tree decomposition of the planar graph with the next information:

-‐ arbre: vector that contains each label of the tree. And each label has: o id: identifier of each label. Root label will be 1. o Nivell: level that each label has. Root label will be 0, and his son level 1.

a b c a b c

a

b c


39

o Label: variable that contains subgraph vertices concatenation of the planar graph.

o Combinacions: matrix that has all the possible partial isomorphism when we are matching a given graph with an induced subgraph associated with the current label.

o X: variable that contains vertices of the lower label but is not in the current label. Leaf label does not have X.

o Fills: Boolean that says if the father of the current label has more children or not.

o Labelplus: variable that contains vertices that have already checked in lower labels.

o Atributnode: vector that contains attributes of each vertex of the label. First position of this vector corresponds to the attribute of the first vertex of the variable label.

Now, starting with the leaf label and always using the bottom-‐up system, we can compute (applying 5.1 equations) all the possible partial isomorphism of the induced subgraph g2’(L(N’)), and the higher label, in this case L(N). After that, we can store the resulting matrix to combinacions (of each respective label). Once this is done, we can check if the edges and the attributes keep the same conditions as g1, otherwise we will discard the combination as a possible partial isomorphism. Finally, we will make a list of each label (L(N) and L(N’)) to join with a unique label that will contain possible partial isomorphism of the two labels together.

Figure 37: Creation join list table Let us see the matrix structure combinacio to understand subsequently the join list matrix.

-‐ Combinacio: use the first columns of the vertices of the label, then, the next one column contains all the x combinations, and the last one all the y combinations, depending on the x values. Suppose that g1 has v={a,b,c} and g2’ has ={ABCE}. Some rows of combinacio will be the following:

LIST L(N) combination row 1

combination row 2

Combination row 3

JOIN LIST

Combination row 1 Combination row 1

Combination row 1 combination row 2

Combination row 2 Combination row 1


combination row 3 combination row 1


LIST L(N')

Combination row 1

Combination row 2


40

To achieve the above table we need to compute all the possibilities between two labels. So we use a index list that contains 7 columns with information about L(N) and L(N’). Then, we store the index of the two labels that we are matching. The structure of the list is the following:

L(N) x of N y of N L(N') x of N' y of N' Boolean

1 1 1 15 Ø 1 Yes

1 2 2 1 Ø 1 Yes

1 2 2 2 Ø 1 No

1 2 2 3 Ø 1 No

If the cell is empty it means that the current label is a leaf, so that, we do not have x to compute. This is the only case where a cell will be empty. The last column is a boolean used to check if the possible partial isomorphism holds with the consistency conditions. If does not hold with the condition we must delete the row as a possible combination. After that, we can now use the index list to join the two labels in one. We will repeat the same process until we arrive at the root label. Once there, the last step is to delete rows that contain any node in y, because we do not have to compute anymore because it is not necessary. To join the labels we use an addition matrix that will have the same attributes as the arbre vector, but always updating the current vertices and moreover, also the possible partial isomorphisms. The structure of jointable is the following: L(N) X Y vertices already treated

-‐ First columns are booked to the L(N) size. -‐ Size L(N)+1 is booked to X.

L(N) x y

A B C D x={E,F} Y

1 a

b c c b b,c Ø Ø b,c

2 a b c Ø Ø

3 a b c Ø Ø c


41

-‐ Size L(N)+2 is booked to Y. -‐ Size L(N)+3 until the end for all the vertices that have been joined in the lower labels.

NOTE: the vertices already treated still belong to variable x, but we have mapped in the current label and it is not necessary to compute the possible combinations taking these vertices. In the x cell there will always be the vertices that are different when we do L(N)-‐L(N’). When we arrive at the top of the tree, the last step is to delete all the rows that contain any value in the y cell, because there are no more possible combinations to check, and the rows that are not complete. To do that we use the function:

-‐ eliminar_incomplets(matrix,subgraph,graph) where matrix is the jointable with all the results, the graph contains the information of the planar graph and subgraph its own information.

-‐ eliminarisomorfismesY(matrix,graph) where matrix is the jointable with all the results, and the graph contains the information of the planar graph.

5.3. Attribute edge function This is the most important function because here, we treat the attributes of the vertices accepting or discarding such a possible isomorphism depending on the threshold. If the threshold is 0 then we will compare exact attributes of both graphs, otherwise we will have an approximation depending on the threshold value. If the threshold has a very high value and low attributes values, it will be the same as we do not having attributes because we would treat all the possible cases without discarding any vertices. Below we show the function:

-‐ esaresta(option, letterA, letterB ) Return 1 if there is an edge in common between letterA and letterB, otherwise

return 0. Option has two values: 1 if vertices letterA and letterB belong to planar graph, or 0 if vertices letterA and letterB belong to a given graph.

-‐ Esaresta_atribut4parametres_llindar(letterA, letterB, attribut_labelA,attribut_labelB, attrribut_subgraphA, attribut_subgraphB, threshold )

LetterA corresponds to one vertex of planar graph, the same case for letterB, attribut_labelA and attribut_labelB are attributes of letterA and letterB respectively, attribut_subgraphA and attribut_subgraphB are attributes of the given graph and threshold is a value between 1 to 10. The function will return 1 if the attributes of planar graph and the given graph applying a threshold is the same or if it belong in the rank applied, otherwise it will return 0.


42

5.4. Sorting euclidian distance Last step when we have achieved all the partial subgraph isomorphism is to order from lesser to greater depending on the euclidian distance.

𝑑 𝑔!,𝑔! = 𝑥! − 𝑦! +⋯+ 𝑥! − 𝑦! + 𝑘 ∗ 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑡 𝑒𝑑𝑔𝑒𝑠 Where K is a constant that we have assumed has a values such as 10, the number of different edges are edges that belong to the planar graph but are not isomorphism with the graph that we have matched, and x and y are the attributes of planar graph and graph vertices respectively.


43

6. Practical evaluation In this section we describe the results of the original system and the results of the new system to compare both of them. Tests are been run with Intel Core of 2 processors of 1,85GHZ and 2GBb RAM. Matlab has been used to test it. We have tested the algorithm using a given tree decomposition. Tests can be done with different kinds of graph. Planar Graphs To check possible partial isomorphism, initially, the planar graph was used to do several operations. Previously said, we skip the steps to get tree decomposition. Below, there are the two planar graphs that we use to evaluate our system with their corresponding tree decomposition

{ A, F, G, H }

{ A, F, H, M }

{ A, E, F, M }

{ F, H, K, M }

{ A, D, E, M }

{ H, K, L, M }

{ A, C, D, E }

{ A, B, C, E }

{ A, B, H, G }

{ A, B, D, G }

{ A, D, F, G }

{A, B, C, D }

{A, D, E, F }

Figure 38: Planar graph 1 and Planar graph 2 with their corresponding tree decompositions respectively


44

a

c b

7

9 9 Tree Decompositions Furthermore to the previous tree decomposition, here we present two more tree decomposition of planar graph 1 and planar graph 2 respectively which allow us to do matching with graphs with at least 4 vertices (w=4).

Figure 39: Tree decomposition of planar graph 1 and planar graph2 respectively with size label =5

Given graphs We use four types of graphs to match with planar graphs:

Figure 40: Different graphs used to match with planar graphs

6.1. Analysis between old system and the new system Using Eppstein[1] algorithm and tree decomposition of size w=3, looking for all the partial isomorphism given a graph H and planar graph G.

graph H: Planar graph G 1

Figure 41: Triangle graph H and planar graph G 1

{ A, F, G, H, M }

{ A, D, E, F, M }

{ F, H, K, L, M }

{A, B, C, D, E }

{ A, D, F, G, E }

{ A, B, G, H, D}

{ A, B, D, G, C}

a b


45

We are going to analyze this problem. The following table shows the values of our solution:

possible partial isomorphism combinations

combinations after applying consistency

partial isomorphism with edge relation

hold

Step 1 join {ACDE} and {ABCE} 3526 95 64

Step 2 join {ADEM} and {ACDE+B} 5248 159 73

Step 3 join {AEFM} and {ADEM+BC} 6862 215 97

Step 4 join {FHKM} and {HKLM} 6160 125 85

Step 5 join {AFHM} between {AEFM+BCD} and {FHKM+L} 16562 775 108

Step 6 join {AFGH} and {AFHM+BCDLEK} 12096 312 139

Partial Isomorphism {AFGHBCDLEKM} 60

Table 1: Resulting table after matching triangle graph H with planar graph G 1 without attributes We can see that each time that we go up the tree, combinations are increased along with the possible partial isomorphism. That means we need to compute more combinations each time that we go up the tree. Step 5 is the point that we need longer to compute because we are joining two children. Finally, in the last step, we obtain the total partial isomorphism (60). Here, we are in the root and we must delete partial isomorphisms that contain some vertices in y, due to the fact we have already computed the entire tree. Now we are going to test the same example but using attributes. Each node has a value and we will only accept as an isomorphism if they hold the same properties.

Possible partial isomorphism combinations


partial isomorphism with edge relation hold

Step 1 join {ACDE} with {ABCE} 810 34 24

Step 2 join {ADEM} with {ACDE+B} 1248 55 42

Step 3 join {AEFM} with {ADEM+BC} 2268 55 25

Step 4 join {FHKM} with {HKLM} 986 36 22

Step 5 join {AFHM} between {AEFM+BCD} and {FHKM+L} 2773 192 20 Step 6 join {AFGH} with

{AFHM+BCDLEK} 2688 221 16 Partial Isomorphism {AFGHBCDLEKM}

6

Table 2: Resulting table after matching triangle graph H with planar graph G 1 with attributes


46

Figure 42: Combinations before and after applying consistency step by step

Using attributes we compute less possible partial isomorphism combinations than using by the planar graph without attributes. On the contrary, partial isomorphisms obtained with this method are a few in quantities because we only get the exact isomorphism. If we compare the two systems we can see that from step 1 to step 3 planar graphs without attributes the number of combination are increasing quite a lot, nevertheless, using attributes graphs we achieve an increase of one third of growth for each step. Step 5 is the hardest step because there is a union between two children and at this point we must compute more combinations than in the others steps. In the old system we can appreciate the combination growth, but with the new system it does not suppose a big increase in combinations. Nonetheless, looking at step 5 after applying consistency the number of combinations has increased due to the union of two children, so we have to check twice the number of consistencies as before.

0

5000

10000

15000

20000

Step 1 Step 2 Step 3 Step 4 Step 5 Step 6

Possible Par0al isomorphism combina0ons

A~ributed Planar Graphs

Planar Graph without a~ributes

0

100

200

300

400

500

600

700

800

Step 1 Step 2 Step 3 Step 4 Step 5 Step 6

Combina0ons a6er applying consistency

A~ributed Planar Graph Planar Graph without a~ributes


47

2 1 5 5

When we do matching with a square then possible combinations are increased significantly than if we compare with a triangle matching. Let us see the matching between them, but now using planar graph 2.

graph H: Planar graph G 2

Figure 43: Square graph H and planar graph G 2

W=4 without attributes (square)




STEP 1 {A,B,G,H,D} join with {A,B,D,G,C}

168961 663 437

STEP2 {A,D,F,G,E} join with { A,B,D,G,H+C,B}

293227 3335 586

TOTAL 462188 3998 1023

Partial Isomorphisms 169

W=4 with attributes (a=5, b=5, c=2, d=1)




STEP 1 {A,B,G,H,D} join with {A,B,D,G,C} 12502 149 115

STEP2 {A,D,F,G,E} join with { A,B,D,G,H+C,B} 17365 486 188

TOTAL 29867 635 303 Partial Isomorphism 72

Table 3: Results of matching planar graph 2 with a square graph H without attributes and with attributes

To compute all the possible combinations doing matching with a square, we must use a tree decomposition that allows 4 vertices. That is the reason why we have used the other tree decomposition. This time this tree has just two steps because it has three labels with one child per level. Fewer levels involve more combinations in each label.


48

Possible partial isomorphism combinations Combinations after applying consistency

Figure 44: Combinations before and after applying consistency, using planar graph 2 and square graph

Increasing just one node in the graph H, finding the subgraph in the planar graph has a whole increase of combinations. Furthermore when the number of graph H vertices is more than the value of w, we must modify the tree decomposition and increase the size of the label. These facts make for a great increase. In figure 44 we can see that in final step planar graph 2 without attributes has 293.227 possible combinations that give comprehensive information if we compare with planar graph2 with attributes that obtains only 32.825. On the other hand, combinations after applying consistency continue being better in planar graph with attributes. Total combinations of planar graph 2 matching Total combinations of planar graph 2 with a square (w=4) matching a triangle (w=4)

Figure 45: Total combinations using the same tree decomposition matching with square and triangle

In the last figure comparing the total number of possible combinations taking four vertices is 15 times greater than taking 3 vertices, and with many vertices the difference will be greater than the current one. On the contrary once the possible combinations are found, when we apply consistency there is just a difference of 4 times of one system compared to the other.

0

50000

100000

150000

200000

250000

300000

STEP 1 STEP2

16524 32825

168961

293227

planar graph 2 with a~ributes (a=5, b=5,c=2, d=1) planar graph 2 without a~ributes

0 500

1000

1500

2000

2500

3000

3500

STEP 1 STEP2

135 706 663

3335

planar graph 2 with a~ributes (a=5, b=5,c=2, d=1)

planar graph 2 without a~ributes

0

100000

200000

300000

400000

500000

possible partal isomorphism combinatons

combinatons auer applying consistency

462188

3998 49349 841

planar graph 2 without a~ributes w=4

planar graph 2 with a~ributes (a=5, b=5, c=2, d=1)

0

5000

10000

15000

20000

25000

30000


combinatons auer applying

consistency

29867

635 3273

184

Planar graph without a~ributes (w=4)

Planar graph 2 with a~ributes (a=1, b=2, c=3 )


49

6.2. Tree decomposition analysis Depending on the tree decomposition used, the possible combinations in each level of the tree will not be the same. If the size label of the tree is big then computing possible combinations will be more laborious than more labels with a small size. In the following we compare tree decomposition of label size 5 versus tree decomposition of label size 4 of planar graph 2, doing matching with a triangle without attributes:

W=3 possible partial isomorphism combinations



STEP 1 {A,B,D,G} join with {A,B,C,D}

6862 107 94

STEP2 {A,D,F,G} join with { A,D,E,F}

3268 74 64

STEP 3 join {A,B,H,G} between {A,B,D,G+C} and

{A,D,F,G+E} 17222 892 102

TOTAL 27352 1073 260

Partial isomorphism 72

W=4 possible partial isomorphism combinations



STEP 1 {A,B,G,H,D} join with {A,B,D,G,C} 12502 149 115

STEP2 {A,D,F,G,E} join with { A,B,D,G,H+C,B}

17365 486 188

TOTAL 29867 635 295 Partial isomorphism 72

Table 4: Results of matching planar graph2 between a triangle with tree decomposition of size w=3 and

w=4

Total combinations before and after using consistency

Figure 46: Combinations using planar graph 2 and a triangle graph without attributes

In figure 46 we can see that when we use tree decomposition of w=3 the number of combinations are less than w=4, although in w=4 there are less labels than in the other.

0

10000

20000

30000


combinatons auer applying consistency

27352

1073

29867

635

W=3

W=4


50

Nevertheless, the opposite happens with combinations after applying consistency. One of the possible causes is because in the tree decomposition of w=4 we do not have to join two children, all the levels have one child. On the contrary in the tree of w=3 there is one level with two children and that produces a lot of consistency combinations that must be checked, and in the other tree this does not happen.

6.3. Vertices analysis We can count all isomorphism of H in planar graph G in time 2O(W log w)n. We checked the time of the old system and the new, and the results state that the new solution might obtain isomorphism in linear time too.

Figure 47: Comparison between old system and the new system time

In figure 47 we can see a comparison between the two systems. When graph H has few vertices the difference between attributes or not is almost the same, but when the vertices increase the combinations comparing one system to the other the differences start to be great, until it arrives at a point, that the vertices of graph H would be the same as the vertices

1 4

16 64 256

1024 4096 16384 65536

262144 1048576 4194304

1 2 3 4 5 6 7 8 9 10 11

2O(W log w)n (old system)

w=2 without a~ributes W=3 without a~ributes w=4 without a~ributes

1

4

16

64

256

1024

4096

16384

65536

262144

1 2 3 4 5 6 7 8 9 10 11

2O(W log w)n (new system)

w=2 with a~ributes w=3 with a~ributes w=4 with a~ributes

1 4 16 64 256

1024 4096

16384 65536

262144 1048576 4194304

1 2 3 4 5 6 7 8 9 10 11

Old system VS New system

W=4 without a~ributes

w=3 without a~ributes

w=2 without a~ributes

w=4 with a~ributes

w=3 with a~ributes

w=2 with a~ributes


51

1 2 1 3

of the planar graph G and then we would not find a set of subgraph of H in G, so that, it would be a NP problem and we could not find the solution.

6.4. Euclidian distance isomorphism analysis When we have achieved all the possible partial isomorphisms, we can find a problem which we might solve by euclidian distance. The isomorphism between the planar graph G and the given graph H is a bijection between vertices, so the subgraph of the planar graph must have the same number of vertices as the given graph H. Nevertheless, graph H is injective with the edges of the planar graph G and G could have more edges than goes to the mapped vertices. When this happens we use the euclidian distance to get the best partial isomorphism orded by distance. Let’s show a practical example:

-‐ Given a square graph H with attributes (a=1, b=2 , c=3 and d=4) -‐ Planar graph G2 with attributes -‐ Threshold=0 (exactly matching)

Table 5: Resulting table after matching the two graphs

Figure 48: Square graph H mapped to planar graph G2. In blue line isomorphism number 8 and in red line isomorphism number 1

# A D F G E x y C B H Euclidian distance 1 [] a [] d b [] [] c [] [] 0 2 [] a c d [] [] [] [] b [] 0 3 [] d [] a b [] [] c [] [] 0 4 [] d c a [] [] [] [] b [] 0 5 [] a [] d [] [] [] c b [] 10 6 [] a c d b [] [] [] [] [] 10 7 [] d [] a [] [] [] c b [] 10 8 [] d c a b [] [] [] [] [] 10


52

Table 5 states all the subgraph isomorphisms contained in the planar graph G 2 and sorted by euclidian distance. In figure 48 we can see two different possible cases when we have achieved the partial isomorphism. Partial isomorphism number one hold the properties of the isomorphism and such that threshold is 0 the euclidian distance is equal to 0, so it is a perfect match. In contrast, partial subgraph isomorphism number eight has an edge between E and F that square graph H does not have, though we can accept such isomorphism, now the euclidian distance states the approximation and we can know that it is not a perfect match. Now we test the attributed approximation. Each node has a value, and we only will accept isomorphism if their properties belong a specific rank. In all the samples vertices have values between 1 to 10, so when we specify a threshold higher than 10 it will be the same as if we are using the algorithm without attributes, because we will check all the graph vertices. Let’s assume that now we have:

-‐ Threshold ( 2, 4, 7) -‐ Planar Graph G1 with attributes -‐ Triangle Graph H with attributes (a=3, b=7, c=9)

Figure 49: Increase of combinations depending on threshold

0 10000 20000 30000 40000 50000 60000

Possible combinatons, threshold =2

planar graph 1 with a~ributes


0

10000

20000

30000

40000

50000

60000


planar graph 1 with a~ributes


0

10000

20000

30000

40000

50000

60000


planar graph 1 with a~ributes planar graph 1 without a~ributes

0

10000

20000

30000

40000

50000

60000

STEP 1

STEP 2

STEP 3

STEP 4

STEP 5

STEP 6

TOTAL

Possible combinatons, threshold 2,4,7

planar graph 1 with a~ributes (threshold=2) planar graph 1 wit a~ributes (threshold=4) planar graph 1 with a~ributes (threshold=7) planar graph 1 without a~ributes


53

In figure 49 we can see that depending on the threshold taken, combinations tend to increase until they arrive at the same combinations that we have achieved without using attributes. When we use threshold=7, the possible combinations are the same as in the planar graph without attributes, because with this threshold and the values of attributes vertices, we accept all the possible cases and we do not discard any partial isomorphism. In contrast, although we use a big threshold, this matching has an important advantage compared with the system without attributes. Matching with attributes is important because computing all the possible combinations we can discover the euclidian distance and hence, know which is the best matching.


54

7. Conclusion, future works In this work we proposed an improvement of the D. Eppstein [1] algorithm adding attributes in the vertices of planar graphs and graphs to do matching, to find partial isomorphism in linear time. We have shown how to solve attributed planar subgraph isomorphism given a pattern with attributes in time O(n). Unlike the D. Eppstein [1] algorithm, the fact of introducing attributes in the vertices produces a significant improvement, although, when we define a threshold, the improvement decreased as we increased the threshold because we got closer to the computation without attributes. Depending on the label size of the tree decomposition we can achieve a better or worse possible partial isomorphism computation. The factor w is the most important when looking for the partial subgraph isomorphism. As we increase the constant w we can find partial subgraph isomorphism given a fixed pattern of w vertices, but when w has the same size as the planar graph there is no subgraph because we are matching planar graph with a fixed graph. Then we have a graph isomorphism problem that also is NP, and we cannot compute with this system. In conclusion when w is smaller we can find subgraph partial isomorphism in little time. But when w is bigger but lower than the number of vertices of the planar graph we can find subgraph partial isomorphism in linear time, although, it is very costly. For example to find subgraph partial isomorphism of given pattern of w=4 without attributes, we needed two days of computation to get the results, so imagine when w=10, the processor has to compute whole possible combinations. Finally, we have introduced euclidian distance when we use an approximation or exact matching to obtain a sorting list of possible partial subgraph isomorphism by threshold. Sometimes vertices in the planar graph contain more edges than in the fixed graph, and then with the euclidian distance we can discover which is the best approximation of partial subgraph isomorphism. Work with optimal algorithms involve look for all the possible solutions, one by one, and that supposes a great expenditure of time, that is the reason from comprehensive possible combinations that we have achieved. In contrast, the suboptimal algorithm tries to find the solution without checking all the possible combinations, and could discount some of them, though it is faster than the optimal. In the future, we expect to make our spanning tree given a planar graph. Hence, we could generate a tree decomposition and we may work with several structures of data of planar graphs. Once done, we could study how this would affect the partial isomorphism solution depending on the spanning tree chosen. Moreover tree decomposition like the subgraph


55

isomorphism problem is NP. In this work we have worked with constant size tree decomposition, although, work with no constant size tree decomposition has not yet been tested without exceeding the w. This could achieve efficient isomorphisms and these could be in linear time.


56

7-‐ References [1] David Eppstein: Subgraph Isomorphism in Planar Graphs and Related Problems, vol 3, No. 3 (1999), pp.1-‐27 [2] Donatello Conte et al.: Thirty Years Of Graph Matching In Pattern Recognition. IJPRAI 18(3): 265-‐298, 2004. [3] S. Gold and A. Rangarajan. A Graduated Assignment Algorithm for Graph Matching. IEEE TPAMI, 18(4): 377 -‐ 388, 1996. [4] Frank R. Shmidt, Eno Töppe and Daniel Cremers: Efficient Planar Graph Cuts with Applications in Computer Vision. IEEE CVPR, 2009 [5] Gruia Calinescu, Cristina G. Fernandes, Ulrich Finkler and Howard Karloff: A better Approximation Algortithm for Finding Planar Subgraphs. Journal of algorithms, vol 27, 269-‐302 (1998). [6] Michel Neuhaus and Horst Bunke: An Error-‐Tolerant Approximate Matching Algorithm for Attributed Planar Graphs and Its Application to Fingerprint Classification. SSPR&SPR 2004, LNCS 3138, pp.180-‐189, 2004 [7] Subgraph Isomorphism in Planar Graphs: Sauued Bashir Sadjad, 2004 [8] D. Eppstein. Connectivity, graph minors, and subgraph multiplicity. J. Graph Theory 17:409-‐416, 1993. [9] J. M. Keil and T. B. Brecht. The complexity of clustering in planar graphs. J. Combinatorial Mathematics and Combinatorial Computing 9:155-‐159, 1991


57

8. Annex 1: Software application You can use an application that includes 2 planar graphs and four graphs to check the new solution.

8.1. Requirement At least Intel pentium IV (2,5 GHZ) and 512 MB of RAM. Depending on the processor speed, matching with two graphs will be quicker or slower.

8.2. How does it work? Open the file interficie_Grafica.m and then run it. Once the main interface on the screen appears:

Figure 50: Application main interface

1-‐ Select a planar graph G that you want (Planar Graph 1 or Planar Graph 2)

2-‐ Select a graph H that you want (there are several options):

a. Triangle without attributes (w=3): Tree decomposition of size label=4 b. Triangle with attributes (w=3): The same as a. but with attributes c. Triangle without attributes (w=4): Tree decomposition of size label=5 d. Triangle with attributes (w=4): The same as c. but with attributes e. Square without attributes (w=4): Tree decomposition of size label=5 f. Square with attributes (w=4): The same as e. but with attributes g. Vector without attributes (w=3): Tree decomposition of size label=4 h. Vector with attributes (w=3): The same as g. but with attributes


58

NOTE: If you select a graph H with attributes then you have to insert the desired values here: The letters correspond to the vertices of the different graphs H. If you want a square, you will use all the letters (a,b,c,d). In the other case if you want a triangle, you will use letters a,b,c or otherwise if you want a vector you will use letters a and b, and it is not necessary to modify the rest. Furthermore, if you want a threshold you have to change the value in this box. Usually it has a value such as 0.

3-‐ Click on the bottom “MATCHING”

4-‐ It will start to looking for all the isomorphism, and a message will appear on the

screen: NOTE: It is not recommended to test square with or without attributes, because the processor has to do whole combinations, and to find all the isomorphisms without attributes can take it 2 days, and with attributes it can take it from 5 to 9 hours depending on the situation.

5-‐ Once, a “matching successful” appears on the screen means that application has finished its work.

6-‐ The last step is checking all the isomorphism in the table. They are sorted by euclidian

distance, so the first one would be the most exact if we are using a threshold different of 0. In the table the name of the planar graph vertices appears at the top of the columns and inside the cell the vertices of the graph H. The last column is the euclidian distance. Columns called x and y will always be empty but they appear because this is the table structure that we have used during the matching.


59

Documents

Màster en Enginyeria Informàtica i de la Seguretat (MEIS ...deim.urv.cat/~francesc.serratosa/2010_09_08_Raul_Arlandez_MEIS_… · Raül’Arlàndez’Reverté:’Attributed’Planar’Graph’Matching