Abstract: 

We propose a parallel algorithm to extract all connected subgraphs, each of which shares a common itemset whose size is not less than a given threshold, from a given graph in which each vertex is associated to an itemset. We also propose implementations of this algorithm using the taskparallel language Tascell. This kind of graph mining can be applied to analysis of social or biological networks. We have already proposed an efficient sequential search algorithm called COPINE for this problem. COPINE reduces the search space of a dynamically growing tree structure by pruning its branches corresponding to the following subgraphs; already visited, having itemsets smaller than the threshold, and having alreadyvisited supergraphs with identical itemsets. For the third pruning, we use a table associating alreadyvisited subgraphs and their itemsets. To avoid excess pruning in a parallel search where a unique set of subtrees (tasks) is assigned to each worker, we should put a certain restriction on a worker when it is referring to a table entry registered by another worker. We designed a parallel algorithm as an extension of COPINE by introducing this restriction. A problem of the implementation is how workers efficiently share the table entries so that a worker can safely use as many entries registered by other workers as possible. We implemented a sharing method where a single table controlled by locks is shared among workers. We also developed task creation strategies in consideration of the tradeoff between the effect of pruning and the number of work stealing events. We evaluated these implementations using a real protein network. As a result, we achieved a speedup of approximately a factor five with 16 workers.
Authors Shingo Okuno, Kyoto University; Tasuku Hiraishi, Kyoto University; Hiroshi Nakashima, Kyoto University; Masahiro Yasugi, Kyushu Institute of Technology; Jun Sese, Tokyo Institute of Technology

