Link Prediction in Graph Stream by Distributed Computation Approach  
Author Khanh-Duy Le-Trinh


Co-Author(s) Anh-Thu Nguyen-Thi; Tu-Anh Nguyen-Hoang


Abstract Recently, graph stream has become an essential model to represent interactive elements in the massive networks. It is a promising key to performing many real-world applications, such as social networks, E-commerce networks, and telecommunication networks. However, the most exciting link prediction methods just focused on predicting the existence of links in snapshot graphs, while most recent applications in the form of the graph stream. When applying these methods to graph stream, we have two challenges: a) a large number of 𝑛 nodes which produce 𝑛" possible links, resulting in significant complexity and b) the rapid evolvement of graph stream. In this paper, we introduce an effective method to predict real-time existence link in graph stream. We propose an efficient Graph Stream Distributed Computation framework (GSDC) which is immediately amendable to parallelization, facilitating a scaleable distributed implementation on Apache Spark platform. With our framework, we use the Distributed Computation Score feature (DCS) which is designed to parallel compute the similarity scores between nodes. Experiment results on three realworld social networks demonstrate the effectiveness of the proposed framework.


Keywords Link Prediction, Distributed Method, Data Mining
    Article #:  DSBFI19-99
Proceedings of ISSAT International Conference on Data Science in Business, Finance and Industry
July 3-5, 2019 - Da Nang, Vietnam