登录 | 注册 | 充值 | 退出 | 公司首页 | 繁体中文 | 满意度调查
综合馆
BS P模型下基于边聚簇的大图划分与迭代处理
  • 摘要

    近年来随着互联网的普及和相关技术的日益成熟,大规模图数据处理成为新的研究热点。由于传统的如Hadoop等通用云平台不适合迭代式地处理图数据,研究人员基于BSP模型提出了新的处理方案,如Pregel ,Hama ,Giraph等。然而,图处理算法需要按照图的拓扑结构频繁交换中间计算结果而导致巨大的通信开销,这严重地影响了基于BSP模型的系统的处理性能。首先从降低消息通信的角度分析当前主流BSP系统的处理方案,然后提出了一种基于边聚簇的垂直混合划分策略(EC‐VHP),并建立代价收益模型分析其消息通信优化的效果。在EC‐V H P的基础上,提出了一个点‐边计算模型,并设计了简单Hash索引和多队列并行顺序索引机制,进一步提高消息通信的处理效率。最后,在真实数据集和模拟数据集上的大量实验,验证了EC‐V H P策略和索引机制的正确性和有效性。

  • 作者

    冷芳玲  刘金鹏  王志刚  陈昌宁  鲍玉斌  于戈  邓超  Leng Fangling  Liu Jinpeng  Wang Zhigang  Chen Changning  Bao Yubin  Yu Ge  Deng Chao 

  • 作者单位

    东北大学信息科学与工程学院 沈阳 110819/中国移动通信研究院业务支撑研究所 北京 100053

  • 刊期

    2015年4期 ISTIC EI PKU

  • 关键词

    大规模图  BS P模型  图划分  点-边计算模型  索引结构  large graph  BSP model  graph partition  vertex-edge computation model  index structure 

参考文献
  • [1] Dean J;Ghemawat S. MapReduce:Simplified data processing on large clusters. {H}Communications of the ACM, 2008,01
  • [2] Blondel V D;Guillaume J L;Lefebvre E. Fast unfolding of communities in large networks. http://iopscien ce.iop.org/1742-5468/2008/10/P10008, 2013-06-25
  • [3] Catalyurek U V;Aykanat C. Patoh:Partitioning tool for hypergraphs. http://bmi.osu.edu/~umit/software.html, 2013-06-25
  • [4] Hendrickson B;Leland R. An improved spectral graph partitioning algorithm for mapping parallel computations. {H}SIAM Journal ON SCIENTIFIC COMPUTING, 1995,02
  • [5] Pujol J M;Erramilli V;Rodriguez P. Divide and conquer:Partitioning online social networks. http://arxiv.org/pdf/0905.4918v1.pdf, 2013-06-26
  • [6] Leskovec J;Dasgupta A;Mahoney M W. Community structure in large networks:Natural cluster sizes and the absence of large well-defined clusters. Journal of Internet Mathematics, 2009,01
  • [7] Gonzalez J E;Low Y. PowerGraph:Distributed graph-parallel computation on natural graphs. Berkeley:USENIX Association, 2012
  • [8] Bao Yubin;Wang Zhigang;Yu Ge. BC-BSP:A BSP-based parallel iterative processing system for big data on cloud. {H}Berlin:Springer-Verlag, 2013
  • [9] 怀特汤姆;周敏奇;王晓玲;金澈清. Hadoop权威指南. {H}北京:清华大学出版社, 2011
  • [10] Khayyat Z;Awara K;Alonazi K. Mizan:A system for dynamic load balancing in large-scale graph processing. New York:ACM, 2013
  • [11] Pace M F. Hama vs MapReduce. http://arxiv.org/abs/1203.2081, 2012-06-25
  • [12] Kyrola A;Blelloch G;Guestrin C. GraphChi:Large-scale graph computation on just a PC. Berkeley:USENIX Association, 2012
  • [13] Grzegorz M;Austern H M. Pregel:A system for large-scale graph processing. New York:ACM, 2010
  • [14] Chen Ruishan;Weng Xuetian;He Bingshen. Large graph processing in the cloud. New York:ACM, 2010
  • [15] Sangwon S;Edward J Y. HAMA:An efficient matrix computation with the MapReduce framework. Los Alamitos,CA:IEEE Computer Society, 2010
  • [16] The Apache Software Foundation. Introduction to Giraph. http://giraph.apache.org/intro.html, 2013-06-25
  • [17] 冯国栋;肖仰华. 大图的分布式存储. {H}中国计算机学会通讯, 2012,11
  • [18] Newman M. Why social networks are different from other types of networks. {H}Physical Review E, 2003,03
  • [19] Mondal J;Deshpande A. Managing large dynamic graphs efficiently. New York:ACM, 2012
  • [20] Salihoglu S;Widom J. GPS:A graph processing system. New York:ACM, 2013
查看更多︾
相似文献 查看更多>>
54.91.41.87