Total views : 126

Improved Parallel Computation of PageRank for Web Searching

Affiliations

  • Department of Computer Science and Engineering, Maulana Azad National Institute of Technology, Link Road Number 3, Near Kali Mata Mandir, Bhopal - 462003, Madhya Pradesh, India
  • Department of Mechanical Engineering, National Institute of Technology, NH 66, Srinivas Nagar, Surathkal, Mangaluru - 575025, Karnataka, India

Abstract


Background/Objectives: PageRank given by Brin and Larry in 1998, emerged as a dominant link analysis method used by web search engines for ranking of its search results. Efficient and fast computation of PageRank values for prodigious web graphs is indeed an important issue for web search engines today. Recognizing and fighting with spam web pages is also considered to be another necessary issue in web searching. Methods/Statistical Analysis: In this paper, we have proposed an efficient and accelerated parallel computation of PageRank scores on Graphics Processing Units (GPUs) which uses non even distribution of PageRank values. This work is experimented on datasets taken from Stanford Large Network Dataset Collection, on a system equipped with NVIDIA Quadro 2000 Graphics card using CUDA programming language. Findings: The proposed work has a speed up of 3.22 to 7.5 and is also capable of dealing with spam web pages. Application: The proposed algorithm helps in detecting spam web pages.

Keywords

CUDA, GPU, Parallel PageRank, Spam Web Pages

Full Text:

 |  (PDF views: 133)

References


  • Page L, Brin S, Motwani R, Winograd T. The PageRank citation ranking: Bringing order to the web. Technical report.Stanford Digital Library Technologies Project. 1999.
  • Brin S, Page L. The anatomy of a large-scale hypertextual web search engine. Proceedings 7th WWW Conference Elsevier. 1998; 30(1-7):107-17.
  • Liu B. Exploring hyperlinks, contents and usage data.Berlin: Springer-Verlag. Handbook of Web Data Mining.2011; p. 247-77.
  • Duong NT, Nguyen QAP, Nguyen AT, Nguyen HD. Parallel PageRank computation using GPUs. New York, USA: Proceedings of the Third Symposium on Information and Communication Technology ACM. 2012 Aug; p. 223-30.
  • Gleich DF. PageRank Beyond The Web. SIAM Review.2015; 57:321-63.
  • Abiteboul S, Preda M, Cobena G. Budapest, Hungary: Proceedings of the 12th international conference on World Wide Web ACM. Adaptive On-Line Page Importance Computation. 2003 May; p. 280-90.
  • Liu D, Gong Y. Optimal methods of PageRank Algorithm on the bilingual web page. Chengdu, China: Proceedings of the 2nd International Conference on Computer Engineering and Technology. 2010 April; 689-91.
  • Haveliwala TH. Topic-Sensitive PageRank: A ContextSensitive Ranking Algorithm for Web Search. IEEE Transactions on Knowledge and Data Engineering. 2003 July; 15(4):784-96.
  • Dubey H, Roy BN. An Improved Page Rank Algorithm based on Optimized Normalization Technique.International Journal of Computer Science and Information Technologies. 2011 Sep; 2(5):2183-88.
  • Singh AK, Kumar R, Leng AGK. Efficient Algorithm for Handling Dangling Pages using Hypothetical node. Seoul, South Korea: IEEE 6th International Conference on Digital Content, Multimedia Technology and its Applications.2010 Aug; p. 44-9.
  • Xing W, Ghorbani A. Weighted PageRank Algorithm.Proceedings of IEEE 2nd Annual Conference on Communication Networks and Services Research. 2004 May; p. 305-14.
  • Al-Saffar S, Heileman G. Experimental bounds on the usefulness of personalized and topic-sensitive pagerank.Fremont, CA: International Conference on Web Intelligence. 2007 Nov; p. 671-75.
  • Rungsawang A, Puntumapon K, Manaskasemsak B.Un-biasing the link farm effect in pagerank computation.Niagara Falls, ON: IEEE 21th International Conference on Advanced Networking and Applications. 2007 May; 92431.
  • Yuan F, Yin C, Liu J. Improvement of PageRank for Focused Crawler. Qingdao: IEEE Eighth ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing. 2007 July; 2(7):797-802.
  • Zhu Y, Ye S, Li X. Distributed pagerank computation based on iterative aggregation-disaggregation methods.ACM, Bremen, Germany: Proceedings of the 14th ACM international conference on Information and knowledge management. 2005 Oct; 578-85.
  • Wu T, Wang B, Shan Y, Yan F, Wang Y, Xu N. Efficient PageRank and SpMV Computation on AMD GPUs. San Diego, CA: IEEE 39th International Conference on Parallel Processing. 2010 Sep; p. 81-9.
  • Praveen K, Vamshi Krishna K, Anil Sri Harsha B, Balasubramanian S, Baruah PK. Cost Efficient PageRank Computation using GPU. Bengaluru, India: International Conference on High Performance Computing (HiPC).Student Research Symposium. 2011 Dec.
  • Gleich D, Zhukov L, Berkhin P. Fast parallel PageRank: A linear system approach. Purdue Universit: Technical Report. 2004.
  • Cevahir A. Aykanat C. Turk A, Cambazoglu BB, Nukada A, Matsuoka S. Efficient PageRank on GPU clusters. IPSJ SIG Technical Report, HPC-128. 2010.
  • Manaskasemsak B, Rungsawang A. Parallel PageRank Computation on gigabit PC Cluster. Proceedings of the 18th IEEE International Conference on Advanced Information Networking and Applications AINA. 2004 March; 1(0):273–77.
  • Perozzi B, McCubbin C, Halbert JT. Scalable graph clustering with parallel approximate PageRank. Springer-Verlag Wien: Social Network Analysis and Mining. 2014 March.
  • Sankaralingam K, Sethummadhavan S, Browne JC.Distributed PageRank for P2P Systems. Proceedings of the 12th IEEE International Symposium on High Performance Distributed Computing. 2003 June; p. 58-68.
  • Cevahir A, Aykanat C, Turk A, Cambazoglu BB. Site-Based Partitioning and Repartitioning Techniques for Parallel PageRank Computation. IEEE Transactions on Parallel and Distributed Systems. 2011 May; 22(5):786-802.
  • Dubey H, Khare N, Appu Kuttan KK, Bhatia S. Improved Parallel PageRank Algorithm for Spam Filtering. Indian Journal of Science and Technology. 2016 Oct; 9(38).
  • Date accessed 03/02/2016: Available from: https://snap.stanford.edu/data/.
  • Bing-Yuan Pu, Ting-Zhu Huang, Chun Wen, An improved PageRank algorithm: immune to spam. Melbourne: Fourth International IEEE Conference on Network and System Security, VIC, no. 978-0-7695-4159-4. 2010 Sep; p. 425-29.DOI:10.1109/NSS.2010.12

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.