Total views : 357

An Improvised Topsis Approach to Select Web Source as External Data Source for Web Warehousing


  • SC & SS, Jawaharlal Nehru University, New Delhi - 110067, India


Objective: The main objective of the paper to incorporate the external web-data efficiently to web-warehouse, as the evolution of web and the requisite of data analytics necessitate it for effective decision support system. Methods/Statistical Analysis: Since the data owned of any organization is insufficient for decision support system. Nevertheless dynamic and complex nature of web pose various challenges during selection of relevant web-data. So evaluation of web resources to select as external source for web-warehouse is the crucial phase during warehousing. Various Multi Criteria Decision Making (MCDM) approaches have been used for it. All these approaches evaluate the web resources on the basis of a set of features which define the relevancy of the resource. Findings: The main focus is on one of the approaches of MCDM viz. "Technique for Order Preference by Similarity to Ideal Solution" (TOPSIS) approach and also improvised the TOPSIS approach for efficient evaluation of the web resources. In traditional TOPSIS approach Euclidean distance has been measured to compute the proximity of real web-sources from Ideal web-sources. The Euclidean distance measure only the distances between the real and ideal web-resources but not the differences between them. In order to compute the differences between real and ideal web-resources Kullback-Leibler divergence method has been incorporated in the place of Euclidean distance method. Application/Improvements: The improvised TOPSIS computes symmetric as well as asymmetric distances to compute the differences, so efficient to compute the proximity in order to evaluation of web-resources.


Improvised TOPSIS, Web-Data, Web-Warehouse, Web-Resources.

Full Text:

 |  (PDF views: 254)


  • Inmon WH. Building the Data Warehouse. John Wiley & Sons. 2005.
  • Ponniah P. Data Warehousing Fundamentals: A Comprehensive Guide for IT Processionals. John Wiley & Sons. 2001.
  • Pedersen TB, Jensen CS. Multidimensional Databases. The Industrial Information Technology Handbook. In: Zurawski R editor. CRC Press. 2005; 1 –13.
  • Perez JM, Berlanga R, Aramburu MJ, Pedersen TB. Integrating data warehouse with web data: A Survey. IEEE Transactions on Knowledge and Data Engineering. 2008.
  • Xyleme L. A Dynamic Warehouse for XML Data of the Web. IEEE Data Eng. 2001; 24(2):40–7.
  • Tan X, Yen DC, Fang X. Web warehousing: Web technology meets data warehousing. Technology in Society. 2003; 25:131–48.
  • Zhu Y, Buchmann AP. Evaluating and Selecting Web Sources as External Information Resources of a Data Warehouse. Web Information Systems Engineering. 2002.
  • Parimala Devi R, Thigarasu V. A Semantic Deduplication of Temporal Dynamic Records from Multiple Web Databases. Indian Journal of Science and Technology. 2015 Dec; 8(34).
  • Carol I, Britto Ramesh Kumar S. Conflict Resolution and Duplicate Elimination in Heterogeneous Datasets using Unified Data Retrieval Techniques.Indian Journal of Science and Technology. 2015 Sep; 8(22).
  • Velasquez M, Hester PT. An Analysis of Multi-Criteria Decision Making Methods. International Journal of Operations Research. 2013; 10(2):56–66.
  • Kullback S, Leibler RA. On information and sufficiency. The annals of mathematical statistics. 1951.
  • Johnson D, Sinanovic S. Symmetrizing the kullbackleibler distance. 2001.
  • Endres DM, Schindelin JE. A new metric for probability distributions. IEEE Transactions on Information theory. 2003.
  • Triantaphyllou E, Shu B, Sanchez SN, Ray T. Multi-criteria decision making: an operations research approach. Encyclopedia of Electrical and Electronics Engineering. 1998.
  • Ullah A. Entropy, divergence and distance measure with econometric applications. Journal of Statistical Planning and Inference. 1996.
  • Cha S-H. Comprehensive Survey on Distance/Similarity Measures between Probability Density Functions. International Journal of Mathematical Models and Methods in Applied Sciences. 2007.
  • Ross S. Introduction to Probability Models. Academic Press/Elsevier. 2012.
  • Johnson JL. Probability and Statistics for Computer Science. Wiley. 2008.


  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.