Total views : 134
Effective Handling of Recurring Concept Drifts in Data Streams
Background: Nowadays, many applications involve huge amounts of data with variations in underlying concept. This large data needs to be handled with high accuracy, even in a resource-constrained environment. Objectives: In order to achieve better generalization accuracy while handling data with drifting concepts mainly recurrent drifts, we proposed an ensemble system called Recurring Dynamic Weighted Majority (RDWM). Methods: Our system maintains a primary online ensemble consisting of experts that represent the present concepts and a secondary ensemble that maintains experts representing the old concepts, since the beginning of learning. An effective pruning methodology helps to remove redundant and old classifiers from the system. Findings: Experimental analysis using Stagger dataset shows that our system proves to be the best system for handling dataset containing abrupt as well as recurrent drifts, achieving the best prequential accuracy using an optimal window size. RDWM proves to be highly resource effective as compared to EDDM approach. Experimental evaluation using a real world electricity pricing dataset proves RDWM to be the best system, performing very accurately even in a resource-constrained environment. Improvements: We can further enhance our system to handle novelty detection in data streams.
Concept Drift, Data Streams, Recurring, Recurring Concept
- Gama J, Medas P, Castillo G, Rodrigues P. Learning with Drift Detection. Proceedings of Advances in Artificial Intelligence – SBIA Brazil; 2004. p. 286–95.
- Gao J, Fan W, Han J. On appropriate assumptions to mine data streams analysis and practice. Proceedings of 7th IEEE International Conference on Data Mining; USA. 2007. p.143–52. Crossref
- Minku FL, White A, Yao X. The impact of diversity on online ensemble learn-ing in the presence of concept drift.IEEE Transactions on Knowledge and Data Engineering. 2009 Jul; 22(5):730–42.
- Kolter JZ, Maloof MA. Dynamic weighted majority. A new ensemble method for tracking concept drift. Proceedings of the 3rd ICDM; USA. 2003. p. 123–30. Crossref
- Baena-Garcıa M, Campo-Avila JD, Fidalgo R, Bifet A. Early Drift Detection Method. Proceedings of Fourth ECML PKDD Int’l Workshop Knowledge Discovery from Data Streams; Germany. 2006. p. 77–86.
- Dawid A, Vovk V. Prequential probability principles and proper ties. Bernoulli. 1999; 5(1):125–62. Crossref
- Nishida K, Yamauchi K. Adaptive classifiers-ensemble system for tracking concept drift. Proceedings of IEEE International Conference on Machine Learning and Cybernetics; 2007. p. 3607–12.
- Stanley KO. Learning concept drift with a committee of decision trees. Technical Report AI-TR-03-302 Department of Computer Sciences University of Texas. 2003.
- Sidhu P, Bhatia MPS. An online ensembles approach for handling concept drift in data streams diversified online ensembles detection. International Journal of Machine Learning and Cybernetics. 2015 Apr; 6(6):883–909.Crossref
- Schlimmer J, Granger R. Beyond incremental processing tracking concept drift. Proceedings of 5th National Conference on Artificial Intelligence; 1986 Aug. p. 502–7.
- Oza NC, Russell S. Experimental comparisons of online and batch versions of bagging and boosting. Proceedings of Seventh ACM SIGKDD; 2001 Aug. p. 359–64. Crossref
- Wang XZ, Wang R, Feng HM, Wang H. A new approach to classifier fusion based on upper integral. IEEE Transactions on Cybernetics. 2014 May; 44(5):620–35. PMid: 23782843.Crossref
- Sidhu P, Bhatia MPS. A novel online ensemble approach to handle concept drifting data streams: Diversified dynamic weighted majority. IJMLC. 2015 Jan; 6(6):883–909. Crossref
- Littlestone N, Warmuth M. The Weighted Majority algorithm. Information and Computation. 1994 May; 108(1):212–61. Crossref
- Blum A. Empirical support for winnow and Weighted Majority algorithms re-sults on a calendar scheduling domain. Machine Learning. 1997Jan; 26(1):5–23. Crossref
- Bifet A, Gavalda R. Learning from time-changing data with adaptive windowing. Proceedings of SIAM International Conference on Data Mining; 2007. p. 443–8. Crossref
- Kolter JZ, Maloof MA. Dynamic weighted majority. An Ensemble Method for Drifting Concepts. 2007; 8 (1):2755– 90.
- Nishida K. Learning and detecting concept drift. [PhD dissertation].Hokkaido University; 2008.
- Minku LL, Yao X. DDD - A new ensemble approach for dealing with concept drift. ITKDE. 2012 Apr; 24(4):619–33. Crossref
- Nishida K, Yamauchi K, Omori T. ACE Adaptive classifiersensemble system for concept-drifting environments. Proceedings of 6th Int’l Workshop on Multiple Classifier Systems; CA USA. 2005. p. 176–85. Crossref
- Hosseini M, Ahmadi Z, Beigy H. Using a classifier pool in accuracy based tracking of recurring concepts in data stream classification. Evolving Systems. 2013 Mar; 4(1):43– 60. Crossref
- Alippi C, Boracchi G, Roveri M. Just-In-Time classifiers for recurrent concepts. IEEE Transactions on Neural Networks and Learning Systems. 2013 Apr; 24(4):620–34. PMid: 24808382. Crossref
- Gomes, J, Menasalvas, E, Sousa, P. Learning recurring concepts from data streams with a context-aware ensemble. Proceedings of the ACM Symposium on Applied Computing; 201. p. 994–9. Crossref
- Harries M. Splice-2 comparative evaluation electricity pricing. Technical Report. Australia: University of New South Wales; 1999.
- Bifet A, Holmes G, Kirkby R, Pfahringer B. MOA Massive Online Analysis a framework for stream classification and clustering. Proceedings of JMLR Workshop on Applications of Pattern Analysis; Spain. 2010. p. 44–50.
- Oza N, Russell S. Online bagging and boosting. Proceedings of the Eighth Interna-tional Workshop on Artificial Intelligence and Statistics; Florida. 2001. p. 105–12.
- There are currently no refbacks.
This work is licensed under a Creative Commons Attribution 3.0 License.