Total views : 385

Twitter Streaming and Analysis through R

Affiliations

  • Department of Information Technology, GMRIT, Rajam - 532127, Andhra Pradesh, India
  • Department of Computer Science, SPMVV, Tirupati - 517502, Andhra Pradesh, India

Abstract


Objectives: To retrieve tweets from Twitter through Twitter API. The domain chosen for analysis is Make-In-India Dataset. Methods/Statistical Analysis: This paper consists of two phases of work: 1. Data Streaming from Twitter, 2. Knowledge Mining through R-Studio. Methods used for the two key operations are: 1. Twitter API and 2. Sentiment Analysis through R. Twitter application is created to request for connection with the Twitter database. Once connection establishes, authentication keys are generated. Providing the search key “Make-In-India” and number of keys required, a file with .df (data frame) is generated with the tweets and is converted into .CSV (Comma Separated Values) file which is suitable Analysis. Sentiment Analysis1 is also called Opinion mining talks about retrieving facts from the tweets such as how many people supporting Make-In-India (or) how many are negative with the scheme (or) how many are neutral with it. For this process, a negative words file and a positive words file is taken for comparison with the tweet data to calculate positive score and negative score of the tweet. The difference of these scores gives us with the final score of the tweet. Findings: The number of tweets identified as positive (or) negative (or) neutral so that the status of Make-In-India can be visualised in a graph. Firstly, the extraction of Tweets is from Twitter through R-Studio Environment About “Make-In-India”. Secondly, we parse the extracted raw tweets using R according to the types and store in .CSV format in R database. Scores are calculated for all the tweets and stored in a file. In the third, we perform visual analysis from the stored data using R statistical software to conclude the impact of the program. Application/Improvements: Application of the methodology is to get findings2 from the public opinions which are available from Twitter tweets on a particular government issue, political parties and medical status around the country. Also it is useful to assess the popularity of the political leader and the program. Decision making is possible through sentiment analysis of user tweets.

Keywords

Big Data, Data Analytics, Make-In-India Data Set, Streaming, R-Studio, Tweets, Twitter API.

Full Text:

 |  (PDF views: 952)

References


  • Agarwal A, Xie B, Vovsha I, Rambow O. Sentiment analysis of Twitter Data. Proceedings of the …, 2011. Available from: dl.acm.org.
  • Rahmath H. Opinion mining and sentiment analysis-challenges and applications. IJAIEM. 2014 May; 3(5):1–3.
  • Spencer J, Uchyigit G. Sentimentor: Sentiment analysis of Twitter Data. CiteSeerX 10M; 2012.
  • Sharma Y, Mangat V, Mandeep K. Sentiment analysis and opinion mining. International Journal of Soft Computing and Artificial Intelligence. 2015 May; 3(1).
  • Rao NP, Srinivas SN, Prashanth CM. Real time opinion mining of Twitter Data. International Journal of Computer Science and Information Technologies. 2015; 6(3):2923–7.
  • Pak A, Paroubek P. Twitter as a corpus for sentiment analysis and opinion mining. Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10); 2010 May 19-21.
  • Ding X, Liu B, Yu PS. A holistic lexicon-based approach to opinion mining. Proceedings of First ACM International Conference on Web Search and Data Mining WSDM; 2008.
  • Bifet A, Frank E. Sentiment knowledge discovery in Twitter Streaming Data. New Zealand: Springer Link; 2010. p. 1–15.
  • Bifet A, Holmes G, Pfahringer B, Gavalda R. Detecting sentiment change in Twitter streaming data. Journal of Machine Learning Research. Proceedings Track. 2011; 17:5–11.
  • Fiaidhi J, Mohammed O, Mohammed S, Fong S, Kim TH. Opinion mining over Twitter space: Classifying tweets programmatically using the R Approach. IEEE. 978-1-4673-2430-4/12.
  • Parthiban P, Selvakumar S. Big data architecture for capturing, storing, analyzing and visualizing of web server logs. Indian Journal of Science and Technology. 2016 Jan; 9(4). DOI: 10.17485/ijst/2016/v9i4/84173.
  • Liang M, Trejo C, Muthu L, Ngo LB, Luckow A, Amy W. Evaluating R-based big data analytic frameworks. IEEE International Conference on Cluster Computing; 2015, p. 508–9.
  • F Morstatter F, Pfeffer J. Is the sample good enough? Comparing data from Twitter’s streaming API with Twitter’s Firehose. The 7th International AAAI Conference on Weblogs and Social Media. 2013.
  • Sukhpal K, Rashid EM. Web news mining using back propagation neural network and clustering using K-Means algorithm in Big data. Indian Journal of Science and Technology. 2016 Nov; 9(41).

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.