Big-Data Analytics and Cloud Computing by Marcello Trovati Richard Hill Ashiq Anjum Shao Ying Zhu & Lu Liu

Big-Data Analytics and Cloud Computing by Marcello Trovati Richard Hill Ashiq Anjum Shao Ying Zhu & Lu Liu

Author:Marcello Trovati, Richard Hill, Ashiq Anjum, Shao Ying Zhu & Lu Liu
Language: eng
Format: epub
Publisher: Springer International Publishing, Cham


5.3 Communication for Data Collection on Twitter

Twitter is a platform that can handle a wide range of communication such as the discussion on certain issues between individual and groups. The communication strategies in Twitter are to study and investigate the communicative patterns within hashtags, keywords and data sets which are extracted from daily interactions. Upon providing a hypothetical description of the Twitter communication analysis [4], customising the approach and methodology to carry-out particular analyses is necessary. In addition to the general findings about Twitter’s communication structure, a large amount of data can be used in order to acquire a better understanding of certain issues or events [5] and to predict specific real-time events, such as massive traffic congestion on a particular road in Edinburgh.

In essence, Twitter communication analysis can be performed by employing metadata provided in application programming interface (API). The use of metrics for Twitter communication analysis is uncomplicated and builds upon the communication data collected through the Twitter API. The Twitter API [6] provides a streaming API, which provides real-time access to Tweets in sampled and filtered forms, making it possible for Twitter’s API to extract live real-time interaction data from Twitter users. Real-time access or real-time streaming allows the Twitter API to access any type of tweets or retweets broadcast by twitter users. The fascinating aspect of real-time streaming is that it is able to perform minute by minute live updates of the data set. This allows for potential improvement of the decision-making process by using the most up-to-date data for further processing. In the context of this paper, receiving real-time data is intended to improve bus arrival time.

The streaming API is grounded on a push-based approach [6]. This push-based approach allows data to constantly stream upon request. Users can manipulate the Twitter API to receive live data on a constant basis and at the same time process data for a specific use. This streaming data is provided as a live feed; as such, particular data can be extracted as soon as it is tweeted. Studies of real-time events, such as road traffic incidents, require researchers to rigorously establish a stream for collection and sorting of data, in order to compile an analytically useful set.

In this research, we make use of Tweetbinder (Twitter API Tool), which has the ability to generate deep analytics of Twitter users based on various filters, such as keywords, hashtags, pictures, textual tweets, and retweets. The Binder is a prime component in Tweetbinder that was developed to tackle two problems: textual organisation and statistics. A textual organiser makes it possible to sort hashtagged tweets into categories (binders), thus making it easier to organise tweets from a particular event (or hashtag) depending on set criteria. In each binder, tweets are separated into textual tweets, pictures, links, and retweets. This allows researchers to conduct granular level data analysis. The concepts introduced in this paper provide a pragmatic set of analytical tools with which to study information retrieved from Twitter communications. The specific metrics that relate to particular Twitter communicative contexts may also be leveraged.



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.