DS501 Homework 3 – Collecting, Manipulating and Blending Data from Twitter Solved

35.00 $

Category:

Description

Rate this product

Problem 1: Sampling Twitter Data with Streaming API about a certain topic

  • Select a topic that you are interested in, for example, “#WPI” or “#DataScience”
  • Use Twitter Streaming API to sample a collection of tweets about this topic in real time. (It would be recommended that the number of tweets should be larger than 50, but smaller than 500.
  • Store the tweets you downloaded into a local file (csv file)
library(twitteR) library(stringr)

setup_twitter_oauth(consumerKey, consumerSecret, accessToken, accessTokenSecret) tweets = searchTwitter(‘#rstats’, n=50) tweetsDF = twListToDF(tweets)

Report some statistics about the tweets you collected

  • The topic of interest: < INSERT YOUR TOPIC HERE>
  • The total number of tweets collected: < INSERT THE NUMBER HERE>

Problem 2: Analyzing Tweets and Tweet Entities with Frequency Analysis

  1. Word Count:
    • Use the tweets you collected in Problem 1, and compute the frequencies of the words being used in these tweets.

# Your R code here

  • Display a table of the top 30 words (ONLY) with their counts

# Your R code here

  1. Find the most popular tweets in your collection of tweets
    • Please display a table of the top 10 tweets (ONLY) that are the most popular among your collection,

i.e., the tweets with the largest number of retweet counts.

# Your R code here

  1. Find the most popular Tweet Entities in your collection of tweets

1

Please display a table of the top 10 hashtags (ONLY), top 10 user mentions (ONLY) that are the most popular in your collection of tweets.

# Your R code here

Problem 3: Getting any 20 friends and any 20 followers of a popular user in twitter

  • Choose a twitter user who has many followers, such as @hadleywickham.
  • Get the list of friends and followers of the twitter user.
  • Display 20 out of the followers, Display their ID numbers and screen names in a table.
  • Display 20 out of the friends (if the user has more than 20 friends), Display their ID numbers and screen names in a table.
  • Compute the mutual friends within the two groups, i.e., the users who are in both friend list and follower list, Display their ID numbers and screen names in a table

Problem 4 (Optional): Explore the data

Run some additional experiments with your data to gain familiarity with the twitter data and twitter API

  • HW-3-p72nlg.zip