<div class="pf-content">The coup/attempt in Turkey kept me up from going to sleep after the two nice hours of “Back to the Future” on German television. So despite of the tragedy behind it: twitter was exploding as I watched the news coming in. I was interested in the timeline of tweets, locations and “everything”. I downloaded about 10Gb of twitter data and here is my analysis for everything ‘#turkey’ from Friday till Monday.
When I first thought about it and trying to get tweets from the Twitter API I reached any limits quite soon. So how was it possible to get over 10GB of twitter data in just 8hrs?Collecting Twitter Data
After google showed me some workarounds that were not working I found this post and applied the logic to my situation:import tweepyimport jsonimport jsonpickle#get the following by creating an app on dev.twitter.comconsumer_key = 'your consumer key here' consumer_secret = 'your consumer secret here'auth = tweepy.AppAuthHandler(consumer_key, consumer_secret)api = tweepy.API(auth, wait_on_rate_limit=True,wait_on_rate_limit_notify=True)searchQuery = '#Turkey' # this is what we're searching formaxTweets = 100 # Some arbitrary large numbertweetsPerQry = 100 # this is the max the API permitsfName = '/tmp/tweetsgeo.txt' # We'll store the tweets in a text file.sinceId = Nonemax_id = -1print("Downloading max {0} tweets".format(maxTweets))with open(fName, 'w') as f: while tweetCount < maxTweets: try: if (max_id
When I first thought about it and trying to get tweets from the Twitter API I reached any limits quite soon. So how was it possible to get over 10GB of twitter data in just 8hrs?Collecting Twitter Data
After google showed me some workarounds that were not working I found this post and applied the logic to my situation:import tweepyimport jsonimport jsonpickle#get the following by creating an app on dev.twitter.comconsumer_key = 'your consumer key here' consumer_secret = 'your consumer secret here'auth = tweepy.AppAuthHandler(consumer_key, consumer_secret)api = tweepy.API(auth, wait_on_rate_limit=True,wait_on_rate_limit_notify=True)searchQuery = '#Turkey' # this is what we're searching formaxTweets = 100 # Some arbitrary large numbertweetsPerQry = 100 # this is the max the API permitsfName = '/tmp/tweetsgeo.txt' # We'll store the tweets in a text file.sinceId = Nonemax_id = -1print("Downloading max {0} tweets".format(maxTweets))with open(fName, 'w') as f: while tweetCount < maxTweets: try: if (max_id