Anita showed some nice examples of tweets in QGIS in 2012. Since then it seemed to be quiet about the twitter-content in QGIS. Yet tweets can be an interesting source of information. Sometimes they can tell you something about the spatiotemporal dimensions regarding a keyword, the digital heartbeat of a defined region and many more. Yet we need to be careful with the data as it is completely biased. But how to get this data stream into QGIS?The First Insights: Tweets in QGIS
As in 2011/2012 the twitter API was easy to fetch with a simple line of code and you were able to stream data. Anita showed this in a nice little way:curl -k -d @locations.txt https://stream.twitter.com/1/statuses/filter.json -uuserassword > tweets.jsonWith this line you collected all tweets in a file for a defined region (as stated in locations.txt) and you can then simply scan the file in QGIS for tweets with a defined location, add those lines as features in a point shapefile and of you go. But soon after, Twitter changed its API policy and switched to a more advanced authentication system. The solutions of Anita wasn’t working any more. Still nice examples from tweets in a geographic context appeared online. Most extreme map is probably this here: The one million tweet map
And also the recent versions of ArcGIS added the possibility to add tweets to ArcGIS map products . So it’s time to add it to QGIS as well.Tweets in QGIS- Prerequisites
At the moment the most reliable way to get access to the Twitter API from a python environment seems to be tweepy. The installation is quite easy under Ubuntu. But for an installation in Windows you need some information about environment parameters and only the user knows whether he has the OS4geow shell installed. Long story short: You need to install the tweepy module by hand prior reading streams from twitter. Furthermore the twitter API needs some tokens and keys: Make sure to have a twitter account and create an application so you’ll get your codes:
The first part of gathering tweets is to call tweepy with a defined location:import tweepyaccess_token = "put your token here"access_token_secret = "put your token secret here"consumer_key = "put your key here"consumer_secret = "put your key secret here"key = tweepy.OAuthHandler(consumer_key, consumer_secret)key.set_access_token(access_token, access_token_secret)# here come the tweepy part:class stream2lib(tweepy.StreamListener)utput = {}def __init__(self, api=None):api = tweepy.API(key)self.api = api or API()self.n = 0 #we will start with zero tweetsself.m = 10 #let's stop with 10 tweetsdef on_status(self, status):#we will parse the interesting information into a nice formatself.output[status.id] = {'tweet':status.text.encode('utf8'), #text could have non utf8 characters, so change this!'user':status.user.screen_name.encode('utf8'), #user name should be utf8 conform as well'geo':status.geo, #this is the point location of the device'localization':status.user.location, #user location as part of the user profile (normally set fixed per user)'time_zone':status.user.time_zone, #quite'time':status.timestamp_ms} #the timestamp given in ms since 01.01.1970#we will only care about tweets with geoif self.output[status.id]['geo']!=None:self.n = self.n+1 #we found a geotweet. but that's always true when calling the command with "locations=[-x,-y,x,y]" as belowif self.n < self.m:return Trueelse:return Falsestream = tweepy.streaming.Stream(key, stream2lib()) #initiate the streamstream.filter(locations=[-180,-90,180,90]) #filter the stream for tweets in this "box"tweetdic = stream2lib().output #copy it in a variableprint tweetdic #just to be sure
As you can see: we gather tweets and we filter the stream. Unfortunately the filtered stream can’t be stored in a variable so we added the output to the whole listener and we need to filter this for tweets with a coordinate afterwards.With the lines above we have a dictionary of tweets….Adding Tweets as Points to QGIS
As we have the tweets in a variable we can simply iterate over the dictionary and fill a virtual layer in QGIS with the point information.First we create this virtual layer:vl = QgsVectorLayer("Point", "temporary_twitter_results", "memory")pr = vl.dataProvider()# changes are only possible when editing the layervl.startEditing()At the moment the layer doesn’t contains any attributes so let’s add them as wellr.addAttributes([QgsField("user_name", QVariant.String),QgsField("localization", QVariant.String), QgsField("tweet", QVariant.String), QgsField("time", QVariant.String)])And the next lines will iterate over the dictionary, uses the coordinates as point locations and tweet attributes as attributes for each feature if a tweet has a coordinate:for tweet in tweetdic:*** if tweetdic[tweet]['geo'] != None:*** ** fet = QgsFeature() #it's a new feature*** ** fet.setGeometry(QgsGeometry.fromPoint(QgsPoint(tweetdic[tweet]['geo']['coordinates'][1],tweetdic[tweet]['geo']['coordinates'][0] ))) #use the coordinates for point location*** ** tweettime = datetime.datetime.utcfromtimestamp(float(tweetdic[tweet]['time'][:-3] + "." + tweetdic[tweet]['time'][11:13])).strftime('%Y-%m-%d %H:%M:%S:%f') #parse the time to fit YYYY-MM-DD HH:MM:SS:MS*** ** fet.setAttributes([tweetdic[tweet]['user'],tweetdic[tweet]['localization'],tweetdic[tweet]['tweet'],tweettime]) #set attributes of current tweet at current location*** ** pr.addFeatures([fet]) #and add the feature to the layer.And as we have finished the iteration let’s stop the editing and publish the layer to the current QGIS project:# commit to stop editing the layervl.commitChanges()# update layer's extent when new features have been added# because change of extent in provider is not propagated to the layervl.updateExtents()QgsMapLayerRegistry.instance().addMapLayer(vl)In the end you can collect as many tweets you would like. But be warned this solution might freeze your QGIS application for a few moments until a new tweet was found. 10’000 tweets in 3min ;-)
Together with Anita’s TimeManager plugin you can now create nice videos:The whole script can be downloaded: tester2.Furthermore I created a QGIS plugin called twitter2qgis or geotweet: twitter2qgis/geotweet for qgis: a plugin
The plugin is also under development via github. so please report any issues or contribute with knowledge/programming…A Warning
You can collect a large number of tweets and also mine them. But be aware: The user who uses twitter is quite specific. The user who also allows Twitter to use current location is even more specific. If you do any analysis with those tweets: keep in mind these aspects and also think about reading these articles:
As in 2011/2012 the twitter API was easy to fetch with a simple line of code and you were able to stream data. Anita showed this in a nice little way:curl -k -d @locations.txt https://stream.twitter.com/1/statuses/filter.json -uuserassword > tweets.jsonWith this line you collected all tweets in a file for a defined region (as stated in locations.txt) and you can then simply scan the file in QGIS for tweets with a defined location, add those lines as features in a point shapefile and of you go. But soon after, Twitter changed its API policy and switched to a more advanced authentication system. The solutions of Anita wasn’t working any more. Still nice examples from tweets in a geographic context appeared online. Most extreme map is probably this here: The one million tweet map
And also the recent versions of ArcGIS added the possibility to add tweets to ArcGIS map products . So it’s time to add it to QGIS as well.Tweets in QGIS- Prerequisites
At the moment the most reliable way to get access to the Twitter API from a python environment seems to be tweepy. The installation is quite easy under Ubuntu. But for an installation in Windows you need some information about environment parameters and only the user knows whether he has the OS4geow shell installed. Long story short: You need to install the tweepy module by hand prior reading streams from twitter. Furthermore the twitter API needs some tokens and keys: Make sure to have a twitter account and create an application so you’ll get your codes:
- access token
- access token secret
- consumer key
- consumer key secret
The first part of gathering tweets is to call tweepy with a defined location:import tweepyaccess_token = "put your token here"access_token_secret = "put your token secret here"consumer_key = "put your key here"consumer_secret = "put your key secret here"key = tweepy.OAuthHandler(consumer_key, consumer_secret)key.set_access_token(access_token, access_token_secret)# here come the tweepy part:class stream2lib(tweepy.StreamListener)utput = {}def __init__(self, api=None):api = tweepy.API(key)self.api = api or API()self.n = 0 #we will start with zero tweetsself.m = 10 #let's stop with 10 tweetsdef on_status(self, status):#we will parse the interesting information into a nice formatself.output[status.id] = {'tweet':status.text.encode('utf8'), #text could have non utf8 characters, so change this!'user':status.user.screen_name.encode('utf8'), #user name should be utf8 conform as well'geo':status.geo, #this is the point location of the device'localization':status.user.location, #user location as part of the user profile (normally set fixed per user)'time_zone':status.user.time_zone, #quite'time':status.timestamp_ms} #the timestamp given in ms since 01.01.1970#we will only care about tweets with geoif self.output[status.id]['geo']!=None:self.n = self.n+1 #we found a geotweet. but that's always true when calling the command with "locations=[-x,-y,x,y]" as belowif self.n < self.m:return Trueelse:return Falsestream = tweepy.streaming.Stream(key, stream2lib()) #initiate the streamstream.filter(locations=[-180,-90,180,90]) #filter the stream for tweets in this "box"tweetdic = stream2lib().output #copy it in a variableprint tweetdic #just to be sure
As we have the tweets in a variable we can simply iterate over the dictionary and fill a virtual layer in QGIS with the point information.First we create this virtual layer:vl = QgsVectorLayer("Point", "temporary_twitter_results", "memory")pr = vl.dataProvider()# changes are only possible when editing the layervl.startEditing()At the moment the layer doesn’t contains any attributes so let’s add them as wellr.addAttributes([QgsField("user_name", QVariant.String),QgsField("localization", QVariant.String), QgsField("tweet", QVariant.String), QgsField("time", QVariant.String)])And the next lines will iterate over the dictionary, uses the coordinates as point locations and tweet attributes as attributes for each feature if a tweet has a coordinate:for tweet in tweetdic:*** if tweetdic[tweet]['geo'] != None:*** ** fet = QgsFeature() #it's a new feature*** ** fet.setGeometry(QgsGeometry.fromPoint(QgsPoint(tweetdic[tweet]['geo']['coordinates'][1],tweetdic[tweet]['geo']['coordinates'][0] ))) #use the coordinates for point location*** ** tweettime = datetime.datetime.utcfromtimestamp(float(tweetdic[tweet]['time'][:-3] + "." + tweetdic[tweet]['time'][11:13])).strftime('%Y-%m-%d %H:%M:%S:%f') #parse the time to fit YYYY-MM-DD HH:MM:SS:MS*** ** fet.setAttributes([tweetdic[tweet]['user'],tweetdic[tweet]['localization'],tweetdic[tweet]['tweet'],tweettime]) #set attributes of current tweet at current location*** ** pr.addFeatures([fet]) #and add the feature to the layer.And as we have finished the iteration let’s stop the editing and publish the layer to the current QGIS project:# commit to stop editing the layervl.commitChanges()# update layer's extent when new features have been added# because change of extent in provider is not propagated to the layervl.updateExtents()QgsMapLayerRegistry.instance().addMapLayer(vl)In the end you can collect as many tweets you would like. But be warned this solution might freeze your QGIS application for a few moments until a new tweet was found. 10’000 tweets in 3min ;-)
Together with Anita’s TimeManager plugin you can now create nice videos:The whole script can be downloaded: tester2.Furthermore I created a QGIS plugin called twitter2qgis or geotweet: twitter2qgis/geotweet for qgis: a plugin
The plugin is also under development via github. so please report any issues or contribute with knowledge/programming…A Warning
You can collect a large number of tweets and also mine them. But be aware: The user who uses twitter is quite specific. The user who also allows Twitter to use current location is even more specific. If you do any analysis with those tweets: keep in mind these aspects and also think about reading these articles:
- Mapping the global Twitter heartbeat
- Geography of Twitter networks
- Typical Twitter user is a young woman with an iPhone
- An Exhaustive Study of Twitter Users Across the World