pydata

Keep Looking, Don't Settle

sentiment analysis with twitter 01: twitter analysis with tweepy

twitter analysis with tweepy

1. set up tweeter account

go to dev.twitter.com and register to get the consumer_key, consumer_secret, access_token and access_token_secret

2. install tweepy

run pip install tweepy to install it

3. usint tweepy to access twitter

3.1 set up customer key and access token

import tweepy
from time import sleep
import os
consumer_key = 'xxxxxxxxxxxxxxxxxxxxxxxx'
consumer_secret = 'xxxxxxxxxxxxxxxxxxxxxxxx'
access_token = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
access_token_secret = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxx'


auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
auth.secure = True
api = tweepy.API(auth)

after it is set up, let's try the first example to verify the set up. Print out the Wall Street Journel '@WSJ' account information.

# print out the user information. it is too much, so print first 800 chars.
print str(api.get_user(screen_name = '@WSJ'))[:800]
User(follow_request_sent=False, has_extended_profile=False, profile_use_background_image=False, _json={u'follow_request_sent': False, u'has_extended_profile': False, u'profile_use_background_image': False, u'profile_text_color': u'333333', u'default_profile_image': False, u'id': 3108351, u'profile_background_image_url_https': u'https://pbs.twimg.com/profile_background_images/378800000026078008/c07ad7d60f01c3a836dd701fc3f71233.jpeg', u'verified': True, u'profile_location': None, u'profile_image_url_https': u'https://pbs.twimg.com/profile_images/685113343204585473/jV72Zljq_normal.jpg', u'profile_sidebar_fill_color': u'EFEFEF', u'entities': {u'url': {u'urls': [{u'url': u'https://t.co/GhhR6PLfem', u'indices': [0, 23], u'expanded_url': u'http://wsj.com', u'display_url': u'wsj.com'}]}, u'descrip

3.2 search keywork in twitter, list users and retweet/favorite the twitts; follow the users

This is to show how to search in tweeter through tweepy.Cursor API api.search, and print out some sample info. you can set up search language, how many tweets to download, and the timeline. More details please check out the help page of the function.

 API.search(q[, lang][, locale][, rpp][, page][, since_id][, geocode][, show_user])

    Returns tweets that match a specified query.
    Parameters:

        q – the search query string
        lang – Restricts tweets to the given language, given by an ISO 639-1 code.
        locale – Specify the language of the query you are sending. This is intended for language-specific clients and the default should work in the majority of cases.
        rpp – The number of tweets to return per page, up to a max of 100.
        page – The page number (starting at 1) to return, up to a max of roughly 1500 results (based on rpp * page.
        since_id – Returns only statuses with an ID greater than (that is, more recent than) the specified ID.
        geocode – Returns tweets by users located within a given radius of the given latitude/longitude. The location is preferentially taking from the Geotagging API, but will fall back to their Twitter profile. The parameter value is specified by “latitide,longitude,radius”, where radius units must be specified as either “mi” (miles) or “km” (kilometers). Note that you cannot use the near operator via the API to geocode arbitrary locations; however you can use this geocode parameter to search near geocodes directly.
        show_user – When true, prepends “<user>:” to the beginning of the tweet. This is useful for readers that do not display Atom’s author field. The default is false.

    Return type:

    list of SearchResult objects

Basically, we will do two thing 1. retweet the tweets if we did not 2. follow the user

The following code will help us to do this. We search the twitts that talks about 'EURO2016' in English and list the first 5 twitts. If we did not retweet these twitts, we will retweet by tweet.retweet(), and favorite it by tweet.favorite().

If we did not follow the users, we can also follow the users by tweet.user.follow().

for tweet in tweepy.Cursor(api.search, q = '#EURO2016', lang = 'en').items(5):
    try:
        print "fount tweet by @" + tweet.user.screen_name
        if(tweet.retweeted == False) or (tweet.favorited == False):
            tweet.retweet()
            tweet.favorite()
            print "retweeted and favorited the user"
        if(tweet.user.following == False):
            # follow the users, it is better to do this with your test account
            # tweet.user.follow()
            print "Followed the user @" + tweet.user.screen_name
    except tweepy.TweepError as e:
        print e.reason
        sleep(10)
        continue
    except StopIteration:
        break

# if you see 403 error, most likely it is because permission denied
fount tweet by @Brndstr
retweeted and favorited the user
Followed the user @Brndstr
fount tweet by @zoorromoha
retweeted and favorited the user
Followed the user @zoorromoha
fount tweet by @Brndstr
retweeted and favorited the user
Followed the user @Brndstr
fount tweet by @notfirstnoel
retweeted and favorited the user
Followed the user @notfirstnoel
fount tweet by @abed31131
retweeted and favorited the user
Followed the user @abed31131

However, if the twitter is sent by myself, then we should ignore it rather than retweet it.

myBot = api.get_user(screen_name = '@rixin008')

for tweet in tweepy.Cursor(api.search, q = '#EURO2016', lang = 'en').items(5):
    try:
        if tweet.user.id == myBot.id:
            continue
        print "fount tweet by @" + tweet.user.screen_name
        print "fount tweet by @" + tweet.user.screen_name
        if(tweet.retweeted == False) or (tweet.favorited == False):
            tweet.retweet()
            tweet.favorite()
            print "retweeted and favorited the user"
        if(tweet.user.following == False):
            # follow the users, it is better to do this with your test account
            # tweet.user.follow()
            print "Followed the user @" + tweet.user.screen_name
    except tweepy.TweepError as e:
        print e.reason
        sleep(10)
        continue
    except StopIteration:
        break
fount tweet by @garysnook
fount tweet by @garysnook
retweeted and favorited the user
Followed the user @garysnook
fount tweet by @Brndstr
fount tweet by @Brndstr
retweeted and favorited the user
Followed the user @Brndstr
fount tweet by @WulanRhm
fount tweet by @WulanRhm
retweeted and favorited the user
Followed the user @WulanRhm
fount tweet by @humashzaadi
fount tweet by @humashzaadi
retweeted and favorited the user
Followed the user @humashzaadi
fount tweet by @TheAOSNOfficial
fount tweet by @TheAOSNOfficial
retweeted and favorited the user
Followed the user @TheAOSNOfficial

3.3 create a list to add a group of users

We can add the users related to a username or screen_name to a list. Create a list to add some members using api.create_list, after it is created, you can find the list you created in https://twitter.com/rixin008/lists/.

myBot = api.get_user(screen_name = '@google')
# create a list, will comment out after it is created
api.create_list('google-related-username', mode = "public", description = "I add you, sorry!")
List(_api=<tweepy.api.API object at 0x7f24e440cdd0>, subscriber_count=0, member_count=0, name=u'google-related-username', created_at=datetime.datetime(2016, 7, 10, 17, 50, 44), uri=u'/rixin008/lists/google-related-username1', mode=u'public', id_str=u'752198382114840576', id=752198382114840576, user=User(follow_request_sent=False, has_extended_profile=False, profile_use_background_image=True, _json={u'follow_request_sent': False, u'has_extended_profile': False, u'profile_use_background_image': True, u'default_profile_image': True, u'id': 228525243, u'profile_background_image_url_https': u'https://abs.twimg.com/images/themes/theme1/bg.png', u'verified': False, u'profile_text_color': u'333333', u'profile_image_url_https': u'https://abs.twimg.com/sticky/default_profile_images/default_profile_2_normal.png', u'profile_sidebar_fill_color': u'DDEEF6', u'entities': {u'description': {u'urls': []}}, u'followers_count': 0, u'profile_sidebar_border_color': u'C0DEED', u'id_str': u'228525243', u'profile_background_color': u'C0DEED', u'listed_count': 2, u'is_translation_enabled': False, u'utc_offset': None, u'statuses_count': 113, u'description': u'', u'friends_count': 4, u'location': u'', u'profile_link_color': u'0084B4', u'profile_image_url': u'http://abs.twimg.com/sticky/default_profile_images/default_profile_2_normal.png', u'following': False, u'geo_enabled': False, u'profile_background_image_url': u'http://abs.twimg.com/images/themes/theme1/bg.png', u'screen_name': u'rixin008', u'lang': u'en', u'profile_background_tile': False, u'favourites_count': 102, u'name': u'rixin008 song', u'notifications': False, u'url': None, u'created_at': u'Mon Dec 20 00:10:43 +0000 2010', u'contributors_enabled': False, u'time_zone': None, u'protected': False, u'default_profile': True, u'is_translator': False}, time_zone=None, id=228525243, _api=<tweepy.api.API object at 0x7f24e440cdd0>, verified=False, profile_text_color=u'333333', profile_image_url_https=u'https://abs.twimg.com/sticky/default_profile_images/default_profile_2_normal.png', profile_sidebar_fill_color=u'DDEEF6', is_translator=False, geo_enabled=False, entities={u'description': {u'urls': []}}, followers_count=0, protected=False, id_str=u'228525243', default_profile_image=True, listed_count=2, lang=u'en', utc_offset=None, statuses_count=113, description=u'', friends_count=4, profile_link_color=u'0084B4', profile_image_url=u'http://abs.twimg.com/sticky/default_profile_images/default_profile_2_normal.png', notifications=False, profile_background_image_url_https=u'https://abs.twimg.com/images/themes/theme1/bg.png', profile_background_color=u'C0DEED', profile_background_image_url=u'http://abs.twimg.com/images/themes/theme1/bg.png', name=u'rixin008 song', is_translation_enabled=False, profile_background_tile=False, favourites_count=102, screen_name=u'rixin008', url=None, created_at=datetime.datetime(2010, 12, 20, 0, 10, 43), contributors_enabled=False, location=u'', profile_sidebar_border_color=u'C0DEED', default_profile=True, following=False), full_name=u'@rixin008/google-related-username1', following=False, slug=u'google-related-username1', description=u'I add you, sorry!')

after the list 'google-related-username' is created, you can add the members to your list by api.add_list_member

myList = api.get_list("@rixin008", slug = 'google-related-username')
print "Running bot: @" + myBot.screen_name + "\n using list: " + myList.name + "\n members count:" + str(myList.member_count) + "; subs count: " + str(myList.subscriber_count)
Running bot: @google
 using list: google-related-username
 members count:0; subs count: 0
# myList = api.get_list("@"+myBot.screen_name, slug = 'sorry-for-adding-you')
# print "Running bot: @" + myBot.screen_name + "\n using list: " + myList.name + "\n members count:" + str(myList.member_count) + "; subs count: " + str(myList.subscriber_count)


for tweet in tweepy.Cursor(api.search, q = '#EURO2016', lang = 'en').items(5):
    try:
        if tweet.user.id == myBot.id:
            continue
        print "fount tweet by @" + tweet.user.screen_name
        if(tweet.retweeted == False) or (tweet.favorited == False):
            tweet.retweet()
            tweet.favorite()
            # add the user to the list
            api.add_list_member(screen_name = tweet.user.screen_name, owner_screen_name = myBot.screen_name, list_id = myList.id)
            print "retweeted and favorited the user, and added the user to the list"
        if(tweet.user.following == False):
            # follow the users, it is better to do this with your test account
            # tweet.user.follow()
            print "Followed the user @" + tweet.user.screen_name + ", not really, just test"
    except tweepy.TweepError as e:
        print e.reason
        sleep(10)
        continue
    except StopIteration:
        break

# if you see 403 error, most likely it is because permission denied
fount tweet by @rosscoeff
retweeted and favorited the user, and added the user to the list
Followed the user @rosscoeff, not really, just test
fount tweet by @Mbuzimawa
retweeted and favorited the user, and added the user to the list
Followed the user @Mbuzimawa, not really, just test
fount tweet by @tugablog
[{u'message': u'You have already retweeted this tweet.', u'code': 327}]
fount tweet by @biggi_abdul
retweeted and favorited the user, and added the user to the list
Followed the user @biggi_abdul, not really, just test
fount tweet by @minyokmarquez91
retweeted and favorited the user, and added the user to the list
Followed the user @minyokmarquez91, not really, just test

3.4 create a list to add a group of users

As we can follow a bunch of users, we can unfollow an friend or all the friends from api.friends with follower.unfollow().

for follower in tweepy.Cursor(api.friends).items():
    try:
        # not really unfollow, so I comment it out
        # follower.unfollow()
        print "unfollowed: " + follower.screen_name
    except tweepy.TweepError as e:
        print e
        sleep(10)
        continue
    except StopIteration:
        break
unfollowed: agile_toolsmith
unfollowed: EmanuelDerman
unfollowed: Dropbox
unfollowed: iphone_dev