Access Data from the Twitter API using Tweepy (Python)

How to Access Data from the Twitter API using Tweepy (Python)

The Twitter API allows you to do many things including retrieve tweet data. In order to access this data, you need a developer account. This tutorial goes over:

  • How to Setup a Twitter Developer Account
  • How to use tweepy (Python) to access Twitter Data

How to Setup a Twitter Developer Account

1.) Create a twitter account if you do not have one.

2.) On the twitter developer account page, you will be asked to answer a few questions. For example, I was asked for a phone number, country, and use case. The next step is to read and agree to the developer agreement.

3.) Verify your email.

4.) After verifying your email, you will be sent to a welcome screen. Name your app and click on Get keys.

5.) You now have access to your keys. Make sure to save your information to a secure location. You will need them to access data using the twitter api. This is enough information for OAuth 2.0.

Using tweepy to Access Twitter Data

This section briefly goes over how to use the Python tweepy library to access twitter data. To get started with the library, you need to install it through pip.

pip install tweepy

Search for Tweets from the Last 7 Days

The code below will search and return tweets for the last 7 days with a maximum of 100 tweets per request. This particular code searches for tweets (not retweets) in english that contain the hashtag #petday.

# OAuth2.0 Version 
import tweepy

#Put your Bearer Token in the parenthesis below
client = tweepy.Client(bearer_token='Change this')

"""
If you don't understand search queries, there is an excellent introduction to it here: 
https://github.com/twitterdev/getting-started-with-the-twitter-api-v2-for-academic-research/blob/main/modules/5-how-to-write-search-queries.md
"""

# Get tweets that contain the hashtag #petday
# -is:retweet means I don't want retweets
# lang:en is asking for the tweets to be in english
query = '#petday -is:retweet lang:en'
tweets = client.search_recent_tweets(query=query, tweet_fields=['context_annotations', 'created_at'], max_results=100)

"""
What context_annotations are: 
https://developer.twitter.com/en/docs/twitter-api/annotations/overview
"""
for tweet in tweets.data:
    print(tweet.text)
    if len(tweet.context_annotations) > 0:
        print(tweet.context_annotations)

Note that in order to get tweets older than just the last 7 days, you will need to use the search_all_tweets method which is ONLY available if you upgrade to the academic research product track or other elevated access levels. There is also a good blog on using that method here.

Get More than 100 Tweets at a Time using paginator

If you need more than 100 Tweets, you have to use the paginator method and specify the limit i.e. the total number of Tweets that you want. Replace limit=1000 with the maximum number of tweets you want.

# OAuth2.0 Version 
import tweepy

#Put your Bearer Token in the parenthesis below
client = tweepy.Client(bearer_token='Change this')

"""
If you don't understand search queries, there is an excellent introduction to it here: 
https://github.com/twitterdev/getting-started-with-the-twitter-api-v2-for-academic-research/blob/main/modules/5-how-to-write-search-queries.md
"""

# Get tweets that contain the hashtag #petday
# -is:retweet means I don't wantretweets
# lang:en is asking for the tweets to be in english
query = '#petday -is:retweet lang:en'
tweets = tweepy.Paginator(client.search_recent_tweets, query=query,
                              tweet_fields=['context_annotations', 'created_at'], max_results=100).flatten(limit=1000)

for tweet in tweets:
    print(tweet.text)
    if len(tweet.context_annotations) > 0:
        print(tweet.context_annotations)

Conclusion

This tutorial was about getting started with the Twitter API. Future tutorials will go over how to export twitter data as well as sentiment analysis. If you have any questions or thoughts on the tutorial, feel free to comment here on the cnvrg.io discourse.

1 Like