Skip to content
This repository was archived by the owner on Jun 19, 2025. It is now read-only.

Twitter Client

foonicorn edited this page Sep 4, 2012 · 16 revisions

PyCoaches Tutorial: Twitter Client

Introduction

Goal

At the end of this tutorial, you will have a client which can grab data from Twitter and analyze it.

Who this Tutorial is for

This tutorial is for Python beginners who have a basic understanding of the language and want to improve their skills by programming a small project.

Tools

All you need to do this tutorial is a browser and a standard installation of Python.

Get data with the browser

Goal

Get tweets from Twitter with your browser.

Instructions

Open this URL in your browser: http://search.twitter.com/search.json?q=python&rpp=1

Explanation

You will see a chunk of data which is not very readable, but it actually contains one tweet related to the keyword python. You can copy and paste the data into a tool like jsonlint.com to make it more readable.

Queries to Twitter's search API consist of a base part and parameters which most of are optional. The base part http://search.twitter.com/search.json? is always the same. It is followed by the parameters in form of key value pairs key=value. The key value pairs are separated by the & character.

The example query from this exercise consists of the base part http://search.twitter.com/search.json?, the required parameter q (query) with value python and the optional parameter rpp (results per page) with value 1.

Exercise: Change the rpp parameter to different values and see how the amount of data varies.

Resources

Get data with Python

Goal

Get the raw data from Twitter into Python.

Instructions

Type the following code into the Python interpreter.

>>> import urllib
>>> response = urllib.urlopen('http://search.twitter.com/search.json?q=python&rpp=1')
>>> raw_data = response.read()
>>> print(raw_data)

Explanation

At first, we import the module urllib from the Python Standard Library, a handy tool for fetching data from the World Wide Web.

Then we open the query URL and store the response in the variable response.

Because the response is actually a file-like object, we use it's read() method to access the data and store it in the variable raw_data.

Finally we print the data what will look very similar to what you have seen in the browser.

Resources

Make the data accessible

Goal

Convert the text-based data into a accessible data structure.

Instructions

>>> import json
>>> data = json.loads(raw_data)
>>> print(data.keys())
>>> print(data['query'])
>>> tweets = data['results']
>>> print(len(tweets))
>>> first_tweet = tweets[0]
>>> print(first_tweet.keys())
>>> print(first_tweet['text'])

Explanation

Twitter uses the JSON notation to format the response. Fortunately the Python standard library contains a JSON parser which does all the work for us. After importing it, we can use json.loads() to convert a JSON string into data structure consisting of lists and dictionaries.

Examine the keys of the dictionary. The key query contains the query string we sent to twitter: python. But much more interesting is results. It contains a list of tweets which matched our query. It's length should be equal to the value of the rpp parameter in the query.

Each tweet is again a dictionary containing various information about it and of course the message itself.

Exercise: Print the username (from_user) from each tweet followed by the message with a for loop.

Resources

Reusable Code

Goal

Create a function which calls the API and returns the tweets.

Instructions

import json
import urllib

def fetch_tweets(query, rpp=100):
    url = 'http://search.twitter.com/search.json?q={0}&rpp={1}'.format(query, rpp)
    response = urllib.urlopen(url)
    raw_data = response.read()
    data = json.loads(raw_data)
    return data['results']
>>> import twitter
>>> tweets = twitter.fetch_tweets('python', 100)

Explanation

You don't want to type in these commands into the interpreter every time, so let's create a function for this. Open a new file twitter.py and paste the first code block into it.

The function has two arguments: query and rpp. Note the default value of rpp - it's used when the argument is omitted.

To construct the query URL, we use Python's string formating. {0} is replaced by the format() function's fist argument, {1} by the second.

Now you can import your twitter module and get the tweets with a single function call.

Exercise: Use urllib.urlencode() to construct the query URL.

Resources

Clone this wiki locally