Dave Cross on building TwittElection

2015-05-20 09:25:56 admin Guest blog posts News 3 Comments

Dave Cross has been using Perl since 1996. In 1998 he started the London Perl Mongers. He blogs about Perl at Perl Hacks and is @davorg on Twitter.


In the UK we have just had a general election. Over the last few weeks many web sites have sprung up to share information about the campaign and to help people decide how to vote. I have set up my own site called TwittElection and in this article I'd like to explain a little about how it works.

It might help to explain how elections are run in the UK. The country is divided into 650 electoral areas (called "constituencies"). Candidates stand for election in one of these constituencies. Most of these candidates will represent one of the major parties, but others will come from smaller parties or even be completely independent. Voters get one vote which they can give to one of the candidates standing in their local constituency. In each constituency, the candidate with the most votes wins (this is called the "first past the post" system) and becomes the Member of Parliament (MP) for that constituency. The 650 MPs sit in parliament in Westminster and the party with the most MPs gets to form the government.

I was interested in how candidates used Twitter to communicate with the electorate. So I planned to create a Twitter list for each constituency. Each list would be made up of the candidates for that constituency. It's then simple enough to create a web page for each constituency that contains a Twitter List Widget for that list.

My first problem was to find out which candidates are standing in which constituency and what their Twitter accounts were. Luckily, this was a problem that many sites were having and a group called the Democracy Club set up a crowd-sourcing effort at YourNextMP in order to compile this  information. They gathered data on almost 4,000 candidates. And they have an API, so I could query that data whenever I wanted.

There was a chance that this site was going to be really popular. So I decided that, if possible, I would host it on Github Pages rather than my own server. That meant that it needed to be a static site rather than something like a Dancer application. But that's ok. We really only need 650 pages (one for each constituency) and once they have the correct Twitter List Widget embedded in them, they don't need to be updated very frequently.

I decided on the following architecture.

  • A program that queries the YourNextMP and gets up to date information about candidates and stores that data in a local database.
  • A program that queries the local database and uses that information to update the Twitter lists for each candidates.
  • A program that rebuilds the pages for the site and publishes them to Github Pages.

The database wasn't hard to design. It has three tables - candidate, constituency and party. There are foreign key from candidate to party (a candidate represents a party) and from candidate to constituency (a candidate is standing in a constituency). I then generated DBIx::Class schema  classes representing these tables.

The first program in my list above isn't very difficult. It uses LWP::Simple to query the YourNextMP API which returns data as JSON. I then use JSON.pm to parse the JSON and get the data that I need. It then compares this new data to the existing data and makes all the required updates. If updates are made, then a 'candidates_updated_time' column on the constituency table is set so that the next program knows that it has work to do.

In the early days of the campaign, candidate lists can be in quite a state of flux and this showed up a couple of interesting bugs in this program. Firstly, an early version of the program forgot to check for candidates that needed to be deleted from the list. It didn't take long to find and fix that one. Secondly, whilst a candidate can only stand in one constituency, it is surprisingly common for a candidate to move from one constituency to another. Initially I handled this as a deletion and an insertion, but as the constituencies were being processed in alphabetical order I occasionally found situations where tried to insert a new candidacy before deleting the old one. This broke a unique index constraint and the insertion would fail. Switching to use DBIC's update_or_create() method fixed this problem.

The second program on the list - the one to update the Twitter lists -  was more interesting as I had never before written very complex code for interacting with the Twitter API. Luckily, we have Net::Twitter which makes working with Twitter from Perl pretty simple. I subclassed this module, adding an authorise() method which hid away the (necessarily) complex OAuth dance that all Twitter applications have  to go through.

As well as the 'candidates_updated_time' column that I mentioned above, the constituency table also has a 'list_rebuilt_time' column which is updated every time that I change the Twitter list for that constituency. If 'candidates_updated_time' is earlier than 'list_rebuilt_time' then I know I can skip processing that constituency as there are no new changes to deal with.

Dealing with Twitter can be complicated. One of the hardest problems is that each Twitter application is only allowed to make a certain number of requests in a given time period before you find your requests being blocked. There are two approaches to dealing with this. You can keep a count of the number of requests you have made and stop working when you think you're about to hit the limit. Or you can take the easier approach. I took the easier approach.

The easier approach is to keep making requests until Twitter sends you a response saying that you have gone over your limit. At that point you stop work. This is easier as you let Twitter do all the housekeeping about how many requests you have made. As a result, most of my Twitter calls are wrapped inside one bit try/catch block (using Try::Tiny, of course). When we get an error, we examine it and work out the best steps to take. By trial and error I discovered that error codes of 403 and 429 mean that you have gone over some request limit, so at that point I kill my program.

Twitter errors are interesting. As you query the API over HTTP, a lot of the errors are HTTP response codes. But sometimes you get a more detailed Twitter-specific error code as well. Net::Twitter includes an exception class, Net::Twitter::Error which encapsulates all of this complexity. You can simply call the has_twitter_error() method to see if you have one of these errors and twitter_error_code() and twitter_error_text() will give you the details.

I got a couple of interesting values in the Twitter error code. 108 means that you are trying to add a non-existent user to a Twitter list. This usually means that the user has changed their Twitter username and the data in YourNextMP is out of date. In this case, it's simple to find the updated name and update YourNextMP to the correct value - which will filter through to my system on the next run.

The other interesting error code I got was 106. This means that the user has blocked my account (or, rather, the @twittelection account that runs the application) so that I can't add them to Twitter lists. It's strange that a parliamentary candidate would block a user which exists simply to make their tweets easier for constituents to find, but it has happened twice during the campaign. In both cases, the candidate represented the UK Independence Party. I'm not sure what that says about their social media strategy!

The final program I wrote was one that generated the web site from my local database. This was a pretty standard program - pulling data from the database and then using the Template Toolkit to build the various pages of the site. As well as the 650 constituency pages, there were  also a other pages - the home page, a constituency index, an about page and a page of statistics. This stats page was the most fun to write as I got to write some interesting DBIC queries to extract the information that I wanted to display.

The site was launched 100 days before the election. A lot of people who saw the site seemed to find it useful - I got some nice compliments from people - but the audience was rather smaller than I hoped it would be. As is often the case, the problem came down to marketing. I just didn't dedicate enough time to telling people about the existance of the site.

A small number of people found the site useful though. And I look forward to doing something similar for the next general election in 2020. The code is available on Github and a lot of it would be reusable by any site that wanted to maintain a large number of themed Twitter lists. If anyone wanted to use it for a similar project, I'd be very interested to hear about it.


Leave a comment


3 Comments

    Hi Someone in my Facebook group shared this site with us so I came to give it a look Im definitely enjoying the information Im bookmarking and will be tweeting this to my followers Terrific blog and excellent design

    Great site Lots of useful info here Im sending it to some pals ans additionally sharing in delicious And certainly thanks to your effort

    Which is a great tip particularly to people a new comer to the blogosphere Simple but very precise info Be grateful for sharing this particular one Essential read post

Subscribe to our newsletter!

Make sure you never miss the interesting stories of Perl startups, apps and projects.