By Stephen Fisher, John Kenny and Rosalind Shorrocks.
Since our first forecast combining different indicators of the election outcome last week, the Conservatives and particularly Labour have edged up in the polls at the expense of UKIP and the Liberal Democrats. The crucial Conservative-Labour lead in the polls this week has varied from between 13 to 20 points. On average the lead has narrowed by just under two points to 16.5.
In line with this movement in the polls, the predicted Conservative majority from the betting markets and the simple models has dropped, but not, intriguingly, in the complex model forecasts. On average, our combined forecast is for a Tory majority of 123; down a little from 132 last week but still a big win.
The combined probability of a Conservative majority, at 91%, has remained unchanged from last week. However, the probability of a Conservative landslide has taken a small dip to a 64% chance (down from 71%). This drop is almost entirely due to Labour’s better performance in more recent polls, resulting in fewer leads for the Conservatives of 16 points or more required for a 100+ majority (the basis of our pseudo-probability from the polls).
The narrowing of the Conservative lead over Labour is largely thanks to the three most recent polls which have leads of 13, 14, and 15; lower than all but two of the other polls since the local elections. These recent polls were conducted after the (leaked) Labour manifesto was published but before the Conservative manifesto was released, and so may represent a temporary respite for Labour.
The betting markets are still expecting a bigger Conservative lead than the polls are showing, and this is clearly not because they are just over-reacting to the most recent information.
Three of the four forecasting models with vote shares are expecting the Tories to underperform their polling averages, despite the party’s historical tendency to do better the polls suggest. However, the statistical models are no longer predicting that Labour will outperform their polling numbers. This might be because Labour’s share in the polls is so close to their 2015 vote share that no reversion to past performance is predicted.
What follows is a more detailed description of our method than we managed to write last week. The main changes in method this week are: instead of using the most recent single poll from each pollster to calculate probabilities, we use the most recent two polls within the last two weeks. The additional sources are Lord Ashcroft’s forecast models, a new citizen forecast, and a new polling average. The sources we have dropped because they have not been updated recently enough are the polling averages from Polling Observatory and Adrian Kavanagh.
The basic approach is to combine forecasts by averaging them within each category and then taking the average across categories. Since the different sources do not all present equally clear figures that can be averaged on a like for like basis we have made various judgement calls on how to treat the data.
Historically the idea of combining forecasts from different sources has had a good track record, though it has to be admitted that our attempt to do one for the EU referendum did not work out well. Most recently the pollyvote.com combined forecast of the US presidential election last year was 2 points out on the share of the vote.
For vote shares, we use the various available polling averages, or ‘polls of polls’, and take their average. We exclude polling averages for whom the most recently published polling average is more than a week old. There are seven different polling averages. They are in truth nowcasts rather than forecasts, but we are in effect treating them as forecasts. There are seven different polling averages. Some admittedly are quite sophisticated, allowing for pollster (aka house) effects, but they are nonetheless estimates of current public opinion and not future votes.
We do not attempt to say what seats outcome is implied by polls (that is the job of the modellers). However, since statistical models are rarely if ever clear about the probabilities their models place on key events like a Conservative majority and a 100+ majority, we have included in a probabilities table some pseudo-probabilities from the polls. Taking the two most recent polls published by each pollster in the last two weeks we calculate the proportion showing a Conservative lead over Labour of 6 points or more as the pseudo-probability of a Conservative majority. Using the same polls, we use the proportion showing a Conservative lead over Labour of 16 points or more as the pseudo-probability of a Conservative landslide. These thresholds of 6 and 16 points are based on what would be required under uniform swing assumptions for the Conservatives to win a bare majority and a 100+ majority respectively.
There are numerous betting markets for the various outcomes in the election. We have taken those that are most helpful for the four forecasts we want to produce. For seat shares, we take the mid-point of the spread as the seat share, and average these mid-points between different sources. Note that the markets imply fewer seats forecast than there are actually are in the House of Commons. This is because the markets are separate for each party and do not need to be consistent collectively.
For vote share, we use betting markets for the Conservatives, Labour and UKIP. (Vote share markets for other parties are unavailable on the betting market aggregation site Oddschecker.) Odds are given for 5-point ranges of vote share. We take a weighted sum of the mid-points of these ranges where the weights are the implied probabilities. For the top and bottom options we use 2.5% above/below the upper/lower bound (e.g. 52.5 for “Above 50” and 27.5 for “Below 30”). The weighted sum is calculated just using the three categories with the largest implied probabilities, because the probabilities for other categories are so small and unstable. For UKIP we use just the two most likely categories.
For the probability of a Conservative majority we give an average of the implied probability from sites offering this market. For the probability of a Conservative landslide, we use the combined prices PredictIt that the Conservatives will win 370-379 seats, 380-389 seats, and 390 or more seats. This really represents a majority of 90 or more but that was as close as we could get to 100.
There are numerous statistical forecasting models this year (and more to come). We have divided them into two categories: simple (poll average plus uniform swing seats projection) and complex (anything more elaborate than the simple models, although they are not necessarily particularly complex). Some make adjustments for long run differences between pollsters, for constituency variation, and some estimate by how much things will differ between current polls and the eventual result. Chris Hanretty’s forecast at electionforecast.co.uk does all of these things. Within these categories we simply average the available estimates of seats and shares.
We should note that not all of the models are the modellers’ favourite. Some are counterpoints to their main models for comparison. We have included these on the basis that they are still talked of and expected to be reasonable estimates, for example, at Electoral Calculus, Martin Baxter has a local election results based model. We have not excluded any models based on our judgement of quality, but they do have to be statistical models as opposed to personal guesses.
These come from the Times Red Box sweepstake, where both podcast contributors and members of the public can make predictions about seats for the Conservatives and Labour.
Some polls ask people what they think that the outcome will be on June 8th. Different pollsters use different survey questions but they can be combined to generate pseudo-probabilities. We use the proportion of poll respondents who think the Conservatives will win/there will be a Conservative majority, excluding don’t knows and re-percentaging, as the pseudo-probability of a Conservative majority. We similarly use the proportion of poll respondents who think that there will be a Conservative landslide/the Conservatives will win more than 100 seats, for the probability of a Conservative landslide. Due to the limited number of polls these questions are asked in, we take results from the last two months.
Note: Estimates come from the morning of 19th May 2017. In all seat estimates, the Speaker’s seat is counted as Conservative.
There sources we used are listed below in no particular order. Please let us know of any that you think we have missed or misclassified. Some polling averages we know of were not included because they were more than a week old.
Complex Forecasting Models:
ElectionForecast.co.uk (Chris Hanretty)
Electoral Calculus (main and local election forecast)
PME Politics (Patrick English)
Nigel Marriot (Uniform Regional Swing + Tactical Voting Model)
Chris Prosser (GE vote shares from Local elections vote shares)
Lord Ashcroft (3 models based on different turnout estimates)
Polling Averages (less than a week old):
The Times Red Box Sweepstake
The three authors are equal contributors and our names are in alphabetical order.