opinion polling2.0 ?
2 days to go and Barack Obama stays ahead of John McCain in all opinion polls, even with the notorious margin of error factored in - more or less 3% for evenly split voting intentions in polls carried out on 1,000 respondents. Although some polls carried out over the last couple of days indicate a rebound in favour of John McCain, Barack Obama still enjoys a 6 percentage point lead, on average, in nationwide polls, as well as estimates of over 300 electoral votes going to his ticket with Joe Biden.
As is often the case, some have begun to cast doubts on the accuracy of polling data, notably in light of the faulty 2004 exit polls which initially indicated John Kerry, the democratic party’s candidate, had defeated George W. Bush. High discrepancies in the latest poll results (up to 10% gaps) also yield doubts. These can be attributed to the difficulties opinion research firms face when deciding how to adjust the raw data they get from their phone or face to face interviews as the traditional turn out models they use might not be relevant in 2008, likely voters having seemingly been energized and mobilized by the candidates, in addition to registered voters. If the French experience with this respect can be of any relevance, we should assume the opinion research firms are doing their best to get things right this time (the French polling institutes had been accused of failing to properly translate the far right wing candidate’s surge in the last days of the 2002 presidential election race and his presence in the second and final round, whereas their results were much more accurate in 2007).
Speaking about the 2007 French presidential election, you should know that linkfluence had set up a political web opinion watch website, observatoire presidentielle, akin to PW08, which yielded up to one-percentage-point-accurate data in the days prior to the first round of the election (the French presidential election is carried out using a two-round universal direct ballot mechanism). And how, do you ask, did we achieve such accuracy? Well, we simply monitored the share of voice of each candidate in the French political web and observed, after election day, that those figures were almost identical with the actual ballots results. Before going any further, I should stress that we do not by any means claim this method is, for now at least, a valid statistical model that should replace, as is, other polling quantitative data. Yet, using this “share of voice” principle, we thought we might share with you some of the insight these data provide.
The following figures and graphs have been realized using the conversations of the U.S. political web - a 4,000+ dataset of websites - between 1 August 2008 and 31 October 2008.
shift in key topics on the public agenda
This won’t come as a surprise, yet it is interesting to note that the political web has been very responsive (some might say proactive) to the shift in focus with respect to the public policy issues making the public agenda and framing the voters’ intentions. Over the 3-month stretch going from August to October, the agregate topic “foreign policy (Iran) / war (Iraq) / national security” and the “economy” have been the two main topics on the agenda, far above the other ones we monitored - please note that each topic has been semantically refined and has thus been followed using a set of 20 to 40 relevant keywords that most characterize, in the political web, the topic at hand.
public polic topics share of voice August-October (results weighted with the linkfluence score of each article)
However, as the two following graphs demonstrate, the Iran + Iraq + National Security topic agregate was still front stage in the public debate, and might have been deciding the fate of the election, in August. Yet, the Economy came to the fore with the first signs of a large and widespread financial crisis in September (see the Economy’s spike on the previous graph, and note the short-lived spike in interest for national security on 9/11), with the Economy topic clearly occupying most of the political web agenda in October.
top three topics in August (results weighted with the linkfluence score of each article)
top three topics in October (results weighted with the linkfluence score of each article)
These data are in line with what most opinion polls have found so far as far as the most prominent topics on the voters’ minds is concerned. Hence, it is safe to say that the candidate with an edge on his opponent with respect to the Economy should enjoy a strong advantage as we enter into the actual voting phase - which has actually already begun thanks to early voting systems, with up to a tenth of registered voters having already cast their ballots.
candidates’ qualified shares of voice
We could measure each candidate’s or each ticket’s share of voice in the U.S. political web and rely solely on these data to gain some insight. This would yield the following for the month of October.
two main tickets’ shares of voice in October (results weighted with the linkfluence score of each article)
The Republican ticket has been discussed more than the Democratic one in October. To be sure, Sarah Palin’s iconoclastic poltical character and experience as Governor of Alaska - and shopping spree - have helped her secure a substantial share of voice in the U.S. political web. Now, these raw figures don’t seem to align with the latest opinion polls giving Barack Obama a 5 to 7 point lead nationwide. How about the shares of voice of the two presidential nominees.
two main candidates’ shares of voice in October (results weighted with the linkfluence score of each article)
Barack Obama has had a 58% share of voice, compared to John McCain’s 42%, in the October conversations of the U.S. political web. This 16-percentage-point gap is much wider than what most traditional opinion polls have shown in the last few weeks, the gap between Obama and McCain being in the 5 to 8 points range. This is why we also rely on what we call qualified shares of voice, i.e. the share of voice of each candidate in association with the main public policy issues framing the debate. These are not just raw data, but data that indicate how much each candidate has been discussed along with the 9 main policy issues we have been monitoring. And the results are a bit different there:
two main candidates’ qualified shares of voice in October (results weighted with the linkfluence score of each article)
Obama’s lead over John McCain settles around 9 percentage points for the month of October, closer to the upper end of the lead stretch given to the former by traditional opinion polls. This is a reverse in the trends of August (McCain +8%) and September (McCain +24%) that saw John McCain take front stage, thanks in large part to his choice of Sarah Palin as his running mate. Yet, with the shift in focus to the Economy and the short-lived Palin effect gone, October reverted to a strong advantage in favour of Obama, with this trend amplifying towards then end of the month (in the last week of October, Obama led by 20 percentage points).
Now it won’t be long before the American people make their choice, and we can then analyze all this in restrospect.
PS: Marcel Lebrun, on Media Philosopher, uses an alternative method to try and make us of the web’s opinion with respect to guessing the results of the presidential election. Using keywords such as “voting for” or “vote for”, and excluding those such as “don’t vote for” or “not voting for”, we realized a similar experiment using the sole U.S. political web. As you’ll see, the trends are quite similar to those found by Marcel Lebrun, yet the gaps are narrower here.
two main candidates’ “vote for” / “voting for” charts (results weighted with the linkfluence score of each article, McCain is in blue, Obama is in yellow)
Update: see FiveThirtyEight.com interesting post on the inclusion of cellphones in opinion polls and the resulting 9 points lead for Obama that ensues…

















