November 2nd, 2008

opinion polling2.0 ?

2 days to go and Barack Obama stays ahead of John McCain in all opinion polls, even with the notorious margin of error factored in - more or less 3% for evenly split voting intentions in polls carried out on 1,000 respondents. Although some polls carried out over the last couple of days indicate a rebound in favour of John McCain, Barack Obama still enjoys a 6 percentage point lead, on average, in nationwide polls, as well as estimates of over 300 electoral votes going to his ticket with Joe Biden.

As is often the case, some have begun to cast doubts on the accuracy of polling data, notably in light of the faulty 2004 exit polls which initially indicated John Kerry, the democratic party’s candidate, had defeated George W. Bush. High discrepancies in the latest poll results (up to 10% gaps) also yield doubts. These can be attributed to the difficulties opinion research firms face when deciding how to adjust the raw data they get from their phone or face to face interviews as the traditional turn out models they use might not be relevant in 2008, likely voters having seemingly been energized and mobilized by the candidates, in addition to registered voters. If the French experience with this respect can be of any relevance, we should assume the opinion research firms are doing their best to get things right this time (the French polling institutes had been accused of failing to properly translate the far right wing candidate’s surge in the last days of the 2002 presidential election race and his presence in the second and final round, whereas their results were much more accurate in 2007).

Speaking about the 2007 French presidential election, you should know that linkfluence had  set up a political web opinion watch website, observatoire presidentielle, akin to PW08, which yielded up to one-percentage-point-accurate data in the days prior to the first round of the election (the French presidential election is carried out using a two-round universal direct ballot mechanism). And how, do you ask, did we achieve such accuracy? Well, we simply monitored the share of voice of each candidate in the French political web and observed, after election day, that those figures were almost identical with the actual ballots results. Before going any further, I should stress that we do not by any means claim this method is, for now at least, a valid statistical model that should replace, as is, other polling quantitative data. Yet, using this “share of voice” principle, we thought we might share with you some of the insight these data provide.

The following figures and graphs have been realized using the conversations of the U.S. political web - a 4,000+ dataset of websites - between 1 August 2008 and 31 October 2008.

shift in key topics on the public agenda

This won’t come as a surprise, yet it is interesting to note that the political web has been very responsive (some might say proactive) to the shift in focus with respect to the public policy issues making the public agenda and framing the voters’ intentions. Over the 3-month stretch going from August to October, the agregate topic “foreign policy (Iran) / war (Iraq) / national security” and the “economy” have been the two main topics on the agenda, far above the other ones we monitored - please note that each topic has been semantically refined and has thus been followed using a set of 20 to 40 relevant keywords that most characterize, in the political web, the topic at hand.

public polic topics share of voice August-October (results weighted with the linkfluence score of each article)

pw08 - topics - 3 months

However, as the two following graphs demonstrate, the Iran + Iraq + National Security topic agregate was still front stage in the public debate, and might have been deciding the fate of the election, in August. Yet, the Economy came to the fore with the first signs of a large and widespread financial crisis in September (see the Economy’s spike on the previous graph, and note the short-lived spike in interest for national security on 9/11), with the Economy topic clearly occupying most of the political web agenda in October.

top three topics in August (results weighted with the linkfluence score of each article)

pw08 - topics - august

top three topics in October (results weighted with the linkfluence score of each article)

pw08 - topics - october

These data are in line with what most opinion polls have found so far as far as the most prominent topics on the voters’ minds is concerned. Hence, it is safe to say that the candidate with an edge on his opponent with respect to the Economy should enjoy a strong advantage as we enter into the actual voting phase - which has actually already begun thanks to early voting systems, with up to a tenth of registered voters having already cast their ballots.

candidates’ qualified shares of voice

We could measure each candidate’s or each ticket’s share of voice in the U.S. political web and rely solely on these data to gain some insight. This would yield the following for the month of October.

two main tickets’ shares of voice in October (results weighted with the linkfluence score of each article)

pw08 - tickets - October

The Republican ticket has been discussed more than the Democratic one in October. To be sure, Sarah Palin’s iconoclastic poltical character and experience as Governor of Alaska - and shopping spree - have helped her secure a substantial share of voice in the U.S. political web. Now, these raw figures don’t seem to align with the latest opinion polls giving Barack Obama a 5 to 7 point lead nationwide. How about the shares of voice of the two presidential nominees.

two main candidates’ shares of voice in October (results weighted with the linkfluence score of each article)

pw08 - candidates - October

Barack Obama has had a 58% share of voice, compared to John McCain’s 42%, in the October conversations of the U.S. political web. This 16-percentage-point gap is much wider than what most traditional opinion polls have shown in the last few weeks, the gap between Obama and McCain being in the 5 to 8 points range. This is why we also rely on what we call qualified shares of voice, i.e. the share of voice of each candidate in association with the main public policy issues framing the debate. These are not just raw data, but data that indicate how much each candidate has been discussed along with the 9 main policy issues we have been monitoring. And the results are a bit different there:

two main candidates’ qualified shares of voice in October (results weighted with the linkfluence score of each article)

pw08 - candidates/topics - October-2

Obama’s lead over John McCain settles around 9 percentage points for the month of October, closer to the upper end of the lead stretch given to the former by traditional opinion polls. This is a reverse in the trends of August (McCain +8%) and September (McCain +24%) that saw John McCain take front stage, thanks in large part to his choice of Sarah Palin as his running mate. Yet, with the shift in focus to the Economy and the short-lived Palin effect gone, October reverted to a strong advantage in favour of Obama, with this trend amplifying towards then end of the month (in the last week of October, Obama led by 20 percentage points).

Now it won’t be long before the American people make their choice, and we can then analyze all this in restrospect.

PS: Marcel Lebrun, on Media Philosopher, uses an alternative method to try and make us of the web’s opinion with respect to guessing the results of the presidential election. Using keywords such as “voting for” or “vote for”, and excluding those such as “don’t vote for” or “not voting for”, we realized a similar experiment using the sole U.S. political web. As you’ll see, the trends are quite similar to those found by Marcel Lebrun, yet the gaps are narrower here.

two main candidates’ “vote for” / “voting for” charts (results weighted with the linkfluence score of each article, McCain is in blue, Obama is in yellow)

pw08 - candidates - votefor - October pw08 - candidates - votefor - Septemberpw08 - candidates - votefor - August

Update: see FiveThirtyEight.com interesting post on the inclusion of cellphones in opinion polls and the resulting 9 points lead for Obama that ensues…

September 28th, 2008

by popular demand: “seeing political memes” goes public

Having received a good number of e-mails and comments asking us if the maps revealing the way McCain’s celebrity ad and Paris Hilton’s video response had spread throughout the U.S. political web territory in August were available on the web, we have decided to upload those on PW08. For a reminder of the story that underpins this technology and these examples, see our previous post - or what Techpresident has to say about it. To see for yourself, just follow the links:

-  McCain’s video “linkspread“©

- Hilton’s video “linkspread“©

September 25th, 2008

seeing political memes : the viral spread of McCain’s & Hilton’s “celebrity” movies

“How cool is it to see a meme?”: that’s the question Philip Sheldrake asked in this must-read post (Can you see it? Making influence visible) to summarize a key concept for the future of Social Web Analytics: gathering the data is no longer the issue. “The next biggest challenge”, to paraphrase him, is about making the data –exponentially growing amounts of data- easily understandable and actionable to marcom professionals. That’s where information visualization kicks in.

“How cool is it to see a meme” then? Well, probably very cool, provided you can actually pin it down and make it show it up on a screen. But memes are, to say the least, elusive and hardly predictable (but that could change…) in the way they spread like wildfire above and below the surface of the “visible” web. It’s a bit like stormchasing, although a lot safer.

We have actually been working on this very issue, to provide our clients with the ability to not only monitor the viral spread of a blog post, or viral video, but to actually see it propagate from one site to the next, from within one community to the web at large. When you’r e in the agency business, it’s one cool thing to be able to get the buzz going about a product, it’s an even cooler one to be able to show your client (and your client’s client) where, when and how it went viral.

Having built the most comprehensive map of the US political web for the 2008 Personal Democracy Forum, we had an ideal dataset to overlay the spread of two of the most blogged-about videos of this electoral cycle: John McCain’s “Celebrity” attack ad, and Paris Hilton’s blockbuster response.

Naturally, the Hilton video propagated well beyond the limits of the “political web” (a dataset of the 4,000+ leading sites and blogs covering US politics). With over 2700 direct links to the video (according to Google Blogsearch) and more than 3 000 000 views at the end of August, the Hilton response video dwarfs the stats of the initial McCain (as shown in the graph below).

Aside from these raw numbers, the animated visualization below provides us with a glimpse of the dynamics of propagation over time on the political web: who’s blogged about it first, who picked up on it among progressive or conservative communities (with direct links to the post and authority ranks for each one of them). It is clear, from this viral propagation map, that Paris Hilton’s video -unsurprisingly- elicited more “buzz”, within the U.S. political web, than the original McCain ad.


linkfluence - pw08 - viral spread - McCain & Hilton videos from linkfluence on Vimeo.

But this is not just about creating cool animations. This type of data visualization has, time and again, provided us (and our clients) with the ability to answer three (out of six) open questions asked by Philip Sheldrake in his post:

- “Who’s most likely to have started this rumour?” [all content is indexed and time-stamped, making it easy to spot the “fire-starter” blog at the onset of the animation* and track propagation henceforth]

- “Who or what is exerting most influence?” [everyone’s got their own ‘secret sauce’ to determine influence on the web. Ours is called the “linkfluence score” which is essentially based on one’s site relative position of authority within its community (see this primer for more details)]

- “Who should we add to our list of key contacts / influencers?” [here again, visualization comes in handy: key influencers don’t exist in a vacuum, they are positioned at the center of their own community of readers and peers. They are first and foremost, hubs of information absorption and dissemination, showing up as large ‘nodes’ (larger dots) in the social graph.]

As to Sheldrake’s conclusion about the beauty of some visualization, well, we do our best, but no one could fault you if your preference went to watching the meme itself, especially one that’s wearing a skimpy swimsuit and shiny high heels ;-)

*In the case of the Celebrity and Paris Hilton videos, there is no single “fire-starter” website, as both videos received considerable paid and earned media exposure, both off-line and online. Although it should be noted that the Progressive community, acting as an aggregate trigger of online discussions, moved faster and displayed more interest in the end than the Conservative community online.

June 24th, 2008

Thanks community!

We’ve received lots of interesting, and sometimes constructively critical, comments on our map of the U.S. political webosphere. As we’re still at the Personal Democracy Forum, we haven’t had time to factor in everything but thank you all. We have simply corrected a graphing error for Andrew’s Sullivan presence on the map. We actually had two websites for him, on both sides of the political spectrum. This has been amended and http://andrewsullivan.theatlantic.com/ is now the sole remaining website, slightly in the progressive community…

June 12th, 2008

Entering the general election season with a sneak preview

0806_PW08_ThumbnailIt seems that both the Republican party and the Democratic party have finally settled on their respective nominees. Once the vice presidential candidates have been chosen, the 2008 primary season will be definitively behind us.At Presidential Watch 08 we also believe it’s more than time to shift our focus to the general election.As a consequence, we have upated our map of the U.S. political webosphere for even more insight into the presidential race. Our new map, of which a small preview is shown here, will reveal:- the 500 most influential websites of the U.S. political webosphere;- the newcomers to this online territory;- the locations of the candidates’ websites;- a refined categorisation of websites that abandons traditional media categories (Mainstream Media vs. Social Media) to offer an accurate view of this territory as one of mainly partisan websites, with some playing the role of information pits (or infopits, but we’ll come back to that in a following post).We have also prepared very interesting case studies that will reveal, on our maps, the presence of large policy issues in specific areas of the political webosphere or the existence of sub-communities among the Liberals and the Conservatives.We will be showcasing all this and more at the Beyondbroadcast conference in Washington D.C., at the Personal Democracy Forum in NYC. and on presidentialwatch08.comStay tuned for an insightful approach to the general networked election!

April 17th, 2008

Political Cartography 2.0: an Interview with MaptheCandidates.com

As you’ve no doubt realized by now, we like maps, and of course we like people who also like cutting-edge, slightly puzzling internet maps. So who better than Chadwick Matlin and E. J. Kalafarski, founders of MapTheCandidates.com and hosts of a recent panel on “Political Cartography 2.0” at the 2008 Politics Online Conference to have a little chat about internet mapping:

- First of all, could you tell us a bit about Mapthecandidates.com: how did it come about? Are you planning to add new features for the general election?

MTC came about last summer from the realization that there was going to be massive amounts of data about presidential candidates available in the coming election, but no intuitive way of searching it. We wanted to provide a tool that could be useful in different ways to different demos: the “average joe” voter who might be wondering where their candidate is today, media analysts looking for trends in statistics, and political operatives trying to get a feel
for the campaign map.

A lot of usability testing went into finding the most intuitive ways of viewing and manipulating the map. We thought of the sidebars as the “axes” of a graph, starting with lists of the candidates and locations as widgets that users should immediately recognize. With the explosion of the amount of data, ways of filtering and restricting the data obviously became more crucial, such as the inclusion of the timeline, a suggestion from a colleague. Tying in articles and videos
came from our obvious love of new media, and we hope its been useful for people who can’t follow around their favorite candidate in person.

For the general election, we’re pursuing options for making MTC available for more than just the Presidential race. We’re envisioning a network of smaller MTC applications, maintained by regional Web sites and publications, creating a hierarchical database of data on local, state, and national races. The types of analysis that could come from such a detailed database could really be fascinating.

- You recently hosted a conference panel on “political cartography 2.0″. This is a brand new trend in digital politics, thanks in large part to mash-ups and new applications. Where do you see this trend going? Do you foresee broader implications for political and social communications, beyond campaign season?

We absolutely see this field expanding even after this election cycle is long over. One interesting statistic I learned recently is that Google estimates that 80% of data can be mapped in some way; there are huge amounts of data already collected that can be looked at in new ways we haven’t even tried yet. Maps are a tremendous tool because of their ubiquity; most users can recognize a map of their country, and thus immediately have a starting point for analysis of
the data mapped on it; it’s a generous learning curve you don’t get with other interfaces.

Moving forward, I think the toughest problem is going to be standardization of the data. Data comes in so many different formats (and sometimes, such as in emailed correspondence from campaigns, completely unformatted) that formatting it logically will become the bottleneck very quickly as the mapping technologies themselves jump forward.

- Last but not least, could you recommend to our readers a few sites (beside yours) showing interesting demos of political cartography?

The Electoral Map is a fantastic blog/round-up of the most interesting maps on the Internet that we read frequently, and its author, Patrick Ottenhoff, was a participant on our panel at POLC. The tools generated by the development team at NYTimes.com, really a powerhouse of map creation, are very interesting also.

February 5th, 2008

Super Tuesday Watch Round-Up

Before republican and democrat voters from 24 States reach the voting booths on this Super Tuesday, we offer a brief state of the political webosphere round-up.

The progressive community online

In the final run-up before Super Tuesday, it seems the US progressive webosphere is trying to deal with all the polls, debates, campaign news and trails in the 20 or so States that will have a say today. As a result large media outlets such as USA Today or the Washington Post receive a lot of attention and interest from bloggers. One might suppose that where these media outlets have the expertise, money and manpower to cover all the issues at stake in an insightful and documented manner, bloggers don’t have a similar claim to exhaustiveness.
Tu put it in a nutshell, bloggers from the progressive community seem to react, comment and debate upon materials drawn from the ultimate common opinion ground: the Mass Media.

Obviously, websites (Dailykos, TPM, Huffington Post, etc.) within the progressive community that are more akin to participative media outlets than to individual blogs offer perspective and substance, in line with what mass media offer - and maybe with even more insight, see here for an interesting Op-Ed.

There are some interesting conversations about the whole primary process: the undecisive character of the democratic primaries caused by the proporitional rule and the part played by the superdelegates - of whom Hillary Clinton seems to have won more to her candidacy than Barack Obama has, although some say the winds might be changing.

The debate is not so much on who’s the best candidate of Clinton or Obama, it’s rather focused on who’s doing the best campaign. It seems the lack of fundamental policy differences between the two and their focus on style (hope, change, experience) over substance has led many in the webosphere, but not all (see here and there as well for examples of opinionated posts), to comment the campaign rather than plainly endorse a candidate.

The buzz is clearly mounting on Barack Obama’s campaign. Although opinion polls show the candidates neck in neck, many pundits say he could take over Hillary Clinton in some key States. As far as the Hispanic vote is concerned, a good number of media and social media sites believe the obstacle that stands in senator Obama’s way is not so much that his candidacy does not engage Hispanic voters but that he has yet to gain the widespread visibility that senator Clinton enjoys among this community - the Washington Post has offered a series of articles with respect thereof.

Hence, after having raised an impressive 32 million dollars in January, senator Obama’s campaign move to spend over 10 million dollars in advertisements seems like a smart move to raise awareness on his candidacy - see Obama’s SuperBowl ad on YouTube.

The webosphere is abuzz (here and there as well for instance) with the number of high-profile endorsements the Illinois senator is getting, both from superdelegates (the Kennedys) and from progressive movements or media (LA Times, MoveOn, LA Opinion). It should be noted that the political webosphere has put into emphasis those endorsements that may bridge the gap between Obama and the Hispanic community, notably in California.

Overall, Barack Obama’s candidacy seems to be gaining momentum and could thus emerge in a leading position after all the Super Tuesday States have had their say - not necessarily in terms of number of delegates, but with respect to his chances in the race with favorably-looking primaries upcoming in February. With respect to large States such as California, New York and Illinois, Barack Obama’s share of voice is often double that of Hillary Clinton in the US progressive webosphere. Once more, these figures should not be construed as opinion polls. However, they both reveal and exagerate trends that have always been both hard to factor in traditional opinion polls and very important in deciding the outcome of an election.

The following chart from Pollster.com clearly shows the pendulum shifting from Clinton to Obama in opinion polls (with black dots representing the latest polls).

The conservative community online

To an external observer, a substantial part of the conservative online community forms a hunting party going after mild republicans, after those whose credentials are not conservative enough. You already see where this is going. Unlike the progressive webosphere, there’s a clear divide between those who support Mitt Romney and those who are behind John McCain - with the former clearly holding the most visible online ground.

There’s a lot of media coverage for the acrimonious battle going on between the two democrat candidates, yet it’s nothing compared to what is going on between McCain and conservative pundits. Influential websites in this community have unleashed hell against the not-so-conservative-to-them senator from Arizona (1, 2, 3, 4, etc.)

Conversely, those supporting McCain (occupying a smaller portion of the territory than their opponents) have gone negative on Romney (see here for instance). There are some who wonder what the witch hunt is all about, seeing McCain as a bona fide conservative - but they stand in the middle of the progressive community.

What does the debate tell us about Super Today? Well, the “true conservative” label has been denied to John McCain by authoritative websites such as Michelle Malkin. As a result, Mitt Romney’s edge on the battleground seems to be in default, or to say it in the words often used by bloggers and media alike, it rests upon his electable character.

Notwithstanding the anti-McCain wave in the webosphere, he seems to remain way ahead in the polls.  Does it mean that the most active online voices are not representative of the larger conservative base? Or does it mean that pollsters haven’t factored in the aversion towards McCain’s candidacy. We’ll soon have an answer to these questions.

January 29th, 2008

Has the fat lady sung in Florida ?

Over the last few days, the Republican political webosphere has been closely following Florida’s primaries, especially with respect to former New York Mayor Rudy Giuliani’s standing. Largely commenting on the joint lead by John McCain and Mitt Romney, some believe Rudy Giuliani’s absence from the spotlight in January hampered his candidacy although he invested time, money and efforts in Florida. In any case voters are about to reach a verdict on whether Rudy Giulani still stands a good chance in the presidential designation process. As often, you’ll find below the share of voice charts from the Republican political webosphere (last 7 days):

1. McCain 39%
2. Romney 30%
3. Huckabee 17%
4. Giuliani 14%

Although they don’t exactly match opinion polls, these figures may turn out to be accurate with respect to the order in which the candidates will emerge from the ballots.

January 8th, 2008

New Hampshire primaries, Obama and McCain

Now to the New Hampshire primaries. As for Iowa, we offer a brief analysis of each candidate’s voice share in the US political webosphere with respect to their standing in New Hampshire.

We have finetuned our metrics in order to exclude from Hillary Clinton’s voice share that of her husband, former President Bill Clinton, whose voice share remains substantial in the debate. Our previous analysis of the Iowa caucuses showed the voice shares of the Republican candidates pretty much matched the actual results. The discrepancy between the leading voice share (H. Clinton) and the ballot winner (B. Obama) for the Democrats can be explained by the aforementioned methodological glitch.

Now, the online buzz matches the polls. On the liberal side, Obama is way ahead of Hillary Clinton and John Edwards. On the conservative side, the online trends are not as conspicuous as the polls. John McCain is leading by a narrow margin, with Romney and Huckabee closely following.

The political webosphere gives the following orders on each side:

Republicans:

  1. McCain
  2. Romney
  3. Huckabee

Democrats:

  1. Obama
  2. Clinton
  3. Edwards

For the french version of this post, see here.

January 3rd, 2008

Iowa Caucus: Huckabee and Clinton dominating the political web

With every poll showing the top-tier candidates neck and neck in Iowa, it is also interesting to glimpse into the web’s crystal ball. Following the predictions of Hitwise, let us consider what the US political web is saying about the Iowa caucus, with one research objective in mind: to analyze, and perhaps confirm –as we did during the last French presidential election- the correlation between each candidate’s “share of voice” on the web, opinion polls, and ballot results.

map pw08 december2007

Prior to unveiling the numbers, let us first go over the methodology. We have measured the number of quotes and mentions of each candidate with respect to mentions of the Iowa Caucus against our dataset of the 2000 main sites and blogs of the US political web. To be fair, those quotes and references are not qualified, in terms of positive or negative language; this is essentially a quantitative measure of the level of buzz on a select sample of the most politically active and influential sites and blogs on the US web.

democrat caucus iowa

Generally speaking, the web is highly reactive to news coverage and events, and thus acts as an amplifier, often yielding previous insights. When focusing exclusively on the chatter over the past 10 days, Hillary Clinton seems to dominate the Democratic side of our dataset of sites, with 31% of share of voice, closely followed by Barack Obama (29%) and John Edwards at 26%.

When looking at Edwards’ share of voice stats over the past 2 months, one can only notice his impressive online surge, apparently confirmed by recent investments in additional servers by the campaign (http://marcambinder.theatlantic.com/archives/2007/12/a_real_edwards_surge.php), albeit not sufficient to leap ahead of the top two Democratic contenders. Conversely, Hillary Clinton seems to have lost some virtual ground since November, considering she once culminated at 42% of share of voice between November 10 and November 19.

rep caucus iowa

On the Republican side of things, the matter isn’t straightforward either. In the last 10 days, the Republican political webosphere (all the Republican sites in our 2,000 sites dataset) has confirmed the “Huckaboom” by putting him at the top of the charts with a 26% share of voice. Mitt Romney follows with a solid 24% while John McCain and Fred Thompson are lagging behind, respectively with a 16% and a 13% share – the latters’ voice shares being slightly better than their standings in the Iowa polls. Again, if we take a look at the trends over the last 3 months, we’ll notice that Huckabee rose over his competitors at the beginning of December, both online and in the polls. Giulani’s steady decline in the polls in the last months compares with his diminishing share of voice among the Republican online community.

iowa rep polls

To sum it up, here is the share of voice for each one of the leading candidates:

Republicans

Mike Huckabee 26%
Mitt Romney 24%

John McCain 16%
Fred Thompson 13%

Democrats

Hillary Clinton 31%
Barack Obama 29%
John Edwards 26%

Now, let us wait for the first actual results.