April 17th, 2008

Political Cartography 2.0: an Interview with MaptheCandidates.com

As you’ve no doubt realized by now, we like maps, and of course we like people who also like cutting-edge, slightly puzzling internet maps. So who better than Chadwick Matlin and E. J. Kalafarski, founders of MapTheCandidates.com and hosts of a recent panel on “Political Cartography 2.0” at the 2008 Politics Online Conference to have a little chat about internet mapping:

- First of all, could you tell us a bit about Mapthecandidates.com: how did it come about? Are you planning to add new features for the general election?

MTC came about last summer from the realization that there was going to be massive amounts of data about presidential candidates available in the coming election, but no intuitive way of searching it. We wanted to provide a tool that could be useful in different ways to different demos: the “average joe” voter who might be wondering where their candidate is today, media analysts looking for trends in statistics, and political operatives trying to get a feel
for the campaign map.

A lot of usability testing went into finding the most intuitive ways of viewing and manipulating the map. We thought of the sidebars as the “axes” of a graph, starting with lists of the candidates and locations as widgets that users should immediately recognize. With the explosion of the amount of data, ways of filtering and restricting the data obviously became more crucial, such as the inclusion of the timeline, a suggestion from a colleague. Tying in articles and videos
came from our obvious love of new media, and we hope its been useful for people who can’t follow around their favorite candidate in person.

For the general election, we’re pursuing options for making MTC available for more than just the Presidential race. We’re envisioning a network of smaller MTC applications, maintained by regional Web sites and publications, creating a hierarchical database of data on local, state, and national races. The types of analysis that could come from such a detailed database could really be fascinating.

- You recently hosted a conference panel on “political cartography 2.0″. This is a brand new trend in digital politics, thanks in large part to mash-ups and new applications. Where do you see this trend going? Do you foresee broader implications for political and social communications, beyond campaign season?

We absolutely see this field expanding even after this election cycle is long over. One interesting statistic I learned recently is that Google estimates that 80% of data can be mapped in some way; there are huge amounts of data already collected that can be looked at in new ways we haven’t even tried yet. Maps are a tremendous tool because of their ubiquity; most users can recognize a map of their country, and thus immediately have a starting point for analysis of
the data mapped on it; it’s a generous learning curve you don’t get with other interfaces.

Moving forward, I think the toughest problem is going to be standardization of the data. Data comes in so many different formats (and sometimes, such as in emailed correspondence from campaigns, completely unformatted) that formatting it logically will become the bottleneck very quickly as the mapping technologies themselves jump forward.

- Last but not least, could you recommend to our readers a few sites (beside yours) showing interesting demos of political cartography?

The Electoral Map is a fantastic blog/round-up of the most interesting maps on the Internet that we read frequently, and its author, Patrick Ottenhoff, was a participant on our panel at POLC. The tools generated by the development team at NYTimes.com, really a powerhouse of map creation, are very interesting also.

February 5th, 2008

Super Tuesday Watch Round-Up

Before republican and democrat voters from 24 States reach the voting booths on this Super Tuesday, we offer a brief state of the political webosphere round-up.

The progressive community online

In the final run-up before Super Tuesday, it seems the US progressive webosphere is trying to deal with all the polls, debates, campaign news and trails in the 20 or so States that will have a say today. As a result large media outlets such as USA Today or the Washington Post receive a lot of attention and interest from bloggers. One might suppose that where these media outlets have the expertise, money and manpower to cover all the issues at stake in an insightful and documented manner, bloggers don’t have a similar claim to exhaustiveness.
Tu put it in a nutshell, bloggers from the progressive community seem to react, comment and debate upon materials drawn from the ultimate common opinion ground: the Mass Media.

Obviously, websites (Dailykos, TPM, Huffington Post, etc.) within the progressive community that are more akin to participative media outlets than to individual blogs offer perspective and substance, in line with what mass media offer - and maybe with even more insight, see here for an interesting Op-Ed.

There are some interesting conversations about the whole primary process: the undecisive character of the democratic primaries caused by the proporitional rule and the part played by the superdelegates - of whom Hillary Clinton seems to have won more to her candidacy than Barack Obama has, although some say the winds might be changing.

The debate is not so much on who’s the best candidate of Clinton or Obama, it’s rather focused on who’s doing the best campaign. It seems the lack of fundamental policy differences between the two and their focus on style (hope, change, experience) over substance has led many in the webosphere, but not all (see here and there as well for examples of opinionated posts), to comment the campaign rather than plainly endorse a candidate.

The buzz is clearly mounting on Barack Obama’s campaign. Although opinion polls show the candidates neck in neck, many pundits say he could take over Hillary Clinton in some key States. As far as the Hispanic vote is concerned, a good number of media and social media sites believe the obstacle that stands in senator Obama’s way is not so much that his candidacy does not engage Hispanic voters but that he has yet to gain the widespread visibility that senator Clinton enjoys among this community - the Washington Post has offered a series of articles with respect thereof.

Hence, after having raised an impressive 32 million dollars in January, senator Obama’s campaign move to spend over 10 million dollars in advertisements seems like a smart move to raise awareness on his candidacy - see Obama’s SuperBowl ad on YouTube.

The webosphere is abuzz (here and there as well for instance) with the number of high-profile endorsements the Illinois senator is getting, both from superdelegates (the Kennedys) and from progressive movements or media (LA Times, MoveOn, LA Opinion). It should be noted that the political webosphere has put into emphasis those endorsements that may bridge the gap between Obama and the Hispanic community, notably in California.

Overall, Barack Obama’s candidacy seems to be gaining momentum and could thus emerge in a leading position after all the Super Tuesday States have had their say - not necessarily in terms of number of delegates, but with respect to his chances in the race with favorably-looking primaries upcoming in February. With respect to large States such as California, New York and Illinois, Barack Obama’s share of voice is often double that of Hillary Clinton in the US progressive webosphere. Once more, these figures should not be construed as opinion polls. However, they both reveal and exagerate trends that have always been both hard to factor in traditional opinion polls and very important in deciding the outcome of an election.

The following chart from Pollster.com clearly shows the pendulum shifting from Clinton to Obama in opinion polls (with black dots representing the latest polls).

The conservative community online

To an external observer, a substantial part of the conservative online community forms a hunting party going after mild republicans, after those whose credentials are not conservative enough. You already see where this is going. Unlike the progressive webosphere, there’s a clear divide between those who support Mitt Romney and those who are behind John McCain - with the former clearly holding the most visible online ground.

There’s a lot of media coverage for the acrimonious battle going on between the two democrat candidates, yet it’s nothing compared to what is going on between McCain and conservative pundits. Influential websites in this community have unleashed hell against the not-so-conservative-to-them senator from Arizona (1, 2, 3, 4, etc.)

Conversely, those supporting McCain (occupying a smaller portion of the territory than their opponents) have gone negative on Romney (see here for instance). There are some who wonder what the witch hunt is all about, seeing McCain as a bona fide conservative - but they stand in the middle of the progressive community.

What does the debate tell us about Super Today? Well, the “true conservative” label has been denied to John McCain by authoritative websites such as Michelle Malkin. As a result, Mitt Romney’s edge on the battleground seems to be in default, or to say it in the words often used by bloggers and media alike, it rests upon his electable character.

Notwithstanding the anti-McCain wave in the webosphere, he seems to remain way ahead in the polls.  Does it mean that the most active online voices are not representative of the larger conservative base? Or does it mean that pollsters haven’t factored in the aversion towards McCain’s candidacy. We’ll soon have an answer to these questions.

January 29th, 2008

Has the fat lady sung in Florida ?

Over the last few days, the Republican political webosphere has been closely following Florida’s primaries, especially with respect to former New York Mayor Rudy Giuliani’s standing. Largely commenting on the joint lead by John McCain and Mitt Romney, some believe Rudy Giuliani’s absence from the spotlight in January hampered his candidacy although he invested time, money and efforts in Florida. In any case voters are about to reach a verdict on whether Rudy Giulani still stands a good chance in the presidential designation process. As often, you’ll find below the share of voice charts from the Republican political webosphere (last 7 days):

1. McCain 39%
2. Romney 30%
3. Huckabee 17%
4. Giuliani 14%

Although they don’t exactly match opinion polls, these figures may turn out to be accurate with respect to the order in which the candidates will emerge from the ballots.

January 8th, 2008

New Hampshire primaries, Obama and McCain

Now to the New Hampshire primaries. As for Iowa, we offer a brief analysis of each candidate’s voice share in the US political webosphere with respect to their standing in New Hampshire.

We have finetuned our metrics in order to exclude from Hillary Clinton’s voice share that of her husband, former President Bill Clinton, whose voice share remains substantial in the debate. Our previous analysis of the Iowa caucuses showed the voice shares of the Republican candidates pretty much matched the actual results. The discrepancy between the leading voice share (H. Clinton) and the ballot winner (B. Obama) for the Democrats can be explained by the aforementioned methodological glitch.

Now, the online buzz matches the polls. On the liberal side, Obama is way ahead of Hillary Clinton and John Edwards. On the conservative side, the online trends are not as conspicuous as the polls. John McCain is leading by a narrow margin, with Romney and Huckabee closely following.

The political webosphere gives the following orders on each side:

Republicans:

  1. McCain
  2. Romney
  3. Huckabee

Democrats:

  1. Obama
  2. Clinton
  3. Edwards

For the french version of this post, see here.

January 3rd, 2008

Iowa Caucus: Huckabee and Clinton dominating the political web

With every poll showing the top-tier candidates neck and neck in Iowa, it is also interesting to glimpse into the web’s crystal ball. Following the predictions of Hitwise, let us consider what the US political web is saying about the Iowa caucus, with one research objective in mind: to analyze, and perhaps confirm –as we did during the last French presidential election- the correlation between each candidate’s “share of voice” on the web, opinion polls, and ballot results.

map pw08 december2007

Prior to unveiling the numbers, let us first go over the methodology. We have measured the number of quotes and mentions of each candidate with respect to mentions of the Iowa Caucus against our dataset of the 2000 main sites and blogs of the US political web. To be fair, those quotes and references are not qualified, in terms of positive or negative language; this is essentially a quantitative measure of the level of buzz on a select sample of the most politically active and influential sites and blogs on the US web.

democrat caucus iowa

Generally speaking, the web is highly reactive to news coverage and events, and thus acts as an amplifier, often yielding previous insights. When focusing exclusively on the chatter over the past 10 days, Hillary Clinton seems to dominate the Democratic side of our dataset of sites, with 31% of share of voice, closely followed by Barack Obama (29%) and John Edwards at 26%.

When looking at Edwards’ share of voice stats over the past 2 months, one can only notice his impressive online surge, apparently confirmed by recent investments in additional servers by the campaign (http://marcambinder.theatlantic.com/archives/2007/12/a_real_edwards_surge.php), albeit not sufficient to leap ahead of the top two Democratic contenders. Conversely, Hillary Clinton seems to have lost some virtual ground since November, considering she once culminated at 42% of share of voice between November 10 and November 19.

rep caucus iowa

On the Republican side of things, the matter isn’t straightforward either. In the last 10 days, the Republican political webosphere (all the Republican sites in our 2,000 sites dataset) has confirmed the “Huckaboom” by putting him at the top of the charts with a 26% share of voice. Mitt Romney follows with a solid 24% while John McCain and Fred Thompson are lagging behind, respectively with a 16% and a 13% share – the latters’ voice shares being slightly better than their standings in the Iowa polls. Again, if we take a look at the trends over the last 3 months, we’ll notice that Huckabee rose over his competitors at the beginning of December, both online and in the polls. Giulani’s steady decline in the polls in the last months compares with his diminishing share of voice among the Republican online community.

iowa rep polls

To sum it up, here is the share of voice for each one of the leading candidates:

Republicans

Mike Huckabee 26%
Mitt Romney 24%

John McCain 16%
Fred Thompson 13%

Democrats

Hillary Clinton 31%
Barack Obama 29%
John Edwards 26%

Now, let us wait for the first actual results.

January 3rd, 2008

All you’ve always wanted to know about our map, but never dared to ask…

Curious about about the Presidential Watch ‘08 map? Here are some answers to the most common questions asked:

I. Drawing the map

The PresidentialWatch08 map is composed of the 297 most visible and influential websites and blogs - out of a complete dataset of over 2000 sites - using Linkfluence’s proprietary crawl technology.

The map includes both social media and mainstream media outlets. The sites are divided into four different categories, or communities (manually labelled):
- Conservative
- Independent
- Mass Media
- Progressive

In terms of methodology, we initiated the process by focusing on a set of a few hundred websites and blogs well-recognized by search engines and other sites related to US politics. Then, we collected the URLs of all sites located just one click away from our initial set - which amounted to tens of thousands of websites.

Why was this step important? Because when it comes to networks – and the web is one giant network – there’s a rule that says that what’s similar to a given node in terms of content will stand close to this node in terms of location. Working with a set of websites large enough, one can collect all the other important websites dealing with the same topics using the “one click removed” idea.

Thanks to a series of metrics, both topology-related (i.e. how many sites link to a particular site) and semantics-related (i.e. are the words used of political nature), we were able to single out over 2,000 websites that constitute the core of the US political webosphere. From these, we extracted the most link-relevant 297 sites.

II. Navigating the map

The PW08 map’s default view is set to display all the categories at once (Conservative, Independent, Mass Media, Progressive). You can select the individual communities you’re interested in and more carefully analyze the links existing between them - most notable to see who links to whom, and what their level of authority is within their community.

See the notice for more practical details on map navigation.

III. Understanding the map

As shown in the map’s navigation bar, a node’s color indicates the community it belongs to, and a node’s size indicates its authority degree (overall number of inbound links) or its Xeno degree (number of inbound links coming from nodes belonging to other categories).

The more links a node receives from other nodes shown on the map, the bigger it appears on the map. Note that the link count is based solely upon links coming from nodes on the map. Links coming from websites located outside of the map are excluded. Based on this approach, we can determine the level of authority attributed to a given site within these communities. This approach may occasionally favor bloggers who splog (spam-blog) others, artificially generating inbound links to their blogs by an abusive use of such techniques as trackbacks. Given the size of the map’s set of websites, we were able to make sure such artificial results were not present.

Nodes are positioned on the map according to a topological placement algorithm, i.e. each node is positioned solely according to its linking pattern, without consideration for the stated political affiliation of the site or its content.

Many algorithms make possible for a 2D rendering of an adjacent matrix - i.e. the matrix describing any graph. We used a Fruchterman Rheingold algorithm, which shares with all the others the same basic principle: minimizing the system’s energy while maximizing the use of the space available for the representation of the data. To minimize the system’s energy, one can for instance assume that nodes that are not linked to each other are pushing away from each other whereas nodes that are linked to each other are attracting each other. Through iterative steps the algorithm tries to find a way to position nodes where there is as little link overlap as possible. To maximize the use of the mapped space, the graph is spread as much as possible over the surface allocated for its display.

These positioning principles call for the following reading conventions:

A site’s position on the map depends solely upon its linking policy. A node has no predefined position, the latter being the result of the relations it has with other nodes. This means that a node with no links at all cannot be positioned on the map, which is why we excluded such websites from the PW08 map;

North, East, South and West don’t matter. The displayed space is not based on the cardinal system (North, East, South, West), which means that the choice of a relative left-right or top-down position is purely arbitrary. Overall, we chose to respect the obvious left-right political axis. The further left you look, the more liberal the site. The further right, the more conservative;

Hubs are center-stage. The displayed space is polarized in a center to periphery tension. The nodes positioned at the center are the ones receiving the most links from other nodes that don’t link much to one another (exogamous nodes). The nodes positioned at the periphery receive fewer links but they receive them from other nodes that tend to link to one another (endogamous nodes). For instance, the PW08 map clearly shows the pivotal position held by the mass media, the sites of large media outlets receiving links from sites pertaining to all the other communities;

It’s not size, it’s density. The map shouldn’t be interpreted with respect to the surface occupied by a given community or subset of nodes. Rather it should be construed with respect to density levels. For instance, two communities may stretch over equally-sized surfaces, with one forming a tight-knit community and the other being looser-knit. An online territory can be occupied by few sites with few links, thus showing a low density level; it can also be occupied by many sites with many links, thus showing a high density level. On the map the “strength” of a community can be inferred from its density and the thickness of the web woven by its nodes. For instance, a zone with a low density level spreading over a large surface should be construed as containing sites with hardly any links to sites in other communities, links being made between nodes within this community (hypertextual endogamy)

That’s it. Now you can navigate inside the PW08 map and analyze in detail the relations between sites and communities.

December 19th, 2007

US Elections web geography

blogopolUS_1

With less than a year to go before election day, the battlefield is already crowded with troops. The Republican and Democrat primaries have brought all supporters and cybersupporters in the debate. Whereas a few months ago American candidates were sending envoys to France to spot the presidential netcampaign’s best practices, they are now the ones steering the wheel, finding new ways to campaign online, pushing further the borders of traditional politics. The netcampaign will take place in every corner of the Internet, from the now ancient e-mails and newsgroups to the new web 2.0 community sites and apps such as Twitter or Digg. It will visit both the most crowded spaces such as YouTube, MySpace or Facebook and the most confidential and secluded - what about some political debating in Lake Ontario’s fly fishing newsgroups? And of course it will still happen within the blogosphere, on thousands of opinion outlets held by supporters, journalists, candidates, writers or citizens. Continuously or from time to time, they will carry, consider or mix the impressive flow of texts, images and sounds published daily by the mass media and, more and more, by their peers.

What do we offer? Some perspective on this very dense flow of opinions. The ability to apprehend the size of this phenomenon by measuring it.

The first measures are made by the topographic surveyor: measure a territory, draw its borders, distinguish its vicinity, spot the highs and lows. The first territory we have mapped is not the multi-dimensional Internet, with too many fronts to cover at the same time! No, the first territory we’ve mapped is the political blogosphere, the territory of all the blogs that will follow and take part in this election. Maybe we should talk about the political webosphere as all the blogs contained therein are not isolated from their hypertextual environments, from sites they link to and they’re linked from. It is this whole ecosystem of intertwined websites that we’ve represented and that we’ll monitor in 2008.

Last spring, we mapped the French political webosphere within the context of the 2007 presidential election. The most astonishing part is that the pulse of this territory, as shown in the map and the various monitors we had set up, actually gave a very good idea of the final outcome, with the ones leading the race on the Internet actually leading the polls. Hence, we suggest you keep a close eye on Presidential Watch 2008 all along the year!

The troops are now ready and trained, the battlefield is before us. Let the political strategists unfold their maps and their most ambitious tactics.