January 09, 2008

POLLING ERRORS IN NEW HAMPSHIRE

Hillary Clinton's stunning win over Barack Obama in New Hampshire is not only sure to be a legendary comeback but equally sure to become a standard example of polls picking the wrong winner. By a lot.

There is a ton of commentary already out on this, and much more to come. Here I simply want to illustrate the nature of the poll errors. These show the nature of the problem and help clarify the issues. I'll be back later with some analysis of these errors, but for now let's just see the data.

In the chart, the "cross-hairs" mark the outcome of the race, 39.1% Clinton, 36.4% Obama. This is the "target" the pollsters were shooting for.

The "rings" mark 5%, 10% and 15% errors. Normal sampling error would put a scatter of points inside the "5-ring", if everything else were perfect.

In fact, most polling shoots low and to the left, though often within or near the 5-ring. The reason is undecided voters in the survey. Unless the survey organization "allocates" these voters by estimating a vote for them, some 3-10% in a typical election survey are left out of the final vote estimate. Some measures of survey accuracy divide the undecided, either evenly across candidates or proportionately across them. There is good reason to do that in another post. But what the pollsters publish are the unallocated numbers (almost always) and so it seems fair to plot here the percent of the vote the pollster published, not one with undecided reallocated.

What we see for the Democrats is quite stunning. The polls actually spread very evenly around the actual Obama vote. Whatever went wrong, it was NOT an overestimate of Obama's support. The standard trend estimate for Obama was 36.7%, the sensitive estimate was 39.0% and the last five poll average was 38.4%, all reasonably close to his actual 36.4%.

It is the Clinton vote that was massively underestimated. Every New Hampshire poll was outside the 5-Ring. Clinton's trend estimate was 30.4%, with the sensitive estimate even worse at 29.9% and the 5 poll average at 31.0% compared to her actual vote of 39.1%.

So the clear puzzle that needs to be addressed is whether Clinton won on turnout (or Obama's was low) or whether last minute decisions broke overwhelmingly for Clinton. Or whether the pollster's likely voter screens mis-estimated the make up of the electorate. Or if the weekend hype led to a feeding frenzy of media coverage that was very favorable to Obama and very negative towards Clinton, which depressed her support in the polls but oddly did not lower her actual vote.

On the Republican side we see a more typical pattern, and with better overall results. About half of the post-Iowa polls were within the 5-ring for the Republicans, and most of the rest within the 10-ring.

As expected, errors tend to be low and left, but the overall accuracy is not bad. This fact adds to the puzzle in an important way:

If the polls were systematically flawed methodologically, then we'd expect similar errors with both parties. Almost all the pollsters did simultaneous Democratic and Republican polls, with the same interviewers using the same questions with the only difference being screening for which primary a voter would participate in. So if the turnout model was bad for the Democrats, why wasn't it also bad for the Republicans? If the demographics were "off" for the Dems, why not for the Reps?

This is the best reason to think that the failure of polling in New Hampshire was tied to swiftly changing politics rather than to failures of methodology. However, we can't know until much more analysis is done, and more data about the polls themselves become available.

A good starting point would be for each New Hampshire pollster to release their demographic and cross tab data. This would allow sample composition to be compared and for voter preferences within demographic groups to be compared. Another valuable bit of information would be voter preference by day of interview.

In 1948 the polling industry suffered its worst failure when confidently predicting Truman's defeat. In the wake of that polling disaster, the profession responded positively by appointing a review committee which produced a book-length report on what went wrong, how it could have been avoided and what "best practices" should be adopted. The polling profession was much the better for that examination and report.

The New Hampshire results are not on the same level of embarrassment as 1948, but they do represent a moment when the profession could respond positively by releasing the kind of data that will allow an open assessment of methods. Such an assessment may reveal that in fact the polls were pretty good, but the politics just changed dramatically on election day. Or the facts could show that pollsters need to improve some of their practices and methods. Pollsters have legitimate proprietary interests to protect, but big mistakes like New Hampshire mean there are times when some openness can buy back lost credibility.

Cross-posted at Political Arithmetik.

Charles Franklin

NEW HAMPSHIRE: SO WHAT HAPPENED? January 09, 2008

There is obviously one and only one topic on the minds of those who follow polls today. What happened in New Hampshire? Why did every poll fail to predict Hillary Clinton's victory?

Let's begin by acknowledging the obvious. There is a problem here. Even if the discrepancy between the last polls and the results turns out to be about a big last minute shift to Hillary Clinton that the polls somehow missed (and that certainly sounds like a strong possibility), just about every consumer of the polling data got the impression that a Barack Obama victory was inevitable. One way or another, that's a problem.

For the best summary of the error itself, I highly recommend the graphics and summary Charles Franklin posted earlier today. Here's a highlight of how the result compared to our trend estimates:

What we see for the Democrats is quite stunning. The polls actually spread very evenly around the actual Obama vote. Whatever went wrong, it was NOT an overestimate of Obama's support. The standard trend estimate for Obama was 36.7%, the sensitive estimate was 39.0% and the last five poll average was 38.4%, all reasonably close to his actual 36.4%.

It is the Clinton vote that was massively underestimated . . .Clinton's trend estimate was 30.4%, with the sensitive estimate even worse at 29.9% and the 5 poll average at 31.0% compared to her actual vote of 39.1%.

So what went wrong? We certainly have no shortage of theories. See Ambinder, Halperin, Kaus, and, for the conspiratorially minded, Friedman. The pollsters that have weighed in so far (that I've seen at least) are ABC's Gary Langer (also on video), Gallup's Frank Newport, Scott Rasmussen and John Zogby. Also, Nancy Mathiowetz, president of the American Association for Public Opinion Research (AAPOR) has blogged her thoughts on Huffington Post.

Figuring out what happened and sorting through the possibilities is obviously a much bigger task than one blog post the morning after the election. But let me quickly review some of the more plausible or widely repeated theories and review what hard evidence we have, for the moment, regarding each.

1) A last minute shift? - Perhaps the polls had things about "right" as of the rolling snapshot taken from Saturday to Monday, but missed a final swing to Hillary Clinton that occurred over the last 24 hours and even as voters made their final decisions in the voting booth. After all, we knew that a big chunk of the Democratic electorate remained uncertain and conflicted, with strong positive impressions of all three Democratic front-runners. The final CNN/WMUR/UNH poll showed 21% of the Democrats "still trying to decide" which candidate they would support, and the exit poll showed 17% reported deciding on Election Day with another 21% deciding within the last three days. Polls showed Clinton polling in the mid to upper 30s during the late fall and early winter before a decline in December. Perhaps some supporters simply came home in the final hours of the campaign.

I did a quick comparison late last night of the crosstabs from the exit polls and final CNN/WMUR/UNH survey. Clinton's gains looked greatest among women and college educated voters. That pattern, if it also holds for other polls (a big if) seems suggestive of a late shift tied to the intense focus on Clinton's passionate and emotional remarks, especially over the last 24 hours of the campaign.

2) Too Many Independents? - One popular theory is that polls over-sampled independent voters who ultimately opted for a Republican ballot to vote for John McCain. I have not yet seen any hard turnout data on independents from the New Hampshire Secretary of State, but the exit poll data does not offer promising data for this theory. As I blogged yesterday, final Democratic polls put the percentage of registered independents (technically "undeclared" voters) at between 26% and 44% (on four polls that released the results of a party registration question). The exit poll reported the registered independent number as 42%, with another 6% reporting they were new registrants. So if anything polls may have had the independent share among Democrats too high.

On Republican samples, pre-election pollsters reported the registered independent numbers ranging between 21% and 34%. The exit poll put it at 34%, with 5% previously unregistered. So here too, the percentage of independents may have been too low.

Apply those percentages to the actual turnout, do a little math, and you get an estimate of how the undeclared voters split: roughly 60% took a Democratic ballot and 40% a Republican. That is precisely the split that CNN/WMUR/UNH found in their last poll although other

Keep in mind that the overall turnout was over 526,671 (or 53.3% of eligible adults). Eight years ago (the last time both parties had contested primaries) it was 396,385 (or 44.4% of eligible adults at the time). That helps explain why we may have seen an increase in independents in both parties.

Of course, we are missing a lot of data here: Nothing yet on undeclared voter participation from the Secretary of State, and roughly half the pollsters never released a result for party registration.

3) Wrong Likely Voters? OK, so maybe they had the independent share right, but perhaps pollsters still sampled the wrong "likely voters" by some other measure. The turnout above means that pollsters had to try to select (or model) a likely electorate that amounted to roughly half the adults in New Hampshire, they reached with a random digit dial sample.

Getting the right mix is always challenging, possibly more so because the Democratic turnout was so much higher than in previous elections. That's an argument blogged today by Allan McCutcheon of Edison Research:

In 2004, a (then) record of 219,787 voters turned out to vote--the previous record for the Democratic primary was in 1992, when 167, 819 voters participated. This year, a record shattering 287,849 voters participated in the New Hampshire Democratic primary--including nearly two thirds (66.3%) of the state's registered Democrats (up from 43.3% in 2004). Simply stated, the 2008 New Hampshire Democratic primary had a voter turnout rate that resembled a November presidential election, not a usual party primary, and the likely voter models for the polling organizations were focused on a primary--this time, that simply did not work.

One way to assess whether polls sampled the wrong kinds of voters would be to look carefully at their demographics (gender, age, education, region) and see how they compared to the exit poll and vote return data. Unfortunately, as is so often the case, only a handful of New Hampshire pollsters reported demographic composition.

4) The Bradley/Wilder effect? The term, as wikipedia tells us, derives from the 1982 gubernatorial campaign of Tom Bradley, then the long time African-American mayor of Los Angeles. Bradley led in pre-election polls but lost narrowly. A similar effect, in which polls understated the support for the opponents of African-American candidates seemed to hold in various instances during the 1980s. Consider this summary of polls compiled by the Pew Research Center for a 1998 report: which they updated in February 2007:

Note that, in almost every instance, the polls were generally about right in the percentage estimate for African-American candidate but tended to underestimate the percentage won by their white opponents. The theory is that some respondents are reluctant to share an opinion that might create "social discomfort" between the respondent and the interviewer, such as telling a stranger on the telephone that you intend to oppose an African-American candidate.

Of course, the Pew Center also looked at six races for Senate and Governor in 2006 that featured an African-American candidate and did not see a similar effect. Also keep in mind that that all of the reports mentioned above that show the effect were from general election contests, not primaries.

What other evidence might suggest the Bradley/Wilder effect operating in New Hampshire in 2008? We might want to consider whether the race of interviewer or the use of an automated (interviewer-free) methodology would have an effect, although these kinds of analyses are difficult, because other variables can confound the analysis. For what it's worth, the final Rasmussen automated survey had Obama leading by seven points (37% to 30%), roughly the same margin as the other pollsters. We might also look at whether pushing undecided voters harder helped Clinton more than other candidates.

Update: My colleagues at AAPOR have made three relevant articles from Public Opinion Quarterly available to non-subscribers on the AAPOR web site.

5) Non-response bias? We would be crazy to rule it out, since even the best surveys are getting response rates in the low twenty percent range. If Clinton supporters were less willing to be interviewed last weekend than Obama supporters, it might contribute to the error. Unfortunately, it is next to impossible to investigate, since we have little or no data on the non-respondents. However, if pollsters were willing to be completely transparent, we might compare the results among those with relatively high response rates to those with lower rates. We might also check to see if response rates declined significantly over the final weekend.

6) Ballot Placement? Gary Langer's review points to a theory offered by University Prof. Jon Krosnick, that Clinton's placement near the top of the New Hampshire ballot boosted her vote total. Krosnick believes that ballot order netted Clinton "at least 3 percent more votes than Obama."

7) Weekend Interviewing? I blogged my concerns on Sunday. Hard data on whether this might be a factor are difficult to come by, but it is certainly an issue worth pursuing.

8) Fraud? As Marc Ambinder puts it, some are ready to believe "[t]here was a conspiracy, somehow, because pre-election polls are just so much more valid than actual vote counts." Put me down as dubious, but Brad Friedman's Brad Blog has the relevant Diebold connections for those who are interested.

Again, no one should interpret any of the above as the last word on what happened in New Hampshire. Most of these theories deserve more scrutiny and I agree with Gary Langer that "it is incumbent on us - and particularly on the producers of the New Hampshire pre-election polls - to look at the data, and to look closely, and to do it without prejudging." This is just a quick review, offering what information is most easily accessible. I am certain I will have more to say about this in coming days.

-- Mark Blumenthal

January 09, 2008 in The 2008 Race