Saturday, November 3, 2012

Party Identification, Revisited

Nate Sliver is now contending, with fairly strong evidence, that the only way that Romney can win is if there is statistical bias.  By this he means a systematic sampling error, rather than some kind of partisan bias.

Today we also have an NBC/WSJ/Marist poll from Ohio showing Obama with a 51 to 45 lead.  (Full PDF here.)

All of this got me to thinking about the brouhaha back in September about party identification.

Back then, there were a whole bunch of Republican proxies raising questions about Democrats being oversampled in the polls (all rapidly southbound for Romney at the time).  The argument was that pollsters appeared to be filtering likely voters based on a party identification model that looked weird, given the GOP's strength in 2010 vs. 2008.

It was a bad argument, because party identification isn't used to filter likely voters.  However, party identification is a commonly asked question in presidential preference polls.  One would expect party ID to be somewhat correlated with presidential preference, but not completely.  As a general rule, party ID should be a lot less volatile than presidential preference.

This is where things get kinda strange.  Take, for example, the Marist poll above.  It found party ID to be 38% Democratic, 29% Republican, and 32% independent.  When you compare that to exit polling in Ohio in 2008 and 2010, here's what you get:

Party ID2008 Exit Ohio2010 Exit Ohio11/03/12 Marist Ohio

Obama won handily in 2008 and the Democrats made significant gains in the House and Senate, so a +8 spread for the Democrats sounded reasonable.  And, in 2010, note that the zero spread between the two parties indicated a sharp shift to the Republicans, as the election results confirmed.

But look at the Marist results for November 3 likely voters: you wind up with a spread of Democrats +9, even greater than that of 2008.  Now, I'm willing to believe that the GOP isn't as strong in Ohio as it was in 2010 but, given the state of the country and Romney's rapid (and dramatic!) rise following the debates, the GOP is almost certainly not doing worse than it did in 2008.  Something's wrong.

Now, I am not saying that the pollsters have done biased filtering for likely voters.  These people live and die on their professional reputations, and they're simply not going to risk going into the tank by applying any illegitimate filter criterion, let alone one with such a strong coefficient.

But there is something profoundly wrong with these numbers.  They don't make sense.  Why?

The only answer I can come up with is that there is indeed some sort of systematic sampling error at work.  Maybe it's a problem with the cell phone model.  Maybe it's just that the people who will actually answer a survey these days are very, very odd.  Maybe it's something more subtle.  But the party ID numbers don't pass the smell test, and that means the sampling is weird.

Given that I think that Nate Silver is correct, this is very good news for Romney.  Or at least not bad news.  Or... news.  My conclusions have a high standard error.

