Political Ideology Scores: Comparing Models

Last summer, I wrote a piece on political ideology in the Washington State Senate, and another on my methodology for computing those ideology scores. In the year since, I’ve sunk deeper into political science twitter. A few days ago, I came across a tweet that said “New update to our state legislative ideology data,” and this piqued my interest. In my initial searches last year, I had trouble finding individual-level ideology data for the Washington State Legislature, but sure enough, Boris Shor and Nolan McCarty have taken on the huge task of pulling together an up-to-date data set of state legislators and ideology scores based on Vote Smart’s Political Courage Test.

I was really thrilled to find this data set, and I wondered: how well do the ideology scores in this data set support/refute my conclusions from analyzing roll call votes? I downloaded their data for Washington State Senators and did some analysis to compare the results to mine.

Background: Shor and McCarty’s Model

I wanted to understand Shor and McCarty’s data a bit better before I dug in. Fortunately, they documented their methodology for assembling this data set [1]. At the heart of their model is The Political Courage Test (PCT, formerly called the National Political Awareness Test), administered by the non-profit Vote Smart. The PCT is a survey in which political candidates and elected officials are asked to indicate their position on a variety of issues. You can look up your local candidates and officials on Vote Smart’s website to see what positions they’ve taken. It’s a really terrific service.

There is a big challenge with the survey that should not surprise anyone who has attempted to administer a survey, and that is response rates. Response rates are fairly low and declining, which means that there are many politicians with no PCT score. Many of them refuse to answer the survey because they are wary of painting themselves into a political corner. This is why Vote Smart has rebranded the project as the “Political Courage Test.” To solve for this, Shor and McCarty have created a model of legislator ideal positions based on vote history (kind of like I did, but theirs is much more robust), and mapped that into PCT space to impute missing scores. This allowed them to estimate scores for non-respondents.

On the plus side, since the PCT score is based on a survey and not on actual policy actions, it is possible to assign PCT scores for candidates who have not yet been elected. A roll call model could never do that, and it presents opportunities for interesting analysis of how prospective legislators as compared to the status quo.

The Shape of the 2016 State Senate

I spent a lot of time looking at the ideological landscape scatterplot in my last blog, and my first question here was: how does the distribution of points in my model compare to theirs?

The NP-score ranks politicians on a single dimension of ideology. This squares with the findings of political science researchers, who have found that most of the time, ideology in congress is most pronounced along one dimension, with some notable exceptions [2]. In my last post, I plotted ideologies on two dimensions, but if you remember, the second dimension didn’t show as much meaningful variance between legislators as the first. For the sake of this analysis, I am using only the first dimension of my model [3], which I’m calling “Roll Call Score.”

Histogram of Roll Call Scores in the 2016 WA Senate

This multimodal distribution with a gulf between reflects our understanding from before that there are two distinct clusters of legislators – liberal and conservative – and they each tend toward their own central point. In this distribution, the right cluster (Republicans) is much more densely-packed. In the last blog post, I hypothesized that this was because the Republicans had a majority, and thus more opportunities to vote as a block.

The distribution for the PCT score shows a multimodal distribution as well:

PCT Scores Histogram for the Washington Senate

The first thing I noticed about these two plots was that the Roll Call scores were less evenly distributed – the two subpopulations had much more pronounced modes. I wanted to do a more direct comparison, so I normalized both a z-score[4] and plotted them together:

Distribution of PCT and Roll Call Scores

Here we see that the PCT scores are more dispersed, with more extreme minima and maxima. I was initially surprised by this, but after a bit of research into PCT I think it makes sense. The PCT covers a wide range of policy questions, many of which are not regularly voted on in the state legislature (for example, foreign affairs and defense issues). The State Legislatures also presumably debate some local issues that aren’t directly measured by PCT, but I’d expect that the ideological coverage of PCT is greater than state roll call votes.

Correlation Analysis

The distributions have some similarities, but the real test of how well matched these models are is how well a given legislator’s score in one model can be predicted from their score in the other. How correlated is a given legislator’s PCT score with their Roll Call Score? Calculating correlation for the 2016 senators, I found a correlation coefficient of .966. That’s a very strong positive correlation, i.e.: a liberal senator in the Roll Call model is very likely to be liberal in the PCT model (and vice versa), and the same for a very conservative senator.

This strong correlation is fairly obvious in the scatterplot as well:

Scatterplot of PCT and Roll Call scores with regression line

This linear model has a mean squared error of 2.84, with 93% of the variance in Roll Call Score explained by the PCT score.

Position of Key Legislators

The correlation is nice, but still a bit abstract from the realities of the model. I investigated further to see how individual legislators were rated or ranked differently between the two models.

Let’s start with our “typical” folks: the median legislator, the most liberal, and most conservative.

The median legislator and most liberal legislator match across models (Litzow and Jayapal, respectively). The most conservative legislator does not match across models, however. The PCT model ranks Dansel on the farthest right. The Roll Call model puts Dansel in the 90th percentile of conservatism (and if you remember from my last blog post, he was the most “distant” from Jayapal). In the Roll Call model Honeyford is the most conservative, and he falls in the 95th percentile of most conservative in PCT model.

I was naturally curious about Tim Sheldon, the Democrat who caucuses (and generally votes) with Republicans. My Roll Call model put him in the 70th percentile of conservatism, and the PCT model puts him in the 55th percentile. His NPAT score does show that he is more conservative than the median legislator in the state (as I would have assumed). The Roll Call model suggests that he is significantly more conservative relative to other legislators than the PCT model does. I would be interested to understand what is at play here, and I’m pondering some ways to investigate more. If you order the legislators from most liberal to most conservative, Sheldon’s positions differ by 12 slots – among the highest differences.

After pondering Senator Sheldon for a while, I wanted to look at the overall “movement” for legislators between model. If ordered from most liberal to most conservative, how much are the positions “shuffled” between models? I calculated the difference in positions for each legislator, and the distribution of differences is below:

Histogram of position differences between PCT and Roll Call models

Most legislators fell in a similar position in both models. The distribution of differences is highly skewed, and has a median difference of 3. 55% of legislators moved no more than 3 spots between models. Not too shabby.


There is a strong correlation between the Roll Call model I created and the PCT score. The relative positions of legislators also hold up reasonably well. I have more faith in the science (and scientists) behind the PCT score than I do in my algorithms, so this serves as reasonable validation of my roll call analysis methodology. I didn’t invent the idea of modeling ideology by roll call votes but I did roll my own version of it, and while it passed the sniff test it’s good to see corroborative evidence.

There are some key differences in PCT scores and roll call scores, however. One of the most significant is that PCT scores are “lifetime” scores, whereas roll call scores can be adjusted with each legislative session. This means that the PCT score cannot be easily used to model a legislator’s ideological change over time, whereas the roll call scores can.

I am most excited about the close relationship of these two models, because that means that I can use one to compensate for weaknesses or gaps in the other. For example, if I want to analyze where a greenhorn candidate would fall in the Senate, I can use the relationship between their PCT score and Roll Call score to project them into the Roll Call space with some confidence. Conversely, if a legislator declined to fill out the PCT, I can project them back into PCT space from the roll call space.

Finally, a hearty “thank you” to Boris Shor and Nolan McCarty for taking on the arduous task of collecting, cleaning, and sharing all this data.


[1] Their paper is available on Harvard Dataverse. It’s very wonk-ish but worth reading if you’re interested in how the sausage gets made.

[2] Most notably Keith Poole and Howard Rosenthal, who created DW-NOMINATE. DW-NOMINATE uses roll call votes to model legislative positions. It’s similar to the approach I used except more robust and just all-around better. They wrote a good book on the subject called Ideology and Congress. Poole and Rosenthal found that generally speaking, the US Congress has been divided starkly on one dimension, with occasional rifts emerging in a second dimension. The first dimension is best thought of as our traditional “liberal vs. conservative” spectrum, while the second dimension became important during two racially charged periods: the Civil War and Reconstruction, and the Civil Rights Movement.

[3] Adding the second component as a variable in the linear model resulted in an increase in the coefficient of determination of less than .0003. Given a small difference like that, I will generally opt for the simpler choice.

[4] Both data sets are not normal, so Z-Score is not really the best measurement to use for many things, but I think it works well-enough to compare dispersion in two similarly-shaped data sets.

No comments on "Political Ideology Scores: Comparing Models"

Leave a Reply

Your email address will not be published. Required fields are marked *