Mapping ADA Voting Scores 1947-2015

Jul 4, 2017 · 1981 words · 10 minutes read ggpubr highcharter htmlwidgets plot R

Tracking Legislator Voting Patterns

How do US legislators vote once they get elected? Or, perhaps more dynamically, how do they react to external shocks (e.g. the dissolution of the Soviet Union, 9/11) that might blur partisan lines? More generally, how does voting behavior change across time and space? Let’s try to provide some answers to these questions using R. You can jump straight to the interactive Shiny app by clicking here.

Data

Organizations such as the Americans for Democratic Action, among other things, track the consistency of the legislators by documenting their voting patterns after they get elected. An aggregated version of their data, compiled by Justin Briggs, provides district-level voting data grouped by year, state, chamber, and party from 1947 to 2015. In addition to the nominal ADA voting scores, he also provides adjusted scores based on this post by Tim Groseclose. Let’s take a look:

library(dplyr)
library(readxl)
adaData <- read_excel("ada4715.xlsx") #aforementioned dataset
glimpse(adaData)

## Observations: 36,432
## Variables: 13
## $ Year                         <dbl> 1947, 1947, 1947, 1947, 1947, 194...
## $ Congress                     <dbl> 80, 80, 80, 80, 80, 80, 80, 80, 8...
## $ ICPSR                        <dbl> 937, 3754, 195, 4471, 7695, 4892,...
## $ `Full Name`                  <chr> "BOYKIN, FRANK WILLIAM", "GRANT, ...
## $ State                        <chr> "ALABAMA", "ALABAMA", "ALABAMA", ...
## $ StateAbbr                    <chr> "AL", "AL", "AL", "AL", "AL", "AL...
## $ Chamber                      <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
## $ District                     <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 1, ...
## $ Party                        <chr> "Democrat", "Democrat", "Democrat...
## $ Nominal.Score                <dbl> 33, 83, 83, 75, 91, 83, 91, 91, 7...
## $ Alt.Nominal.Score.Groseclose <chr> "25", "75", "75", "75", "92", "83...
## $ Adjusted.Score               <dbl> 24.017374, 70.699040, 70.699040, ...
## $ Mean.Adj.Score               <dbl> 22.56041, 29.70871, 23.96539, 52....

Looks pretty good. Still, some dplyr action will be needed to accomplish the task. We are probably interested in some sort of average score for comparison purposes. For starters, we can get the yearly averages by chamber and party:

annual <- adaData %>%
  group_by(Year, Chamber, Party) %>%
  summarise(ADA = round(mean(Nominal.Score), 2),
            aADA = round(mean(Adjusted.Score), 2))
head(annual)

## # A tibble: 6 x 5
## # Groups:   Year, Chamber [3]
##    Year Chamber          Party   ADA  aADA
##   <dbl>   <dbl>          <chr> <dbl> <dbl>
## 1  1947       1 American Labor 91.00 78.17
## 2  1947       1       Democrat 73.66 61.97
## 3  1947       1     Republican 14.94  7.15
## 4  1947       2       Democrat 70.68 47.02
## 5  1947       2     Republican 23.73  4.65
## 6  1948       1 American Labor 72.50 67.58

American Labor? Good old times, eh. Next, we include states to track them over time:

states <- adaData %>%
  group_by(Year, StateAbbr, Chamber, Party) %>%
  summarise(ADA = round(mean(Nominal.Score), 2),
            aADA = round(mean(Adjusted.Score), 2))
head(states)

## # A tibble: 6 x 6
## # Groups:   Year, StateAbbr, Chamber [6]
##    Year StateAbbr Chamber    Party   ADA  aADA
##   <dbl>     <chr>   <dbl>    <chr> <dbl> <dbl>
## 1  1947        AL       1 Democrat 78.33 66.34
## 2  1947        AL       2 Democrat 95.00 68.96
## 3  1947        AR       1 Democrat 58.00 47.36
## 4  1947        AR       2 Democrat 70.00 46.40
## 5  1947        AZ       1 Democrat 87.00 74.43
## 6  1947        AZ       2 Democrat 95.00 68.96

At this point, we create multiple copies using dplyr primarily with the Shiny app in mind. Also note that some year/state/chamber combinations will be missing depending on election performance. Finally, in addition to nation-wide averages, we would also want some indicator of within-state change over time. A basic way of doing this is to calculate a lagged value of this grouped set after arranging by year:

trend <- states  %>%
  arrange(Year) %>%
  group_by(StateAbbr, Chamber, Party) %>%
  mutate(Diff = ADA - lag(ADA),
         Diff2 = aADA - lag(aADA))
head(trend[trend$Year == 1948, ]) #First year will have NAs because of the lag structure

## # A tibble: 6 x 8
## # Groups:   StateAbbr, Chamber, Party [6]
##    Year StateAbbr Chamber    Party   ADA  aADA   Diff  Diff2
##   <dbl>     <chr>   <dbl>    <chr> <dbl> <dbl>  <dbl>  <dbl>
## 1  1948        AL       1 Democrat 60.22 54.48 -18.11 -11.86
## 2  1948        AL       2 Democrat 93.50 70.52  -1.50   1.56
## 3  1948        AR       1 Democrat 47.86 41.29 -10.14  -6.07
## 4  1948        AR       2 Democrat 49.50 27.98 -20.50 -18.42
## 5  1948        AZ       1 Democrat 72.50 67.58 -14.50  -6.85
## 6  1948        AZ       2 Democrat 65.00 42.96 -30.00 -26.00

At this point, we might as well provide a baseline value to signal the change from the previous year; that is, whether the shift is toward liberalism or conservatism. Given the coding, this simply leads to a binary classification based on sign:

trend$Threshold <- factor(ifelse(trend$Diff > 0, "Liberal", "Conservative"),
                          levels = c("Liberal", "Conservative"))
trend$Threshold2 <- factor(ifelse(trend$Diff2 > 0, "Liberal", "Conservative"),
                          levels = c("Liberal", "Conservative"))
head(trend[trend$Year == 1948, 6:10], n = 2)

## # A tibble: 2 x 5
##    aADA   Diff  Diff2    Threshold   Threshold2
##   <dbl>  <dbl>  <dbl>       <fctr>       <fctr>
## 1 54.48 -18.11 -11.86 Conservative Conservative
## 2 70.52  -1.50   1.56 Conservative      Liberal

This wraps up the data setup.

Visualization

Disclaimer: Apparently blogdown is not playing well with some widgets (see 1, 2, 3, 4, and 5). I only realized this after deploying the website. Specifically, highcharter maps that are displayed locally do not carry over to R Markdown. I am currently implementing the solution (hack?) described in 5, which suggests binding the output and not evaluating the code, saving it as a widget, and then calling it as an iframe object (this is what you get to see). Yeah, not the prettiest method, but think of this as a band-aid solution for now. Hope blogdown fixes this soon, because it’s great. Moving on.

Edit to Disclaimer: I decided to replace the hack with static pictures instead, as it messes up the pagination settings.

There are many good R packages for charting maps. One of the better looking ones is highcharter.¹ Furthermore, I have yet to try it so I will use this excuse to get to know it. Relevant.² Anyway, it is powered by htmlwidgets so it comes with built-in Shiny integration. You can chart US states by calling:

library(highcharter)
map1 <- hcmap("countries/us/us-all")

hcmap creates an interactive legend by default; if you click on ‘Series 1’, the map will disappear because we have yet to project any other data. You can turn this behavior off by passing showInLegend = FALSE. You can also get and download the map data:

mapData <- get_data_from_map(download_map_data("countries/us/us-all"))
head(mapData$`hc-a2`)

## [1] "MA" "WA" "CA" "OR" "WI" "ME"

We see that state abbreviations are stored in hc-a2. We will us this as the key to match it with the states data set. Let’s map House Democrats in 1963 as an example:

hd63 <- states[states$Year == 1963 & states$Party == "Democrat" & states$Chamber == 1, ]
head(hd63)

## # A tibble: 6 x 6
## # Groups:   Year, StateAbbr, Chamber [6]
##    Year StateAbbr Chamber    Party   ADA  aADA
##   <dbl>     <chr>   <dbl>    <chr> <dbl> <dbl>
## 1  1963        AK       1 Democrat 92.00 81.40
## 2  1963        AL       1 Democrat 51.25 47.95
## 3  1963        AR       1 Democrat 56.25 52.06
## 4  1963        AZ       1 Democrat 75.00 67.45
## 5  1963        CA       1 Democrat 91.87 81.29
## 6  1963        CO       1 Democrat 83.50 74.42

mapData <- (download_map_data("countries/us/us-all")) #We overrride the version we downloaded in the previous chunk

hc <- highchart(type = "map") %>%
          hc_add_series_map(map = mapData,                       #map data
                            df = hd63,                           #voting score data subset
                            value = "ADA",                       #variable to map from df
                            joinBy = c("hc-a2", "StateAbbr"),    #linking map and data by state
                            name = "ADA Voting Score",           #hover title
                            nullColor = "#dadada",               #null color for NAs
                            borderColor = "white",               #border outline color
                            dataLabels = list(enabled = TRUE,    #hover data display
                            format = '{point.name}'))

Looks good for a first cut. We first call the appropriate map data from highcharts.com. We specify the data set of interest with df. value takes a string that specifies the variable we want to chart. joinBy links the two data sets by a matching variable; in our case abbreviated state codes. I assigned a null color to differentiate zeros from NAs.

Still, the colors can use more work. At it is now, we get a spectrum between 0 and 100, with the former signaling a slide towards conservatism and the latter conveying more liberal voting patterns. You might want to assign canonical colors of the parties to advertise these shifts. For this, we need to supply stop breaks to the hc_colorAxis. Say we want to assign red to 0, white to 50 (to display moderation/on-the-fence behavior), and blue to 100:

stops <- data.frame(q = c(0, .5, 1),
                    c = c("#ea5148", "#ffffff", "#18469e"))
stops

##     q       c
## 1 0.0 #ea5148
## 2 0.5 #ffffff
## 3 1.0 #18469e

stops <- list_parse2(stops)    #highcharter wrapper
map3 <- hc_colorAxis(hc, stops = stops)

Now we can clearly differentiate between voting patterns: States with blue hues maintained a liberal agenda, whereas those that did slide into voting more conservative (remember, we are only looking at House Democrats) are colored red. Finally, let’s add a title and change the border color to grey for increased visibility. Similar to ggplot, highcharter uses the pipe operator so it’s a breeze:

hc <- highchart(type = "map") %>%
          hc_add_series_map(map = mapData, df = hd63, value = "ADA", joinBy = c("hc-a2", "StateAbbr"),    
                            name = "ADA Voting Score", nullColor = "#dadada", borderColor = "grey", borderWidth = .2,   
                            dataLabels = list(enabled = TRUE, format = '{point.name}')) %>%
          hc_title(text = "1963 House Democrats ADA Voting Scores")
map4 <- hc_colorAxis(hc, stops = stops)

We can also convey yearly changes with a deviation plot using the ggpubr package. We have already calculated within-state change by lagging the ADA scores by one year after grouping. ggpubr as the name suggests builds on ggplot so the grammar works the way you would expect:

hd63T <- trend[trend$Year == 1963 & trend$Party == "Democrat" & trend$Chamber == 1, ]
library(ggpubr)
p <- ggbarplot(data = hd63T,
               x = "StateAbbr",
               y = "Diff",                               #use difference in nominal score
               fill = "Threshold",                       #factor identifying negative/positive scores
               color = "white", size = .1, width = 1,    #bin color, size, outline
               palette = c("#18469e", "#ea5148"),        #red/blue colorway
               sort.val = "asc",                         #sort by ascending value
               sort.by.groups = FALSE,                   #do not sort by group
               x.text.angle = 90,                        #rotate x axis text
               ylab = "Change from Last Year",
               xlab = FALSE,                             #hide x axis
               legend.title = "Trend Direction")
p +
geom_abline(slope = 0, color = "gray")                   #add a reference line at zero

All good and well, however, showing stats from a single year (plus only one chamber, one party) is not that captivating. Enter Shiny. I will not go into the details of building the app, as the code is publicly available on GitHub.³ Instead, I will briefly touch on how to transform the above code into reactive programming.

Assume we want to allow the users to switch years, chambers, party, and ADA score type. In order words, we need to have input concerning these four variables. The easiest way is to create a dynamic subsetting mechanism. An example would be:

#Not evaluated
selectedData <- reactive({
    states <- states[states$Year == input$Year & states$Party == input$Party &
                         states$Chamber == ifelse(input$Chamber == "House", 1, 2), ]
  })

This creates a reactive data set called selectedData() that changes according to user input to any of the variables defined above. Within Shiny, you have to call reactive items with brackets (). Using them outside of reactive environments will also irritate R. Below is a screenshot of the opening page of the completed app:

If you made it this far and still haven’t checked the app, you should do so now! The inclusion of president portraits took some consideration, however it made a strong case for starting out with JFK.

Get inspired at http://jkunst.com/highcharter/index.html ↩
Literally what happened.↩
Which assumes some understanding of how Shiny works. Feel free to contact me on any one of these platforms if you have questions.↩