Politics and Education - con’d

Below is data from the 2008 General Social Survey about education levels and party affiliation. This time we will look at ways to manipulate tables in R to easily work with data like this.

myData = read.csv('http://people.hsc.edu/faculty-staff/blins/spring17/math222/data/polpartyfull.csv')
education = factor(myData$Education,levels=c("None","Highschool","JrCollege","Bachelor","Graduate"),ordered=T)
politics = factor(myData$Politics,levels=c("StrongDemocrat","WeakDemocrat","NearDemocrat","Otherparty","Independent","NearRepublican","WeakRepublican","StrongRepublican"),ordered=T)
myTable = xtabs(myData$Count~education+politics)
myTable
##             politics
## education    StrongDemocrat WeakDemocrat NearDemocrat Otherparty
##   None                   63           45           34          2
##   Highschool            185          183          132         16
##   JrCollege              32           30           19          4
##   Bachelor               56           44           57         11
##   Graduate               54           29           20          5
##             politics
## education    Independent NearRepublican WeakRepublican StrongRepublican
##   None                87             19             20               25
##   Highschool         156             78            147               98
##   JrCollege           22             20             33               13
##   Bachelor            31             35             76               43
##   Graduate            26             10             27               22

Selecting a Single Row or Column

To select a single row or column of a table in R, use the following syntax.

myTable["Bachelor",]
##   StrongDemocrat     WeakDemocrat     NearDemocrat       Otherparty 
##               56               44               57               11 
##      Independent   NearRepublican   WeakRepublican StrongRepublican 
##               31               35               76               43

Removing Rows or Columns

In the table above, there are very few people who identify as Otherparty. This might cause problems for the chi-square test. Why not remove that column of the table? The command to do this is: myTable[,-4]. If I wanted to remove a row instead, I would use the command myTable[-m,] where m is the number of the row I wanted to remove. Or you can use the name of the row or column instead of the number.

myTable = myTable[,-4]
myTable
##             politics
## education    StrongDemocrat WeakDemocrat NearDemocrat Independent
##   None                   63           45           34          87
##   Highschool            185          183          132         156
##   JrCollege              32           30           19          22
##   Bachelor               56           44           57          31
##   Graduate               54           29           20          26
##             politics
## education    NearRepublican WeakRepublican StrongRepublican
##   None                   19             20               25
##   Highschool             78            147               98
##   JrCollege              20             33               13
##   Bachelor               35             76               43
##   Graduate               10             27               22

Combining Rows or Columns

The cbind() function lets you combine column vectors into a matrix. You can use this to combine columns from a table in R. There is also an rbind() function that builds a matrix out of rows.

myTable2 = cbind(myTable[,"StrongDemocrat"]+myTable[,"WeakDemocrat"]+myTable[,"NearDemocrat"],myTable[,"Independent"],myTable[,"StrongRepublican"]+myTable[,"WeakRepublican"]+myTable[,"NearRepublican"])
myTable2
##            [,1] [,2] [,3]
## None        142   87   64
## Highschool  500  156  323
## JrCollege    81   22   66
## Bachelor    157   31  154
## Graduate    103   26   59

Notice that we lost the column names. You make new column names using the colnames() function:

colnames(myTable2)=c("Democrat","Independent","Republican")
myTable2
##            Democrat Independent Republican
## None            142          87         64
## Highschool      500         156        323
## JrCollege        81          22         66
## Bachelor        157          31        154
## Graduate        103          26         59
mosaicplot(myTable2,col=T,las=1)