Tuesday, April 26, 2011

Plotting Maps with R



I stumbled upon this tutorial while not studying and I thought it would be fun to try and plot maps of the San Francisco Bay Area household income, education, population density, poverty, etc...

To do this I needed a Shapefile for the Bay Area zip codes similar to the London borough file used in the tutorial. I found a .shp file for the Bay Area zip code boundaries here.

I have a demographic database on my computer that I used to add the needed information to the Shapefile. In the code I included a vector of the median household incomes for the corresponding zipcodes in the the .shp file.

The following code is only slightly modified from the code included in the Spatial Analysis tutorial.


library(maptools)
library(RColorBrewer)
library(classInt)

## set the working directory.
setwd("/the/folder/where/bayarea_zipcode.shp/is/saved/")

## load the shapefile
zip<- readShapePoly("bayarea_zipcodes.shp")

#add the median household incomes for each zipcode to the .shp file
med.income = c(55995, 54448, 50520, 43846, 50537, 67705, 45142, 36518, NA, 65938, 50500, 61022, 65959, 17188, 53881, 66970, 35699, 76194, 63777, 63838, 66010, 48523, 70758, 41002, 60082, 48672, 61429, 60402, 75707, 58333, 60769, 60375, 76627, 68515, 64485, 60971, 60833, 54732, 60804, 33962, 57601, 43649, 59889, NA, 61494, 67824, 64429, 82528, 101555, 96658, 50300, 41573, 75747, 53750, 56905, 85479, 77455, 64389, 91283, 119832, 103791, 100590, 85109, 88184, 57153, 34951, 51418, 38613, 43640, 19750, 139997, 39290, NA, 76808, 98525, 106492, 68112, 68853, 77952, 34398, 75026, 33556, NA, 109771, 142459, 80959, 21124, 49066, 95588, 55321, 20034, 57976, 40990, 73571, 84710, 43444, 32273, 56569, 54174, 105393, 51896, 31542, 33152, 54879, 88976, 14609, 61609, 61776, 22351, 31131, 47288, NA, 63983, 60733, 29181, 75727, 61362, 53795, 76044, 34755, 66627, 37146, 92644, 87855, 95313, 50888, 55000, 57629, 54342, 77122, 44723, 64534, 65658, 60711, 57214, 54594, 48523, 90107, 69014, 49452, 72288, 56973, 81923, 61289, 71863, 61939, 0, 82735, 68067, 82188, 70026, 101977, 55112, 84442, 82777, 82796, 92989, 67152, 68121, 69350, 104958, 49279, 80973, 89016, 96677, 89572, 64256, 84565, 16250, 64839, 200001, 82072, 58304, 66807, 97758, 68721, 77539, 41313, NA, 82314, 164479, 69087, 145425, NA, 71056, 128853, 84856)
zip$INCOME = med.income

#select color palette and the number colors (levels of income) to represent on the map
colors <- brewer.pal(9, "YlOrRd")

#set breaks for the 9 colors
brks<-classIntervals(zip$INCOME, n=9, style="quantile")
brks<- brks$brks

#plot the map
plot(zip, col=colors[findInterval(zip$INCOME, brks,all.inside=TRUE)], axes=F)

#add a title
title(paste ("SF Bay Area Median Household Income"))

#add a legend
legend(x=6298809, y=2350000, legend=leglabs(round(brks)), fill=colors, bty="n",x.intersp = .5, y.intersp = .5)




This is the map that results from the code above.

Here is a visualization of the population density in the Bay Area.

Finally, here is a visualization of the percent of adults (25 years or older) with a bachelors degree or higher.


I'm not familiar with GIS software so I imagine that this is sort of a round about way of doing something pretty simple, but I think the results are cool.

6 comments:

  1. Very nice introduction to maps in R!

    ReplyDelete
  2. It looks very pretty!! However, when I run the script, the legend of my chart is narrowed ... Do you use another legend R-function in these plots?

    ReplyDelete
  3. There is also the geospacom R-package that automatize many of these features

    ReplyDelete
  4. How do you match the zip codes you have in your database to the one in the .shp file?
    I would like to match where people come from based on their zip codes

    ReplyDelete
  5. Where did you find the zip codes file?

    ReplyDelete
  6. very nice discovery.

    ReplyDelete