Here is a set of RDS files that contain sf objects of state county boundaries. We are going to work with these using iteration and functions for some of this week’s work.
Let’s warm up with some SF practice. the function readRDS()
reads in RDS files. The dplyr function bind_rows()
can take rows of a data frame, tibble, of sf object, and bind them together properly. Using the purrr
library, read in all of the counties files and then combine them into a single data frame. Plot the result.
This is great. Now, I’m curious - is there a link between the number of counties in a state and the ratio of area of the largest county in the state to the total state area? Let’s find out!
A. Write a function that, given a state name, will use readRDS
to read in a single data file and fix up the CRS (these are all in lat/long - you want a mollweide, in which distance is in meters). Plot Massachusetts to make sure everything works.
B. Write a function that, given an sf object of a single state and its counties, will return a one row data frame with the number of counties, the area of the largest county, the average county area, the state’s area, and the ratio of the largest county to total area. st_area()
will help you calculate area - but you will need to as.numeric()
, and if you take an sf object and use summarize()
on it, it will merge all of the polygons into one.
C. Using iteration, make a data frame that has all of the above information for all of the states. +1 EXTRA CREDIT - have a column named state with the state name. (hint: ?setNames
)
D. Plot that largest county ratio to number of counties! What do you learn? +1 extra credit for each exploration beyond this.
repurrsive
. It has an object in it, got_chars
with information about the characters from the Game of Thrones series. Notice it is a list of lists. To explore it, check out listviewer::jsonedit(got_chars, mode = "view")
.Now, using purrr
functions make a tibble with the following columns:
Who has more aliases on average? Men or women? Visualize however you see fit.
One thing that is cool about list columns is that we can filter on them. We can remove rows with list columns that have a length of 0 with filter(lengths(x) < 0)
where x is some column name. Note we are using lengths()
and not length()
.
Another cool thing is that we can always tidyr::unnest()
columns to expand them out, repeating, say, names or other elements of a data frame.
A. Select just name and aliases. Filter the resulting data down to something usable, and then unnest aliases. Use the resulting data to determine, who had the most aliases!
B. Great! Now. Let’s use this idea of unnesting to build and then visualize a dataset that shows the breakdown, within each allegiance, whether there are more aliases for men or women. What does this visualization teach you about the different allegiances?
E.C. +8 Write a function that takes a state name, and plots the state, but with height of county as % area using deckgl or mapdeck