library(readxl)
lter_dat <- read_excel("data/EST-PR-PlanktonChemTax.xls",
sheet = "EST-PR-PlanktonChemTax",
na = "NA")Homework: I love it when a Plot comes together
1. This week, we’ll work with a dataset from sampling plankton in the Plum Island Estuary by the PIE Long Term Ecological Research site. This dataset is in an excel file with both metadata and data. There’s a lot of information in it, and we’ll come back to this dataset a few times through the semester. To load the data, make a folder called data. Download this file and save it into the data folder. Then, load the data with:
You could also have specified sheet=2. Hey! Now you know how to read in a tabular excel file! AND you even specified which characters are NA!
1A. Make a scatterplot of the relationship between Chlorophytes and TotalChlA. Is there more Chlorophyll when there are more Chlorophytes? Note, if one of your axes is unreadable due to the number of values, that column is a character and not a numeric. Go back and fix this with how you load the data. See the code above.
1B. Many processes can modify this relationship. They all tend to covary with distance from the mouth of the estuary, where it empties into the ocean and is highly saline. Maybe distance from estuary mouth - Distance - affects the relationship between Chlorophytes and total chlorophyll? Can you see any pattern of how distance alters this relationship by coloring the points by Distance? Use something other than the default color scale. What do you learn?
1C. As distance is continuous, any patterns might still be hard to see. What if we made a discrete variable out of distance using cut_interval() and used facet_wrap() to see its influence.
For example, facet_wrap(vars(cut_interval(X, n = 4))) will facet a figure by variable X, where it has split X into four groups with equal range. Try different values of n. Feel free to keep color if you think it useful.
What patterns do you see?
1D. The estuary was sampled at times of year where temperature varied, and distance from mouth might have a different effect under cold v. warm temperatures, let’s look at whether temperature and distance act in concert using facets.
What do you see if you use both Distance and Temp with cut_interval and facet_grid to examine the effects of both temperature and distance from mouth? How do temperature and distance jointly influence the relationship? (note: might be easier if you drop color, fyi)
2. Let’s make this plot look good! Choose one of the plots that you worked on in part 1.
2A. Give it a title, x label, and y label with labs(). You can learn more about the variables in the first tab of the excel spreadsheet
2B. Now, let’s theme it using the ggthemes package. Look through the theme options it gives you. Choose one, and implement it (e.g., add theme_bw(base_size=12)) to your plot. Why did you choose this theme? What about it aids in your visualization?
2C. Extra credit - look at the theme help file. Customize your plot even more using theme() and justify your choices.
3. What is your favorite data visualization. Grab a jpg of it and put it into this Quarto document (you’ll need look at how to get images into Quarto documents - the syntax).
Now tell us why this is your favorite example of a data visualization.
4. Based on what you talked about with respect to data for a final project in last week’s homework, describe one visualization you’d like to make with it.