In the R community, there’s a weekly event known as Tidy Tuesday where everyone comes together around a single big dataset and attempts to create the most interesting visualizations possible, posting code and data viz on Twitter using #TidyTuesday. I’d like us to try a… Tidy Friday, with data on the ongoing coronavirus pandemic.
With the advent of the Covid-19 caronavirus outbreak across the world, people want to know more - and no more now! Fortunately, the R community has begun coming together and making tools to rapidly disseminate data. There are two packages out there currently, but let’s focus on the coronavirus package. To install it, use the following code:
install.packages("coronavirus")
Let’s see what is there:
library(coronavirus)
head(coronavirus)
## date province country lat long type cases uid iso2 iso3
## 1 2020-01-22 Alberta Canada 53.9333 -116.5765 confirmed 0 12401 CA CAN
## 2 2020-01-23 Alberta Canada 53.9333 -116.5765 confirmed 0 12401 CA CAN
## 3 2020-01-24 Alberta Canada 53.9333 -116.5765 confirmed 0 12401 CA CAN
## 4 2020-01-25 Alberta Canada 53.9333 -116.5765 confirmed 0 12401 CA CAN
## 5 2020-01-26 Alberta Canada 53.9333 -116.5765 confirmed 0 12401 CA CAN
## 6 2020-01-27 Alberta Canada 53.9333 -116.5765 confirmed 0 12401 CA CAN
## code3 combined_key population continent_name continent_code
## 1 124 Alberta, Canada 4413146 North America NA
## 2 124 Alberta, Canada 4413146 North America NA
## 3 124 Alberta, Canada 4413146 North America NA
## 4 124 Alberta, Canada 4413146 North America NA
## 5 124 Alberta, Canada 4413146 North America NA
## 6 124 Alberta, Canada 4413146 North America NA
Each row is a single instance of recorded cases. There is information on the province, country, continent, the latitude, longitude, and date. We also see the number of cases and whether the observation is of a confirmed case, a recovery, or a death. You can take a look at ?coronavirus
for more information or at https://ramikrispin.github.io/coronavirus/.
There is also a second dataset on vaccines.
head(covid19_vaccine)
## country_region date doses_admin people_partially_vaccinated
## 1 Afghanistan 2021-02-22 0 0
## 2 Afghanistan 2021-02-23 0 0
## 3 Afghanistan 2021-02-24 0 0
## 4 Afghanistan 2021-02-25 0 0
## 5 Afghanistan 2021-02-26 0 0
## 6 Afghanistan 2021-02-27 0 0
## people_fully_vaccinated report_date_string uid province_state iso2 iso3 code3
## 1 0 2021-02-22 4 <NA> AF AFG 4
## 2 0 2021-02-23 4 <NA> AF AFG 4
## 3 0 2021-02-24 4 <NA> AF AFG 4
## 4 0 2021-02-25 4 <NA> AF AFG 4
## 5 0 2021-02-26 4 <NA> AF AFG 4
## 6 0 2021-02-27 4 <NA> AF AFG 4
## fips lat long combined_key population continent_name continent_code
## 1 <NA> 33.93911 67.70995 Afghanistan 38928341 Asia AS
## 2 <NA> 33.93911 67.70995 Afghanistan 38928341 Asia AS
## 3 <NA> 33.93911 67.70995 Afghanistan 38928341 Asia AS
## 4 <NA> 33.93911 67.70995 Afghanistan 38928341 Asia AS
## 5 <NA> 33.93911 67.70995 Afghanistan 38928341 Asia AS
## 6 <NA> 33.93911 67.70995 Afghanistan 38928341 Asia AS
This is much more rich with spatial data, but, can be used in a similar manner.
I want you to load the data, look through it, and then, make it tell a story! To do this, I want you to
ggthemes
or other places. Google around.As you make your first cool data viz, copy it and post it to slack! Show it off to the class to see what you found!
After you finish your lab report (your .Rmd file), compile it, and submit it at your homework link. I want to make a gallery of interesting reports for you all to look at to see what is possible.
Extra credit if you post it to Twitter with the hashtags #rstats and #coronavirus, making sure to mention that you used https://ramikrispin.github.io/coronavirus/