1.1 Download the data
It is important that you follow along with the guide by replicating the code and analysis presented throughout. You will need three datasets linked below. You may find it convenient to save the files to a project folder for this course.
DCPS testing.RData(download DCPS data here)
This data from the DC Public School System records the results of the PARCC (Partnership for Assessment of Readiness for College and Careers) Assessment from 2017-2018. This version of the data includes the school name, level, and number of students tested, as well as the percentage of students performing at or above grade level in language and math. You can find more information about the test at the DC PARCC results page.biopics.xls(download biopics data here)
This is a shortened version of the data behind the story “‘Straight Outta Compton’ Is The Rare Biopic Not About White Dudes.” published on fivethirtyeight.com. It contains IMDB data on 177 biopics from 1915 to 2014. Variables include the sex and race of the lead actor at the center of the biopic and the year in which the film was released.dc_weather_2018.csv(download weather data here)
The National Oceanic and Atmospheric Administration maintains networks of weather monitoring stations around the country, an important input into understanding weather and climate. Collected during 2018 as a part of the Global Historical Climatology Network Daily (GHCND) data, this file contains daily weather information from a station at Washington National Airport. It includes information about temperature and a categorization of weather type, e.g. rain, snow, fog.
Note: in some browsers (e.g. Safari), a left click on the link for the weather data above will display this type of file (comma separated values or CSV) rather than downloading it. One easy way to download the file is to right click on the link and then choose “Download linked file,” “Download linked file as…,” or a similar option.