Publications by Tyler Simko
API-209: predict()
Using the predict() function So far, we have calculated predicted values using our models visually (by looking at regression functions) or manually (by adding and multiplicated our estimated parameters). predict() is a built-in R function that can do this for you. We will do a short example here to give you better practice with the syntax: For th...
2516 sym R (2322 sym/18 pcs) 2 tbl
API-209: Prediction
For this question, we will revisit and expand on the housing data from Buenos Aires that you have been using in section. Once again, please imagine that you are a policy analyst for Asociación Civil por la Igualdad y la Justicia, an advocacy organization in Argentina. You specialize in housing policy, and are preparing a report about housing aff...
7026 sym 6 img
Gov 50: Assignment Instructions
These instructions will walk you through submitting Problem Sets and Exams. Gov 50 staff have worked very hard to make this process as smooth and convenient as possible for you, but it can confusing the first few times! This document walks through each step of the process. Here is a checklist at the bottom of this document that you may find it he...
10707 sym 11 img
Gov 50: Animation Example
p <- ggplot(gapminder, aes(x = gdpPercap, y=lifeExp, size = pop, colour = country)) + geom_point(show.legend = FALSE, alpha = 0.7) + scale_color_viridis_d() + scale_size(range = c(2, 12)) + scale_x_log10() + labs(x = "GDP per capita", y = "Life expectancy") p + transition_time(year) + labs(title = "Yea...
6 sym R (365 sym/1 pcs) 1 img
Lesson 8: Relationships Between Variables
1. What is a relationship anyway? You will often hear that two variables of some kind are “related,” “linked,” or “associated” in some way. What does this mean? Well, we generally mean that the relationship between them is predictable in some way. For example: library(tidyverse) ## ── Attaching packages ─────────...
3940 sym R (4999 sym/27 pcs) 9 img
Lesson 0: The Tools of Data Science
For this course, you will need to install some free and safe software. This document will walk you through installing the tools we will be using in this course. Installing R First, we will install a programming language called R. This is the tool we will be using throughout the course to analyze data and make plots. R is a programming language t...
5404 sym R (199 sym/3 pcs) 12 img
Lesson 7: Text Data
1. Getting Started We have primarily used nice, clean data in datasets. However, sometimes we will need to work with our data to extract the information we want. This process is called data cleaning and will inevitably be a large part of your project and all analyses you do in the future. Text data is one of the most common types of data you will...
5324 sym R (8087 sym/60 pcs) 9 img 1 tbl
Lesson 5: Mapping and Merging Data
1. Maps So far, we have worked with relatively simple data types - numbers (340) and characters ("Data Science"). However, R knows how to work with many different types of data. Today, we’ll work with maps. Download the world.rds object at this link and place it in your data folder. This is an R-Dataset file (RDS) - basically, an object that I ...
6557 sym R (12444 sym/65 pcs) 9 img
Lesson 4: Summarizing Data
1. When to summarize? When designing any visualization, you need to think about the story you want to get across. What is the most important point are you trying to make? Find that idea, and design your visualization around it. Often, you will need to make changes to your data before it is in the format you need to make the point that you want. F...
5440 sym R (8852 sym/63 pcs) 23 img
Final Project Instructions
The goal of data science is to gain knowledge and insights from data. The ultimate goal of the course is to prepare you to use the tools of data science to conduct an independent analysis of your own. This is your opportunity to apply everything we’ll learn about in the class to a topic that you personally care about.1 Schedule The final proje...
5021 sym