Publications by Thomas Wood (
Dplyr and dates exercise
A couple of exercises with dplyr and lubridate. Two things I’d like to accomplish with today’s quick exercise Further embed some of the tidy principles. Use an new package, lubridate , which is super useful for manipulating date data, and can come up in stats applications all the time. Specifically, that we haven’t had previous lab exposu...
1761 sym R (181 sym/1 pcs)
A fleeting tidy reprise
Reprise exercises We’ve not all been in the same room for a such a long time! Let’s make sure that the tidy toolkit is still at our fingertips. Some nice NFL game by game player data (t0) and some salary data (t1) library(tidyverse) library(nflreadr) library(magrittr) t0 <- load_player_stats( seasons = T, stat_type = "offense" ...
1359 sym R (157 sym/1 pcs)
Document
Strings Strings (just sequences of natural language letters, intended to be legible by humans) are super useful for doing categorical data analysis. Whenever people from outside R look at the way it’s used, often they’re struck by how flexible and fundamental R’s string implementation is. Manipulating and interacting with label text is oft...
3997 sym R (12165 sym/23 pcs) 1 img 2 tbl
Code Lab review exercise
Review exercises Since we’ll not be meeting for a couple of weeks, I wanted to provide a couple of questions to further embed some of the recent tidy tools we’ve demonstrated. The following is table summarizing college football games. It includes every game between 1869 and 2022. library(tidyverse) library(magrittr) t1 <- "https://github....
1319 sym R (152 sym/1 pcs)
Code Lab Session 2--Pivoting data
Reshaping tabular data I can very clearly remember reading Hadley Wickham’s 2007 JSS paper (written while still a grad student!) on reshaping data, and it was a real struggle to understand why we’d want to invest so much time in his approach for eshaping. I initially thought reshaping was a toolkit designed to address an anomaly, where we�...
6699 sym R (7209 sym/20 pcs) 1 img 3 tbl
2023 code lab, lab 1 - dplyr verbs
dplyr’s 5 verbs (and 1 adverb) The principle advantage to the dplyr approach is cognitive–that a huge, potentially limitless set of challenges and statistical tasks… how do I do report means and standard errors by groups? how do I include a new variable with population mean evaluation with each separate respondent evaluation? how do I co...
6624 sym R (5568 sym/19 pcs) 3 img
Answers for Problem Set 1
Question 1 Report the univariate distributions of ideology and partisanship, both as they’re found in the ANES, and as three part scales library(plyr) library(tidyverse) library(magrittr) d1 <- "https://github.com/thomasjwood/ps7160/raw/master/anes_cdf_20211118.rds" %>% url %>% gzcon %>% readRDS d1$pid_3 <- d1$VCF0301 %>% ...
926 sym R (7868 sym/10 pcs) 3 img
PS7160 Lab 2 Fall 2022
Question 1 Q1. Report the relationship between three issues scales (Guaranteed Jobs and Incomes-VCF0809, Government Health Insurance Scale-VCF0806, Government Services-Spending Scale-VCF0839). Which is the most strongly re- lated to presidential vote choice? library(plyr) library(tidyverse) library(magrittr) library(purrrlyr) library(bro...
1805 sym R (6705 sym/6 pcs) 3 img
Document
Question 1 Report the univariate distributions of ideology and partisanship, both as they’re found in the ANES, and as three part scales library(plyr) library(tidyverse) library(magrittr) d1 <- "https://github.com/thomasjwood/ps7160/raw/master/anes_cdf.RDS" %>% url %>% gzcon %>% readRDS %>% as_tibble d1$pid_3 <- d1$VCF0301...
920 sym R (7877 sym/10 pcs) 3 img
PS7160 Lab 2 Answers
Question 1 Q1. Report the relationship between three issues scales (Guaranteed Jobs and Incomes-VCF0809, Government Health Insurance Scale-VCF0806, Government Services-Spending Scale-VCF0839). Which is the most strongly re- lated to presidential vote choice? ## # A tibble: 3 x 2 ## scale .out ## <chr> ...
1788 sym R (5233 sym/5 pcs) 3 img