Publications by George
Curiobaby Drop Exp 3 Analysis
Load packages library(png) library(grid) library(ggplot2) library(xtable) require(here) ## Loading required package: here ## here() starts at /Users/gkacherg/Documents/GitHub/curiobaby_drop require(tidyverse) ## Loading required package: tidyverse ## ── Attaching packages ───────────────────────�...
575 sym R (16115 sym/84 pcs) 4 img
CDI Bias
First, let’s get some wordbank data for American English. We’ll start with production data from the CDI: Words & Gestures (WG) form. We’ll look for DIF based on 1) sex and 2) SES (high vs. low). Questions: How do we choose a dozen anchor items (that we expect to be unbiased)? action_words animals body_parts clothing descriptive_words food...
559 sym R (77 sym/1 pcs) 1 img 2 tbl
CDI-III IRT Chapter Analyses
Descriptives We have item-level data for 234 children, aged 24-48 months. However, we will focus our analyses on the 114 participants 30-37 months of age, the intended age range of the CDI-III. (Although comfortingly, in Appendix 1 we show that fitting IRT models to the dataset from the entire age range results in negligible impacts on item param...
8710 sym R (204 sym/2 pcs) 10 img 6 tbl
CDI Bias: WG comprehension
First, let’s get some wordbank data for American English. We’ll start with production data from the CDI: Words & Gestures (WG) form. We’ll look for DIF based on 1) sex and 2) SES (high vs. low). Questions: How do we choose a dozen anchor items (that we expect to be unbiased)? Let’s do comprehension data. None Primary Some Secondary S...
2094 sym R (476 sym/8 pcs) 7 img 12 tbl
CDI-CAT Spanish Production
Introduction Our goal here is to develop and test via simulation a bank of CDI:WG items and IRT parameters that we can recommend to those wanting to develop and conduct computerized adaptive tests (CATs) of children’s early word learning. This is to complement the CDI:WS item bank and parameters, and to be used by parents of children who are no...
8314 sym R (258 sym/2 pcs) 7 img 11 tbl
Document
Goal Can we identify words on the CDI that are more bookish, speechy, or from TV? (Controlling for difficulty, as more bookish words are probably more difficult) Using this distributional source information, can we find features of children (e.g., mother’s education) that relate to knowledge of subsets of words (e.g. bookish words)? To start, ...
3087 sym R (21986 sym/66 pcs) 7 img
Children's language input
Goal Use average hourly input rates from three sources (overheard speech, child-directed speech, and child-read books) to estimate how many tokens a ‘standard’ child will experience over their early development. Overheard and Child-directed speech Using children’s median number of waking hours, we extrapolate from our best estimates of hou...
1489 sym R (1426 sym/2 pcs) 2 img
RL agent behavior
Load packages Preprocess data full_df <- read.csv('reply_editor_ppo.csv') # Initialize df with column names full_df$condition <- ff(full_df$file, c("-curr-", "curr8K", "2M", "curr_rev", "200K", "nocur"), c("400K", "800K", "2M", "0", "200K", "never"), NA, ignore.case = TRUE) full_df$condition <- factor(full_df$condition, levels = c("0", "200K", "...
1329 sym R (1503 sym/4 pcs) 3 img 1 tbl
Authorship Credit
Load data ## ## -- Column specification -------------------------------------------------------- ## cols( ## indicator = col_character(), ## `Identifier for the researcher credit ratings being evaluated` = col_character() ## ) ## ## -- Column specification -------------------------------------------------------- ## cols( ## X1 = c...
2764 sym R (4513 sym/10 pcs) 3 img 4 tbl
booky-cdi-english
Load Data First we load in the frequency data from several corpora preprocessed in CDIorigins.Rmd. The frequencies are already normalized to counts per million tokens. load("data/merged_word_freqs.Rdata") aoas <- readRDS("data/english_(american)_aoa_bydefinition.rds") %>% filter(measure=="produces") Word frequency table Use to pick compelling...
1598 sym R (13373 sym/32 pcs) 3 img 4 tbl