Publications by George
Examining Cross-linguistic Difficulty of CDI Items
Our goal is to look at IRT parameters across a diverse set of languages and find a subset of uni-lemmas that are somewhat similar in their difficulty. We’ll start with 2PL fits to WG data (comprehension and production separately) for 18 languages: British Sign Language, Croatian, Danish, English (American), Korean, Spanish (Mexican), Italian, M...
4382 sym R (411 sym/6 pcs) 9 img 2 tbl
CDI-III demographics
Summary statistics of the WebCDI sample of CDI-III data (English and Spanish) from Virginia. Adding Philip’s data. English Demographics We have item-level data from 234 participants, and summary data from a further 499 participants from the 2007 CDI-III norms. Vocabulary totals from all participants are shown below, but the targeted age range ...
2550 sym R (336 sym/2 pcs) 4 img
Cross-linguistic Comprehension vs. Production Ability
We want to compare the IRT-estimated ability vs. age plots for each language model to make sure that nothing is funky with the fits. (Particularly since Danish and Norwegian comprehension item parameters have shifted distributions compared to the other languages.) Overall Ability Distributions The ability distributions per language greatly ove...
691 sym R (80 sym/2 pcs) 4 img
CDI-III IRT Analysis
Load data load("data/CDI-III-Spanish.Rdata") load("data/CDI-III-English.Rdata") bad_Ss = which(rowSums(en_voc, na.rm=T)==0) # 2 subjects with no correct items; can't estimate # weird values... one "3", one "11", and one "12"..replace with NAs for now #table(unlist(en_voc)) which(apply(en_voc, 1, max)>1) ## [1] 152 154 en_voc[152, which(en_voc[1...
2173 sym R (11236 sym/53 pcs) 8 img 2 tbl
Models of Attention, Learning, and Curiosity
Goal Implement and test simple models of learning (habituation) and attention shifting, and test them for stereotyped looking time behavior (e.g., the Hunter & Ames 1988 model). Pelz et al. 2015 Model The Pelz model has five components: the Gompertz learning curve that the learner follows when attending to a stimulus, decay of short term memor...
1110 sym R (363 sym/1 pcs) 2 img
Swadesh CDI comparisons and GPCM
Variability of Difficulty by CDI Category Below we show the standard deviation of cross-linguistic item difficulties by CDI category (for 437 uni-lemmas that are defined in at least 5 languages). ## `summarise()` has grouped output by 'uni_lemma', 'category'. You can override using the `.groups` argument. ## `summarise()` has grouped output by 'u...
1709 sym R (2825 sym/12 pcs) 4 img
MB3 Pilot Analysis
Load Data Load pilot data (from Julien Mayor’s lab). d1 <- read_csv(here("pilot/data/first_session_babies.csv")) %>% rename(sd_LT_incongruent_trials = sd_LT_congruent_trials_1) ## Warning: Duplicated column names deduplicated: 'sd_LT_congruent_trials' => ## 'sd_LT_congruent_trials_1' [18] ## ## ── Column specification ──────...
531 sym R (8960 sym/22 pcs) 3 img 2 tbl
Peekbank Time Windowing Analysis
Motivation Peelle and Van Engen (2020) style multiverse analysis considering possible time windows with logistic growth curve models in a dataset with words of varying frequency, stimuli with varying levels of noise, and with young or old adults. For our analysis, we will restrict ourselves to familiar words, and will model age effects. # get loc...
2297 sym R (5642 sym/13 pcs) 4 img 1 tbl
Comprehension vs. Production (SP + EN) IRT parameters
Overview The “mod_2pl” files (for Spanish/English, production/comprehension) each contain a coefs_2pl dataframe of the item parameters (in mirt’s slope-intercept form), as well as a mod_2pl mirt model object, and fscores_2pl (the estimated ability parameters from Wordbank participants). Production English it = list() # item parameters ab ...
1977 sym R (5899 sym/27 pcs) 3 img
Bilingual CDI Analysis
Load data ## ## ── Column specification ──────────────────────────────────────────────────────── ## cols( ## .default = col_double(), ## ParticipantId = col_character(), ## Gender = col_character(), ## Ethnic = col_characte...
291 sym R (9591 sym/11 pcs) 2 img