Publications by Ken Wood
Bayes Regression
This second lab will deal with model assumptions, selection, and interpretation. The concepts tested here will prove useful for the final peer assessment, which is much more open-ended. First, let us load the data: load("ames_train.Rdata") library(MASS) library(dplyr) library(ggplot2) library(plotly) library(devtools) library(statsr) library(broo...
5496 sym R (8488 sym/22 pcs) 1 img
Bayesian Inference Lab
Bayesian Inference Getting Started In this lab we will review exploratory data analysis using the ggplot2 package for data visualization, which is included in the tidyverse. The main focus of this lab is to be able to obtain and interpret credible intervals and hypothesis tests using Bayesian methods for numerical variables. The data and functio...
18166 sym R (13841 sym/46 pcs) 8 img 2 tbl
Bayesian Statistics - Week 2 Practice Quiz
Question 5: You are hired as a data analyst by politician A. She wants to know the proportion of people in Metrocity who favor her over politician B. From previous poll numbers, you place a Beta(40,60) prior on the proportion. From polling 200 randomly sampled people in Metrocity, you find that 103 people prefer politician A to politician B. What...
1333 sym R (276 sym/6 pcs)
Modeling and Prediction for Movies
Setup Load packages library(ggplot2) library(dplyr) library(statsr) library(plotly) library(GGally) Introduction Congratulations on getting a job as a data scientist at Paramount Pictures! Your boss has just acquired data about how much audiences and critics like movies as well as numerous other variables about the movies. This dataset is provi...
5482 sym R (10849 sym/21 pcs) 1 img
Multiple Linear Regression
Grading the professor Many college courses conclude by giving students the opportunity to evaluate the course and the instructor anonymously. However, the use of these student evaluations as an indicator of course quality and teaching effectiveness is often criticized because these measures may reflect the influence of non-teaching related charac...
12915 sym R (9659 sym/46 pcs) 7 img 1 tbl
Intro to Linear Regression
Batter up The movie Moneyball focuses on the “quest for the secret of success in baseball”. It follows a low-budget team, the Oakland Athletics, who believed that underused statistics, such as a player’s ability to get on base, better predict the ability to score runs than typical statistics like home runs, RBIs (runs batted in), and battin...
11975 sym R (7614 sym/39 pcs) 5 img
Statistical Inference with GSS Data
Setup Load packages library(ggplot2) library(dplyr) library(statsr) library(plotly) Load data and clean by removing columns where ‘NA’ values comprise more than 10% of total rows. We will also delete the caseid column since it serves no useful purpose in our analysis. load("gss.RData") gss_filtered <- gss[,colSums(is.na(gss)) <= 0.1*nrow(gss...
4199 sym R (2794 sym/11 pcs) 2 img
Foundations for inference - Confidence intervals
Complete all Exercises, and submit answers to Questions on the Coursera platform. If you have access to data on an entire population, say the size of every house in Ames, Iowa, it’s straight forward to answer questions like, “How big is the typical house in Ames?” and “How much variation is there in sizes of houses?”. If you have acces...
8641 sym R (2661 sym/24 pcs) 1 img
Intro to Probability & Data in R: Data Analysis Project
Introduction The Behavioral Risk Factor Surveillance System (BRFSS) is a collaborative project between all of the states in the United States (US) and participating US territories and the Centers for Disease Control and Prevention (CDC). The BRFSS is administered and supported by CDC’s Population Health Surveillance Branch, under the Division o...
8466 sym R (7074 sym/23 pcs)
Intro to Probability & Data in R: Data Analysis Project
Introduction The Behavioral Risk Factor Surveillance System (BRFSS) is a collaborative project between all of the states in the United States (US) and participating US territories and the Centers for Disease Control and Prevention (CDC). The BRFSS is administered and supported by CDC’s Population Health Surveillance Branch, under the Division o...
5610 sym R (54 sym/2 pcs)