Publications by Ken Wood
Practical Machine Learning in R - Quiz 2
Question 1 library(caret) ## Loading required package: lattice ## Loading required package: ggplot2 library(AppliedPredictiveModeling) data(AlzheimerDisease) What are the set of commands that will create non-overlapping training and test sets with about 50% of the observations assigned to each? adData = data.frame(diagnosis,predictors) trainIndex...
1475 sym R (4792 sym/31 pcs) 3 img
Regression Models - Course Project
Executive Summary Motor Trend, a magazine about the automobile industry, wants to look at a data set of a collection of cars to learn more about mileage. They are interested in exploring the relationship between a set of variables and miles per gallon (MPG) (outcome). Specifically, they are interested in answering the following two questions: �...
3818 sym R (3613 sym/14 pcs) 2 img
Reproducible Research in R - Week 4 Assignment
Executive Summary Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern. This analysis leverages the U.S. National Oceanic and Atmo...
3222 sym R (6388 sym/16 pcs) 2 img
Statistical Inference in R Course Project - Part 2
Introduction For the second portion of the course project, we’re going to analyze the ToothGrowth data in the R datasets package. Specifically, we will: Load the ToothGrowth data and perform some basic exploratory data analyses. Provide a basic summary of the data. Use confidence intervals and/or hypothesis tests to compare tooth growth by sup...
2430 sym R (2529 sym/13 pcs) 3 img
Getting and Cleaning Data in R - Course Project
# Create one R script called run_analysis.R that does the following: # 1. Merges the training and the test sets to create one data set. # 2. Extracts only the measurements on the mean and standard deviation for each measurement. # 3. Uses descriptive activity names to name the activities in the data set # 4. Appropriately labels the data set wit...
5 sym R (5355 sym/10 pcs)
R Programming - Week 2 Assignment
pollutantmean <- function(directory, pollutant, id=1:332) { # Create a list of files in the directory argument files_list <- list.files(directory, full.names = TRUE) df <- data.frame() #creates an empty data frame # Loop through the files, rbinding them together for (i in id) { df <- rbind(df, read.csv(files_list[i])) } # S...
5 sym R (7067 sym/29 pcs)
Data Science Capstone in R - Week 2 Analysis Alternate
Instructions The goal of this project is to display that we’ve become familiar with the data and that we are on track to create our prediction algorithm. This report (to be submitted on R Pubs (http://rpubs.com/)) explains our exploratory analysis and our goals for the eventual app and algorithm. This document should be concise and explain only...
2901 sym R (4310 sym/21 pcs) 3 img
Data Science Capstone in R - Week 3 N-gram Generator
As noted earlier, a corpus is a body of text from which we build and test LMs. rm(list = ls()) library(quanteda) ## Package version: 2.1.2 ## Parallel computing: 2 of 4 threads used. ## See https://quanteda.io for tutorials and examples. ## ## Attaching package: 'quanteda' ## The following object is masked from 'package:utils': ## ## View l...
852 sym R (3523 sym/22 pcs)
Data Science Capstone in R - Shiny App Presentation
11/15/2020 Executive Summary Natural Language Processing App All code implemented using R Hosted at https://www.shinyapps.io \[\\\] Goal: Predict third word of tri-gram given two leading words \[\\\] Use Katz Back-Off Method for predictions Provide list of word predictions along with probabilities Training Corpus & Prediction Method Three ...
1045 sym 1 img
Bayesian Statistics - Data Analysis Project Rubric
Bayesian Statistics - Data Analysis Project Rubric Part 1: Data (2 points) 1 pt for correct reasoning for generalizability – Answer should discuss whether random sampling was used. Learners might discuss any reservations, those should be well justified. 1 pt for correct reasoning for causality – Answer should discuss whether random assignme...
3595 sym