Publications by Daniel Moscoe
DATA606 Lab1
The RStudio Interface The goal of this lab is to introduce you to R and RStudio, which you’ll be using throughout the course both to learn the statistical concepts discussed in the course and to analyze real data and come to informed conclusions. To clarify which is which: R is the name of the programming language itself and RStudio is a conven...
17922 sym R (1584 sym/29 pcs)
DATA605 HW1
Create first initial, D: Dx = c(rep(0, 1000), seq(0, 1, length.out = 500), seq(1, 1.5, length.out = 350), rep(1.5, 500), seq(1.5, 1, length.out = 350), seq(1, 0, length.out = 500)) Dy = c(seq(-1, 1, length.out = 1000), rep(1, 500), seq(1, 0.5, length.out = 350), seq(0.5, -0.5, length.out = 500), seq(-0.5, -1, length.out = 350), rep(-1, 500)) ...
790 sym R (2010 sym/11 pcs) 32 img
DATA607 Wk1 Assignment
Overview What regions in the United States are best equipped to deliver intensive care treatment to COVID-19 patients? This dataset compares the number of people at high-risk for requiring ICU treatment due to COVID-19 with the number of ICU beds in that region. In regions where the number of high-risk residents is large relative to ICU beds, the...
2689 sym R (1730 sym/11 pcs) 1 img
DATA605 Wk 4 Assignment
Introduction. Principal Components Analysis (PCA) is an unsupervised approach to reducing the dimensionality of a dataset. Reducing the number of variables in a dataset confers three advantages. First, it makes explanations of variability easier for humans to understand. Second, it reduces the size of the dataset, simplifying the computations req...
6111 sym R (2230 sym/18 pcs) 4 img
DATA607 Wk3 Assignment
1. Using the 173 majors listed in fivethirtyeight.com’s College Majors dataset, provide code that identifies the majors that contain either “DATA” or “STATISTICS”. Import the dataset into R: college_majors_csv = "https://raw.githubusercontent.com/fivethirtyeight/data/master/college-majors/majors-list.csv" college_majors <- read_csv(url...
2931 sym R (3048 sym/22 pcs)
DATA607 SQL and R
Connect to SQL server: library(tidyverse) ## -- Attaching packages --------------------------------------- tidyverse 1.3.0 -- ## v ggplot2 3.3.3 v purrr 0.3.4 ## v tibble 3.0.4 v dplyr 1.0.2 ## v tidyr 1.1.2 v stringr 1.4.0 ## v readr 1.4.0 v forcats 0.5.0 ## -- Conflicts ------------------------------------------ tidy...
2045 sym R (9859 sym/101 pcs) 2 img
DATA 607 Project 1
Introduction How can we transform a text file with partially structured data into a more useful form? In this project I examine one such file that reports the proceedings of a chess tournament. By exploiting the structure of the text file and employing tools in R, I show how to create a tibble containing all the information in the text file in a ...
4981 sym R (8158 sym/13 pcs)
DATA 605 Linear Regression with R
Introduction The cars dataset contains 50 observations of the speeds (mph) of various cars along with their stopping distances (ft). Because the data were gathered in the 1920s, the regression parameters for these data will not extrapolate to cars currently on the road. However, if the relationship between stopping distance and speed is linear fo...
5584 sym R (2370 sym/16 pcs) 3 img
DATA606 HW6
2010 Healthcare Law. (6.48, p. 248) On June 28, 2012 the U.S. Supreme Court upheld the much debated 2010 healthcare law, declaring it constitutional. A Gallup poll released the day after this decision indicates that 46% of 1,012 Americans agree with this decision. At a 95% confidence level, this sample has a 3% margin of error. Based on this inf...
10258 sym R (1149 sym/20 pcs)
DATA606 HW5
Heights of adults. (7.7, p. 260) Researchers studying anthropometry collected body girth measurements and skeletal diameter measurements, as well as age, weight, height and gender, for 507 physically active individuals. The histogram below shows the sample distribution of heights in centimeters. What is the point estimate for the average height...
8762 sym R (460 sym/8 pcs) 6 img 2 tbl