Publications by David Blumenstiel
Data 607 Homework 9
Assignment – Web APIs The New York Times web site provides a rich set of APIs, as described here: https://developer.nytimes.com/apis You’ll need to start by signing up for an API key. Your task is to choose one of the New York Times APIs, construct an interface in R to read in the JSON data, and transform it into an R DataFrame. To actual...
1420 sym R (4702 sym/3 pcs)
Data 607 Project 3
Overview: The objective of this project was to answer the question: “Which are the most valued data science skills?”. The dataset used to answer this question was sourced from Kaggle: https://www.kaggle.com/elroyggj/indeed-dataset-data-scientistanalystengineer. It contains information from 5715 data-science related job postings on the job-li...
2713 sym R (25939 sym/49 pcs) 34 img 3 tbl
Data 606 Homework 6
2010 Healthcare Law. (6.48, p. 248) On June 28, 2012 the U.S. Supreme Court upheld the much debated 2010 healthcare law, declaring it constitutional. A Gallup poll released the day after this decision indicates that 46% of 1,012 Americans agree with this decision. At a 95% confidence level, this sample has a 3% margin of error. Based on this inf...
8138 sym R (1262 sym/18 pcs)
Data 606 Lab 6
In August of 2012, news outlets ranging from the Washington Post to the Huffington Post ran a story about the rise of atheism in America. The source for the story was a poll that asked people, “Irrespective of whether you attend a place of worship or not, would you say you are a religious person, not a religious person or a convinced atheist?�...
13851 sym R (5102 sym/49 pcs) 10 img
Data 607 Homework 7
Assignment – Working with XML and JSON in R Pick three of your favorite books on one of your favorite subjects. At least one of the books should have more than one author. For each book, include the title, authors, and two or three other attributes that you find interesting. Take the information that you’ve selected about these three books, ...
1753 sym R (3167 sym/14 pcs)
Data 606 Homework 5
Heights of adults. (7.7, p. 260) Researchers studying anthropometry collected body girth measurements and skeletal diameter measurements, as well as age, weight, height and gender, for 507 physically active individuals. The histogram below shows the sample distribution of heights in centimeters. What is the point estimate for the average height...
9109 sym R (1341 sym/33 pcs) 6 img
Data 607 Project 2
Choose 3 wide data sets, tidy, and analyze Dataset #1: Agricultural Land Values This dataset contains the values of agricultural land from 2015-2019 across different states and regions Importing from csv and basic cleaning #Importing agland <- read.csv("https://raw.githubusercontent.com/davidblumenstiel/data/master/Agricultural%20Land%20Value...
2656 sym R (15149 sym/29 pcs) 5 img
Data 606 Lab 5a
In this lab, we investigate the ways in which the statistics from a random sample of data can serve as point estimates for population parameters. We’re interested in formulating a sampling distribution of our estimate in order to learn about the properties of the estimate, such as its distribution. The data We consider real estate data from th...
11619 sym R (3557 sym/47 pcs) 9 img
Data 606 Homework 4
Area under the curve, Part I. (4.1, p. 142) What percent of a standard normal distribution \(N(\mu=0, \sigma=1)\) is found in each region? Be sure to draw a graph. #A pnorm(-1.35) ## [1] 0.08850799 #B 1- pnorm(1.48) ## [1] 0.06943662 #C pnorm(1.5) - pnorm(-0.4) ## [1] 0.5886145 #D 1-pnorm(2) + pnorm(-2) ## [1] 0.04550026 \(Z < -1.35\) 8.85...
6691 sym R (1247 sym/38 pcs) 2 img
Data 606 Lab 4
In this lab we’ll investigate the probability distribution that is most central to statistics: the normal distribution. If we are confident that our data are nearly normal, that opens the door to many powerful statistical methods. Here we’ll use the graphical tools of R to assess the normality of our data and also learn how to generate random...
9833 sym R (2898 sym/42 pcs) 14 img