Publications by David Blumenstiel

R HW, Week 1

21.12.2019

##R HW Week 1 ##David Blumenstiel 1. Write a loop that calculates 12-factorial x<-1 for (i in 1:12) { x = x * i } print(x) ## [1] 479001600 2. Show how to create a numeric vector that contains the sequence from 20 to 50 by 5. vec <- seq(20, 50, 5) print(vec) ## [1] 20 25 30 35 40 45 50 3. Create the function “quadratic” that takes...

347 sym R (332 sym/8 pcs)

David Blumenstiel Data 606 Final Project: How to Make a Successful Movie

10.05.2020

Introduction Data Exploratory data analysis Inference Conclusions References Introduction I’m sure movie executives have long debated one topic in particular: What makes a movie great? Is it the actors? The story? The art form? No. It’s Profit of course! For this project, we’ll try to figure out what a movie great (at making money). Da...

6436 sym R (6511 sym/23 pcs) 4 img

Data 607 Final Project: Energy Load vs Time and Percipitation

09.05.2020

Abstract Introduction Loading and Tidying the Data Initial Investigation: Energy Usage Initial Investigation: Weather Modeling Preparation Modeling Discussion/Conclusions Abstract The objectives of this project were to discover trends in power-load on an electrical grid over time, and to determine if rain had any noticeable effect on load. The ...

11666 sym R (9683 sym/23 pcs) 8 img

Data 606 Homework 9

03.05.2020

Baby weights, Part I. (9.1, p. 350) The Child Health and Development Studies investigate a range of topics. One study considered all pregnancies between 1960 and 1967 among women in the Kaiser Foundation Health Plan in the San Francisco East Bay area. Here, we study the relationship between smoking and weight of the baby. The variable smoke is c...

7084 sym R (1236 sym/13 pcs) 2 img

Data 606 Lab 9

03.05.2020

Grading the professor Many college courses conclude by giving students the opportunity to evaluate the course and the instructor anonymously. However, the use of these student evaluations as an indicator of course quality and teaching effectiveness is often criticized because these measures may reflect the influence of non-teaching related charac...

12835 sym R (5856 sym/27 pcs) 12 img 1 tbl

Data 607 Project 4

23.04.2020

No one like’s spam email. Thus, we have spam filters built in to our email services that detect and send spam into a deep dark folder where no one can hear it scream. Let’s make a spam detector. The first task will be to load sets of spam and ham (non-spam) messages into a dataframe. Now that we’ve got our data loaded in, we need to seperat...

3335 sym R (16422 sym/18 pcs) 1 img

Data 606 Lab 8

19.04.2020

Batter up The movie Moneyball focuses on the “quest for the secret of success in baseball”. It follows a low-budget team, the Oakland Athletics, who believed that underused statistics, such as a player’s ability to get on base, betterpredict the ability to score runs than typical statistics like home runs, RBIs (runs batted in), and batting...

10535 sym R (5185 sym/28 pcs) 11 img

Data 606 Homework 8

19.04.2020

Nutrition at Starbucks, Part I. (8.22, p. 326) The scatterplot below shows the relationship between the number of calories and amount of carbohydrates (in grams) Starbucks food menu items contain. Since Starbucks only lists the number of calories on the display items, we are interested in predicting the amount of carbs a menu item has based on i...

6205 sym R (433 sym/8 pcs) 10 img

Data 607 Pandora Recommendation System

15.04.2020

Overview Pandora is a music streaming service whose main function is to allow the user to create ‘stations’ which play similar types of music to a selected artist, song, or style, and allow for further customization. While users can compile playlists, it’s primary way delivering songs to the user is almost fully reliant on recommendations s...

4495 sym

Data 607 Homework 10

05.04.2020

The Assignment: In Text Mining with R, Chapter 2 looks at Sentiment Analysis. In this assignment, you should start by getting the primary example code from chapter 2 working in an R Markdown document. You should provide a citation to this base code. You’re then asked to extend the code in two ways: Work with a different corpus of your choosin...

3583 sym R (19431 sym/97 pcs) 11 img