Publications by Diego Correa

DATA607 - Assignment 9

24.10.2020

Introduction The New York Times web site provides a rich set of APIs, as described here: https://developer.nytimes.com/apis You’ll need to start by signing up for an API key. Your task is to choose one of the New York Times APIs, construct an interface in R to read in the JSON data, and transform it into an R DataFrame. I started by creating an...

1302 sym R (7576 sym/17 pcs)

speed reducers

05.12.2020

Introduction Speed humps are traffic calming devices intended to slow traffic speeds on low volume, low speed roads. Speed humps are generally installed on local residential (non-truck route, non-bus route locations), while speed cushions are generally installed on designated truck route locations and bus route locations. Intuitively, I suspect t...

2658 sym R (21705 sym/43 pcs) 10 img

Star Wars Fan

03.12.2020

Star Wars FansDiego Correa12/3/2020 Introduction FiveThirtyEight ran a poll, surveying 1,186, respondents from June 3 to 6 in 2014. The survey included the following questions: Are you a fan?, Which movies have you watched? Age Group, Sex, Household Income, and Education. Research questions: Does this data provide convincing evidence of an in...

3455 sym R (4308 sym/15 pcs) 7 img

DATA 607 - Assignment 10

31.10.2020

Introduction In this assignment, you should start by getting the primary example code from chapter 2 working in an R Markdown document. You should provide a citation to this base code. You’re then asked to extend the code in two ways: 1. Work with a different corpus of your choosing, and 2. Incorporate at least one additional sentiment lexicon ...

1087 sym R (9560 sym/59 pcs) 10 img

DATA605 - Discussion 11

05.11.2020

Discussion Using R, create a simple linear regression model and test its assumptions. # Loading data url <- 'https://github.com/dcorrea614/MSDS/raw/master/cereal.csv' cereal <- read.csv(url) # linear model lm_cereal <- lm(cereal$rating ~ cereal$calories) summary(lm_cereal) ## ## Call: ## lm(formula = cereal$rating ~ cereal$calories) ##...

95 sym R (1129 sym/4 pcs) 2 img

DATA605 - Final Exam

20.12.2020

youtube video Problem 1 Using R, generate a random variable X that has 10,000 random uniform numbers from 1 to N, where N can be any number of your choosing greater than or equal to 6. Then generate a random variable Y that has 10,000 random normal numbers with a mean of \(\mu = \sigma = (N+1)/2\) set.seed(123) N <- 10 sigma <- (N + 1)/2 mu ...

4256 sym R (22173 sym/70 pcs) 8 img

DATA624 - Final Project

08.12.2021

Libraries library(kableExtra) library(tidyverse) library(ggplot2) library(dplyr) library(psych) library(caret) library(mice) library(randomForest) library(caTools) library(corrplot) library(class) library(rpart) library(AppliedPredictiveModeling) library(naniar) library(xgboost) library(DiagrammeR) library(readxl) library(writexl...

6123 sym R (27456 sym/59 pcs) 10 img 12 tbl

DATA624 - HW7

08.11.2021

6.2 Developing a model to predict permeability (see Sect. 1.4) could save significant resources for a pharmaceutical company, while at the same time more rapidly identifying molecules that have a sufficient permeability to become a drug: Start R and use these commands to load the data: library(AppliedPredictiveModeling) data(permeability) The ...

3827 sym R (120498 sym/28 pcs) 2 img

DATA622 - HW3

05.11.2021

Libraries library(kableExtra) library(tidyverse) library(ggplot2) library(dplyr) library(psych) library(caret) library(mice) library(randomForest) library(caTools) library(corrplot) library(class) library(rpart) library(rpart.plot) library(naniar) Background For this assignment, we will be working with a dataset on loan approval sta...

10245 sym R (32326 sym/81 pcs) 15 img 6 tbl

DATA624 - HW4

03.10.2021

3.1 The UC Irvine Machine Learning Repository6 contains a data set related to glass identification. The data consist of 214 glass samples labeled as one of seven class categories. There are nine predictors, including the refractive index and percentages of eight elements: Na, Mg, Al, Si, K, Ca, Ba, and Fe. Using visualizations, explore the predi...

2142 sym R (8291 sym/11 pcs) 3 img