Publications by Harris Dupre

Multple and Logistic Regression

04.05.2020

Baby weights, Part I. (9.1, p. 350) The Child Health and Development Studies investigate a range of topics. One study considered all pregnancies between 1960 and 1967 among women in the Kaiser Foundation Health Plan in the San Francisco East Bay area. Here, we study the relationship between smoking and weight of the baby. The variable smoke is c...

7536 sym R (351 sym/7 pcs) 2 img

Project 1

04.04.2022

PART A Data Exploration First we will load and explore the data. We will also convert the data into a time series object (tsibble). The xlsx data was opened in Excel and exported as a CSV. atm_raw <- read.csv("https://raw.githubusercontent.com/hdupre/DATA624/main/Project1/ATM624Data.csv") head(atm_raw) ## DATE ATM Cash ## 1 5...

5185 sym R (13617 sym/68 pcs) 16 img

HW6

27.03.2022

9.1 a. These plots have different confidence intervals for their autocorrelation coefficients. The plots do indicate that the data are white noise because the autocorrelation lines are randomly distributed have no obvious pattern and no lines breach the 95% confidence interval. b. The length of the time series, T, is part of the calculation for...

3729 sym R (9017 sym/80 pcs) 27 img

HW5

13.03.2022

8.1 a. fit <- aus_livestock %>% filter(Animal == 'Pigs', State == 'Victoria') %>% model(ETS(Count ~ error("A") + trend("N") + season("N"))) fc <- fit %>% forecast(h = 4) fc ## # A fable: 4 x 6 [1M] ## # Key: Animal, State, .model [1] ## Animal State .model Month Count .mean ## <fct> <fct...

2474 sym R (6202 sym/41 pcs) 10 img

HW4

06.03.2022

3.1 data(Glass) glass_df <- as.data.frame(Glass) summary(glass_df) ## RI Na Mg Al ## Min. :1.511 Min. :10.73 Min. :0.000 Min. :0.290 ## 1st Qu.:1.517 1st Qu.:12.91 1st Qu.:2.115 1st Qu.:1.190 ## Median :1.518 Median :13.30 Median :3.480 Median :1.360 ## Mean...

2912 sym R (13401 sym/32 pcs) 8 img

HW3

28.02.2022

5.1 Australian population global_economy %>% filter(Country == "Australia") %>% autoplot(Population) This plot appears appropriate for the drift method because of the upward trend. aus_population <- global_economy %>% filter(Country == "Australia") fit <- aus_population %>% model(Drift = RW(Population ~ drift())) aus_population_foreca...

3014 sym R (5307 sym/50 pcs) 25 img

HW2

20.02.2022

3.1 global_economy %>% group_by(Country) %>% autoplot(GDP/Population, show.legend = FALSE) + labs(title = 'Global GDP Per Capita', y = '$', x = 'Year') ## `mutate_if()` ignored the following grouping variables: ## • Column `Country` ## Warning: Removed 3242 row(s) containing missing values (geom_path). global_economy %>% ...

4248 sym R (5291 sym/37 pcs) 26 img

HW1 DATA624

14.02.2022

2.1 GAFA help(gafa_stock) head(gafa_stock) ## # A tsibble: 6 x 8 [!] ## # Key: Symbol [1] ## Symbol Date Open High Low Close Adj_Close Volume ## <chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> ## 1 AAPL 2014-01-02 79.4 79.6 78.9 79.0 67.0 58671200 ## 2 AAPL 2014-01-03 79.0 79.1 77.2 77.3 ...

3435 sym R (7623 sym/29 pcs) 12 img

DATA624 Project 2 HDupre

15.05.2022

Our initial step is to import the relevant libraries containing the requisite models of this analysis. library(mlbench) library(randomForest) library(caret) library(party) library(Cubist) library(dplyr) library(rpart.plot) library(kernlab) library(earth) library(nnet) library(DataExplorer) library(RANN) library(corrplot) pacman::p_load(tidyverse,...

1831 sym R (22527 sym/45 pcs) 19 img 1 tbl

Market Basket

08.05.2022

Market Basket I’ll follow the guidelines of this page: https://www.kirenz.com/post/2020-05-14-r-association-rule-mining/#association-rules Though rather than convert the CSV into a list of character vectors as mentioned in the article, I will use read.transactions() from arules. library(arules) groceries <- read.transactions("/Users/harris/ds_...

1786 sym R (12063 sym/13 pcs) 2 img