Publications by Betsy Rosalen

CUNY MSDS DATA624 HW5

09.03.2020

Exercise 7.1 Consider the pigs series — the number of pigs slaughtered in Victoria each month. a. Use the ses() function in R to find the optimal values of \(\alpha\) and \(\ell_0\), and generate forecasts for the next four months. pigs <- fma::pigs tail(pigs) ## Mar Apr May Jun Jul Aug ## 1995 106723 84307 114896 10674...

7388 sym R (18327 sym/104 pcs) 34 img 20 tbl

CUNY MSDS DATA624 HW4

04.03.2020

Exercise 3.1 3.1. The UC Irvine Machine Learning Repository1 contains a data set related to glass identification. The data consist of 214 glass samples labeled as one of seven class categories. There are nine predictors, including the refractive index and percentages of eight elements: Na, Mg, Al, Si, K, Ca, Ba, and Fe. The data can be accessed v...

8515 sym R (9560 sym/33 pcs) 14 img 9 tbl

CUNY MSDS DATA624 HW3

24.02.2020

Exercise 6.2 The plastics data set consists of the monthly sales (in thousands) of product A for a plastics manufacturer for five years. a. Plot the time series of sales of product A. Can you identify seasonal fluctuations and/or a trend-cycle? Description: Monthly sales of product A for a plastics manufacturer. plastics <- fma::plastics # befor...

3663 sym R (3541 sym/22 pcs) 15 img 1 tbl

CUNY MSDS DATA624 HW2

17.02.2020

Exercise 3.1 For the following series, find an appropriate Box-Cox transformation in order to stabilize the variance. usnetelec usgdp mcopper enplanements usnetelec Description: Annual US net electricity generation (billion kwh) for 1949-2003 usnetelec <- expsmooth::usnetelec # before BoxCox autoplot(usnetelec) frequency(usnetelec) ## [1] 1 # ...

5151 sym R (3117 sym/58 pcs) 29 img 5 tbl

CUNY MSDS DATA624 HW1

09.02.2020

Exercise 2.1 Use the help function to explore what the series gold, woolyrnq and gas represent. Use autoplot() to plot each of these in separate plots. What is the frequency of each series? Hint: apply the frequency() function. Use which.max() to spot the outlier in the gold series. Which observation was it? gold ??gold Help pages: forecast::gol...

5634 sym R (1502 sym/49 pcs) 32 img 2 tbl

CUNY MSDS DATA698 Project Proposal

25.02.2020

Introduction - The Problem The majority of community college students, approx. 80%, begin their college education with the intent to transfer to and complete a bachelor’s degree program either before or after completion of their associate’s degree, however, only about 17% are successful in earning a bachelor’s degree within 6 years.1 The af...

8776 sym

CUNY MSDS DATA624 Non-Linear Regression Presentation

21.04.2020

DATA 624 - Non-Linear Regression Zach Herold, Anthony Pagan, Betsy Rosalen April 21, 2020 Linear Regression Review Linear Regression model equations can be written either directly or indirectly in the form: \[y_i = b_0 + b_1x_{i1} + b_2x_{i2} + ... + b_Px_{iP} + e_i\] Where: \(y_i\) is the outcome or response \(b_0\) is the Y-intercept \(P\) ...

13032 sym R (4498 sym/15 pcs) 23 img

CUNY MSDS DATA624 HW8

27.04.2020

Exercise 7.2 Friedman (1991)1 introduced several benchmark data sets created by simulation. One of these simulations used the following nonlinear equation to create data: \[ y = 10 sin(\pi x_1x_2) + 20(x_3 − 0.5)^2 + 10x_4 + 5x_5 + N(0, \sigma^2) \] where the \(x\) values are random variables uniformly distributed between [0, 1] (there are also...

3664 sym R (29232 sym/51 pcs) 2 img 10 tbl

CUNY MSDS DATA609 HW8

14.12.2021

Ex. 1 Use the nnet() package to analyze the iris data set. Use 80% of the 150 samples as the training data and the rest for validation. Discuss the results. i <- iris i samp <- c(sample(1:50,40), sample(51:100,40), sample(101:150,40)) train <- i[samp,] test <- i[-samp,] iris_nnet <- nnet(Species~., train, size = 2, rang = 0.1, deca...

4097 sym R (11361 sym/32 pcs) 9 img

CUNY MSDS DATA609 HW7

06.12.2021

Ex. 1 Use the svm() algorithm of the e1071 package to carry out the support vector machine for the PlantGrowth data set. Then discuss the number of support vectors/samples. [Install the e1071 package in R if needed.] p <- PlantGrowth cbind(p[1:10,],p[11:20,],p[21:30,]) p_svm <- svm(group ~ weight, data = p) summary(p_svm) ## ## Call: ## svm(fo...

1399 sym R (7259 sym/25 pcs)