Publications by Mary Anna Kivenson
Data 624 Homework 9
Do problems 8.1, 8.2, 8.3, and 8.7 in Kuhn and Johnson. Question 8.1 Recreate the simulated data from Exercise 7.2: set.seed(200) simulated <- mlbench.friedman1(200, sd = 1) simulated <- cbind(simulated$x, simulated$y) simulated <- as.data.frame(simulated) colnames(simulated)[ncol(simulated)] <- "y" Fit a random forest model to all of t...
7158 sym R (5224 sym/28 pcs) 6 img
EDA/Preprocessing
Data Exploration Read Data Here, we read the training dataset into a dataframe. df <- read.csv("https://raw.githubusercontent.com/mkivenson/Business-Analytics-Data-Mining/master/Insurance%20Model/insurance_training_data.csv")[-1] head(df) ## TARGET_FLAG TARGET_AMT KIDSDRIV AGE HOMEKIDS YOJ INCOME PARENT1 HOME_VAL ## 1 0 ...
2541 sym R (12566 sym/17 pcs) 4 img
Data 624 Homework 7
library(caret) library(pls) library(tidyverse) library(AppliedPredictiveModeling) library(corrplot) Assignment 7 - Linear Regression Question 6.2 Developing a model to predict permeability (see Sect. 1.4) could save significant resources for a pharmaceutical company, while at the same time more rapidly identifying molecules that have a suff...
5288 sym R (6030 sym/51 pcs) 6 img
Classification Data Cleanup
library(ggplot2) require(gridExtra) library(car) library(factoextra) library(dplyr) library(DT) library(knitr) Data Exploration df <- read.csv("https://raw.githubusercontent.com/mkivenson/Business-Analytics-Data-Mining/master/Classification%20Project/crime-training-data_modified.csv") datatable(df) Summary First, we take a look at a summ...
3897 sym R (11205 sym/32 pcs) 6 img 1 tbl
Classification Metrics
Download Data df <- read.csv("https://raw.githubusercontent.com/mkivenson/Business-Analytics-Data-Mining/master/Classification%20Metrics/classification-output-data.csv") datatable(df) Confusion Matrix cm <- as.matrix.data.frame(table(df$scored.class, df$class)) rownames(cm) <- c('predicted negative', 'predicted positive') colnames(cm) <- c('a...
174 sym R (1033 sym/7 pcs)
Data 624 Homework 5
Assignment 5 library(fpp2) library(mlbench) library(corrplot) library(ggplot2) require(gridExtra) library(car) library(caret) library(tidyverse) library(DT) library(plotly) Question 7.1 Consider the pigs series — the number of pigs slaughtered in Victoria each month. Use the ses() function in R to find the optimal values of α and �...
4437 sym R (8242 sym/43 pcs) 9 img
Modeling
Assignment 4 library(mlbench) library(corrplot) library(ggplot2) require(gridExtra) library(car) library(caret) library(tidyverse) library(DT) Question 3.1 The UC Irvine Machine Learning Repository6 contains a data set related to glass identification. The data consist of 214 glass samples labeled as one of seven class categories. There ...
4893 sym R (16476 sym/48 pcs) 5 img
Moneyball Model
library(corrplot) library(psych) library(ggplot2) require(gridExtra) library(car) library(mice) library(VIM) library(caret) library(dplyr) library(MASS) Read Data Here, we read the dataset and shorten the feature names for better readibility in visualizations. df <- read.csv("https://raw.githubusercontent.com/mkivenson/Business-Analytic...
1961 sym R (3330 sym/8 pcs)
Time Series
Time Series Decomposition library(fpp2) library(seasonal) library(zoo) library(plotly) Question 6.2 The plastics data set consists of the monthly sales (in thousands) of product A for a plastics manufacturer for five years. Plot the time series of sales of product A. Can you identify seasonal fluctuations and/or a trend-cycle? There are sea...
1782 sym R (2557 sym/12 pcs) 1 img
Data 624 Homework 2: Time Series
Time Series library(fpp2) library(zoo) library(plotly) require(gridExtra) Question 3.1 For the following series, find an appropriate Box-Cox transformation in order to stabilise the variance. usnetelec usgdp mcopper enplanements usnetelec data(usnetelec) invisible(lambda <- round(BoxCox.lambda(usnetelec),3)) grid.arrange(autoplot(usnetelec...
2264 sym R (3613 sym/21 pcs) 10 img