Publications by Mary Anna Kivenson + Elina v2 + Charls

Data621_final_project

17.05.2020

Data This dataset is from the UCI Machine Learning Repository and is comprised of student performance inforation. The data contains the following features: school - student’s school (binary: ‘GP’ - Gabriel Pereira or ‘MS’ - Mousinho da Silveira) sex - student’s sex (binary: ‘F’ - female or ‘M’ - male) age - student’s age (n...

7366 sym R (15187 sym/30 pcs) 8 img 2 tbl

Multi-Level Classifer

17.05.2020

Here, we will see how a multi level classifer can be compared with other classifer in terms of it performance. In binary classifers, we library("e1071") data(iris) attach(iris) head(iris) ## Sepal.Length Sepal.Width Petal.Length Petal.Width Species ## 1 5.1 3.5 1.4 0.2 setosa ## 2 4.9 3....

1762 sym R (7642 sym/32 pcs) 1 img

Binary Classifer Performance

26.04.2020

In this blog, lets review some of the performance metrics used for evaluating binary classifers and ways to compare multiple classifer models. The prerequiste of the audience is to know atleast how the binary classifers works and some of the algorthims used. Regression and Classifers algorthims are categorised into two types. Parametric and Non-P...

7785 sym R (3716 sym/16 pcs) 12 img

data621_assignment4_build_classifer

25.04.2020

Building Models training <- read.csv( "https://raw.githubusercontent.com/charlsjoseph/Data621/master/Data621-Assignment4/insurance_tf_train.csv")[-1] test_set <- read.csv("https://raw.githubusercontent.com/charlsjoseph/Data621/master/Data621-Assignment4/insurance_tf_test.csv")[-1] df_eval <- read.csv( "https://raw.githubusercontent.com/charlsj...

1333 sym R (6607 sym/29 pcs) 4 img 1 tbl

Data621_assignment4

25.04.2020

Data Exploration Read Data Here, we read the training dataset into a dataframe. df <- read.csv("https://raw.githubusercontent.com/mkivenson/Business-Analytics-Data-Mining/master/Insurance%20Model/insurance_training_data.csv")[-1] head(df) ## TARGET_FLAG TARGET_AMT KIDSDRIV AGE HOMEKIDS YOJ INCOME PARENT1 HOME_VAL ## 1 0 ...

2525 sym R (17753 sym/19 pcs) 4 img

Data 622 - Assignment 2

21.04.2020

Load the data Load the data and set the categorical data as factor require(stringr) ## Loading required package: stringr df <- read.csv("C:\\Users\\Charls\\Documents\\CunyMSDS\\Data622\\Assignments\\HW2\\dataset.csv") #df$Y_1 <- as.integer(as.factor(df$Y)) df$Y_n <- sapply(df$Y, function(x) {switch(as.character(x), "a" = 1, "b" = 2, "c" = 3,...

154 sym R (10268 sym/132 pcs) 6 img 1 tbl

Data621_assignment3

02.04.2020

library(ggplot2) require(gridExtra) library(car) library(factoextra) library(dplyr) library(DT) library(knitr) Data Exploration df <- read.csv("https://raw.githubusercontent.com/mkivenson/Business-Analytics-Data-Mining/master/Classification%20Project/crime-training-data_modified.csv") datatable(df) Summary First, we take a look at a summ...

3897 sym R (11205 sym/32 pcs) 6 img 1 tbl

Data622_HW1Qn2

30.03.2020

Load the ‘junk1.txt’ file. There are 100 observation with 3 columns. library(VIM) ## Loading required package: colorspace ## Loading required package: grid ## Loading required package: data.table ## VIM is ready to use. ## Since version 4.0.0 the GUI is in its own package VIMGUI. ## ## Please use the package to use the new (an...

1463 sym R (2301 sym/26 pcs) 4 img

Data622_HWQn3

30.03.2020

euclideanDist <- function(a, b){ d = 0 for(i in c(1:(length(a)) )) { d = d + (a[[i]]-b[[i]])^2 } d = sqrt(d) return(d) } knn_predict2 <- function(test_data, train_data, k_value, labelcol){ pred <- c() #empty pred vector #LOOP-1 for(i in c(1:nrow(test_data))){ #looping over each record of test data eu_di...

6 sym R (3827 sym/10 pcs) 1 img 1 tbl

Data622_HW1qn1

26.03.2020

R Markdown df <- read.csv("C:\\Users\\Charls\\Documents\\CunyMSDS\\Data622\\Assignments\\HW1\\Qn1\\data.csv") kable(df) age.group networth status credit_rating classprospect youth high employed fair no youth high employed excellent no middle high employed fair yes senior medium employed fair yes ...

212 sym R (5332 sym/83 pcs) 1 tbl