Publications by Don Padmaperuma
Tidyverse - Blog1
library(tidyverse) library(kableExtra) ## Warning: package 'kableExtra' was built under R version 3.6.3 ## ## Attaching package: 'kableExtra' ## The following object is masked from 'package:dplyr': ## ## group_rows Tidyverse and its function in Data Visualization Tidyverse is a package I always use in my R data projects. Tidyverse is a...
10988 sym R (6483 sym/24 pcs) 1 img
DATA 608 - Final Project
Libraries library(tidyr) ## Warning: package 'tidyr' was built under R version 3.6.3 library(dplyr) ## Warning: package 'dplyr' was built under R version 3.6.3 ## ## Attaching package: 'dplyr' ## The following objects are masked from 'package:stats': ## ## filter, lag ## The following objects are masked from 'package:base': ## ## ...
2352 sym R (21656 sym/51 pcs) 2 img
DATA621_HW4
Homework4 Matthew Baker, Don Padmaperuma, Subhalaxmi Rout, Erinda Budo 11/22/2020 Overview In this homework assignment, we will explore, analyze and model a data set containing approximately 8000 records representing a customer at an auto insurance company. Each record has two response variables. The first response variable, TARGET_FLAG, is a 1 ...
15919 sym R (36315 sym/58 pcs) 18 img
DATA 621 - Blog2
Machine Learning and Regression Extract Data Using Boston data set which is a part of Mass library to create the training and testing sample. The problem statement is to predict \(medv\) based on the set of input features. library(MASS) ## Warning: package 'MASS' was built under R version 3.6.3 library(ggplot2) ## Warning: package 'ggplot2' was ...
3048 sym R (11557 sym/50 pcs) 7 img
Robust Regression - Blog Post
library(ggplot2) ## Warning: package 'ggplot2' was built under R version 3.6.3 library(car) ## Warning: package 'car' was built under R version 3.6.3 ## Loading required package: carData ## Warning: package 'carData' was built under R version 3.6.3 library(olsrr) ## Warning: package 'olsrr' was built under R version 3.6.3 ## ## Attaching packag...
1764 sym R (3005 sym/45 pcs) 3 img
DATA 621_HW5
DATA 621: Homework 5 Matthew Baker, Don Padmaperuma, Subhalaxmi Rout, Erinda Budo 12/10/2020 Overview In this homework assignment, you will explore, analyze and model a data set containing information on approximately 12,000 commercially available wines. The variables are mostly related to the chemical properties of the wine being sold. The resp...
10772 sym R (22624 sym/42 pcs) 59 img
DATA 621 _ Blog4
Poisson Regression - Blog Post Don Padmaperuma Libraries library(faraway) ## Warning: package 'faraway' was built under R version 3.6.3 Introduction As chapter 5 of “Extending the Linear Model with R” by Julian J. Faraway explained Poisson regression can be a really useful tool if we know how and when to use it. For this demonstration I wil...
3240 sym R (4252 sym/16 pcs) 2 img
DATA 621 - Blog 5: Panel Regression
Panel Data Regression Panel data (also known as longitudinal or cross-sectional time-series data) is a dataset in which the behavior of entities are observed across time. These entities could be states, companies, individuals, countries, etc. Fixed and Random Effect The term Fixed and Random are used frequently in multilevel modeling. They are u...
2613 sym R (2793 sym/10 pcs)
DATA 621 Final Project
Final Project Matthew Baker, Don Padmaperuma, Subhalaxmi Rout, Erinda Budo 2020-12-15 Abstract HR Analytics finds out the people-related trends in the data and helps the HR Department take the appropriate steps to keep the organization running smoothly and profitably. Attrition is a corporate setup is one of the complex challenges that the peopl...
9535 sym R (7710 sym/6 pcs) 15 img 6 tbl
DATA 622 Homework 1
DATA622-Assignment 1 Logistic Regression with Penguin dataset Don Padmaperuma Penguin Data Set This dataset contains size measurements for three differen kind penguin species observed on three islands in the Palmer Archipelago, Antarctica.These data were collected from 2007 - 2009 by Dr. Kristen Gorman with the Palmer Station Long Term Ecologic...
6001 sym R (28945 sym/95 pcs) 4 img