Publications by Haley Koprivsek
Internal Consistency & Principal Component Analysis of Student Satisfaction Survey
1 Introduction The data in this analysis comes from a survey conducted at regional universities in the U.S. seeking to investigate which factors may impact the satisfaction level of undergraduate business students with their institution. The study population was defined to be all undergraduate business students at these two colleges. A total of...
8297 sym R (4926 sym/10 pcs) 5 img 3 tbl
Internal Consistency Analysis of Student Satisfaction Survey
1 Introduction The data in this analysis comes from a survey conducted at regional universities in the U.S. seeking to investigate which factors may impact the satisfaction level of undergraduate business students with their institution. The study population was defined to be all undergraduate business students at these two colleges. A total of...
5971 sym R (1063 sym/5 pcs) 1 img 2 tbl
Comparing Sampling Methods with Bank Loan Data
1 Introduction The data set that will be used for this analysis consists of almost 900,000 observations, corresponding to loan applications submitted to banks by small businesses with a partial warranty from the Small Business Association (SBA). It is available to the public via free internet download. From this data set four different samples ...
7200 sym R (5044 sym/17 pcs) 1 img 8 tbl
Sample HTML Presentation
Table of Contents Introduction Exploratory Data Analysis Performing Multiple Linear Regression Bootstrap Multiple Linear Regression Final Model Conclusion & Discussion Introduction Data set related to homes on the Melbourne housing market 34,857 total observations and 21 different variables 8 categorical variables (e.g. Suburb, Address, Metho...
812 sym 2 img 1 tbl
Modeling Monthly New House Sales Using Exponential Smoothing
1 Introduction 1.1 Data Set This analysis will be performed using a time series of the number of new houses sold in the U.S. in each month from January 2010 to January 2022. The data was retrieved from the website of the U.S. Census Bureau, where it is publicly available for free download. 1.2 Research Question The purpose of this analysis is...
4865 sym R (4459 sym/8 pcs) 2 img 3 tbl
Forecasting Natural Gas Prices With Decomposition
1 Introduction 1.1 Data Set The data set which will be used for this analysis is a time series of the monthly price of natural gas from January 1997 to August 2020, collected by the U.S. Energy Information Administration. It is publicly available for free download on the datahub.io website. For the purposes of this analysis, the full data set ...
2947 sym R (2681 sym/8 pcs) 4 img 1 tbl
Comparing Accuracy of Baseline Forecasting Methods Using New Car Sales Data
1 Introduction 1.1 Data Set This analysis will be performed using a time series data set consisting of the total reported sales (in U.S. dollars) by new car dealers in the U.S. for each month from January of 2000 to August of 2023. This data is collected by the U.S. Census Bureau and is publicly available on their website. 1.2 Research Quest...
5920 sym R (3055 sym/6 pcs) 2 img 2 tbl
Poisson & Quasi-Poisson Regression Modeling of Brooklyn Bridge Cyclists Counts
1 Introduction 1.1 Data Set & Variables bike.counts <- read.csv("C:\\Users\\eh738\\OneDrive\\Documents\\STA321\\BikeCounts.csv") #removing commas from count variable values and converting to numeric bike.counts$BrooklynBridge <- as.numeric(gsub(",", "", bike.counts$BrooklynBridge)) bike.counts$Total <- as.numeric(gsub(",", "", bike.counts$Tota...
11307 sym R (6008 sym/11 pcs) 3 img 2 tbl
Poisson Regression Modeling of Brooklyn Bridge Bike Counts
1 Introduction 1.1 Data Set & Variables bike.counts <- read.csv("C:\\Users\\eh738\\OneDrive\\Documents\\STA321\\BikeCounts.csv") #removing commas from count variable values and converting to numeric bike.counts$BrooklynBridge <- as.numeric(gsub(",", "", bike.counts$BrooklynBridge)) bike.counts$Total <- as.numeric(gsub(",", "", bike.counts$Tota...
9751 sym Python (897 sym/3 pcs) 2 tbl
Predicting Customer Churn Via Logistic Regression Modeling
1 Introduction The data set used in this analysis is a subset of a larger pool of customer data collected by a telecommunications company to investigate what factors may contribute to the retention or churn (i.e., loss) of customers. This data set consists of 1000 observations and 14 variables. There are no missing values. It is available for f...
7827 sym R (5414 sym/11 pcs) 3 img 1 tbl