Publications by Christian Thieme

Car Insurance Analysis and Logistic Regression


Authorship Critical Thinking Group 1 Angel Claudio Bonnie Cooper Manolis Manoli Magnus Skonberg Christian Thieme Leo Yi Abstract We will explore, analyze and model a data set containing approximately 8,000 records. Each row represents a customer at an auto insurance company. Each record has two response variables. The first response variable, ...

Influenza and Pneumonia Mortality during the Global COVID-19 Pandemic and the Impact of Local Government Restrictions


Influenza and Pneumonia Mortality during the Global COVID-19 Pandemic and the Impact of Local Government RestrictionsAngel Claudio, Bonnie Cooper, Manolis Manoli, Magnus Skonberg, Christian Thieme, Leo Yi"2021-05-23" Communicable Disease during COVID-19 The global COVID-19 pandemic has greatly impacted disease beyond direct cases of COVID-19. R...

Project 2 - Non-linear Regression Models


Group Members Subhalaxmi Rout Kenan Sooklall Devin Teran Christian Thieme Leo Yi /pagebreak Introduction We have been given a dataset from a beverage manufacturing company that consists of 2,571 rows of data and 33 columns. The dataset contains information on different beverages and their chemical composition. The goal of this analysis is to u...

Nonlinear Regression Models, Regression Trees and Rules-Based Models


Homework 5: Applied Predictive Modeling 7.2 set.seed(200) trainingData <- mlbench.friedman1(200, sd = 1) trainingData$x <- data.frame(trainingData$x) testData <- mlbench.friedman1(5000, sd = 1) testData$x <- data.frame(testData$x) knn_fit<- train(x=trainingData$x,y=trainingData$y, method = "knn", preProcess = c("center", "scale"), tuneLe...

Arima Models


Forecasting: Principles and Practice 8.1 The main difference between these figures are that the scale of both the ACF values and the 95% limit lines seems to be decreasing from left to right. Additionally, it appears that some of the patterns change from chart to chart with positive and negative ACF. It does appear that these charts indicate th...

Time Series and Decomposition


Homework 1: Time Series and Decomposition Forecasting: Principles and Practice 2.1 Use the help function to explore what the series gold, woolyrnq and gas represent. #?gold #?woolyrnq #?gas \((a)\). Use autoplot() to plot each of these in separate plots. autoplot(gold) autoplot(woolyrnq) autoplot(gas) \((b)\) What is the frequency of each ...

Data Pre-processing & Exponential Smoothing


Applied Predictive Modeling Chapter 3 - Data Pre-processing 3.1 The UC Irvine Machine Learning Repository contains a data set related to glass identification. The data consist of 214 glass samples labeled as one of seven class categories. There are nine predictors, including the refractive index and percentages of eight elements: Na, Mg, Al, Si,...

Time Series Analysis Stock Data


Project 1: S05 - Forecast Var02, Var03 Load Libraries and Data library(readxl) # read excel library(dplyr) library(PerformanceAnalytics) # correlation and histogram library(ggplot2) # ggplot library(tidyverse) library(tidyr) # drop_na() library(forecast) # autoplot library(fpp3) library(zoo) library(seasonal) library(tidymodels) data ...

Project 1 - Time Series Analysis and Forecasting Using ARIMA and ETS


Group Members Subhalaxmi Rout Kenan Sooklall Devin Teran Christian Thieme Leo Yi Introduction The dataset for this project was provided as a de-identified excel spreadsheet that included six different groups. Each group had two variables to be forecasted 140 periods into the future using 1,622 historical periods. This data had been provided as ...

Final Project - DATA622


Group 1 Members: David Moste Vinayak Kamath Kenan Sooklall Christian Thieme Lidiia Tronina Introduction Data scientists are in extremely high demand around the world. Companies are constantly fighting to acquire and keep talented professionals. However, as we’ve seen in the last year, many professionals are leaving their jobs and looking for ...

