Publications by Christian Thieme

Inference, Prediction, and Explanation with Linear Regression

23.02.2021

Simple Linear Regression: Inference, Prediction, Explanation A Modern Approach to Regression with R Linear Models with R Christian Thieme 2/17/2021 Simple Linear Regression: Inference, Prediction, Explanation A Modern Approach to Regression with R Exercise 2.3 The manager of the purchasing department of a large company would like to develo...

4617 sym R (4405 sym/30 pcs) 1 img

Knowledge and Visual Analytics Final Project Proposal

26.03.2021

The Purpose 🐔 Simulation ⚙️ The Data 💻 Why Streamlit ❓ Knowledge and Visual Analytics Final Project Proposal Christian Thieme 3/26/2021 The Purpose 🐔 Over the past decade, the desire for all natural, fresh, and organic food has grown exponentially. Significantly more offerings at grocery stores as well as the abundance of fa...

4327 sym 7 img

Understanding Classification Metrics

20.03.2021

Authorship Critical Thinking Group 1: Angel Claudio, Bonnie Cooper, Manolis Manoli, Magnus Skonberg, Christian Thieme and Leo Yi Background In the following exercises we will be working with a version of a well known dataset known as the “Pima Indians Diabetes Database”. The dataset was gathered by the National Institute of Diabetes and Dig...

12651 sym R (7564 sym/33 pcs) 2 img

Understanding Linear Regression Output in R

11.03.2021

Understanding Linear Regression Output in R Christian Thieme 3/10/2021 Regression is an incredibly common form of analysis used by both amateurs and professionals alike. Why is that? Because it is a robust tool for understanding relationships between variables. In addition, it also allows us the ability to make predictions on previously unseen d...

11721 sym R (185 sym/3 pcs) 12 img

Moneyball Multiple Regression Analysis

07.03.2021

AUTHORSHIP Critical Thinking Group 1: Angel Claudio, Bonnie Cooper, Manolis Manoli, Magnus Skonberg, Christian Thieme and Leo Yi BACKGROUND On Sabermetrics     Statistics have played a role in quantifying baseball since Henry Chadwick introduced the box score in 1858. The box score was adopted from cricket scorecards and introduced metrics ...

20663 sym R (23280 sym/25 pcs) 15 img 1 tbl

Understanding Common Classification Metrics - Titanic Style

03.04.2021

Understanding Common Classification Metrics - Titanic Style Introduction Evaluation Metrics The Model Evaluating our Model Accuracy Classification Error Rate Precision Sensitivity (Recall) Specificity Conclusion Christian Thieme 3/31/2021 Understanding Common Classification Metrics - Titanic Style Introduction After successfully generating ...

9482 sym R (2178 sym/15 pcs) 7 img

Logistic Regression - Identifying High Risk Neighborhoods

16.04.2021

Authorship Critical Thinking Group 1 Angel Claudio Bonnie Cooper Manolis Manoli Magnus Skonberg Christian Thieme Leo Yi Background In the following exercises we will be working with The Boston Housing Dataset. The dataset was gathered by the US Census Bureau regarding housing in the Boston Massachussetts area and can be obtained from the StatLi...

24316 sym R (33394 sym/41 pcs) 5 img 1 tbl

CODE APPENDIX - COVID-19 Effect on Pneumonia & Influenza

24.05.2021

Methods Dependancies library( dplyr ) library( ggplot2 ) ## Warning: package 'ggplot2' was built under R version 4.0.5 library( gridExtra ) library( rvest ) library( tidyverse ) library( multcompView ) ## Warning: package 'multcompView' was built under R version 4.0.5 library( cdcfluview ) ## Warning: package 'cdcfluview' was built under R v...

5285 sym R (80456 sym/130 pcs) 27 img

Identifying Outliers in Linear Regression - Cook's Distance

15.05.2021

Removing Outliers with Cook’s Distance Christian Thieme 5/14/2021 Understanding How to Use Cook’s Distance to Remove Outliers There are many techniques to remove outliers from a dataset. One method that is often used in regression settings is utilizing Cook’s Distance. Cook’s distance is an estimate of the influence of a data point. It t...

2967 sym R (6259 sym/10 pcs) 2 img

Poisson, quasi-Poisson, Negative Binomial, and Zero Inflated Regression

14.05.2021

Authorship Critical Thinking Group 1 Angel Claudio Bonnie Cooper Manolis Manoli Magnus Skonberg Christian Thieme Leo Yi Abstract We will explore, analyze and model a data set containing approximately 12,000 records representing various commercially available wines. The variables are primarily related to the chemical properties of the wine bein...

19914 sym R (48044 sym/20 pcs) 8 img