Publications by Magnus Skonberg
DATA 605 HW 11
Background The purpose of the assignment was to explore linear regression. More specifically, it was to use the “cars” dataset in R to build out a linear model for stopping distance as a function of speed and then replicate the analysis of Chapter 3 from the course text for visualization, quality evaluation of the model, and residual analysis...
4497 sym R (1765 sym/18 pcs) 4 img
DATA 605 Wk 11 Disc
Background The purpose of this week’s discussion topic is to build out a simple linear regression model and test the assumptions using any data set of interest. Data Source The source of data is cited APA-style below: Marcos Pesotto. (2018). Happiness and Alcohol Consumption [Data file]. Retrieved from https://www.kaggle.com/marcospessotto/ha...
4432 sym R (3182 sym/20 pcs) 2 img
DATA 607 Week10 Assignment
Code Show All Hide All DATA 607 Wk 10 Assignment BACKGROUND FUNCTIONAL EXAMPLE CODE CODE EXTENSION DOWNLOAD & TIDY THE TEXT POSITIVE VS. NEGATIVE SENTIMENT WORD COUNTS SENTIMENT FLOW CONCLUSION Magnus Skonberg 2020-10-30 BACKGROUND The purpose of this assignment is to familiarize ourselves with text mining and sentiment analysis. FUNCTIO...
7957 sym R (9776 sym/80 pcs) 9 img
DATA 607 Discussion / Assignment 11
BACKGROUND The purpose of this assignment is to analyze an existing, interesting recommender system. I elected to analyze one of the world’s largest, most efficient, and innovative tech companies. A company that had to create its recommendation algorithm because none at the time could scale … SCENARIO DESIGN Perform a Scenario Design analys...
5821 sym 2 img
DATA 605 Wk 12 Disc
Background The purpose of this week’s discussion topic is to build out a regression model and conduct residual analysis for any data set that interests us. Dataset The source of data is cited APA-style below: UCI Machine Learning. (2018). Red Wine Quality [Data file]. Retrieved from https://www.kaggle.com/uciml/red-wine-quality-cortez-et-al-2...
3675 sym R (5768 sym/17 pcs) 3 img
DATA 607 Final Project
Code Show All Hide All DATA 607 - Final Project BACKGROUND APPROACH DATA SOURCE(S) ACQUIRE DATA TIDY & TRANSFORM VISUALIZE & ANALYZE CONCLUDE Magnus Skonberg 2020-12-09 BACKGROUND We’ve explored the most valuable skills as a class and discussed a number of companies over the course of the semester (Amazon, Netflix, Spotify, etc.) … I...
9918 sym R (7366 sym/17 pcs) 1 img
DATA 608 HW1
Background The purpose of the assignment was to explore principles of data visualization with ggplot2. ……………………………………………………………………. Preliminary EDA #Read in data on the fastest growing companies in the US, as compiled by Inc. magazine: inc <- read.csv("https://raw.githubusercontent.com/charleyf...
2676 sym R (7965 sym/24 pcs) 4 img
DATA 622 HW1
Background The purpose of the assignment was to explore logistic and multinomial logistic regression. ……………………………………………………………………. 1 - Logistic Regression with Binary Outcome The penguin dataset has ‘species’ column. Please check how many categories you have in the species column. Conduct w...
12610 sym R (7617 sym/28 pcs) 2 img
DATA 622 HW2
Background The purpose of the assignment was to explore linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), and naive Bayes as applied to the Palmer penguin dataset. …………………………………………………………………….. Data Preprocessing Before we explore LDA, QDA, and naive Bayes, as applied to...
9714 sym R (7254 sym/40 pcs) 3 img
DATA 621 HW#1
AUTHORSHIP Critical Thinking Group 1: Angel Claudio, Bonnie Cooper, Manolis Manoli, Magnus Skonberg, Christian Thieme and Leo Yi BACKGROUND On Sabermetrics Statistics have played a role in quantifying baseball since Henry Chadwick introduced the box score in 1858. The box score was adopted from cricket scorecards and introduced metrics ...
20658 sym R (22794 sym/20 pcs) 15 img 1 tbl