Publications by Philip Tanofsky
DATA 621 Blog Post 2
Linear Regression: Diagnostics Introduction After building or creating a linear regression model and evaluating for estimation and goodness of fit, the data scientist should use regression diagnostics to truly ascertain the value of the model before using the model for future predictions. The regression diagnostics include checking to see if the...
3386 sym R (6737 sym/23 pcs) 6 img
DATA 621 Blog Post 4
Robust Regression Introduction In trying to better understand how to handle real-world data sets, I dug into robust regression to better understand the scenario when errors are not normally distributed. Robust regression “is designed to estimate the mean relationship between the predictors and response” according to the LMR textbook. Robust ...
2158 sym R (12012 sym/13 pcs)
DATA 621 Blog Post 1
Simple Linear Regression Introduction Linear regression I suppose is the starting point of this major and the focus of this course. Estimation is the first step of building the linear models. Estimation is the process of identifying the coefficient for each independent variable so as to explain as much of the response (dependent) variable as pos...
2912 sym R (8145 sym/27 pcs)
DATA 621 Blog Post 3
Ridge regression Introduction Given that big data has been a buzzword and concept for years now, the situation of independent variables (predictors) outnumbering the count of observations or instances is possible. When this situation arises, shrinkage methods are appropriate to counter overfitting and also identify important predictors over less...
2896 sym R (8121 sym/17 pcs)
DATA 621 Blog Post 5
Generalized Linear Models Introduction Generalized linear models allow for linear regression to be applied to data sets that have response variables in which the error distribution does not follow a normal distribution. In this approach the response variable can be considered with a linear model through a link function. This link function allows...
2830 sym R (20994 sym/24 pcs)
DATA 605 Discussion Week 12
Prompt Using R, build a regression model for data that interests you. Conduct residual analysis. Was the linear model appropriate? Why or why not? Regression Model I chose to use some NBA team data easily accessed using the \(NBAloveR\) package. The goal is to determine the relationship between the Offensive Rating, Defensive Rating, and Rating ...
2633 sym R (4363 sym/15 pcs) 3 img
DATA 622 Assignment 3 Gradient Boosting
Introduction This document discusses analyses of two datasets, the Palmer Penguin dataset and a Loan Approvals dataset prepared by Group 6. We divide the document into five parts and adopted two key principles to undertaking this analysis: First, group has developed a system of checks and balances in preparing each model’s output. A primary and...
21425 sym R (21265 sym/6 pcs) 4 img 8 tbl
DATA 605 Discussion Week 15
Prompt Pick any exercise in Chapter 12 of the calculus textbook. Post the solution or your attempt. Discuss any issues you might have had. What were the most valuable elements you took away from this course? Exercise 12.3.6 Evaluate \(f_x(x, y)\) and \(f_y(x, y)\) at the indicated point. \[f(x,y) = x^3 − 3x + y^2 − 6y\ at\ (−1,3)\] Partia...
546 sym R (161 sym/8 pcs)
DATA 605 Discussion Week 14
Prompt Pick any exercise in 8.8 of the calculus textbook. Solve and post your solution. If you have issues doing so, discuss them. Exercise 8.8.3 Key Idea 8.8.1 gives the \(n^{th}\) term of the Taylor series of common functions. In Exercises 3–6, verify the formula given in the Key Idea by finding the first few terms of the Taylor series of th...
1963 sym R (176 sym/4 pcs)
DATA 622 Assignment 4 SVM
1 Prompt For this assignment, we will be working with a very interesting mental health data set from a real-life research project. All identifying information, of course, has been removed. The attached spreadsheet has the data (the tab name “Data”). The data dictionary is given in the second tab. You can get as creative as you want. The assig...
10305 sym R (6514 sym/36 pcs) 8 img 4 tbl