Publications by Avraham

Delaporte package: The SPARCmonster is sated

31.03.2017

Finally, finally after months of pulling out my hair, the Delaporte project on CRAN passes all of its checks on Solaris SPARC. The last time it did that, it was still using serial C++. Now it uses OpenMP-based parallel Fortran 2003. What a relief! One of these days I should write up what I did and why, but for now, I’ll be glad to put the proje...

787 sym 2 img

Delaporte package: The SPARCmonster is sated

31.03.2017

Finally, finally after months of pulling out my hair, the Delaporte project on CRAN passes all of its checks on Solaris SPARC. The last time it did that, it was still using serial C++. Now it uses OpenMP-based parallel Fortran 2003. What a relief! One of these days I should write up what I did and why, but for now, I’ll be glad to put the proje...

787 sym 2 img

CUNY MSDS-Fall 2019-DT 605-Week 14-Calculus

25.11.2019

Question Find a formula for the \(n^{\textrm{th}}\) term of the Taylor series of \(\ln(1 + x)\), centered at 0, by finding the coefficients of the first few powers of \(x\) and looking for a pattern. (The formulas for several of these are found in Key Idea 8.8.1; show work verifying these formula.) Answer Using existing knowledge of \(\ln{x}\):...

1957 sym

CUNY MSDS-Fall 2019-DT 605-Week 15-Calculus

28.11.2019

Question 1 Find \(f_x, f_y, f_{xx}, f_{yy}, f_{xy},\) and \(f_{yx}\) for \(f(x, y) = \ln{\left(x ^ 2 + y\right)}\). Solution All the examples (except \(f_y\)) will need to make use of the chain rule in that if \(h(\mathbf{\theta}) = f(g(\mathbf{\theta}))\) then \(h'(\mathbf{\theta}) = f'(g(\mathbf{\theta}))\cdot g'(\mathbf{\theta})\). \[ \begin...

1432 sym

Blog Post 3: Training & Testing: Other Models

21.09.2020

Blog Post 3: Training & Testing: Other Models DT 621—Fall 2020 Avraham Adler 9/21/2020 Training and Testing with Hyperparameters We saw in the previous blog post that splitting data into training and testing sets in and of itself can be valuable to prevent overfitting. In this post, we will demonstrate how it can help with hyperparameter tunin...

1790 sym R (1876 sym/18 pcs) 4 img

DT621_F2020_Blog_1

16.09.2020

Blog Post 1: Orthogonal Regression DT 621—Fall 2020 Avraham Adler 9/15/2020 What is Orthogonal Regression Orthogonal regression is when the error between the observations, \(y_i\) and the regression line is not measured solely along the y-axis (vertically) but perpendicularly—orthogonally—to the regreression line. The “error” is split ...

2030 sym R (1790 sym/6 pcs) 2 img

DT 622 - Fall 2020 - L02 - Logistic Regression

13.09.2020

library(MASS) library(caret) library(data.table) Introduction This document represents an attempt to review the heart dataset and perform a logistic regression on it. As this is an explanatory document, the code will be interspersed with the findings. In a formal report, code would be relegated to a code appendix. Exploratory Data Analysis L...

5320 sym R (10745 sym/40 pcs) 2 img

Blog Post 2: Training & Testing: Linear Regression

17.09.2020

Blog Post 2: Training & Testing: Linear Regression DT 621—Fall 2020 Avraham Adler 9/17/2020 Training & Testing Sets When originally taught, most students perform their regression analysis on the totality of their data. If one is solely interested in past behavior, this is proper. However, if one is interested in predicting future behavior, it ...

1993 sym R (2841 sym/10 pcs)

DT622 - Fall 2020 - L03 - Naïve Bayes

22.09.2020

L03—Naïve Bayes DT 622—Fall 2020 Avraham Adler 9/22/2020 Introduction In this document I am going to attempt a Naïve Bayesian analysis on made-up data in order to try and deepen my understanding of the process. I will restrict the features to categorical variables for simplicity. I will sanity test my findings against an existing software ...

5475 sym R (22644 sym/40 pcs)

The Value of Cross Validation & Hyperparameter Tuning

08.10.2020

Blog Post 5: The Value of Cross Validation & Hyperparameter Tuning DT 621—Fall 2020 Avraham Adler 10/8/2020 What is Cross-Validation? We saw in previous blog posts that splitting data into training and testing sets helps to prevent overfitting. In this post, we will demonstrate another method that can be used both separately and in conjunction...

3873 sym R (2308 sym/5 pcs)