Publications by William Aiken

DATA 609 HW5

03.04.2023

Exercise 1 Carry out the logistic regression (Example 22, page 94) in R. formula: \(y(x) = \frac{1}{1 + exp(-(a+bx))}\) x <- c(0.1, 0.5, 1.0, 1.5, 2.0, 2.5) y <- c(0, 0, 1, 1, 1, 0) lr <- glm(y ~ x, family = "binomial") summary(lr) ## ## Call: ## glm(formula = y ~ x, family = "binomial") ## ## Deviance Residuals: ## 1 2 3 ...

1523 sym R (7226 sym/34 pcs) 2 img

DATA 609 HW4

20.03.2023

Problem 1 x <- c(-0.98, 1, 2.02, 3.03, 4.00) y <- c(2.44, -1.51, -0.47, 2.54, 7.52) df <- data.frame(x,y) lm <- lm(y ~ x, data = df) summary(lm) ## ## Call: ## lm(formula = y ~ x, data = df) ## ## Residuals: ## 1 2 3 4 5 ## 2.9547 -2.8511 -2.7671 -0.7037 3.3671 ## ## Coefficients: ## Estimate Std....

45 sym

DATA 609 HW3

06.03.2023

Problem 1: \(f(x) = \frac{(3x^4 - 4x^3)}{12}\) \(f(x)'=x^3 - x^2\) \(f(x)''=3x^2 - 2x\) \(x_{k + 1} = x_k - \frac{x^3 - x^2}{3x^2 - 2x}\) step <- -10:10 newton <- function(x){ x - (x^3 - x^2)/(3*x^2 - 2*x) } for (n in step) { result <- round(newton(n),3) print(result) } ## [1] -6.562 ## [1] -5.897 ## [1] -5.231 ## [1] -4.565 ## [1] -3.9 ## ...

241 sym R (1564 sym/9 pcs)

DATA 609 HW2

21.02.2023

Problem 1 Show \(x^2+exp(x)+2x^4+1\) is convex. \(f(αx+βy)≤αf(x)+βf(y)\) \((αx+βy)^2+exp(αx+βy)+2(αx+βy^)4+1≤α(x2+exp(x)+2x4+1)+β(y2+exp(y)+2y4+1\)) Using \(α+β=1\) this can be simplified \(2αx^4+αx^2+αexp(x)+2βy^4+βy^2+βexp(y)+1−((αx+βy^)2+exp(αx+βy)+2(αx+βy)^4+1)≥0\) \(2αx4+αx2+αexp(x)+2βy4+βy2+βexp(y)−(�...

1705 sym 1 img

DATA608 Final Project Writeup

11.12.2022

#Abstract Diabetes is a disease with a high health cost to the individual and high monetary cost to our communities. New York state tracks diabetic rates at the county level along with other health and economic data. I leveraged this publicly available data to explore the heterogeneity in diabetic rates in New York state. I wanted to know if ther...

3808 sym

Initial Review of 2 Kaggle Glassdoor DS datasets

10.11.2021

library(dplyr) ## ## Attaching package: 'dplyr' ## The following objects are masked from 'package:stats': ## ## filter, lag ## The following objects are masked from 'package:base': ## ## intersect, setdiff, setequal, union library(stringr) library(kableExtra) ## ## Attaching package: 'kableExtra' ## The following object is mask...

392 sym R (2471 sym/26 pcs) 4 img 5 tbl

Data606 Final Presentation

09.12.2021

DATA 606 Data Project Presentation William Aiken Abstract Diabetes is a disease with a high health cost to the individual and high monetary cost to our communities. New York state tracks diabetic rates at the county level along with other health and economic data. I leveraged this publicly available data to explore the heterogeneity in diabeti...

4362 sym 10 img 4 tbl

DATA607 Project4 WilliamAiken

15.11.2021

DATA607 Project4 William Aiken William Aiken 11/14/2021 Introduction: It can be useful to be able to classify new “test” documents using already classified “training” documents. A common example is using a corpus of labeled spam and ham (non-spam) e-mails to predict whether or not a new document is spam. In this project I used emails cla...

4442 sym R (11356 sym/19 pcs) 3 img

DATA607 Assignment11 Recommender William Aiken

07.11.2021

DATA607 Recommender Assignment William Aiken 11/7/2021 Introduction: My task was to analyze an existing recommender system that you find interesting. I analyzed the Tinder recommender system. Tinder is a dating app that allows users to create an online profile and meet people that Tinder thinks you are compatible with. Method: I did the followin...

4652 sym

Data607 HW10 William Aiken

01.11.2021

Introduction This is an exploration of sentiment analysis using R packages. The code is taken from the book Text Mining with R: A Tidy Approach 1. Method 2.1 The sentiments datasets Get the 3 different sentiment lexicons library(tidytext) get_sentiments("afinn") ## # A tibble: 2,477 × 2 ## word value ## <chr> <dbl> ## 1 aba...

7900 sym R (18173 sym/108 pcs) 10 img