Publications by Ken Wood

Normal Distribution Assessment

10.07.2023

A normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. \(X_i \stackrel{iid}{\sim} N(\mu,\sigma_0^2)\) prior: \(\mu \sim N(m_{\space0},s_0^2)\) temps = c(94.6,95.4,96.2,94.9,95.9) mean(temps) ## [1] 95.4 qnorm(.975,95.41,.042) ## [1] 95.49232 pnorm(100,95.41,.042) ## [1] ...

3617 sym

Exponential Distribution Analysis

07.07.2023

The exponential distribution is the probability distribution of the time between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant average rate. It is a particular case of the gamma distribution. \(Y \sim Exp(\lambda)\) with conjugate \(Gamma\) function. prior: \(\lambda \sim Gamma...

2590 sym 2 img

Poisson Distribution Assessment

06.07.2023

The Poisson likelihood is often used to model count data since Poisson random variables are integer-valued, starting at 0. Example scenario where could we appropriately model with a Poisson likelihood? Predicting the number of goals scored in a hockey match. Each of the following gamma distributions is being considered as a prior for a Poisson mean...

2106 sym 1 img

Bernoulli-Binomial Distribution Analysis

05.07.2023

Suppose we are giving two students a multiple-choice exam with 40 questions, where each question has four choices. We don’t know how much the students have studied for this exam, but we think that they will do better than just guessing randomly. 1. What are the parameters of interest? Parameters of interest are \(\theta_1\)=true probability the ...

2402 sym 7 img

Bayesian Statistics Lesson 7 Assessment

05.07.2023

Flipping a coin with unknown probability of heads (\(\theta\)) Suppose we use a Bernoulli likelihood for each coin flip, i.e., \(f(y_i|\theta) = \theta^{y_i}(1-\theta)^{1-y_i}I_{(0\le\theta\le1)}\) for \(y_i=0\) or \(y_i=1\), and a uniform prior for \(\theta\). What is the posterior distribution for \(\theta\) if we observe the following sequence ...

674 sym 1 img

Binomial Distribution Analysis

05.07.2023

Suppose we are giving two students a multiple-choice exam with 40 questions, where each question has four choices. We don’t know how much the students have studied for this exam, but we think that they will do better than just guessing randomly. 1. What are the parameters of interest? Parameters of interest are \(\theta_1\)=true probability the ...

2101 sym 5 img

Data Science Capstone in R - Week 2 Milestone Report Using Quanteda

18.10.2020

Instructions The goal of this project is to display that we’ve become familiar with the data and that we are on track to create our prediction algorithm. This report (to be submitted on R Pubs (http://rpubs.com/)) explains our exploratory analysis and our goals for the eventual app and algorithm. This document should be concise and explain only...

3221 sym R (9635 sym/39 pcs) 5 img

Data Science Capstone in R - Week 3 Quiz

18.10.2020

rm(list = ls()) library(quanteda) ## Package version: 2.1.2 ## Parallel computing: 2 of 4 threads used. ## See https://quanteda.io for tutorials and examples. ## ## Attaching package: 'quanteda' ## The following object is masked from 'package:utils': ## ## View library(data.table) library(dplyr) ## ## Attaching package: 'dplyr' ## The foll...

2162 sym R (18436 sym/46 pcs)

Katz Backoff Prediction Model - Week 3

17.10.2020

Applying the Katz Backoff Algorithm: As noted earlier, a corpus is a body of text from which we build and test LMs. rm(list = ls()) library(quanteda) ## Package version: 2.1.2 ## Parallel computing: 2 of 4 threads used. ## See https://quanteda.io for tutorials and examples. ## ## Attaching package: 'quanteda' ## The following object is masked fr...

5764 sym R (19407 sym/54 pcs)

Katz Backoff Example

12.10.2020

Example of Applying the Algorithm: The Little Corpus That Could As noted earlier, a corpus is a body of text from which we build and test LMs. To illustrate how the mathematical formulation of the KBO Trigram model works, it’s helpful to look at a simple corpus that is small enough to easily keep track of the n-gram counts, but large enough to ...

6585 sym R (18310 sym/51 pcs)