Publications by Ken Wood
Normal Distribution Assessment
A normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. \(X_i \stackrel{iid}{\sim} N(\mu,\sigma_0^2)\) prior: \(\mu \sim N(m_{\space0},s_0^2)\) temps = c(94.6,95.4,96.2,94.9,95.9) mean(temps) ## [1] 95.4 qnorm(.975,95.41,.042) ## [1] 95.49232 pnorm(100,95.41,.042) ## [1] ...
3617 sym
Exponential Distribution Analysis
The exponential distribution is the probability distribution of the time between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant average rate. It is a particular case of the gamma distribution. \(Y \sim Exp(\lambda)\) with conjugate \(Gamma\) function. prior: \(\lambda \sim Gamma...
2590 sym 2 img
Poisson Distribution Assessment
The Poisson likelihood is often used to model count data since Poisson random variables are integer-valued, starting at 0. Example scenario where could we appropriately model with a Poisson likelihood? Predicting the number of goals scored in a hockey match. Each of the following gamma distributions is being considered as a prior for a Poisson mean...
2106 sym 1 img
Bernoulli-Binomial Distribution Analysis
Suppose we are giving two students a multiple-choice exam with 40 questions, where each question has four choices. We don’t know how much the students have studied for this exam, but we think that they will do better than just guessing randomly. 1. What are the parameters of interest? Parameters of interest are \(\theta_1\)=true probability the ...
2402 sym 7 img
Bayesian Statistics Lesson 7 Assessment
Flipping a coin with unknown probability of heads (\(\theta\)) Suppose we use a Bernoulli likelihood for each coin flip, i.e., \(f(y_i|\theta) = \theta^{y_i}(1-\theta)^{1-y_i}I_{(0\le\theta\le1)}\) for \(y_i=0\) or \(y_i=1\), and a uniform prior for \(\theta\). What is the posterior distribution for \(\theta\) if we observe the following sequence ...
674 sym 1 img
Binomial Distribution Analysis
Suppose we are giving two students a multiple-choice exam with 40 questions, where each question has four choices. We don’t know how much the students have studied for this exam, but we think that they will do better than just guessing randomly. 1. What are the parameters of interest? Parameters of interest are \(\theta_1\)=true probability the ...
2101 sym 5 img
Data Science Capstone in R - Week 2 Milestone Report Using Quanteda
Instructions The goal of this project is to display that we’ve become familiar with the data and that we are on track to create our prediction algorithm. This report (to be submitted on R Pubs (http://rpubs.com/)) explains our exploratory analysis and our goals for the eventual app and algorithm. This document should be concise and explain only...
3221 sym R (9635 sym/39 pcs) 5 img
Data Science Capstone in R - Week 3 Quiz
rm(list = ls()) library(quanteda) ## Package version: 2.1.2 ## Parallel computing: 2 of 4 threads used. ## See https://quanteda.io for tutorials and examples. ## ## Attaching package: 'quanteda' ## The following object is masked from 'package:utils': ## ## View library(data.table) library(dplyr) ## ## Attaching package: 'dplyr' ## The foll...
2162 sym R (18436 sym/46 pcs)
Katz Backoff Prediction Model - Week 3
Applying the Katz Backoff Algorithm: As noted earlier, a corpus is a body of text from which we build and test LMs. rm(list = ls()) library(quanteda) ## Package version: 2.1.2 ## Parallel computing: 2 of 4 threads used. ## See https://quanteda.io for tutorials and examples. ## ## Attaching package: 'quanteda' ## The following object is masked fr...
5764 sym R (19407 sym/54 pcs)
Katz Backoff Example
Example of Applying the Algorithm: The Little Corpus That Could As noted earlier, a corpus is a body of text from which we build and test LMs. To illustrate how the mathematical formulation of the KBO Trigram model works, it’s helpful to look at a simple corpus that is small enough to easily keep track of the n-gram counts, but large enough to ...
6585 sym R (18310 sym/51 pcs)