Publications by Ken Wood
Exploring the CO2 Dataset
Objective: In this assignment, you will apply basic statistical analysis techniques using the R programming language on the CO2 dataset (make sure you use the CO2 dataset and not the co2 dataset, they are different). This dataset details CO2 uptake in grass plants under different environmental conditions. Your tasks will include data exploration, v...
1318 sym R (2000 sym/10 pcs) 2 img
Coding Temple R-Programming Challenge Exercise
1. Setting up the Environment Import necessary libraries and load the dataset: library(tidyverse) library(lubridate) library(skimr) library(dplyr) whr <- read.csv("data/WHR2023.csv") 2. Initial Exploration of the Dataset Explore the structure and summary of the dataset: str(whr) # View the structure of the dataset ## 'data.frame': 137 obs. of ...
1245 sym R (8001 sym/14 pcs) 2 img 3 tbl
Coding Temple - R Programming - Guided Demo
1. Setting up the Environment Import necessary libraries and load the dataset: library(tidyverse) library(lubridate) library(skimr) library(dplyr) life_exp <- read.csv("data/Life Expectancy Data.csv") 2. Initial Exploration of the Dataset Explore the structure and summary of the dataset: str(life_exp) # View the structure of the dataset ## 'data....
964 sym R (10993 sym/26 pcs) 4 img 3 tbl
Hierarchical Mixture Models & Likelihoods
We have the mixture model written as \[f(x) = \sum_{k=1}^{K}w_kg_k(x)\] We will introduce an indicator \(C\), which is a (discrete) random variable where \(C \in 1,2,,,,K\). Then, \(X|C \sim g_c(x)\) and \(C\sim Pr(C=k) = w_k\), and \[Pr(X)=\sum_{k=1}^{K}f(x|C=k)Pr(C=k)=\sum_{k=1}^{K}w_kg_k(x)\] Setting up the hierarchical problem: \(X|C \sim g_c(x...
6848 sym Python (603 sym/1 pcs) 1 img
Hierarchical Mixture Models & Likelihoods
We have the mixture model written as \[f(x) = \sum_{k=1}^{K}\omega_kg_k(x)\] We will introduce an indicator \(C\), which is a (discrete) random variable where \(C \in 1,2,,,,K\). Then, \(X|C \sim g_c(x)\) and \(C\sim Pr(C=k) = \omega_k\), and \[Pr(X)=\sum_{k=1}^{K}f(x|C=k)Pr(C=k)=\sum_{k=1}^{K}\omega_kg_k(x)\] Setting up the hierarchical problem: \...
1087 sym Python (603 sym/1 pcs) 1 img
Bayesian Zero-Inflated Mixture Modes
Zero inflated negative binomial distribution x = seq(0, 15) y = dnbinom(x, 8, 0.6) z = 0.2*c(1,rep(0,length(x)-1)) + (1-0.2)*y par(mfrow=c(2,1)) par(mar=c(4,4,2,2)+0.1) barplot(y, names.arg=x, las=1, xlab = "x", ylab="Probability", border=NA, main="Negative Binomial") par(mar=c(4,4,1,1)+0.1) barplot(z, names.arg=x, las=1, xlab = "x", ylab=...
825 sym 4 img
Bayesian Mixture Model Examples
Mixture of univariate Gaussians, bimodal x = seq(-5, 12, length=100) y = 0.6*dnorm(x, 0, 1) + 0.4*dnorm(x, 5, 2) par(mar=c(4,4,1,1)+0.1) plot(x, y, type="l", ylab="Density", las=1, lwd=2) Mixture of univariate Gaussians, unimodal skewed x = seq(-5, 12, length=100) y = 0.55*dnorm(x, 0, sqrt(2)) + 0.45*dnorm(x, 3, 4) par(mar=c(4,4,1,1)+0.1) plot(x, ...
158 sym 3 img
Data Analysis Project - Bayesian Statistics V2
Executive Summary The purpose of this project was to fit a series of regression models to a dataset containing housing features and a corresponding sale price as the response variable. Three models were constructed using both R and JAGS. One of the JAGS models used 3 features to predict sale prices while the final iteration used 4 features. A numbe...
3790 sym R (6308 sym/20 pcs)
Bayesian Mixture Models - Basic Definitions
Suppose we have a variable \(X\) with probability density function of \(f(x)\) (continuous) or probability mass function (discrete). Then \(F(x)\) is the probability distribution where \(F(x) = P(X<x)\). The Mixture model takes the form: \[F(x) = \sum_{k=1}^{K}w_kG_k(x),\] where \(w_k\) are the weights and \(G_k(x)\) are the components such that \(...
2855 sym
Data Analysis Project - Bayesian Statistics
Introduction The Ames Housing dataset, which is available on Kaggle.com, was compiled by Dean De Cock for use in data science education. It’s an incredible dataset resource for data scientists and statisticians looking for a modernized and expanded version of the often-cited Boston Housing dataset. The subject dataset contains 79 explanatory vari...
4048 sym 1 img