Publications by Ken Wood

Exploring the CO2 Dataset

25.02.2025

Objective: In this assignment, you will apply basic statistical analysis techniques using the R programming language on the CO2 dataset (make sure you use the CO2 dataset and not the co2 dataset, they are different). This dataset details CO2 uptake in grass plants under different environmental conditions. Your tasks will include data exploration, v...

1318 sym R (2000 sym/10 pcs) 2 img

Coding Temple R-Programming Challenge Exercise

17.02.2025

1. Setting up the Environment Import necessary libraries and load the dataset: library(tidyverse) library(lubridate) library(skimr) library(dplyr) whr <- read.csv("data/WHR2023.csv") 2. Initial Exploration of the Dataset Explore the structure and summary of the dataset: str(whr) # View the structure of the dataset ## 'data.frame': 137 obs. of ...

1245 sym R (8001 sym/14 pcs) 2 img 3 tbl

Coding Temple - R Programming - Guided Demo

16.02.2025

1. Setting up the Environment Import necessary libraries and load the dataset: library(tidyverse) library(lubridate) library(skimr) library(dplyr) life_exp <- read.csv("data/Life Expectancy Data.csv") 2. Initial Exploration of the Dataset Explore the structure and summary of the dataset: str(life_exp) # View the structure of the dataset ## 'data....

964 sym R (10993 sym/26 pcs) 4 img 3 tbl

Hierarchical Mixture Models & Likelihoods

08.08.2023

We have the mixture model written as \[f(x) = \sum_{k=1}^{K}w_kg_k(x)\] We will introduce an indicator \(C\), which is a (discrete) random variable where \(C \in 1,2,,,,K\). Then, \(X|C \sim g_c(x)\) and \(C\sim Pr(C=k) = w_k\), and \[Pr(X)=\sum_{k=1}^{K}f(x|C=k)Pr(C=k)=\sum_{k=1}^{K}w_kg_k(x)\] Setting up the hierarchical problem: \(X|C \sim g_c(x...

6848 sym Python (603 sym/1 pcs) 1 img

Hierarchical Mixture Models & Likelihoods

07.08.2023

We have the mixture model written as \[f(x) = \sum_{k=1}^{K}\omega_kg_k(x)\] We will introduce an indicator \(C\), which is a (discrete) random variable where \(C \in 1,2,,,,K\). Then, \(X|C \sim g_c(x)\) and \(C\sim Pr(C=k) = \omega_k\), and \[Pr(X)=\sum_{k=1}^{K}f(x|C=k)Pr(C=k)=\sum_{k=1}^{K}\omega_kg_k(x)\] Setting up the hierarchical problem: \...

1087 sym Python (603 sym/1 pcs) 1 img

Bayesian Zero-Inflated Mixture Modes

07.08.2023

Zero inflated negative binomial distribution x = seq(0, 15) y = dnbinom(x, 8, 0.6) z = 0.2*c(1,rep(0,length(x)-1)) + (1-0.2)*y par(mfrow=c(2,1)) par(mar=c(4,4,2,2)+0.1) barplot(y, names.arg=x, las=1, xlab = "x", ylab="Probability", border=NA, main="Negative Binomial") par(mar=c(4,4,1,1)+0.1) barplot(z, names.arg=x, las=1, xlab = "x", ylab=...

825 sym 4 img

Bayesian Mixture Model Examples

07.08.2023

Mixture of univariate Gaussians, bimodal x = seq(-5, 12, length=100) y = 0.6*dnorm(x, 0, 1) + 0.4*dnorm(x, 5, 2) par(mar=c(4,4,1,1)+0.1) plot(x, y, type="l", ylab="Density", las=1, lwd=2) Mixture of univariate Gaussians, unimodal skewed x = seq(-5, 12, length=100) y = 0.55*dnorm(x, 0, sqrt(2)) + 0.45*dnorm(x, 3, 4) par(mar=c(4,4,1,1)+0.1) plot(x, ...

158 sym 3 img

Data Analysis Project - Bayesian Statistics V2

05.08.2023

Executive Summary The purpose of this project was to fit a series of regression models to a dataset containing housing features and a corresponding sale price as the response variable. Three models were constructed using both R and JAGS. One of the JAGS models used 3 features to predict sale prices while the final iteration used 4 features. A numbe...

3790 sym R (6308 sym/20 pcs)

Bayesian Mixture Models - Basic Definitions

04.08.2023

Suppose we have a variable \(X\) with probability density function of \(f(x)\) (continuous) or probability mass function (discrete). Then \(F(x)\) is the probability distribution where \(F(x) = P(X<x)\). The Mixture model takes the form: \[F(x) = \sum_{k=1}^{K}w_kG_k(x),\] where \(w_k\) are the weights and \(G_k(x)\) are the components such that \(...

2855 sym

Data Analysis Project - Bayesian Statistics

02.08.2023

Introduction The Ames Housing dataset, which is available on Kaggle.com, was compiled by Dean De Cock for use in data science education. It’s an incredible dataset resource for data scientists and statisticians looking for a modernized and expanded version of the often-cited Boston Housing dataset. The subject dataset contains 79 explanatory vari...

4048 sym 1 img