Publications by David Timewell
Intro to Simr
Introduction When beginning to use R one of the first things to do after getting a view of the dataset you are dealing using glimpse() and str() is to start looking at basic descriptive statistics. In the past I used summary extensivly to do this. I was going through a tutorial on EDA and linear regression and the author used the skimr package wh...
4151 sym R (4509 sym/14 pcs) 17 tbl
Basic EDA on Categorical data
Intro to EDA with Categorical variables Below is an introduction to EDA with categorical variables. THis should give a decent starting point when you want to start your analysis. If you knowledge of ggplot needs some work then this book is really good : https://r-graphics.org/ library(tidyverse) ## -- Attaching packages --------------------------...
3012 sym R (8369 sym/35 pcs) 6 img
Using R in PowerBI
Introduction : This serves as a quick guide to get R scripts running in PowerBI Step 1 - Install R R can be installed from the following link : https://cran.r-project.org/bin/windows/base/ .Remember where you installed it. Step 2 - Setup R in PowerBI Go to File -> Options and Setting -> Options Select R scripting from the side menu and then in...
3084 sym 9 img
Sqldf Introduction
library(sqldf) ## Loading required package: gsubfn ## Loading required package: proto ## Loading required package: RSQLite library(readr) library(ggplot2) library(dplyr) ## ## Attaching package: 'dplyr' ## The following objects are masked from 'package:stats': ## ## filter, lag ## The following objects are masked from 'package:base': ...
1237 sym R (4986 sym/26 pcs) 1 img
Using ROC , Confusion matrix and AUC with logistic regression
Introduction On a recent project using logistic regression whilst testing my model accuracy, adjusting the classification threshold and creating many confusion matrices. I later found that using a ROC curve was a better approach to finding the optimal threshold. library(caTools) library(caret) ## Loading required package: lattice ## Loading requ...
3415 sym R (5426 sym/21 pcs) 3 img
AT3 Further Explorations - Predicting Accidents with Time Series Analysis
Introduction The Queensland government estimates that the cost of fatalities and hospitalisations for 2014 alone is estimated at over $5 billion (Safer Roads, Safer Queensland: Queensland’s Road Safety Strategy 2015-21,2019). In a recent report the estimated cost for 2018 was also put at $5 billion. With 15 percent of hospital admissions attrib...
14582 sym R (4268 sym/12 pcs) 13 img