Publications by Joey Campbell

R Lab: Clustering

16.04.2020

K-Means Clustering The function kmeans() performs K-means clustering in R. We begin with a simple simulated example in which there truly are two clusters in the data: the first 25 observations have a mean shift relative to the next 25 observations. set.seed(2) x=matrix(rnorm (50*2), ncol=2) x[1:25,1]=x[1:25,1]+3 x[1:25,2]=x[1:25,2]-4 We now pe...

10265 sym R (2332 sym/25 pcs) 4 img

Linear Regression R Lab

14.04.2020

Libraries The library() function is used to load libraries, or groups of functions and data sets that are not included in the base R distribution. Basic functions that perform least squares linear regression and other simple analyses come standard with the base distribution, but more exotic functions require additional libraries. Here we load the...

31271 sym R (12488 sym/68 pcs) 11 img

Introduction to R Lab

14.04.2020

In this lab, we will introduce some simple R commands. The best way to learn a new language is to try out the commands. R can be downloaded from http://cran.r-project.org/. Basic Commands R uses functions to perform operations. To run a function called funcname, we type funcname(input1, input2), where the inputs (or arguments) input1 and input2 ...

33556 sym R (7107 sym/106 pcs) 23 img

Case Study WHO Example Questions

07.04.2020

This code is repeated from the Tidy Data Case Study because it is needed by the exercises. library(tidyverse) who1 <- who %>% gather(new_sp_m014:newrel_f65, key = "key", value = "cases", na.rm = TRUE) glimpse(who1) Observations: 76,046 Variables: 6 $ country [3m[38;5;246m<chr>[39m[23m "Afghanistan", "Afghanistan", "Afghanistan", "Afghanist...

11717 sym R (2463 sym/18 pcs) 1 img

Tidy Data Case Study

07.04.2020

Let’s pull together everything you’ve learned to tackle a realistic data tidying problem. The tidyr::who dataset contains tuberculosis (TB) cases broken down by year, country, age, gender, and diagnosis method. The data comes from the 2014 World Health Organization Global Tuberculosis Report, available at http://www.who.int/tb/country/data/do...

11067 sym R (1048 sym/11 pcs)

googleVis

04.04.2020

R Interface to Google Charts The googleVis package provides an interface between R and the Google’s charts tools. It allows users to create web pages with interactive charts based on R data frames. Charts are displayed locally via the R HTTP help server. A modern browser with Internet connection is required and for some charts a Flash player. T...

2073 sym R (2727 sym/16 pcs) 1 tbl

Plotly

04.04.2020

Getting Started with Plotly for R Plotly is a free and open-source graphing library for R. Getting Started with Plotly for R Plotly is a free and open-source graphing library for R. Plotly’s R graphing library makes interactive, publication-quality graphs. Getting Started with Plotly for R Plotly is a free and open-source graphing library ...

4828 sym R (3609 sym/22 pcs) 1 img

SVM with CARET

02.04.2020

The Support Vector Machine (or SVM) is a useful classification technique. Support vector machine methods can handle both linear and non-linear class boundaries. It can be used for both two-class and multi-class classification problems. In real life data, the separation boundary is generally nonlinear. Technically, the SVM algorithm perform a non-...

11331 sym R (8128 sym/22 pcs) 1 img

Lab 8 Trees Based Method

01.04.2020

Fitting Classification Trees Recursive partitioning is a fundamental tool in data mining. It helps us explore the stucture of a set of data, while developing easy to visualize decision rules for predicting a categorical (classification tree). Classification (as described by Brieman, Freidman, Olshen, and Stone) can be generated through the rpart ...

25498 sym R (20193 sym/44 pcs) 8 img

STA4143_mt_caret

31.03.2020

library('tidyverse') library('caret') library('modelr') set.seed(303) Part 2: Regression (40 Points) The table below displays catalog-spending data for the first few of 200 randomly selected individuals from a very large (over 20,000 households) data base.1 The variable of particular interest is catalog spending as measured by the Spending Ra...

21679 sym R (23729 sym/48 pcs) 4 img 2 tbl