Publications by Datacamp - Jo Hardin

Foundations of Inference

06.01.2021

Welcome to the course! Working with the NHANES data Throughout this chapter, you will use the NHANES dataset from the NHANES R package. The data are collected by the Center for Disease Control (CDC, the national public health institute in the United States) and can be thought of as a random sample of US residents. Before moving on to investigate...

31307 sym R (29255 sym/164 pcs) 77 img

Modeling with Data in the Tidyverse

04.01.2021

Background on modeling for explanation Course overview introduction to modeling: theory and terminology regression: simple linear regression multiple regression model assessment General modeling framework formula \(y = f(\vec{x}) + \epsilon\) Where: \(y\): outcome variable of interest \(\vec{x}\): explanatory/predictor variables \(f()\): fun...

18656 sym R (18405 sym/105 pcs) 53 img

Case Studies: Network Analysis in R

31.12.2020

Exploring your data set Dyads Triads 3-digit code count of vertices connected by a bidirectional symmetric edge count of pairs of vertices connected by an asymmetric edge count of pairs of unconnected vertices Letter code C. Cyclic D. Single edges go Down U. Single edges come Up T. Transitive - if any two vertices in a triad are connected to ...

31964 sym R (39342 sym/169 pcs) 90 img

Supervised Learning in R: Classification

20.12.2020

Classification with Nearest Neighbors Recognizing a road sign with kNN After several trips with a human behind the wheel, it is time for the self-driving car to attempt the test course alone. As it begins to drive away, its camera captures the following image: Can you apply a kNN classifier to help the car recognize this sign? The dataset signs...

20348 sym R (28773 sym/128 pcs) 53 img

Exploratory Data Analysis in R

17.12.2020

Exploring categorical data Contingency table review In this chapter you’ll continue working with the comics dataset introduced in the video. This is a collection of characteristics on all of the superheroes created by Marvel and DC comics in the last 80 years. Let’s start by creating a contingency table, which is a useful way to represent th...

13038 sym R (15690 sym/99 pcs) 75 img

Introduction to Writing Functions in R

16.12.2020

Why you should use functions The arguments to mean() Mean has 3 arguments x: a numeric or date-time vector trim: the proportion of outliers from each end to remove before calculating na.rm: remove before calculating Calling mean() Pass arguments by position mean(numbers, 0.1, TRUE) Pass arguments by name mean(na.rm = TRUE, trim = 0.1, x = numbe...

22482 sym R (3018751 sym/143 pcs) 66 img

Working with Dates and Times in R

15.12.2020

Introduction to dates Dates 27th Feb 2013 NZ: 27/2/2013 USA: 2/27/2013 ISO 8601 YYYY-MM-DD values ordered from the largest to smallest unit of time each has a fixed number of digits, must be padded with leading zeros either, no separators for computers, or - in dates 1st of January 2011 -> 2011-01-01 Specifying dates As you saw in the video, ...

25662 sym R (32366 sym/207 pcs) 54 img

Intermediate Importing Data in R

11.12.2020

R Markdown Relational Databases What is a relational database? How to connect? How to read table? Database Management System DBMS Open Source MySQL, PostgreSQL, SQLite Proprietary Oracle Database, Microsoft SQL Server SQL = Structured Query Language Databases in R Different R packages MySQL: RMySQL PostgresSQL: RPostgresSQL Oracle Data...

21859 sym R (52873 sym/123 pcs) 20 img

Introduction to Importing Data in R

11.12.2020

Introduction & read.csv read.csv The utils package, which is automatically loaded in your R session on startup, can import CSV files with the read.csv() function. In this exercise, you’ll be working with swimming_pools.csv[http://s3.amazonaws.com/assets.datacamp.com/production/course_1477/datasets/swimming_pools.csv]; it contains data on swimm...

18478 sym R (90394 sym/105 pcs) 26 img

Intermediate Data Visualization with ggplot2

10.12.2020

Stats with geoms ggplot2, course 2 statistics coordinates facets data visualization best practices statistics layer two categories of functions called from within a geom called independently stats_ non-parametric model is default - loess Smoothing To practice on the remaining layers (statistics, coordinates and facets), we’ll continue w...

25741 sym R (27222 sym/166 pcs) 176 img