Publications by Rsquared Academy Blog

Import Data into R – Part 1

29.07.2018

Introduction In this post, we will learn to: read data from flat or delimited files handle column names/header skip text/info present before data specify column/variable types read specific columns/variables Libraries, Data & Code We will use the readr package. The data sets can be downloaded from here and the codes from here. library(readr) T...

6913 sym R (9918 sym/11 pcs) 20 img

Import Data into R – Part 2

10.08.2018

Introduction This is the second post in the series Importing Data into R. In the previous post, we explored reading data from flat/delimited files. In this post, we will: list sheets in an excel file read data from an excel sheet read specific cells from an excel sheet read specific rows read specific columns read data from – SAS – SPSS – ...

4104 sym R (7184 sym/20 pcs) 10 img

Data Wrangling with dplyr – Part 1

22.08.2018

Introduction According to a survey by CrowdFlower, data scientists spend most of their time cleaning and manipulating data rather than mining or modeling them for insights. As such, it becomes important to have tools that make data manipulation faster and easier. In today’s post, we introduce you to dplyr, a grammar of data manipulation. Libra...

9664 sym R (17851 sym/30 pcs) 18 img

Data Wrangling with dplyr – Part 2

03.09.2018

Introduction In the previous post we learnt about dplyr verbs and used them to compute average order value for an online retail company data. In this post, we will learn to combine tables using different *_join functions provided in dplyr. Libraries, Code & Data We will use the following packages: dplyr readr The data sets can be downloaded fr...

4390 sym R (4999 sym/9 pcs) 16 img

Data Wrangling with dplyr – Part 3

15.09.2018

Introduction In the previous post, we learnt to combine tables using dplyr. In this post, we will explore a set of helper functions in order to: extract unique rows rename columns sample data extract columns slice rows arrange rows compare tables extract/mutate data using predicate functions count observations for different levels of a variable ...

3768 sym R (8608 sym/20 pcs) 12 img

Introduction to tibbles

27.09.2018

Introduction A tibble, or tbl_df, is a modern reimagining of the data.frame, keeping what time has proven to be effective, and throwing out what is not. Tibbles are data.frames that are lazy and surly: they do less (i.e. they don’t change variable names or types, and don’t do partial matching) and complain more (e.g. when a variable does n...

4441 sym R (8873 sym/18 pcs)

Readable Code with Pipes

09.10.2018

Introduction R code contain a lot of parentheses in case of a sequence of multiple operations. When you are dealing with complex code, it results in nested function calls which are hard to read and maintain. The magrittr package by Stefan Milton Bache provides pipes enabling us to write R code that is readable. Pipes allow us to clearly express a...

5913 sym R (8184 sym/28 pcs) 12 img

Hacking strings with stringr

21.10.2018

Introduction In this post, we will learn to work with string data in R using stringr. As we did in the other posts, we will use a case study to explore the various features of the stringr package. Let us begin by installing and loading stringr and a set of other pacakges we will be using. Libraries, Code & Data We will use the following librarie...

10578 sym R (15263 sym/41 pcs) 30 img

Working with Date and Time in R

02.11.2018

Introduction In this post, we will learn to work with date/time data in R using lubridate, an R package that makes it easy to work with dates and time. Let us begin by installing and loading the pacakge. Libraries, Code & Data We will use the following packages: lubridate dplyr magrittr readr The data sets can be downloaded from here and the c...

5767 sym R (9895 sym/25 pcs) 22 img

Categorical Data Analysis using forcats

14.11.2018

Introduction In this post, we will learn to work with categorical/qualitative data in R using forcats. Let us begin by installing and loading forcats and a set of other pacakges we will be using. Libraries & Code We will use the following packages: forcats dplyr magrittr ggplot2 tibbe purrr and readr The codes from here. library(forcats) libra...

6762 sym R (7833 sym/37 pcs) 36 img