Publications by Andrew Treadway

Dpylthon…dplyr for Python!

05.09.2018

If you’re an avid R user, you probably use the famous dplyr package. Python has a package meant to be similar to dplyr, called dplython. This article will give an introduction for how to use dplython. For the examples below, we’ll use a sample dataset that comes with R giving attributes about the US states, including population, area, and i...

4323 sym Python (2407 sym/13 pcs) 2 img

How to build a logistic regression model from scratch in R

02.10.2018

In a previous post, we showed how using vectorization in R can vastly speed up fuzzy matching. Here, we will show you how to use R’s vectorization functionality to efficiently build a logistic regression model. Now we could just use the caret or stats packages to create a model, but building algorithms from scratch is a great way to develop a...

7493 sym R (3303 sym/9 pcs) 18 img

How to run R from the Task Scheduler

31.10.2018

In a prior post, we covered how to run Python from the Task Scheduler on Windows. This article is similar, but it’ll show how to run R from the Task Scheduler, instead. Similar to before, let’s first cover how to R from the command line, as knowing this is useful for running it from the Task Scheduler. Running R from the Command Line To ope...

4472 sym 12 img

Those “other” apply functions…

13.11.2018

So you know lapply, sapply, and apply…but…what about rapply, vapply, or eapply? These are generally a little less known as far as the apply family of functions in R go, so this post will explore how they work. rapply Let’s start with rapply. This function has a couple of different purposes. One is to recursively apply a function to a lis...

4991 sym R (1253 sym/13 pcs) 14 img

10 R functions for Linux commands and vice-versa

10.12.2018

This post will go through 10 different Linux commands and their R alternatives. If you’re interested in learning more R functions for working with files like some of those below, also check out this post. How to list all the files in a directory Linux R What does it do? ls list.files() Lists all the files in a directory ls -R list.files(rec...

2012 sym R (1914 sym/22 pcs) 2 img 10 tbl

So you want to play a pRank in R…?

18.12.2018

So…you want to play a pRank with R? This short post will give you a fun function you can use in R to help you out! How to change a file’s modified time with R Let’s say we have a file, test.txt. What if we want to change the last modified date of the file (let’s suppose the file’s not that important)? Let’s say, for instance, we wan...

1965 sym R (358 sym/4 pcs) 6 img

Creating a word cloud on R-bloggers posts

29.01.2019

This post will go through how to create a word cloud of article titles scraped from the awesome R-bloggers. Our goal will be to use R’s rvest package to search through 50 successive pages on the site for article titles. The stringr and tm packages will be used for string cleaning and for creating a term document frequency matrix (with tm). W...

3262 sym R (2501 sym/9 pcs) 4 img

Speed Test: Sapply vs. Vectorization

13.03.2019

The apply functions in R are awesome (see this post for some lesser known apply functions). However, if you can use pure vectorization, then you’ll probably end up making your code run a lot faster than just depending upon functions like sapply and lapply. This is because apply functions like these still rely on looping through elements in a ...

2757 sym R (912 sym/8 pcs) 14 img

Don’t forget the “utils” package in R

03.04.2019

With thousands of powerful packages, it’s easy to glaze over the libraries that come preinstalled with R. Thus, this post will talk about some of the cool functions in the utils package, which comes with a standard installation of R. While utils comes with several familiar functions, like read.csv, write.csv, and help, it also contains over 2...

4228 sym R (658 sym/10 pcs) 14 img

Four ways to reverse a string in R

16.05.2019

R offers several ways to reverse a string, include some base R options. We go through a few of those in this post. We’ll also compare the computational time for each method. Reversing a string can be especially useful in bioinformatics (e.g. finding the reverse compliment of a DNA strand). To get started, let’s generate a random string of ...

2362 sym R (711 sym/6 pcs) 10 img