Publications by Andrew Treadway

Running R Code in Parallel

14.10.2017

Background Running R code in parallel can be very useful in speeding up performance. Basically, parallelization allows you to run multiple processes in your code simultaneously, rather than than iterating over a list one element at a time, or running a single process at a time. Thankfully, running R code in parallel is relatively simple using t...

5405 sym R (2184 sym/6 pcs) 4 img

Vectorize Fuzzy Matching

11.12.2017

One of the best things about R is its ability to vectorize code. This allows you to run code much faster than you would if you were using a for or while loop. In this post, we’re going to show you how to use vectorization to speed up fuzzy matching. First, a little bit of background will be covered. If you’re familiar with vectorization a...

5177 sym R (1339 sym/7 pcs) 2 img

Underrated R Functions

30.12.2017

I wanted to write a post about a couple of handy functions in R that don’t always get the recognition they deserve. This article will talk about a few functions that form part of R’s core functional programming capabilities. R has thousands of functions, so this is just a short list, and I’ll probably write other articles like this in the...

6298 sym R (1558 sym/13 pcs) 2 img

Timing Python Processes

14.01.2018

Timing Python processes is made possible with several different packages. One of the most common ways is using the standard library package, time, which we’ll demonstrate with an example. However, another package that is very useful for timing a process — and particularly telling you how far along a process has come — is tqdm. As we’ll...

3603 sym Python (1380 sym/4 pcs) 6 img

Coding with the Yahoo_fin Package

24.01.2018

Subscribe to TheAutomatic.net via the area on the right side of the page. The yahoo_fin package contains functions to scrape stock-related data from Yahoo Finance and NASDAQ. You can view the official documentation by clicking this link, but the below post will provide a few more in-depth examples. All of the functions in yahoo_fin are containe...

4334 sym Python (1525 sym/13 pcs) 2 img

ICA on Images with Python

23.06.2018

Click here to see my recommended reading list. What is Independent Component Analysis (ICA)? If you’re already familiar with ICA, feel free to skip below to how we implement it in Python. ICA is a type of dimensionality reduction algorithm that transforms a set of variables to a new set of components; it does so such that that the statistical ...

3874 sym Python (475 sym/5 pcs) 16 img

R: How to create, delete, move, and more with files

11.07.2018

Though Python is usually thought of over R for doing system administration tasks, R is actually quite useful in this regard. In this post we’re going to talk about using R to create, delete, move, and obtain information on files. How to get and change the current working directory Before working with files, it’s usually a good idea to first ...

6431 sym R (2275 sym/24 pcs) 2 img

How to download image files with RoboBrowser

16.07.2018

In a previous post, we showed how RoboBrowser can be used to fill out online forms for getting historical weather data from Wunderground. This article will talk about how to use RoboBrowser to batch download collections of image files from Pexels, a site which offers free downloads. If you’re looking to work with images, or want to build a tr...

4359 sym Python (4912 sym/15 pcs) 4 img

How to get live stock prices with Python

31.07.2018

In a previous post, I gave an introduction to the yahoo_fin package. The most updated version of the package includes new functionality allowing you to scrape live stock prices from Yahoo Finance (real-time). In this article, we’ll go through a couple ways of getting real-time data from Yahoo Finance for stocks, as well as how to pull cryptocu...

2567 sym Python (594 sym/5 pcs) 8 img

Getting data from PDFs the easy way with R

24.08.2018

Earlier this year, a new package called tabulizer was released in R, which allows you to automatically pull out tables and text from PDFs. Note, this package only works if the PDF’s text is highlightable (if it’s typed) — i.e. it won’t work for scanned-in PDFs, or image files converted to PDFs. If you don’t have tabulizer installed, ju...

3013 sym R (781 sym/8 pcs) 2 img