Publications by Benjamin Smith

RObservations #42: Using the jinjar and tidyRSS packages to make a simple newsletter template

20.11.2022

Introduction Jinja is a powerful templating engine that is useful in a variety of contexts. Recently, I discovered how its possible to use the power of Jinja syntax in R with the jinjar package written by David C Hall. With jinjar and the tidyRSS package by Robert Myles it is possible to make an email template that can provide short and informati...

3312 sym R (5312 sym/5 pcs) 12 img

RObservations #43 : Control Individual Label Positions In mapBliss With `_flex()` Functions

23.11.2022

Introduction After introducing the mapBliss package to the world, I was pleased to see that people started using it and were experimenting with making their own map art! On Github, the package got a few stars, some issues opened/closed and some improvements have been made since my last blog on the topic. If you haven’t followed the journey of m...

2386 sym R (391 sym/1 pcs) 6 img

RObservations # 36: Opinions on RStudio’s name change. A Bayesian approach with Stan

07.08.2022

Introduction Recently, RStudio announced its name change to Posit. For many this name change was accepted with open arms, but for some-not so. Being the statistician that I am I decided to post a poll on LinkedIn to see the sentiment of my network. After running the poll for a week the results were in: Most of the respondents to the poll voted t...

2864 sym R (7160 sym/6 pcs) 12 img

RObservations #37: Demistifying the tapply() function and comparing it to the “tidy” approach.

19.08.2022

Introduction Many seasoned base R users use the tapply() function to help them in many contexts and talk about how powerful it is. However, many new R users have either have never seen tapply() or they have and are unsure how it works. The documentation is not very helpful in explaining it either: [tapply() applies] a function to each cell of a ...

4085 sym R (2446 sym/7 pcs) 6 img

RObservations #38: Visualizing Average Delay Times On TTC Subway Stations

05.09.2022

Introduction Any Torontonian who has commuted regularly on the TTC has probably experienced their fair share of delays on the subway. Having experienced a few recently I was inspired to visualize the average delay times across all stops on the subway. What are the stations with the longest delays on average this past year? Could we make a nice vi...

4685 sym R (8280 sym/7 pcs) 14 img

RObservations #39: Uncovering A Stranger Side Of The Collatz Conjecture

02.10.2022

Introduction The Collatz Conjecture is one of the most famous unsolved problems in mathematics which only requires the knowledge of 4th grade math to understand. This blog was initially intended to show how to code the Collatz conjecture as function and visualize stopping times as well as the hailstone sequences for some positive integers. Howeve...

4522 sym R (5113 sym/11 pcs) 26 img

RObservations #28 Canada’s Political Leadership and Inflation (Another Kaggle Contribution)

06.04.2022

Introduction In my last blog I shared a basic dataset listing the Prime Minister’s of Canada, the start and end of their terms and the political party they associated themselves with during their tenure. In this blog I share my second dataset contribution that compliments this- Canadian inflation rate data. Note: This blog is based on my Kaggle...

2757 sym R (1439 sym/2 pcs) 6 img 2 tbl

RObservations #29 – Classifying and Filtering Coordinates By Using the sf Library

10.04.2022

Introduction Geo-spatial analysis and visualizations is a powerful tool for providing insight bringing an idea or a result in a more tangible manner. Oftentimes, we are only interested in a specific points or we wan to classify the data we have by a larger location it belongs to. In this blog I share my discovery of the sf library’s st_intersec...

2816 sym R (2468 sym/5 pcs) 10 img

RObservations #30: Fixing R’s “messy string concatenation” with a special function

21.04.2022

Introduction Recently I discovered stackshare.io’s stackups which offers comparisons of different programming languages as well as their pros and cons. While looking at the all too classic comparison available between R and Python I noticed that one of the cons listed was: Messy syntax for string concatenation While it is possible to use the ...

2390 sym R (717 sym/4 pcs) 6 img

RObservations #31: Using the magick and tesseract packages to examine asterisks within the Noam Elimelech

24.05.2022

Introduction Since my last blog on Tesseract-OCR I have been playing around casually with it to see what it is possible of doing. Tesseract supports optical character recognition for over 100 languages. That together with straight forward usage for implementing it in R inspired me to try using it for Hebrew text. The last time I publicly explored...

3846 sym R (2331 sym/3 pcs) 4 img