Publications by Martin Chan
Vignette: Write & Read Multiple Excel files with purrr
Introduction This post will show you how to write and read a list of data tables to and from Excel with purrr, the functional programming package ???? from tidyverse. In this example I will also use the packages readxl and writexl for reading and writing in Excel files, and cover methods for both XLSX and CSV (not strictly Excel, but might as wel...
10345 sym R (6390 sym/15 pcs) 10 img
A Short Essay on Duplicated R Artefacts
Organic Development of R Artefacts In a previous post, I alluded to the point that one of the great strengths (but also one of the challenges) of R is the organic way in which R ‘artefacts’ are developed.1 One characteristic of this “organic development” process is as follows. Given enough familiarity with R, anyone can create their own ...
6900 sym 2 img
Data Chats: An Interview with Avision Ho
Introduction Why do an interview? On this occasion, I’ve decided to have a conversation with a data scientist for a change, as opposed to creating a vignette or reviewing a package (atypical of the content on this blog). I’ve always enjoyed interviews as talking to people is a great way to understand and imagine what it really is like to do ...
10815 sym 4 img
LondonR: Hadley Wickham & tidyverse’s greatest hits
Meeting Hadley! Last Monday, I had the pleasure of attending a talk given by Hadley Wickham at LondonR, which was held at one of their usual venues at the UCL Darwin Lecture Theatre. For most readers of this blog, Hadley needs no introduction: it is a running joke amongst R users that if tidyverse hadn’t been rebranded, it would’ve been know...
8073 sym R (201 sym/1 pcs) 12 img
Data Chats: From Physics student to Data Science Consultant
Introduction How do you begin a career in analytics and data science? What’s the best way of learning R? Should I still bother with Excel? Arguably, these are some questions that you can gain more insights on by speaking to people than running models. This week, I have the pleasure of speaking with Abhishek Modi, to find out about his journey f...
8439 sym 6 img
First World Problems: Very long RMarkdown documents
RMarkdown is awesome! When I first started using RMarkdown, it felt very much like a blessing. Not only does the format encourage reproducible analysis by enabling you to interweave code, text, images, and plots, it also allows you to knit() the document into so many different formats, including static HTML, MS Word, PowerPoint, PDF – everythin...
5063 sym 4 img
Vignette: Google Trends with the gtrendsR package
Background Google Trends is a well-known, free tool provided by Google that allows you to analyse the popularity of top search queries on its Google search engine. In market exploration work, we often use Google Trends to get a very quick view of what behaviours, language, and general things are trending in a market. And of course, if you can do...
6249 sym R (1860 sym/6 pcs) 8 img
Vignette: Downloadable tables in RMarkdown with the DT package
Background In an earlier post April this year, I discussed using flexdashboard (with RMarkdown) as an appealing and practical R alternative to Excel-based reporting dashboards. Since it’s possible to (i) export these ‘flexdashboards’ as static HTML files that can be opened on practically any computer (virtually no dependencies), (ii) shared...
6662 sym R (680 sym/2 pcs) 4 img
RStudio Projects and Working Directories: A Beginner’s Guide
Introduction ???????????? This post provides a basic introduction on how to use RStudio Projects and structure your working directories – which is well worth a read if you are still using setwd() to set your directories! Although the R working directory is quite a basic and reasonably well-covered subject, I felt that it would still be worth sh...
10852 sym 2 img
Data cleaning with Kamehamehas in R
Background Given present circumstances in in the world, I thought it might be nice to write a post on a lighter subject. Recently, I came across an interesting Kaggle dataset that features the power levels of Dragon Ball characters at different points in the franchise. Whilst the dataset itself is quite simple with only four columns (Character, P...
8234 sym R (8024 sym/10 pcs) 10 img