Publications by Anindya Mozumdar

Analysing IPL matches using Cricsheet data – Part 2

31.12.2017

This is the 2nd in the series of articles to analyse IPL cricket matches using data from cricsheet. The first article in the series can be found here. The first article showed you how to load the information pertaining to the matches into five main tables. In this, our focus will be on loading the ball by ball information. Since we are restricted...

2691 sym R (4975 sym/3 pcs)

Analysing IPL matches using Cricsheet data – Part 1

31.12.2017

In a series of articles, I will be analysing Indian Premier League (IPL) cricket matches using data from cricsheet and using the R programming language. Cricsheet is an excellent website which provides ball-by-ball data for a large number of cricket matches. The IPL is a professional Twenty20 cricket league in India. I chose the IPL because the c...

2823 sym R (3945 sym/3 pcs)

Exploring R Packages – plyr

04.12.2018

In this post, we explore the functionality provided by the plyr package. The ideas behind this package are described in this paper by Hadley Wickham. However, rather than trying to understand the theoretical underpinnings of the package, we look at some of the useful functions provided by this package and how they work. Anyone using R seriously w...

6712 sym R (11065 sym/20 pcs)

R Vocabulary – Part 1

22.12.2018

To be a proficient R user, you need to read and understand the material in the book Advanced R by Hadley Wickham. The second chapter in this book is on vocabulary – a list of functions from the base, stats and utils packages which all R users should be familiar with. In a series of posts, we will attempt to learn most of the functions mentioned...

6357 sym R (6371 sym/26 pcs) 2 img

R Vocabulary – Part 2

25.01.2019

This is the second part of the series of articles on R vocabulary. In this series, we explore most of the functions mentioned in Chapter 2 of the book Advanced R. The first part of the series can be read here. The keyword function is used to define what is technically a closure in R. It has three components – it’s formals (arguments), the bod...

5014 sym R (5691 sym/18 pcs)

R Vocabulary – Part 3

11.02.2019

This is the third part of the series of articles on R vocabulary. In this series, we explore most of the functions mentioned in Chapter 2 of the book Advanced R. The first part of the series can be read here and the second part of the series can be read here. We start this article by looking at some functions which work on dates. The function str...

9047 sym R (7204 sym/25 pcs)

R Vocabulary – Part 4

06.03.2019

This is the fourth and final part in the series of articles on R vocabulary. In this series, we explore most of the functions mentioned in Chapter 2 of the book Advanced R. The first, second and third part of the series can be read here, here and here. In this article, we explore most of the functions mentioned under the heading Statistics in the...

6884 sym R (8172 sym/16 pcs) 2 img

Analysing Strike Rates in the IPL using the tidyverse

05.04.2019

In this article, we analyse the strike rates of the top batsmen in the Indian Premier League. We will use the tidyverse packages for the analysis, primarily dplyr and ggplot2. The code for all the data processing and analysis can be found in this Github repo. We will be using data from Cricsheet. Unfortunately, the data is only available till 201...

8632 sym R (5652 sym/15 pcs) 10 img

Easily explore your data using the summarytools package

10.05.2019

Whenever we start working with data with which we are not familiar, our first step is usually some kind of exploratory data analysis. We may look at the structure of the data using the str function, or use a tool like the RStudio Viewer to examine the data. We might also use the base R function summary or the describe function from the Hmisc pack...

3688 sym R (2778 sym/5 pcs) 18 img 8 tbl

Fun with Statistics – Is Usain Bolt really the fastest man on earth?

17.05.2019

If you search for the phrase “fastest man on earth” in Google, chances are that it will return the answer “Usain Bolt”. It certainly does so for me, even though the results might be different if Google decides to personalize the results for you. This is because currently he holds the world record for being the quickest (9.58s) to run a 10...

6800 sym R (8527 sym/21 pcs) 16 img