Publications by Method Matters
Multilevel Modeling Solves the Multiple Comparison Problem: An Example with R
Multiple comparisons of group-level means is a tricky problem in statistical inference. A standard practice is to adjust the threshold for statistical significance according to the number of pairwise tests performed. For example, according to the widely-known Bonferonni method, if we have 3 different groups for which we want to compare the means ...
17465 sym R (5196 sym/9 pcs) 4 img 8 tbl
Linguistic Signals of Album Quality: A Predictive Analysis of Pitchfork Review Scores Using Quanteda
In this post we will return to the Pitchfork music review data, parts of which I’ve analyzed in previous posts. Our goal here will be to use text mining and natural language processing (NLP) to understand linguistic signals of album quality. This type of analysis helps us understand what Pitchfork reviewers appreciate or dislike, and gives us a...
18746 sym R (8694 sym/12 pcs) 4 img 3 tbl
Linguistic Signals of Album Quality: A Predictive Analysis of Pitchfork Review Scores Using Quanteda
In this post we will return to the Pitchfork music review data, parts of which I’ve analyzed in previous posts. Our goal here will be to use text mining and natural language processing (NLP) to understand linguistic signals of album quality. This type of analysis helps us understand what Pitchfork reviewers appreciate or dislike, and gives us a...
18862 sym R (9218 sym/11 pcs) 4 img 3 tbl
A Tale of Two (Small Belgian) Cities with Open Data: Official Crime Statistics and Self-Reported Feelings of Safety in Leuven and Vilvoorde
In this post, we will analyze government data from the Flemish region in Belgium on A) official crime statistics and B) self-reported feelings of safety among residents of Flanders. We will focus our analysis on two cities in the province of Flemish Brabant: Leuven and Vilvoorde. A key question of this analysis is: do the residents of the safer c...
12713 sym R (10329 sym/5 pcs) 6 img 2 tbl
A Tale of Two (Small Belgian) Cities with Open Data: Official Crime Statistics and Self-Reported Feelings of Safety in Leuven and Vilvoorde
In this post, we will analyze government data from the Flemish region in Belgium on A) official crime statistics and B) self-reported feelings of safety among residents of Flanders. We will focus our analysis on two cities in the province of Flemish Brabant: Leuven and Vilvoorde. A key question of this analysis is: do the residents of the safer c...
12647 sym R (10020 sym/5 pcs) 6 img 2 tbl
FizzBuzz in R and Python
In this post, we will solve a simple problem (called “FizzBuzz”) that is asked by some employers in data scientist job interviews. The question seeks to ascertain the applicant’s familiarity with basic programming concepts. We will see 2 different ways to solve the problem in 2 different statistical programming languages: R and Python. The ...
10828 sym R (1279 sym/4 pcs)
FizzBuzz in R and Python
In this post, we will solve a simple problem (called “FizzBuzz“) that is asked by some employers in data scientist job interviews. The question seeks to ascertain the applicant’s familiarity with basic programming concepts. We will see 2 different ways to solve the problem in 2 different statistical programming languages: R and Python.The F...
11030 sym R (1256 sym/4 pcs)
Downloading Fitbit Data Histories with R
In this post, we will see how to download personal Fitbit data histories for step counts, heart rate, and sleep via the Fitbit API. We will use a combination of existing R packages and custom calls to the Fitbit API to get all of the data we are interested in. This post won’t focus on data analysis per se, but rather data collection. As I was g...
13912 sym R (8352 sym/10 pcs) 4 tbl
Downloading Fitbit Data Histories with R
In this post, we will see how to download personal Fitbit data histories for step counts, heart rate, and sleep via the Fitbit API. We will use a combination of existing R packages and custom calls to the Fitbit API to get all of the data we are interested in.This post won’t focus on data analysis per se, but rather data collection. As I was go...
13835 sym R (7939 sym/10 pcs) 4 tbl
Downloading Fitbit Data Histories with R
In this post, we will see how to download personal Fitbit data histories for step counts, heart rate, and sleep via the Fitbit API. We will use a combination of existing R packages and custom calls to the Fitbit API to get all of the data we are interested in.This post won’t focus on data analysis per se, but rather data collection. As I was go...
13835 sym R (7939 sym/10 pcs) 4 tbl