Publications by diffuseprior
Web-Scraping in R
Web-scraping, or web-crawling, sounds like a seedy activity worthy of an Interpol investigative department. The reality, however, is far less nefarious. Web-scraping is any procedure by which someone extracts data from the internet. Given that it’s possible to get the internet on computers these days; web-scrapping opens an array of interesting...
2562 sym R (1496 sym/3 pcs) 18 img
Temperature Change in Ireland
Has Ireland gotten any warmer? Ask any punter on the street and they will happily inform you of wild swings, trends and dips. “Back when I was a child”, “when I was younger”, or “years ago” are the usual refrains. What’s the evidence? To answer this, I will use the temperature data from my previous post alongside the R package bcp. ...
1703 sym R (454 sym/1 pcs) 20 img
Instrumental Variables without Traditional Instruments
Typically, regression models in empirical economic research suffer from at least one form of endogeneity bias. The classic example is economic returns to schooling, where researchers want to know how much increased levels of education affect income. Estimation using a simple linear model, regressing income on schooling, alongside a bunch of contr...
3130 sym R (2091 sym/2 pcs) 18 img
Dummies for Dummies
Most R functions used in econometrics convert factor variables into a set of dummy/binary variables automatically. This is useful when estimating a linear model, saving the user from the laborious activity of manually including the dummy variables as regressors. However, what if you want to reshape your dataframe so that it contains such dummy va...
993 sym R (640 sym/1 pcs) 16 img
Probit/Logit Marginal Effects in R
The common approach to estimating a binary dependent variable regression model is to use either the logit or probit model. Both are forms of generalized linear models (GLMs), which can be seen as modified linear regressions that allow the dependent variable to originate from non-normal distributions. The coefficients in a linear regression model ...
2420 sym R (2147 sym/2 pcs) 18 img
An ivreg2 function for R
The ivreg2 command is one of the most popular routines in Stata. The reason for this popularity is its simplicity. A one-line ivreg2 command generates not only the instrumental variable regression coefficients and their standard errors, but also a number of other statistics of interest. I have come across a number of functions in R that calculate...
2563 sym R (4566 sym/3 pcs) 16 img
Simple Spatial Correlograms for Cross-Country Analysis in R
Accounting for temporal dependence in econometric analysis is important, as the presence of temporal dependence violates the assumption that observations are independent units. Historically, much less attention has been paid to correcting for spatial dependence, which, if present, also violates this independence assumption. The comparability of ...
4540 sym R (1886 sym/4 pcs) 18 img
Time-Series Policy Evaluation in R
Quantifying the success of government policies is clearly important. Randomized control trials, like those conducted by drug companies, are often described as the ‘gold-standard’ for policy evaluation. Under these, a policy is implemented in/to one area/group (treatment), but not in/to another (control). The difference in outcomes between the...
4499 sym R (1537 sym/2 pcs) 18 img
Optim, you’re doing it wrong?
Call me uncouth, but I like my TV loud, my beer cold and my optimization functions as simple as possible. Therefore, what I write in this blog post is very much from a layman’s perspective, and I am happy to be corrected on any fundamental errors. I have recently become interested in writing my own maximum likelihood estimators. However, before...
2850 sym R (2914 sym/3 pcs) 16 img
Let’s Party!
Exploring whether regression coefficients differ between groups is an important part of applied econometric research, and particularly for research with a policy based objective. For example, a government in a developing country may decide to introduce free school lunches in an effort to improve childhood health. However, if this treatment is kno...
2974 sym R (379 sym/1 pcs) 18 img