Publications by Peter's stats stuff - R
X13-SEATS-ARIMA as an automated forecasting tool
The M3 forecasting competition The M3 forecasting competition in 2000, organized by Spyros Makridakis and Michele Hibon, tested a variety of methods against 3,003 time series, with forecasts compared to held out test sets. The data are conveniently available for R users in the Mcomp package and Rob Hyndman has published example code benchmarking...
8921 sym R (940 sym/5 pcs) 18 img
Network charts of commuting in New Zealand with R and D3
Commuting between districts and cities in New Zealand At this year’s New Zealand Statisticians Association conference I gave a talk on Modelled Territorial Authority Gross Domestic Product. One thing I’d talked about was the impact on the estimates of people residing in one Territorial Authority (district or city) but working in another one...
6174 sym R (4547 sym/5 pcs) 2 img
A (not too talkative) twitterbot is born
Twitter bots One of my holiday projects was to get my head around how Twitter bots work, and the best (only?) way of course is to make one. Apparently about one in seven Twitter accounts is a computer program rather than a human user (although of course there are humans behind, and responsible, for all the robots). That’s 23 million bots (and...
6136 sym R (1171 sym/4 pcs) 2 img
Filling in the gaps – highly granular estimates of income and population for New Zealand from survey data
Individual-level estimates from survey data I was motivated by web apps like the British Office of National Statistics’ How well do you know your area? and How well does your job pay? to see if I could turn the New Zealand Income Survey into an individual-oriented estimate of income given age group, qualification, occupation, ethnicity, region ...
19456 sym R (21333 sym/13 pcs) 11 img
Better prediction intervals for time series forecasts
Forecast Combination I’ve referred several times to this blog post by Rob Hyndman in which he shows that a simple averaging of the ets() and auto.arima() functions in his {forecast} R package not only out performs ets() and auto.arima() individually (in the long run, not every time), they outperform nearly every method that was entered in the M...
8700 sym R (4274 sym/4 pcs) 5 img 2 tbl
Data from the World Health Organization API
Eric Persson released yesterday a new WHO R package which allows easy access to the World Health Organization’s data API. He’s also done a nice vignette introducing its use. I had a play and found it was easy access to some interesting data. Some time down the track I might do a comparison of this with other sources, the most obvious being ...
2790 sym R (5146 sym/1 pcs) 6 img
ggseas package for seasonal adjustment on the fly with ggplot2
In a post a few months ago I built a new ggplot2 statistical transformation (“stat”) to provide X13-SEATS-ARIMA seasonal adjustment on the fly. With certain exploratory workflows this can be useful, saving on a step of seasonally adjusting many different slices and dices of a multi-dimensional time series during data reshaping, and just draw...
2944 sym R (1279 sym/4 pcs) 8 img
Dynamic Stochastic General Equilibrium models made (relatively) easy with R
General Equilibrium economic models To expand my economics toolkit I’ve been trying to get my head around Computable General Equilibrium (CGE) and Dynamic Stochastic General Equilibrium (DSGE) models. Both classes of model are used in theoretical and policy settings to understand the impact of changes to an economic system on its equilibrium s...
9728 sym R (3511 sym/3 pcs) 4 img
New Zealand Tourism Dashboard pushes Shiny to the max
Just a short note that in my day job we’ve released the New Zealand Tourism Dashboard, launched by the Associate Minister for Tourism earlier today. It’s built with R and Shiny and I won’t say more about it than that to avoid mixing up my work and outside-work hats. Except that it’s really really awesome, and there’s an enormous amoun...
800 sym 2 img
Explore with Shiny the impact of sample size on “p-charts”
Control charts I wanted to help explore the implications of changing sample size for a quality control process aimed at determining the defect rate in multiple sites. Defect in this particular case is binary ie the products are either good or not. Much of the advice on this in the quality control literature strikes me as rather abstract and tec...
5247 sym 2 img