Publications by Luis
Statistics unplugged
How much does statistical software help and how much it interferes when teaching statistical concepts? Software used in the practice of statistics (say R, SAS, Stata, etc) brings to the party a mental model that it’s often alien to students, while being highly optimized for practitioners. It is possible to introduce a minimum of distraction whi...
5973 sym R (244 sym/2 pcs) 8 img 2 tbl
Teaching linear models
I teach several courses every year and the most difficult to pull off is FORE224/STAT202: regression modeling. The academic promotion application form in my university includes a section on one’s ‘teaching philosophy’. I struggle with that part because I suspect I lack anything as grandiose as a philosophy when teaching: as most university...
5635 sym 2 img
R as a second language
Imagine that you are studying English as a second language; you learn the basic rules, some vocabulary and start writing sentences. After a little while, it is very likely that you’ll write grammatically correct sentences that no native speaker would use. You’d be following the formalisms but ignoring culture, idioms, slang and patterns of ef...
4457 sym R (1434 sym/1 pcs) 2 img 1 tbl
Less wordy R
The Swarm Lab presents a nice comparison of R and Python code for a simple (read ‘one could do it in Excel’) problem. The example works, but I was surprised by how wordy the R code was and decided to check if one could easily produce a shorter version. The beginning is pretty much the same, although I’ll use ggplot2 rather than lattice, bec...
2612 sym Python (2197 sym/6 pcs) 4 img 6 tbl
Sometimes I feel (some) need for speed
I’m the first to acknowledge that most of my code could run faster. The truth of the matter is that, in essence, I write ‘quickies’: code that will run once or twice, so there is no incentive to spend days or hours in shaving seconds of a computation. Most analyses of research data fall in to this approach: read data-clean data-fit model-ch...
6156 sym R (1286 sym/7 pcs) 2 img 7 tbl
Comment on Sustainability and innovation in staple crop production in the US Midwest
After writing a blog post about the paper “Sustainability and innovation in staple crop production in the US Midwest” I decided to submit a formal comment to the International Journal of Agricultural Sustainability in July 2013, which was published today. As far as I know, Heinemann et al. provided a rebuttal to my comments, which I have not ...
9270 sym 4 img
Mucking around with maps, schools and ethnicity in NZ
I’ve been having a conversation for a while with @kamal_hothi and @aschiff on maps, schools, census, making NZ data available, etc. This post documents some basic steps I used for creating a map on ethnic diversity in schools at the census-area-unit level. This “el quicko” version requires 3 ingredients: Census area units shape files (avai...
1612 sym R (1302 sym/2 pcs) 4 img 2 tbl
Cute Gibbs sampling for rounded observations
I was attending a course of Bayesian Statistics where this problem showed up: There is a number of individuals, say 12, who take a pass/fail test 15 times. For each individual we have recorded the number of passes, which can go from 0 to 15. Because of confidentiality issues, we are presented with rounded-to-the-closest-multiple-of-3 data (\(\mat...
1739 sym R (2231 sym/5 pcs) 4 img
CRAWL with Covariates
#This is a simplified version of the workflow used in #Bedrinana-Romano, L., Hucke-Gaete, R., Viddi, F.A., Johnson, #D., Zerbini, A.N., Morales, J., Mate, B., Palacios, D.M., 2021. #Defining priority areas for blue whale conservation and #investigating overlap with vessel traffic in Chilean Patagonia, #using a fast-fitting movement mode...
17 sym R (12179 sym/9 pcs) 2 img