Publications by Francis Smart

More Explorations with catR

01.12.2013

# For the purposes of simulating computerized adaptive tests # the R package catR is unparallelled. # catR is an excellent tool for students who are curious about # how a computerized adaptive test might work. It is also useful # for testing companies that are interested in seeing how # their choices of number of items, or model, stopping rule, ...

3 sym R (9819 sym/2 pcs) 2 img

Unobserved Effects With Panel Data

04.12.2013

It is common for researchers to be concerned about unobserved effects being correlated with observed explanatory variables.For instance, if we were curious about the effect of meditation on emotional stability we may be concerned that there might be some unobserved factor such as personal genetics that might  predict both likelihood to meditate ...

2059 sym R (2993 sym/2 pcs)

Incidental Parameters Problem with Binary Response Data and Unobserved Individual Effects

05.12.2013

It is a well known problem that in some models as the number of observations becomes large, econometric estimators fail to converge on consistent estimators.  The leading case of this is when estimating a binary response model with panel data with potential “fixed effects” correlated with the explanatory variables appearing in the population...

1806 sym R (3494 sym/2 pcs) 2 img

Stick Figure Function Fun – R

31.01.2014

I have created a stick figure generating function for the purposes of adding a human figure as a demonstration of scale to some of my graphs as well as potentially emoticons to my shiny/concerto applications.You can change basic graphing parameters like scale, line color, and line width as well as more entertaining options such as arm position, ...

1117 sym 4 img

Using MongoHQ to build a Shiny Hit Counter

06.02.2014

In serveral previous posts I have posted shiny applications which temporarily store data on shiny servers such as hit counters or the survey tool which I created,  These do not work in the long term since shiny will restart its servers without warning when needed.  In addition, saving data to a shiny server is not an ideal method since special ...

1736 sym R (1120 sym/1 pcs)

Easily generate correlated variables from any distribution

27.02.2014

In this post I will demonstrate in R how to draw correlated random variables from any distributionThe idea is simple.  1. Draw any number of variables from a joint normal distribution. 2. Apply the univariate normal CDF of variables to derive probabilities for each variable. 3. Finally apply the inverse CDF of any distribution to simulate draws...

1211 sym R (3235 sym/5 pcs) 10 img

The Star Puzzle

04.03.2014

The Star Puzzle is a puzzle presented on The Math Forum.  I became aware of this problem by noticing the article and solution posted on Quantitative Decisions article section. It asks the question, “How many triangles, quadrilaterals, and irregular hexagons can we form from a star of David?” In this post I will solve these questions using R ...

1659 sym R (11367 sym/8 pcs) 22 img

Ever wonder how popular your favorite R functions are?

07.03.2014

How’s that fried pickle sandwich treating you?  Perhaps your taste in R  functions are less bizarre than your taste in R commands?Now you can easily find out using this new shiny app!  In this post I use the R function frequency table compiled by John Myles White in 2009 in which he counts the occurrences of words in the source files of all ...

1379 sym R (1008 sym/1 pcs) 2 img

Text Mining Gun Deaths Data

13.03.2014

In this post I will explore public data being collected by Slate.  I previously released code using a much early set of this data demonstrating how to turn this data into an animated map.This data began collection as a public response to the horrifying shooting at Sandy Hook Elementary in December of 2012 in an attempt to create a database of al...

1044 sym R (6251 sym/3 pcs) 6 img

It is time for RData files to become the standard for Data Transfer

20.03.2014

It is time Rdata files become the primary means of disseminating publicly available data online.1. R is the most efficient Statistical software at compressing dataI was recently attempting to download weather data from the US government and found myself stymied because the dataset I wanted was considered too large (+5 gigs).  The problem I reali...

5916 sym 2 img 2 tbl