Publications by Wesley

Spearman’s Rho

30.08.2012

Spearman’s Rho Rank Correlation There are generally three types of correlation that a researcher may encounter: Pearson’s r, Kendall’s Tau, and Spearman’s Rho.  They each have their own uses and applications depending on the data and what you’re trying to achieve.  This example shows how Spearman’s Rho rank correlation is calculated...

742 sym

Kendall’s Tau

05.09.2012

Kendall’s Tau This is an example of Kendall’s Tau rank correlation.  This is similar to Spearman’s Rho in that it is a non-parametric measure of correlation on ranks.  It is an appropriate measure for ordinal data and is fairly straight forward when there are no ties in the ranks. When ties do exist then variations of Kendall’s Tau can...

963 sym

Using R to connect to a SQL Server and MySQL Database using MS Windows

08.09.2012

Connecting to MySQL and Microsoft SQL Server Connecting to a MySQL database or MS SQL Server from the R environment can be extremely useful.  It allows a researcher direct access to the data without have to first export it from a database and then import it from a csv file or entering it directly into R. This example shows the process to set up ...

1023 sym

One-Way ANOVA

11.09.2012

One-Way ANOVA Analysis of variance is a tool used for a variety of purposes. Applications range from a common one-way ANOVA, to experimental blocking, to more complex nested designs. This first ANOVA example provides the necessary tools to analyze data using this technique. This example will show a basic one-way ANOVA. I will save the theory and ...

840 sym

N-Way ANOVA

15.09.2012

N-Way ANOVA example Two-way analysis of variance is where the rubber hits the road, so to speak. This extends the concepts of ANOVA with only one factor to two factors. When there are two factors this means that there can be an interaction between the two factors that should be tested. As one might expect this concept can be extended beyond just ...

1258 sym

Power Analysis and the Probability of Errors

22.09.2012

Power analysis is a very useful tool to estimate the statistical power from a study. It effectively allows a researcher to determine the needed sample size in order to obtained the required statistical power. Clients often ask (and rightfully so) what the sample size should be for a proposed project. Sample sizes end up being a delicate balance b...

1406 sym 18 img

Data Frames and Transactions

24.09.2012

Transactions are a very useful tool when dealing with data mining.  It provides a way to mine itemsets or rules on datasets. In R the data must be in transactions form.  If the data is only available in a data.frame then to create (or coerce) the data frame to transaction the researcher may use the following code.   This example shows the “A...

1733 sym R (248 sym/2 pcs)

Association Rule Learning and the Apriori Algorithm

26.09.2012

Association Rule Learning (also called Association Rule Mining) is a common technique used to find associations between many variables. It is often used by grocery stores, retailers, and anyone with a large transactional databases. It’s the same way that Target knows your pregnant or when you’re buying an item on Amazon.com they know what e...

3567 sym R (2170 sym/8 pcs) 8 img

Text Mining

15.10.2012

When it comes down to it R does a really good job handling structured data like matrices and data frames. However, its ability to work with unstructured data is still a work in progress. It can and it does handle text mining but the documentation is incomplete and the capabilities still don’t compare to other programs such as MALLET or Mahout. ...

2313 sym R (2442 sym/4 pcs) 2 img 1 tbl

Mapping Capabilities in R

02.11.2012

From time-to-time creating a basic map of the United States or other parts of the world to complement some statistical analysis is useful to emphasize a point. The maps package in R provide a good way to produce these these maps.  These maps axes are based on latitude and longitude so overlaying other information on these maps is quite simple. �...

1578 sym R (1646 sym/1 pcs) 2 img