Publications by Eric Cai - The Chemical Statistician

Displaying Isotopic Abundance Percentages with Bar Charts and Pie Charts

17.02.2013

The Structure of an Atom An atom consists of a nucleus at the centre and electrons moving around it.  The nucleus contains a mixture of protons and neutrons.  For most purposes in chemistry, the two most important properties about these 3 types of particles are their masses and charges.  In terms of charge, protons are positive, electrons are ...

2614 sym R (1003 sym/1 pcs) 10 img

Getting Help with R Programming: Useful Survival Skills

23.02.2013

Useful Resources to Learn about R on the Internet When I program in R and struggle with something, the first thing that I usually turn to is Google.  I search the relevant function or the desired outcome, and I often find the solutions within the first few hits.  They likely show up in the documentation, online discussion forums like Nabble and...

4071 sym R (1069 sym/6 pcs) 6 img

Adding Labels to Points in a Scatter Plot in R

02.03.2013

What’s the Scatter? A scatter plot displays the values of 2 variables for a set of data, and it is a very useful way to visualize data during exploratory data analysis, especially (though not exclusively) when you are interested in the relationship between a predictor variable and a target variable.  Sometimes, such data come with categorical...

3878 sym R (1937 sym/5 pcs) 8 img

Discovering Argon with the 2-Sample t-Test

10.03.2013

I learned about Lord Rayleigh’s discovery of argon in my 2nd-year analytical chemistry class while reading “Quantitative Chemical Analysis” by Daniel Harris.  (William Ramsay was also responsible for this discovery.)  This is one of my favourite stories in chemistry; it illustrates how diligence in measurement can lead to an elegant and s...

8659 sym Python (1504 sym/3 pcs) 12 img

My Own R Function and Script for Simple Linear Regression – An Illustration with Exponential Decay of DDT in Trout

24.03.2013

Here is the function that I wrote for doing simple linear regression, as alluded to in my blog post about simple linear regression on log-transformed data on the decay of DDT concentration in trout in Lake Michigan.  My goal was to replicate the 4 columns of the output from applying summary() to the output of lm(). To use this file and this scr...

1541 sym Python (3632 sym/2 pcs) 4 img

Estimating the Decay Rate and the Half-Life of DDT in Trout – Applying Simple Linear Regression with Logarithmic Transformation

24.03.2013

This blog post uses a function and a script written in R that were displayed in an earlier blog post. Introduction This is the second of a series of blog posts about simple linear regression; the first was written recently on some conceptual nuances and subtleties about this model.  In this blog post, I will use simple linear regression to ana...

6406 sym R (1131 sym/2 pcs) 66 img

Checking for Normality with Quantile Ranges and the Standard Deviation

31.03.2013

Introduction I was reading Michael Trosset’s “An Introduction to Statistical Inference and Its Applications with R”, and I learned a basic but interesting fact about the normal distribution’s interquartile range and standard deviation that I had not learned before.  This turns out to be a good way to check for normality in a data set. In...

5360 sym R (304 sym/1 pcs) 34 img

How do Dew and Fog Form? Nature at Work with Temperature, Vapour Pressure, and Partial Pressure

31.03.2013

In the early morning, especially here in Canada, I often see dew – water droplets formed by the condensation of water vapour on outside surfaces, like windows, car roofs, and leaves of trees.  I also sometimes see fog – water droplets or ice crystals that are suspended in air and often blocking visibility at great distances.  Have you ever ...

6652 sym R (1467 sym/3 pcs) 24 img

Checking the Goodness of Fit of the Poisson Distribution in R for Alpha Decay by Americium-241

14.04.2013

Introduction Today, I will discuss the alpha decay of americium-241 and use R to model the number of emissions from a real data set with the Poisson distribution.  I was especially intrigued in learning about the use of Am-241 in smoke detectors, and I will elaborate on this clever application.  I will then use the Pearson chi-squared test to...

7673 sym R (4511 sym/3 pcs) 14 img

The Golden Section Search Method: Modifying the Bisection Method with the Golden Ratio for Numerical Optimization

22.04.2013

Introduction The first algorithm that I learned for root-finding in my undergraduate numerical analysis class (MACM 316 at Simon Fraser University) was the bisection method.  It’s very intuitive and easy to implement in any programming language (I was using MATLAB at the time).  The bisection method can be easily adapted for optimizing 1-dime...

6760 sym R (467 sym/2 pcs) 102 img