Publications by Eric Cai - The Chemical Statistician
Displaying Isotopic Abundance Percentages with Bar Charts and Pie Charts
The Structure of an Atom An atom consists of a nucleus at the centre and electrons moving around it. The nucleus contains a mixture of protons and neutrons. For most purposes in chemistry, the two most important properties about these 3 types of particles are their masses and charges. In terms of charge, protons are positive, electrons are ...
2614 sym R (1003 sym/1 pcs) 10 img
Getting Help with R Programming: Useful Survival Skills
Useful Resources to Learn about R on the Internet When I program in R and struggle with something, the first thing that I usually turn to is Google. I search the relevant function or the desired outcome, and I often find the solutions within the first few hits. They likely show up in the documentation, online discussion forums like Nabble and...
4071 sym R (1069 sym/6 pcs) 6 img
Adding Labels to Points in a Scatter Plot in R
What’s the Scatter? A scatter plot displays the values of 2 variables for a set of data, and it is a very useful way to visualize data during exploratory data analysis, especially (though not exclusively) when you are interested in the relationship between a predictor variable and a target variable. Sometimes, such data come with categorical...
3878 sym R (1937 sym/5 pcs) 8 img
Discovering Argon with the 2-Sample t-Test
I learned about Lord Rayleigh’s discovery of argon in my 2nd-year analytical chemistry class while reading “Quantitative Chemical Analysis” by Daniel Harris. (William Ramsay was also responsible for this discovery.) This is one of my favourite stories in chemistry; it illustrates how diligence in measurement can lead to an elegant and s...
8659 sym Python (1504 sym/3 pcs) 12 img
My Own R Function and Script for Simple Linear Regression – An Illustration with Exponential Decay of DDT in Trout
Here is the function that I wrote for doing simple linear regression, as alluded to in my blog post about simple linear regression on log-transformed data on the decay of DDT concentration in trout in Lake Michigan. My goal was to replicate the 4 columns of the output from applying summary() to the output of lm(). To use this file and this scr...
1541 sym Python (3632 sym/2 pcs) 4 img
Estimating the Decay Rate and the Half-Life of DDT in Trout – Applying Simple Linear Regression with Logarithmic Transformation
This blog post uses a function and a script written in R that were displayed in an earlier blog post. Introduction This is the second of a series of blog posts about simple linear regression; the first was written recently on some conceptual nuances and subtleties about this model. In this blog post, I will use simple linear regression to ana...
6406 sym R (1131 sym/2 pcs) 66 img
Checking for Normality with Quantile Ranges and the Standard Deviation
Introduction I was reading Michael Trosset’s “An Introduction to Statistical Inference and Its Applications with R”, and I learned a basic but interesting fact about the normal distribution’s interquartile range and standard deviation that I had not learned before. This turns out to be a good way to check for normality in a data set. In...
5360 sym R (304 sym/1 pcs) 34 img
How do Dew and Fog Form? Nature at Work with Temperature, Vapour Pressure, and Partial Pressure
In the early morning, especially here in Canada, I often see dew – water droplets formed by the condensation of water vapour on outside surfaces, like windows, car roofs, and leaves of trees. I also sometimes see fog – water droplets or ice crystals that are suspended in air and often blocking visibility at great distances. Have you ever ...
6652 sym R (1467 sym/3 pcs) 24 img
Checking the Goodness of Fit of the Poisson Distribution in R for Alpha Decay by Americium-241
Introduction Today, I will discuss the alpha decay of americium-241 and use R to model the number of emissions from a real data set with the Poisson distribution. I was especially intrigued in learning about the use of Am-241 in smoke detectors, and I will elaborate on this clever application. I will then use the Pearson chi-squared test to...
7673 sym R (4511 sym/3 pcs) 14 img
The Golden Section Search Method: Modifying the Bisection Method with the Golden Ratio for Numerical Optimization
Introduction The first algorithm that I learned for root-finding in my undergraduate numerical analysis class (MACM 316 at Simon Fraser University) was the bisection method. It’s very intuitive and easy to implement in any programming language (I was using MATLAB at the time). The bisection method can be easily adapted for optimizing 1-dime...
6760 sym R (467 sym/2 pcs) 102 img