Publications by Eric Cai - The Chemical Statistician
Scripts and Functions: Using R to Implement the Golden Section Search Method for Numerical Optimization
In an earlier post, I introduced the golden section search method – a modification of the bisection method for numerical optimization that saves computation time by using the golden ratio to set its test points. This post contains the R function that implements this method, the R functions that contain the 3 functions that were minimized by t...
1779 sym Python (3149 sym/3 pcs) 4 img
Using the Golden Section Search Method to Minimize the Sum of Absolute Deviations
Introduction Recently, I introduced the golden search method – a special way to save computation time by modifying the bisection method with the golden ratio, and I illustrated how to minimize a cusped function with this script. I also wrote an R function to implement this method and an R script on how to apply this method with this example....
4967 sym Python (2075 sym/8 pcs) 42 img
How to Calculate a Partial Correlation Coefficient in R: An Example with Oxidizing Ammonia to Make Nitric Acid
Introduction Today, I will talk about the math behind calculating partial correlation and illustrate the computation in R with an example involving the oxidation of ammonia to make nitric acid using a built-in data set in R called stackloss. In a separate post, I will also share an R function that I wrote to estimate partial correlation. In...
8369 sym Python (2577 sym/1 pcs) 26 img
Exploratory Data Analysis – Computing Descriptive Statistics in R for Data on Ozone Pollution in New York City
Introduction This is the first of a series of posts on exploratory data analysis (EDA). This post will calculate the common summary statistics of a univariate continuous data set – the data on ozone pollution in New York City that is part of the built-in “airquality” data set in R. This is a particularly good data set to work with, si...
2794 sym R (975 sym/8 pcs) 4 img
When Does the Kinetic Theory of Gases Fail? Examining its Postulates with Assistance from Simple Linear Regression in R
Introduction The Ideal Gas Law, , is a very simple yet useful relationship that describes the behaviours of many gases pretty well in many situations. It is “Ideal” because it makes some assumptions about gas particles that make the math and the physics easy to work with; in fact, the simplicity that arises from these assumptions allows the...
5080 sym R (1669 sym/3 pcs) 16 img
Exploratory Data Analysis: Variations of Box Plots in R for Ozone Concentrations in New York City and Ozonopolis
Introduction Last week, I wrote the first post in a series on exploratory data analysis (EDA). I began by calculating summary statistics on a univariate data set of ozone concentration in New York City in the built-in data set “airquality” in R. In particular, I talked about how to calculate those statistics when the data set has missing ...
5010 sym R (1085 sym/7 pcs) 12 img
Exploratory Data Analysis: Kernel Density Estimation in R on Ozone Pollution Data in New York and Ozonopolis
Introduction Recently, I began a series on exploratory data analysis; so far, I have written about computing descriptive statistics and creating box plots in R for a univariate data set with missing values. Today, I will continue this series by analyzing the same data set with kernel density estimation, a useful non-parametric technique for vis...
7725 sym R (2828 sym/3 pcs) 34 img
Exploratory Data Analysis: Combining Box Plots and Kernel Density Plots into Violin Plots for Ozone Pollution Data
Introduction Recently, I began a series on exploratory data analysis (EDA), and I have written about descriptive statistics, box plots, and kernel density plots so far. As previously mentioned in my post on box plots, there is a way to combine box plots and kernel density plots. This combination results in violin plots, and I will show how to...
3982 sym R (2185 sym/4 pcs) 8 img
Exploratory Data Analysis: Conceptual Foundations of Empirical Cumulative Distribution Functions
Introduction Continuing my recent series on exploratory data analysis (EDA), this post focuses on the conceptual foundations of empirical cumulative distribution functions (CDFs); in a separate post, I will show how to plot them in R. (Previous posts in this series include descriptive statistics, box plots, kernel density estimation, and violin...
3978 sym Python (899 sym/1 pcs) 66 img
Exploratory Data Analysis: 2 Ways of Plotting Empirical Cumulative Distribution Functions in R
Introduction Continuing my recent series on exploratory data analysis (EDA), and following up on the last post on the conceptual foundations of empirical cumulative distribution functions (CDFs), this post shows how to plot them in R. (Previous posts in this series on EDA include descriptive statistics, box plots, kernel density estimation, and...
4746 sym R (1913 sym/4 pcs) 12 img