Publications by Ian Johnson
Graphing California Electricity Supply using ggplot2
Graphing California Electricity Supply using ggplot2 during record temperatures 9/05/2022 – 09/09/2022Raw data from CA ISO. Data is available in 5 minute increments for each 24 hour period. Related To leave a comment for the author, please follow the link and comment on their blog: Data Science, Machine Learning and Predictive Analytics. ...
627 sym 4 img
R LightGBM Regression
In previous posts, I used popular machine learning algorithms to fit models to best predict MPG using the cars_19 dataset which is a dataset I created from publicly available data from the Environmental Protection Agency. It was discovered that support vector machine was clearly the winner in predicting MPG and SVM produces models with the lowe...
2863 sym R (3289 sym/9 pcs) 4 img
R: K-Means Clustering MLB Data
k-means clustering is a useful unsupervised learning data mining tool for assigning n observations into k groups which allows a practitioner to segment a dataset. I play in a fantasy baseball league and using five offensive variables (R, AVG, HR, RBI, SB) I am going to use k-means clustering to:1) Determine how many coherent groups there are in...
2110 sym 10 img
R: Birthday Problem
An interesting and classic probability question is the birthday problem.The birthday problem asks how many individuals are required to be in one location so there is a probability of 50% that at least two individuals in the group have the same birthday.To solve: If there are just 23 people in one location there is a 50.7% probability ...
836 sym 4 img
Predicting MPG for 2019 Vehicles using R
I am going to use regression, decision trees, and the random forest algorithm to predict combined miles per gallon for all 2019 motor vehicles. The raw data is located on the EPA government siteAfter preliminary diagnostics, exploration and cleaning I am going to start with a multiple linear regression model.The variables/features I...
2083 sym R (6761 sym/4 pcs) 10 img
R: Gradient Boosted Machine to Predict MPG for 2019 Vehicles
Continuing on the below post, I am going to use a gradient boosted machine model to predict combined miles per gallon for all 2019 motor vehicles. Part 1: Using Decision Trees and Random Forest to Predict MPG for 2019 Vehicles The raw data is located on the EPA government siteThe variables/features I am using for the models are: Engin...
2113 sym R (5464 sym/9 pcs) 8 img
Using Gradient Boosted Machine to Predict MPG for 2019 Vehicles
Continuing on the below post, I am going to use a gradient boosted machine model to predict combined miles per gallon for all 2019 motor vehicles. Part 1: Using Decision Trees and Random Forest to Predict MPG for 2019 Vehicles The raw data is located on the EPA government siteThe variables/features I am using for the models are: Engin...
2108 sym R (5464 sym/9 pcs) 8 img
R: SVM to Predict MPG for 2019 Vehicles
Continuing on the below post, I am going to use a support vector machine (SVM) to predict combined miles per gallon for all 2019 motor vehicles. Part 1: Using Decision Trees and Random Forest to Predict MPG for 2019 Vehicles Part 2: Using Gradient Boosted Machine to Predict MPG for 2019 Vehicles The raw data is located on the EPA gove...
1864 sym R (4156 sym/9 pcs) 4 img
Using SVM to Predict MPG for 2019 Vehicles
Continuing on the below post, I am going to use a support vector machine (SVM) to predict combined miles per gallon for all 2019 motor vehicles. Part 1: Using Decision Trees and Random Forest to Predict MPG for 2019 Vehicles Part 2: Using Gradient Boosted Machine to Predict MPG for 2019 Vehicles The raw data is located on the EPA gove...
1859 sym R (4156 sym/9 pcs) 4 img
R Tensorflow Multiple Linear Regression
In the previous three posts I used multiple linear regression, decision trees, gradient boosting, and support vector machine to predict miles per gallon for 2019 vehicles. It was determined that svm produced the best model. In this post, I am going to run TensorFlow through R and fit a multiple linear regression model using the sa...
1574 sym R (2288 sym/5 pcs) 2 img