Publications by Nirmal Ghimire, K-16 Literacy Center
Document
Part 2-Basic Inferential Statistics This second portion of the project analyzes the ToothGrowth data in the R dataset package and performs the following activities: Basic Exploratory Data Analyses: Basic Summary of the Data: Use Confidence Intervals (CI) and/or hypothesis tests to compare tooth growth by supp and dose: State conclusion and the a...
1020 sym R (3320 sym/27 pcs) 2 img
Regression Analysis using R Sample based on Dr. Caffo's Lecture
Linear Regression for Prediction library(UsingR) ## Loading required package: MASS ## Warning: package 'MASS' was built under R version 4.0.3 ## Loading required package: HistData ## Loading required package: Hmisc ## Loading required package: lattice ## Loading required package: survival ## Warning: package 'survival' was built under R version 4...
76840 sym R (42484 sym/216 pcs) 38 img
Linear Regression_Residuals
Residuals Residuals represent variation left unexplained by our model. We emphasize the difference between residuals and errors. The errors unobservable true errors from the known coefficients, while residuals are the observable errors from the estimated coefficients. In a sense, the residuals are estimates of the errors. Let’s start by talking...
9521 sym R (6511 sym/49 pcs) 10 img
Combining Predictors, Forecasting, & Unsupervised Prediction
Combining Predictors This section is about combining predictors, it’s sometimes called ensembling methods in ML. And so, the key idea here is combine classifiers by averaging or voting even if these classifiers are very different. For example, we can combine a boosting classifier with a random forest with a linear regression model. In general, ...
11711 sym R (6972 sym/49 pcs) 6 img
Boosting, and Model Based Predication
Boosting Boosting along with random forest is one fo the most accurate box classifiers that we can use. In general, in boosting, we take a large number ob possibly weak predictors and weight them in a way that takes advantage of their strengths and add them up. When we weight them and add them up, we’re sort of doing the same kind of idea that ...
3644 sym R (2539 sym/18 pcs) 2 img
Predicting with Regression
Predicting with Regression (Linear Model) With One Predictor This section is about one of the most direct and simple ways to perform machine learning using regression modeling. Knowledge of regression modeling helps a lot of with the materials in this section. We’re just using it in the service of performing prediction. So the key idea here is ...
12370 sym R (6927 sym/36 pcs) 12 img
Preprocessing with PCA
This section is about preprocessings covariates with principal components analysis. Often we have multiple quantitative variables and sometimes they’ll be highly correlated with each other. In other words, they are very similar to being the almost the exact same variable. In this case, it’s not necessarily useful to include every variable in ...
9979 sym R (5632 sym/31 pcs) 15 img
Covariate Creation
This unit is about covariate creation. Covariates are sometimes called predictors and sometimes called features. They’re the variables that we include in our model to predict our outcome. There are two levels of covariate creation, or feature creation. The first level is, taking the raw data that we have and turning it into a predictor so the r...
9299 sym R (3994 sym/28 pcs) 1 img
Preprocessing for Prediction Variables
Why Preprocess? We need to plot the variables upfront so we can see if there’s any sort of weird behavior of those variables. And sometimes predictors or the distribution look very strange, and we might need to transform them in order to make them more useful for prediction algorithms. This is particularly true when we’re using model based al...
8878 sym R (2509 sym/35 pcs) 2 img