Publications by Joey Campbell
Lab 6b Ridge Regression and the Lasso
We will use the glmnet package in order to perform ridge regression and the lasso. The main function in this package is glmnet(), which can be used to fit ridge regression models, lasso models, and more. This function has slightly different syntax from other model-fitting functions that we have encountered thus far in this book. In particular, we...
20365 sym R (6278 sym/51 pcs) 3 img
Lab 6a Subset Selection Methods
Best Subset Selection Here we apply the best subset selection approach to the Hitters data. We wish to predict a baseball player’s Salary on the basis of various statistics associated with performance in the previous year. First of all, we note that the Salary variable is missing for some of the players. The is.na() function can be used to iden...
24429 sym R (16174 sym/65 pcs) 3 img
Lab 4: Logistic Regression, LDA, QDA, and KNN
The Stock Market Data We will begin by examining some numerical and graphical summaries of the Smarket data, which is part of the ISLR library. This data set consists of percentage returns for the S&P 500 stock index over 1,250 days, from the beginning of 2001 until the end of 2005. For each date, we have recorded the percentage returns for each ...
52124 sym R (9246 sym/122 pcs) 3 img
Web Scraping IMDB
Introduction Data and information on the web is growing exponentially. All of us today use Google as our first source of knowledge - be it about finding reviews about a place or understanding a new term. There is a lot of information available on the web. With the amount of data available over the web, it opens new horizons of possibility for a D...
9716 sym R (13014 sym/62 pcs) 8 img
Carvana
Welcome to the Carvana Analytics experiment evaluation assignment! Introduction This project includes fictional data from an outbound call initiative Carvana has been making to users in order to attempt to increase sales. In this exercise you will be asked to describe the results of how the campaign has been performing, explore how you might imp...
29935 sym R (16013 sym/23 pcs) 6 img
Lab 3 Reproducible Research
Problem 10 This question should be answered using the Carseats data set. library(ISLR) attach(Carseats) (a) Fit a multiple regression model to predict Sales using Price,Urban, and US. fit<-lm(Sales~Price+Urban+US) summary(fit) ## ## Call: ## lm(formula = Sales ~ Price + Urban + US) ## ## Residuals: ## Min 1Q Median 3Q ...
2273 sym R (4275 sym/12 pcs) 1 img
Rgeom04_demo
1. What geom would you use to draw a line chart? A boxplot? A histogram? An area chart? line chart: geom_line() boxplot: geom_boxplot() histogram: geom_histogram() area chart: geom_area() 2. Run this code in your head and predict what the output will look like. Then, run the code in R and check your predictions. ggplot(data = mpg, mapping = aes...
265977 sym R (1499 sym/14 pcs) 15 img
edamiss17_demo
suppressPackageStartupMessages(library("tidyverse")) package 㤼㸱tidyverse㤼㸲 was built under R version 3.6.3 1. What happens to missing values in a histogram? What happens to missing values in a bar chart? Why is there a difference? Missing values are removed when the number of observations in each bin are calculated. See the warning messa...
2828 sym R (450 sym/8 pcs) 2 img
tidydatacasestudy28
suppressPackageStartupMessages(library("tidyverse")) package 㤼㸱tidyverse㤼㸲 was built under R version 3.6.3 Case Study To finish off this section, let’s pull together everything you’ve learned to tackle a realistic data tidying problem. The tidyr::who dataset contains tuberculosis (TB) cases broken down by year, country, age, gender, ...
21733 sym R (2334 sym/26 pcs) 1 img
charClasses36_demo
suppressPackageStartupMessages(library("tidyverse")) package 㤼㸱tidyverse㤼㸲 was built under R version 3.6.3 1. Create regular expressions to find all words that: 1. Words starting with vowels str_subset(stringr::words, "^[aeiou]") [1] "a" "able" "about" "absolute" "accept" "account" [7] "achieve" ...
7123 sym R (3968 sym/21 pcs)