Publications by John Cruz
Time Series Analysis
Required Libraries library(fpp3) Problem 2.1 Explore the following four time series: Bricks from aus_production, Lynx from pelt, Close from gafa_stock, Demand from vic_elec. Use ? (or help()) to find out about the data in each series. What is the time interval of each series? Use autoplot() to produce a time plot of each series. For the last plot...
4761 sym R (2836 sym/29 pcs) 17 img
Final Kaggle
Problem 1 Using R, set a random seed equal to 1234 (i.e., set.seed(1234)). Generate a random variable X that has 10,000 continuous random uniform values between 5 and 15.Then generate a random variable Y that has 10,000 random normal values with a mean of 10 and a standard deviation of 2.89. set.seed(1234) X <- runif(10000, min=5, max=15) Y<- rnor...
7588 sym Python (16123 sym/79 pcs) 14 img 4 tbl
Final Project
Problem 1 Using R, set a random seed equal to 1234 (i.e., set.seed(1234)). Generate a random variable X that has 10,000 continuous random uniform values between 5 and 15.Then generate a random variable Y that has 10,000 random normal values with a mean of 10 and a standard deviation of 2.89. set.seed(1234) X <- runif(10000, min=5, max=15) Y<- rnor...
7585 sym Python (15852 sym/77 pcs) 13 img 4 tbl
puddle
Water flows onto a flat surface at a rate of 5 \(cm^3/s\) forming a circular puddle 10 mm deep. How fast is the radius growing when the radius is: (a) 1 cm? (b) 10 cm? (c) 100 cm? We know that the volume of a cylinder is \(V = \pi r^2h\). Using the product rule, \(\frac {d(uv)}{dt} = u\frac{dv}{dt} + v\frac{du}{dt}\), we can differentiate time (t) ...
833 sym
Solving a System of Linear Equations
Problem C50 A three-digit number has two properties. The tens-digit and the ones-digit add up to 5. If the number is written with the digits in the reverse order, and then subtracted from the original number, the result is 792. Use a system of equations to find all of the three-digit numbers with these properties. Let’s assume that a three-digit ...
1601 sym 3 tbl
Spam Email Classifier
Introduction Working with test document data from ‘Spam Assassin’, this project will classify documents whether it is spam or non-spam email. A training and validation set will be created and pushed into a decision tree and logistic regression model. Required Libraries library(tidyverse) library(tidytext) library(purrr) library(rpart) library(...
1486 sym R (10369 sym/19 pcs) 1 img 6 tbl
Tidyverse - Fuzzyjoin
library(tidyverse) library(fuzzyjoin) Overview This vignette will introduce the fuzzyjoin package, which enables joining of two datasets based on imperfect matches. This package is very helpful for combining data without unique keys. We will use data related to candidates running in the 2022 election for the House of Representatives. Specifically,...
7769 sym R (13167 sym/41 pcs)
Recommender Systems
Introduction Perform a Scenario Design analysis that helps make sure to take UX (user experience) into account. Consider whether it makes sense for your selected recommender system to perform scenario design twice, once for the organization (e.g. Amazon.com) and once for the organization’s customers. Attempt to reverse engineer what you can abo...
4247 sym 1 img
Tidyverse - Lubridate
Required Libraries library(tidyverse) library(lubridate) library(httr) library(jsonlite) Tidyvserse Packages Tidyverse contains many packages within it that allows users to work with strings, mutate and rearange dataframes and access data through APIs or websites. We can see a few of these packages listed below. tidyverse_packages() ## [1] "broom...
2234 sym R (1041 sym/7 pcs) 2 img 4 tbl
Sentiment Analysis - NY Times
Introduction Text Mining with R, Chapter 2, looks at sentiment analysis. The authors provide an example using the text of Jane Austen’s six completed, published novels from the janeaustenr library. All the code is originally credited to the authors, unless otherwise noted. Required Libraries library(tidyverse) library(tidytext) library(janeauste...
3347 sym R (4138 sym/16 pcs) 4 img 8 tbl