Publications by Julius Sullivan
Quiz 3
The data set is from a case-control study of smoking and Alzheimer’s disease. The data set has two variables of main interest: smoking a factor with four levels “None”, “<10”, “10-20”, and “>20” (cigarettes per day) disease a factor with three levels “Alzheimer”, “Other dementias”, and “Other diagnoses”. ## ── ...
2386 sym R (419 sym/4 pcs) 2 img
Term Project
This is an extension of the tidytuesday assignment you have already done. Complete the questions below, using the screencast you chose for the tidytuesday assigment. Import data library(tidyverse) theme_set(theme_light()) simpsons <- readr::read_delim("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2019/2019-08-27/sim...
889 sym R (445 sym/2 pcs) 1 img
Quiz 5
Replicate a case study of marketing analytics: https://www.linkedin.com/learning/the-data-science-of-marketing/cluster-analysis-with-r?u=2232593 Q1 Import data myClusterData <- read.csv("/cloud/project/cluster-r.csv") myClusterData ## Email Behavior.3 ## 1 nisl@adipiscin...
669 sym R (53182 sym/6 pcs) 1 img
Tidytuesday
Choose one of David Robinson’s tidytuesday screencasts, watch the video, and summarise. https://www.youtube.com/channel/UCeiiqmVK07qhY-wvg3IZiZQ Instructions You must follow the instructions below to get credits for this assignment. Read the document posted in Moodle before answering the following questions. Write in your own words. Multiple ...
3023 sym
Quiz 4
Make sure to include the unit of the values whenever appropriate. Q1 Build a regression model to predict life expectancy using gdp per capita. Hint: The variables are available in the gapminder data set from the gapminder package. Note that the data set and package both have the same name, gapminder. library(tidyverse) options(scipen=999) data(...
2245 sym R (1811 sym/2 pcs)
Reading on Regression
Instructions You must follow the instructions below to get credits for this assignment. Read the document (example of regression analysis) posted in Moodle before answering the following questions. Write in your own words. Multiple identical answers will get zero. Elaborate your answer. One or two sentence answers won’t get credit. Make sure t...
3468 sym
Quiz 3
The data set is from a case-control study of smoking and Alzheimer’s disease. The data set has two variables of main interest: smoking a factor with four levels “None”, “<10”, “10-20”, and “>20” (cigarettes per day) disease a factor with three levels “Alzheimer”, “Other dementias”, and “Other diagnoses”. library(ti...
1614 sym R (1047 sym/9 pcs) 1 img
Correlation
In this exercise you will learn to visualize the pairwise relationships between a set of quantitative variables. To this end, you will make your own note of 8.1 Correlation plots from Data Visualization with R. Q1 What factors have strong positve correlation with home price? Living area has a strong positive relationship with price. Q2 Continue...
1468 sym 2 img
Quiz 2
# Load packages library(tidyquant) library(tidyverse) # Import stock prices stock_prices <- tq_get(c("WMT", "TGT", "AMZN"), get = "stock.prices", from = "2020-01-01") # Calculate daily returns stock_returns <- stock_prices %>% group_by(symbol) %>% tq_mutate(select = adjusted, mutate_fun = periodReturn, period = "daily") stock_retur...
1968 sym R (2508 sym/9 pcs) 3 img
Bivariate Graphs
In this exercise you will learn to plot data using the ggplot2 package. To answer the questions below, use Chapter 4.3 Categorical vs. Quantitative Data Visualization with R. Q1 Plot the distribution of daily returns by stock using kernel density plots. Hint: See the code in 4.3.2 Grouped kernel density plots. Q2 Plot the distribution of daily...
1507 sym 5 img