Publications by Peter
DATA621-Homework-05
Overview In this homework assignment, you will explore, analyze and model a data set containing information on approximately 12,000 commercially available wines. The variables are mostly related to the chemical properties of the wine being sold. The response variable is the number of sample cases of wine that were purchased by wine distribution c...
3561 sym 4 img 1 tbl
Lab1-606
library(tidyverse) library(openintro) Exercise 1 arbuthnot$girls ## [1] 4683 4457 4102 4590 4839 4820 4928 4605 4457 4952 4784 5332 5200 4910 4617 ## [16] 3997 3919 3395 3536 3181 2746 2722 2840 2908 2959 3179 3349 3382 3289 3013 ## [31] 2781 3247 4107 4803 4881 5681 4858 4319 5322 5560 5829 5719 6061 6120 5822 ## [46] 5738 5717 5847 6203 6...
3247 sym R (2518 sym/33 pcs) 4 img
Data wrangling
library(RCurl) #READING FILE FROM GIT x <- getURL("https://raw.githubusercontent.com/petferns/csvfile/master/datasets.csv") y <- read.csv(text = x) #CREATE DATA FRAME df<- data.frame(y$Package, y$Item, y$Title, y$Rows, y$Cols) #RENAME COLUMNS colnames(df) <- c("Division","Type", "Description", "Source", "Destination") #CREATE A SUBSET...
5 sym R (622 sym/1 pcs)
Data exploration
library(RCurl) x <- getURL("https://raw.githubusercontent.com/petferns/csvfile/master/datasets.csv") y <- read.csv(text = x) print('SUMMARY of dataset is as below : ') ## [1] "SUMMARY of dataset is as below : " summary(y) ## Package Item Title Rows ## Length:1303 Length:1303 ...
5 sym R (2374 sym/16 pcs)
Add new column
elements <- read.csv(file.path("D:","datasets.csv")) df<- data.frame(elements$Rows, elements$Cols) colnames(df) <- c("Source","Destination") ndf <- df[1:4,1:2] NewColumn <- c(TRUE, TRUE, TRUE, TRUE) ndf$NewColumn <- NewColumn print(ndf) ## Source Destination NewColumn ## 1 60 3 TRUE ## 2 570 6 TRUE ...
5 sym R (416 sym/2 pcs)
Summary of new data frame
elements <- read.csv(file.path("D:","datasets.csv")) df<- data.frame(elements$Rows, elements$Cols) colnames(df) <- c("Source","Destination") ndf <- df[1:4,1:2] NewColumn <- c(TRUE, TRUE, TRUE, TRUE) ndf$NewColumn <- NewColumn print('SUMMARY of new data frame is below :') ## [1] "SUMMARY of new data frame is below :" summary(ndf) ## So...
5 sym R (1031 sym/12 pcs)
Graphics
library(RCurl) require(ggplot2) ## Loading required package: ggplot2 x <- getURL("https://raw.githubusercontent.com/petferns/csvfile/master/datasets.csv") y <- read.csv(text = x) # Plotting boxplot graph from CSV data boxplot(y$Cols) # Plotting histogram graph from CSV data hist(y$Cols) #Graph in ggplot2 with single numeric value qplot(Co...
9 sym R (499 sym/7 pcs) 4 img
Homework2
Stats scores. (2.33, p. 78) Below are the final exam scores of twenty introductory statistics students. 57, 66, 69, 71, 72, 73, 74, 77, 78, 78, 79, 79, 81, 81, 82, 83, 83, 88, 89, 94 Create a box plot of the distribution of these scores. The five number summary provided below may be useful. Answer Creating a boxplot one with whole scores Crea...
5176 sym R (1731 sym/7 pcs) 6 img
Lab2
Exrecise 1 Look carefully at these three histograms. How do they compare? Are features revealed in one that are obscured in another? ANSWER (1) There are many flights leaving before the departure time (2) Departure delay peaks are clearly visible with more bins Exercise 2 Create a new data frame that includes flights headed to SFO in February,...
2216 sym R (1047 sym/6 pcs) 5 img