Publications by Keeno Glanville

DATA606 Lab2

19.09.2022

Some define statistics as the field that focuses on turning information into knowledge. The first step in that process is to summarize and describe the raw information – the data. In this lab we explore flights, specifically a random sample of domestic flights that departed from the three major New York City airports in 2013. We will gener...

11484 sym 9 img

DATA607 Project 1

25.09.2022

Read data into R urldata="https://raw.githubusercontent.com/kglan/MSDS/main/DATA607/Project%201/tournamentinfo.txt" chessdata<-read.csv(url(urldata)) view(chessdata) Clean Data When you examine the data we see two things. Variables in rows 1 and 2 that we know but do not need. So what we will do now is remove those rows. Upon removing these co...

1662 sym

DATA606 - Lab3- Probability

26.09.2022

The Hot Hand Basketball players who make several baskets in succession are described as having a hot hand. Fans and players have long believed in the hot hand phenomenon, which refutes the assumption that each shot is independent of the next. However, a 1985 paper by Gilovich, Vallone, and Tversky collected evidence that contradicted this belief ...

11935 sym 2 img

DATA607 Working with Tidy Data

01.10.2022

load CSV file and Clean data url <- "https://raw.githubusercontent.com/kglan/MSDS/main/DATA607/WorkingWithTidyData/WorkingwithTidyData.csv" rawdata <-read.csv(url) rawdata <- t(rawdata) rawdata<- data.frame(rawdata) rownames(rawdata)<- c() rawdata ## X1 X2 X3 X4 X5 ## 1 ALASKA AM WEST ## 2 on t...

806 sym R (3792 sym/12 pcs) 2 img

DATA606 Lab4

02.10.2022

In this lab, you’ll investigate the probability distribution that is most central to statistics: the normal distribution. If you are confident that your data are nearly normal, that opens the door to many powerful statistical methods. Here we’ll use the graphical tools of R to assess the normality of our data and also learn how to generat...

9893 sym Python (3895 sym/31 pcs) 13 img

GunViolence

09.10.2022

Prompt:What factors most influences crime? and what state has the largest crime rate? Load data set.seed(1324) urldata <- "https://raw.githubusercontent.com/kglan/MSDS/main/DATA607/Data%20Transformation/GunViolence/guns.csv" gundata.r <- read_csv(url(urldata)) ## New names: ## Rows: 1173 Columns: 14 ## -- Column specification ## ----------...

1218 sym R (6110 sym/24 pcs) 5 img

Zillow Home Prices

09.10.2022

Determine if the average price of a house in NY is decreasing or increasing compared to one year ago. Will involve turning the months/years into a long format. Load data urldat <- "https://raw.githubusercontent.com/kglan/MSDS/main/DATA607/Data%20Transformation/ZillowHomePrices/ZillowHomeprices.csv" zill<- read_csv(url(urldat)) ## Rows: 27319 ...

549 sym 1 img

Cyber Threats

10.10.2022

Analysis: Compare the frequency of cyber crime in each year. Load Data urldata<- "https://raw.githubusercontent.com/kglan/MSDS/main/DATA607/Data%20Transformation/Cyber%20Threats/cyberthreats.csv" nbad<- read_csv(url(urldata)) ## Rows: 8 Columns: 5 ## -- Column specification -------------------------------------------------------- ## Delimiter...

339 sym 1 img

Working with XML HTML and JSON files

15.10.2022

Read XML xml.url<-"https://raw.githubusercontent.com/kglan/MSDS/main/DATA607/Working%20with%20XML%20and%20JSON%20files/favbooks.xml" xData <- getURL(xml.url) XML_df <- xmlToDataFrame(xData) XML_df ## title Year Author Publisher ## 1 R for Data Science 2017 Garrett Grolemund, Hadley Wickha...

145 sym

Inference for categorical data

17.10.2022

Getting Started Load packages In this lab, we will explore and visualize the data using the tidyverse suite of packages, and perform statistical inference using infer. The data can be found in the companion package for OpenIntro resources, openintro. Let’s load the packages. library(tidyverse) library(openintro) library(infer) set.seed(1234...

8103 sym Python (4928 sym/41 pcs) 1 img