Publications by Tom Buonora
Week1Quiz
Create a vector that contains 20 numbers. (You may choose whatever numbers you like, but make sure there are some duplicates.) # Jet win total for the past 20 years vector_num1 = c(2,7,4,5,5,10,4,8,6,8,11,9,9,4,10,4,10,6,9,10) length(vector_num1) # 20 ## [1] 20 Use R to convert the vector from question 1 into a character vector. ve...
3135 sym R (3597 sym/36 pcs)
CUNY MSDS R Bridge Course Final Project
Has quality within move genres changed since 1930 ? The Internet Movie Database provides an opportunity to review how the internet generation views movies throughout the last century. We can explore the trends in genres, popularity (votes) and acclaim (ratings). This project will explore 3 genres Comedy Drama Romance Let start by produ...
1764 sym R (3554 sym/9 pcs) 3 img 3 tbl
TomBuonora_Week2
Exercises in Data Frames The data set is a survey of 4700 students who commute to college. Attributes include the distance they commute their high school GPA if father or mother is a college graduate The results indicate some correlation between parents education and shorter distances, as well as increased high school GPA. The final ...
1041 sym R (4039 sym/12 pcs) 6 tbl
Data607_Final_Project_Proposal
Expoplanet Archive: A Statistical Analysis The final project will download data from the CalTech IPAC Exoplanet Archive using the TAP query service. The report will provide some education in astronomy in order to explain the data, but it will also assume the audience already has a basic knowledge of astronomy and exoplanet research. Key Aims ...
1010 sym
Data607_Week9
New York Times APIs : Best Selling Books Overview We will connect to the NY Times api and download data from their bestseller lists. Then it will parse into a R data frame and display it. The NYT Developer Network is here Retrieve api key and set list type. # to set global and permanently, use "setx" nyt_api <- Sys.getenv("NYT_API") ny...
983 sym R (3465 sym/9 pcs) 4 tbl
Data607_Project3
Data Overview The Data comes in 3 extracts : Data Science Datasets Type Sample Observations Job Seekers All 23859 Job Seekers Subset 2325 Job Openings All 7556 Glimpse the 3 Datasets qry<-"select gender, location, education, title, skill from staging_db.JobSeekersStage limit 4;" rs = dbSendQuery(conn, qry) qry_df = fetch(rs, n=-1) ...
753 sym R (8360 sym/15 pcs) 8 img 8 tbl
Data607_Week7
Data Scraping and Parsing : My Favorite Books Overview This assignment will take a data set of book data in 3 common formats - html, xml, and json. Then it will parse into a R data frame and display it. Each return set is different so standardizing it into a data frame requires some work. Format Function Package Returns HTML readHTMLTa...
632 sym R (2848 sym/6 pcs) 4 tbl
Data607_Week5_Corrected
Tidy data using dplyr and tidyr The R Code Import readxl and tidyverse library(readxl) library(tidyverse) # ggplot2, dplyr, tidyr, readr, tibble, sringr and more library(knitr) Download the excel file. Note the mode is “wb” to preserve the binary elements of the excel file. sourcefile<-"https://github.com/acatlin/data/raw/mast...
2150 sym R (3334 sym/14 pcs) 1 tbl
Data607_Week5
Tidy data using dplyr and tidyr The R Code Import readxl and tidyverse library(readxl) library(tidyverse) # ggplot2, dplyr, tidyr, readr, tibble, sringr and more library(knitr) Download the excel file. Note the mode is “wb” to preserve the binary elements of the excel file. sourcefile<-"https://github.com/acatlin/data/raw/mast...
1834 sym R (3638 sym/14 pcs) 1 tbl
Data_607_Week1_Lab
Pittsburghs Rivers This lab uses dplyr pipe construct to display the average length of bridges that cross each of Pittsburgh’s 3 big rivers. library(dplyr) Get Data bridges <- read.table("https://archive.ics.uci.edu/ml/machine-learning-databases/bridges/bridges.data.version1", h = F, sep = ",") names(bridges)<-c("ident","river","locati...
227 sym R (880 sym/4 pcs) 1 tbl