Publications by Karol Orozco
Crime in Portland
crime$Year <- format(as.Date(crime$OccurDate, format="%m/%d/%Y"),"%Y") crime$Month <- format(as.Date(crime$OccurDate, format="%m/%d/%Y"),"%m") crime$weekday <- weekdays(as.Date(crime$OccurDate)) crime$weekdaynum <- recode(crime$weekday, "Sunday"="0", "Monday"= "1", "T...
11 sym Python (2792 sym/7 pcs) 1 img
Bank Data- Machine Learning
Create a model that predicts churns of bank customers using only 5 features. Setup bank <-readRDS(gzcon(url("https://github.com/karolo89/Raw_Data/raw/main/BankChurners.rds"))) bank = bank %>% rename_all(funs(tolower(.))) ## Warning: `funs()` was deprecated in dplyr 0.8.0. ## ℹ Please use a list of either functions or lambdas: ## ## # Si...
719 sym 15 img 3 tbl
Census
the easy way. the tradeoff is that it’s harder to save your output. data = read.csv(file.choose()) the harder way. retrieve and set your filepath to wherever your files are. This is known as your ‘working directory.’ getwd() ## [1] "C:/Users/karol/Desktop/census-master" setwd("C:/Users/karol/Desktop/census-master") data = read.csv("oecd.c...
9860 sym R (77401 sym/173 pcs) 24 img
Time Series- Decomposition
All of the data for today, including computations, can be acquired using load(url(“https://github.com/robertwwalker/xaringan/raw/master/CMF-Week-9/data/FullWorkspace.RData”)) Inflation Expectations library(tidyverse) library(lubridate); library(tsibble) library(readxl) url <- "https://www.newyorkfed.org/medialibrary/interactives/sce/sce/...
700 sym R (1666 sym/5 pcs) 4 img
PDX Prices
metrofour <- read.csv("C:/Users/karol/Downloads/Metro_zhvi_bdrmcnt_4_uc_sfrcondo_tier_0.33_0.67_sm_sa_month.csv") str(metrofour[,c(1:11)]) ## 'data.frame': 892 obs. of 11 variables: ## $ RegionID : int 102001 394913 753899 394463 394514 394692 395209 394856 394974 394347 ... ## $ SizeRank : int 0 1 2 3 4 5 6 7 8 9 ... ## $ Regio...
3247 sym Python (12241 sym/40 pcs) 9 img
time series
The Data metrofour <- read.csv("C:/Users/karol/Downloads/Metro_zhvi_bdrmcnt_4_uc_sfrcondo_tier_0.33_0.67_sm_sa_month.csv") str(metrofour[,c(1:11)]) ## 'data.frame': 892 obs. of 11 variables: ## $ RegionID : int 102001 394913 753899 394463 394514 394692 395209 394856 394974 394347 ... ## $ SizeRank : int 0 1 2 3 4 5 6 7 8 9 ... ##...
3069 sym Python (10970 sym/46 pcs) 10 img
Portland Prices- Regression
Get the Data raw_pdx <- read.csv("C:/Users/karol/Desktop/PORTLAND HOUSE.csv", stringsAsFactors=TRUE) Prepare the Data This data has 25731 obs. of 32 variables ## raw_pdx <- raw_pdx%>%select(-id) head(raw_pdx) ## id yearBuilt City latitude longitude zipcode bathrooms bedrooms ## 1 1 2007 Fairview 45.54357 -122.4418 97024 ...
846 sym R (22456 sym/51 pcs) 7 img 1 tbl
wine
State of the Market Insights Summary Statistics 52 obs. of 36 variables summary(wine) ## Region Country Vine.Area...000.ha. ## Length:52 Length:52 Min. : 0.000 ## Class :character Class :character 1st Qu.: 7.652 ## Mode :character Mode :character Median : 43.904 ## ...
165 sym 3 tbl
Central Park Squirrel Census, 2018
Central Park Squirrel Census, 2018 Karol Orozco The Task Using the dataset from below, present three ggplot2 plots that attempt to answer questions about the data that you think are interesting. The focus of this assignment is exploration and experimentation. Concentrate on answering each question in a variety of ways and exploring the funct...
503 sym Python (6398 sym/10 pcs) 4 img
Hurricane
x <- getURL("https://raw.githubusercontent.com/karolo89/Raw_Data/main/Hurricane.csv") hurricane <- read.csv(text = x) head(hurricane) ## year num_hurricanes num_major_hurricanes type average ## 1 1851 3 1 avg_h_15year NA ## 2 1851 3 1 avg_mh_15year NA ## 3 1...
111 sym Python (3353 sym/7 pcs) 2 img