Publications by Arash Hatamirad
Correspondence Analysis
Correspondence Analysis for 2 categorical variables (Race/Restaurants) Analyzing the relationship between race and pizza for women for a market research firm library(tidyr) library(ca) Read data # import data data <- read.csv(file = 'pizzafem truncated.csv') head(data) ## myid RESP_RACE pizza ## 1 2354858 1 NA ## 2 2354859 ...
550 sym R (8212 sym/30 pcs) 2 img
MYOPIA Study
MYOPIA Study This dataset is a subset of data from the Orinda Longitudinal Study of Myopia (OLSM), a cohort study of ocular component development and risk factors for the onset of myopia in children. Data collection began in the 1989–1990 school year and continued annually through the 2000–2001 school year. All data about the parts that ...
4524 sym Python (23144 sym/126 pcs) 8 img
R Coding Samples
Initalizing # Read data d1 <- read_dta("tech_co_cstat_dta.zip") #psych::describe(d1) #glimpse(d1) #names(d1) #head(d1) #attributes(d1$tic) # put data for Sale>0 into d2 d2 <- filter(d1,sale>0) Q1 Print a data frame with the medians of cogs, emp, and xrd. #### Answer: d2 %>% select(cogs,emp,xrd) %>% summarize(cogs=median(cogs,na....
3169 sym R (5573 sym/17 pcs)
Data Visualization with R
Initalizing # Read data d1 <- read_dta("tech_co_cstat_dta.zip") #psych::describe(d1) #glimpse(d1) #names(d1) #head(d1) #attributes(d1$tic) # put data for Sale>0 into d2 d2 <- filter(d1,sale>0) Q1 Print a data frame with the medians of cogs, emp, and xrd. Answer: d2 %>% select(cogs,emp,xrd) %>% summarize(cogs=median(cogs,na.rm = ...
3165 sym R (5573 sym/17 pcs)
ANOVA
# Read the CSV file data.bweight <- read.csv("birthweight.csv",header = TRUE) # Check data format str(data.bweight) ## 'data.frame': 295 obs. of 6 variables: ## $ Weight : int 2891 3572 3827 4593 3940 2778 3544 3402 3147 3997 ... ## $ Black : int 0 0 0 0 0 0 0 0 0 0 ... ## $ Married : int 1 1 1 1 0 1 1 1 0 1 ... ## $ Boy ...
4895 sym R (8036 sym/43 pcs) 10 img
Logistic Regression, Stepwise Model Selection with AIC
Logistic Regression, Stepwise Model Selection with AIC The liver data set is a subset of the ILPD (Indian Liver Patient Dataset) data set. It contains the first 10 variables described on the UCI Machine Learning Repository and a LiverPatient variable (indicating whether or not the individual is a liver patient. People with active liver disease ...
8409 sym R (8759 sym/36 pcs)
Linear Regression, Best Subset Selection
Best Subset Model Selection We will use heart.csv dataset. Below is brief summary of variables in heart.csv. Weight: subject’s weight Systolic: top number in a blood pressure reading, indicating the blood pressure level when the heart contracts Diastolic: bottom number in a blood pressure reading, indicating the blood pressure level when the...
4323 sym R (7262 sym/39 pcs) 7 img
Linear Regression
linear regression We will use heart.csv dataset. Below is brief summary of variables in heart.csv. Weight: subject’s weight Systolic: top number in a blood pressure reading, indicating the blood pressure level when the heart contracts Diastolic: bottom number in a blood pressure reading, indicating the blood pressure level when the heart is ...
2938 sym R (4896 sym/33 pcs) 7 img
Linear Regression
linear regression We will use heart.csv dataset. Below is brief summary of variables in heart.csv. Weight: subject’s weight Systolic: top number in a blood pressure reading, indicating the blood pressure level when the heart contracts Diastolic: bottom number in a blood pressure reading, indicating the blood pressure level when the heart is ...
3657 sym R (2376 sym/17 pcs) 6 img
two-way ANOVA
Analysis of Variance Use the cars_new.csv. See HW1 for detailed information of variables. # Read data q4.data <- read.csv("cars_new.csv") # Check the structure str(q4.data) ## 'data.frame': 180 obs. of 4 variables: ## $ type : chr "Sedan" "Sedan" "Sedan" "Sedan" ... ## $ origin : chr "Asia" "Asia" "Asia" "Asia" ... ## $...
3844 sym R (5299 sym/30 pcs) 1 img