Publications by Alice Xiang
STA490 - Week 2 Assignment: RMarkdown Presentation
class: center, middle, inverse, title-slide .title[ # Week 2 Assignment: Presentation ] .subtitle[ ## MLR of CO2 Emissions for Vehicles ] .author[ ### Alice Xiang ] .date[ ### 2024-02-13 ] --- ## Introduction I chose [this dataset](https://www.kaggle.com/datasets/bhuviranga/co2-emissions) on CO2 emissions of different cars to do ...
5517 sym
STA 321 Final Project: Time Series Models with Champagne Sales from 1964 to 1972
1 Description of the Dataset This dataset contains the sale of Perrin Freres Champagne in millions of dollars every month from January, 1964 to September, 1972. champagne = na.omit(read.csv("C:/Users/qinfa/Desktop/school/STA 321/champagne.csv")) 2 Create the Time Series Object We will proceed by holding back the last 12 data points for testing ...
3284 sym Python (7709 sym/13 pcs) 6 img 4 tbl
STA 321 Assignment 13: Exponential Smoothing Methods with Champagne Sales from 1964 to 1972
1 Description of the Dataset This dataset contains the sale of Perrin Freres Champagne in millions of dollars every month from January, 1964 to September, 1972. champagne = na.omit(read.csv("C:/Users/qinfa/Desktop/school/STA 321/champagne.csv")) 2 Create the Time Series Object We will proceed by holding back the last 12 data points for testing ...
1603 sym Python (2890 sym/6 pcs) 2 img 3 tbl
DocumentSTA 321 Assignment 12: Deconstructed Time Series of Natural Gas Prices from 2004 to 2020
1 Description of the Dataset This dataset contains the monthly prices of natural gas in nominal dollars from 2004 to 2020. data = read.csv("C:/Users/qinfa/Desktop/school/STA 321/monthly_csv.csv") ## get last 200 observations natural_gas <- data %>% slice_tail(n=200) 2 Create the Time Series Object gas.ts = ts(natural_gas[,2], frequency = 12,...
1170 sym Python (2711 sym/7 pcs) 4 img 1 tbl
STA 321 Assignment 11: Time Series on New Houses Sold between 1963 and 2023
1 Description of the Dataset This dataset includes the counts of new single-family houses sold per month, in thousands, in the United States from 1963 to 2023. There are 729 observations in the dataset. new.houses = na.omit(read.csv("C:/Users/qinfa/Desktop/school/STA 321/newhousessold.csv")[,-1]) 1.1 Training and Testing Data We will begin by ...
1258 sym 1 img 2 tbl
STA 321 Assignment 10: Dispersed Poisson Regression on NYC Cycling Data for the Brooklyn Bridge
1 Description of the Dataset biking <- read.csv("C:/Users/qinfa/Desktop/school/STA 321/biking.csv") biking$BrooklynBridge <- decomma(biking$BrooklynBridge) biking$Total <- decomma(biking$Total) biking$AvgTemp <- (biking$HighTemp + biking$LowTemp)/2 biking <- biking %>% mutate( NewPrecip = case_when( Precipitation > 0 ~ 1, Precipi...
2712 sym Python (2536 sym/5 pcs) 1 img 4 tbl
STA 321 Assignment 9: Poisson Regression on NYC Cycling Data for the Brooklyn Bridge
1 Description of the Dataset biking <- read.csv("C:/Users/qinfa/Desktop/school/STA 321/biking.csv") biking$BrooklynBridge <- decomma(biking$BrooklynBridge) biking$Total <- decomma(biking$Total) kable(head(biking), caption = "First few records in the data set") First few records in the data set Date Day HighTemp LowTemp Precipitation Brookly...
4462 sym 4 tbl
STA 321 Assignment 7: Multiple Logistic Regression on Stroke Risk Factors
1 Description of the Dataset I chose a dataset found from this link: https://www.kaggle.com/datasets/fedesoriano/stroke-prediction-dataset/ that includes identifying information about different hospital patients and whether or not they had strokes. The author chose not to disclose the source of the data. The dataset includes the following vari...
4326 sym 2 img 5 tbl
STA 321 Assignment 8: Stroke Prediction with Logistic Regression using Cross ValidationPublish Document
1 Description of the Dataset stroke <- read.csv("C:/Users/qinfa/Desktop/school/STA 321/stroke.csv") # turn the 'N/A'-s in the dataset into NA make.true.NA <- function(x) if(is.character(x)||is.factor(x)){ is.na(x) <- x=="N/A"; x} else { x} stroke[] <- lapply(stroke, make.tr...
3101 sym 2 img 2 tbl
STA 321 Assignment 6: Simple Logistic Regression on Stroke Probability and Average Blood Glucose Levels
1 Description of the Dataset I chose a dataset found from this link: https://www.kaggle.com/datasets/fedesoriano/stroke-prediction-dataset/ that includes identifying information about different hospital patients and whether or not they had strokes. The author chose not to disclose the source of the data. The dataset includes the following vari...
3359 sym 2 img 3 tbl