Publications by Xiao Ling

STAT3000 1009

09.10.2024

Normal distribution: https://mathlets.org/mathlets/probability-distributions/ Confidence Interval: https://mathlets.org/mathlets/confidence-intervals/ Example Suppose you give a test to a history class. After you have graded them, you compute the mean and standard deviation of the distribution of grades to be \(\mu\) = 80 and \(\sigma\) = 12. T...

5274 sym

STAT3000 100724

07.10.2024

What is a Random Variable? Let’s start by breaking it down: The word “random” suggests something that happens by chance. The word “variable” is something we use in math or science to represent a number that can change. So, a random variable is a quantity that can take on different values depending on the outcome of some random event. ...

4056 sym 9 img

STAT3000 1002

02.10.2024

Setting the random seed set.seed(2) sample(1:100) ## [1] 85 79 70 6 32 8 17 93 81 76 41 50 75 65 3 80 89 55 ## [19] 63 95 33 54 43 38 40 16 45 97 9 74 73 2 67 86 1 48 ## [37] 90 71 15 100 18 92 36 82 30 37 34 21 13 7 84 77 44 20 ## [55] 28 4 72 42 98 96 62 52 35 59 64 ...

3444 sym

STAT3000 093024

30.09.2024

Common Terms and Statistics We introduce concepts briefly and then provide detailed case studies demonstrating how statistics is used in data analysis along with R code implementing these ideas. It is important for a data analyst to have an in-depth understanding of statistics. Distributions: a quick way to summarize a list with lots of values....

3989 sym 13 img

STAT3000 092524

25.09.2024

Background The Instacart dataset consists of multiple CSV files that provide detailed information on customer orders, products, aisles, and departments. It is commonly used for analyzing e-commerce customer behaviors and developing recommendation systems. This report provides a brief overview of the data and sets up the analysis for further ex...

4303 sym R (11598 sym/44 pcs) 9 img

STAT3000 092324

23.09.2024

Pipes Pipes are a new tool for expressing a sequence of multiple operations. “object %>% function1() %>% function2()” ### The point of the pipe is to help you write code in a way that is easier to read and understand. lapply(c("ggplot2","tidyverse"),library,character.only=1) ## [[1]] ## [1] "ggplot2" "stats" "graphics" "grDevices" ...

6682 sym R (6678 sym/42 pcs) 19 img

STAT3000 091824

18.09.2024

In this section, we will learn how to make presentation ready plots using ggplot2 and base R. We will learn how to use ggplot to make presentations. lapply(c("ggplot2","readr","tidyverse","RColorBrewer"),library,character.only=1) #load multiple packages in one line ## [[1]] ## [1] "ggplot2" "stats" "graphics" "grDevices" "utils" "data...

5045 sym R (9422 sym/41 pcs) 22 img

STAT3000 091624

16.09.2024

More on coorelationship You may want to present a pairwise scatterplot matrix of the multiple variables. Every column of input is plotted against every other column of input. pairs(mtcars) crabs <- read.csv("./crabs.csv") cor(mtcars) ## mpg cyl disp hp drat wt ## mpg 1.0000000 -0.8521620 -0.8475...

2929 sym R (6234 sym/29 pcs) 19 img

STAT3000 091124

11.09.2024

Mutate Function mutate() is used to create new variables of modify existing ones within a data frame. It is used with other dplyr functions, like filter(), select(). Arguments: - data: The data frame you’re working with. - new_variable: The name of the new or modified variable. - expression: A mathematical or logical expression to create the ...

2644 sym 11 img

STAT3000 090924

09.09.2024

Data Summarization We will learn the basics of exploratory data analysis in R We will learn how to summarize one categorical variable, a character vector in R, one quantitative variable, a numeric vector in R, and summaries of bivariate data. We will cover both numeric and the basics of graphical summaries. Let’s explore some of these numer...

4107 sym R (11075 sym/52 pcs) 9 img