Publications by Arvind Sharma

LLN

14.02.2023

In probability theory, the law of large numbers (LLN) is a theorem that describes the result of performing the same experiment a large number of times. According to the law, the average of the results obtained from a large number of trials should be close to the expected value, and will tend to become closer as more trials are performed. 1 Create ...

1527 sym Python (2516 sym/15 pcs) 2 img

CLT

14.02.2023

Central Limit Theorem The Central Limit Theorem (CLT) is one of the most important theorems in statistics and data science. The CLT states that the sample mean of a probability distribution sample is a random variable with a mean value given by population mean and standard deviation given by population standard deviation divided by square root of ...

4161 sym R (10299 sym/65 pcs) 5 img

Midterm Solution

13.02.2023

Question 1: Basic Data Analysis in R (Assignment+Discussion 1) In 1986, the Challenger space shuttle exploded during “throttle up” due to catastrophic failure of o-rings (seals) around the rocket booster. The data (real) on all space shuttle launches prior to the Challenger disaster are in the file challenger.csv. Load the data into R or Python...

9617 sym R (11462 sym/97 pcs) 7 img

Discussion3

07.02.2023

Often, we can model processes using several different probability distributions (skim). For example, we might use the Poisson instead of the binomial (\(n>20\) and \(np<10\) aka large n and small p), the binomial instead of the geometric (both are repetition of independent Bernoulli trials), or the normal approximation instead of the binomial (if \...

4732 sym 2 img

HW3

07.02.2023

It would be very helpful if you could plot the distributions before calculating the probabilities. Begin with reading up on the plot() function. These questions will help you build an understanding of Normal, Binomial, Hypergeometric and Poisson distribution. You will be using the probability density function, cumulative density function and quanti...

13813 sym Python (15566 sym/101 pcs) 8 img

HW3_Q2

03.02.2023

Q2. A quality control inspector has drawn a sample of 13 light bulbs from a recent production lot. Suppose 20% of the bulbs in the lot are defective. What is the probability that less than 6 but more than 3 bulbs from the sample are defective? Round your answer to four decimal places. i. Identify the distribution. This is a binomial distribution ...

1267 sym 1 img

Discussion1

26.01.2023

1 Discussion on Iris Dataset ?read.csv In OpenStats Chapter 1, Exercises, Problem 9, there is a reference to Fisher’s iris data. Discuss the solutions to this problem, and then conduct a descriptive analysis of the data which are conveniently available in R. To access the data in R, simply type “iris.” Investigate any additional R libraries...

3192 sym R (12659 sym/77 pcs) 7 img

HW1

27.01.2023

1 Instructions Go to Kaggle.com (owned by Google). Create a free account. Sign up for the Titanic: Machine Learning through Disaster competition located here: https://www.kaggle.com/c/titanic/data?select=train.csv Download the train.csv data. Open the train.csv file in R. To do so, use something like mydata <- read.csv(‘D:/train.csv’) but re...

6236 sym R (18445 sym/83 pcs) 3 img

R Markdown

27.01.2023

1 Official R Markdown Guide The link above is what you should explore to understand R Markdown. Can replace ‘html_document’ with ‘pdf_document’ in the .Rmd (Rmarkdown) file above manually to generate the output in your preferred format. However, I would strongly suggest using HTML format initially as the setup is likely to reduce math sym...

3848 sym 1 img

Week 1:Bivariate Regression

02.11.2022

Setting Up Working directory, clearing all data and memory # Clear the workspace rm(list = ls()) # Clear environment gc() # Clear unused memory ## used (Mb) gc trigger (Mb) limit (Mb) max used (Mb) ## Ncells 539941 28.9 1208047 64.6 NA 669282 35.8 ## Vcells 990762 7.6 8388608 64.0 32768 1840247 14.1 cat(...

724 sym R (5859 sym/39 pcs) 6 img