Publications by Mikhilesh Dehane
Linear Regression - Basics
setwd("E:\\mikhilesh\\Horizon 2020\\ApCoTe Yogesh Sky Analytics DA DS Learning") data <- read.csv("waist circumfirance - adipose tissue july28.csv") #data <- read.csv(file.choose()) #another way of selecting the data file from browser #View(data) dim(data) #dim() for dimension of datset ## [1] 109 2 class(data) #let's you know if the datase...
6 sym R (13506 sym/34 pcs) 1 img
Probability of Normal Distribution and Transformation
#probability of normal distribution - pnorm() #step 1 calculate z score pnorm(75, 72, 6) #(N, mean, sd) ## [1] 0.6914625 #step 2 tocalculate a person living > 75 years using z score 1 - pnorm(75, 72, 6) ## [1] 0.3085375 mtcars #selecting a default dataset ## mpg cyl disp hp drat wt qsec vs am gear carb ## Mazda RX4...
9 sym R (3894 sym/18 pcs) 4 img
Confirmatory Factor Analysis - CFA
Confirmatory Factor Analysis - CFA library(readxl) ## Warning: package 'readxl' was built under R version 3.6.3 setwd("E:\\mikhilesh\\HU Sem VI ANLY 510 and 506\\ANLY 510 Kao Principals and Applications\\Lecture and other materials") data <- read_xlsx("lecture 14 CFAexample.xlsx") data ## # A tibble: 800 x 60 ## Gender Age Country Current...
1169 sym R (76708 sym/33 pcs)
Paired Samples T Test
Paired Samples t-test - to compare before and after or with and without treatment kind of situation data Suppose we are interested in whether a drug reduces the level of bad cholesterol significantly. We measure cholesterol before and after treatment. #dataset before <- c(155, 151, 80, 112, 179, 170, 131, 175, 153, 175) after <- c(139, 93, 133,...
259 sym R (2265 sym/18 pcs)
Independent or Two Sample T Test
Independent or Two Sample T Test - comparing two different groups of data Suppose we have data on cholesterol levels for males and females and we wish to see if one sex has higher/lower LDL cholesterol (i.e., the bad kind) than the other. Is there any difference in terms of cholesterol level between male and female? #dataset male <- c(169, 175, ...
694 sym R (2349 sym/34 pcs) 4 img
3. Factorial (2 or more Way) ANOVA
Factorial (2 or more Way) ANOVA - A factorial ANOVA is an Analysis of Variance test with more than one independent variable. The data set from the first class contains another column that includes whether the valuation was made for a product or gamble so we have data that we can analyze using a factorial ANOVA. For example, if we are interested i...
1689 sym R (26699 sym/82 pcs) 4 img
Simple Linear Regression - Fat~WaistCircumference
Linear Regression Predict fat (adipose tissue - AT) in body based on waist circumference. setwd("E:\\mikhilesh\\Horizon 2020\\ApCoTe Yogesh Sky Analytics DA DS Learning") data <- read.csv("waist circumfirance - adipose tissue july28.csv") #data <- read.csv(file.choose()) #another way of selecting the data file from browser #View(data) dim(dat...
268 sym R (14078 sym/40 pcs) 1 img
Multiple Linear Regression
Multiple Linear Regression Predicts mpg for a car from hp, vol, sp, and wt Cars dataset - IV x1 hp, x2 vol, x3 sp, x4 wt input variable - DV y mpg output variable Exploratory Data Analysis(60% of time) 1. Measures of Central Tendency 2. Measures of Dispersion 3. Third Moment Business decision 4. Fourth Moment Business decision 5. Probabilit...
631 sym R (21693 sym/73 pcs) 5 img
Naive Bayes/K-Nearest Neighbor Implementation
Predicting outcome y - dignosis benign or malignant based on input variables x - [1] “radius_mean” [4] “texture_mean” “perimeter_mean” “area_mean” [7] “smoothness_mean” “compactness_mean” “concavity_mean” [10] “points_mean” “symmetry_mean” “dimension_mean” [13] “radius_se” “texture_se” “perime...
732 sym R (564897 sym/50 pcs)
Decision Tree Implementation - creating a model and checking accuracy
Implementing a desicion tree to Classify if a particular flower belongs to output variable - species - setosa, versicolor, or virginica depending on input variable - sepal length, sepal width, petal length and petal width - # using iris data from default R datasets #data() gives you list of default datasets available in R data("iris") iris #...
460 sym R (26590 sym/33 pcs) 1 img