Publications by Khyati Naik

blog4_ts_analysis

11.12.2023

Introduction: In the realm of data analysis, understanding trends, patterns, and behaviors over time is a crucial aspect. Time Series Analysis, a powerful statistical method, provides us with the tools to dissect temporal data and extract meaningful insights. In this blog, we’ll embark on a journey through the fundamental concepts, techniques...

3350 sym R (2914 sym/17 pcs) 4 img

blog5_cst

11.12.2023

Introduction The Chi-Square test, a statistical method developed by Karl Pearson in the early 20th century, stands as a cornerstone in the realm of statistical analysis. This test, also known as the χ² test, is particularly employed to assess the association or independence between categorical variables. As a non-parametric test, it transcend...

4871 sym R (1361 sym/9 pcs)

blog1_mva

24.11.2023

Introduction: Multivariate analysis is a powerful statistical approach that allows us to explore relationships and patterns among multiple variables simultaneously. In this blog post, we’ll delve into the world of multivariate analysis, focusing on Principal Component Analysis (PCA) as a tool to unravel hidden structures within the Student Pe...

3352 sym R (1262 sym/5 pcs)

blog2_da

24.11.2023

Introduction: The Student Performance dataset encompasses various factors that may influence a student’s academic performance. This report aims to provide a comprehensive descriptive analysis of the dataset, shedding light on the key features and distributions of the variables. 1. Loading and Exploring the Dataset: Let’s begin by loading th...

3399 sym R (3288 sym/15 pcs) 5 img

blog3_isa

24.11.2023

1. Hypothesis Test: Test Preparation and Math Scores Research Question: Is there a significant difference in the mean math scores between students who completed the test preparation course and those who did not? Hypotheses: Null Hypothesis (H0): The mean math scores of students who completed the test preparation course are equal to those who did...

3311 sym R (2161 sym/7 pcs)

SQL connection in R

11.09.2022

Load required packages library(tidyverse) library(kableExtra) library(RMySQL) library(RODBC) Connect to MYSQL movies schema sql_conn <- dbConnect(MySQL(), user= usr, password = pwd, dbname='movies', host='localhost') Please click here to access SQL script to create the two tables in database. Read the movie rating csv and upload it as a tab...

2029 sym R (2020 sym/20 pcs) 1 img

regex

17.09.2022

library(tidyverse) Exercise 1 Using the 173 majors listed in fivethirtyeight.com’s College Majors dataset [https://fivethirtyeight.com/features/the-economic-guide-to-picking-a-college-major/], provide code that identifies the majors that contain either “DATA” or “STATISTICS” #read the dataset from github link col_majors_df<-read.cs...

2756 sym R (1443 sym/13 pcs)

chess tournament

25.09.2022

In this project, you’re given a text file with chess tournament results where the information has some structure. Your job is to create an R Markdown file that generates a .CSV file (that could for example be imported into a SQL database) with the following information for all of the players: Player’s Name, Player’s State, Total Number...

2038 sym R (14880 sym/29 pcs)

tidy_transform

02.10.2022

Read libraries library(tidyverse) Read CSV file from github ip_fl <- "https://raw.githubusercontent.com/Naik-Khyati/tidy_transform_data/main/data/arr_delays.csv" raw_dt <- read.csv(ip_fl, header=FALSE, sep=",", stringsAsFactors=FALSE) head(raw_dt) ## V1 V2 V3 V4 V5 V6 V7 ## 1 Lo...

2581 sym R (3680 sym/19 pcs) 3 img

lab5b

09.10.2022

Load packages library(tidyverse) library(openintro) library(infer) set seed var_seed <- 3421 Read data us_adults <- tibble( climate_change_affects = c(rep("Yes", 62000), rep("No", 38000)) ) Exercise 1 What percent of the adults in your sample think climate change affects their local community? Hint: Just like we did with the population...

6421 sym R (1627 sym/14 pcs) 5 img