Publications by Jered Ataky

DATA606-LAB5a

06.10.2020

In this lab, you will investigate the ways in which the statistics from a random sample of data can serve as point estimates for population parameters. We’re interested in formulating a sampling distribution of our estimate in order to learn about the properties of the estimate, such as its distribution. Setting a seed: We will take some rando...

45598 sym R (4760 sym/28 pcs) 5 img

DATA606-LAB7

20.10.2020

Getting Started Load packages In this lab, we will explore and visualize the data using the tidyverse suite of packages, and perform statistical inference using infer. The data can be found in the companion package for OpenIntro resources, openintro. Let’s load the packages. library(tidyverse) library(openintro) library(infer) The data Ever...

30716 sym R (6911 sym/62 pcs) 5 img

DATA607_tidyverse_recipe

25.10.2020

Vignette: Getting to know dplyr & stringr Jered Ataky 2020-10-25 1. Introduction “dplyr” is one of the tidyverse packages, and it is used for data manipulation. In other words, it is a grammar of data manipulation providing verbs that help to solve many problems faced in data manipulation. “stringir” in the other is also one of a tidyvers...

4399 sym R (5665 sym/17 pcs)

DATA606-LAB8

21.10.2020

The Human Freedom Index is a report that attempts to summarize the idea of “freedom” through a bunch of different variables for many countries around the globe. It serves as a rough objective measure for the relationships between the different types of freedom - whether it’s political, religious, economical or personal freedom - and other s...

40239 sym R (7695 sym/46 pcs) 13 img

DATA607-ASS9

24.10.2020

Overview Problem The New York Times web site provides a rich set of APIs, as described here: https://developer.nytimes.com/apis I need to start by signing up for an API key. My task is to choose one of the New York Times APIs, construct an interface in R to read in the JSON data, and transform it into an R DataFrame. Approach The New York Times ...

6453 sym R (3746 sym/14 pcs) 1 img

DATA606_Project_proposal

26.10.2020

1. Data Preparation # Libraries library(tidyverse) library(statsr) library(infer) library(psych) # Load data from Github repository data <- read.csv("https://raw.githubusercontent.com/jnataky/DATA-607/master/A2_Various_dataset_transformation/students_performance.csv") # Take a look at its structure glimpse(data) # Since there are m...

6593 sym R (3133 sym/10 pcs) 3 img

DATA606_Presentation

27.10.2020

Chapter 7 - Inference for Numerical Data 7.21 Global warming, Part II. We considered the change in the number of days exceeding 90°F from 1948 and 2018 at 197 randomly sampled locations from the NOAA database in Exercise 7.19. The mean and standard deviation of the reported differences are 2.9 days and 17.2 days. (a) Calculate a 90% confidence ...

1083 sym R (331 sym/2 pcs)

DATA606_lab9

12.12.2020

knitr::opts_chunk$set(eval = TRUE, message = FALSE, warning = FALSE) Grading the professor Many college courses conclude by giving students the opportunity to evaluate the course and the instructor anonymously. However, the use of these student evaluations as an indicator of course quality and teaching effectiveness is often criticized because t...

42709 sym R (12482 sym/54 pcs) 30 img

DATA607_Final_project

09.12.2020

# Load libraries library(tidyverse) library(plyr) library(kableExtra) library(plotly) library(corrplot) ## Warning: package 'corrplot' was built under R version 4.0.3 library(PerformanceAnalytics) ## Warning: package 'PerformanceAnalytics' was built under R version 4.0.3 ## Warning: package 'xts' was built under R version 4.0.3 ## Warning: p...

25215 sym R (21498 sym/72 pcs) 3 img

DATA606_final_project

04.12.2020

December 3, 2020 Students Performance in Exams Overview Data is collected by kaggle to explore and build in a web-based data science environment. There are 1000 observations with 8 variables in this given data set, and each case represents a student in the United States. It is an observatory study in which: The response variable is mean tests s...

2469 sym R (2698 sym/16 pcs) 4 img