Publications by Amit Kapoor

Data607 - Project2B

09.03.2020

About the data The data is collected from the R package: fueleconomy. The fueleconomy package’s data was sourced from the EPA (Environmental Protection Agency). In this package, the data is stored in vehicles dataset. Fuel economy data contains data for all cars sold in the US from 1984 to 2015. The package fueleconomy has 33,442 rows and 12 va...

3013 sym R (4851 sym/25 pcs) 4 img 3 tbl

Data605 - Assignment6

07.03.2020

Problem 1 A box contains 54 red marbles, 9 white marbles, and 75 blue marbles. If a marble is randomly selected from the box, what is the probability that it is red or blue? Express your answer as a fraction or a decimal number rounded to four decimal places. Solution # funtion to calc probability of given marble Prob <- function(m, total) { r...

4237 sym R (3669 sym/31 pcs)

Data605 - Assignment5

02.03.2020

Choose independently two numbers B and C at random from the interval [0, 1] with uniform density. Prove that B and C are proper probability distributions. Note that the point (B,C) is then chosen at random in the unit square. set.seed(67) n <- 10000 B <- runif(n, min = 0, max = 1) C <- runif(n, min = 0, max = 1) cat("min(B)->", format(min(B), sci...

456 sym R (1375 sym/17 pcs) 2 img

Data 607 - Assignment 5

02.03.2020

Introduction Data manipulation is one of the most important part of Data Science. The purpose of this assignment is to perform data manipulation using R packages tidyr and dplyr. Data manipulation involves data rearrangement, manipulation and its analysis to make it ready for applicable model. Problem Statement We have been provided the data for...

3379 sym R (8050 sym/23 pcs) 4 img

Data 607 - Project 1

24.02.2020

Project Overview This project is to process a text file of chess tournament results as shown below. The tournament results follow some structure and below is the glimpse of data from the given file. The goal of this project is to generate a .CSV file (which could for example be imported into a SQL database) with the following information of the ...

3535 sym R (19904 sym/45 pcs) 3 img 1 tbl

Data 605 - Assignment 4

22.02.2020

Problem set 1 1. In this problem, we’ll verify using R that SVD and Eigenvalues are related as worked out in the weekly module. Given a 3x2 matrix A \(A = \left( \begin{matrix} 1&2&3 \\ -1&0&4 \end{matrix} \right)\) write code in R to compute X = A\(\mathbf{A}^\intercal\) and Y = \(\mathbf{A}^\intercal\)A. Then, compute the eigenvalues and ei...

2430 sym R (4326 sym/41 pcs)

Data 607 - Assignment 3

16.02.2020

library(dplyr) ## ## Attaching package: 'dplyr' ## The following objects are masked from 'package:stats': ## ## filter, lag ## The following objects are masked from 'package:base': ## ## intersect, setdiff, setequal, union library(stringr) theLink <- "https://raw.githubusercontent.com/fivethirtyeight/data/master/college-majors/majors-l...

1794 sym R (2936 sym/21 pcs)

Data 605 - HW2

10.02.2020

1. Problem set 1 (1) Show that \(A\) \(\mathbf{A}^\intercal\) \(\neq\) \(\mathbf{A}^\intercal\) \(A\) in general. (Proof and demonstration.) Proof:- Let A be a m x n matrix. By definition its transpose \(\mathbf{A}^\intercal\) will be n x m matrix. Given the definition of matrix multiplication, multiplying A to \(\mathbf{A}^\intercal\) will ret...

2798 sym R (875 sym/5 pcs)

Data 607 - Assginment 1

03.02.2020

Overview This data originates from “Where People Go To Check The Weather”. The source of the data is a Survey Monkey Audience poll commissioned by FiveThirtyEight and conducted from April 6 to April 10, 2015. Data Source https://fivethirtyeight.com/features/weather-forecast-news-app-habits/ Data Description RespondentID Do you typically ch...

2177 sym R (10071 sym/18 pcs) 2 img

Data 605 - HW1

03.02.2020

1. Problem set 1 You can think of vectors representing many dimensions of related information. For instance, Netflix might store all the ratings a user gives to movies in a vector. This is clearly a vector of very large dimensions (in the millions) and very sparse as the user might have rated only a few movies. Similarly, Amazon might store the i...

2196 sym R (1138 sym/15 pcs)