Publications by William Jasmine
Data 605 - HW2 - Transposes and LU Decomposition
Problem Set 1 Part A Given a matrix A, show that \(AA^T \neq A^tA\), in general. Proof Consider a matrix \(A\) with \(i\) rows and \(j\) columns, with \(i \neq j\). The transpose of matrix \(A\), \(A^T\), will then have \(j\) columns and \(i\) rows. The matrix product \(AA^T\) will then be a square matrix with \(i\) rows and \(i\) columns, whil...
5316 sym 1 img
Data 605 - HW1 - Image Manipulation
1 Assignment Description One of the most useful applications for linear algebra in data science is image manipulation. We often need to compress, expand, warp, skew, etc. images. To do so, we left multiply a transformation matrix by each of the point vectors. For this assignment, build the first letters for both your first and last name using poi...
4561 sym 3 img
Data 607 - Assignment 2 - SQL and R
Imports The chunk below includes all the pertinent R packages for running this .Rmd file. library("RMySQL") ## Loading required package: DBI library("dplyr") ## ## Attaching package: 'dplyr' ## The following objects are masked from 'package:stats': ## ## filter, lag ## The following objects are masked from 'package:base': ## ## interse...
8499 sym R (12157 sym/44 pcs) 3 tbl
Data 606 - Lab 2 - Introduction to Data
Some define statistics as the field that focuses on turning information into knowledge. The first step in that process is to summarize and describe the raw information – the data. In this lab we explore flights, specifically a random sample of domestic flights that departed from the three major New York City airports in 2013. We will generate s...
14994 sym 8 img
Data 607 - Extra Credit Window Functions
The chunk below downloads the time series data from the “OpenIntro” package. More specifically, the data imported below is from the sp500_1950_2018 dataset and includes daily financial metrics for the S&P 500 market for all trading days from 1950-2018. data(sp500_1950_2018) The query below demonstrates how to calculate both the the year-to-da...
856 sym
Data 607 - Assignment 3 - R String Manipulation
Problem 1 Description Using the 173 majors listed in fivethirtyeight.com’s College Majors dataset, provide code that identifies the majors that contain either “DATA” or “STATISTICS”. Solution First, the majors data is imported from the majors-list.csv file and stored as an R dataframe called df. df <- read.csv("majors-list.csv", heade...
7349 sym
Data 606 - Lab 3 - Probability
The Hot Hand Basketball players who make several baskets in succession are described as having a hot hand. Fans and players have long believed in the hot hand phenomenon, which refutes the assumption that each shot is independent of the next. However, a 1985 paper by Gilovich, Vallone, and Tversky collected evidence that contradicted this belief ...
14953 sym 2 img
Data 606 - Lab 4 - The Normal Distribution
In this lab, you’ll investigate the probability distribution that is most central to statistics: the normal distribution. If you are confident that your data are nearly normal, that opens the door to many powerful statistical methods. Here we’ll use the graphical tools of R to assess the normality of our data and also learn how to generate ra...
17036 sym 15 img
Data 607 - Project 1 - Data Cleaning
Introduction The goal of this project is to turn a sample of messy chess tournament data into a readable, ingestible .csv format. The .txt file containing the raw data can be found at the following link. The desired output of the data is to create a .csv file with the following variables: player name, player state, total number of points won, pla...
4420 sym 1 img
Data 607 - Assignment 4 - Tidying and Transforming Data
Introduction The goal of this assignment is to take some messy (untidy) airport arrival data, clean it, and then analyze it. The hope is that doing so will elucidate a number of insights regarding the arrival patterns of two different airlines. Import Data The data is stored in a csv file here, and is imported as a R data frame in the chunk belo...
4519 sym Python (6827 sym/20 pcs)