Publications by Emmanuel Hayble-Gomes
Data 607-Week 12 Assignment
Week 12 Assignment - NoSQL Migration For this assignment, you should take information from a relational database and migrate it to a NoSQL database of your own choosing. For the relational database, you might use the flights database, the tb database, the “data skills” database your team created for Project 3, or another database of your own ...
2955 sym R (4437 sym/13 pcs)
Data 607-Tidyverse Project
Tidyverse Assignment Task Create an Example. Using one or more TidyVerse packages, and any dataset from fivethirtyeight.com or Kaggle, create a programming sample “vignette” that demonstrates how to use one or more of the capabilities of the selected TidyVerse package with your selected dataset. (25 points) Extend an Existing Example. Using o...
3351 sym R (19486 sym/33 pcs) 4 img
Data 607-Final Project
Project Title The Impact of Twitter Sentiment on Airline Reputation Final Project Description The Airline Quality Rating (AQR) is the most comprehensive study of performance and quality of the largest airlines in the United States. The rating is a multifactor examination of the airlines based on mishandled baggage, consumer complaints, on-time ...
5320 sym R (45176 sym/27 pcs) 6 img
Data 605-Homework2
Problem 1 1. Show that \(A^T A \neq AA^T\) in general. (Proof and demonstration.) Indirect Proof: Consider \(A^T A = A A^T\) If A = \(\begin{bmatrix} 2 & 3 \\2 & 4\end{bmatrix}\) then \(A^T\) = \(\begin{bmatrix} 2 & 2 \\3 & 4\end{bmatrix}\) \(A^T A = \begin{bmatrix} 2 & 3 \\2 & 4\end{bmatrix}\begin{bmatrix} 2 & 2 \\3 & 4\end{bmatrix} = \begin{bm...
2017 sym R (1660 sym/19 pcs)
Data 608-Homework 1
Principles of Data Visualization and Introduction to ggplot2 I have provided you with data about the 5,000 fastest growing companies in the US, as compiled by Inc. magazine. lets read this in: inc <- read.csv("https://raw.githubusercontent.com/charleyferrari/CUNY_DATA_608/master/module1/Data/inc5000_data.csv", header= TRUE) library(dplyr) ## ...
1392 sym R (6015 sym/23 pcs) 3 img
Data 605-Homework5
Choose independently two numbers B and C at random from the interval [0, 1] with uniform density. Prove that B and C are proper probability distributions.Note that the point (B,C) is then chosen at random in the unit square. Simulate B and C using Unifoorm Distribution # Using a sample of 5000 random numbers from uniform distribution from zero (0...
437 sym R (466 sym/11 pcs)
Data 605-Homework7
Problem 1 Let X1, X2, . . . , Xn be n mutually independent random variables, each of which is uniformly distributed on the integers from 1 to k. Let Y denote the minimum of the Xi’s. Find the distribution of Y. dist <- function(a,b) { Y = c() for (i in 1:b){ X <- runif(a) Y[i] = min(X) } return(Y) } Y <- dist(20, 500) ...
1190 sym R (1462 sym/33 pcs) 1 img
Data 605-Discussion9
Problem 5 in Chapter 9 Write a program to choose independently 25 numbers at random from [0, 20], compute their sum \(S_{25}\), and repeat this experiment 1000 times. Make a bargraph for the density of \(S_{25}\) and compare it with the normal approximation of Exercise 4. How good is the fit? Now do the same for the standardized sum \(S_{25}\) a...
761 sym R (978 sym/9 pcs) 3 img
Data 612-Project3
0.1 Introduction The goal of this assignment is to provide practice working with Matrix Factorization techniques. The task is to implement a matrix factorization method—such as singular value decomposition (SVD) or Alternating Least Squares (ALS)—in the context of a recommender system. You may approach this assignment in a number of ways. You...
20375 sym R (4174 sym/40 pcs) 3 img
Data 612-Project2
0.1 Introduction The goal of this assignment is for you to try out different ways of implementing and configuring a recommender, and to evaluate your different approaches. For assignment 2, start with an existing dataset of user-item ratings, such as our toy books dataset, MovieLens, Jester [http://eigentaste.berkeley.edu/dataset/] or another dat...
17722 sym R (5530 sym/39 pcs) 5 img 2 tbl