Publications by Albina Gallyavova, Michael Gankhuyag, Joby John
DATA624Final
Objective Following new regulations, understand manufacturing process at ABC beverage and produce predictive model for PH. Deliverables - Business report (non-technical) - Predictions in an Excel readable format - Technical report outlining tested models and selection process Exploratory Data Analysis # Read data for train/test, as well as n...
7704 sym R (41473 sym/101 pcs) 29 img
Project 5
Introduction The goal of this assignment is to adapt one of my recommendation systems to work with Apache Spark and compare the performance with my previous iteration. We will use MovieLense recommendation systems from project 2 and compare it with the model built on Apache Spark. load libraries ## -- Attaching packages -------------------------...
2415 sym R (3624 sym/26 pcs) 2 tbl
DATA624 HW8
7.2. We will create four models(KNN, SVM, NueralNet, and MARS) and evaluate the accuracies of these models on the training and test data to see which models fits the data best . library(mlbench) library(kableExtra) library(caret) ## Warning: package 'caret' was built under R version 3.6.3 ## Loading required package: lattice ## Loading required...
1960 sym R (17186 sym/54 pcs) 4 img 2 tbl
Project 4
The dataset contains product ratings for beauty products sold on Amazon. • Original set contains 2,023,070 observations and 4 variables. It covers 1,210,271 users and 249,274 products. We will work with a subset of the data, otherwise running the project takes up long time. ratings <- read.csv("Beauty.csv") Lets look at the first few records a...
3347 sym R (18638 sym/41 pcs) 5 img 2 tbl
Data624_HW7
6.2 (a) ## Loading required package: lattice ## Loading required package: ggplot2 ## Loading required package: Rcpp ## ## ## ## Amelia II: Multiple Imputation ## ## (Version 1.7.6, built: 2019-11-24) ## ## Copyright (C) 2005-2020 James Honaker, Gary King and Matthew Blackwell ## ## Refer to http://gking.harvard.edu/amelia/ for more informat...
1582 sym R (35162 sym/47 pcs) 7 img
DATA624_Project1
library(readxl, quietly = TRUE, warn.conflicts = FALSE, verbose = F) library(fpp2,quietly = TRUE, warn.conflicts = FALSE, verbose = F) library(ggplot2) library(gridExtra) library(mlbench) library(psych) library(dplyr ) library("readxl") library("dplyr") library("tidyr") library("forecast") library("fpp2") library("lubridate") Pa...
6958 sym R (18990 sym/138 pcs) 40 img
Data 624 HW 6
library(readxl, quietly = TRUE, warn.conflicts = FALSE, verbose = F) library(fpp2,quietly = TRUE, warn.conflicts = FALSE, verbose = F) library(ggplot2) library(gridExtra) library(mlbench) library(caret) library(corrplot) library(dplyr ) library(kableExtra) library(e1071) library(urca) 8.1, 8.2, 8.3, 8.5., 8.6, 8.7 Excercise 8.1 a Th...
2572 sym R (3293 sym/35 pcs) 26 img 1 tbl
DATA612 Project 3
In this project we will use the Ratings dataset. We will be using Single Value Decomposition(SVD) matrix factorization method to estimate similarity and to create a content based recommender system. Factorizing matrix allows us to discover the most descriptive dimensions for predicting movie preferences. We can identify the first few most importa...
2482 sym R (3846 sym/15 pcs) 4 tbl
DATA624_HW5
library(readxl, quietly = TRUE, warn.conflicts = FALSE, verbose = F) library(fpp2,quietly = TRUE, warn.conflicts = FALSE, verbose = F) library(ggplot2) library(gridExtra) library(mlbench) library(caret) library(corrplot) library(dplyr ) library(kableExtra) library(e1071) Excercise 7.1 a l = 77260.0561 and alpha = 0.2971 ses_pig...
2062 sym R (8692 sym/47 pcs) 16 img 8 tbl
DATA612_Discussion2
It was nice to hear about a real word implementation of recommender system and challenges faced in implementing it.He discussed different ways of implementing recommender systems and pros and cons. For example, Pandora manually tagging 200+ attributes and scalability issues related to the manually process. Manual Curation Manually Tag Attribute...
1542 sym 5 img