Publications by Erinda Budo

Implementing a Recommender System on Spark

09.07.2020

Introduction Adapt one of your recommendation systems to work with Apache Spark and compare the performance with your previous iteration. Consider the efficiency of the system and the added complexity of using Spark. You may complete the assignment using PySpark (Python), SparkR (R) , sparklyr (R), or Scala. #Required libraries library(recommen...

1904 sym R (3376 sym/10 pcs) 2 tbl

Project 1

11.06.2020

This system will recommend Books to readers. Libraries: library(kableExtra) 1. Reading in the Data. get_url <- read.csv(file="https://raw.githubusercontent.com/ErindaB/Data-612/master/BookRatings.csv", header=TRUE, sep=",") Book_R <- get_url[1:5,] Book_R %>% kable(caption = "Book Ratings") %>% kable_styling("striped", full_width = TRUE) Book ...

4489 sym R (2872 sym/17 pcs)

Data 612-Project2

18.06.2020

Introduction Start with an existing dataset of user-item ratings, such as our toy books dataset, MovieLens, Jester or another dataset of your choosing. Implement at least two two of these recommendation algorithms: Content-Based Filtering User-User Collaborative Filtering Item-Item Collaborative Filtering # Loading libraries library(recommende...

7053 sym R (4061 sym/24 pcs) 4 img

Data 612-Project 4

05.07.2020

Instructions The goal of this assignment is give you practice working with accuracy and other recommender system metrics. As in your previous assignments, compare the accuracy of at least two recommender system algorithms against your offline data. Implement support for at least one business or user experience goal such as increased serendipity,...

5633 sym R (7559 sym/34 pcs) 6 img

Blog 2

20.12.2020

Leaps: Regression Subset Selection regsubsets() function in the R package leaps. The basic idea of the all possible subsets approach is to run every possible combination of the predictors to find the best subset to meet some pre-defined objective criteria such as Cp and adjusted R^2. It is hoped that that one ends up with a reasonable and useful...

3149 sym R (1923 sym/9 pcs) 1 img

Data 608 -HW1

17.02.2021

Libraries Used require(dplyr) require(tidyr) require(knitr) require(kableExtra) require(kable) require(ggplot2) Principles of Data Visualization and Introduction to ggplot2 I have provided you with data about the 5,000 fastest growing companies in the US, as compiled by Inc. magazine. lets read this in: inc <- read.csv("https://raw.githubus...

2535 sym R (4329 sym/14 pcs) 3 img 3 tbl