Publications by Jorge Bueno Perez

Classification methods applied to an imbalanced big dataset

15.06.2020

Classification methods applied to an imbalanced big dataset 1) Project decription: 2) Data description: 2.1) Features description: 2.2) Numeric variables: 2.3) Categorical variables: 3) Cleaning the data: 3.1) Var. transformation: 3.2) Missing values: 3.3) Unique variables: 4) Data partitioning: 4.1) For models that can manage ordinal variable...

29765 sym R (10363 sym/48 pcs) 14 img 43 tbl

Applying Association rules on 2000 supermarket baskets

20.07.2020

Applying Association rules on 2000 supermarket baskets 1) Project description: 2) Manipulation of the data: 3) Generating Rules: 4) Conclusions: 5) Data set bibliography: Jorge Bueno Perez 2020-07-20 1) Project description: The goal of this paper is to learn how to apply apriori association rules algorithm with R. In this paper we will analy...

12412 sym R (14932 sym/34 pcs) 13 img

Dimensionality Reduction applied to the EU trade data, 2018

19.07.2020

Dimensionality Reduction applied to the EU trade data, 2018 1) Project description: 2) Dataset description 3) Manipulation of the data: 4) Correlation between variables: 5) MDS - Multidimensional Scaling: 6) Non-metric MDS: 7) PCA - Principal Component Analysis: 8) Conclusions: 9) Extensions - Image compressing: 9) Summary and conclussions - Ima...

13538 sym R (21028 sym/23 pcs) 20 img

Clustering applied to the EU trade data, 2018

17.07.2020

Clustering applied to the EU trade data, 2018 1) Project description: 2) Dataset description 3) Manipulation of the data: 4) Pre-diagnosis: 5) Identification of the numBER of clusters: 5.1) d index: 5.2) Hubert: 5.3) K-means (silhouette and wss): 5.4) PAM (silhouette and wss): 6) Clustering: 6.1) K-means: 6.2) PAM: 7) Post-diagnosis: 8) Result...

7513 sym R (3075 sym/11 pcs) 25 img

Dashboard shiny page on COVID-19

03.07.2020

1) Project decription: As the world is in the midst of a once in a century pandemic phase, it would be natural for Data Science students to look closely into the data which is being generated on a daily basis. The aim of our project would be to be an interactive web application which will visualize different variables such as, new cases, new deat...

5031 sym R (10491 sym/13 pcs) 13 img 1 tbl

Ordered Choice Logit Model in R

18.09.2020

library(dplyr) library(knitr) library(psych) library(MASS) library(lmtest) library(oglmx) library(generalhoslem) library(jtools) library(brant) library(DescTools) library(pscl) library(stargazer) 1) Abstract: This paper uses ordered choices logit model to build a model for career satisfaction of programmers in Eastern European countries. The dat...

33951 sym R (12373 sym/56 pcs) 5 img