Publications by Pablo Casas
How to self-publish a book: Customizing Bookdown
tl;dr: This post is related to How to self-publish a book: A handy list of resources. It’s centered around Bookdown and some non-standard customizations I found useful to create the Data Science Live Book. The first steps into Bookdown Amazon Kindle format Building the book Be mindful of the line width At the beginning of every ‘.Rmd’ Some...
7692 sym R (469 sym/4 pcs) 22 img
Sample size and class balance on model performance
tl;dr: This post shows the relationship between the sample size and the accuracy in a classification model. I hope this little research I did may help you in your classification problems. An LSTM model created in Keras was used to produce the results. The metric we are tracking is categorical_accuracy (equivalent to accuracy for multi-class), whi...
5372 sym R (1262 sym/5 pcs) 8 img
How to create a sequential model in Keras for R
tl;dr: This tutorial will introduce the Deep Learning classification task with Keras. We will particularly focus on the shape of the arrays, which is one of the most common pitfalls. The topics we’ll cover are: How to do one-hot encoding Choosing the input and output shape/dimensions in the layers How to train the model How to evaluate the mod...
5398 sym R (3113 sym/12 pcs) 10 img
How to apply a function to a matrix/tibble
Scenario: we got a table of id-value, and a matrix/tibble that contains the id, and we need the labels. It may be useful when predicting the Key (or Ids) of in a classification model (like in Keras), and we need the labels as the final output. There are two interesting things: The usage of apply based on column and rows at the same time. The cre...
1153 sym R (1228 sym/3 pcs) 2 img
Integrating R and Telegram
Hi there! tl;dr: Some models (deep learning) take a long time to finish. Even some data preparation scripts. We can be notified that the process ended by Telegram sending messages from R. Get notify by Telegram bot This section is entirely based on the documentation of telegram.bot package, by Ernest Benedito. Please visit the site to get used of...
1854 sym R (1623 sym/4 pcs) 8 img
Feature Selection using Genetic Algorithms in R
This is a post about feature selection using genetic algorithms in R, in which we will do a quick review about: What are genetic algorithms? GA in ML? What does a solution look like? GA process and its operators The fitness function Genetics Algorithms in R! Try it yourself Relating concepts Animation source: “Flexible Muscle-Based Locomotion...
7707 sym R (2438 sym/2 pcs) 10 img
New discretization method: Recursive information gain ratio maximization
Hello everyone, I’m happy to share a new method to discretize variables I was working on for the last few months: Recursive discretization using gain ratio for multi-class variable tl;dr: funModeling::discretize_rgr(input, target) The problem: Need to convert a numeric variable into one categorical, considering the relationship with the target ...
3734 sym R (691 sym/4 pcs) 6 img
A gentle introduction to SHAP values in R
Hi there! During the first meetup of argentinaR.org -an R user group- Daniel Quelali introduced us to a new model validation technique called SHAP values. This novel approach allows us to dig a little bit more in the complexity of the predictive model results, while it allows us to explore the relationships between variables for predicted case. ...
6514 sym 16 img
How to use `recipes` package from `tidymodels` for one hot encoding ????
Since once of the best way to learn, is to explain, I want to share with you this quick introduction to recipes package, from the tidymodels family. It can help us to automatize some data preparation tasks. The overview is: How to create a recipe How to add a step How to do the prep Getting the data with juice! Apply the prep to new data What is...
4884 sym R (4578 sym/12 pcs) 4 img
How to use `recipes` package from `tidymodels` for one hot encoding ????
Since once of the best way to learn, is to explain, I want to share with you this quick introduction to recipes package, from the tidymodels family. It can help us to automatize some data preparation tasks. The overview is: How to create a recipe How to add a step How to do the prep Getting the data with juice! Apply the prep to new data What is...
4884 sym R (4578 sym/12 pcs) 4 img