Publications by Pabloc
Introduction to automatic machine learning
Automatic Machine Learning Introduction Introduction “I want to develop a model that automatically learns over time“, a really challenging objective. We’ll develop in this post a procedure that loads data, build a model, make predictions and, if something changes over time, it will create a new model, all with R. Picture credit: S.H Hori...
5257 sym R (2332 sym/11 pcs) 22 img
Package funModeling: data cleaning, importance variable analysis and model performance
Hi there 🙂 This new package –install.packages("funModeling")– tries to cover with simple concepts common tasks in data science. Written like a short tutorial, its focus is on data interpretation and analysis. Below, you’ll find a copy-paste from the package vignette, (so you can drink a good coffee while you read it… ) Introduction...
7828 sym R (1936 sym/15 pcs) 32 img
Package funModeling: data cleaning, importance variable analysis and model perfomance
Hi there 🙂 This new package –install.packages("funModeling")– tries to cover with simple concepts common tasks in data science. Written like a short tutorial, its focus is on data interpretation and analysis. Below, you’ll find a copy-paste from the package vignette, (so you can drink a good coffee while you read it… ) Introduction...
7862 sym R (2213 sym/17 pcs) 30 img
Time Series Analysis Using Max/Min… and some Neuroscience.
Introduction Time series have maximum and minimum points as general patterns. Sometimes the noise present on it causes problems to spot general behavior. In this post, we will smooth time series -reducing noise- to maximize the story that data has to tell us. And then, an easy formula will be applied to find and plot max/min points thus character...
3101 sym R (1188 sym/7 pcs) 30 img
Time Series Analysis Using Max/Min… and some Neuroscience.
Introduction Time series has maximum and minimum points as general patterns. Sometimes the noise present on it causes problems to spot general behavior. In this post, we will smooth time series -reducing noise- to maximize the story that data has to tell us. And then, an easy formula will be applied to find and plot max/min points thus characteri...
3097 sym R (1188 sym/7 pcs) 28 img
Data Science Live Book (open source)
Hi! Well finally there is the first release of this project: An open source book which will hopefully contain some useful resources for those who want to learn some data analysis/machine learning. This release covers a little of data preparation, data profiling, selecting best variables (dataviz), assessing model performance, and coming soon a c...
955 sym 12 img
Data Science Live Book – Scoring, Model Performance & profiling – Update!
This update contains a new chapter –scoring– which is related to model performance and model deployment, used when predicting a binary outcome. Link to the scoring chapter. Important: To use following updates please update funModeling package 🙂 install.packages("funModeling") Also related to predictive modelling for binary outcome, ther...
1408 sym 18 img
Model Performance in Data Science Live Book
Hi there! I decided to almost re-write the model validation section since it didn’t reflect real case scenarios. Hopefully in the two new chapters you will gain a deeper knowledge on methodological aspects on model validation through classical cross-validation, bootstrapping, and going further in the nature of the error. And also take advanta...
1272 sym 14 img
Playing with dimensions: from Clustering, PCA, t-SNE… to Carl Sagan!
Playing with dimensions Hi there! This post is an experiment combining the result of t-SNE with two well known clustering techniques: k-means and hierarchical. This will be the practical section, in R. But also, this post will explore the intersection point of concepts like dimension reduction, clustering analysis, data preparation, PCA, HDBSCAN...
7733 sym R (2378 sym/4 pcs) 26 img
Data Science Live Book (open source) ~ new big release! 200-pages
Well after some time, and +300 commits, this is the biggest release of the Data Science Live Book! (open source), after the first publication more than 1 year ago 🙂 tl;dr: Hi there! I invite you to read the book online and/or download here. Thanks and have a nice day 🙂 !(tl;dr): An overview… It’s a book to learn data science, machine ...
3804 sym 10 img