Publications by rstats | Julia Silge

Estimate change in #TidyTuesday CEO departures with bootstrap resampling

27.04.2021

This is the latest in my series of screencasts demonstrating how to use the tidymodels packages, from just starting out to tuning more complex models with many hyperparameters. Today’s screencast walks through how to use bootstrap resampling, with this week’s #TidyTuesday dataset on CEO departures. ? Here is the code I used in the video, ...

1929 sym R (5962 sym/9 pcs) 4 img

Predict availability in #TidyTuesday water sources with random forest models

05.05.2021

This is the latest in my series of screencasts demonstrating how to use the tidymodels packages, from just starting out to tuning more complex models with many hyperparameters. Today’s screencast walks through how to train and evalute a random forest model, with this week’s #TidyTuesday dataset on water sources. ? Here is the code I used ...

3141 sym R (4492 sym/15 pcs) 16 img

Partial dependence plots with tidymodels and DALEX for #TidyTuesday Mario Kart world records

27.05.2021

This is the latest in my series of screencasts demonstrating how to use the tidymodels packages, from just starting out to tuning more complex models with many hyperparameters. Today’s screencast walks through how to train and evalute a random forest model, with this week’s #TidyTuesday dataset on Mario Kart world records. ? Here is the c...

3190 sym R (6502 sym/12 pcs) 8 img

Class imbalance and classification metrics with aircraft wildlife strikes

20.06.2021

This is the latest in my series of screencasts demonstrating how to use the tidymodels packages, from just starting out to tuning more complex models with many hyperparameters. I recently participated in SLICED, a competitive data science prediction challenge. I did not necessarily cover myself in glory but in today’s screencast, I walk thro...

3761 sym R (5773 sym/13 pcs) 4 img 3 tbl

Create a custom metric with tidymodels and NYC Airbnb prices

29.06.2021

This is the latest in my series of screencasts demonstrating how to use the tidymodels packages, from just getting started to tuning more complex models. This week’s episode of SLICED, a competitive data science prediction challenge, introduced a challenge for predicting the prices of Airbnb listings in NYC. In today’s screencast, I walk t...

3378 sym R (5137 sym/11 pcs) 8 img

Predict which #TidyTuesday Scooby Doo monsters are REAL with a tuned decision tree model

12.07.2021

This is the latest in my series of screencasts demonstrating how to use the tidymodels packages, from just getting started to tuning more complex models. Today’s screencast walks through how to train and evalute a random forest model, with this week’s #TidyTuesday dataset on Scooby Doo episodes. ? Here is the code I used in the video, for...

2901 sym R (5863 sym/14 pcs) 8 img

Use racing methods to tune xgboost models and predict home runs

28.07.2021

This is the latest in my series of screencasts demonstrating how to use the tidymodels packages, from just getting started to tuning more complex models. This week’s episode of SLICED, a competitive data science streaming show, had contestants compete to predict home runs in recent baseball games. Honestly I don’t know much about baseball ...

3539 sym R (4475 sym/13 pcs) 10 img

Tune xgboost models with early stopping to predict shelter animal status

06.08.2021

This is the latest in my series of screencasts demonstrating how to use the tidymodels packages, from just getting started to tuning more complex models. I participated in this week’s episode of the SLICED playoffs, a competitive data science streaming show, where we competed to predict the status of shelter animals. ? I used xgboost’s ear...

3867 sym R (5249 sym/14 pcs) 12 img

Supervised Machine Learning for Text Analysis in R is now complete

12.08.2021

Last summer, Emil Hvitfeldt and I announced that we had started work on a new book project, to be published in the Chapman & Hall/CRC Data Science Series, and we are now happy to say that Supervised Machine Learning for Text Analysis for R (or SMLTAR, as we call it for short) is complete, in production, and available for preorder! You should be ...

4189 sym 6 img

Predict housing prices in Austin TX with tidymodels and xgboost

14.08.2021

This is the latest in my series of screencasts demonstrating how to use the tidymodels packages, from just getting started to tuning more complex models. My screencasts lately have focused on xgboost as I have participated in SLICED, a competitive data science streaming show. This past week were the semifinals, where we competed to predict pri...

6230 sym R (11322 sym/20 pcs) 16 img