Publications by rstats | Julia Silge
Upcoming changes to tidytext: threat of COLLAPSE
The tidytext package passed one million downloads from CRAN this year! It has been truly amazing to see this project grow out of an rOpenSci unconference several years ago to be a piece of software useful to people’s real world work. There has been some of the infrastructure of the package still around from its very early days, and as more pe...
3845 sym R (6080 sym/10 pcs)
Predicting injuries for Chicago traffic crashes
This is the latest in my series of screencasts demonstrating how to use the tidymodels packages, from starting out with first modeling steps to tuning more complex models. Instead of Tidy Tuesday data, this screencast uses some “wild caught” data from Chicago’s open data portal and is planned to be the first in a series walking through ho...
3299 sym R (7360 sym/16 pcs) 14 img
Explore art media over time in the #TidyTuesday Tate collection dataset
This is the latest in my series of screencasts demonstrating how to use the tidymodels packages, from starting out with first modeling steps to tuning more complex models. Today’s screencast walks through how to train a regularized regression model with text features and then check model diagnostics like residuals, using this week’s #TidyT...
6493 sym R (10660 sym/18 pcs) 10 img
Learn tidytext with my new learnr course
Today I am happy to announce that a new free, online, open source, interactive tutorial, Text Mining with Tidy Data Principles, has been published! ???? I previously developed an interactive course on text mining for an online learning company, but that course is no longer available. I’ve been wanting to revisit the ideas behind that course, ...
3880 sym 2 img
Understand your models with #TidyTuesday inequality in student debt
This is the latest in my series of screencasts demonstrating how to use the tidymodels packages, from starting out with first modeling steps to tuning more complex models. Today’s screencast is a short one! It walks through how we can use tidyverse and tidymodels functions to explore a model after we have trained it, using this week’s #Tid...
2663 sym R (2627 sym/7 pcs) 4 img
Getting started with k-means and #TidyTuesday employment status
This is the latest in my series of screencasts demonstrating how to use the tidymodels packages, from starting out with first modeling steps to tuning more complex models. Today’s screencast uses the broom package to tidy output from k-means clustering, with this week’s #TidyTuesday dataset on employment and demographics. Here is the code I ...
3072 sym R (3618 sym/9 pcs) 4 img
Bootstrap confidence intervals for #TidyTuesday Super Bowl commercials
This is the latest in my series of screencasts demonstrating how to use the tidymodels packages, from starting out with first modeling steps to tuning more complex models. Today’s screencast uses a relatively new function from rsample for quickly finding bootstrap confidence intervals, with this week’s #TidyTuesday dataset on Super Bowl co...
2537 sym R (4161 sym/7 pcs) 6 img
Dimensionality reduction of #TidyTuesday United Nations voting patterns
This is the latest in my series of screencasts demonstrating how to use the tidymodels packages, from starting out with first modeling steps to tuning more complex models. One change I have recently made on my blog is to remove Disqus comments. I want to say a huge THANK YOU ? to everyone who ever commented on my blog before and express how muc...
3100 sym R (2879 sym/7 pcs) 6 img
Which #TidyTuesday post offices are in Hawaii?
This is the latest in my series of screencasts demonstrating how to use the tidymodels packages, from starting out with first modeling steps to tuning more complex models. Today’s screencast walks through how to use text information at the subword level in predictive modeling, with this week’s #TidyTuesday dataset on United States post off...
3586 sym R (14198 sym/15 pcs) 2 img
Which #TidyTuesday Netflix titles are movies and which are TV shows?
This is the latest in my series of screencasts demonstrating how to use the tidymodels packages, from just starting out to tuning more complex models with many hyperparameters. Today’s screencast walks through how to build features for modeling from text, with this week’s #TidyTuesday dataset on Netflix titles. ? Here is the code I used i...
3306 sym R (7273 sym/12 pcs) 6 img