Publications by rstats on Bryan Shalloway's Blog

Influencing Distributions with Tiered Incentives

01.11.2020

Simple Example Applying Incentives Takeaways of Resulting Distribution Think Carefully About Assumptions How to Set Assumptions Appendix Simple Assumptions Trade-offs In this post I will use incentives for sales representatives in pricing to provide examples of factors to consider when attempting to influence an existing distribution. For in...

9722 sym R (4012 sym/5 pcs) 8 img 1 tbl

Undersampling Will Change the Base Rates of Your Model’s Predictions

22.11.2020

Create Data Association of ‘feature’ and ‘target’ Resample Build Models Rescale Predictions to Predicted Probabilities Appendix Density Plots Lift Plot Comparing Scaling Methods TLDR: In classification problems, under and over sampling1 techniques shift the distribution of predicted probabilities towards the minority class. If your prob...

6688 sym R (3006 sym/16 pcs) 16 img 2 tbl

Undersampling Will Change the Base Rates of Your Model’s Predictions

22.11.2020

Create Data Association of ‘feature’ and ‘target’ Resample Build Models Rescale Predictions to Predicted Probabilities Appendix Density Plots Lift Plot TLDR: In classification problems, under and over sampling1 techniques shift the distribution of predicted probabilities towards the minority class. If your problem requires accurate prob...

6242 sym R (2894 sym/15 pcs) 14 img 1 tbl

Weighting Confusion Matrices by Outcomes and Observations

07.12.2020

Model Performance Metrics Lending Data Example Starter Code Weighting by Classification Outcomes Metrics Across Decision Thresholds Weighting by Observations Closing note Appendix Weights of Observations During and Prior to Modeling Notes on Cost Sensitive Classification Weighted Classification Metrics Questions on Cost Sensitive Classification...

23962 sym R (5475 sym/19 pcs) 14 img 1 tbl

Weighting Confusion Matrices by Outcomes and Observations

07.12.2020

Model Performance Metrics Lending Data Example Starter Code Weighting by Classification Outcomes Metrics Across Decision Thresholds Weighting by Observations Closing note Appendix Weights of Observations During and Prior to Modeling Notes on Cost Sensitive Classification Weighted Classification Metrics Questions on Cost Sensitive Classification...

23962 sym R (5475 sym/19 pcs) 14 img 1 tbl

Understanding Prediction Intervals

17.03.2021

Providing More Than Point Estimates Considering Uncertainty Observation Specific Intervals A Few Things to Know About Prediction Intervals Prediction Intervals and Confidence Intervals Analytic Method of Calculating Prediction Intervals Visual Comparison of Prediction Intervals and Confidence Intervals Inference or Prediction? Cautions With Ov...

37796 sym R (7404 sym/17 pcs) 16 img 8 tbl

Simulating Prediction Intervals

04.04.2021

Rough Idea Inspiration Procedure Example Simulate Prediction Interval Review Interval Width Coverage Closing Notes Appendix Conformal Inference Other Examples Using Simulation Confusion With Confidence Intervals Adjusting Procedure Alternative Procedure With CV Part 1 of my series of posts on building prediction intervals used data held-out...

19914 sym R (6851 sym/13 pcs) 6 img 5 tbl

Quantile Regression Forests for Prediction Intervals

20.04.2021

Quantile Regression Example Quantile Regression Forest Review Performance Coverage Interval Width Closing Notes Appendix Residual Plots Other Charts In this post I will build prediction intervals using quantile regression, more specifically, quantile regression forests. This is my third post on prediction intervals. Prior posts: Understand...

15212 sym R (11370 sym/20 pcs) 16 img 5 tbl

Macros in the Shell: Integrating That Spreadsheet From Finance Into a Data Pipeline

09.05.2021

Macro in the Shell Example Setting-up Gaurd Rails Closing Appendix Related Alternative Other Resources There is many a data science meme degrading excel: (Google Sheets seems to have escaped most of the memes here.) While I no longer use it regularly for the purposes of analysis, I will always have a soft spot in my heart for excel1. Furthe...

8908 sym 6 img

Predicting NBA Playoff Berths: FiveThirtyEight vs Betting Markets

16.12.2021

NBA Playoffs and the Lakers Data Prep Scraping Betting Markets Steps Joining with FiveThirtyEight data Analysis How much does FiveThirtyEight differ from markets? Closing Thought Appendix Potential Reasons for the Difference Calculating percentiles of diff TLDR: FiveThirtyEight’s forecasts of NBA playoff berths seem to hold-up OK agains...

11468 sym R (5355 sym/12 pcs) 6 img 6 tbl