Publications by R on Stats and R
Paper: ‘Right to be forgotten for mortgage insurance issued to cancer survivors: critical assessment and new proposal’
I am happy to announce that our paper entitled “Right to be forgotten for mortgage insurance issued to cancer survivors: critical assessment and new proposal” has been accepted for publication in European Actuarial Journal. In this paper, we propose an alternative method to determine the waiting period opening the right to be forgotten in insur...
1330 sym 2 img
Binary logistic regression in R
Introduction Linear versus logistic regression Univariate versus multivariate logistic regression Data Binary logistic regression in R Univariate binary logistic regression Quantitative independent variable Qualitative independent variable Multivariate binary logistic regression Interaction Model selection Quality of a model Validity of the p...
54932 sym R (15444 sym/50 pcs) 28 img 6 tbl
What is the probability that two persons have the same initials?
Introduction How likely is it? For our team For teams of different sizes Conclusion Introduction Last week, I joined a team to work on a collaborative project. The team was already established for a few months, with several scientists working together on the project. For simplicity, they used to sign documents, mention colleagues in emails, etc....
9817 sym R (5929 sym/12 pcs) 6 img
Introduction to data manipulation in R with {dplyr}
Introduction Data {dplyr} package Filter observations The pipe operator Extract observations Based on their positions Based on their values Sample observations Sort observations Select variables Rename variables Create or modify variables Summarize observations Identify distinct values Connected operations Group by Number of observations Numb...
19436 sym R (23358 sym/37 pcs) 4 img
Pearson, Spearman and Kendall correlation coefficients by hand
Introduction Data With ties Without ties Correlation coefficients by hand Pearson With and without ties Spearman With ties Without ties Kendall Without ties With ties Verification in R Conclusion Introduction In statistics, a correlation is used to evaluate the relationship between two variables. In a previous post, we showed how to compu...
18547 sym R (319 sym/5 pcs) 6 img
How to: one-way ANOVA by hand
Introduction Data and hypotheses ANOVA by hand Overall and group means SSR and SSE ANOVA table Conclusion of the test Conclusion Introduction An ANOVA is a statistical test used to compare a quantitative variable between groups, to determine if there is a statistically significant difference between several population means. In practice, it is u...
6023 sym R (307 sym/3 pcs) 4 img
Scrape Yahoo search engine results with R
Introduction Scraping Yahoo search engine results with R Conclusion Note: This is a guest post by Manthan Koolwal, founder of Scrapingdog. Introduction Web scraping is the process of extracting data from websites. It is usually done in an automated manner to obtain a large amounts of data through various websites, without the need to gather data ...
3645 sym R (6507 sym/6 pcs) 4 img
Two-way ANOVA in R
Introduction The two-way ANOVA (analysis of variance) is a statistical method that allows to evaluate the simultaneous effect of two categorical variables on a quantitative continuous variable. The two-way ANOVA is an extension of the one-way ANOVA since it allows to evaluate the effects on a numerical response of two categorical variables instead ...
24557 sym R (10743 sym/31 pcs) 32 img
10 potential career options with a degree in statistics
Introduction What types of jobs are available? Statistician Data scientist, data/business analyst, data engineer or machine learning engineer Actuary or actuarial analyst Financial (risk) analyst, investment analyst, financial trader, financial manager or quantitative analyst Business intelligence analyst Operational researcher or quality control ...
21837 sym 2 img
Top 10 errors in R and how to fix them
Introduction 1. Unmatched parentheses, curly braces, square brackets or quotes 2. Using a function that is not installed or loaded 3. Typos in function, variable, dataset, object or package names 4. Missing, incorrect or misspelled arguments in functions 5. Wrong, inappropriate or inconsistent data types 6. Forgetting the + sign in ggplot2 7. Misun...
20586 sym R (7082 sym/41 pcs) 22 img