Publications by Françoisn -
ggnetwork: Network geometries for ggplot2
This note is a shameless plug demo of the ggnetwork package, which provides several geoms to plot network objects with ggplot2, and which just got published on CRAN. See the package vignette for a more detailed guide to its functionalities. Our example data is the most recent version of the Icelandic legal code, which is available as a ZIP archiv...
6241 sym R (329 sym/2 pcs) 6 img
An awesome list of network analysis resources
Inspired by the awesome R list that I mentioned a few months ago, I have started the awesome-network-analysis list, which features a large section on R packages. Building a list specifically dedicated to network analysis presents the opportunity to cite more R packages that focus on that task, such as the rapidly expanding list of packages to est...
2215 sym
Turning keywords into a co-occurrence network
This post is addressed to the GLM Fall 2016 students who are currently taking my Statistical Reasoning and Quantitative Methods course at Sciences Po in Paris. Dear students Since you are going to learn a lot of statistical computing/programming this semester, I thought it would be a good idea to show you a quick example of what you can achieve w...
12271 sym R (1124 sym/4 pcs) 4 img
Collapsing a bipartite co-occurrence network
This note is a follow-up to the previous one. It shows how to use student-submitted keywords to find clusters of shared interests between the students. Dear students If you enjoyed my previous note, this one might also entertain you. And since your real first names are used in the data, you should be able to tell me later if this note makes sense...
10881 sym 6 img
One year of R / Notes
My collection of R notes is now slightly over one year old. This note reflects on how useful the exercise of blogging about R has been so far, and answers some of the questions that I have received about it. Blogging about R I created my collection of R notes with the intention to keep track of technical notes that I often need to refer to when I...
2861 sym
Compiling the ggplot2 book on Mac OS
This note explains to compile Hadley Wickham’s ggplot2 book on Mac OS. This guide has 8 steps. If you have already installed R and RStudio, you should be able to get through Steps 1-4 very quickly. Similarly, if you use Git, Steps 5-6 should also be very straightforward. The longest steps are Step 7 (package dependencies) and Step 8 (book c...
3704 sym 2 img
Remember to use the RDS format
Note to self – Remember to serialize R objects as RDS files when it makes sense. Importing Stata data into R The European Social Survey recently announced that it had added Round 7 of its survey to its cumulative dataset, which can be downloaded in CSV, SPSS or Stata format. While my instinctive preference for storing data is to use CSV, in th...
3110 sym
Scraping Web sources: Two illustrations
Per request from a couple of students in a course on open data that I contribute to, here’s a short guide to the “why” and “how” questions about (Web) scraping, with links to examples to illustrate the usefulness of the technique. What is scraping? (Web) Scraping consists in writing computer code to automate the download and/or parsing ...
5632 sym 4 img
Turning KML into tidy data frames
This note briefly introduces the tidykml package, which turns basic KML geometries into tidy data frames that can be visualized with ggplot2. Summary The tidykml package provides a quick way to import data from Google My Maps into R, in a format that makes it easy to manipulate the data and visualize it with ggplot2. Below is an example that uses...
5472 sym 4 img
Technologies worth learning for data science
As a complement to my note on R as a data science language, this note lists ten other technologies that you might want to learn to use, or at least monitor, if you are interested in learning data science. Communication Git is a concurrent versioning system that is easy to use through platforms like GitHub or GitLab. It is the best tool that I kn...
4915 sym