Publications by Bob Rudis (@hrbrmstr)

R vs Spreadsheets

08.01.2014

One of the myriad of reasons we created the Data Driven Security blog was to provide pointers to data analysis and visualization resources for security domain experts who may have not been exposed to these types of tools. I’d venture a posit that most folks jump into some type of spreadsheet software whenever they get their hands on a managea...

2092 sym 4 img

Firewall-busting ASN-lookups – Part 1

23.04.2014

This is a short post on one way to bust through your corporate firewall when trying to use the Team CYMRU ASN lookup facility that we presented in our book. Part 2 will show how to create a vectorized version of this code. Most corporate networks aren’t going to allow port 43 (WHOIS) access directly, which will make the bulk lookup routines t...

1728 sym R (852 sym/1 pcs)

Making Better DNS TXT Record Lookups With Rcpp

25.04.2014

Technically this is Part 2 of Firewall-busting ASN-lookups. However, I said (in Part 1) that Part 2 would be about making a vectorized version and this is absolutely not about that. Rather than fib, I merely misdirect. Moving on… As you can see in Part 1, we have to resort to a system() call to do the TXT record lookup with dig. Frankly, I rea...

6947 sym R (4390 sym/9 pcs)

Scraping SSL Labs Server Test Results With R

29.04.2014

NOTE: Qualys allows automated access to their SSL Server Test site in their T&C’s, and the R fucntion/script provided here does its best to adhere to their guidelines. However, if you launch multiple scripts at one time and catch their attention you will, no doubt, be banned. This post will show you how to do some basic web page data scraping...

5265 sym R (2965 sym/10 pcs)

Speeding Up IPv4 Address Conversion in R

14.05.2014

In our book we provide examples of how to convert IPv4 addresses to integer format (and back). We held ourselves to using only basic R functionality since the book had to be at an introductory level. On a fairly modern box, the ip2long function takes (roughly) 0.1s to convert 4,000 IPv4 address to integers (I just happened to have a file with 4K...

3351 sym R (872 sym/2 pcs)

Vectorizing IPv4 Address Conversions – Part 1

16.05.2014

Our previous post showed how to speed up the conversion of IPv4 addresses to/from integer format by taking advantage of a simple Rcpp wrapper to “boosted” native functions. However, to convert more than one IP address, you need to stick those functions into one of the R *apply functions, which does the job, but is not an optimal solution. Ide...

2957 sym R (1199 sym/4 pcs)

Vectorizing IPv4 Address Conversions – Part 2

17.05.2014

The previous post looked at using the Vectorize() function to, well, vectorize, our Rcpp IPv4 functions. While this is a completely acceptable practice, we can perform the vectorization 100% in Rcpp/C++. We’ve included both the original Rcpp IPv4 functions and the new Rcpp-vectorized functions together to show the minimal differences between t...

2485 sym R (2339 sym/3 pcs) 4 img 3 tbl

Can You Track Me Now? (Visualizing Xfinity Wi-Fi Hotspot Coverage) [Part 1]

06.06.2014

This is the first of a two-part series. Part 1 sets up the story and goes into how to discover, digest & reformat the necessary data. Part 2 will show how to perform some basic visualizations and then how to build beautiful & informative density maps from the data and offer some suggestions as to how to prevent potential tracking. Xfinity has a...

6068 sym R (7097 sym/9 pcs) 6 img

Can You Track Me Now? (Visualizing Xfinity Wi-Fi Hotspot Coverage) [Part 2]

13.06.2014

This is the second of a two-part series. Part 1 set up the story and goes into how to discover, digest & reformat the necessary data. This concluding segment will show how to perform some basic visualizations and then how to build beautiful & informative density maps from the data and offer some suggestions as to how to prevent potential trackin...

4953 sym R (3428 sym/3 pcs) 26 img

Controlling RStudio Python Child Processes

23.06.2014

I’ve been using RStudio’s new ability to run Python scripts since I often need to analyze/process data in R but then run web services with said data in Python (usually via Flask). I’d rather live with the foibles of the RStudio editor than use a separate one and run code on the command line. Everyting below is for OS X, but I suspect hold...

2556 sym R (164 sym/3 pcs)