Publications by Art Steinmetz
Predicting Water Quality in New York Harbor
Motivation This is an exercise in using machine learning to predict the level of harmful bacteria in New York Harbor based on environmental factors like tidal conditions, rainfall and location. Among the reasons this is useful is understanding how to rebuild a marine life ecosystem in the harbor, where oysters were a keystone species. We use the R ...
15846 sym R (22823 sym/17 pcs) 30 img 1 tbl
The Truth About Tidy Wrappers
These are the packages we will need for this analysis. library(tidyverse) library(data.table) library(dtplyr) library(duckdb) library(duckplyr) library(polars) library(tidypolars) library(arrow) library(tictoc) library(microbenchmark) library(gt) The Tidyverse I love the Tidyverse from Posit.co. The biggest evolution of the R language ecosystem si...
15078 sym R (12944 sym/24 pcs) 6 img 5 tbl
The Truth About Tidy Wrappers
These are the packages we will need for this analysis. library(tidyverse) library(data.table) library(dtplyr) library(duckdb) library(duckplyr) library(polars) library(tidypolars) library(arrow) library(tictoc) library(microbenchmark) library(gt) The Tidyverse I love the Tidyverse from Posit.co. The biggest evolution of the R language ecosystem si...
14420 sym R (10486 sym/16 pcs) 8 img 6 tbl
Kakhovka Dam Disaster
Some History The war in Ukraine has spawned yet another disaster, the destruction of the dam across the Dnipro river, upstream from Kherson City. This is an ecologial and humanitarian disaster as vast acres of settlements, farmlands and wetlands have been destroyed. This marks the third time a dam in this region has been destroyed. First in 1941, t...
10048 sym R (7762 sym/15 pcs) 14 img
Sentiment Analysis Using Google Translate (Pt. 4 – A Painful Sight)
Introduction Many of us in the U.S. have been surprised by the indifference many in less developed countries have shown about the Ukraine war. There has been much talk lately about how the “Global South” is feeling slighted by the rich countries in the northern hemisphere. The Afrisenti data set (Muhammad et al. 2023) is a collection of over 11...
4603 sym R (4863 sym/7 pcs) 4 img
Sentiment Analysis Using Google Translate (Pt. 1)
Inspired by TidyTuesday Some of the the R data science community participate in a weekly challenge called “Tidy Tuesday,” where an interesting data set is presented for analysis but mostly visualization. There are some tremendous examples of beautiful work posted on Twitter with the hashtag #tidytuesday. African Tweets and Sentiment Recently, ...
6193 sym R (10119 sym/21 pcs) 2 img