Publications by Tiangeng Lu
Two-mode Network Visualization
df_two_mode_90 <- read.csv('two_mode_90.csv', row.names = 1) df_two_mode_90 <- df_two_mode_90[,!colnames(df_two_mode_90) %in% "pol"] print(paste("There are", nrow(df_two_mode_90), "items.")) ## [1] "There are 133 items." df_two_mode_90_simple <- df_two_mode_90[rowSums(df_two_mode_90) > 0,] print(paste(nrow(df_two_mode_90_simple), "are kept for netw...
96 sym 3 img
Skilled Migration Review
1 Setup Sys.time() ## [1] "2023-04-30 01:22:08 EDT" getwd() ## [1] "/Users/tiangeng/Library/CloudStorage/OneDrive-ThePennsylvaniaStateUniversity/R files" # Update here for each folder directory <- "wos_mig" folder <- paste(getwd(), directory, sep = "/") library(readxl) library(writexl) library(DT) library(tidyr) library(ggplot2) library(dplyr) ## ...
3290 sym R (23143 sym/169 pcs) 2 img
World Location List
library(DT) unloc <- list.files(path = getwd(), pattern = 'UNLOCODE', full.names = F) unloc ## [1] "2022-2 UNLOCODE CodeListPart1.csv" "2022-2 UNLOCODE CodeListPart2.csv" ## [3] "2022-2 UNLOCODE CodeListPart3.csv" unloc_1 <- read.csv(unloc[1], header = F) unloc_2 <- read.csv(unloc[2], header = F) unloc_3 <- read.csv(unloc[3], header = F) unloc <- d...
369 sym R (11923 sym/52 pcs)
Twitter (Retweet) Network Gini Coefficient Calculation
Last compiled at 2023-01-15 23:08:44. Importing libraries library(dplyr) # data cleaning and processing library(rtweet) # importing and processing Twitter data structure library(igraph) # network visualization Importing Datasets myfiles <- list.files(getwd(), pattern = "", full.names = F) ## [1] "2023-01-13 23:27:42 EST" ## [1] "2023-01-15 22:51...
2788 sym R (9774 sym/43 pcs) 5 img
academictwitteR
NOTE: This is a demonstration of downloading and cleaning non-retweet Twitter data using the academic research API. I’m glad to share more best practices and lessons learned in Twitter data collection and analysis. I can be reached at here. Academic Research API ATTENTION: The resultant dataframe has the following columns. [1] “author_id”...
2382 sym R (6418 sym/35 pcs) 1 img
academictwitteR User Download
library(academictwitteR) # get_all_tweets() library(rtweet) # save_as_csv() that prepends numerical ids as characters library(dplyr) # %>% convenient data cleaning Input bearer token and filepattern Read in Tweets Data List all files that meet the search patterns Read in all the files as a list Use do.call(what = "rbind", args = lapply(nlist, a...
820 sym R (7295 sym/26 pcs)
AcademictwitteR Referenced Tweets Download
Last compiled on 2023-01-17 22:45:51 NOTES: Download tweets by their ids using hydrate_tweets(ids = , bear_token = , bind_tweets = TRUE) Include illustrations of interim data tables Document how to flatten nested hashtag and mention information associated with each tweet Scenario: ids of referenced (source) tweets (e.g., replied_to or quote) are...
2503 sym R (10542 sym/54 pcs)
Twitter Data Construction and Analyses
Compiled on 2023-01-24 00:55:26 Join the following three tables: (1) tweets, (2) tweet/referenced tweet authors, and (3) referenced tweets NOTE: The users table will be joined twice because this table includes user information for both the original tweets and the referenced tweets. time1 = Sys.time() library(rtweet) # save_as_csv() that prepends ...
7312 sym R (29592 sym/93 pcs) 4 img
PDF Administrative Records Data Cleaning
This is a demonstration of cleaning Administrative Records (AR) downloaded from the United States Department of State—Bureau of Consular Affairs. The raw data were stored in 67 .txt files. Each of the .txt files contains the monthly non-immigrant visa assurances by nationality and visa class. The raw data tables are available here. This is how ...
2230 sym Python (7139 sym/48 pcs) 3 img
Data Cleaning .pdf(.txt) to .csv
This is the second tutorial of my R administrative records data cleaning series. This first tutorial is available here. The raw data were downloaded from the United States Department of State–Bureau of Consular Affairs. Figure 1 I’ll be working with monthly immigrant statistics between March 2017 and September 2022. Figure 2 Except for be...
2139 sym Python (6357 sym/44 pcs) 5 img