Publications by Tiangeng Lu

Two-mode Network Visualization

08.08.2023

df_two_mode_90 <- read.csv('two_mode_90.csv', row.names = 1) df_two_mode_90 <- df_two_mode_90[,!colnames(df_two_mode_90) %in% "pol"] print(paste("There are", nrow(df_two_mode_90), "items.")) ## [1] "There are 133 items." df_two_mode_90_simple <- df_two_mode_90[rowSums(df_two_mode_90) > 0,] print(paste(nrow(df_two_mode_90_simple), "are kept for netw...

96 sym 3 img

Skilled Migration Review

30.04.2023

1 Setup Sys.time() ## [1] "2023-04-30 01:22:08 EDT" getwd() ## [1] "/Users/tiangeng/Library/CloudStorage/OneDrive-ThePennsylvaniaStateUniversity/R files" # Update here for each folder directory <- "wos_mig" folder <- paste(getwd(), directory, sep = "/") library(readxl) library(writexl) library(DT) library(tidyr) library(ggplot2) library(dplyr) ## ...

3290 sym R (23143 sym/169 pcs) 2 img

World Location List

22.02.2023

library(DT) unloc <- list.files(path = getwd(), pattern = 'UNLOCODE', full.names = F) unloc ## [1] "2022-2 UNLOCODE CodeListPart1.csv" "2022-2 UNLOCODE CodeListPart2.csv" ## [3] "2022-2 UNLOCODE CodeListPart3.csv" unloc_1 <- read.csv(unloc[1], header = F) unloc_2 <- read.csv(unloc[2], header = F) unloc_3 <- read.csv(unloc[3], header = F) unloc <- d...

369 sym R (11923 sym/52 pcs)

Twitter (Retweet) Network Gini Coefficient Calculation

15.01.2023

Last compiled at 2023-01-15 23:08:44. Importing libraries library(dplyr) # data cleaning and processing library(rtweet) # importing and processing Twitter data structure library(igraph) # network visualization Importing Datasets myfiles <- list.files(getwd(), pattern = "", full.names = F) ## [1] "2023-01-13 23:27:42 EST" ## [1] "2023-01-15 22:51...

2788 sym R (9774 sym/43 pcs) 5 img

academictwitteR

17.01.2023

NOTE: This is a demonstration of downloading and cleaning non-retweet Twitter data using the academic research API. I’m glad to share more best practices and lessons learned in Twitter data collection and analysis. I can be reached at here. Academic Research API ATTENTION: The resultant dataframe has the following columns. [1] “author_id”...

2382 sym R (6418 sym/35 pcs) 1 img

academictwitteR User Download

17.01.2023

library(academictwitteR) # get_all_tweets() library(rtweet) # save_as_csv() that prepends numerical ids as characters library(dplyr) # %>% convenient data cleaning Input bearer token and filepattern Read in Tweets Data List all files that meet the search patterns Read in all the files as a list Use do.call(what = "rbind", args = lapply(nlist, a...

820 sym R (7295 sym/26 pcs)

AcademictwitteR Referenced Tweets Download

18.01.2023

Last compiled on 2023-01-17 22:45:51 NOTES: Download tweets by their ids using hydrate_tweets(ids = , bear_token = , bind_tweets = TRUE) Include illustrations of interim data tables Document how to flatten nested hashtag and mention information associated with each tweet Scenario: ids of referenced (source) tweets (e.g., replied_to or quote) are...

2503 sym R (10542 sym/54 pcs)

Twitter Data Construction and Analyses

24.01.2023

Compiled on 2023-01-24 00:55:26 Join the following three tables: (1) tweets, (2) tweet/referenced tweet authors, and (3) referenced tweets NOTE: The users table will be joined twice because this table includes user information for both the original tweets and the referenced tweets. time1 = Sys.time() library(rtweet) # save_as_csv() that prepends ...

7312 sym R (29592 sym/93 pcs) 4 img

PDF Administrative Records Data Cleaning

07.11.2022

This is a demonstration of cleaning Administrative Records (AR) downloaded from the United States Department of State—Bureau of Consular Affairs. The raw data were stored in 67 .txt files. Each of the .txt files contains the monthly non-immigrant visa assurances by nationality and visa class. The raw data tables are available here. This is how ...

2230 sym Python (7139 sym/48 pcs) 3 img

Data Cleaning .pdf(.txt) to .csv

12.11.2022

This is the second tutorial of my R administrative records data cleaning series. This first tutorial is available here. The raw data were downloaded from the United States Department of State–Bureau of Consular Affairs. Figure 1 I’ll be working with monthly immigrant statistics between March 2017 and September 2022. Figure 2 Except for be...

2139 sym Python (6357 sym/44 pcs) 5 img