Publications by Shin Lee
HMI: Week 14
Learning Objectives Describe a theoretical foundation of topic modeling methods Understand word co-occurrences as a way of analyzing topics Analyzing Topics Texts can be analyzed in terms of topics. This means we can analyze text in terms of what it is that it is written about in the first place. For example, consider how the topics covered in ...
13729 sym R (8539 sym/34 pcs) 4 img 1 tbl
HMI: Second Major Assignment
Second Major Assignment In this second major assignment to be graded, you will text-analyze tweets collected around a keyword, social distancing. For doing so, you are provided with a tweet data set, social_distancing_HMI.RData, which contains only English-written tweets with geo-location information. Analyzing the tweets, you are required to set...
3182 sym
BDM-Week7
AJAX HTML Markup language for structuring content on the Web Static display of content in a page layout Inability to create more dynamic displays of information Once an HTML document is downloaded, the visual appearance will not change. ***Three elements to provide content dynamically a mechanism to register user behavior in the browser a scri...
12950 sym R (48454 sym/155 pcs) 1 img
BDJ20-Week5
Introduction to Web Scraping What is web scraping? Web scraping is the process of extracting a structural representation of data from a website. Example To collect user comments on an online news article Targeting the marking of the web page Parsing the web page into a tree representation Running an R script automatically I. Data on the intern...
4650 sym R (865 sym/8 pcs) 3 img
html-1
I am your first HTML file! 5 < 6 but 7 > 3 5 < 6 but 7 > 3 Writing code is poetry Writing code is poetry Writing code is poetry Link with absolute path some text set in bold some text set in italics some text so important to be emphasized This text is going to be a paragraph one day and separated from ...
794 sym
BDJ20-Week4-2
Before we start… Please be noted that you will work with R Markdown documents. R Markdown consists of three parts: 1) contents; 2) codes; 3) outputs (results). First, the content parts describe what you are learning about and asked to work on. Second, the code parts are in grey boxes and are what you can enter in the source window of RStudio. T...
2256 sym R (3116 sym/76 pcs) 1 img
BDJ-W4-1
Data Storytelling Scenarios Questions to be considered What type of data is most suited to answer your question? Is the quality of the data sufficiently high to answer your question? Isn’t the information systematically flawed? Web data quality: origin of online data What is the primary sources of secondary data? There may be situations wh...
3741 sym R (245 sym/1 pcs) 3 img
BDM-Week3
How to use R Foundational notions of R language Vectors Vectors - Numeric - Logical / Boolean - Character “object-oriented programming language” apple <- c(2020123456, 20197890123, 2020456789, 201812345) sale <- c("apple", "banana", "kiwi", "melon") sale ## [1] "apple" "banana" "kiwi" "melon" sales <- c(sale, apple) sales #...
5828 sym R (54192 sym/97 pcs) 10 img
BDJ-Week3
Case study: World Heritage Sites in Danger - 1,121 heretage sites like the Pyramids in Egypt - Which sites are threatened and where are they located? - Are there regions in the world where sitets are more endangered than in others? - What are the reasons that put a site at risk? https://en.wikipedia.org/wiki/List_of_World_Heritage_in_Danger Loa...
884 sym R (14795 sym/52 pcs) 3 img