Publications by Shin Lee



Write text and code here. 연구 목적 및 배경 설명 What is (are) your main question(s)? What is your story? What does the final graphic show? 데이터에 대한 개괄적인 설명 Explain where the data came from, what agency or company made it, how it is structured, what it shows, etc. 데이터 정제 및 가공 과정 Describe and ...

679 sym



Final Project For your final project, you will take a dataset, explore it, tinker with it, and tell a nuanced story about it using any method of automated text analysis covered in this class. I want this project to be as useful for you and your future career as possible - you’ll hopefully want to show off your final project in a portfolio or duri...

6776 sym

Publish Document


Final Project For your final project, you will take a dataset, explore it, tinker with it, and tell a nuanced story about it using any method of automated text analysis covered in this class. I want this project to be as useful for you and your future career as possible - you’ll hopefully want to show off your final project in a portfolio or ...

6860 sym

HMI Week 11 Pre-class


Ch10. Word and text relatedness Learning Objectives Understand the goals and applications of the taks of word relatedness Learn about corpus-based measures of word relatedness Word Relatedness Semantic relatedness involves identification and quantification of the strength of relationships in meaning that exist between textual units such as wor...

11199 sym R (23422 sym/76 pcs) 9 img

HMI Week 10 Pre-class


Ch11. Text Classification Learning Objectives Understand the task of text classification and learn about its applications Learn about a basic automated way of text classification Practice the lexicon-based analysis for sentiment classification of COVID-19 Tweets The Logic of Text Classification Text classification refers to the task of assigni...

10321 sym R (82238 sym/110 pcs) 17 img



The logic of text classification is simple Document clustering < Unsupervised machine learning Document classification < Supervised machine learning When we know what to find or predict from text The defined outcome for each document is predicted, not clustered The premise of classification is simple: given a categorical target variable, learn...

6398 sym R (555486 sym/106 pcs) 10 img

HMI Week 9 Pre-class


Ch14. Sentiment Analysis Learning Objectives Understand the tasks of subjectivity and sentiment analysis Learn about resources for subjectivity and sentiment analysis, specifically addressing lexicon-based sentiment analysis Learn about tidy text approach to lexicon-based sentiment analysis What is Sentiment Analysis? Sentiment analysis is the...

9339 sym R (12602 sym/69 pcs) 10 img

HMI: Week9


Ch14. Sentiment Analysis Learning Objectives Understand the tasks of subjectivity and sentiment analysis Learn about resources for subjectivity and sentiment analysis, specifically addressing lexicon-based sentiment analysis Learn about tidy text approach to lexicon-based sentiment analysis What is Sentiment Analysis? Sentiment analysis is the...

9227 sym R (11062 sym/65 pcs) 10 img

HMI: First Major Assignment


First Major Assignment In this first major assignment to be graded, you will 1) pre-process and tokenize text retrieved from 10,000 randomly sampled tweets about Covid-19 in covid19_tweets_df, 2) create three tables to show the top 50 hashtags among the tweets post on March 26th, 27th, and 28th, respectively, and 3) generate three word clouds tha...

2183 sym R (472 sym/2 pcs)

HMI First Major Assignment


First Major Assignment In this first major assignment to be graded, you will 1) pre-process and tokenize text retrieved from 10,000 randomly sampled tweets about Covid-19 in covid19_tweets_df, 2) create three tables to show the top 20 hashtags among the tweets post on March 26th, 27th, and 28th, respectively, and 3) generate three word clouds tha...

2183 sym R (468 sym/2 pcs)