Publications by William Jasmine

Data 606 - Lab 8 - Introduction to Linear Regression


The Human Freedom Index is a report that attempts to summarize the idea of “freedom” through a bunch of different variables for many countries around the globe. It serves as a rough objective measure for the relationships between the different types of freedom - whether it’s political, religious, economical or personal freedom - and other s...

14598 sym 8 img

Data 607 - Assignment 7 - Intro to Sentiment Analysis


Introduction The code below is cited from Chapter 2 of Welcome to Text Mining with R: A Tidy Approach by Julia Silge and David Robertson, which shows how the tidytext library can be used to evaluate the sentiment of text. The example I have chosen shows how the nrc lexicon can be used in order to get the most commonly used words in Jane Austen no...

4492 sym Python (4401 sym/25 pcs)

Data 607 - Project 4 - Document Classification


Introduction This document will outline the process by which a Naive Bayes classifier can be used to categorize documents as spam or ham (not spam). The data being used in this case comes from Kaggle and consists of a set of 5,574 SMS text messages. The Kaggle dataset page lists the source of the text messages, all of which are listed below: A c...

9521 sym Python (11038 sym/35 pcs) 4 img

Data 606 - Lab 9 - Multiple Regression


Grading the professor Many college courses conclude by giving students the opportunity to evaluate the course and the instructor anonymously. However, the use of these student evaluations as an indicator of course quality and teaching effectiveness is often criticized because these measures may reflect the influence of non-teaching related charac...

16449 sym 15 img

Using Logistic Regression to Identify b-Jets In Particle Collision Data


1 Introduction 1.1 Context In recent history, particle collision experiments have provided humanity with some of its greatest advancements in scientific knowledge. Probably the most famous of these was in 2012, when scientists at the Large Hardron Collider (LHC) at the CERN laboratory in Geneva, Switzerland, announced that they had discovered de...

19847 sym Python (18067 sym/48 pcs) 13 img 1 tbl

Influence of Campaign Finance on the Outcomes of the 2018 Midterm Elections


1 Introduction 1.1 Context Compared to the rest of many other democracies in the world, elections in the United States tend to last a very long time. For presidential elections, nominees typically announce their candidacy over a year (and sometimes multiple years) in advance, and US citizens are exposed to political advertising and showmanship o...

15964 sym Python (14057 sym/56 pcs) 4 img 1 tbl