Publications by yina qiao

Data 605 HW1


library(dplyr) ## ## Attaching package: 'dplyr' ## The following objects are masked from 'package:stats': ## ## filter, lag ## The following objects are masked from 'package:base': ## ## intersect, setdiff, setequal, union library(gifski) HW Outline Part 1: Draw Initials ‘Y’ and ‘Q’ Creation of ‘Y’ Creation of ‘Q’ Par...

606 final exams


Part I Please put the answers for Part I next to the question number (please enter only the letter options; 4 points each): 1.B 2.A 3.D 4.B  5.B  6.E  7.D  8.E 9.B  10.C  Part II Consider the three datasets, each with two columns (x and y), provided below. Be sure to replace the NA with your answer for each part (e.g. assign the mean of x f...

Data 606 Final Project


Library library(tidyverse) library(caTools) library(ROCR) library(rpart) library(rmdformats) library(randomForest) library(psych) Introduction Research question: What are the variables affect loan approval rate? Given a list of applicant characteristics, can we build a model to predict the loan approval outcome? Problem: Dream Housing Company w...

Data 606 lab 9


Grading the professor Many college courses conclude by giving students the opportunity to evaluate the course and the instructor anonymously. However, the use of these student evaluations as an indicator of course quality and teaching effectiveness is often criticized because these measures may reflect the influence of non-teaching related characte...

Data 607 Final Project


Introduction Research Question: What are the Top Tech companies to work for? What are the characteristics that make them so? Overview: Needless to say, it’s been a turbulent few years. Companies have either risen to the top or fallen off thanks to competing pressures from all sides. Over the course of this class, we discussed a number of com...

Data Science in context


Data 607 Project 4


Introduction Would it be nice to classify text message as spam or ham(legit messages) ? This project aims to do exactly that. We will use SMS Spam Collection dataset to build a document classifier (Naive Bayes algorithm) that classify text message, then predict the class of new text (testing dataset), then using confusion matrix to evaluate the mod...

Data 607 Lab 10


Introduction In this assignment, you should start by getting the primary example code from chapter 2 working in an R Markdown document. You should provide a citation to this base code. You’re then asked to extend the code in two ways: 1. Work with a different corpus of your choosing, and 2. Incorporate at least one additional sentiment lexicon C...

Data 606 Lab 8


In this lab, you’ll be analyzing data from Human Freedom Index reports from 2008-2016. Your aim will be to summarize a few of the relationships within the data both graphically and numerically in order to find which variables can help tell a story about freedom. Load packages In this lab, you will explore and visualize the data using the tidyver...

Data 606 final project proposal


Data Preparation Install Packages library(tidyverse) library(caTools) library(ROCR) library(rpart) library(rmdformats) library(randomForest) library(psych) ## load data my_loan_data<- read.csv("") head(my_loan_data) ## Loan_ID Gender Married Dependent...

