Publications by yina qiao
Data 605 HW1
library(dplyr) ## ## Attaching package: 'dplyr' ## The following objects are masked from 'package:stats': ## ## filter, lag ## The following objects are masked from 'package:base': ## ## intersect, setdiff, setequal, union library(gifski) HW Outline Part 1: Draw Initials ‘Y’ and ‘Q’ Creation of ‘Y’ Creation of ‘Q’ Par...
472 sym R (2820 sym/15 pcs) 3 img
606 final exams
Part I Please put the answers for Part I next to the question number (please enter only the letter options; 4 points each): 1.B 2.A 3.D 4.B 5.B 6.E 7.D 8.E 9.B 10.C Part II Consider the three datasets, each with two columns (x and y), provided below. Be sure to replace the NA with your answer for each part (e.g. assign the mean of x f...
1953 sym 9 img 1 tbl
Data 606 Final Project
Library library(tidyverse) library(caTools) library(ROCR) library(rpart) library(rmdformats) library(randomForest) library(psych) Introduction Research question: What are the variables affect loan approval rate? Given a list of applicant characteristics, can we build a model to predict the loan approval outcome? Problem: Dream Housing Company w...
15664 sym 12 img
Data 606 lab 9
Grading the professor Many college courses conclude by giving students the opportunity to evaluate the course and the instructor anonymously. However, the use of these student evaluations as an indicator of course quality and teaching effectiveness is often criticized because these measures may reflect the influence of non-teaching related characte...
12843 sym R (15158 sym/35 pcs) 11 img
Data 607 Final Project
Introduction Research Question: What are the Top Tech companies to work for? What are the characteristics that make them so? Overview: Needless to say, it’s been a turbulent few years. Companies have either risen to the top or fallen off thanks to competing pressures from all sides. Over the course of this class, we discussed a number of com...
21025 sym Python (8542 sym/26 pcs) 2 img 1 tbl
Data Science in context
Tableau Public Hit —open in browser— for full view of the dashboard <script LS0tCnRpdGxlOiAiRW1iZWRkaW5nIFRhYmxlYXUgRGFzaGJvYXJkIGluIFJNRCIKYXV0aG9yOiAieWluYSBxaWFvIgpkYXRlOiAiMjAyMy0wNS0wNCIKb3V0cHV0OiBvcGVuaW50cm86OmxhYl9yZXBvcnQKLS0tClwKCgoKCiMgVGFibGVhdSBQdWJsaWNcClwKCioqSGl0IC0tLW9wZW4gaW4gYnJvd3Nlci0tLSBmb3IgZnVsbCB2aWV3IG...
4722 sym 2 img
Data 607 Project 4
Introduction Would it be nice to classify text message as spam or ham(legit messages) ? This project aims to do exactly that. We will use SMS Spam Collection dataset to build a document classifier (Naive Bayes algorithm) that classify text message, then predict the class of new text (testing dataset), then using confusion matrix to evaluate the mod...
12407 sym Python (10450 sym/54 pcs) 1 img
Data 607 Lab 10
Introduction In this assignment, you should start by getting the primary example code from chapter 2 working in an R Markdown document. You should provide a citation to this base code. You’re then asked to extend the code in two ways: 1. Work with a different corpus of your choosing, and 2. Incorporate at least one additional sentiment lexicon C...
1998 sym R (18080 sym/133 pcs) 7 img 3 tbl
Data 606 Lab 8
In this lab, you’ll be analyzing data from Human Freedom Index reports from 2008-2016. Your aim will be to summarize a few of the relationships within the data both graphically and numerically in order to find which variables can help tell a story about freedom. Load packages In this lab, you will explore and visualize the data using the tidyver...
10208 sym R (7458 sym/36 pcs) 8 img
Data 606 final project proposal
Data Preparation Install Packages library(tidyverse) library(caTools) library(ROCR) library(rpart) library(rmdformats) library(randomForest) library(psych) ## load data my_loan_data<- read.csv("https://raw.githubusercontent.com/yinaS1234/data-606/main/606%20final%20project/loan_data.csv") head(my_loan_data) ## Loan_ID Gender Married Dependent...
7608 sym 5 img