Publications by Alice Ding
Discussion 13
Let’s represent the two numbers as \(x\) and \(y\), where \(x+y=100\). We want to maximize the product \(P=xy\). We can express one variable in terms of the other using the constraint \(x+y=100\), say \(y=100−x\). Now we have \(P=x(100−x)\). To find the maximum of \(P\), we’ll take its derivative with respect to \(x\), set it equal to zero,...
438 sym
Discussion 12
First, we will load the mtcars dataset and create the model. We will be using these fields: Quadratic term: I(hp^2) represents the quadratic term for horsepower (hp). It captures the non-linear relationship between horsepower and miles per gallon (mpg). Dichotomous term: am represents whether the car has an automatic transmission. It’s dichotomo...
3969 sym 2 img
Discussion 11
First, we will load the dataset and check the structure of it. data(mtcars) # check the structure of the dataset str(mtcars) ## 'data.frame': 32 obs. of 11 variables: ## $ mpg : num 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ... ## $ cyl : num 6 6 4 6 8 6 8 4 4 6 ... ## $ disp: num 160 160 108 258 360 ... ## $ hp : num 110 110 93 1...
2145 sym 2 img
Data 607 Final Project
Overview South Korea has its own genre of music known as k-pop. Artists such as BTS and BLACKPINK have become more prominent in western media over the past several years, but there is a much more diverse music scene than just these artists. Korean national broadcasting companies have weekly music shows where artists come to promote new albums and s...
14026 sym R (25244 sym/53 pcs) 15 img
Project 4
Overview It can be useful to be able to classify new "test" documents using already classified "training" documents. A common example is using a corpus of labeled spam and ham (non-spam) e-mails to predict whether or not a new document is spam. For this project, I’ll be taking a list of 2,551 ham (non-spam) messages and 1,397 spam messages to se...
2113 sym R (3732 sym/8 pcs)
TidyVerse Create
Introduction For this assignment, I’ll be creating a “vignette” for the tidyverse package ggplot2. A description of the package is as follows: ggplot2 is a system for declaratively creating graphics, based on The Grammar of Graphics. You provide the data, tell ggplot2 how to map variables to aesthetics, what graphical primitives to use, and i...
4166 sym R (4315 sym/18 pcs) 8 img
Assignment 10
Overview To start with, I’ll be copying over Text Mining with R, Chapter 2’s code base in order to perform sentiment analysis on something of my choice. Text Mining with R, Chapter 2 library(janeaustenr) library(dplyr) ## ## Attaching package: 'dplyr' ## The following objects are masked from 'package:stats': ## ## filter, lag ## The foll...
3127 sym R (8920 sym/54 pcs) 11 img
Assignment 9
Overview Given the list of APIs found here, I’ve chosen to look at the most popular one and specifically picked the one that shows the most viewed articles for the last seven days. api_link <- "https://api.nytimes.com/svc/mostpopular/v2/viewed/7.json?api-key=" data <- GET(paste(api_link, Sys.getenv("key"), sep="")) Note: I’ve stored my api-key...
1340 sym R (663 sym/7 pcs)
Assignment 7
Overview I’ve picked 3 fiction books, each written by people of color and telling stories from their own personal lens. These brings different cultures through the authors’ own personal experiences, but each story is inherently American in nature as well. I’ve put the following information in html, xml, and json formats: Title Author(s) Publ...
2733 sym Python (15967 sym/15 pcs)
Israeli Extra Credit
Overview israeli_data <- read.csv("https://github.com/addsding/data607/raw/main/extra/ec3/israeli_vaccination_data.csv") head(israeli_data) This data describes August 2021 data for Israeli hospitalization (“Severe Cases”) rates for people under 50 (assume “50 and under”) and over 50, for both un-vaccinated and fully vaccinated populations....
4874 sym