Publications by Alice Ding

Discussion 13

17.04.2024

Let’s represent the two numbers as $x$ and $y$, where $x+y=100$. We want to maximize the product $P=xy$. We can express one variable in terms of the other using the constraint $x+y=100$, say $y=100−x$. Now we have $P=x(100−x)$. To find the maximum of $P$, we’ll take its derivative with respect to $x$, set it equal to zero,...

438 sym

Discussion 12

08.04.2024

First, we will load the mtcars dataset and create the model. We will be using these fields: Quadratic term: I(hp^2) represents the quadratic term for horsepower (hp). It captures the non-linear relationship between horsepower and miles per gallon (mpg). Dichotomous term: am represents whether the car has an automatic transmission. It’s dichotomo...

3969 sym 2 img

Discussion 11

02.04.2024

First, we will load the dataset and check the structure of it. data(mtcars) # check the structure of the dataset str(mtcars) ## 'data.frame': 32 obs. of 11 variables: ## $ mpg : num 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ... ## $ cyl : num 6 6 4 6 8 6 8 4 4 6 ... ## $ disp: num 160 160 108 258 360 ... ## $ hp : num 110 110 93 1...

2145 sym 2 img

Data 607 Final Project

12.05.2023

Overview South Korea has its own genre of music known as k-pop. Artists such as BTS and BLACKPINK have become more prominent in western media over the past several years, but there is a much more diverse music scene than just these artists. Korean national broadcasting companies have weekly music shows where artists come to promote new albums and s...

14026 sym R (25244 sym/53 pcs) 15 img

Project 4

28.04.2023

Overview It can be useful to be able to classify new "test" documents using already classified "training" documents. A common example is using a corpus of labeled spam and ham (non-spam) e-mails to predict whether or not a new document is spam. For this project, I’ll be taking a list of 2,551 ham (non-spam) messages and 1,397 spam messages to se...

2113 sym R (3732 sym/8 pcs)

TidyVerse Create

05.04.2023

Introduction For this assignment, I’ll be creating a “vignette” for the tidyverse package ggplot2. A description of the package is as follows: ggplot2 is a system for declaratively creating graphics, based on The Grammar of Graphics. You provide the data, tell ggplot2 how to map variables to aesthetics, what graphical primitives to use, and i...

4166 sym R (4315 sym/18 pcs) 8 img

Assignment 10

28.03.2023

Overview To start with, I’ll be copying over Text Mining with R, Chapter 2’s code base in order to perform sentiment analysis on something of my choice. Text Mining with R, Chapter 2 library(janeaustenr) library(dplyr) ## ## Attaching package: 'dplyr' ## The following objects are masked from 'package:stats': ## ## filter, lag ## The foll...

3127 sym R (8920 sym/54 pcs) 11 img

Assignment 9

21.03.2023

Overview Given the list of APIs found here, I’ve chosen to look at the most popular one and specifically picked the one that shows the most viewed articles for the last seven days. api_link <- "https://api.nytimes.com/svc/mostpopular/v2/viewed/7.json?api-key=" data <- GET(paste(api_link, Sys.getenv("key"), sep="")) Note: I’ve stored my api-key...

1340 sym R (663 sym/7 pcs)

Assignment 7

07.03.2023

Overview I’ve picked 3 fiction books, each written by people of color and telling stories from their own personal lens. These brings different cultures through the authors’ own personal experiences, but each story is inherently American in nature as well. I’ve put the following information in html, xml, and json formats: Title Author(s) Publ...

2733 sym Python (15967 sym/15 pcs)

Israeli Extra Credit

02.03.2023

Overview israeli_data <- read.csv("https://github.com/addsding/data607/raw/main/extra/ec3/israeli_vaccination_data.csv") head(israeli_data) This data describes August 2021 data for Israeli hospitalization (“Severe Cases”) rates for people under 50 (assume “50 and under”) and over 50, for both un-vaccinated and fully vaccinated populations....

4874 sym