Publications by Diana Plunkett

HW4

22.09.2024

Helpful links: http://rismyhammer.com/ml/Pre-Processing.html#pre-processing https://www.rdocumentation.org/packages/caret/versions/6.0-92/topics/preProcess Exercise 3.1 The UC Irvine Machine Learning Repository6 contains a data set related to glass identification. The data consist of 214 glass samples labeled as one of seven class categories. Th...

4997 sym R (4299 sym/15 pcs) 10 img

HW3

18.09.2024

Exercise 5.1 Produce forecasts for the following series using whichever of NAIVE(y), SNAIVE(y) or RW(y ~ drift()) is more appropriate in each case: Australian Population (global_economy) First a quick look at the data. AUSpop <-global_economy |> filter (Code=='AUS') |> select (Population) autoplot(AUSpop) + scale_y_continuous(labels =...

6919 sym Python (4704 sym/32 pcs) 23 img

624 hw2

10.09.2024

Excercise 3.1 Consider the GDP information in global_economy. Plot the GDP per capita for each country over time. Which country has the highest GDP per capita? How has this changed over time? ge <-global_economy |> mutate(gdpByPop = GDP/Population) |> mutate( color = case_when( Code=='MCO' ~ 'red', Code=='LIE' ~ 'blue', Code=...

5331 sym 25 img

624 wk2 HW

06.09.2024

Excercise 2.1 Explore the following four time series: Bricks from aus_production, Lynx from pelt, Close from gafa_stock, Demand from vic_elec. -Use ? (or help()) to find out about the data in each series. -What is the time interval of each series? -Use autoplot() to produce a time plot of each series. -For the last plot, modify the axis labels ...

2437 sym Python (2804 sym/17 pcs) 9 img

Document4

17.03.2024

Data Sources: Salary Data: https://www.zippia.com/ 3/2/2024 Cost of Living Index: https://www.datapandas.org/ranking/cost-of-living-by-state Inspiration for dumbbell viz: Stalder, T., Holtz, Y. (2021): Extended Dumbbell Plot in R with ggplot2. R graph gallery. Access: r-graph-gallery.com/web-extended-dumbbell-plot-ggplot2.html. Date: 03-03-2024...

611 sym R (7553 sym/17 pcs) 7 img

Document3.1

02.03.2024

python code sample: import requests url = “https://wonder.cdc.gov/datarequest/D158” response = request.post(url, data{“request_xml”: xml_request, “accept_datause_restrictions”: “true”}) if response.status_code == 200: data = response.text else: print(“something went wrong”) library(httr) # in the response from the manual use o...

305 sym R (737 sym/3 pcs)

Document3

02.03.2024

Data Sources: Gun law strength: https://everytownresearch.org/rankings/methodology/ Deaths by Firearms: https://www.cdc.gov/nchs/pressroom/sosmap/firearm_mortality/firearm.htm First get the hexmap for US States # Load required libraries library(tidyverse) library(sf) library(ggplot2) # Read the file into an sf object us_states_hexgrid <- st...

454 sym R (4468 sym/12 pcs) 5 img

Story2

17.02.2024

Data sources: https://fred.stlouisfed.org/ CPI: https://fred.stlouisfed.org/series/CORESTICKM159SFRBATL Unemployment rate: https://fred.stlouisfed.org/series/UNRATE Fed Funds: https://fred.stlouisfed.org/series/FEDFUNDS library(tidyverse) library(dplyr) #Bring in the Data ff <- as.data.frame(read_csv ("https://raw.githubusercontent.com/dianaplunk...

705 sym R (5655 sym/12 pcs) 5 img

Document

12.11.2023

Introduction Our goal is to utilize job salary data (retrieved from Ask A Manager Salary Survey here: https://docs.google.com/spreadsheets/d/1IPS5dBSGtwYVbjsfbaMCYIWnOuRmJcbequohNxCyGVw/edit?resourcekey#gid=1625408792) and demographic data to predict if the salaries in the survey end up above or below per capita personal income for state. We origin...

16798 sym R (41134 sym/25 pcs) 8 img 6 tbl

Publish Document

26.10.2023

Introduction We will explore, analyze and model a data set containing information on crime for various neighborhoods of a major city. Each record has 12 predictor variables and a response variable indicating whether or not the crime rate is above the median crime rate (1) or not (0). The model will be a binary logistic regression model on the train...

25256 sym R (23534 sym/26 pcs) 15 img 3 tbl