Publications by Shoshana Farber

DATA 624 - Homework 4

25.02.2024

Exercise 3.1 The UC Irvine Machine Learning Repository contains a data set related to glass identification. The data consist of 214 glass samples labeled as one of seven class categories. There are nine predictors, including the refractive index and percentages of eight elements: Na, Mg, Al, Si, K, Ca, Ba, and Fe. The data can be accessed via: ...

3059 sym R (4644 sym/19 pcs) 42 img

DATA 624 - Homework 3

18.02.2024

Exercise 5.1 Produce forecasts for the following series using whichever of NAIVE(y), SNAIVE(y) or RW(y ~ drift()) is more appropriate in each case: Australian Population (global_economy) For this, let’s predict the Australian population for the next ten years. The population is steadily increasing without seasonal trends so we can use the d...

5013 sym 20 img

DATA 624 - Homework 2

12.02.2024

Exercise 3.1 Consider the GDP information in global_economy. Plot the GDP per capita for each country over time. Which country has the highest GDP per capita? How has this changed over time? data("global_economy") global_economy <- global_economy |> mutate(GDP_cap = GDP/Population) global_economy |> autoplot(GDP_cap, show.legend=F) + ...

4937 sym 26 img

DATA 624 Homework 1

05.02.2024

library(fpp3) Exercise 2.1 Explore the following four time series: Bricks from aus_production, Lynx from pelt, Close from gafa_stock, Demand from vic_elec. data(aus_production, pelt, gafa_stock, vic_elec) Use ? (or help()) to find out about the data in each series. ?aus_production ?pelt ?gafa_stock ?vic_elec What is the time interval of e...

4338 sym R (6039 sym/53 pcs) 26 img 2 tbl

DATA 621 Homework 3

13.11.2023

Homework 3 - Logistic Regression Overview: In this homework assignment, you will explore, analyze and model a data set containing information on crime for various neighborhoods of a major city. Each record has a response variable indicating whether or not the crime rate is above the median crime rate (1) or not (0). Your objective is to build a...

16258 sym R (25340 sym/15 pcs) 16 img 9 tbl

DATA 608 - Story 1

11.09.2023

Assignment Details This assignment is based on data on the present allocation of the Infrastructure Investment and Jobs Act (IIJA) funding by State and Territory. The goal of the assignment is to use data visualizations to address the following questions: Is the allocation equitable based on the population of each of the States and Territories...

5405 sym 5 img

DATA 605 - Final Project

18.05.2023

Problem 1 Generate Distributions Probability Density 1: X~Gamma. Using R, generate a random variable \(X\) that has 10,000 random Gamma pdf values. A Gamma pdf is completely described by \(n\) (a size parameter) and \(\lambda\) (a shape parameter). Choose any \(n\) greater than 3 and an expected value (\(\lambda\)) between 2 and 10. set.seed(...

12313 sym Python (55810 sym/146 pcs) 32 img

Final Project - Motor Vehicle Collisions

15.05.2023

Abstract This study investigates factors associated with motor vehicle collisions and their relationships with collision severity. The analysis was conducted using car crash data obtained from the New York City Police Department, which contains approximately 1.98 million collision records. The highest number of recorded collisions occurred betw...

20831 sym Python (15060 sym/62 pcs) 15 img 2 tbl

Tidyverse EXTEND

26.04.2023

Required Libraries library(tidyverse) ## Warning: package 'tidyverse' was built under R version 4.2.2 ## Warning: package 'ggplot2' was built under R version 4.2.2 library(lubridate) library(httr) library(jsonlite) Tidyvserse Packages Tidyverse contains many packages within it that allows users to work with strings, mutate and rearange datafram...

3326 sym R (2005 sym/19 pcs) 2 img 4 tbl

DATA 607 - EC 5

26.04.2023

The goal of this project is to implement a Global Baseline Estimate recommendation system in R based on movie ratings. This is an extension of assignment 2, where movie ratings were collected from different individuals. For this, we will use the ratings collected in assignment 2 by connecting to the SQL database in which they are stored. Conne...

1887 sym 1 tbl