Publications by Julia Ferris

Lab 8 DATA 606

05.11.2023

library(tidyverse) library(openintro) library(ggplot2) data('hfi', package='openintro') Exercise 1 What are the dimensions of the dataset? The dataset has 1458 rows and 123 columns. glimpse(hfi) ## Rows: 1,458 ## Columns: 123 ## $ year <dbl> 2016, 2016, 2016, 2016, 2016, 2016,… ## $ ISO_code ...

17174 sym 7 img

Assignment 9 - API HW

31.10.2023

Load Libraries library(rjson) library(RJSONIO) ## ## Attaching package: 'RJSONIO' ## The following objects are masked from 'package:rjson': ## ## fromJSON, toJSON library(gt) library(ggplot2) Format Data and Create Data Frame The data was uploaded from the NY Times API I created. It shows data for the top stories in the U.S. usa_data <...

691 sym R (840 sym/6 pcs) 1 img 1 tbl

Project Proposal DATA 606

30.10.2023

Data Preparation library(readr) traffickingVictims <- read_csv("https://raw.githubusercontent.com/juliaDataScience-22/cuny-fall-23/stats-and-probability/data_glotip.csv") library(dplyr) library(plyr) library(ggplot2) # Less than 5 can include 0, 1, 2, 3, or 4. # On average, we can expect 2 to be the value, # so I changed every value tha...

3438 sym R (8387 sym/11 pcs) 7 img

Project 3 DATA 607

29.10.2023

Finding Value Within the Data Science Industry Author Team Krijudato (Kristin L, Julia F, David G, Tony F) Published October 29, 2023 Introduction Beginning in 2011, Stack Overflow has conducted an annual survey for developers to participate in. This survey provides insights on a wide variety of topics, such as employment status, annual salar...

7733 sym Python (12700 sym/15 pcs) 14 img 1 tbl

Lab 7

22.10.2023

knitr::opts_chunk$set(eval = TRUE, message = FALSE, warning = FALSE) library(tidyverse) library(openintro) library(infer) library(ggplot2) data('yrbss', package='openintro') Question 1 What are the cases in this data set? How many cases are there in our sample? The cases are individual high schoolers who participated in the survey. 13,583...

21266 sym 4 img

Lab 6: Inference for Categorical Data

15.10.2023

library(tidyverse) library(openintro) library(infer) Exercise 1 What are the counts within each category for the amount of days these students have texted while driving within the past 30 days? 0 — 4792 1-2 — 925 3-5 — 493 6-9 — 311 10-19 — 373 20-29 — 298 30 — 827 did not drive — 4646 NA — 918 library(dplyr) yrbss |> co...

14475 sym Python (4036 sym/21 pcs) 2 img

Assignment 7 - Working with Different Data Sources

16.10.2023

Introduction In this file, three data frames (tables) are created. Each one shows the same data, but each one has data from different types of sources. The first one is from a json file, the second is from a html file, and the third is from an xml file. All the tables are at the end of this document. library(rvest) library(rjson) library(json...

328 sym R (1482 sym/8 pcs) 3 tbl

Extra Credit - Vaccine Data

17.10.2023

Import the Data I used the read.csv() function to read the data from a csv file. vaccineData <- read.csv("https://raw.githubusercontent.com/juliaDataScience-22/cuny-fall-23/manage-acquire-data/vaccine_data.csv") Format the Data To format the data, I made the types of data the names of the columns. I renamed the first and last columns, and then I ...

3103 sym Python (365 sym/7 pcs) 1 tbl

Lab 5a

05.10.2023

knitr::opts_chunk$set(eval = TRUE, message = FALSE, warning = FALSE) set.seed(1234) library(tidyverse) library(openintro) library(infer) Exercise 1 Describe the distribution of responses in this sample. How does it compare to the distribution of responses in the population. Hint: Although the sample_n function takes a random sample of obser...

27644 sym 11 img

Lab 5b

05.10.2023

knitr::opts_chunk$set(eval = TRUE, message = FALSE, warning = FALSE) library(tidyverse) library(openintro) library(infer) Exercise 1 What percent of the adults in your sample think climate change affects their local community? Hint: Just like we did with the population, we can calculate the proportion of those in this sample who think clima...

17914 sym