Publications by Carol Campbell
Data 607 Project 4 "Spam Ham"
Assignment We are to create a program that can classify a text document using training documents that are already classified. This program will classify email as ‘spam’, i.e., unwanted email, or ‘ham’, i.e., wanted email. Install necessary packages (only need to do this once) # install.packages("tm") # install.packages("caTools") # ...
4257 sym Python (8384 sym/34 pcs) 1 img
Data 607 Sentiment Analysis, Part 2
PART 2 OF 2 SENTIMENT ANALYSIS My corpus is from The Great Gatsby by F. Scott Fitzgerald. About the new corpus and lexicon In Part 1, the janeaustenr package was used to explore tidying text from her novels. For this assignment, I used the gutenbergr package (Robinson 2016) to locate the book, “The Great Gatsby” by F. Scott Fitzgerald. The...
1274 sym R (5964 sym/38 pcs) 4 img
Data 607 Sentiment Analysis Part 1
Part 1 of 2 - recreate the code from Chapter 2 of the book Text Mining with R by Juilia Silge and David Robinson. https://www.tidytextmining.com/sentiment.html CHAPTER 2 SENTIMENT ANALYSIS WITH TIDY DATA Packages used for this assignment #install.packages("tidyverse") #install.packages("textdata") #install.packages("gutenbergr") #install.pack...
8768 sym R (10175 sym/55 pcs) 5 img
Document
[Tidyverse graphic]{width=“411”} Introduction: The goal of this assignment practice collaborating around a code project with GitHub. You could consider our collective work as building out a book of examples on how to use TidyVerse functions. What is Tidyverse? Tidyverse is a collection of R packages which contain tools for transforming an...
4944 sym Python (3286 sym/5 pcs) 1 img 3 tbl
Data 607 HW 9 - Web APIs
Assignment Overview The New York Times web site provides a rich set of APIs, as described here. You’ll need to start by signing up for an API key. Your task is to choose one of the New York Times APIs, construct an interface in R to read in the JSON data, and transform it into an R DataFrame. Packages library(httr) library(jsonlite) librar...
815 sym Python (194578 sym/30 pcs)
Data 606 - Lab 8 Intro to Linear Regression
title: “Data 606 Lab 8 - Intro to Linear Regression” author: “Carol Campbell” date: “2023-11-05” output: pdf_document: default html_document: includes: in_header: header.html css: ./lab.css highlight: pygments theme: cerulean toc: true toc_float: true editor_options: chunk_output_type: console — The Human Freedom Index is a report...
11178 sym R (4718 sym/24 pcs) 6 img
Group Project 3 - Most Valued Data Science Skills final version
Loading libraries: Introduction: The goal of this project is to find and use data to answer the question “Which are the most valued data science skills?”. Our group members are: Kossi Akplaka, Carol Campbell, Saloua Daouki, and Souleymane Doumbia. We began by searching various websites for suitable data before finally settling on a Kaggle ...
4456 sym Python (26913 sym/44 pcs) 10 img
HW 7 - Working with XML and JSON in R
##Assignment - Working with XML and JSON in R Pick three of your favorite books on one of your favorite subjects. At least one of the books should have more than one author. For each book, include the title, authors, and two or three other attributes that you find interesting. Take the information that you’ve selected about these three books, ...
1183 sym R (6669 sym/31 pcs)
Data 607-Homework-5
Assignment – Tidying and Transforming Data We are given a .csv file containing disjointed data for two airlines - “Alaska” and “AM WEST”, the five airports that they operate out of, and their respective arrival and departure delays. Use ‘tidyr’ and ‘dplyr’ to tidy and transform the data. Perform an analysis to compare the arri...
1319 sym Python (2672 sym/12 pcs) 2 img 5 tbl
Data 606 Lab 5a
title: “Foundations for statistical inference - Sampling distributions” author: “Carol Campbell” output: pdf_document: default html_document: includes: in_header: header.html css: ./lab.css highlight: pygments theme: cerulean toc: true toc_float: true editor_options: chunk_output_type: console — In this lab, you will investigate the w...
13360 sym R (6161 sym/34 pcs) 5 img