Publications by Shane Hylton

DATA 607 Project 3

17.10.2021

Project: What are the most important data science skills? Goals In this project, we seek to answer this question by collecting all of the words among a wide number of job descriptions. We will take all of the words and create word cloud plots to show which words occur most frequently in job descriptions. This should provide us with sufficient in...

2663 sym R (7861 sym/13 pcs) 5 img

DATA 607 Homework Week 9 -- APIs

24.10.2021

NYT Bestselling Books API I chose to work with the New York Times Bestseller List. I loaded the data into R using the api-key I requested, then I collected the raw data from the JSON data. I then took the raw data, selected the results subsection, and created a dataframe based on books, where the data is stored. From there, I performed some minor...

832 sym R (2806 sym/2 pcs)

DATA 607 Homework Week 10 -- Sentiment Analysis

31.10.2021

Example Code Citations: Example code was downloaded from here Robinson, Julia Silge and David. “2 Sentiment Analysis with Tidy Data: Text Mining with R.” 2 Sentiment Analysis with Tidy Data | Text Mining with R, https://www.tidytextmining.com/sentiment.html. Data Sets: Saif M. Mohammad and Peter Turney. (2013), ``Crowdsourcing a Word-Emotion...

4204 sym R (18925 sym/93 pcs) 15 img

DATA 607 Final Project Proposal

15.11.2021

Sloan Digital Sky Survey Exploration Shane Hylton 11/14/2021 Proposal I have always found astronomy to be very inspiring. I began my college career as an astronomy major. Over the past few months, I have grown increasingly attracted to the idea of studying astronomy again. After searching for interesting astronomy datasets, I found the Sloan Dig...

1982 sym

Data Science In Context Notes

21.11.2021

Data Science In Context Data Science In Context Automated Machine Learning Shane Hylton 11/21/2021 Automated Machine Learning What is Automated Machine Learning? Machine Learning: Improving algorithms and outputs through explicit instruction and experience. Supervised Machine Learning: Using labeled data to train the computer to predict l...

3798 sym

Data Science In Context Slides

21.11.2021

Automated Machine Learning Shane Hylton 11/21/2021 Three Key Types of Machine Learning Supervised Machine Learning Unsupervised Machine Learning Automated Machine Learning Supervised Machine Learning User provides labeled data Computer analyzes the provided data to predict labels Hands on Unsupervised Machine Learning Raw, unlabeled data ...

1895 sym

607 Final Project Presentation

08.12.2021

Exploring the Universe Shane Hylton 12/8/2021 Motivation and Data Source Big Data and Astronomy go hand in hand Sloan Digital Sky Survey SQL Based Search Goals Map the Universe Visualize the relationship between temperature and magnitude (brightness) Create a custom classification system Initial Steps Extensive Tidying and Trimming Custo...

2105 sym R (1067 sym/6 pcs) 9 img

606 Final Presentation

09.12.2021

MLB Batting Analysis for 2021 Shane Hylton 12/9/2021 Goals Provide Relevant Summary Statistics Construct a Regression Model for the relationship between age and batting average Visualize the differences in batting efficiency for each position Construct a simulation to show which position is most likely to successfully record a hit Demonstrate...

2372 sym R (2765 sym/30 pcs) 11 img

DATA 605 Homework 2

07.02.2022

Problem Set 1 Question 1: Show that \(A^TA \neq AA^T\) in general. Let A be a square 2x2 matrix. Let \(x_1, x_2, x_3, x_4\) be unique elements of A. Let \(A = \begin{bmatrix} x_1 & x_2 \\ x_3 & x_4 \end{bmatrix}\). Then \(A^T = \begin{bmatrix} x_1 & x_3 \\ x_2 & x_4 \end{bmatrix}\). From \(A\) and \(A^T\), a general example can be found. \(A^TA ...

3597 sym R (3193 sym/10 pcs)

DATA 605 Homework 1

05.02.2022

Initials Creation Using rep in the x-axis and seq in the y-axis creates a vertical line segment. Each individual chunk in the x-axis has a complementary chunk in the y-axis. x <- c(rep(-1.5, 500), seq(-1.5,-0.5, length.out = 500), rep(-0.5, 500), seq(-1.5,-0.5, length.out = 500), seq(-1.5,-0.5, length.out = 500), seq(-1.5, -0.5, length.out = 500)...

514 sym R (1693 sym/8 pcs) 5 img