Publications by Jay Ralyea
Data Science in Finance - Lab 4
Article published on January 10, 2021 Summary of Article The financial services industry finds itself among the ever growing list of fields looking to benefit from data science. Rohit Sharma article identifies the “top seven use cases” of data science in finance as “risk analytics, real-time analytics, consumer analytics, customer data man...
2520 sym 2 img 1 tbl
Commercial or Not? K-Nearest Neighbors
Base Rate Included in the commercials dataset about 64 percent of the total observations are true commercials. Any model generated needs to be better at classifying commercials than assigning every observation the label “commercial,” which would result in an accuracy of about 64 percent. ## Note: Using an external vector in selections is ambi...
3708 sym R (2420 sym/3 pcs) 1 img 1 tbl
Saving the Wizards
Business Problem The Wizards over the past few years have been mediocre at best and abysmal at worst. Thankfully, machine learning can be used to save the team from irrelevance. The data set used in this process is available on basketball-reference.com and has statistics ranging from games played to effective field goal percentage. In terms of as...
4975 sym R (1587 sym/4 pcs) 4 img
DS 3002 Lab 5: Corruption and human development
Lab 5 Jay Ralyea — 3/10/2021 Row Final Chart Row Datatable Bar Chart ...
141 sym 10 img
Sentiment Analysis of "Text Mining"
Introduction We decided to perform Sentiment Analysis on the term “Text Mining”. We felt a more specific term would derive differentiated results from the “Data Science” term, and this was a term we were interested in learning more about. Narrowing Down Source Material Part of our approach to gathering an appropriate corpus to analyze ca...
6460 sym R (675 sym/5 pcs) 11 img
Evaluation of Two K-NN Models
Heart Attack Heart attacks are life threatening events that afflict many people each year. From data gathered on Kaggle it may be possible to determine the various attributes that increase the likelihood of a heart attack. Available in the data are the patient’s age, resting blood pressure (in mm Hg), cholesterol in mg/dl, and maximum heart rat...
8104 sym R (4911 sym/19 pcs) 1 img
Decision Tree Models
Progesterone Receptors The variable of interest, “PR.Status,” must be a binary response variable. “PR.Status” is the indicator in this dataset of whether or not progesterone receptors are present in the tumor. The table of values shows that “PR.Status” values are binary. Var1 Freq pr_absent 51 pr_present 54 Base Rate The base rate...
6903 sym R (5104 sym/12 pcs) 6 img 3 tbl
Group 2: Exploring Plotly
Due Date: 11:59pm, Oct 25 Part 1 Part 1: Instruction Use the EuStockMarkets data that contains the daily closing prices of major European stock indices: Germany DAX (Ibis), Switzerland SMI, France CAC, and UK FTSE. Then, create multiple lines that show changes of each index’s daily closing prices over time. Please use function gather from pac...
1423 sym 1 img