Publications by KoohPi

DATA606 - Breast Cancer Survival Rate Estimate

13.05.2024

Data Preparation In this project, I have chosen to work on breast cancer. There are various resources available on this topic, with the Surveillance, Epidemiology, and End Results (SEER) [1] program being the most reliable one. The SEER Program of the National Cancer Institute (NCI) collects and publishes cancer data through a coordinated syst...

19434 sym R (119398 sym/125 pcs) 47 img 25 tbl

DATA606 Final Project

12.05.2024

Data Preparation In this project, I have chosen to work on breast cancer. There are various resources available on this topic, with the Surveillance, Epidemiology, and End Results (SEER) [1] program being the most reliable one. The SEER Program of the National Cancer Institute (NCI) collects and publishes cancer data through a coordinated syst...

19428 sym R (118828 sym/120 pcs) 46 img 25 tbl

Breast Cancer Survival Rate With SEER

12.05.2024

Data Preparation In this project, I have chosen to work on breast cancer. There are various resources available on this topic, with the Surveillance, Epidemiology, and End Results (SEER) [1] program being the most reliable one. The SEER Program of the National Cancer Institute (NCI) collects and publishes cancer data through a coordinated syst...

21644 sym R (124383 sym/150 pcs) 49 img 24 tbl

DATA 607 - SPAM/HAM email classification

29.04.2024

Intro The goal of this project is to work with a database to identify spam emails. Being able to classify new “test” documents using already classified “training” documents is crucial. A common scenario involves using a corpus of labeled spam and ham (non-spam) emails to predict whether a new document is spam or not. For this project, ...

11739 sym Python (25870 sym/66 pcs) 9 img 7 tbl

DATA607 11th Week

07.04.2024

Intro This assignment is to find an interesting recommender system and analyze it. I have chosen to work on Goodreads. What is Goodreads (WIKI): Goodreads is the world’s largest site for readers and book recommendations. It was launched in January 2007 and later acquired by Amazon in 2013. The platform is designed to help people find and sha...

10973 sym 6 img

DATA606 Project Intro

06.04.2024

Data Preparation In this project, I have chosen to work on breast cancer. There are various resources available regarding this particular topic, with the SEER being the most reliable one. The Surveillance, Epidemiology, and End Results (SEER) Program of the National Cancer Institute (NCI) collects and publishes cancer data through a coordinated...

7934 sym 4 img 8 tbl

DATA607_3rd Project_Final

20.03.2024

Teamwork By: Team<- c (Anthony C., James N., Koohyar P., Victor T.) Introduction (AC) While the presidential election season is in full swing, we decided to explore polling data sources that exist online. There are several individual sources that could be found; however, the website RealClear Politics is a location that gathers, summarizes, and...

8532 sym Python (5155 sym/18 pcs) 5 img 6 tbl

DATA607_3rd Project

18.03.2024

Introduction While the presidential election season is in full swing, we decided to explore polling data sources that exist online. There are several individual sources that could be found online; however, the website RealClear Politics is a location that gathers, summarizes, and presents the results of the various polls in one location. It sho...

5866 sym Python (5574 sym/28 pcs) 5 img 6 tbl

DATA607 9th Week Assignment

17.03.2024

Introduction The goal of this week’s assignment is to work with APIs. We will work with the New York Times web site rich set of APIs, as described here: New York Times APIs. I first needt oestablish a secure way of working by signing up for an API key. My next task is as follow to choose one of the New York Times APIs, construct an interface ...

1885 sym Python (4335 sym/25 pcs) 16 img 2 tbl

DATA607 7th Week Assignment

10.03.2024

Introduction The goal of this week’s assignment is to work with HTML, XML, and JSON files. In this process, I have selected three of my favorite books from Amazon and created different files for each. For each book, I have included the title, authors, language, version, publisher, links, and summary. I have created these files myself while le...

3512 sym R (9645 sym/36 pcs) 5 tbl