Publications by Joel Predix

Predicting Pittsburgh Housing Prices

02.05.2023

Stat 1361 Code for Final Project 1 Objective The objective of this project is to predict housing prices in the Pittsburgh area based off of historical data. Various data science models were used in order to get accurate results. Models used include: Linear Regression Ridge Regression Lasso Regression Bagged Tree Random Forest Mean Squared Error ...

4463 sym R (21122 sym/55 pcs) 16 img

Predicting Pittsburgh Housing Prices

02.05.2023

1 Objective The objective of this project is to predict housing prices in the Pittsburgh area based off of historical data. Various data science models were used in order to get accurate results. Models used include: Linear Regression Ridge Regression Lasso Regression Bagged Tree Random Forest Mean Squared Error (MSE) was then ultimately used to ...

4428 sym R (21122 sym/55 pcs) 16 img

temp

03.11.2022

R Markdown This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com. When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within t...

591 sym 1 img

Under the hood of PCA

11.11.2022

This Portfolio will walk you through using PCA for what I call pseudo-cluster analysis. You will also examine the PCA scores generated by PCA, how they correlate with themselves, and how they correlate to the original data features. This will help illustrate how the vectors in a biplot relate to the original data and the layout of the points with...

8829 sym R (8580 sym/36 pcs) 16 img

The vegan package for PCA

14.11.2022

Preliminaries Download the vegan package Only do this once, then comment out of the script. You may have already done this this for a previous assignment. #install.packages("vegan") Load the libraries library(ggplot2) library(vegan) ## Loading required package: permute ## Loading required package: lattice ## This is vegan 2.6-4 Load the msleep...

2190 sym R (2207 sym/21 pcs) 5 img

Code Check Point: Testing vcfR

21.11.2022

This code checkpoint will make sure that you can load .vcf files on your computer. You will also review vocab and concepts related to .vcf files and SNPs. You will need to load an analyze a .vcf file on the final exam. Make notes on all of this material and include it on your notes sheet. Learning objectives This material will appear on the fina...

5486 sym R (3991 sym/25 pcs)

Setting a working directory in R

21.11.2022

Learning objectives All of this material will appear on the exam. Take notes on the workflow, functions, and concepts. Main objectives By the end of this lesson you will know how to.. set a working directory in RStudio confirm the location of the working directory with getwd() confirm a file is present with and list.files(pattern = ...) load ty...

4460 sym 2 img

Preparing VCF Data for Analysis: Transposition

01.12.2022

NOTE - before you begin, make sure your WORKING DIRECTORY is set to the location of the .vcf file being used. Learning objectives All of this material will appear on the exam. Take notes on the workflow, functions, and concepts. Set a working directory and confirm a file is present with getwd() and list.files(pattern = ...) Know what it means t...

4792 sym R (8533 sym/41 pcs)

Loading VCF File

02.12.2022

library(vcfR) ## ## ***** *** vcfR *** ***** ## This is vcfR 1.13.0 ## browseVignettes('vcfR') # Documentation ## citation('vcfR') # Citation ## ***** ***** ***** ***** snps <- vcfR::read.vcfR("10.29000-269000.ALL.chr10_GRCh38.genotypes.20170504.vcf.gz", convertNA ...

13 sym R (39239 sym/10 pcs)

Removing Fixed Allels from SNP data

02.12.2022

Learning objectives This lesson introduces the concept of invariant columns and why they should be removed. It also provides a function to remove them. All of this material will appear on the exam. Take notes on the workflow, functions, and concepts. Main objectives By the end of this lesson you will Understand what can lead to a column of SNP...

7498 sym R (7187 sym/62 pcs)