Publications by Joel Predix
Predicting Pittsburgh Housing Prices
Stat 1361 Code for Final Project 1 Objective The objective of this project is to predict housing prices in the Pittsburgh area based off of historical data. Various data science models were used in order to get accurate results. Models used include: Linear Regression Ridge Regression Lasso Regression Bagged Tree Random Forest Mean Squared Error ...
4463 sym R (21122 sym/55 pcs) 16 img
Predicting Pittsburgh Housing Prices
1 Objective The objective of this project is to predict housing prices in the Pittsburgh area based off of historical data. Various data science models were used in order to get accurate results. Models used include: Linear Regression Ridge Regression Lasso Regression Bagged Tree Random Forest Mean Squared Error (MSE) was then ultimately used to ...
4428 sym R (21122 sym/55 pcs) 16 img
temp
R Markdown This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com. When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within t...
591 sym 1 img
Under the hood of PCA
This Portfolio will walk you through using PCA for what I call pseudo-cluster analysis. You will also examine the PCA scores generated by PCA, how they correlate with themselves, and how they correlate to the original data features. This will help illustrate how the vectors in a biplot relate to the original data and the layout of the points with...
8829 sym R (8580 sym/36 pcs) 16 img
The vegan package for PCA
Preliminaries Download the vegan package Only do this once, then comment out of the script. You may have already done this this for a previous assignment. #install.packages("vegan") Load the libraries library(ggplot2) library(vegan) ## Loading required package: permute ## Loading required package: lattice ## This is vegan 2.6-4 Load the msleep...
2190 sym R (2207 sym/21 pcs) 5 img
Code Check Point: Testing vcfR
This code checkpoint will make sure that you can load .vcf files on your computer. You will also review vocab and concepts related to .vcf files and SNPs. You will need to load an analyze a .vcf file on the final exam. Make notes on all of this material and include it on your notes sheet. Learning objectives This material will appear on the fina...
5486 sym R (3991 sym/25 pcs)
Setting a working directory in R
Learning objectives All of this material will appear on the exam. Take notes on the workflow, functions, and concepts. Main objectives By the end of this lesson you will know how to.. set a working directory in RStudio confirm the location of the working directory with getwd() confirm a file is present with and list.files(pattern = ...) load ty...
4460 sym 2 img
Preparing VCF Data for Analysis: Transposition
NOTE - before you begin, make sure your WORKING DIRECTORY is set to the location of the .vcf file being used. Learning objectives All of this material will appear on the exam. Take notes on the workflow, functions, and concepts. Set a working directory and confirm a file is present with getwd() and list.files(pattern = ...) Know what it means t...
4792 sym R (8533 sym/41 pcs)
Loading VCF File
library(vcfR) ## ## ***** *** vcfR *** ***** ## This is vcfR 1.13.0 ## browseVignettes('vcfR') # Documentation ## citation('vcfR') # Citation ## ***** ***** ***** ***** snps <- vcfR::read.vcfR("10.29000-269000.ALL.chr10_GRCh38.genotypes.20170504.vcf.gz", convertNA ...
13 sym R (39239 sym/10 pcs)
Removing Fixed Allels from SNP data
Learning objectives This lesson introduces the concept of invariant columns and why they should be removed. It also provides a function to remove them. All of this material will appear on the exam. Take notes on the workflow, functions, and concepts. Main objectives By the end of this lesson you will Understand what can lead to a column of SNP...
7498 sym R (7187 sym/62 pcs)