Predicting Pittsburgh Housing Prices
Stat 1361 Code for Final Project 1 Objective The objective of this project is to predict housing prices in the Pittsburgh area based off of historical data. Various data science models were used in order to get accurate results. Models used include: Linear Regression Ridge Regression Lasso Regression Bagged Tree Random Forest Mean Squared Error ...
1 Objective The objective of this project is to predict housing prices in the Pittsburgh area based off of historical data. Various data science models were used in order to get accurate results. Models used include: Linear Regression Ridge Regression Lasso Regression Bagged Tree Random Forest Mean Squared Error (MSE) was then ultimately used to ...

Under the hood of PCA
This Portfolio will walk you through using PCA for what I call pseudo-cluster analysis. You will also examine the PCA scores generated by PCA, how they correlate with themselves, and how they correlate to the original data features. This will help illustrate how the vectors in a biplot relate to the original data and the layout of the points with...
The vegan package for PCA
Preliminaries Download the vegan package Only do this once, then comment out of the script. You may have already done this this for a previous assignment. #install.packages("vegan") Load the libraries library(ggplot2) library(vegan) ## Loading required package: permute ## Loading required package: lattice ## This is vegan 2.6-4 Load the msleep...
Code Check Point: Testing vcfR
This code checkpoint will make sure that you can load .vcf files on your computer. You will also review vocab and concepts related to .vcf files and SNPs. You will need to load an analyze a .vcf file on the final exam. Make notes on all of this material and include it on your notes sheet. Learning objectives This material will appear on the fina...
Setting a working directory in R
Learning objectives All of this material will appear on the exam. Take notes on the workflow, functions, and concepts. Main objectives By the end of this lesson you will know how to.. set a working directory in RStudio confirm the location of the working directory with getwd() confirm a file is present with and list.files(pattern = ...) load ty...
Preparing VCF Data for Analysis: Transposition
NOTE - before you begin, make sure your WORKING DIRECTORY is set to the location of the .vcf file being used. Learning objectives All of this material will appear on the exam. Take notes on the workflow, functions, and concepts. Set a working directory and confirm a file is present with getwd() and list.files(pattern = ...) Know what it means t...
Loading VCF File
library(vcfR) ## ## ***** *** vcfR *** ***** ## This is vcfR 1.13.0 ## browseVignettes('vcfR') # Documentation ## citation('vcfR') # Citation ## ***** ***** ***** ***** snps <- vcfR::read.vcfR("10.29000-269000.ALL.chr10_GRCh38.genotypes.20170504.vcf.gz", convertNA ...
Removing Fixed Allels from SNP data
Learning objectives This lesson introduces the concept of invariant columns and why they should be removed. It also provides a function to remove them. All of this material will appear on the exam. Take notes on the workflow, functions, and concepts. Main objectives By the end of this lesson you will Understand what can lead to a column of SNP...
