Publications by Vennila Ramasubramanian
Testing vcfR
This code checkpoint will make sure that you can load .vcf files on your computer. You will also review vocab and concepts related to .vcf files and SNPs. You will need to load an analyze a .vcf file on the final exam. Make notes on all of this material and include it on your notes sheet. Learning objectives This material will appear on the fina...
5487 sym R (4060 sym/26 pcs)
PCA on SNPs data from a vcf file Part 1 - Data Preparation
Introduction In this worked example you will replicate a PCA on a published dataset. The example is split into 2 Parts: Part 1: Data Preparation (this file) Part 2: Data analysis with PCA In this Data Preparation phase, you will do the following things: Load the SNP genotypes in .vcf format (vcfR::read.vcfR()) Extract the genotypes into an R-c...
3418 sym R (8263 sym/34 pcs) 1 img
PCA on SNPs data from a vcf file Part 2 - Data Analysis
Introduction The example is split into 2 Parts: Part 1: Data Preparation Part 2: Data analysis with PCA (this file) Part 1 must be completed first to create a file, SNPs_cleaned.csv, that has been completely prepared for analysis. Now in Part 2, you will analyze the data with PCA. The steps here will be: Center the data (scale()) Run a PCA ana...
2615 sym R (2430 sym/26 pcs) 4 img
Pairwise Alignment for gene BHLHE41 on mice and humans
Global proteins aligments in R By: Avril Coghlan. Adapted, edited and expanded: Nathan Brouwer under the Creative Commons 3.0 Attribution License (CC BY 3.0). Preliminaries library(compbio4all) library(Biostrings) Download sequences human and mouse fasta files for the gene BHLHE41 are downloaded # Download ## sequence 1: NP_110389.1 human_fasta...
11091 sym R (8343 sym/62 pcs)
BHLHE41
Introduction BHLHE41(basic helix-loop-helix family member e41) is a gene that may be involved in cell differentiation and control of the circadian rhythm. #Resources https://www.ncbi.nlm.nih.gov/gene/79365 https://www.ncbi.nlm.nih.gov/gene/?Term=ortholog_gene_79365[group] https://www.uniprot.org/uniprot/?query=BHLHE41&sort=score https://alphafold...
717 sym R (21995 sym/55 pcs) 4 img 7 tbl
Predicting amino acid chemistry using regression models
Key vocab proteinogenic amino acids regression model / line of best fit pI confidence intervals (CI) confidence ellipse correlation coefficient Selenocysteine and Pyrrolysine re-coding stop codons y = m*x + b slope intercept Key functions / packages ggpubr pander lm() coef() cor() round() Predict pI for an Selenocysteine and Pyrrolysine Amino...
5654 sym R (4813 sym/27 pcs) 1 img 4 tbl
ggplot2 and ggpubr test
gpubr - allometric data Allometric data - classic case of regression, using logs, using non-linear model too library(compbio4all) Vocab wrapper ggplot2 ggpubr $ operator smoother continous data categorical data Learning objectives Know what a wrapper is Know the relationship between ggplot2 and ggpubr Be able to run code that makes graphs wit...
4282 sym R (2474 sym/35 pcs) 9 img
downloading, cleaning, and aligning data
The goal of this exercise is to make you familiar with how to download data from Google Sheets and to briefly review some key concepts R functions and coding concepts. We’ll do the following things Packages ## Google sheets download package # comment this out when you are done # install.packages("googlesheets4") library(googlesheets4) # comp ...
4353 sym R (17580 sym/122 pcs) 1 img
Using dotplots in R to investigate sequence repeats
In this exercise we’ll look at a sequence with known tandem repeats. We’ll load the data, explore it in R, then use the dotPlot() function to make various dotplots to see how changing settings for dotPlots() help make repeat patterns stand out. Add the necessary code to make this script functional. Preliminaries Load packages library(seqinr...
1545 sym R (7583 sym/68 pcs) 12 img
Introduction to dotplots in R
Sequence dotplots in R By: Avril Coghlan. Adapted, edited and expanded: Nathan Brouwer under the Creative Commons 3.0 Attribution License (CC BY 3.0). NOTE: I’ve added some new material that is rather terse and lacks explication. Good sources of more info: https://omicstutorials.com/interpreting-dot-plot-bioinformatics-with-an-example/ http://r...
4539 sym R (1574 sym/13 pcs) 8 img