Publications by dgrapov
Dynamic Data Visualizations in the Browser Using Shiny
After being busy the last two weeks teaching and attending academic conferences, I finally found some time to do what I love, program data visualizations using R. After being interested in Shiny for a while, I finally decided to pull the trigger and build my first Shiny app! I wanted to make a proof of concept app which contained the following dy...
2539 sym R (34 sym/1 pcs) 12 img
Principal Components Analysis Shiny App
I’ve recently started experimenting with making Shiny apps, and today I wanted to make a basic app for calculating and visualizing principal components analysis (PCA). Here is the basic interface I came up with. Test drive the app for yourself using the code below or check out the the R code HERE. library(shiny) runGist("5846650") Above i...
1455 sym R (34 sym/1 pcs) 10 img
Interactive Heatmaps (and Dendrograms) – A Shiny App
Heatmaps are a great way to visualize data matrices. Heatmap color and organization can be used to encode information about the data and metadata to help learn about the data at hand. An example of this could be looking at the raw data or hierarchically clustering samples and variables based on their similarity or differences. There are a var...
3590 sym R (533 sym/1 pcs) 14 img
Orthogonal Partial Least Squares (OPLS) in R
I often need to analyze and model very wide data (variables >>>samples), and because of this I gravitate to robust yet relatively simple methods. In my opinion partial least squares (PLS) is a particular useful algorithm. Simply put, PLS is an extension of principal components analysis (PCA), a non-supervised method to maximizing variance ex...
1704 sym 4 img
Classification with O-PLS-DA
Partial least squares (PLS) is a versatile algorithm which can be used to predict either continuous or discrete/categorical variables. Classification with PLS is termed PLS-DA, where the DA stands for discriminant analysis. The PLS-DA algorithm has many favorable properties for dealing with multivariate data; one of the most important of which ...
2483 sym 4 img
Tutorials- Statistical and Multivariate Analysis for Metabolomics
I recently had the pleasure in participating in the 2014 WCMC Statistics for Metabolomics Short Course. The course was hosted by the NIH West Coast Metabolomics Center and focused on statistical and multivariate strategies for metabolomic data analysis. A variety of topics were covered using 8 hands on tutorials which focused on: data quality ov...
1554 sym 6 img
High Dimensional Biological Data Analysis and Visualization
High dimensional biological data shares many qualities with other forms of data. Typically it is wide (samples << variables), complicated by experiential design and made up of complex relationships driven by both biological and analytical sources of variance. Luckily the powerful combination of R, Cytoscape (< v3) and the R package RCytoscape can...
1655 sym 14 img
Choose Your Own Data Adventure
The question is: can we automate scientific discovery, and what might an interface to such a tool look like. I’ve been experimenting with automating simple and complex data analysis and report generation tasks for biological data and mostly using R and LATEX. You can see some of my progress and challenges encountered in the presentation below....
2689 sym 12 img
Enrichment Network
Enrichment is beyond random occurrence within a category. Networks can represent relationships among variables. Enrichment networks display relationships among variables which are over represented compared to random chance. Next is a tutorial for making enrichment networks for biological (metabolomic) data in R using the KEGG database. Rela...
762 sym 4 img
Using Repeated Measures to Remove Artifacts from Longitudinal Data
Recently I was tasked with evaluating and most importantly removing analytical variance form a longitudinal metabolomic analysis carried out over a few years and including >2,5000 measurements for >5,000 patients. Even using state-of-the-art analytical instruments and techniques long term biological studies are plagued with unwanted trends whic...
3933 sym 14 img