Publications by dgrapov

Dynamic Data Visualizations in the Browser Using Shiny

16.06.2013

After being busy the last two weeks teaching and attending academic conferences, I finally found some time to do what I love, program data visualizations using R. After being interested in Shiny for a while, I finally decided to pull the trigger and build my first Shiny app! I wanted to make a proof of concept app which contained the following dy...

2539 sym R (34 sym/1 pcs) 12 img

Principal Components Analysis Shiny App

23.06.2013

I’ve recently started experimenting with making Shiny apps, and today I wanted to make a basic app for calculating and visualizing principal components analysis (PCA). Here is the basic interface I came up with. Test drive the app for yourself using the code below or  check out the the R code HERE. library(shiny) runGist("5846650") Above i...

1455 sym R (34 sym/1 pcs) 10 img

Interactive Heatmaps (and Dendrograms) – A Shiny App

07.07.2013

Heatmaps are a great way to visualize data matrices. Heatmap color and organization can be used to  encode information about the data and metadata to help learn about the data at hand. An example of this could be looking at the raw data  or hierarchically clustering samples and variables based on their similarity or differences. There are a var...

3590 sym R (533 sym/1 pcs) 14 img

Orthogonal Partial Least Squares (OPLS) in R

28.07.2013

I often need to analyze and model very wide data (variables >>>samples), and because of this I gravitate to robust yet relatively simple methods. In my opinion partial least squares (PLS) is a particular useful algorithm. Simply put, PLS is an extension of principal components analysis (PCA), a non-supervised  method to maximizing  variance ex...

1704 sym 4 img

Classification with O-PLS-DA

29.09.2013

Partial least squares (PLS) is a versatile algorithm which can be used to predict either continuous or discrete/categorical variables. Classification with PLS is termed PLS-DA, where the DA stands for discriminant analysis.  The PLS-DA algorithm has many favorable properties for dealing with multivariate data; one of the most important of which ...

2483 sym 4 img

Tutorials- Statistical and Multivariate Analysis for Metabolomics

17.02.2014

I recently had the pleasure in participating in the 2014 WCMC Statistics for Metabolomics Short Course. The course was hosted by the NIH West Coast Metabolomics Center and focused on statistical and multivariate strategies for metabolomic data analysis. A variety of topics were covered using 8 hands on tutorials which focused on: data quality ov...

1554 sym 6 img

High Dimensional Biological Data Analysis and Visualization

22.02.2014

High dimensional biological data shares many qualities with other forms of data. Typically it is wide (samples << variables), complicated by experiential design and made up of complex relationships driven by both biological and analytical sources of variance. Luckily the powerful combination of R, Cytoscape (< v3) and the R package RCytoscape can...

1655 sym 14 img

Choose Your Own Data Adventure

05.04.2014

The question is: can we automate scientific discovery, and what might an interface to such a tool look like. I’ve been experimenting with automating simple and complex data analysis and report generation tasks for biological data and mostly using R and LATEX. You can see some of my progress and challenges encountered in the presentation below....

2689 sym 12 img

Enrichment Network

10.05.2014

Enrichment is beyond random occurrence within a category. Networks can represent relationships among variables. Enrichment networks display relationships among variables which are over represented compared to random chance. Next is  a tutorial for making enrichment networks for biological (metabolomic) data in R using the KEGG database. Rela...

762 sym 4 img

Using Repeated Measures to Remove Artifacts from Longitudinal Data

04.06.2014

Recently I was tasked with evaluating and most importantly removing analytical variance form a longitudinal metabolomic analysis carried out over a few years and including >2,5000 measurements for >5,000 patients. Even using state-of-the-art analytical instruments and techniques long term biological studies are plagued with unwanted trends whic...

3933 sym 14 img