Publications by dgrapov

Multivariate Data Analysis Work Flow

02.08.2012

Here is an example of a data analysis work flow supported in imDEV. This network visualization was made using CmapTools. Related To leave a comment for the author, please follow the link and comment on their blog: imDEV » R. R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click he...

515 sym 4 img

Discriminating Between Iris Species

04.08.2012

The Iris data set is a famous for its use to compare unsupervised classifiers. The goal is to use information about flower characteristics to accurately classify the 3 species of Iris. We can look at scatter plots of the 4 variables in the data set and see that no single variable nor bivariate combination can achieve this. One approach t...

2045 sym 10 img

Excel + Cytoscape + R = ExCytR

16.11.2012

My new project is coming along nicely and should be released early 2013. It builds on the structures developed in imDEV to link Excel, Cytoscape and R using RExcel,  RCytoscape, and CytoscapeRPC . This trio can be used to rapidly generate beautiful and  informative network representations of data. Here is an example of a  undirected Gaussian...

1327 sym 4 img

ExCytR Concept

01.12.2012

The concept is to make a GUI to provide a static and dynamic linking between data and its network representations. Static access will involve making networks based on data and metadata stored in some table or spreadsheet. Dynamic control will provide interactive access to network construction and annotation properties. Together, these will pr...

2154 sym 6 img

Anaerobic Stress in Seeds – A Chemical Similarity Network Story

31.12.2012

The chemical similarity network or CSN is a great tool for organizing biological data based on known biochemistry or chemical structural similarity. Here is an example CSN for visualizing metabolomic  changes (measured via GC/TOF) due to anaerobic stress in germinating seeds. In this network edges are formed for chemical similarity scores > 75...

2040 sym R (635 sym/1 pcs) 8 img

Power Calculations – relationship between test power, effect size and sample size

17.01.2013

I was interested in modeling the relationship between the power and sample size, while holding the significance level constant (p = 0.05) , for the common two-sample t-Test. Luckily R has great support for power analysis and I found the function I was looking for in the package pwr. To calculate the power for the two-sample T-test at different...

2170 sym R (2205 sym/3 pcs) 4 img

Data analysis approaches to modeling changes in primary metabolism

31.01.2013

View this document on Scribd Related To leave a comment for the author, please follow the link and comment on their blog: imDEV » R. R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job. Want to share your content on ...

423 sym 2 img

PCA to PLS modeling analysis strategy for WIDE DATA

02.03.2013

Working with wide data is already hard enough, add to this row outliers and things can get murky fast. Here is an example of an anlysis of a wide data set, 24 rows  x 84 columns. Using imDEV, written in R, to calculate and visualize a principal components analysis (PCA) on this data set. We find that 7 components capture >80% of the variance i...

4589 sym 14 img

Evaluation of Orthogonal Signal Correction for PLS modeling (OSC-PLS and OPLS)

15.03.2013

Partial least squares projection to latent structures or PLS is one of my favorite modeling algorithms. PLS is an optimal algorithm for predictive modeling using wide data or data with  rows << variables. While there is s a wealth of literature regarding the application of PLS to various tasks, I find it especially useful for biological data w...

5197 sym 14 img

Tutorial- Building Biological Networks

04.04.2013

I love networks! Nothing is better for visualizing complex multivariate relationships be it social, virtual or biological. I recently gave a hands-on network building tutorial using R and Cytoscape to build large biological networks. In these networks Nodes represent metabolites and edges can be many things, but I specifically focused on biochemi...

1320 sym R (1560 sym/2 pcs) 8 img