Publications by Xianjun Dong
Best way to draw heatmap for publication
Here are two tips I can share if you were also working on a big dataset towards a high quality heatmap:1. Don’t generate PDF using pheatmap() or heatmap.2() as (i) the file is unnecessarily SUPER large if you have a lot of data points in the heatmap, so that you can kill your Illustrator; (ii) annoying grey boxes added to the grip (...
1227 sym R (41 sym/1 pcs)
Download all KEGG pathway KGML files for SPIA analysis
Most people know KEGG pathway, but not everyone knows that it costs at least $2000 to subscribe its database. If you want to save the cost a bit, you can manually download the KEGG pathway KGML files and install in SPIA. Here I have a workaround to download all KEGG pathway files using their REST API.## Claim: this is my personal tric...
1392 sym
PCA plot with fill, color, and shape all together
When I plotted the PCA results (e.g. scatter plot for PC1 and PC2) and was about to annotate the dataset with different covariates (e.g. gender, diagnosis, and ethic group), I noticed that it’s not straightforward to annotate >2 covariates at the same time using ggplot. Here is what works for me in ggplot:pcaData percentVar ggplot(p...
1539 sym
Making Art in R
Amazing artworks people made in R:See their source code and more arts at:http://www.r-graph-gallery.com/286-antonio-sanchez-dataart/ Related To leave a comment for the author, please follow the link and comment on their blog: One Tip Per Day. R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R an...
542 sym 4 img
Note for DEseq2 time course analysis
In many cases, we need to perform differential expression across the time course data, e.g. finding genes that react in a condition-specific manner over time, compared to a set of baseline samples. DEseq2 has such an implementation for time-course experiments: There are a number of ways to analyze time-series experiments, depending on the biolog...
3407 sym
Note (2) for DESeq2 time series data analysis
More notes on using LRT to test time-series data. Thanks for the discussion with Jie. swapping the levels of time factor won’t change the LRT results, as if the time variable is a factor, LRT won’t see it as a trajectory analysis but rather a factor analysis (e.g. condition-specific difference at ANY time point). subsetting only two time po...
4146 sym
A bug related to R factor
Note a bug in my code today. Sometimes you need to put a certain level (e.g. healthy control) in the first position for your covariance. Here is my old code:dds[[variable]]=factor(dds[[variable]])levels(dds[[variable]])= union(variable_REF, levels(dds[[variable]])))Note that this can cause problem. For example, you have two levels: HC and AD in ...
1226 sym
An easy to convert list to long table
Say you have a list with different lengths of vectors, e.g. > head(genesets_list)$KEGG_GLYCOLYSIS_GLUCONEOGENESIS [1] “ACSS2” “GCK” “PGK2” “PGK1” “PDHB” “PDHA1” “PDHA2” “PGM2” “TPI1” “ACSS1” “FBP1” “ADH1B” “HK2” “ADH1C” “HK1�...
2559 sym