Publications by Xianjun Dong

Best way to draw heatmap for publication

08.07.2016

Here are two tips I can share if you were also working on a big dataset towards a high quality heatmap:1. Don’t generate PDF using pheatmap() or heatmap.2() as (i) the file is unnecessarily SUPER large if you have a lot of data points in the heatmap, so that you can kill your Illustrator; (ii) annoying grey boxes added to the grip (...

1227 sym R (41 sym/1 pcs)

Download all KEGG pathway KGML files for SPIA analysis

08.06.2018

Most people know KEGG pathway, but not everyone knows that it costs at least $2000 to subscribe its database. If you want to save the cost a bit, you can manually download the KEGG pathway KGML files and install in SPIA. Here I have a workaround to download all KEGG pathway files using their REST API.## Claim: this is my personal tric...

1392 sym

PCA plot with fill, color, and shape all together

25.09.2018

When I plotted the PCA results (e.g. scatter plot for PC1 and PC2) and was about to annotate the dataset with different covariates (e.g. gender, diagnosis, and ethic group), I noticed that it’s not straightforward to annotate >2 covariates at the same time using ggplot. Here is what works for me in ggplot:pcaData percentVar ggplot(p...

1539 sym

Making Art in R

15.10.2018

Amazing artworks people made in R:See their source code and more arts at:http://www.r-graph-gallery.com/286-antonio-sanchez-dataart/ Related To leave a comment for the author, please follow the link and comment on their blog: One Tip Per Day. R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R an...

542 sym 4 img

Note for DEseq2 time course analysis

19.04.2021

In many cases, we need to perform differential expression across the time course data, e.g. finding genes that react in a condition-specific manner over time, compared to a set of baseline samples. DEseq2 has such an implementation for time-course experiments: There are a number of ways to analyze time-series experiments, depending on the biolog...

3407 sym

Note (2) for DESeq2 time series data analysis

02.07.2021

More notes on using LRT to test time-series data. Thanks for the discussion with Jie. swapping the levels of time factor won’t change the LRT results, as if the time variable is a factor, LRT won’t see it as a trajectory analysis but rather a factor analysis (e.g. condition-specific difference at ANY time point). subsetting only two time po...

4146 sym

A bug related to R factor

07.10.2021

Note a bug in my code today. Sometimes you need to put a certain level (e.g. healthy control) in the first position for your covariance. Here is my old code:dds[[variable]]=factor(dds[[variable]])levels(dds[[variable]])= union(variable_REF, levels(dds[[variable]])))Note that this can cause problem. For example, you have two levels: HC and AD in ...

1226 sym

An easy to convert list to long table

11.12.2021

 Say you have a list with different lengths of vectors, e.g. > head(genesets_list)$KEGG_GLYCOLYSIS_GLUCONEOGENESIS [1] “ACSS2”   “GCK”     “PGK2”    “PGK1”    “PDHB”    “PDHA1”   “PDHA2”   “PGM2”    “TPI1”    “ACSS1”   “FBP1”    “ADH1B”   “HK2”     “ADH1C”   “HK1�...

2559 sym