Publications by Di Cook

plotting individual values within multiple groups together with their means

23.07.2024

In this post I show how groupScatterPlot(), function of the rnatoolbox R package can be used for plotting the individual values in several groups together with their mean (or other statistics). I think this is a useful function for plotting grouped data when some groups (or all groups) have few data points ! You may be wondering why to include such...

1946 sym 2 img

Inferring the gender of the subjects from RNAseq BAM files

17.07.2024

In this post I show how getGender(), function can be used for inferring the gender of  the studied subjects from their binary alignment bam files. The gender can be a source of unwanted variation within the data, for which you may want to adjust your differential gene expression or splicing analysis. However, complete  metadata are unfortunately ...

5832 sym 2 img

Assessing the number of mapped reads in several bam files

16.07.2024

Recently I have started to organize my commonly used functions related to quality assessment and analyzing RNAseq data into an R package. It is called rnatoolbox and it is available here. In this post I introduce getMappedReadsCount(), i.e. a function that can be used for checking the number of aligned/mapped fragments in several bam files and dete...

4203 sym 2 img

adhan package: retreiving and aligning the prayer times in R

01.04.2024

The adhan package is available here ! The prayer times cannot always be estimated accurately in some places such as countries located in higher latitudes (e.g. the Nordic countries) . as for instance during midsummer time the Fajr may be impossible to estimate or in other words it may simply not exist ! Some Muslim residents of those countries fo...

3086 sym 2 img

These drinking glasses are too short!

25.02.2023

These drinking glasses are too short! Some of my reinsurance and math teacher friends may remember that when I am out of town and having an adult beverage with friends, I have been known to stare at the drinking glass and say something like, “I don’t mean to be rude, but the glasses are certainly short here. They are much shorter than what ...

4821 sym 14 img

Some different graph types in R

12.02.2023

I don’t know about you, but I get tired of seeing column charts and pie charts. It’s not difficult to create a few more interesting chart types once in a while. Whether these are relevant for a particular audience and truly display your message is a different question. I wanted a really small dataset to experiment in R, so I used numbers of d...

4296 sym 5 img

How do I count thee? Let me count the ways?

17.09.2022

by Jerry Tuttle    In Major League Baseball, a player who hits 50 home runs in a single season has hit a lot of home runs. Suppose I want to count the number of 50 homer seasons by team, and also the number of 50 homer seasons by New York Yankees. (I will count Maris and Mantle in 1961 as two.) Here is the data including Aar...

6838 sym 4 img

Find the next number in the sequence

29.10.2022

Between ages two and four, most children can count up to at least ten. If you ask your child, “What number comes next after 1, 2, 3, 4, 5?” they will probably say “6.” But to math nerds, any number can be the next number in a finite sequence. I like -14. Given a sequence of n real numbers f(x1), f(x2), f(x3), … , f(xn), t...

2485 sym 2 img

Fixed effects vs. random effects for web browsing data – a simulation

23.06.2022

 When you work with trace data — data that emerge when people interact with technology — you will notice that such data often have properties that open up questions about statistical modelling. I currently work with browsing records, obtained at several times from the same users (i.e., a panel data set). A first typical characteristi...

4516 sym R (2776 sym/7 pcs) 2 img

Fixed vs. random effects for browsing data – a simulation

07.07.2022

 When you work with trace data — data that emerge when people interact with technology — you will notice that such data often have properties that open up questions about statistical modelling. I currently work with browsing records, obtained at several times from the same users (i.e., a panel data set). A first typical character...

4572 sym R (2776 sym/7 pcs) 2 img