Improving Hierarchical Models Using Historical Data with Applications in High-Throughput Genomics Data Analysis

  • Ben Li, Yunxiao Li, Zhaohui S. Qin
  • Statistics in Biosciences, July 2016, Springer Science + Business Media
  • DOI: 10.1007/s12561-016-9156-x

Use historical data to improve detection of differential expressed genes.

Photo by Markus Spiske on Unsplash

Photo by Markus Spiske on Unsplash

What is it about?

This is a follow up paper to the Li et al. 2015 Bioinformatics paper describing IPBT. In this paper, instead of using historical data to construct a gene-specific informatics prior as in IPBT, we operated within the classical Bayesian hierarchical model framework. But instead of borrowing strength across all genes in the genome, we only borrow strength from a small neighborhood of genes that are believe to have similar mean and variance. The neighborhood is defined using historical data. We proposed two different ways to define a neighborhood: 1. divide all genes into disjoint bins with the same size, where genes are ranked by the variance estimates; 2. sliding window, fix a window size, the same number of neighboring genes were selected. Simulation and real data analyses suggest that these two method gave similar performance as IPBT, much better than other state-of-the-art methods such as LIMMA or SAM, while maintain high flexibility.

Read Publication

http://dx.doi.org/10.1007/s12561-016-9156-x

The following have contributed to this page: Dr Zhaohui S. Qin