What is it about?

In this work, we develop a statistical pipeline to infer fitness values from tracking relative abundances of microbial strains via high-throughput sequencing. In other words, experimentally, we can track the dynamics of a growing culture with many uniquely-identified strains as they evolve. From this tracking information, we want to learn the relative fitness of each of these strains. Here, we use Bayesian statistics not only to extract these fitness values but also to report the level of uncertainty we have about these inferred parameters.

Featured Image

Why is it important?

When extracting quantitative information from experimental assays via fitting a model, there are two main things to consider: 1. The mathematical model we use to describe how we think the data was generated. 2. The statistical analysis we deploy to fit the model's parameters. Addressing the second point, it becomes evident that extracting the values of the parameters we are interested in and quantifying the confidence we should assign to them is of the utmost importance. In essence, we need to communicate the level of uncertainty in our estimated parameters, considering the inherent noise in our measurements. This is where Bayesian statistics comes to our aid with its apt mathematical formalism. In this work, we generate a computational tool for researchers working on experimental evolution to deploy a Bayesian inference pipeline on their experimental data, helping the field to improve their data analysis workflow.


As scientists, it's crucial that we not only present our results but also the level of certainty we have in them. Bayesian statistics, in my view, is the ideal paradigm for this. It's a way to be transparent about our certainty using the powerful tool of probability theory, empowering us to confidently communicate the robustness of our findings. Yet, Bayesian statistics is not without its challenges. Working with models that have a multitude of parameters, numbering in the thousands or tens of thousands, can be a daunting task. However, I learned about approximate Bayesian methods. These methods efficiently scale up to handle such complex models, relieving us of the burden of complexity and allowing us to apply Bayesian thinking even in areas with massive datasets. I hope that this article will encourage more scientists to adopt Bayesian methods as a principled way to quantify the confidence they should assign to the quantitative information they can extract from a noisy world.

Manuel Razo Mejia

Read the Original

This page is a summary of: Bayesian inference of relative fitness on high-throughput pooled competition assays, PLoS Computational Biology, March 2024, PLOS,
DOI: 10.1371/journal.pcbi.1011937.
You can read the full text:

Open access logo


The following have contributed to this page