What is it about?
Abstract—The interest in analyzing biological data on a large scale has grown over the last years. Bioinformatic applications play an important role when it comes to the analysis of huge amounts of data. Due to the large amount of biological data and/or large problem spaces a considerable amount of computing resources is required to answer the raised research questions. In order to estimate which underlying hardware might be the most suitable for the bioinformatic tools applied, a well-defined benchmark suite is required. Such a benchmark suite can get useful in the case of purchasing hardware and even further for larger projects with the goal to establish a bioinformatics compute infrastructure. With this paper we present BOOTABLE, our bioinformatic benchmark suite. BOOTABLE currently contains six popular and widely used bioinformatic applications representing a broad spectrum of usage characteristics. It further includes an automated installation procedure and all required datasets. BOOTABLE is available from our Github repository (https://github.com/MaximilianHanussek/BOOTABLE) in various formats.
Featured Image
Why is it important?
As there can be found only few published works in the literature [2], [3] there seems to be a lack of such benchmark suites, especially regarding multithreaded applications. In order to fill this gap we present BOOTABLE (BiOin- fOrmatics ThreAded Benchmark tooLsuitE), a benchmark suite covering a variety of bioinformatics applications reach- ing from sequence analysis to machine learning.
Perspectives
In this initial release of BOOTABLE we have focused on the convenient installation and execution of our tool using a rather small number of applications and datasets and only the CentOS operating system. We are aware of this and will add more applications from further bioinformatic categories and also plan to expand the installation to Ubuntu. We also want to extend the provided installation formats by a Singularity [15] container to make even more comparisons of different technologies possible. We also have to put some effort in the report generation tool. Further, we want to add some more meaningful graphs and more data about the usage for example. To get more information about the consumed resources (RAM, CPU, disk read/writes, network traffic) for every single tool we are looking into different monitoring systems that can present this data. An upcoming area of interest combined with machine learning is the usage of GPUs (Graphics Processing Unit) and providing benchmark sets and applications for that, combined with a monitoring of the GPU usage. A larger but very important task is the implementation of a scaling mode to give application developers the possibility to benchmark their application regarding the scaling on multiple CPU cores or to reveal specific resource behaviors that are not desired. In order to implement such a scaling mode it would be necessary to modularize BOOTABLE even further and create a mecha- nism to add other applications in a simple way. But all these required steps should be worth the effort.
Maximilian Hanussek
Eberhard Karls Universitat Tubingen
Read the Original
This page is a summary of: BOOTABLE: Bioinformatics Benchmark Tool Suite, May 2019, Institute of Electrical & Electronics Engineers (IEEE),
DOI: 10.1109/ccgrid.2019.00027.
You can read the full text:
Contributors
The following have contributed to this page







