The institute has a long history of research into text-based algorithms that focus on pattern-finding, indexing, and data compression techniques. They naturally lead to specific applications, such as DNA sequence analysis in bioinformatics. Topics in the field of bioinformatics currently researched at the Institute include:
- efficiently finding maximum exact matches (MEMs) in pairs of genomes,
- bioinformatics data compression,
- pangenome search,
- classification of virus subtypes.
The algorithmic tools used include text processing and data compression techniques as well as deep learning for virus classification.
The issues we tackle are of significant practical importance and are actively researched around the world. One of our achievements is an algorithm that finds the longest Maximum Exact Matches (MEM) by sampling two genomes using coprimes, hence the name of our algorithm, copMEM.
T. Kowalski, S. Grabowski. „PgRC: Pseudogenome based Read Compressor”, Bioinformatics, 2020. 36: 2082-2089. DOI link.
A. Fabijańska, S. Grabowski. „Viral Genome Deep Classifier”, IEEE Access, 2019. 7: 81297-81307. DOI link.
S. Deorowicz, A. Debudaj-Grabysz, A. Gudyś, S. Grabowski. „Whisper: read sorting allows robust mapping of DNA sequencing data”, Bioinformatics, 2019. 35: 2043-2050. DOI link.
S. Grabowski, W. Bieniecki. „copMEM: Finding maximal exact matches via sampling both genomes”. Bioinformatics, 2019. 35: 677–678. DOI link.
A. Cisłak, S. Grabowski and J. Holub. „SOPanG: online text searching over a pan-genome”. Bioinformatics, 2018. 34: 4290-4292. DOI link.
T. Kowalski and S. Grabowski. „Faster range minimum queries”. Software Pract. Exper., 2018. 48: 2043-2060. DOI link.
Algorithm engineering for full-text indexes, NCN grant, S. Grabowski (coordinator), T. Kowalski, 2014-2017.
Memory efficient algorithms for processing and analysis of genome sequencing data, Grant NCN, S. Grabowski (coordinator), 2013-2015.