Computational Biology of Infection Research

The Department of “Computational Biology for Infection Research” studies the human microbiome, viral and bacterial pathogens, and human cell lineages within individual patients by analysis of large-scale biological and epidemiological data sets with computational techniques. Focusing on high throughput meta’omics, population genomic and single cell sequencing data, we produce testable hypotheses, such as sets of key sites or relevant genes associated with the presence of a disease, of antibiotic resistance or pathogenic evasion of immune defense. We interact with experimental collaborators to verify our findings and to promote their translation into medical treatment or diagnosis procedures. To achieve its research goals, the department also develops novel algorithms and software.


Seminar: Sequence search and analysis

The seminar "Sequence search and analysis" is hosted by the department "Computational Biology of Infection Research" at the HZI headed by Prof. Alice McHardy.

Kick-off meeting time: Wednesday, 7.10.2020, 10 am
Kick-off meeting room: BRICS, room 207
Date: one-day block during semester break (tba)
Room: tba
Modus: 30 minutes presentation and discussion; 3-5 page written summary
Language: English
Maximum of 10 participants
Designated for Bachelor and Master students in computer science and related

In case you have any questions about the seminar and related topics, feel free to contact Andreas Klötgen


Sequence analysis is a broad problem not only restricted to bioinformatics but common to various computer science fields. Google’s success comes from its Search Engine developed in the late 1990’s dealing with information from a giant space of websites, made available via complicated sequence search algorithms. In bioinformatics, sequence analysis is a key field dealing with the comparison of two or more (genome or gene) sequences. Most importantly, the breakthrough of next generation sequencing techniques, allowing to sequence an individual’s genome and uncover all potentially disease-relevant mutations in an unprecedented fast and potentially personalized manner, was enabled via bioinformatics improvements in large scale sequence analysis algorithms.

This seminar will cover the basics of sequence analysis, from comparing two strings for (dis-)similarities to building complex algorithms analyzing millions of sequences within minutes. It will also discuss appropriate data structures to handle a big search space of sequences such as suffix trees.


  • Sequence alignment (how to compare strings)
  • De-novo genome assembly
  • Sequencing read alignment (e.g. using suffix trees) of millions of reads
  • Protein sequence alignments (e.g. in Pfam database using hidden markov models)
  • Taxonomic binning / profiling of metagenomes
  • Gene prediction in novel genome sequences
  • Google’s Search Engine
PrintSend per emailShare