Computational Biology of Infection Research
The seminar "Sequence search and analysis" is hosted by the department "Computational Biology of Infection Research" at the HZI headed by Prof. Alice McHardy.
Kick-off meeting time: Wednesday, 7.10.2020, 10 am
Kick-off meeting room: BRICS, room 207
Date: one-day block during semester break (tba)
Modus: 30 minutes presentation and discussion; 3-5 page written summary
Maximum of 10 participants
Designated for Bachelor and Master students in computer science and related
In case you have any questions about the seminar and related topics, feel free to contact Andreas Klötgen
Sequence analysis is a broad problem not only restricted to bioinformatics but common to various computer science fields. Google’s success comes from its Search Engine developed in the late 1990’s dealing with information from a giant space of websites, made available via complicated sequence search algorithms. In bioinformatics, sequence analysis is a key field dealing with the comparison of two or more (genome or gene) sequences. Most importantly, the breakthrough of next generation sequencing techniques, allowing to sequence an individual’s genome and uncover all potentially disease-relevant mutations in an unprecedented fast and potentially personalized manner, was enabled via bioinformatics improvements in large scale sequence analysis algorithms.
This seminar will cover the basics of sequence analysis, from comparing two strings for (dis-)similarities to building complex algorithms analyzing millions of sequences within minutes. It will also discuss appropriate data structures to handle a big search space of sequences such as suffix trees.
- Sequence alignment (how to compare strings)
- De-novo genome assembly
- Sequencing read alignment (e.g. using suffix trees) of millions of reads
- Protein sequence alignments (e.g. in Pfam database using hidden markov models)
- Taxonomic binning / profiling of metagenomes
- Gene prediction in novel genome sequences
- Google’s Search Engine