Prot-SpaM: Fast alignment-free phylogeny reconstruction based on whole-proteome sequences


Created on 23rd April 2018

Paper submitted to journal GigaScience

Last updated on 19th May 2018

Chris-Andre Leimeister; Jendrik Schellhorn; Svenja Schoebel; Michael Gerth; Christoph Bleidorn; Burkhard Morgenstern;


Word-based or "alignment-free" sequence comparison has become an active area of research in bioinformatics. Recently, fast word-based algorithms have been proposed that are able to accurately estimate phylogenetic distances between genomic DNA sequences without the need to calculate full sequence alignments. One of these approaches is Filtered Spaced Word Matches. Herein, we extend this approach to estimate evolutionary distances between species based on their complete or incomplete proteomes; our implementation is called Prot-SpaM. We show that Prot-SpaM can accurately estimate phylogenetic distances, and that our program can be used to calculate phylogenetic trees from whole proteomes in a matter of seconds. For various groups of taxa, we show that trees calculated with Prot-SpaM are of high quality. The source code of our software is available through Github: https://github.com/jschellh/ProtSpaM

Show more

Review Summary

# Status Date
Sharma Thankachan Completed 31 May 2018 View review
Se-Ran Jun Completed 18 Jun 2018 View review
Sriram Chockalinga... Completed 20 Jun 2018 View review
Alexey Kozlov Completed 21 Jun 2018 View review