Protein Bioinformatics

Vikram Alva-Kullanja

It is estimated that there are more than a trillion differing proteins existing today. Although this may seem a vast number, the actual diversity of proteins in nature is somewhat limited. Many proteins share detectable similarities in sequence and structure, as they arose by amplification, recombination, and divergence from a basic complement of autonomously folding modules, referred to as domains.

On the diversity of modern proteins

Sequence-based comparison of modern proteins shows that they fall into only about ten thousand domain families, which, based on structural similarity, can be grouped further into one of a thousand folds. Many of these folds were already established at the time of the Last Universal Common Ancestor, a theoretical primordial organism from which all life on earth descended. We are broadly interested in understanding the events that led to the emergence of these first folds and their diversification into the many functional protein families we recognize today.

The MPI Bioinformatics Toolkit

To track evolutionary relationships between proteins, we use bioinformatics tools that establish correlations between sequence and structure similarity. Many of the tools we use, such as the state-of-the-art sequence comparison methods HHblits and HHpred, are integrated into the MPI Bioinformatics Toolkit (, a one-stop, integrative resource for protein bioinformatic analysis, which we develop and maintain. The Toolkit currently includes 36 interconnected in-house and external tools, whose functionality covers the detection of remote homologs, the calculation of multiple alignments, and the annotation of sequence features.

Prokaryotic cell surface proteins

All cells use sophisticated molecular machinery to interact with their environment, but the molecular basis of such interactions remains poorly understood in prokaryotes. To this end, in an HFSP-funded project, we collaborate with Tanmay Bharat (MRC LMB) and Alex Bisson (Brandeis University) to explore the evolution and molecular basis of mechanosensing in archaea. Additionally, we collaborate with Tanmay Bharat on deciphering the evolution and structure of prokaryotic S-layers (two-dimensional paracrystalline sheets that encapsulate many prokaryotic cells).

Selected Publications

A multidomain connector links the outer membrane and cell wall in phylogenetically deep-branching bacteria.
von Kügelgen A, van Dorst S, Alva V#, Bharat TAM#Proc Natl Acad Sci U S A. 2022;119(33):e2203156119.

Complete atomic structure of a native archaeal cell surface.
von Kügelgen A, Alva V, Bharat TAM. Cell Rep. 2021;37(8):110052.

An astonishing wealth of new proteasome homologs
Fuchs ACD, Alva V, Lupas AN. Bioinformatics. 2021 Jul 29;btab558.

Protein Sequence Analysis Using the MPI Bioinformatics Toolkit
Gabler F, Nam SZ, Till S, Mirdita M, Steinegger M, Söding J, Lupas AN, Alva VCurr Protoc Bioinformatics. 2020 Dec;72(1):e108.

Molecular Logic of Prokaryotic Surface Layer Structures
Bharat TAM#, von Kügelgen A, Alva V#Trends Microbiol. 2020 Oct 26:S0966-842X(20)30258-4.

Histones Predate the Split Between Bacteria and Archaea
Alva V#, Lupas AN#Bioinformatics. 2019 Jul 15;35(14):2349-2353.

From ancestral peptides to designed proteins
Alva V, Lupas AN. Curr Opin Struct Biol. 2018 Feb;48:103-109.

Ribosomal proteins as documents of the transition from unstructured (poly)peptides to folded proteins
Lupas AN, Alva VJ Struct Biol. 2017 May;198(2):74-81.

A vocabulary of ancient peptides at the origin of folded proteins
Alva V, Söding J, Lupas AN. Elife. 2015 Dec 14;4:e09410.

(see full publication list on Google Scholar)

Go to Editor View