J. Mol. Biol. (1999) 289, 729±745
Eukaryotic Signalling Domain Homologues in Archaea and Bacteria. Ancient Ancestry and Horizontal Gene Transfer
C. P. Ponting1*, L. Aravind1, J. Schultz2, P. Bork2 and E. V. Koonin1
National Center for Biotechnology Information National Library of Medicine National Institutes of HealthBldg. 38A, Bethesda MD 20894, USA EMBL, Meyerhofstr. 1 69012 Heidelberg, Germany
Phyletic distributions of eukaryotic signalling domains were studied using recently developed sensitive methods for protein sequence analysis, with an emphasis on the detection and accurate enumeration of homologues in bacteria and archaea. A major difference was found between the distributions of enzyme familiesthat are typically found in all three divisions of cellular life and non-enzymatic domain families that are usually eukaryote-speci®c. Previously undetected bacterial homologues were identi®ed for# plant pathogenesis-related proteins, Pad1, von Willebrand factor type A, src homology 3 and YWTD repeat-containing domains. Comparisons of the domain distributions in eukaryotes and prokaryotes enableddistinctions to be made between the domains originating prior to the last common ancestor of all known life forms and those apparently originating as consequences of horizontal gene transfer events. A number of transfers of signalling domains from eukaryotes to bacteria were con®dently identi®ed, in contrast to only a single case of apparent transfer from eukaryotes to archaea.
# 1999 AcademicPress
Keywords: horizontal gene transfer; signalling domains; homology; genome comparison; sequence pro®les
Recent genome sequencing of organisms representing each of the three divisions of cellular life (archaea, bacteria and eukaryota) provides opportunities to infer the genetic heritage of the entire set of their genes and portions of genes encodingindividual protein domains. Domains are spatially compact units of three-dimensional protein structure. Those domains that show signi®cant
sequence similarity to each other or are similar in fold, with common functions and active or binding sites, are considered homologous, that is are thought to have evolved from a common ancestor. Homologous proteins and domains may arise either due to speciation, inwhich case they are products of orthologous genes, or else as a consequence of an intra-genome gene duplication, resulting in paralogous gene products (Fitch, 1970, 1995;
Abbreviations used: AP-ATPase, APAF-1-like ATPase; BRCT, breast cancer C-terminal domain; C1, protein kinase C constant region 1 domain; C2, protein kinase C constant region 2 domain; CBS, domain present twice in cystathionineb-synthase; cNMP, cyclic nucleotide monophosphate binding domain; EGF, epidermal growth factorlike domain; FHA, forkhead associated domain; GAP, GTPase activator protein; GEF, guanine nucleotide exchange factor; HECT, domain homologous to E6-AP carboxyl terminus; IG, immunoglobulin domain; LRR, leucine-rich repeat; LysM, lysin-like motif; MATH, Meprin and TRAF homology domain; MPN, Mpr1p and Pad1pN-terminal domain; NHL, NCL-1, HT2A and LIN-41 repeats; NIb, Na-Ca2 exchanger/integrin subunit b4 domain; Pad1, domain homologous to Schizosaccharomyces pombe Pad1p (also called MPN or JAB domains); PDZ, PSD-95, Dlg, ZO1/2 domain; PH, pleckstrin homology domain; PKD, domain in polycystic kidney disease 1 protein; PR-1, plant pathogenesis-related proteins of group 1-like domains; PTB/PI,phosphotyrosine binding/interaction domain; PX, phox homology domain; SAM, sterile alpha motif domain; SET, Suvar3-9, enhancer-of-zeste, trithorax domain; SH2, src homology 2 domain; SH3, src homology 3 domain; SWIB, SWI complex, BAF60b domain; TGFb, transforming growth factor b-like domain; TIR, toll-interleukin-1-resistance domain; TPR, tetratrico peptide repeat; vWFA, von Willebrand factor A domain;...