Session 1
New Chemistry in the Expanding Protein Universe
Biosynthesis of nisin A, a lantibiotic that has been used for nearly 50 years in the food industry to combat food-borne pathogens. See Figure 2 contributed by Wilfred van der Donk on page 8.
NOVEL CHEMISTRY STILL TO BE FOUND IN NATURE
CHRISTOPHER T. WALSH
Biochemistry & Molecular Pharmacology Department Harvard Medical School, Boston, MA 02115, USA
My view of the present state of research on new chemistry in the expanding protein universe
New chemical transformations catalyzed by proteins continue to be discovered. Most of the novel scaffolds in small molecule frameworks emerge from studies on microbes, prokaryotes and single cell eukaryotes, from underexplored niches with both anaerobic and aerobic (oxidative) metabolic transformations prominent in bond-breaking and bond-making steps. Bioinformatic analysis of protein superfamilies may highlight particular family members for novel catalytic activities. Post-translational modification of ribosomally generated proteins has also been implicated in the morphing of peptide frameworks into complex architectures. De novo design and protein evolution activity can also create novel chemical transformations, including reactions not previously seen in Nature.
My recent contributions to new chemistry in the expanding protein universe
Research from my group has focused on the morphing and maturation of peptide scaffolds into rigidified, compact scaffolds by the two major biosynthetic strategies for peptide bond formation: ribosomal and non-ribosomal assembly lines. A hallmark of the RNA-independent nonribosomal peptide synthetases has been the use of nonproteinogenic amino acid building blocks in place of the 20 canonical proteinogenic amino acids. As an example, we have examined how a subset of mononuclear nonheme iron oxygenases act instead as halogenating catalysts to provide γ-chloro amino acid and cyclopropyl amino acid building blocks to NRPS assembly lines. The nonproteinogenic β-amino acid anthranilate is a building block for a series of fungal alkaloids ranging from bicyclic to octacyclic scaffolds, put together by bi- and trimodular NRPS assembly lines, followed by action of dedicated tailoring enzymes.
Complexity generation in peptide scaffolds can also be achieved from ribosomally generated nascent proteins by a series of post-translational modifications (PTMs). A remarkable cascade of more than a dozen PTMs occur as a 14 residue C-terminal peptide region of a 52mer is morphed into the trithiazolylpyridine core of thiazolyl peptide antibiotics in the thiocillin and nosiheptide class of antibiotics. These involve conversion of cysteines to thiazoles, threonines to methyloxazoles and partitioning of serine residues down two distinct PTM pathways. One route is to the corresponding oxazole by cyclization, dehydration, and aromatization; the other route is net dehydration to dehydroalanines. A pair of dehydroalanines can undergo condensation and dehydrative aromatization to yield the core pyridine ring of the 2,4,6-trithiazolyl pyridine core at the center of these mature antibiotic scaffolds.
Fig. 1. Complexity generation in peptide scaffolds: two strategies.
Outlook to future developments of research in the chemistry of the expanding protein universe
As bioinformatics, structural genomics and proteomics continue to define the existing protein universe and guide protein engineers in search of new kinds of catalysts, the future is bright for novel chemistry to continue to emerge. On one end of the catalytic spectrum the large superfamily of radical S-adenosyl methionine (SAM) enzymes holds particular promise for catalytic diversity and novelty of chemical transformations: the reactions catalyzed by such famly members as riibonucleotide reductase, lipoyl and biotin synthases, ThiiC, and the tRNA and mRNA C-methyltransferases are likely to be the tip of the iceberg in the chemical capacity of this superfamily. In a distinct superfamily, the hemeprotein cyochrome P450 oxygenases, protein engineering has recently led to evolution of synthetically useful carbene chemistry. De novo protein design should enable completely abiotic chemistry to move into the realm of protein catalysis.
References
1.C. T. Walsh, R. O’Brien, C. Khosla, Angew. Chem. Int. Ed. 52, 7098 (2013).
2.C. T. Walsh, S. Haynes, B. Ames, X. Gao, Y. Tang, ACS Chem. Biol. 8, 1366 (2013).
3.C. T. Walsh, S. Malcolmson, T. Young, ACS Chem. Biol. 7, 429 (2012).
4.P. S. Coelho, J. Wang, M. E. Ener, S. A. Baril, A. Kannan et al., Nat. Chem. Biol. 9, 485 (2013).
NATURAL PRODUCT BIOSYNTHESIS IN THE GENOMIC AGE
WILFRED A. VAN DER DONK
Department of Chemistry, University of Illinois at Urbana-Champaign and the Howard Hughes Medical Institute, 600 South Mathews Ave, Urbana, IL 61801, USA
My view of the present state of research on new chemistry in the expanding protein universe
Natural products (NPs) have featured prominently in the development of pharmaceuticals and as tools to study biology. However, since the turn of the century, many natural product discovery platforms in industry have been dissembled. Several explanations have been given for the withdrawal of pharmaceutical companies from NP discovery. For antibiotics, small projected profits have been a major driving force. But when considering e.g., antitumor agents, the move away from NPs has been motivated by other factors including high rediscovery rates of known compounds and the difficulty to perform medicinal chemistry because of complex structures. At the same time, the available sequenced genomes have demonstrated that the number of NP biosynthetic gene clusters in a typical microorganism far exceeds the number of compounds it produces under laboratory conditions. Based on their sequences, the overwhelming majority of these “silent” gene clusters are expected to encode new NPs. As a result, genome mining has been widely touted as a potentially efficient route to new NPs. An important bonus of investigating NP biosynthetic pathways with respect to the theme of this panel is their richness in novel biochemical transformations.
My recent research contributions to new chemistry in the expanding protein universe
The genome sequencing efforts have revealed that ribosomally synthesized and post-translationally modified peptides (RiPPs) form a much larger class of NPs than anticipated [1]. The extensive post-translational modifications endow these peptides with greatly expanded chemical structures, restricted conformational flexibility for target recognition, and increased metabolic stability. In retrospect, it may not be surprising that RiPP biosynthetic pathways are so widespread because they offer a unique advantage to their producing organisms: high evolvability.
Nearly all RiPPs are initially synthesized as a longer precursor peptide, with a leader peptide appended to the N-terminus of the core peptide, which will be converted to the final NP (Fig. 1(a)). The leader peptide is important for recognition by many of the post-translational modification enzymes [2]. This leader peptide-guided strategy results in highly evolvable pathways because the post-translational processing enzymes can be intrinsically permissive with respect to mutations in the core peptide. The leader peptide-guided biosynthesis also renders RiPPs particularly well-suited to both genome mining and synthetic biology approaches because their substrates are DNA encoded and the biosynthetic enzymes have intrinsically relaxed substrate specificity [3].
Fig. 1. Biosynthesis of RiPPs. (a) General biosynthetic pathway of RiPPs. (b) General biosynthetic strategy that results in the formation of thioether corsslinks named lanthionine and methyllanthionine.
Based on the available genomes, lanthionine-containing peptides (lanthipeptides) are the most abundant class of RiPPs. Currently known lanthipeptides have a range of activities including antimicrobial (called lantibiotics), morphogenetic, antiviral, and antiallodynic [4]. Lanthionine (Lan) consists of two alanine residues crosslinked via a thioether linkage that connects their β-carbons; 3-methyllanthionine (MeLan) contains one additional methyl group (Fig. 1(b)). These structures are installed by a dehydratase that eliminates water from Ser and Thr residues to generate dehydroalanine (Dha) and dehydrobutyrine (Dhb), respectively, and a cyclase that catalyzes the subsequent addition of thiols of Cys residues to the dehydro amino acids to generate Lan and MeLan, respectively. How the enzymes coordinate these complex chemical transformations in which the substrate peptide structure is continuously changing is unknown (e.g., Fig. 2). Furthermore, most lanthipeptide biosynthetic enzymes do not have homology with non-lanthipeptide proteins, and hence many questions remain about their evolutionary origin.
At least four different pathways to these polycyclic natural products have evolved [4], reflecting the high efficiency and evolvability of a post-translational modification route to generate conformationally constrained peptides. Recent studies have shown that three of the four pathways involve Ser/Thr phosphorylation and subsequence phosphate elimination to generate the dehydro amino acids [6–8]. The fourth route unexpectedly involves Ser/Thr glutamylation [9]. As can be seen in Fig. 2, the dehydratase and cyclase enzymes involved in lanthipeptide biosynthesis act on residues that are located in diverse sequence contexts. These lanthionine synthetases appear to have retained the low substrate specificity of their primitive progenitors but acquired a dependence on a leader peptide. The exact role of the leader peptide has been widely debated, and no definitive answer has been provided thus far. Our studies suggest that the leader peptide plays an allosteric role [5], but other roles cannot be ruled out and it is possible that the leader peptides have different functions in different classes of RiPPs.
Fig. 2. Biosynthesis of nisin A, a lantibiotic that has been used for nearly 50 years in the food industry to combat food-borne pathogens. Note the challenge for the cyclase to regioselectively connect five Cys nucleophiles with the correct dehydro amino acids, a process that can generate more than 6,500 different constitutional isomers [5]. In this process, five rings that differ greatly in size and amino acid sequence are formed by one enzyme.
A recent genome mining exercise revealed a stunning example of natural combinatorial biosynthesis, illustrating the considerable potential of RiPP biosynthesis for synthetic biology. The genome of a strain of Prochlorococcus, a planktonic marine photosynthetic cyanobacterium, encodes a single class II lanthionine synthetase (ProcM) but no less than 29 different putative substrate peptides. These substrates have highly conserved leader peptides but hypervariable core peptides (not a single homologous pair). We cloned the enzyme and a subset of its putative substrates and showed that all 17 peptides tested were indeed substrates for ProcM [10]. In a follow-up study, we determined the structure of a subset of the resulting compounds termed prochlorosins (Pcn), and demonstrated that their ring topology is highly diverse [11]. These findings open up a large number of intriguing questions, including (1) how can one enzyme make 29 very different polycyclic structures, (2) what is the function of these cyclic peptides, and (3) can the ProcM-activity be harnessed for synthetic purposes?
Outlook to future developments of research on new chemistry in the expanding protein universe
Genome mining efforts are highly likely to uncover many new NPs. However, genome mining approaches have their own challenges. Most importantly, unlike phenotypic screens that provide compounds with a desired activity, tools to predict the type of bioactivity of a putative new compound identified by genome mining are lacking. Hence, for genome mining to be a viable approach to new compounds with desirable activities, new predictive tools need to be developed and/or high-throughput production platforms must be engineered such that the odds of finding molecules with desired activities are increased. Efficient production methods could either be based on production in heterologous hosts or on new strategies to elicit production from the original organisms [12]. RiPPs lend themselves particularly well for heterologous expression as the 20 amino acids are common building blocks that are present in any heterologous hosts and because their substrates are gene encoded. In addition, synthetic biology approaches may take advantage of the high substrate tolerance of RiPP biosynthetic enzymes to generate cyclic peptide libraries that can be selected for specific properties.
Acknowledgments
This research was supported by the National Institutes of Health (RO1 GM 58822) and the Howard Hughes Medical Institute.
References
1.P. G. A...