Key words
Papaver somniferum
- Papaveraceae -
Catharanthus roseus
- Apocynaceae - alkaloids - caffeine - cannabinoids - ginseng - artemisinin - taxol
Introduction
The first archeological evidences of the use of herbal remedies date back to prehistory:
Neanderthals, for example, who have been long considered mainly meat-eaters, had instead
already a good knowledge of the surrounding vegetation and adopted sophisticated diets:
their dental plaques contained residues of several herbs, indicating the early consumption
of plants, perhaps already for self-medication purposes [1].
Written records of the use of medicinal plants–including recipes for preparing decocts
and extracts–were also common in Ancient Egypt, Greece, Rome, China, and in the Middle
East.
During the modern era, a more rational approach began to be applied to the study of
herbal medicines, as was the case with the discovery of the properties of foxglove
(Digitalis purpurea L., family Plantaginaceae) to treat edema and heart failures [2], [3].
Since the beginning of the 19th century, in parallel with the development of the pharmaceutical
industry, there was an impetus in the isolation of new compounds possessing a therapeutic
or commercial potential. In 1805, morphine was isolated from the latex of opium poppies
and went immediately into commercial production in Europe and the United States, where
it soon reached widespread popularity as a pain relief medication [4]. After the discovery of morphine, many other compounds with therapeutic effects
were isolated and purified from plants.
The botanical drugs we use today, like the ancient herbal remedies, are all examples
of complex mixtures enriched in plant secondary metabolites. Along evolution, plants
have in fact developed a vast array of chemical defenses to stand up against their
enemies (herbivores, fungi) to attract pollinators or to disseminate various chemical
signals in their surrounding environment. Secondary metabolites are present in all
higher plants but display a large structural diversity: different taxa usually accumulate
different classes of secondary metabolites, reflecting the adaptations to the various
ecological niches plants colonized on Earth [5]. This is in contrast to the current knowledge about the role and distribution of
primary metabolites (amino acids, organic acids, carbohydrates, etc.). Primary metabolites
represent the intermediates of those metabolic pathways related to the basic processes
of plant growth and development (e.g., glycolysis, TCA cycle, ATP (adenosine triphosphate)
synthesis, Calvin-Benson cycle, etc.); as such, their presence is not confined to
specific taxa, and the key metabolic steps for their biosynthesis and degradation
are mostly conserved across the green lineage. So although most of the primary metabolic
pathways have been well described in plants, both at the genetic and biochemical level,
the elucidation of the pathways of secondary metabolites has lagged behind, due to
their confined taxonomic distribution and inherent difficulties in purifying them
from natural sources (due both to their low amounts and chemical complexity). The
study of plant secondary metabolism is thus of interest not only for answering basic
research questions–such as the evolution of metabolic pathways, the extent of natural
metabolic diversity, and pathway regulation in relation to the environmental conditions–but
also from an applied perspective, given that most of the natural products of medicinal
importance are actually secondary metabolites.
Thus, although nearly 400,000 flowering plants have been classified so far, only a
fraction of these, around 20,000, has been used since ancient times for medicinal
purposes [5], and only a minority of these has been studied in detail with regard to the metabolic
composition and biological effects of their crude extracts [6]. The Dictionary of Natural Products, for example, which is a curated database of
various chemical entities isolated from plants and microbes, contains around 160,000
entries; this number is, however, considered a round-down approximation of the extant
diversity of secondary metabolites in higher plants [7].
Today, almost 30% of the new chemical entities released by the FDA (Food and Drug
Administration) are either (entirely) natural products, botanical drugs, or semisynthetic
derivatives of a natural product [8]. The pharmaceutical industry has been, however, rather reluctant in investing in
large-scale screening of natural products for drug discovery [6]. One of the reasons limiting the screening of small molecules in plants has been
the inherent difficulties in the purification of known compounds in adequate yields,
but also, as we have mentioned above, the incomplete knowledge of many of the biosynthetic
pathways of secondary metabolites [9], [10], [11]. The full knowledge of the pathways of plant secondary metabolites is of course
essential to develop alternative strategies of production in heterologous hosts (yeast,
bacteria) for pharmaceutical applications [12].
The advances in a number of systems-biology disciplines (genomics, transcriptomics,
metabolomics, and computation biology), however, fueled by the decreasing costs for
generating large-scale molecular data, are revolutionizing our research approaches
also in the field of medicinal plants.
In the present review, we will present examples where the application of traditional
biochemical and omic-based approaches contributed to new discoveries in the pathways
of some secondary metabolites of medicinal importance. We will not cover in detail
the knowledge acquired so far on the chemistry of natural products (but we refer the
reader to recent excellent reviews on the subject: [13] for benzylisoquinoline alkaloids, [14] and [15] for monoterpenoid indole alkaloids (MIAs), [16] for cannabinoids, [17] for xanthine alkaloids, [18] for ginsenosides, [19] for withanolides, [20] for artemisinin, and [21] for taxol), and we instead focus on the historical developments and the advances
made recently in completing the missing parts of the puzzle in the biosynthesis of
some important natural products. In the first part of this review, we chose to focus
on the cases of benzoisoquinoline alkaloids (BIAs), MIAs, cannabinoids, and caffeine,
as they represent exemplary cases of how the application of several approaches, based
on the integration of genomics and metabolomics, has helped clarify specific biochemical
steps or entire pathway branches that had remained elusive. In the second part of
this review, we will summarize the knowledge acquired so far on the biosynthesis of
specific compounds (ginsenosides, withanolides, artemisinin, and taxol) from other
important medicinal plants where we believe integrative approaches could help further
the elucidation of their secondary metabolism with a view on the discovery of novel
metabolites of medicinal importance. For each presented case study, we survey the
health-related benefits and current medicinal use of these compounds and how traditional
“reductionist” and integrative approaches are accelerating the development of metabolic
engineering strategies (in heterologous and native hosts) for the production of secondary
metabolites of pharmaceutical interest.
Approaches for Pathway Discovery
Approaches for Pathway Discovery
Traditionally, the first steps in the elucidation of plant metabolic pathways were
based on the identification of a rather limited number of primary metabolites and
on the use of radioactive labels to follow their fate. These were essentially the
approaches that brought to the discovery of the reactions of the path of carbon in
photosynthesis: the strategy was based on exposing a green algae to a stream of 14C-labeled CO2, followed by extraction, separation, and identification of metabolites with paper
chromatography. The gradual decrease of the exposure time to labeled CO2 allowed, for example, the identification of the product immediately downstream of
the CO2 fixation reaction (phosphoglyceric acid [22], [23]). Similarly, the remaining intermediates of the various reactions were identified,
increasing the exposure time to labeled CO2
[24]. With the advent of recombinant DNA technology, these initial labeling approaches
were combined with the isolation of the respective genes and with the synthesis and
purification of candidate enzymes. The advent of these new technologies was also accompanied
by an increasing interest toward secondary metabolites, which were initially considered
only as “waste” products of primary metabolites, with no physiological or ecological
role [25]. The use of molecular biology techniques (i.e., molecular cloning and heterologous
expressions systems) along with classical protein biochemistry allowed, for example,
to assess in vitro the catalytic properties, substrate specificities, and identity of the products for
a large number of enzymes involved in secondary metabolism (and several examples from
these early, targeted approaches for pathway discovery of medicinally important phytochemicals
are reported in this review). In recent years, the leap of genomic technologies, with
the relative ease in collecting large-scale sequence data, has bred new life into
metabolism research [26]. The increasing number of available genome sequence is now frequently integrated
with high-resolution/deep-coverage metabolomics approaches [27] not only to uncover structural and regulatory aspects of pathways of secondary metabolism,
but also to go deeper into the evolution of metabolism across the diversification
of land plants (and landmark examples in this area are the recent reconstructions
of the synthesis of nicotine and caffeine [28], [29]). The case studies presented here thus represent successful examples of how targeted
molecular approaches or, more recently, the combination of next-generation genomics
with metabolic profiling are revolutionizing the field of medicinal plants with new
knowledge concerning the synthesis of natural products.
Benzoisoquinoline Alkaloids
Benzoisoquinoline Alkaloids
BIAs represent perhaps the oldest medicines humans have used to treat pain. These
alkaloids belong to a large family with over 2500 known structures; they are mostly
restricted to members of the order Ranunculales (in particular, they are present in
the families Papaveraceae and Berberidaceae), Magnoliales, and Laurales. Among the
Papaveraceae, opium poppy (Papaver somniferum L.) has emerged as the model species to study metabolism of important BIAs, as this
plant accumulates large amounts of different subgroups of these alkaloids [30]. The most abundant BIAs in roots of opium poppy are those of the benzophenanthridine-type
(e.g., sanguinarine, a potent anti-inflammatory agent that has also showed antitumor
properties [31], [32]), while the latex preferentially accumulates varying amounts of morphine and codeine
(“morphinans”). Although the increasing use of opioid drugs (natural morphinans and
their semisynthetic derivatives like oxycodone) in clinical practice is now raising
concerns given their history of abuse, there is no doubt that morphine and codeine
represent effective analgesics in the treatment of severe pain, at least in the short-term
following an acute trauma [33]. The initial isolation of morphine from the latex of opium poppy stimulated further
research into the elucidation of its biosynthesis in plants. The first studies were
based on radiolabel incorporation of a few candidate substrates and established tyrosine
and its derivatives as the precursors of morphine [34], [35]. We now know, after decades of research that have seen the application of more detailed
tracer studies and biochemical characterization of the related enzymes, that the biosynthesis
of BIAs involves a highly branched network of chemical transformations starting from
two tyrosine derivatives, dopamine and 4-hydroxyphenylacetaldehyde (4-HPAA) [13]. These two metabolites condense to give rise to (S)-norcoclaurine, which is in turn modified by a number of O-, N-methyltransferases and oxidoreductases to produce (S)-reticuline, the precursor of almost all subgroups of BIAs. From (S)-reticuline, the pathway diverges into different branches, which may be active only
in some species or tissues, resulting in the wide structural diversity of BIAs subgroups
that has been observed in plants ([Fig. 1]).
Fig. 1 BIAs biosynthetic pathways of P. somniferum (opium poppy) discussed in the text. All BIAs derive from (S)-norcoclaurine, the product of the condensation of two tyrosine derivatives, dopamine
and 4-HPAA. After a series of O-, N-methyltransferase and hydroxylation reactions, (S)-norcoclaurine is converted into (S)-reticuline, the central precursor of all BIAs biosynthetic branches. NCS: norcoclaurine
synthase; NMCH: (S)-N-methylcoclaurine 3′-hydroxylase; 4′-OMT 3′-hydroxy-N-methylcoclaurine 4′-hydroxylase; STORR: (S)-to-(R) reticuline (aka REPI, reticuline epimerase); P6H: protopine 6-hydroxylase; DBOX:
dihydrobenzophenanthridine oxidase; SalSyn: salutaridine synthase; SalR: salutaridine
reductase; SalAT: salutaridinol 7-O-acetyltransferase; T6ODM: thebaine 6-O-demethylase; CODM: codeine O-demethylase; COR: codeinone reductase; SOMT1: scoulerine 9-O-methyltransferase; CAS: canadine synthase; TNMT: tetrahydroprotoberberine N-methyltransferase; NOS: noscapine synthase. Dashed arrows indicate multiple steps.
The early efforts in the elucidation of BIA biosynthesis were based on the purification
of the putative enzymes and on the screening of cDNA libraries for the isolation of
the corresponding genes; these initial studies allowed, for example, the characterization
of norcoclaurine synthase, the enzyme responsible for the condensation of dopamine
and 4-HPAA, producing (S)-norcoclaurine [36], [37]. Similar approaches have been followed in the elucidation of the remaining early
steps of the BIA pathway: the synthesis of (S)-coclaurine, for example, by the action of a 6-O-methyltransferase (norcoclaurine
6-O-methyltransferase, 6OMT) [38], [39] or, analogously, the synthesis of (S)-N-methylcoclaurine by a N-methyltransferase (coclaurine N-methyltransferase, CNMT, [40]). The late steps of morphinan biosynthesis remained instead uncharacterized until
the development of global gene expression resources for opium poppy. After screening
a number of varieties and mutants differing in their accumulation of morphine, two
candidate genes were eventually proposed on the basis of the correlation of their
expression with the accumulation profiles of morphinans. The discovery of these two
genes, thebaine-6-O-demethylase (DIOX1) and codeine-O-demethylase (DIOX3), was thus made possible by the development of cDNA microarrays from an opium poppy
EST (expressed sequence tag) database [41].
The advent of these “early” global gene expression resources in P. somniferum (ESTs collections, microarray) heralded a new era in the study of BIA metabolism.
Additional gene expression resources–based on next-generation sequencing–were developed
and integrated with metabolomics and proteomics studies in order to identify novel
gene candidates [42]. As an example of this approach, known cytochrome genes of the CYP80B3 and CYP82N3
subfamilies, responsible for hydroxylating (S)-N-methylcoclaurine and protopine, respectively, were used as queries in a co-expression
analysis to discover additional BIA biosynthetic genes in several accessions of opium
poppy [43].
More recently, integrative approaches based on the combination of gene expression
analyses and metabolic profiling were also fundamental in unveiling the nature of
a biochemical step in BIA biosynthesis that had remained elusive for a long time.
The first step of the morphinan branch is the conversion of (S)-reticuline into its R stereoisomer; the reaction is a two-step process involving the oxidation of (S)-reticuline to 1,2-dehydroreticuline and the subsequent reduction to (R)-reticuline. Although the reaction was supposed to be catalyzed by two different
genes, in agreement with the reaction being a two-step process, screening of transcriptome
libraries from opium poppy identified instead a single fused gene composed of two
domains. This gene, named STORR (from S- to R-reticulin), encodes a unique bifunctional protein containing a P450 monoxygenase
at the N-terminus and an oxidoreductase at the C-terminus [44], [45]. The genetic analysis of opium mutants with impaired synthesis of morphine and high
accumulation of reticuline confirmed STORR as the causal locus for the epimerization of S- to R-reticuline. Bifunctional genes like STORR, including monoxygenases fused with various additional domains (hydrolase, dioxygenase),
have been found also in secondary metabolic pathways of other organisms (e.g., fungi,
[46]); it is thus possible that the occurrence of these genes could represent a sort
of metabolic channeling of higher efficiency, in which highly unstable intermediates–like
those formed during an epimerization reaction–are converted into final products by
the action of a single protein rather than by a multienzymatic assembly [44].
Another example of the application of integrative approaches to the metabolism of
BIAs lies in the elucidation of the biosynthesis of noscapine. This alkaloid belongs
to the phtalideisoquinoline subgroup of MIAs; it was already widely used for its antitussive
properties but has recently been demonstrated to possess antitumor activity given
its ability to bind tubulin and arrest cell division in a number of cancer cell lines
[47]. It was later showed that noscapine specifically targets the NF-κB signaling pathway in tumor cells, repressing proteins involved in cell invasion
and tumor proliferation [48]. Early radiolabeling experiments in the 1960s traced back the origin of noscapine
to (S)-scoulerine [49], which is produced starting from (S)-reticuline by the action of a FAD-linked (FAD: flavin adenine dinucleotide) oxidoreductase
(BBE, berberine-bridge enzyme). From (S)-scoulerine, the synthesis of noscapine requires at least six additional biosynthetic
steps, including O- and N-methylations and several oxidations, but only recently could the complete pathway
to noscapine be elucidated in detail. The clarification of the pathway was made possible
thanks to the availability of opium poppy varieties accumulating different amounts
of noscapine and morphinans. Stems and capsules of these varieties were subjected
to RNA sequencing and metabolic profiling to identify genes specifically expressed
by the high-noscapine variety (HN1). A number of O- and N-methyltransferases, along with several cytochrome P450s, were found to be highly
expressed only in the high-noscapine variety. Genomic analysis showed that these genes
were actually exclusive of HN1. A mQTL (metabolic quantitative trait loci) analysis
for noscapine content in an F2 population identified a single locus that was found
to be strongly linked to the high-noscapine phenotype in the segregating generation.
The locus contained a cluster of 10 genes spanning 220 kbp; the clustered genes corresponded
to those previously identified as being exclusively present in HN1 [50]. The reconstruction of the pathway was also supported by virus-induced silencing
of the cluster genes, thus allowing to confirm the role of each gene and measuring
the accumulation of the various intermediates [50]. The occurrence of the high-noscapine cluster was not a feature unique to BIA metabolism:
cluster organization is in fact a recurrent feature in the genomic organization of
pathway genes of secondary metabolism [51].
The knowledge acquired so far on the biosynthesis of medicinally important BIAs has
of course allowed the transfer of partial or entire pathways into non-plant hosts.
Strategies for chemical synthesis of morphinans (e.g., morphine, codeine, aka “opiates”)
have in fact been demonstrated not to be economically feasible; therefore, the licit
cultivation of opium poppy is the only source of opiates, from which several semisynthetic
derivatives (“opioids”) can be also obtained through semisynthesis (e.g., hydrocodone,
[52]). Synthesis of thebaine and hydrocodone, for example, has been obtained in yeast
starting from common precursors of primary metabolism. This has required the (over)expression
of over 20 genes from yeast itself, plant (P. somniferum and Papaver bracteatum Lindl.), bacteria, and mammals. Many of the genes transformed into yeast were specifically
engineered to increase their activity and stability (e.g., through site-specific mutagenesis
to make the enzymes less sensitive to feedback inhibition or to modify their glycosylation
patterns); although the fermentation titers for the production of thebaine and hydrocodone
remained nevertheless low, especially when compared with the yields obtained with
direct purification from opium or semisynthesis, the results obtained so far represent
a starting point for further optimization of an alternative strategy for opioids production
[53], [54]. Similar strategies have also been followed in Escherichia coli
[55] and in yeast for the synthesis of dihydrosanguinarine, a BIA showing antitumor activity
[56].
Monoterpenoid Indole Alkaloids
Monoterpenoid Indole Alkaloids
MIAs represent another class of important alkaloids whose biosynthesis has been studied
in detail due to their diverse pharmacological effects. Vinblastine and vincristine,
for example, two MIAs of the bisondole-type isolated from the plant Catharanthus roseus (L.) G. Don (Madagascar periwinkle, family Apocynaceae), show toxicity to white blood
cells of mammals and are used today as effective medications to treat tumors like
lymphoma and myeloma [57]. Other important MIAs include camptothecin, an inhibitor of DNA topoisomerase I
isolated from the tree Camptotheca acuminata Decne (Nyssaceae) (irinotecan, a semisynthetic derivative of camptothecin, is one
of the most diffused chemotherapeutic in the treatment of colon cancer) and quinine,
an antimalarial isolated from the bark of the Cinchona trees (Cinchona spp.). Quinine is still in use today, although it has been replaced by artemisinin
as the recommended first-line treatment for malaria.
MIAs constitute a large family, with over 3000 structures identified to date. They
are mostly confined to plants of the order Gentianales, in the family of Apocynaceae,
Loganiaceae, and Rubiaceae. The species C. roseus, which synthesizes over 150 different MIAs, has emerged in this case as the model
plant for studying the biosynthesis and regulation of this important class of alkaloids
[30].
The biosynthetic pathway of MIAs is complex. As an example, the complete biosynthesis
of vinblastine in C. roseus proceeds through at least 30 enzymatic steps, which take place in several different
tissues (phloem-associated parenchyma, epidermis, mesophyll, laticifer) and subcellular
compartments (plastid, nucleus, ER [endoplasmic reticulum], and vacuole) [58], [59], [60], [61]. The chemical complexity of most of the active MIAs hampered developments in chemical
synthesis; this factor, combined with the general low number of MIAs recovered from
plant sources, drove efforts toward the elucidation of biochemical pathways as a necessary
step to develop metabolic engineering strategies. We will thus first summarize here
the main branches of the MIA biosynthetic pathway to later focus on the recent discoveries
made in the elucidation of the steps that were previously poorly characterized.
As the name suggests, all MIAs contain a terpenoid and an indole moiety. The terpenoid
moiety derives from secologanin, a cyclic monoterpene formed from geraniol. The indole
moiety of MIAs is instead coming from tryptamine, as a result of the decarboxylation
of tryptophan. Tryptamine and secologanin then condense to give rise to strictosidine,
the precursor of all MIAs. The whole pathway thus consists of four main parts:
-
The first part is the synthesis of geraniol, through the plastid MEP (methylerythritol
4-phosphate) pathway. Although two different routes exist in plants for the synthesis
of terpenoid precursors (the cytosolic mevalonate and the plastidic MEP pathway [62]), early labeling studies supported the origin of the terpene moiety of MIAs from
the MEP pathway [63].
-
The second part is the conversion of geraniol into secologanin in a series of eight
steps that have been elucidated recently (iridoid pathway [64], [65], [66], [67], [68]) ([Fig. 2]).
-
The “mid-pathway” then involves the formation of strictosidine starting from secologanin
and tryptamine [69], its deglycosylation [70], and a series of downstream transformations whose steps have been clarified, in
part, only recently [71], [72] ([Fig. 3]).
Fig. 2 MIA biosynthesis (iridoid pathway) in C. roseus. The entire pathway is composed by eight steps converting geraniol into secologanin.
Geraniol is mainly derived from the plastidial MEP pathway. The early steps in the
pathway, up to the synthesis of loganic acid, take place in the phloem-associated
parenchyma (vascular cells), while the last two genes in the pathway have been localized
to the epidermal cells. The gene responsible for transporting loganic acid across
the two cell types has not been identified yet. 10HGO: 10-hydroxygeraniol oxidoreductase;
IS: iridoid synthase; IO iridoid oxidase; 7-DLGT: 7-deoxyloganetic acid glucosyltransferase;
7-DLH: 7-deoxyloganic acid hydroxylase; LAMT: loganic acid methyltransferase; SLS:
secologanin synthase.
Fig. 3 “Mid” and “late” pathway steps in the biosynthesis of MIAs in C. roseus. The first step is the condensation between secologanin (end product of the Iridoid
biosynthesis) and tryptamine to form strictosidine in the vacuole of epidermal cells.
Strictosidine is then exported from the vacuole into the cytosol through a transporter
of the nitrate/peptide family (CrNPF2.9). The deglycosylated form of strictosidine
(strictosidine aglycone) is the central biosynthetic intermediate of many MIAs types.
Vindoline, for example, derives from tabersonine and accumulates in laticifers; prekuammicine
is instead the precursor of catharanthine, which is then exported to the leaf surface
via another transporter, CrTPT2. Leaf damage or herbivory can cause cell disruption,
allowing catharantine and vindoline to react together and form the dimeric MIA vinblastine.
TDC: tryptophan decarboxylase; STR: strictosidine synthase; SGD: strictosidine beta-glucosidase;
D4H: desacetoxyvindoline 4-hydroxylase; DAT: deacetylvindoline 4-O-acetyltransferase. Dashed arrows indicate multiple steps.
-
Finally, the “late-pathway” converts tabersonine, a downstream product of strictosidine,
into vindoline, the immediate precursor of vinblastine [73], [74], [75] ([Fig. 3]).
The first approaches in the elucidation of the steps of MIA biosynthesis were mostly
based on conventional strategies starting from the purification of the single enzymes,
analysis of their AA (aminoacid) sequences, and cloning of full-length clones from
cDNA libraries using degenerate primers. This was the approach followed, for example,
for the identification of geraniol-10-hydroxylase (G10H), the enzyme responsible for the hydroxylation of geraniol, the first step of the
iridoid pathway [76], [77] and for the purification and cloning of strictosidine beta-glucosidase [70], [78]. More recently, several transcriptome resources and databases have been developed
in C. roseus, and these have been used for initial selection of candidate genes of MIA biosynthesis
[79], [80], [81], [82], [83], [84], [85].
As an example of this approach, transcriptome datasets from several tissues of a C. roseus plant [68], [86] have been screened to identify the gene responsible for an elusive step in iridoid
biosynthesis, the cyclization reaction of 10-oxogeranial into iridodial (iridoid synthase).
Since the reaction was known to occur in the presence of NADH (nicotinamide adenine
dinucleotide [reduced])/NADPH (nicotinamide adenine dinucleotide phosphate [reduced]),
the genes using these two cofactors were first selected from the entire transcriptome
dataset; then only the transcripts showing a similar expression profile to that of
G10H (an upstream gene in the same pathway) were retained and considered as candidates
for iridoid synthase. The transcript showing the highest correlation to G10H was selected for functional validation. The expression of the enzyme in E. coli showed that it was able to convert 10-oxogeranial into cis-trans nepetalactol (which is in equilibrium with cis-trans iridodial), and VIGS (virus-induced gene silencing) of the candidate gene in C. roseus confirmed downregulation of the transcript and the lower accumulation of several
MIAs downstream of iridoid synthase (e.g., vindoline and catharantine) [68]. Mining the expression databases from C. roseus and analysis of coregulation with additional known genes, proved to be useful also
for the discovery of other genes involved in the remaining steps of iridoid biosynthesis
[64].
One of the most interesting features of MIA biosynthesis is the spatial distribution
of its enzymes. The various parts of the pathway operate in fact in different cell
types: (i) the MEP reactions and the early reactions of iridoid biosynthesis occur
in the phloem-associated parenchyma; (ii) the remaining steps of the iridoid pathway
and the “mid” reactions take place in the epidermis, while (iii) the reactions of
the late pathway occur in laticifers [75]. Adding to this complexity, the reactions taking place in the leaf epidermis are
also compartmentalized at the subcellular level: the condensation of tryptamine and
secologanin to form strictosidine occur in fact in the vacuole, while the downstream
transformations of strictosidine occur in the nucleus and in the cytosol [87]. In particular, the physical separation between the synthesis of strictosidine (vacuole)
and its immediate successive step, deglycosylation (nucleus), implies the existence
of an export system from the vacuole. Transporter genes have long remained elusive
in MIA biosynthesis, with only two systems characterized to date: the export of catharanthine
(the immediate precursor of vinblastine) to the leaf surface [88] and the sequestration of vindoline inside the vacuole of mesophyll cells [89]. Also in this case, however, the recent developments of transcriptome resources,
combined with functional studies in planta, allowed the elucidation of a transporter gene responsible for the export of strictosidine
from the vacuole to the cytosol [90]. In order to identify transporter genes, self-organizing maps (SOMs) have been used
to cluster all transcript contigs according to the similarity of their expression
profiles across a wide range of tissues and developmental stages. The high-quality
nodes of the SOMs that contained known MIA biosynthetic genes were then retained and
inspected for the presence of putative transporter genes. This led to the identification
of a candidate transporter of the NPF (nitrate/peptide family) family (CrNPF2.9). Further analysis confirmed the role of this gene in the export of strictosidine
from the vacuole. For example, transient silencing of CrNPF2.9 in leaf of C. roseus led to a necrotic phenotype, probably as a result of the increase in the vacuolar
accumulation of strictosidine [90].
As in the case of BIAs, several strategies have been attempted also for production
of MIAs in microbial hosts. The commercial production of vincristine and vinblastine,
for example, which are powerful therapeutic agents for the treatment of several forms
of blood cancer, relies entirely on extraction from plant sources. Most of the active
MIAs, however, including vincristine and vinblastine, are produced in extremely low
amounts, so their extraction from plant tissues is uneconomical and laborious for
commercial production. The first attempt to produce MIAs in microbial hosts focused
on the production of strictosidine in yeast. Strictosidine represents in fact the
central precursor for a number of MIAs of medical importance (vincristine, vinblastine,
quinine, strychnine, ajmalicine). Reconstitution of the pathway in Saccharomyces cerevisiae required the integration of a total of 21 genes; of these, 15 represented the entire
known plant MIA pathway, while the remaining six were either duplication of yeast
endogenous genes or animal-derived sequences. The transformed yeast strain also contained
targeted deletions of endogenous genes to decrease the flux into competing routes.
As reported already for opiate production in yeast, also in this case the final yields
of strictosidine remained nevertheless low (around 0.5 mg/L) for commercial production
[91]; the production of this yeast strain represents in any case the basis for further
optimization of the flux toward strictosidine or as a starting point for the synthesis
of non-natural products [92].
Cannabinoids
Cannabinoids constitute a group of terpenic alkylresorcinols found in Cannabis sativa L., a dioecious plant of the Cannabaceae family. They accumulate in the glandular
cavity of specific types of trichomes (capitate sessile or stalked trichomes), which
are particularly abundant in female flowers and, to a lesser extent, in other parts
of the plant (e.g., leaves, shoots). More than 120 different cannabinoids have been
isolated to date [16], although the study of their medical and pharmacological effects focused on the
most abundant ones, tetrahydrocannabinol (THC) and cannabidiol (CBD) [93], [94]. Scientific studies on the medical effects of cannabinoids were stimulated by anecdotes
reported by people who used to smoke cannabis to relieve pain or to treat a number
of conditions (loss of appetite, insomnia). Cannabis in fact represents one of the
first plants used for medicinal purposes since ancient times. The first reports of
its medical use date back to 2700 BC, when teas and other infuses were already prepared
in China to relieve symptoms of rheumatisms and arthritis. Also, archeological evidences
from a burial cave near Jerusalem, dating back to 390 BC, document the use of smoked
cannabis to relieve pain. In addition to its use as a medicine, cannabis has always
been used as a source of textile fibers (“bast” fibers) and as a recreational psychoactive
drug to achieve a status of mental high. Zoroastrian priests and shamans (~ 500 BC),
for example, used cannabis to reach ecstasy during their religious ceremonies [95]. Today, fiber-type cannabis plants continue to be used as a fiber in the textile
and bioplastic industries [96], while marijuana-type cannabis represents one of the most highly consumed recreational
drugs in the world. Despite the strict regulations around cannabis research, several
cannabinoid preparations have been tested in controlled trials for relieving symptoms
associated to cancer or HIV [97].
The isolation and structural elucidation of cannabinoids began in the 1940s with the
isolation of cannabinol and cannabidiol [98], [99], but it was not until 1964 that the structure of Δ9-THC–the main psychoactive component–was reported [100]. In a series of papers from the 1990s, it was found that THC exerts its effects
through binding to two different receptors in the human body: CB1, which is present
in the brain [101], [102], and CB2, which is instead mainly located in the immune system [103]. The characterization of these receptors led to the discovery of additional substances
produced by the human body that also target the cannabinoid receptors [104]. These endogenous ligands were named endocannabinoids to distinguish them from the
phytocannabinoids produced in the trichomes of the cannabis plant. We now know that
the interaction between endocannabinoids and CB1/CB2 constitutes the “endocannabinoid
system”, a central regulator of homeostasis in the human body. Typical responses mediated
by this system include pain perception, memory, appetite, immunity, and, of course,
the neurological responses induced by the psychoactive Δ9-THC [105].
Although more than 120 phytocannabinoids have been reported in the literature, their
biosynthesis has been fully described only for the most abundant components, THCA
(tetrahydrocannabinolic acid) and CBDA (cannabidiolic acid) ([Fig. 4]). THCA is the most abundant cannabinoid in marijuana-type plants, while CBDA, which
does not possess psychoactive properties, is instead the most abundant in hemp (fiber-type
plants). We will present here some examples to show the advances made in the elucidation
of the steps in the core cannabinoid pathway. While the first steps to be defined,
historically, were based on classical enzyme purification approaches and homology-based
cloning of the corresponding genes, more recently the development of genomics and
transcriptomics resources in cannabis have helped to clarify additional biosynthetic
steps [106], [107], [108]. Also, at least initially, the elucidation of the cannabinoid pathway was made difficult
by the low incorporation of the label [109] and by the fact that cannabinoids occur in vivo as carboxylic acids but are then decarboxylated to neutral (active) forms during
heating or smoking.
Fig. 4 Biosynthetic pathways of the major phytocannabinoids, Δ9-THC and CBD. The alkyresorcinol
(phenolic lipid) moiety of cannabinoids derive from the polyketide pathway, in which
hexanoyl-CoA is first condensed with three molecules of malonyl-CoA by the action
of TKS and then cyclizes to form OA in a reaction catalyzed by OA cyclase (OAC). The
addition of GPP, from the plastidial MEP pathway, then generates CBGA, the immediate
precursor of Δ9-THCA and CBDA. Δ9-THCA (and its decarboxylated form, delta9-THC) represent
the psychoactive compounds of marijuana-type plants. The most abundant cannabinoid
in hemp (fiber-type cannabis) is instead the non-psychoactive CBDA.
All phytocannabinoids are formed by an alkylresorcinol (phenolic) moiety coupled to
a monoterpene ([Fig. 4]). Labeling studies using 13C-glucose showed that the monoterpene moiety derived from the plastidial MEP pathway,
while the alkylresorcinol was produced through the polyketide pathway [110]. The first step in the synthesis of THCA and CBDA is the condensation of olivetolic
acid (OA, an alkyresorcinol) with geranylpyrophosphate (GPP), leading to cannabigerolic
acid (CBGA), the immediate precursor of THCA and CBDA. The reaction is catalyzed by
an aromatic prenyltransferase (geranyl pyrophosphate: olivetolate geranyltransferase,
GOT), which was isolated in 1998 [111]. The gene (CsPT) was later cloned and shown to be expressed in leaves, flowers, and trichomes [112], [113].
CBGA is then the substrate of two different FAD oxidases: the tetrahydrocannabinolic
acid synthases (THCA synthase) and the cannabidiolic acid synthase (CBDA synthase), which produce, respectively, THCA and CBDA. The two genes, which share 84% similarity,
are encoded by different loci [114]. Both THCA and CBDA synthase were purified through enzymatic assays from crude extracts
and their respective genes cloned using degenerate PCR primers (THAS: [115]; CBDA: [116], [117]).
The steps leading to the synthesis of the alkyresorcinol precursor of cannabinoids,
OA, have, however, remained elusive, and it was not until recently that these biosynthetic
steps have been clarified. OA was long supposed to be synthesized starting from hexanoyl-CoA
through successive condensations with three molecules of malonyl-CoA, in a series
of steps catalyzed by a type III polyketide synthase (PKS, [118], [119]). A type III PKS cloned from cannabis leaves (named tetraketide synthase, TKS), however, did not produce OA and was instead shown to accumulate, among other byproducts,
α-pyrones [120]. These metabolites were typical downstream products of polyketide pathways in bacteria
lacking polyketide cyclase activity [121]. On the basis of this, candidates with structural similarity to polyketide cyclases
were selected from an EST library of cannabis trichomes, leading to the identification
of a member of the dimeric α+β barrel protein superfamily (DABB superfamily). This gene, which was distantly related
to type II polyketide cyclases of bacteria (Streptomyces), was able to convert, in the presence of TKS, hexanoyl-CoA and malonyl-CoA into
OA, acting effectively as a noncanonical polyketide cyclase [107]. A similar approach, based on mining the same EST database from cannabis trichomes,
was also used to identify the acyl-activating enzyme responsible for the synthesis
of hexanoyl-CoA, the first step of the polyketide pathway in cannabinoid biosynthesis
[108].
The elucidation of the steps in the biosynthesis of the main phytocannabinoids opened
the possibility to transfer the pathway to heterologous hosts for commercial production
of THCA/THC and CBDA/CBD. These two cannabinoids have in fact several pharmacological
effects. THC, the neutral psychoactive form of THCA, targets mainly the CB1 receptor
in the central nervous system and has analgesic and antispastic activities. Its consumption
is, however, associated to well-known side effects (memory loss, decreased coordination,
and, in some individuals, anxiety, [122]). CBD, on the other hand, may reduce the side effects of THC and has shown pharmacological
potential to reduce inflammation and symptoms of epilepsy [123]. Sativex, which is the only cannabinoid-based drug approved so far in 27 countries,
is a mouth spray of THC and CBD. This drug is used today to treat the spasticity associated
to multiple sclerosis [94]. Given the potential shown by THC and CBD, various strategies have been attempted
in metabolic engineering of cannabinoids. Cell cultures of cannabis, even in the presence
of elicitors, have resulted in limited yields, probably due to the lack of compartmentalization
required by the high toxicity of cannabinoids [124], [125]. A more promising approach might be represented by the production of THCA synthase
in Pichia pastoris and its use in a cell-free two-liquid phase reactor to drive the synthesis of THCA.
Also, this system, however, achieved relatively low yields (0.121 g · L−1 · h−1 of THCA), probably as a consequence of the sensitivity of THCA synthase to be inhibited
by its substrate [126], [127].
Today, the regulations around the use of cannabis, and the research around it, are
becoming less strict. Several European countries and the United States have exemptions
for the medical use of marijuana; other U. S. states have legalized cannabis consumption,
in moderate amounts, for personal use. Canada and Israel have funding bodies and programs
specific for cannabis research. As the regulations in cannabis research will ease,
we anticipate the development of additional genomic and metabolomics resources in
cannabis. The integration of these resources will aid the elucidation of the full
biosynthetic pathways of cannabinoids, opening the way to the discovery of novel compounds
of potential medicinal importance.
Caffeine
Caffeine (1,3,7-trimethylxanthine) is a xanthine (purine) alkaloid found in guarana,
yerba maté, cacao, and several species used to make tea. Traditionally, it is called
guaranine when it comes from the guarana plant (Paullinia cupana Kunth, family Sapindaceae), theine when it comes from the tea plant (Camellia sinensis (L.) Kuntze, family Theaceae), and mateine in mate infusions; however, they all are
the same compound. In addition, cacao, which accumulates only trace amounts of caffeine,
contains the similar compound theobromine, which has similar, albeit less potent,
bioactivities to caffeine. Of the species listed above, genome sequences for coffee
[128], tea [129], and cacao [130] have been published indicating that at least three metabolic pathways for caffeine
biosynthesis evolved independently co-opting genes from different gene families. The
appearance of at least three pathways for caffeine biosynthesis in higher plants is
thus an example of recurrent convergent evolution: the presence of caffeine per se in species from multiple plant orders (Malvales, Sapindales, Ericales, and Gentianales)
did not always imply the recruitment of homologous genes [29] ([Fig. 5]). Intriguingly, this study, which relied on sequence information from five flowering
species, revealed that caffeine biosynthesis was characterized by an even greater
degree of convergent evolution than was previously thought, with citrus, chocolate,
and guarana plants containing two previously unknown pathways of caffeine synthesis
using either caffeine synthase or xanthine methyltransferase-like enzymes. Moreover,
ancestral sequence reconstruction revealed that these pathways would have arisen rapidly
since the ancestral enzymes were co-opted from their previous biochemical roles to
those of caffeine biosynthesis. As such, this seminal paper provides a fantastic blueprint
for studies into the evolution of natural product biosynthesis.
Fig. 5 Biosynthetic pathways of caffeine biosynthesis. The synthesis of caffeine evolved
independently in several orders of eudicots. Two different gene families have been
recruited to synthesize caffeine: (i) caffeine synthases (CS), which sequentially
methylate xanthine (in cacao and guarana) or xanthosine (in C. sinensis) to eventually produce caffeine; (ii) XMTs, which are instead active in the flowers
of C. sinensis and in coffee (C. arabica). Different substrate specifies of CS and XMT enzymes gave rise to at least three
main pathways in caffeine-accumulating plants. The first pathway represents the CS
lineage and is the route present in cacao and guarana (red); the second pathway is
the synthesis of caffeine operated by the XMT genes (C. sinensis and C. arabica, blue); C. sinensis has instead recruited the genes in the CS lineage but synthesizes caffeine through
the same sequence of intermediates detected in C. arabica (green). Guarana and Citrus sinensis, although both members of the Sapindales, have converged on caffeine synthesis co-opting different genes. CS: caffeine synthase.
Caffeineʼs exact function in planta is unclear, and two main roles, which are by no means mutually exclusive, have been
proposed. In the first of these, sometimes called the chemical defense theory, caffeine
is believed to protect young leaves and fruit from predators [131], [132]. In keeping with this, Uefuji et al. [133] demonstrated that leaves of transgenic tobacco (Nicotiana tabacum L. [Solanaceae]) plants, engineered to produce caffeine, were less susceptible to
insect feeding than leaves that did not contain caffeine. In the second, sometimes
known as the allelopathic theory, caffeine is believed to be released by the seed
coat to prevent germination of other seeds [134]. Evaluation of the cacao genome, the first of the three caffeine-containing species
to be sequenced, suggested that cacao harbors a rich repertoire of homologs of secondary
metabolism-associated genes, including pathways for oils, storage lipids, flavonoids,
and terpenes as well as the alkaloid class to which caffeine belongs. The analysis
of multiple metabolomics studies of this species suggests that functional prediction
of the gene repertoire mentioned above was indeed largely correct [135]. The evolution of caffeine and indeed its metabolic precursor theobromine was, however,
looked at in more detail following publication of the coffee and tea genomes [128], [129]. Intriguingly, coffee was characterized to contain several species-specific gene
family expansions including that of the xanthine N-methyltransferases (XMTs) involved in caffeine production and revealed that these
genes expanded through sequential tandem duplications independently of genes from
cacao and tea. As for cacao, a large number of metabolomics studies have been performed
on coffee and tea identifying high contents of caffeine, quinate, and chlorogenic
acid in the former [136], [137], [138] and catechins, terpenes, and caffeine in the latter [139], [140], [141]. Since there is also an increasing amount of transcriptomics data available for
these species [142], [143], [144], [145], [146], [147], [148], it would appear likely that evaluating the dynamic behavior of transcripts related
to caffeine biosynthesis in comparison to other unknown genes (and to the levels of
the metabolites themselves) will greatly enhance our understanding as to how these
pathways are controlled. One study of particular interest is the long read sequencing
of the coffee bean transcriptome since this provided more and longer transcript variants
specifically allowing the identification of a further 10 transcripts likely to encode
key enzyme isoforms of caffeine biosynthesis [142]. This information thus greatly extends the number of candidate genes that are potentially
important determinants of the final caffeine level within plant cells, and their study
will thus prove instrumental in allowing rational design of metabolic engineering
strategies aimed at modifying caffeine content. In addition, two other studies, this
time in tea, have been highly informative in analyzing the regulation of caffeine
biosynthesis. The first of these built gene regulatory networks for secondary metabolism
of a wide range of tea tissues implicating a large number of transcription factors
in the regulation of caffeine biosynthesis [149]. The second article used a comparative transcriptomic and metabolomics analysis
of tea and oil tea that does not produce caffeine, indicating higher expression of
the key phenylpropanoid enzymes flavanone-3-hydroxylase, dihydroflavonol reductase,
and anthocyanidin reductase in tea but lower levels of phenylalanine ammonia-lyase
and chalcone isomerase; however, the exact link between this and the levels of caffeine
is not apparent from this study [150]. Thus, these studies offer hints to the regulation; however, due to the genetic
recalcitrance of the species, it will likely be several years before these can be
confirmed at the molecular level.
Caffeine is a compound whose medicinal properties are at least in part offset by its
addictive properties [151], [152], and as such, it remains very much debated as to how healthy it actually is. That
said, a lot of the idea that coffee is dangerous springs from work in the 1970s and
1980s in which its consumption was linked to higher incidence of cancer and heart
disease [153], [154]; however, much of this early research should be disregarded since it did not take
into account other health-detrimental habits in the cohorts such as cigarette smoking.
More recent analyses evaluating health and diet data of a cohort of 400,000 adults
over a period of 13 years revealed no evidence that coffee consumption increased death
from either these diseases or indeed any others with anything but a minor drop in
mortality rate among regular coffee drinkers [155]. Coffee has additionally been linked to lower rates of type 2 diabetes [155], reduced risk for some cancers [156], and protection against Parkinsonʼs disease [157], as well as inhibiting propagation of hepatitis C virus [158]; however, as we detail below, at least some of these proposed functions remain very
much under debate. By contrast, caffeine has been suggested to inhibit lipid anabolism
and thereby have a contributory role in metabolic syndrome [159]. In addition, coffee consumption has been linked to diversity of gut bacteria and
is often added to painkillers in the belief that it aids in analgesic efficiency [160]. Largely on the basis of its properties as a stimulant, overconsumption of caffeine
has a number of (short-term) health-negative effects including paranoia, restlessness,
anxiety, high blood pressure, very fast and abnormal heart rate, vomiting, and confusion
[161].
However, given the richness in terms of metabolic diversity of all species accumulating
caffeine and the specific medicinal implications of any one of their constituents,
it is clearly very hard to disentangle, as is the case of all food-based bioactives,
the health-positive effects of one from another.
That said, interestingly, several studies have shown that decaffeinated coffee has
the same health properties, suggesting–although by no means proving due to the small
amounts of residual caffeine in such beverages–that caffeine itself is not the bioactive
ingredient in such instances. This fact aside, the current consensus appears to be
that there are relatively few health-negative effects of caffeine (with the exception
of those following extreme consumption). Although the purported health-positive effects
remain somewhat contentious, it is likely that in the coming years they will be exposed
to severe scrutiny, and only then we will be in a position to categorically state
the case that caffeine is effective against any one ailment or the other.
Ginsenosides
Ginsenosides constitute a group of triterpenoid saponins that are exclusively produced
in plants of the Panax genus (family Araliaceae). The name “Panax” comes from Greek, meaning “all-healing,”
and refers to the medicinal properties of these plants. Of the nine existing Panax species, three in particular have been studied in relation to their pharmacological
activities: Panax ginseng C. A. Mey. (Chinese ginseng), Panax quinquefolium L. (American ginseng), and Panax notoginseng (Burkill) F. H.Chen [162]. These species have been–and still are–widely used in Chinese traditional medicine
to treat a number of ailments, including fatigue, anemia, rheumatisms, and cardiac
disorders. The use of ginseng as a herbal remedy dates back to about 100 AD, when
it was believed that the dry root powder of this plant possessed miraculous healing
effects [163].
Ginsenosides accumulate during the normal development of the ginseng plant. The total
amount of ginsenosides has been shown to be higher in leaves of one-year-old seedlings
and mature roots [18]. The accumulation and composition of ginsenosides is regulated during growth, but
the exact mechanism of how this occurs still remains not clear [164]. At least 150 naturally occurring ginsenosides have been described so far [165], and a number of multiple benefits on human health has been reported, such as strong
anti-oxidative, antitumoral, and anti-inflammatory activities.
Ginsenosides have been classified according to their chemical skeleton in two different
types: dammarane- and oleanane-type ginsenosides. Based on the glycosides attached,
the dammarane ginsenosides are further divided into three different subgroups: PPD-type
(protopanaxadiol), PPT-type (protopanaxatriol), and ocotillol-type ([Fig. 6]).
Fig. 6 Ginsenoside biosynthesis. The crucial step in the generation of ginsenoside diversity
is the cyclization of 2,3-epoxysqualene. One of the cyclization reactions leads to
the production of β-amyrin, which is precursor of the oleanane-type ginsenosides. An alternative cyclization
of 2,3-epoxysqualene, catalyzed by DDS, leads to the formation of dammarenediol, which
is then the precursor of ocotillol-, PPT-, and PPD-type ginsenosides. Compound K is
a dammarenediol-type ginsenoiside isolated from human blood after oral administration
of P. ginseng and has not been detected so far in Panax plants. Many of the enzymatic steps in the ginsenoside biosynthesis have not been
well characterized, but two gene families play key roles in generating ginsenoside
diversity: the CYPs and the UGTs. SE: squalene epoxidase; β-AS: β-amyrin synthase; OAS: oleanane acid synthase; GT glycosyltransferase; UGT UDP-glycosyltransferase.
Reactions with genes marked in red indicate hypothetical steps. Dashed arrows indicate
multiple steps.
Recent studies showed that the molecular structure of the ginsenosides is important
in defining their medical properties. The anticancer activities of these saponins
depend on the number of sugar molecules and on their attachment position [162]. Protopanaxadiol and protopanaxatriol ginsenosides with no sugar residues or PPT
and PPD ginsenosides containing up to three sugar residues inhibited different types
of cancer, while others containing a higher amount of sugar residue showed none or
very weakly antiproliferative effects [166], [167], [168]. Furthermore, it has been shown that the biological response of different types
of ginsenosides is also related to the number and positions of the hydroxyl groups,
which reflects the polarity of these molecules and thus facilitates the interaction
with the cell membrane [169], [170], [171], [172]. Also, differences in stereochemistry were demonstrated to produce different pharmacological
effects [173].
The biosynthetic pathway of ginsenosides is not entirely characterized and many steps
still need to be elucidated. The studies so far show that the main precursor used
for the triterpene ginsenosides is squalene, which is formed from the condensation
of two farnesyl pyrophosphate (FPP) molecule. The synthesis of each FPP requires the
condensation of one dimethylallyl pyrophosphate (DMAPP) with two molecules of isopentenyl
pyrophosphate (IPP). IPP can be produced in the cytosol through the mevalonic acid
(MVA) pathway or in the chloroplast from the methylerythritol (MEP) pathway [62]. The role of the plastidial IPP is still unclear since ginsenoside biosynthesis
mainly relies on the pool of cytosolic IPP [174], although a certain degree of compensation was observed in case of inhibition of
either MEV or MEP [175].
The crucial steps in the generation of ginsenoside diversity are the cyclization of
2,3-oxidosqualene by oxidosqualene cyclases (OSCs) and the subsequent hydroxylations
and glycosylations [176], [177] ([Fig. 6]). Dammarenediol synthase (DDS) is a member of the family of OSCs, which is specifically
found only in Panax species [18]. Its encoding gene has been characterized as the very first step in ginsenoside
biosynthesis [178], [179].
The product of this enzymatic conversion is dammarenediol, which is the precursor
of three of the four types of ginsenosides: PPD-, PPT-, and ocotillol-type. In the
next subsequent reactions, the dammarenediol is hydroxylated in two consecutive reactions
to protopanaxadiol and protopanaxatriol by protopanaxadiol and protopanaxatriol synthases
(PPDS and PPTS, members of the cytochrome P450 family). Both protopanaxadiol and protopanaxatriol
are further glycosylated by uridine diphosphate (UDP)-dependent glycosyltransferases
(UGTs), whose genes remain to be identified. Extensive additional glycosyl decorations
give rise to the diversity of all detected ginsenosides [180]. Recent studies provided a better understanding of a part of PPT-type biosynthetic
pathway by characterization of four P. ginseng UGTs catalyzing protopanaxatriol glycosylations [181].
The biosynthesis of the oleanane-type ginsenosides starts always from 2,3-oxidosqualene,
which is then cyclized to β-amyrin by β-amyrin synthase and converted to oleanolic acid by the action of oleanane acid synthase,
member of the cytochrome P450s family. The remaining reactions, leading to glycosylated
oleanane-type ginsenosides, are catalyzed by additional UGT genes that have not been
identified so far ([Fig. 6]).
In the last years, a novel dammarenediol-type ginsenoside (compound K) has been isolated
from human blood after oral administration of ginseng [182]. Interestingly, compound K has been never detected in Panax plants. The authors suggested that this novel ginsenoside could actually represent
a minor component whose biosynthesis may actually occur in Panax plants, since the transcripts encoding two of the fundamental enzymes (CYP716A47
and UGTPg1) responsible for its conversion are present in P. ginseng tissues. Compound K could possess a number of beneficial effects for human health,
given its anticancer, antidiabetes, and anti-inflammatory properties tested in vitro
[183], [184]. Currently, compound K is synthesized from deglycosylation of PPD-type ginsenosides
[185].
Given the medicinal importance of ginsenosides, a number of bioengineering strategies
have been developed in order to increase their production and to compensate the time
required for field cultivation, which generally involves four to six years. Four different
main strategies have been undertaken to synthesize ginsenosides in native and heterologous
hosts: (i) developing cell and tissue culture methods [186]; (ii) adventitious root cultures [187]; (iii) transgenic plants [188]; and (iv) engineered yeast systems [189].
The first tissue culture of ginseng was reported in 1964 [190], and many other successful studies followed afterward [191], [192]. The effects of different growth regulators on the final product formation have
been evaluated, including sucrose (used as the most common carbon source in ginseng
cultures), phosphate, copper, and nitrate. These investigations showed that the rate
of biomass growth and the respective ginsenoside content correlated directly with
the medium sugar concentration (up to 60 g L−1). Higher sugar concentrations inhibited cell growth and had a negative impact on
ginsenoside production [193]. Phosphate, copper, and nitrates in different concentrations improved the ginsenoside
yield and thus stimulated ginsenoside production in cell cultures [194], [195].
An example of the tissue culture approach is using adventitious roots as high biomass
producers and studying the effect of different treatments or chemical elicitors [189], [196], [197]. As the major physiological role of the ginsenosides is related to plant defense
[198], [199], stress-inducible factors have been used in order to improve their production. Treatments
with methyl jasmonate and salicylic acid generally induced oxidative stress and increased
ginsenoside content, as well as gamma-irradiation, which enhanced the final product
up to 16-fold [200], [201].
In addition to the cell and tissue culture methods, genetic engineering methods have
been used successfully to up- and downregulate key genes involved in ginsenoside biosynthesis,
such as 3-hydroxy-3-methylglutaryl coenzyme A, squalene synthase (SS), cytochrome
p450 (CYPs), and DDS. Transgenic plants overexpressing these genes showed an increased
amount of ginsenosides [188], [202], [203], [204].
Successful achievements of producing PPD, PPT, oleanolic acid, and compound K have
been also made by using engineered yeast strains [185], [203], [205].
All these works provide an insight into the complex mechanisms of ginsenoside biosynthesis
and explore new methods for large-scale production of these important pharmacological
compounds. Nevertheless, many efforts still need to be done in order to further elucidate
the biochemical pathways leading to ginsenoside formation, as well as to clarify the
events responsible for their diversification in Panax species. Further studies are needed to improve the current available platforms and
resources, as well as to advance the knowledge about their clinical applications.
Withanolides
Withanolides are a group of naturally occurring C-28 oxygenated steroidal lactone
triterpenoids that have been found in at least 15 genera of Solanaceae (e.g., Withania, Tubocapsicum, Lycium, Datura, to mention few). Their presence has been reported also in Fabaceae (legumes) and
Lamiaceae (the family to which most aromatic plants belong) [206]. Within Solanaceae, the shrub Withania somnifera (L.) Dunal (“Indian ginseng” or “Ashwagandha”) has been the focus of several pharmacological
studies, given its wide use in Ayurveda (the major system of Indian traditional medicine)
as a general tonic to increase vigor and memory and lessen the symptoms associated
to rheumatisms, fatigue, and dehydration [207]. On the basis of the anecdotal reports from the Ayurvedic practices, W. somnifera extracts were subjected to intense pharmacological scrutiny and showed to possess
promising antitumor and anti-inflammation properties [208], [209], [210].
Despite the growing relevance of withanolides in medical research (which we will cover
in detail further below), information about their biosynthetic routes and pathway
regulation in planta remain scarce. Over the past years, more than 200 different withanolides have been
isolated from roots, berries, and leaves of W. somnifera
[19]; the focus of most of the pharmacological research was placed, however, almost exclusively
on Withaferin A ([Fig. 7]), the first withanolide to be isolated from W. somnifera
[211]. In general, we now know that the C28-steroidal lactones are biosynthesized from
the C5-terpenoid precursors IPP and DMAPP. As in the case of ginsenosides, the key
step in the synthesis of withanolides is the cyclization of 2,3-oxidosqualene. In
the biosynthesis of withanolides, the product of this reaction is cycloartenol, which
is then converted to 24-methylenecholesterol, the precursor of all withanolides ([Fig. 7]). Methylenecholesterol is then subjected to a series of hydroxylations, elongations,
glycosylations of the carbocyclic skeleton, and further cyclization of its side chain,
resulting in compounds with complex structural features [212], [213], [214], [215], [216]. According to the difference in the substituted groups of C-17 side chain, withanolides
can be divided into two types; type A with a δ-lactone or δ-lactol and type B with γ-lactone or γ-lactol side chain [217]. Some recent investigations have identified putative regulatory and structural genes
involved in withanolide biosynthesis [218], [219], [220], [221].
Fig. 7 Overview of withanolide biosynthesis. The precursor of all withanolides is 24-methylencholesterol,
which undergoes a series of hydroxylations and further modifications of the side chain
in a series of steps not yet completely elucidated. Methylencholesterol is a downstream
product of cycloartenol, which is in turn derived from the cyclization of 2,3-epoxysqualene.
Withaferin A (red) was the first withanolide to be isolated from W. somnifera and is today the best characterized in terms of pharmacological effects. Abbreviations:
SE: squalene epoxidase; CAS: cycloartenol synthase. Dashed arrows indicate multiple
steps.
As we have already mentioned, in the past few decades, withanolides attracted considerable
research attention, and several studies were carried out to investigate the pharmacological
and biological activities of this class of metabolites and their role in human medicine.
Withanolide extracts from W. somnifera showed to possess anti-inflammatory, cytotoxic and antitumor activities [222]; there are also indications that the administration of Withania extracts improved memory retention in rats [223] and cognitive functions in humans [224], [225]. Withanolide A, withanolide B, withaferin A, and withanone, in particular, showed
protective effect on the neuronal tissues of frontal cortex and corpus striatum in
rats and prevented increase of lipid peroxidation [226], [227]. These early investigations on the effects of Withania extracts in attenuating cerebral functional deficits led to more targeted studies
on the potential beneficial effects of withanolides in neurodegenerative diseases.
Recent studies showed, for example, that a root extract of W. somnifera was effective in decreasing the accumulation of β-amyloid peptides in the brains of rats affected by Alzheimerʼs disease [228]. Also, a crude Withania extract relieved significantly the symptoms of drug-induced parkinsonism (tremor,
rigidity) in model rats [229].
Withanolides have also shown promising antitumor activities. Withanolide A and Withaferin
A are two of the best studied withanolides for their capacity to significantly reduce
the survival of various cancer cell lines and decrease the size of breast tumors implanted
in rats [230], [231], [232]. The effect of Withaferin A, in particular, seems related to its capacity to interfere
in the pathways of protein degradation and recycling (which are highly active in cancer
cells), through inhibition of tubulin polymerization: this inhibition would prevent
the formation of autophagy-related structures, which are essential for protein recycling
[233].
Also, other withanolides (e.g., withanolide D, 17α-hydroxywithanolide D, physagulines) were extracted from stems, roots, and leaves
of Tubocapsicum anomalum (Franch. & Sav.) Makino (Solanaceae) and Physalis angulata L. (Solanaceae), and all exhibited high and significant cytotoxicity against several
human cancer cell lines [234], [235], [236].
Despite the increasing evidence concerning the beneficial effects of these compounds,
there are still many areas that remain to be investigated, especially regarding the
biosynthesis and regulation of the withanolide pathway. W. somnifera is an important and highly valued plant in traditional medicine and showed promising
effects in small-scale clinical trials [237], [238]. In the future, the full elucidation of withanolide biosynthesis will help to transfer
the pathway to heterologous hosts for cost-effective biosynthesis of the active components;
on the other hand, the development of biotechnology protocols for Withania spp. will guide future efforts for functional studies in this important genus and
will provide the genetic materials for targeted breeding and commercial exploitation.
Artemisinin
Artemisinin is a sesquiterpene lactone isolated from the Chinese herb Artemisia annua L. (Asteraceae), known as qinghaosu (sweet worm-wood) in traditional medicine, and
mainly used for its antimalarial effect. In addition to that, recent studies showed
promising anticancer, antiviral, and anti-inflammatory activities [239].
The first report on the healing properties of A. annua extracts dates back to 340 AD by Ge Hong in his book Zhou Hou Bei Ji Fang (A Handbook of Prescriptions for Emergencies). It was only in 1971, however, that the active compound was isolated and characterized,
due to the work of the Chinese chemist Youyou Tu [240], [241], who was later awarded the Nobel prize for medicine in 2015 for her discovery of
artemisinin.
Artemisinin became essential in the treatment of uncomplicated malaria caused by the
parasite Plasmodium falciparum and has established itself as the most potent of all antimalarial drugs [242]. Although the mechanism of action is still not completely understood, the use of
artemisinin and its derivatives in combined therapies contributed significantly to
the reduction in malaria mortality [243]. Artemisinin is currently the first-line treatment against malaria [244], [245], despite the emergence in recent years of cases of resistance in Southeast Asia.
Recent studies showed that the resistance is mainly due to the K13 mutation in P. falciparum parasites [246], [247].
Given the complex structure of natural artemisinin, the main commercial source for
this compound so far is the natural plant. Artemisinin is produced by the glandular
trichomes of A. annua, but its accumulation in planta is low (0.01 – 1.4% dry weight) and highly dependent on the plant variety [248]. Based on this, the extraction of artemisinin is relatively expensive and its production
cannot meet the global demand.
In order to face these fundamental problems, many efforts to increase artemisinin
production have been attempted. Significant results in this direction were obtained
in the field of molecular biology, synthetic biology, and genetic and metabolic engineering.
All these achievements would have not been possible without the characterization of
the genes and enzymes related to artemisinin biosynthesis. In the early studies, radioactive-isotope
labeling has been used to show that artemisinin derives from IPP and DMAPP, which
are synthesized both from the cytosolic mevalonate (MVA) and from the plastidial 2-c-methyl-d-erythritol
4-phosphate (MEP) pathway [249], [250], [251], [252]. The condensation of two molecules of IPP with one molecule of DMAPP forms FPP,
which is then converted to amorpha-4,11-diene by amorphadiene synthase (ADS) [253]. Amorphadiene is subsequently oxidized, first, to artemisinic alcool and then to
artemisinic aldehyde by a CYP71AV1 and its redox partner cytochrome P450 reductase
(CPR) [254], [255]. Artemisinic aldehyde is then converted to dihydroartemisinic aldehyde by the enzyme
DBR2 (artemisinic aldehyde Δ11(13) reductase) and oxidized to dihydroartemisinic acid
(DHAA) by aldehyde dehydrogenase (ALDH1) [256], [257]. The export of DHAA to the trichome and its photoxidation then yields artemisinin
([Fig. 8]).
Fig. 8 Metabolic pathway of artemisinin biosynthesis. The first step of artemisinin synthesis
is the condensation of IPP/DMAPP into farnesylpyrophosphate (FPP). FPP is then cyclized
to amorphadiene by ADS and further oxidized to artemisinic alcohol and artemisinic
aldehyde by CYP71AV1 and its redox partner CPR. Artemisinic aldehyde is converted
to dihydroartemisinic aldehyde by DBR2, and then to DHAA by ALDH1. Artemisinin is
produced by spontaneous photo-oxidation of DHAA.
The elucidation of the artemisinin biosynthetic pathway has been a fundamental step
in exploring and developing the bioengineering tools used to enhance its production.
Different directions have been undertaken in order to improve the artemisinin biosynthesis
in the same A. annua species or in different host organisms.
Germplasm selection and breeding have been used for creating superior cultivars [258]. The studies reported so far describe a number of cultivars with increased artemisinin
content from 1 to 2.4% (DW), but due to instable artemisinin production, these lines
have not been considered as a valuable commercial source [259], [260].
Transgenic A. annua plants have also been produced with the aim of increasing the amount of artemisinin.
In general, two main strategies have been used: the first one based on the overexpression
of structural or regulatory genes [261], [262], [263], and the second one based on the inhibition of competing pathways, such as, for
example, the squalene pathway [264].
Overexpression of several genes responsible for key steps of artemisinin biosynthesis,
such as farnesyl pyrophosphate synthase (FPS), ADS, CYP71AV1, CPR, and DBR2 led to
approximately a double increase of artemisinin production [265], [266], [267].
Based on these conclusions, many research groups focused their interest in co-overexpressing
two or more genes in A. annua to further increase the amount of artemisinin [262], [263]. For example, co-overexpression of FPS, CYP71AV1, and CPR genes increased the artemisinin
content by 3.6 fold (2.9 mg/g fw) in comparison with control plants [267], and the simultaneous overexpression of ADS, CYP71AV1, and CPR resulted in 2.4-fold
increase of artemisinin (15.1 mg/g DW) compared to control plants [268].
Recently, several transcription factors of different families, including WRKY, bHLH,
NAC, and MYC have been isolated and characterized in A. annua. The overexpression of these genes also increased the final amount of total artemisinin
[261], [266], [269], [270], [271].
The other approach used to enhance the artemisinin amount is to block the key enzymatic
steps in competitive pathways to divert the flow predominantly into artemisinin biosynthesis
[262]. Inhibition of the expression of the SS gene, which uses farnesyl pyrophosphate
as a substrate and catalyzes the first step of the sterol pathway, increasing the
artemisinin content up to 31.4 mg/g (a three-fold increase with respect to control
plants) [264].
In order to explore the metabolic engineering approaches for alternative artemisinin
production, several heterologous hosts have been tested. The steps leading to the
synthesis of amorphadiene have been engineered in E. coli by introducing the MVA pathway from yeast (S. cerevisiae) and a synthetic ADS gene [272]. The results obtained reached a titer of 300 mg/L amorphadiene [273].
Another attempt to enhance artemisinin production has been made in plant hosts. Nicotiana species have been selected as potentially the most suitable ones because of their
favorable characteristics (rapid growth and high biomass) [263]. An innovative approach consisted in the insertion of biosynthetic genes in both
the nucleus and chloroplast genomes, leading to a final yield of 120 µg/g artemisinic
acid [274]. Despite these efforts, however, the production levels in Nicotiana remained low and therefore not suitable for commercial production.
To date, the most prominent achievement in the field of metabolic engineering is the
production of artemisinic acid in yeast. In this case, the MVA pathway has been introduced
into S. cerevisiae along with ADS and CYP71AV1, allowing the conversion of amorphadiene to artemisinic
acid in three oxidation steps. As a result, around 100 mg/L of artemisinic acid have
been obtained [254]. The system was further improved by the introduction of two additional enzymes,
a plant dehydrogenase (ADH1) and a second cytochrome (CYB5), which were both positive
regulators of artemisinin biosynthesis. The process reached titers up to 25 mg/L of
artemisinic acid, which is the maximum amount achieved so far [275]; this improved yeast system has, however, found modest market impact due to the
lower costs associated to the direct extraction of artemisinin from plants [276].
Taxol
Taxol (paclitaxel) is a complex diterpenoid extracted from the bark of the pacific
yew (Taxus brevifolia Nutt., family Taxaceae), a tree native to the west coastal region of North America.
In 1960, taxol was discovered during a large phytochemical screening aimed at the
identification of cytotoxic natural products from plants. This effort was jointly
conducted by the National Cancer Institute and the U. S. Department of Agriculture
[277], [278]. Taxol belongs to a large family of taxoids (taxane diterpenoids) that accumulate
in Taxus species, where they play an important role in plant defense. Taxoids deter the feeding
activities of mammals and insects and protect the plants from fungi colonization [279].
Taxol is formed by a tetracyclic oxaheptadecane skeleton decorated with eight functional
oxygen groups, two acyl groups, and a benzyl group [280]. After the elucidation of its structure in 1971 [277], several clinical trials led to its approval by the FDA as an anticancer drug for
the treatment of a wide range of cancers (ovarian, breast, lung, Kaposiʼs sarcoma,
cervical, and pancreatic) [281]. Since then, taxol has become a leading anticancer drug, whose total sales exceed
several billion U. S. dollars per year [282]. The mechanism of action of taxol is based on its capacity to interfere with the
function of microtubules during cell division, causing their polymerization even at
low temperatures. This property renders taxol highly cytotoxic to cancer cells [281].
The amount of taxol that can be extracted from the bark of the adult trees of T. brevifolia is however, extremely low. Around 12 kg of bark material yield only 0.5 g of purified
taxol [278]; therefore, alternatives sources or methods for taxol production must be developed
to avoid the need to rely on destructive bark harvesting [283].
In addition to that, the knowledge of the pathway of taxol biosynthesis remains incomplete.
Of the 20 hypothesized enzymatic steps, only 14 have been well characterized [280], [284], [285] ([Fig. 9]). The current understanding of the taxol biosynthetic pathway includes at least
eight oxidation steps, five acetyl/aroyl transferase steps, a C4β,C20-epoxidation reaction, a phenylalanine aminomutase step, N-benzoylation, and two CoA esterifications [282]. The presence of several putative enzymes in the pathway was recently suggested
by analyzing the transcripts of Taxus baccata L. cells elicited with methyl jasmonate [285].
Fig. 9 Overview of taxol biosynthesis. The pathway leading to taxol is composed by at least
20 enzymatic steps; of these, only 14 have been characterized (enzymes in red indicate
hypothetical steps). TXS: taxadiene synthase; T5αOH: taxane 5α-hydroxylase; TAT: taxadiene-5α-ol-O-acetyl transferase; T10βOH: taxane 10β-hydroxylase; T13αOH: taxane 13α-hydroxylase; T2αOH: taxane 2α-hydroxylase; T9αOH: taxane 9α-hydroxylase; T7βOH: taxane 7β-hydroxylase; T1βOH: taxane 1β-hydroxylase; TBT: taxane-2α-O-benzoyltransferase; DBAT: 10-deacetyl baccatin III-10-O-acetyltransferase; T2′OH: taxane 2′a-hydroxylase; PAM: phenylalanineaminomutase;
TBPCCL: β-phenylalanine coenzyme A ligase. Figure modified from [280].
The precursors of taxol are IPP and DMAPP from the plastidial MEP pathway. Geranylgeranyl
pyrophosphate synthase catalyzes the condensation of three molecules of IPP and one
of DMAPP into geranylgeranyl pyrophosphate (GGPP), which is then cyclized by taxadiene
synthase into taxa-4(5),11(12)-diene (taxadiene). Taxadiene is then the central precursor
from which all taxane diterpenoids originate. In the branch leading to taxol biosynthesis,
taxadiene is hydroxylated by different P450 hydroxylases. The order of the reactions
and some of the genes responsible for these subsequent catalytic steps are, however,
not clear yet: from the isolation of the putative intermediates, several hydroxylations
should occur at positions C1, C2, C4, C7, and C9, as well as a further oxidation at
C9 and a C4β,C20 epoxidation. The product of this series of poorly characterized steps is baccatin
III, a key intermediate that can be also extracted from the needles of T. brevifolia and constitutes the starting substrate for semisynthesis of taxol and other taxane
diterpenoids [280]. Baccatin III is then esterified on C13 with a β-phenylalanoyl moiety yielding 3′-N-debenzoyl-2′-deoxy-taxol, in a reaction catalyzed by baccatin III: 3-amino,13-phenylpropanoyltransferase
([Fig. 9]).
From 3′-N-debenzoyl-2′-deoxy-taxol, the last two steps of the biosynthesis leading to taxol
require the hydroxylation and terminal N-benzoylation of the β-phenylalanine side chain by a yet uncharacterized taxane-2′α-hydroxylase and a N-benzoyl transferase (DBTNBT) [21], [285].
Today, the supply of taxol for medical use cannot be achieved from natural sources.
As a consequence of the initial overharvesting of the bark for taxol extraction, T. brevifolia is now in a near threatened state [286]. On the other hand, total chemical synthesis of taxol, which was achieved in 1994
[287], has never been considered as an economically feasible alternative, due to the high
complexity of the process. The current standard for taxol production is now semisynthesis,
starting from the isolation of the intermediates baccatin III or 10-deacetylbaccatin
III from Taxus cell cultures. Taxol can also be produced entirely from Taxus cell suspension cultures. The whole process, after decades of optimization based
on the use of chemical elicitors (e.g., methyl jasmonate) and improvement of growth
conditions, has now reached yields in the range of several hundred mg per liter of
culture [282], [288].
A partial alternative to Taxus cell culture was represented by the transfer of the known part of the pathway–up
to taxadiene–to E. coli. Bacteria (and yeast) offer in fact a higher growth rate with respect to plant cell
cultures and are generally easier to manipulate. The insertion of two pathway modules
into E. coli (the MEP pathway and the GGPP synthase/taxadiene synthase pathway) resulted in final
yield of around 1 g/L of taxadiene. Although taxadiene is a distant precursor of baccatin
III (and thus several steps–some of which still unknown–separate taxadiene from taxol),
the metabolic engineering of E. coli was an important achievement for the future full transfer of this important pathway
to a microbial host [289].
Bioinformatic Resources for Medicinal Plants
Bioinformatic Resources for Medicinal Plants
In recent years, the decreasing costs associated with sequencing and assembly of genomic
data led to the release of a high number of whole-plant genome sequences, including
several from medicinal plants [290]. In some cases, as we detail below, this was accompanied by the development of several
communal bioinformatics resources that integrated various types of omics datasets.
Clearly, given the complexity of secondary metabolism of medicinal plants with respect
to crops and model plants species, these resources offer the opportunity to mine specifically
the metabolic pathways of medicinal plants and correlate, for example, the number
of specific metabolites with the genomic data (e.g., gene expression, sequence polymorphisms).
We provide below a survey of the main genomic databases that have been recently developed
for some of the most studied medicinal plants.
Medicinal Plant Genomics Resource [291] is an example of a large, collaborative effort between several research institutions
containing genome and metabolome data of 14 taxonomically diverse medicinal species,
including Atropa belladonna L. (family Solanaceae), C. sativa, C. roseus, Panax quinquefolius L. The website offers an easy-to-use interface for a BLAST (basic local alignment
search tool) search against the sequenced species and provides access to the various
genome browsers of medicinal plants. The files related to the genome and transcript
assemblies are also available for download. C. acuminata (the “happy tree” of Chinese traditional medicine, [292]), Calotropis gigantea (L.) W. T. Aiton (a shrub of the Apocynaceae family growing in Southeast Asia, which
is known for producing cardiac glycosides [293]), and a new variety of C. roseus are the latest medicinal plants whose genomic and transcriptomic data have been added
to the database. The database also contains metabolic profiling data (mainly acquired
through LC-MS), collected from several tissues of medicinal plants.
Another example of a resource offering a range of tools for visualization and analysis
of metabolic networks and ʼomicsʼ data is CathaCyc, a metabolic pathway database built
from metabolic and RNA-seq data of the plant C. roseus
[82]. CathaCyc is a repository for genes, enzymes, reactions, and pathways of primary
and secondary metabolism; it contains 390 pathways with more than 1300 enzymes. The
database also integrates the draft genome data of C. roseus
[74]. The enzymes in CathaCyc have also been linked to ORCAE [294], a genome annotation resource, allowing the users to validate and edit gene annotations
[295].
In 2011, a consortium of U. S. research organizations, funded by NIH, launched the
project Transcriptome Characterization, Sequencing, and Assembly of Medicinal Plants
Relevant to Human Health [296]. Currently, the database contains transcriptome data related to 31 species of medicinal
importance, including, among others, Cinchona pubescens Vahl (the quinine tree, family Rubiaceae), Colchicum autumnale L. (family Colchicaceae, the source of colchicine), Datura stramonium L. (family Solanaceae), and Podophyllum peltatum L. (family Berberidaceae) (mayapple; the roots of Podophyllum accumulate podophyllotoxin,
the precursor of the chemotherapeutic etoposide [297]).
Recently, another database has been established within the Phytometasyn project (www.phytometasyn.ca). It contains de novo transcript assemblies of around 20 medicinal plants including the plant Eschscholzia californica Cham. (California poppy, a member of Papaveraceae accumulating several active BIAs,
mainly those of the pavine-type, e.g., eschscholtzidine [83]).
Future Prospects
For centuries, plants have always been used as remedies to treat a great number of
symptoms. Even today, a large part of the world population relies on herbal medicines
as a major source of health care, especially in Asia, Africa, and Latin America. In
some rural areas, traditional medicines based on herbal drugs are the only source
of health care. Almost 30% of the modern drugs we use today are actually derived from
natural products; an ever-increasing number of these, coming from plants, are now
in the process of being approved for market either as main active ingredients or as
supplements. Several clinical trials of herbal medicines are now underway in the United
States for the treatment of food allergies, asthma, and gastric inflammation [298].
We are now at the beginning of a new phase in which integrative approaches of genomics
and metabolomics are applied to the study of the metabolism of medicinal plants. These
approaches have begun to revolutionize our understanding of at least two main aspects
of herbal medicines: (i) the biosynthesis, and pathway regulation, of many plant secondary
metabolites of medicinal importance [290]; (ii) the mechanism of action of many of these plant herbal components on human
metabolism and health [299], [300]. We see in this avalanche of knowledge both challenges and avenues for further research.
We think there is a urgent need to develop faster, more informative and comprehensive
analytical approaches for profiling and characterizing a larger number of metabolites;
these challenges can be overcome also with the development of computational metabolomics
strategies for metabolite annotation [301], [302], de novo pathway reconstruction [303], and analysis of natural variation [304]. We clearly recognize the long history and the potential of traditional medicines
as a source of well-being, but we also reason that a more intense scrutiny should
be conducted on herbal drugs–including rigorous studies on their chemical composition
and clinical trials–before claims could be made in relation to their therapeutic efficacy.
This new knowledge could then be used–as we have seen in the case studies presented
here (especially in the case of artemisinin)–to set up platforms for metabolic engineering
and enable sustainable production of medicinal phytochemicals. Finding alternative
ways for production of these compounds–outside of their respective native plant hosts–is
also relevant to preserve natural resources in their native habitats, as the case
of taxol has shown during the initial overharvesting of T. brevifolia. Scientists and policy makers need to find a better balance to promote a sustainable
use of genetic resources, especially from the hot spots of world biodiversity (e.g.,
the Amazonian forest). A new equilibrium need to be established between ecological
conservation and bioprospecting for novel drug discoveries from plants [11].