Login

Join for Free!
16750 members
table of contents table of contents

Analyses of complete genomes and large collections of gene transcripts have shown …


Biology Articles » Mycology » An ancient genome duplication contributed to the abundance of metabolic genes in the moss Physcomitrella patens » Results and discussion

Results and discussion
- An ancient genome duplication contributed to the abundance of metabolic genes in the moss Physcomitrella patens

Physcomitrella patens is a paleopolyploid

Based upon a dataset of 24,845 P. patens unigenes [22-24], prediction of open reading frames using species-specific models yielded a dataset of 22,237 coding sequences. From those, a total of 2,907 paralogs were determined by all-against-all BLAST searches using previously described parameters [25]. After average linkage clustering, the KS-values representing ancient duplication events were calculated. In total, 1,971 genes were placed in 854 clusters. A cutoff distance of KS 5.0 was used and the KS-range was divided into bins of size 0.1. The KS age distribution plot exhibits a clearly distinguishable peak at KS ~0.85, providing evidence for an ancient large-scale or even genome-wide duplication event (Figure 1A). While KS values >1.0 should be used with caution because multiple mutations may cause inaccuracies in the estimation [26,27], the KS estimate for the peak in the P. patens distribution thus probably is trustworthy. Also, while e.g. in Drosophila KS is lower in genes with strong codon bias, this is not the case for P. patens [28] and thus needs not to be dealed with.

In order to obtain additional evidence for a large-scale or genome-wide duplication in P. patens, as well as to date this duplication event, we constructed linearized trees (see Methods). We constructed neighbor-joining trees for 487 gene families which contained two to ten P. patens genes, one Chlamydomonas reinhardtii or Ostreococcus tauri gene as an outgroup sequence, and genes from at least two different seed plants (Arabidopsis thaliana, poplar, or rice) as reference points. Sequences that were evolving too fast or too slow were removed, after which linearized trees, in which branch lengths are directly proportional to time, were constructed for each gene family [29,30]. This left 330 trees, and after removing nodes with bootstrap values P. patens in 159 trees that could be used for dating the gene duplications in P. patens by comparing the time of duplication with the time of speciation between A. thaliana and poplar, assumed 100 MYA [31] and A. thaliana or poplar and rice, assumed 150 MYA [32] (see Figure 1B).

Next, we plotted the estimated dates of the 179 P. patens duplication events, which are shown in Figure 1C. As can be clearly observed, a majority of the gene duplicates seem to have been created between 30–60 MYA (average 45 MYA), indicating that a large-scale gene duplication or a whole-genome duplication is indeed likely to have occurred around this time. Although using two different calibration dates (100 MYA for the Arabidopsis-poplar split, and 150 MYA for the monocot-eudicot split) may affect the age distribution if one of the two calibration dates is unrealistic compared to the other, age distributions obtained for each calibration point separately were very similar (data not shown), suggesting that the dates of 100 MYA and 150 MYA [31,32] are in good agreement with inferred dates from tree topologies.

It should be noted, however, that a number of alternative and more sophisticated methods exist to estimate divergence or duplication dates based on tree inference, even if rate heterogeneity between lineages is present. Yet, as has been shown in several studies (e.g. [33] and references therein), caution should be taken when using such rate-smoothing methods. It is also much harder to process a large amount of data in a high-throughput manner using rate-smoothing methods because of the different parameters that ideally have to be estimated or used for different genes or proteins, while this is not the case when a correction for unequal rates is not required and faster/slower evolving genes are simply removed, as in the linearized trees method applied here. The major disadvantage of the linearized tree method is that a substantial amount of data is omitted from the analysis because trees showing unequal branch lengths for the species under investigation are not considered. However, in the current study there are plenty of data left to provide convincing evidence that a large-scale gene duplication event has occurred in the evolutionary past of P. patens. This was the main aim of our study, rather than to come up with the most accurate dating of that event, which will be very difficult anyway given the fact that the calibration points are debateable in the first place.

However, the approximate range of the duplication event is probably trustworthy. If we assume that the genome duplication indeed took place about 45 MYA (average of the peak in Figure 1), and we assume that genes duplicated at that time have an average KS value of 0.85 (see Figure 1), we can infer the rate of synonymous substitutions by simply dividing 0.85 by 45 MY. This gives us a rate of 1.9 synonymous substitutions per synonymous site per year, which is very close to the value presented by Koch et al. [34] based on the calibration of molecular clocks for eudicots. However, these substitution rates have to be interpreted with caution, since there are many theoretical and empirical concerns about the accuracy of molecular clocks and the rate of substitutions in different lineages. Some of the major issues are rate heterogeneity in and between lineages caused by evolutionary factors (e.g. generation time), difficulties in interpreting the fossil data used to calibrate the clock, and rate variation among genes, even at synonymous sites.

An enigma unveiled: ploidy levels and chromosome counts among mosses

Besides the long-term effect of increasing the genetic complexity, there might be several possible short-term advantages of polyploidization events for diploid seed plants (reviewed by [13]), such as heterosis effect, sympatric and allopatric speciation, decreased inbreeding depression and genetic load (allowing selfing populations that can be monoecious and better dispersers). Genome duplication, due to its effects on gene regulation and developmental processes, might also be a foundation for speciation and adaptation through genetic divergence in plants [35]. In the case of the haploid “bryophytes”, however, other effects appear more relevant. The allopolyploidization of dioecious gametophytes might yield a monoecious plant (thus rendering the dispersal of breeding populations easier). A second advantage might be that the duplication of the genome would free the formerly haploid plant from the necessity to preserve the function of crucial single copy genes under all circumstances, thus enhancing the potential for development of new functions. The moss P. patens belongs to the Funariaceae, is haploid, monoecious and self-fertile. Polyploidization occurs rather frequently during transfection of P. patens protoplasts [36]. Among transformants, diploid plants cannot be distinguished from haploid plants using morphological traits alone [37]. The P. patens wild type, however, is clearly functionally haploid, as can be seen from segregation ratios [38]. Taken together, our data suggest that the ancestor of P. patens underwent polyploidization during the Eocene, potentially becoming hermaphroditic through this process. However, subsequently the plant became functionally haploid again (haploidization) while keeping the duplicated chromosomes. Analogous states are known from seed plants as well, where duplicated chromosomes often remain after allopolyploidization and subsequent diploidization [39,40].

Most liverworts have 9 chromosomes and hornworts usually have 8, 9 or 10; there are few polyploids in both groups. The mosses, in contrast, display chromosome numbers between 4 and 72 [41]. These are probably due to both different base numbers in the different orders and the existence of many aneuploids and polyploids [5]. Within the Funariaceae, chromosome counts between 4 and 72 have been reported [41]. While Funaria hygrometrica seems to be representative for the majority of Funariaceae, Physcomitrium pyriforme usually exhibits a higher chromosome count with several samples each being described to contain 18, 26, 36, 52 and 54 haploid chromosomes, with the highest number being 72 chromosomes [41]. Among the analysed F. hygrometrica accessions, 51% contain 28 chromosomes. While single accessions were described to contain 4, 21 and 42 chromosomes, the remainder contains either 14 (34%) or 56 (9%) chromosomes. In the case of P. patens, chromosome counts of 14 and 28 have been reported for two different isolates [42]. These data make frequent, recent and independent polyploidization events among individual species or genera evident. The haploid chromosome count of the P. patens ecotype analysed here is 27 [43], which, given the data presented in this work, would make it a putative paleopolyploid and paleoaneuploid. In a recent phylogenetic analysis, the age of the Funariales was determined at ~172 MYA [9]. Therefore, the whole genome duplication analysed here most probably represents a duplication event that occurred after speciation among the Funariaceae. Consequently, the different chromosome counts found in extant species like P. patens, F. hygrometrica and P. pyriforme have most probably occurred and been fixed several times independently during evolution. This is further supported by the fact that moss genera or species seem to have become hermaphroditic several times independently during evolution, given the dissipated pattern of mon- and dioecious species within the taxonomic groups [6].

Gene retention following the whole genome duplication in P. patens

It has been demonstrated that retention of functional gene classes after large-scale duplication events is biased in angiosperms. For example, genes involved in signal transduction and transcriptional regulation were preferentially retained after the three whole genome duplication events within the ancestor of A. thaliana, while there was selection against retention of these genes after small-scale duplication events [16,19,20]. In order to analyse potential bias among the genes that were retained following the genome duplication in Physcomitrella patens, we compared the fractions of genes associated with Gene Ontology (GO) terms [44]. All paralogs with a KS of 0.6–1.1 (765 genes) were mapped to GO Slim [45] and compared to the associations of an equally sized random sample. The biological process categories "biosynthesis" and "generation of precursor metabolites and energy" are significantly over-represented (q 1). In total, 199 genes, (26%) belong to these categories. Genes of biological process categories that are under-represented within the genome duplication peak are "protein biosynthesis", "organelle organization and biogenesis", "cytoskeleton organization and biogenesis" and "cytoplasm organization and biogenesis". Genes involved in signal transduction and transcriptional regulation, which were preferentially retained after genome duplications in angiosperms [19,20,46], seem not to be retained in excess following the duplication event in P. patens. Also, the PFAM domains that were reported to be enriched in plant (A. thaliana and rice) duplicate genes [47] were compared with those assigned to genes present in the P. patens KS peak and were found not to be enriched.

The genes present in the peak (Figure 1A) representing the whole genome duplication were also mapped against the Kyoto Encyclopedia of Genes and Genomes (KEGG) database in order to analyse in which pathways they are involved. Among the genes that belong to the under-represented GO categories, the lack of ribosomal proteins (48% of those present in the reference set) is noteworthy. The genes belonging to the enriched GO categories all belong to the KEGG ontology (KO) class "metabolism". Among GO:0006091 "generation of precursor metabolites and energy", the major KO pathways are energy and carbohydrate metabolism, among GO:0009058 "biosynthesis", the major KO pathways are carbohydrate, amino acid and lipid metabolism. Thus, both GO and KEGG analyses demonstrate an abundance of genes involved in metabolism that were specifically retained following the whole genome duplication. This is in concordance with the high abundance of metabolic genes in P. patens as compared to seed plants which has been described previously [22]. Based on GO mappings of large-scale genome or transcriptome datasets, metabolism-associated transcripts account for 10–44% of seed plant transcriptomes, while their abundance is significantly higher (70–80%) in mosses [22]. Apparently, metabolic genes have been maintained in excess following the large-scale duplication event, which might explain their previously observed abundance in the P. patens genome.

Peculiarities of moss ecology and metabolism

There are species of mosses that can survive long times of dryness (up to 14 years), extreme cold (Antarctica) and heat (70 to 110 degrees Celsius) and are able to prosper in only 0.1% of sun light (while seed plants need at least 2%). In general, mosses are adapted to capture of low light intensities, having low light compensation and saturation points, making use of their one-cell-thick leaflets and lack of waxy cuticles [48]. The structural simplicity of the tissues permits mosses to react immediately to water, CO2 and light availability in terms of photosynthesis [48]. Mosses in general are able to grow over a wider temperature range than seed plants, particularly at low temperatures. Many mosses are able to have photosynthetic gain at temperatures as low as -10°C [48]. They typically become dormant in summer heat and drought but are able to immediately photosynthesize upon rehydration. Mosses are able to receive nutrients from the substrate as well as from precipitation and dust [48]. In a multitude of studies, alternative and/or redundant metabolic pathways have been described in mosses. As an example, the reduction of adenosine 5'-phosphosulfate (APS) to sulfite by adenosine 5'-phosphosulfate reductase is considered the key step of sulfate assimilation in seed plants. While APS-reductase is present in P. patens as well, this moss also harbors a phosphoadenosine 5'-phosphosulfate (PAPS) reductase, which was previously known e.g. from enteric bacteria [49]. Thus, P. patens is able to employ an alternative pathway of sulfate reduction that is not accessible to seed plants. Interestingly, sulfate adenylyltransferase, the enzyme that catalyzes the formation of PAPS from ATP and inorganic sulfate, is encoded by a gene present in the KS peak. In Ceratodon purpureus. a bifunctional delta-fatty acyl acetylenase/desaturase has been characterized which displays a redundant functionality, being able to introduce a Delta6cis-double bond into 9,12,(15)-C18-polyenoic acids as well as converting a Delta6cis-double bond to a Delta6-triple bond [50]. P. patens contains a homolog of the yeast ELO-genes unknown from seed plants, encoding a component of the Delta6-elongase, which is involved in the biosynthesis of C20 polyunsaturated fatty acids [51]. Among the proteins encoded by the KS peak genes, 12 are involved in fatty acid metabolism (e.g., fatty acid desaturase, long-chain fatty-acid-CoA ligase), of which eight are involved in fatty acid biosynthesis.

Lipid metabolism and volatiles

The tetracyclic diterpene 16alpha-hydroxykaurane is a major lipid compound in P. patens, which has been shown to be released into the air [52]. P. patens contains high levels of arachidonic acid and lesser amounts of eicosapentaenoic acid, which is due to delta5- and delta6-desaturases that are associated with the synthesis of these fatty acids [53,54]. A complex mixture of fatty acid-derived aldehydes, ketones, and alcohols is released upon wounding of P. patens. In contrast to other lipoxygenases cloned so far, the P. patens enzyme exhibits an unusually high hydroperoxidase and fatty acid chain-cleaving lyase activity, leading to the formation of unusual oxylipins based on arachidonic acid as substrate [55]. Thus, a highly diverse product spectrum is formed by a single enzyme accounting for most of the observed oxylipins produced by P. patens. Also, a P. patens gene was cloned and classified to encode an unspecific hydroperoxide lyase having a substrate preference for 9-hydroperoxides of C18-fatty acids [56]. The knockout lines failed to emit (2E)-nonenal while formation of C8-volatiles was not affected, indicating that in contrast to flowering plants the P. patens enzyme is involved in formation of a specific subset of volatiles. Ent-kaurene is a precursor for gibberellins (GAs) in plants and fungi. The fungal CPS/KS enzyme catalyzes a two-step reaction corresponding to ent-copalyl diphosphate synthase (CPS) and ent-kaurene synthase (KS) activity in plants. Overexpression of fungal CPS/KS in A. thaliana has been shown to rescue GA-deficient phenotypes [57]. Interestingly, the over-expressing plants emitted ent-kaurene as a volatile, inducing airborne action on nearby plants. Recently, an ent-kaurene synthase from P. patens was cloned and characterized. The enzyme is a bifunctional cyclase which, like fungal CPS/KS, directly synthesizes ent-kaurene from geranylgeranyl diphosphate [58].

Tolerance to abiotic stresses

In comparison with seed plants, Physcomitrella patens exhibits a much greater tolerance to abiotic stresses, being able to survive e.g. NaCl concentrations up to 350 mM and sorbitol up to 500 mM [59]. P. patens is dehydration tolerant, plants that had lost 92% water on a fresh-weight basis were able to recover successfully [59]. Other mosses, like Tortula ruralis, are even desiccation tolerant, the rehydrating gametophytes displaying an abundance of transcripts that code for e.g. enzymes involved in oxidative stress metabolism [60]. The transcript levels of novel putative membrane transporters similar to mammalian inward rectifier potassium channels were shown to be upregulated in P. patens upon cold and osmotic stress [61]. The widespread calcifuge moss Pleurozium schreberi is moderately tolerant to dissolved SO2 (bisulfite). The tolerance mechanism involves extracellular oxidation using metabolic (photo-oxidative) energy, passive oxidation by adsorbed Fe3+ and probably also internal metabolic detoxification [62]. In a comparative classification of alpine mosses, lichens and seed plants, strong illumination caused photodamage in dried leaves, but not in dry moss (Grimmia alpestris) and dry lichens [63]. In hydrated mosses, but not in leaves of seed plants, protein protonation and zeaxanthin availability are fully sufficient for effective energy dissipation even when photosystem II reaction centers are open [64]. During desiccation, quenchers accumulate in the poikilohydric moss Rhytidiadelphus squarrosus which are stable in the absence of water but revert to non-quenching molecular species on hydration [65]. Together with zeaxanthin-dependent energy dissipation, desiccation-induced thermal energy dissipation protects desiccated poikilohydric mosses against photo-oxidation, ensuring survival during drought periods [65]. There are several proteins encoded by KS peak genes that might be related to these phenomena, such as those involved in carotenoid synthesis (1-deoxy-D-xylulose-5-phosphate synthase, zeta-carotene desaturase, two phytoene synthases) and electron transfer (ten light-harvesting complex II chlorophyll a/b binding proteins, two plastocyanins and two cytochrome b6-f complex iron-sulfur subunits).


rating: 2.00 from 2 votes | updated on: 3 Sep 2007 | views: 651 |

Rate article:







excellent!bad…