Login

Join for Free!
17784 members
table of contents table of contents

The authors reviewed the current thinking and progress on how to get …


Biology Articles » Bioinformatics » From genome to epigenome » Technology platforms

Technology platforms
- From genome to epigenome

TECHNOLOGY PLATFORMS 

Central to successfully resolving any epigenome is the availability of robust technologies and assays to generate quantifiable and reliable data that can be integrated with existing genome annotation. Figure 1 suggests how the key technology platforms of DNA methylation and chromatin profiling could be integrated with expression profiling platforms and existing genome data and bioinformatics to produce a multidimensional epigenome database. Direct comparison of the various technology platforms to find the most sensitive, accurate and highest resolution platform is not really meaningful because the endpoints are different (e.g. global methylation profiles across the whole genome or detailed methylation maps across preselected regions of the genome). Table 1 summarizes the major technology platforms with their individual methodologies. These methodologies enable large amounts of DNA methylation, chromatin and expression profiling data to be generated that can be processed through bioinformatics and be superimposed upon the genome to ultimately create an epigenome database.

DNA methylation profiling platforms
Gene regulation is influenced by interactions between histone modifications and DNA methylation. DNA methylation is proposed to recruit methyl-DNA binding proteins that associate with histone deacetylases (9Go,10Go). However, DNA methyltransferases can also target histone deacetylases, leading to histone modifications that are independent of methyl-DNA binding proteins (11Go). Regardless of where DNA methylation occurs in the cascade of histone modification, it remains the most accessible epigenomic feature because of its stability. If we are going to start looking for ‘epigenetic signatures’ associated with various diseases on the basis of genome-wide chromatin profiling or merely concentrate on specific loci or groups of genes, knowledge of where CpG methylation occurs within the genome (or epigenome) will be invaluable. The baseline normal methylation profile will also be helpful when we try to elucidate complex processes such as genomic imprinting, X-chromosome inactivation, gene regulation, chromatin structure, genome stability and complex multifactor diseases such as immune disorders and cancer.

For whole genome methylation analyses, the key parameters are whether we can analyse multiple CpGs in several genes at once and whether methylation levels can be quantified. Further consideration includes whether we intend covering the whole genome without preselecting specific regions and at what resolution we intend analyzing the results.

Technologies for DNA methylation analyses remain either PCR based (after bisulphite conversion of unmethylated cytosines to uracil) or methylation sensitive restriction enzyme based (reviewed in 12Go–14Go). Microarray technology and comparative genomic hybridization have further opened the field for high-throughput methylation analyses, and the various advantages and disadvantages have been extensively reviewed in the last 18 months (12Go,13Go,15Go,16Go). Modifications of the contemporary methodology include using matrix-assisted laser desorption/ionization mass spectrometry (MALDI-MS) for quantitative detection of methylation after primer extension for discrimination between methylated and unmethylated CpGs on bisulphite treated DNA (17Go). MALDI-MS offers a high degree of automation and integration and allows for discrimination of methylation levels that differ by ≥5%. This sensitivity is almost matched by pyrosequencing protocols for bisulphite analyses which currently enable only a few CpGs to be analysed at a time. Pyrosequencing assays that enable high-throughput analyses of up to 10 CpGs in an amplicon size of 300 bp are presently being developed (18Go) and promise to be quantitative, but expensive. Real time PCR approaches may quantitatively detect methylation specific amplicons (19Go). At present, bisulphite conversion of DNA followed by PCR and sequencing remains the gold standard of methylation analysis. Bioinformatics programs to analyse raw sequence data for quantitative differences at individual CpGs and which can align multiple sequences to identify variable methylation are already in place for managing the large amount of data generated when bisulphite sequencing is done on a large scale (20Go).

As an alternative to high-throughput bisulphite analyses, microarray based methods have the advantage of being faster and do not necessarily rely on pretreatment of DNA with bisulphite. However, the limited availability of well-defined microarrays and the low resolution of methylation profiles obtained from microarray based methods presently hampers the widespread use of DNA microarray technology for global methylation profiling. The resolution obtained with genomic microarrays depends upon the size and content of the clones spotted, with BAC and cosmid clones giving 100 and 40 kb resolution and also containing high amounts of repetitive DNA sequences. The most viable options for methylation analyses are between custom arrays of oligos or amplicons covering regions-of-interest. These arrays increase resolution but entail sequence preselection.

Methylation specific oligonucleotide arrays containing oligonucleotides that can distinguish between bisulphite converted TpG dinucleotides and methylated CpG dinucleotides have been described (16Go,21Go), and custom microarray panels using clones of CpG islands (CGI library clones) have been generated (22Go). The initial CGI library was isolated through use of affinity purified methyl CpG binding domain of MeCP2 (0.2–2 kb sized clones), and the initial experiments were performed prior to sequencing of the clones. Alternative methods of generating methylated DNA libraries rely on methylation sensitive restriction digestion followed by amplification protocols that enrich for methylated CGI (23Go,24Go). Most of the currently available DNA microarrays generally represent only a small fraction of the genome, and the size of arrayed fragments affects resolution. Thus, an array of 200 kb BAC clones will give relatively low resolution but may cover a larger region of the genome compared with PCR amplicon probes or oligo tilepaths across specific regions-of-interest.

Enzyme based global methylation methods are still being used as an alternative to bisulphite treatments. A number of restriction enzymes are methylation sensitive and do not recognize restriction sites with methylated cytosines. The use of McrBC, a GTP-requiring, modification-dependent endonuclease of Escherichia coli K-12, which specifically recognizes DNA sites of the form 5' R(m)C 3' has been employed to deplete methylation rich sequences while constructing plant libraries (25Go,26Go) and has been recently used to comprehensively analyse CGIs in chromosome 21q (27Go). Enzyme based isolation of methylated or unmethylated DNA has the advantage that the output material can more easily be cloned into libraries, enabling global analyses and avoiding biases introduced through preselection of sequences to be analysed. Methylation sensitive representational difference analysis (Me-RDA) has been one of the first technique used to specifically screen the whole genome for imprinted genes on the basis of differential methylation (28Go–30Go). Restriction landmark genomic scanning (RLGS) is a type of two-dimensional electrophoresis relying on restriction digestion with rare cutting methylation sensitive Not1, radioactive labelling and separating fragments in one direction and then followed by in-gel restriction with an enzyme of choice to obtain profiles for analyses (31Go–33Go). Downstream analyses include spot cloning and identification by sequencing, comparison to arrayed libraries of Not1–EcoRV fragments or computer based approaches (discussed subsequently). This technique and its applications for methylation detection have been extensively reviewed elsewhere (32Go).

Chromatin profiling platforms
Chromatin structure is an integral part of the epigenome and has started to be unravelled at genomic levels. Genome-wide maps of DNaseI hypersensitive sites have been described after sequencing libraries either enriched for or depleted of hypersensitive DNA (34Go,35Go). These maps are useful in the genome-wide identification of regions containing regulatory elements. The first chromatin structure map of the whole human genome, mapping the distribution of compact and open chromatin fibre to the genome and correlating compaction status with gene density and expression status in lymphoblastoid cells, has recently been described (36Go). In this study, hybridization of density fractionated chromatin to genomic DNA microarrays enabled high resolution maps showing that compact chromatin fibres are not only composed of heterochromatin but also contain some active genes, whereas open chromatin fibers correlate with regions of high gene density rather than gene expression. Further functional read outs of chromatin structure include high resolution maps of replication timing on chromosomes 22 and 6 (37Go–39Go). It would be interesting to incorporate the methylation data obtained in the HEP with the replication timing data.

Microarray technology combined with chromatin immunoprecipitation (ChIP) procedures has been applied to study chromatin structure (ChIP-chip) (40Go). DNA methylation analyses can be followed by ChIP for histone modifications, methylbinding proteins, transcription factors, chromatin modifiers and secondary chromatin structure (41Go). One limitation of this technique is that DNA extracted after immunoprecipitation needs to be linearly amplified prior to hybridization. Protocols which do not rely on linear amplification are more likely to yield quantifiable results. The best methodology for ChIP-chip has not been established, and the majority of primary papers describe using this technology successfully in yeast because of the compact and non-repetitive nature of yeast genomes (reviewed in 42Go). In Drosophila, global patterns of histone acetylation and methylation have been mapped and correlated to gene expression using ChIP-chip technology (43Go). Bisulphite sequencing can also be done after chromatin immunoprecipitation (ChIP-BS) (44Go), which could be useful for examining DNA methylation status in combination with histone modification on a relatively large scale. Preparative ChIP to create libraries enriched for specific transcription factors or chromatin features (e.g., to look for genes regulated by the boundary element CTCF) (45Go) has been undertaken, but so far, these have not yet been fully sequenced and incorporated into a chromatin map.

The epigenome is not a linear system of neatly aligned nucleosomes subjected to histone modifications and DNA methylation changes. We also need to think about gene expression and genome usage in a multidimensional way taking into consideration long range interactions of regulatory regions and secondary chromatin structure. It is now proved that long range interactions between gene regulatory regions occur through chromatin loops and that the DNA methylation status may either depend upon the loop or influence the loop structure (46Go,47Go). Additionally, we should not only be trying to integrate different aspects of epigenetic regulation but also be interpreting this information within the context of the whole nucleus. In recent years, it has been shown that the genome is organized within dynamic chromosome territories, which impacts upon gene expression (48Go–52Go). Indeed, it is foreseeable that the next echelon of epigenome maps will be three-dimensional spatial maps of human chromosomes in the nucleus.

Expression profiling platforms
To correlate the epigenome with gene expression, quantitative measurements of expression are required. Currently, profiling of whole genome gene expression patterns is being widely performed in both basic and applied research, using techniques such as high-throughput microarrays and real-time PCR methods. Other techniques such as parallel signature sequencing on microbeads (53Go) and serial analysis of gene expression (reviewed in 54Go) also provide powerful quantitative approaches for determining expression levels. As RNA and protein levels are subjected to post-transcriptional and translational regulation, accurate correlations between epigenomic and expression profiles may be difficult to establish. Additionally, heritable variation in gene expression exists which may be due to sequence variation. An optimal approach would combine allelic gene expression data with a catalogue of candidate regulatory polymorphisms. ChIP technology using antibodies to RNA polymerase II (RNAII pol-ChIP) (55Go) can be used to establish nascent transcription profiles and chromatin profiles. If the epigenetic code holds true, then chromatin and DNA methylation profiles could eventually predict gene expression patterns. This will become evident as more DNA methylation and histone modification patterns correlated with particular states of gene activity emerge.

Bioinformatics
Bioinformatic approaches to genome-wide prediction of CpG methylation have been limited to in silico simulation analyses such as comparing virtual image restriction landmark genomic scanning with real RLGS in Arabidopsis and mice (31Go), but no complete CpG maps have yet been completed. Further computational search algorithms for epigenetic features have been applied to search for imprinting signatures (56Go,57Go) and pcG elements (58Go). As part of the Human Genome Project, bioinformatics was invaluable for integrating curated and computationally predicted genomic data into flexible, public databases such as ENSEMBL and the UCSC Genome Browser. Except for the DNA methylation databases (www.epigenome.org and www.methdb.de), as yet, no general epigenome database that integrates all epigenetic data derived from the DNA, RNA, chromatin and protein levels exists. One of the main problems hampering the development of such an epigenome database is the lack of primary databases (such as EMBL/GenBank/DDBJ) to which epigenetic data can be submitted.




rating: 6.83 from 12 votes | updated on: 31 Oct 2006 | views: 2193 |

Rate article:







excellent!bad…