Global alignment of the 5'-flanking region of mammalian HL genes revealed three highly conserved elements (P ≤ 10-5) that lie far upstream of the HL promoter (Fig. 1). Two of these elements, at -14 kb and -22 kb, show moderate enhancer activity in HepG2 cells. What discriminates the conserved -14 kb and -22 kb elements from the non-functional, non-conserved -10 kb sequence is unclear, as all three sequences contain a similar repertoire of TFBS for liver-expressed transcription factors (data not shown). Further studies are required to clarify the mechanism responsible for the enhancer activity of the two highly conserved elements in the HL gene. The finding of two hitherto unknown enhancers supports the hypothesis that conserved non-coding sequences may identify functional regulatory elements. Experimentally, we also found a positive and a negative regulatory sequence between -2.2 and -0.4 kb of the rat HL gene that coincided with homology peaks, but were not recognized by the Rankvista analysis of the sequence comparison. Rubin's group recently demonstrated strong in vivo enhancer activity for almost half of the elements that are ultra-conserved among human/mouse/rat [8,27]. Our study further illustrates the power of the approach, and suggests that gene regulatory functions may also reside in somewhat less conserved elements among mammalian genomes.
We also tested whether global genome comparisons can also aid in identification of functional regulatory elements within highly conserved sequences, using the proximal HL promoter region as a model. Within this proximal promoter region, three modules are identified with conserved clusters of TFBS motifs. These modules A, B and C correspond with the previously identified regulatory elements DR1 [25], HNF1 [22-24] and Inr [22-24], respectively. However, we missed an additional module (-295 to -265) that has recently been identified as a functional DR4 site [25]. The cluster of TFBS within this module appeared to be conserved among human and mouse, but not among human and rat. Despite the relatively high homology between the mouse and rat over the proximal 5'-flanking region of the HL gene (Table 1), the outcome of the genomic sequence analysis differed whether the rat or the mouse sequence was used. Hence, although searching genomic sequences for conserved clusters of TFBS is a valuable tool in predicting functionally important regulatory elements, this approach is sub optimal.
For two of the modules that are conserved among the four species, a significant contribution to basal transcription was confirmed by promoter-assays in HepG2 cells. For module C (-25/+5), this is not surprising since it contains the transcriptional start site itself, as well as a pyrimidine-rich stretch that may serve as an initiator region (Inr). Module B (-80 to -40) overlaps with a protected region in DNAse footprinting in rat liver [22] as well as in HepG2 cells [23], and contains a HNF1 binding site that has been implicated in liver-specific expression of the human HL gene by other groups [22-24]. Experimentally, we could not confirm a major role for module A (-240 to -200) in determining basal transcription activity in HepG2 cells. This is surprising since it corresponds to a functional DR1 site [25], and perfectly matches with a protected region in DNAse footprinting in rat liver and human HepG2 nuclear extracts [22,23], suggesting that this part of the HL promoter is occupied by transcription factors under basal conditions. Similarly, we could not confirm the role of the DR4 module (-295 to -265) conserved among human and mouse, in basal transcriptional activity in HepG2 cells. We propose, therefore, that this part of the HL promoter region is involved in modulation of gene transcription under different hormonal or nutritional conditions.
We show here that the conserved module B (-80 to -40) plays a dual role in mediating liver-restricted transcription of the HL gene. On the one hand, the module mediates moderate stimulation of minimal promoter activity in liver-derived HepG2 cells, and on the other hand, it mediates inhibition of minimal promoter activity in the non-hepatic HeLa cells. Of the potential TFBS identified in module B, the liver-enriched HNF1 is a likely candidate for effecting the liver-specific activation of the HL promoter. Other groups have already suggested an important role for the HNF1 binding site [22-24], and in vitro HNF1 binding to this sequence has been demonstrated by gelshift assays [24]. Furthermore, HNF1α knockout mice have 3.4 fold lower HL mRNA levels than control mice [28]. In primary hepatocytes, HL secretion increases with HNF1α gene dosage [28]. However, HL mRNA and HL secretion are not completely lost by HNF1α knockout, indicating that HNF1α is not the only transcription factor determining HL expression in liver. HL secretion was only observed with hepatoma cell lines that express HNF1α or HNF1β mRNA [24], but not all cell lines with detectable HNF1α or -β expression do also secrete HL. In fact, HL secretion correlated with expression of HNF4 rather than with HNF1 mRNA [24]. The HNF4α gene itself is a target of HNF1α [29]. Since potential HNF4α binding sites were detected in the conserved module A (as well as in the -295/-265 module), the liver-specific stimulation of HL promoter activity may well be mediated by HNF4α. In fact, HNF4α is bound to the promoter regions of almost half of the actively transcribed genes in human liver [29] and therefore contributes to a large fraction of liver-specific gene expression. Sequence modules that contain both HNF1 and HNF4 binding sites are among the strongest predictors of liver-specific transcription [10]. Rufibach et al. [25] proposed that HNF1α and HNF4α independently and additively activate HL promoter activity. Which transcription factor(s) mediate inhibition of minimal promoter activity in cells of non-hepatic origin, remain(s) unknown at present.