Neuronal mRNA-tagging yields reproducible microarray expression profiles
To profile gene expression throughout the nervous system, we generated a stable, chromosomally integrated transgenic line expressing an epitope-tagged poly-A binding protein (FLAG::PAB-1) throughout the nervous system. Pan-neuronal expression was confirmed by immunostaining with a FLAG-specific antibody (Figure 1). We selected the second larval stage (L2) to test the application of the mRNA-tagging method. At this stage, the nervous system is largely in place and should, therefore, express a broad array of transcripts that define the development and function of most neurons. Sub-microgram quantities of mRNA isolated by the mRNA-tagging method were amplified and labeled for application to an Affymetrix chip representing approximately 90% of predicted C. elegans genes. Neuron-enriched transcripts in these samples were detected by comparison to a reference profile of all larval cells (see Materials and methods). We reasoned that this approach should detect a significant fraction of known neuronal transcripts and thus provide an initial test of the specificity of this strategy.
Comparisons of independently derived datasets for both the experimental (larval pan-neural) and reference samples showed that individual replicates for each condition are highly reproducible (Figure 2a,b). For example, an average coefficient of determination (R2) of approximately 0.96 was calculated from pairwise combinations of each individual reference dataset (Figure 2d). The pan-neural datasets were similarly reproducible (R2 of approximately 0.96; Figure 2e). The overall concurrence of these data is graphically illustrated in the scatter plots shown in Figure 2a,b.
Transcripts detected by neuronal mRNA-tagging are expressed in neurons
Scatter plots comparing larval pan-neural versus reference data revealed a substantial number of transcripts with significant differences in hybridization intensities (Figure 2c). Statistical analysis detected 1,562 transcripts with elevated expression (≥ 1.5-fold, ≤ 1% false discovery rate (FDR)) in the larval pan-neural sample (Additional data file 1). Strikingly, we found that 92% of the 443 genes with known expression patterns included in the larval pan-neural enriched dataset (409/443) are listed in WormBase [15] as neuronally expressed (Figure 3a; Additional data file 1). By contrast, only 57% of all genes (1,612/2,837) with defined expression patterns in WormBase are annotated as expressed in neurons (see Materials and methods; Figure 3a; Additional data files 2 and 3). Moreover, genes with key roles in neuronal function are highly represented in this list. For example, 55 transcripts encoding ion channels, receptors or membrane proteins with known expression in the C. elegans nervous system are enriched (Figure 3b; Additional data file 7). The enrichment of transcripts known to be expressed in neurons demonstrates that the larval pan-neural profile is largely derived from neural tissue. This conclusion is also substantiated by the finding that mRNAs highly expressed in other cell types are preferentially excluded from this dataset (Figure 2c). For example, microarray profiling experiments identified a total of 1,926 transcripts enriched in either larval germline, muscle or intestinal cells (GMI; Additional data file 5) [13]. This set of genes is significantly under-represented (97/1,562) in the larval pan-neural dataset (representation factor 0.6, p -9; a representation factor 4a; Additional data file 6). Independent results have confirmed that at least one of these, the acetylcholine receptor subunit acr-16, is expressed in both muscle and neurons [16,17]. The apparent low frequency of false positives empirically defined by these comparisons is consistent with the estimated FDR of ≤ 1% for this dataset. The stringent exclusion of non-neuronal transcripts has been achieved, however, while retaining sensitivity to transcripts that may be expressed in limited numbers of neurons (Figure 5). For example, our methodology identifies genes that are expressed in only two neurons; daf-7 (transforming growth factor (TGF)-beta-like peptide expressed in ASIL and ASIR) [18] and gcy-8 (guanylate cyclase expressed in AFDL and AFDR) [19] (Figure 5).
The strong enrichment of known neuronal genes in the larval pan-neural dataset indicates that other previously uncharacterized transcripts in this list are also likely to be expressed in the nervous system. To test this prediction, we evaluated GFP reporter genes for representative transcripts in this profile. As shown in Table 1 and Additional data file 17, all but one of the transgenic lines (24 of 25) derived from these promoter GFP fusions show expression in neurons (Figure 6). Of the GFP reporters tested, 56% (14/25) are exclusively detected in neurons (Additional data file 17). For example, the stomatin gene sto-4 is highly expressed in ventral cord motor neurons, touch neurons and in head and tail ganglia (Table 1; Figure 6d,h). Our GFPreporter analysis demonstrates that the remaining 11 genes tested are expressed in other tissues in addition to neurons. For instance, the GFP reporter for C04E12.7 (phospholipid scramblase), which is expressed widely throughout the nervous system, is also expressed in muscle cells (Table 1; Figure 6c). Thus, these results indicate that the genes identified in the larval pan-neural profile largely fall into two classes; those that are exclusively expressed in neurons, and those that are expressed in multiple tissues, including neurons. Our finding of neuronal GFP expression for transcripts exhibiting a wide range of enrichment (1.5- to 8.3-fold) predicts that most of the genes in this list that have not been directly tested are also likely to be expressed in neurons. Together, these results demonstrate that our pan-neural mRNA-tagging approach enriches for bona fide neuronally expressed transcripts and effectively excludes transcripts expressed exclusively in other tissues.
Gene families enriched in neurons of C. elegans larvae
Protein-encoding genes in the enriched larval pan-neural profile were organized into groups on the basis of KOGs and other descriptions that identify functional or structural categories (Table 2; Additional data file 4) [20]. Over half (880/1,562) are homologous to proteins in at least one other widely diverged eukaryotic species (that is, KOGs and TWOGs), 49 of which are classified as uncharacterized conserved proteins. Homologs for an additional 225 pan-neural enriched proteins are limited to other nematode species (that is, LSEs).
Transcripts encoding proteins with fundamental roles in neuronal activity or signaling are highly represented in this dataset (for a comprehensive list see Additional data file 4). For example, in addition to the 34 synaptic vesicle (SV) associated transcripts from Figure 3b (Additional data file 7), transcripts for 19 proteins with potential roles in synaptic vesicle function are identified (Figure 7). These include six members of the synaptotagmin family of calcium-dependent phospholipid binding proteins (snt-1, snt-4, snt-5, snt-6, DH11.4, T10B10.5), only one of which, snt-1, has been previously shown to function in neurons [21]. Expression of the additional synaptotagmin genes in the nervous system may account for the residual synaptic vesicle function of snt-1 mutants [21]. Three members of the copine family (B0495.10, tag-64, T28F3.1), a related group of calciumbinding proteins with potential roles in synaptic vesicle fusion (listed as part of endocytosis machinery in Figure 7), are also enriched [22].
In addition to genes with general functions in synaptic vesicle signaling, the larval pan-neural profile includes transcripts encoding proteins with roles specific to particular neurotransmitters. For example, the plasma membrane and vesicular transporters for choline and acetylcholine (cho-1 and unc-17), GABA (snf-11 and unc-46, unc-47), dopamine (dat-1 and cat-1), and glutamate (glt-3 and eat-4) are included (Figure 7) [23-27]. The corresponding families of neurotransmitter-specific ligand-gated ion channels are highly represented, including 22 members of the ionotropic nicotinic acetylcholine (ACh) receptor family (Additional data file 4). Other classes of ion channels with key neural functions are also abundant, such as potassium channels (24), voltage-gated calcium channels (10) and DEG/ENaC sodium channels (10) (Table 2).
The wide range of neurotransmitter-specific genes in the larval pan-neural dataset reflects the diverse array of neuron types in C. elegans (Figure 5). This point is underscored by the detection of a large number of transcription factors with established roles in neuronal specification (Table 3). These include UNC-86, the POU homeodomain protein that regulates the differentiation of a broad cross-section of neuron classes [28-30], as well as transcription factors that define specific neuronal subtypes, such as the canonical LIM homeodomain MEC-3 (mechanosensory neurons) [31-33] and the UNC-4 homeodomain (A-class ventral cord motor neurons, see below) [34-37]. Transcription factors with undefined roles in the nervous system are also identified. Of particular note are 15 members of the nuclear hormone receptor (NHR) family, only one of which, fax-1, has been previously shown to regulate neuronal differentiation [38].
A striking example of the power of this profiling approach is revealed by strong enrichment for genes involved in peptidergic signaling. Neuropeptides are potent modulators of synaptic transmission. A combination of genetic and pharmacological experiments have assigned specific neuromodulatory roles to FMRFamide and related peptides (FaRPs) encoded by members of the 'flp' (FMRFamide like peptides) gene family [39]. Examples include flp-13 (cell excitability)[40], flp-1 (locomotion) [41] and flp-21 (feeding behavior) [42]. The enriched status of the majority of flp genes (20/23) in the larval pan-neural profile (Figure 4b) parallels immunostaining and GFP reporter results showing expression of this gene family in the C. elegans nervous system [43]. Transcripts encoding insulin-like peptides (ins) and neuropeptide-like genes (nlp) are among the most highly enriched mRNAs in the pan-neural dataset (Additional data file 4). Neuropeptide activating proteases such as the proprotein convertase egl-3 and the carboxypeptidase egl-21 are also elevated [44]. Finally, we detect 136 members of the G-protein coupled receptor (GPCR) family, including four GPCRs (npr-1, npr-2, npr-3 and T19F4.1) that have been either directly identified as neuropeptide receptors or implicated in neuropeptide-dependent behaviors [42,45,46] (E Siney, A Cook, N Kriek, L Holden Dye, personal communication). The strong representation of diverse neuropeptidergic components in the larval pan-neural profile is suggestive of a nervous system that is richly endowed with complex signaling pathways for modulating function and behavior.
Embryonic and larval nervous systems express many common sets of genes
To complement the profile of the larval nervous system obtained by the mRNAtagging method, a pan-neural GFP reporter gene [47] (J Culotti, personal communication) was used to mark embryonic neurons for MAPCeL analysis. GFP labeled neurons were isolated by FACS to ≥ 90% purity from primary cultures of embryonic cells (see Materials and methods). Comparisons of independent replicates showed that these data are highly reproducible (Additional data file 8). We identified 1,637 enriched genes (≥ 1.5-fold, FDR ≤ 1%) versus a reference dataset obtained from all embryonic cells (Additional data file 1). The majority (82%) of transcripts in this list with known expression patterns are expressed in neurons (Figure 3a). All of the promoter-GFP fusions (10/10) created from previously uncharacterized genes in the enriched embryonic pan-neural dataset showed expression in neurons, further validating this MAPCeL profile (Table 1; Additional data file 17). A comparison of the embryonic (MAPCeL) and larval (mRNA-tagging) profiles reveals considerable overlap, with approximately 45% of transcripts (710/1,637; representation factor 5.2, p -325) enriched in the embryonic neurons also elevated in larval neurons (Figure 8a). The intersection of these two datasets is significantly enriched (96%) for known neuron-expressed genes. The high likelihood of neural expression for these transcripts is underscored by our finding that a set of approximately 240 candidate neural genes originally identified as including a presumptive pan-neural regulatory motif ('N1 box') are overrepresented (35%, representation factor 2.6, p -17) in this subset of pan-neural transcripts [48].
As an additional test of the similarities between these independent datasets, we examined the embryonic and larval pan-neural profiles for elevated expression of gene families with roles in synaptic vesicle function (Figure 7a). Both the embryonic and larval pan-neural datasets were enriched for many of these components. In contrast, the majority of these transcripts are not upregulated in a MAPCeL profile of embryonic muscles (RMF, DMM, unpublished data). Interestingly, the one exception to this correlation, the GABA transporter snf-11, is known to be expressed in body wall muscle in addition to neurons [26].
Examination of the embryonic and larval pan-neural datasets confirmed expression of genes that regulate the dauer pathway in C. elegans neurons. The dauer larva adopts an alternative developmental program to withstand stressful conditions (for instance, starvation, overcrowding, high temperature). The decision to adopt the dauer state is regulated by the nervous system and is triggered during the L1/L2 transition in response to environmental cues [49-54]. Figure 9 graphically represents the dauer pathway genes identified in the combined pan-neural datasets. Of particular note is a conserved insulin-dependent signaling pathway (for example, age-1/PI3Kinase) that also regulates lifespan in C. elegans and in other species [28].
Transcription factors constitute the largest gene family that is differentially enriched between the embryonic and larval pan-neural profiles (Table 3). For example, the combined pan-neural datasets detect a total of 30 NHRs. However, 16 NHRs are exclusively detected in embryonic neurons, whereas only six are enriched solely in larval neurons. Homeodomain transcription factors are also unequally distributed across the two datasets. Of 32 enriched homeoproteins, 24 are exclusive to the larval pan-neural profile, whereas only 4 are selectively elevated in the embryonic pan-neural dataset (Table 3). The relative lack of enrichment of homeodomain mRNAs in the embryonic pan-neural profile was initially surprising given strong genetic evidence for the widespread role of the members of this transcription factor class in embryonic neural development [31,47,55-57]. A likely explanation for this finding is that many homeobox transcripts are dynamically expressed in multiple cell types in the embryo but are increasingly restricted to neurons during larval development [56,58]. This view is consistent with our observation that a majority (22/28) of homeodomain genes that are enriched in the larval pan-neural dataset are in fact also detected as expressed genes in the embryonic pan-neural profile (see below).
Homologs of C. elegans neural genes are expressed in the mammalian brain
Over half of the enriched transcripts identified in the embryonic and larval pan-neural profiles have likely homologs in mammals (Additional data file 1). A substantial fraction of these transcripts encodes members of protein families with conserved roles in neural function or development (for instance, synaptic vesicle proteins; Figure 7b). We also identified neuron-enriched transcripts from C. elegans that are conserved but have largely undefined in vivo biochemical functions. For example, of the 711 transcripts that are enriched in both the embryonic and larval pan-neural datasets (Figure 8a), 27 encode uncharacterized conserved proteins (Additional data file 9). To determine if these transcripts are also detected in the mammalian brain, we queried the Allen Brain Atlas [59], which catalogs in situ hybridization results for 20,000 mouse transcripts (see Materials and methods). Of the 27 uncharacterized conserved genes from C. elegans, 26 have mouse homologs and 25 are included in the Allen Brain Atlas. We find that 76% (19/25) of these genes are detected in the mouse brain and, therefore, suggest that neural functions for these genes are likely conserved from nematodes to mammals. For instance, one member of this group of genes, osm-12, is the C. elegans homolog of a human disease gene, BBS7. Bardet-Biedle syndrome (BBS; OMIM 209900) is a rare, pleiotropic disorder with multiple pathologies (obesity, rod-cone dystrophy, cognitive impairment) [60]. At least 12 genes (BBS1-12) have been linked to this disease [61]. osm-12 and other BBS genes are highly expressed in ciliated neurons in C. elegans and genetic studies suggest key roles in intraflagellar transport [62]. These findings and additional work in other systems have led to the hypothesis that basal body dysfunction could be the root cause of BBS [63-66]. Thus, we propose that genetic studies in C. elegans of other uncharacterized conserved genes detected in the pan-neural enriched profile may be instructive.
The C. elegans interactome identifies neuronal genes potentially involved in synaptic function
The C. elegans interactome documents approximately 5,500 protein-protein interactions derived from yeast two-hybrid results, from interologs (that is, interactions between protein homologs in other species) and from functional interactions described in the literature [67]. To gain insight into the functional significance of prospective neural genes identified by these microarray datasets, we looked for evidence of interactions among proteins encoded by these genes in the Interactome database (see Materials and methods). The 711 transcripts enriched in both the embryonic and larval pan-neural datasets were uploaded for this analysis (Figure 8a). This search generated an interaction map with a single prominent cluster. Most of the transcripts in this group (30/34) are detected in at least one of the pan-neural datasets (Figure 10). Our finding that the majority of genes in this interactome group are expressed in the nervous system favors the idea that these networks reflect authentic interactions in neurons. We note that 13 of the proteins in this list (yellow circles in Figure 10) have not been previously assigned to the nervous system. Annotation of this interactome map with functional data for each corresponding protein revealed two distinct subclusters featuring roles in either synaptic transmission or nucleic acid binding. For example, the JIP3/JSAP1 JNK scaffolding protein, UNC-16, interacts with KLC-2 (kinesin light chain) to regulate vesicular transport in neurons [68]. Other members of this interacting complex, MKK-4 (MAP kinase kinase) and JNK-1 (Jun kinase) are also required for maintaining normal synaptic structure [69,70]. These findings suggest that additional proteins in this subcluster may function at the synapse. F43G6.8 (E3 ubiquitin ligase) and B0547.1 (COP-9 signalosome subunit) are attractive possibilities as synaptic development and function are regulated by ubiquitin-dependent protein degradation [71]. As more phenotypic data are compiled, this analysis can be extended to encompass data derived from RNA interference (RNAi) experiments, which may yield models for molecular machines that function in neurons [72].
An mRNA-tagging transcriptional profile of a small subset of neurons
Although our gene expression profiles of the embryonic and larval nervous systems provide a comprehensive list of transcripts that function in neurons, these data lack the spatial resolution to identify the specific neurons in which these transcripts are expressed. For instance, the dopamine transporter, dat-1, is highly enriched (15.9-fold) in the larval pan-neural dataset, but dat-1 expression is limited to eight dopaminergic neurons [73]. Other transcripts that are also restricted to a small number of neurons, however, might not be detected in a global profile of the entire nervous system. For example, the genes gcy-5 and gcy-6 (guanylate cyclase) are each expressed in single neurons, ASER and ASEL [74], respectively, and neither is enriched in the larval pan-neural dataset. The application of the mRNA-tagging strategy to individual classes of neurons should, therefore, correlate gene expression with specific neurons as well as detect low abundance transcripts with potential key functions in these cells. To test this idea, we used the unc-4 promoter to express FLAG-PAB-1 in only the subset of neurons in the ventral nerve cord that express the UNC-4 homeodomain protein. In the L2 larva, unc-4::GFP and unc-4::LacZ reporters show strong expression in a total of 18 neurons: VA motor neurons (12), SAB motor neurons (3), the I5 pharyngeal motor neuron (1) and AVF interneurons (2) [35,75]. Weaker, sporadic expression is observed in nine embryonically derived DA motor neurons at this stage. (unc-4 is strongly expressed in the DAs in the embryo and in L1 larvae.) To increase the sensitivity of the mRNA-tagging method for profiling these neurons, PAB-1 was labeled with three tandem repeats of the FLAG epitope (3XFLAG). Figure 11a,b show a mid-L2 larval animal (NC694) expressing the unc4::3XFLAG::PAB-1 transgene in VA, SAB, and I5 motor neurons and in AVF interneurons; less intense expression is seen in the DA motor neurons. Because most (24/27) of the neurons in this group are members of the 'A-class' of ventral cord excitatory motor neurons (VA, SAB, DA), we will refer to the mRNA-tagging data obtained from this transgene as the 'larval A-class motor neuron' profile (Figure 9).
As previously observed for the larval pan-neural data (Figure 2), independent hybridizations resulted in highly reproducible data for the larval A-class motor neuron profile (Additional data file 8). A comparison of the A-class hybridization data to the reference sample of mRNA from the average larval cell detected 412 enriched genes (see Materials and methods). Of the 114 genes in this list with known expression patterns, 102 (approximately 90%) are found in neurons (Figure 3a). Of these genes, 96 have detailed spatial information, and 76 (approximately 80%) of these show annotated expression in regions that also contain UNC4expressing neurons (Additional data file 1). Of particular note, the native unc-4 transcript, which is selectively expressed in these neurons in vivo, is the most highly enriched (eight-fold) mRNA in this dataset. Other known A-class motor neuron genes in this list include the vesicular ACh transporter (VAChT) unc-17 and the Olf/EBF transcription factor unc-3 (Figure 11c) [75,76]. In contrast, transcripts known to be restricted to other cell types, such as muscle (myo-2, unc-22) or GABAergic neurons (unc-25), are depleted from the A-class neuronal profile (Figures 4a and 11c). For instance, [13].
All of the GFP reporter lines (19/19) constructed for A-class enriched transcripts (Table 1; Additional data file 17) are expressed in UNC-4 neurons. For example, in the mid-L2 stage ventral nerve cord, mec-12::GFP is expressed in DA, VA, VB and VD motor neurons (Figure 6a,e) and syg1::GFP (Ig domain) is detected in DA and VA motor neurons among others (Figure 6g). These results strongly suggest that most of the genes in the UNC-4 neuron enriched dataset are expressed in these cells in vivo. Thus, these data indicate that the mRNA-tagging method can produce a reliable profile of subsets of neurons in C. elegans.
A subset of pan-neural genes are expressed in larval A-class motor neurons
Nearly 70% of the larval A-class enriched transcripts (282/412) are also elevated in the larval pan-neural dataset (representation factor 8.2, p -209; Additional data file 10). As expected, genes with known functions in all neurons are highly represented in this group (Table 2). Synaptic vesicle associated transcripts that are widely expressed in the nervous system, such as rab-3 (G-protein), snt-1 (synaptotagmin) and snb-1 (synaptobrevin), are enriched in both datasets. Absences from the larval A-class profile are correlated with class-specific functions in neurons. For example, the 60 transcripts encoding proteins involved in synaptic transmission enriched in the larval pan-neural dataset include vesicular transporters for GABA (unc-47), glutamate (glt-3), dopamine/serotonin (cat-1) and acetylcholine (unc-17) (Figure 7b) [24]. The selective enrichment of the vesicular ACh transporter unc-17 in the larval A-class profile is consistent with the known cholinergic signaling capacity of A-class motor neurons [75]. In another striking example of neuron-specific gene expression, the 'mec' genes, which are required for normal differentiation or function of mechanosensory neurons, are highly represented in the larval pan-neural dataset but are not detected in the larval A-class profile (Table 4) [77]. The one exception is the alpha-tubulin encoding gene, mec-12, for which enriched expression in A-class neurons was confirmed with a GFP reporter gene (Figure 6a,e). As described above, most of the known flp genes are enriched in the pan-neural dataset [39]. A subset of five flp genes is found in the A-class dataset (flp-2, 4, 5, 12, 13), providing enhanced spatial resolution for the expression repertoire of this large family of neuropeptide transmitters (Figure 4b).
The A-class profile includes approximately 130 transcripts that are not detected in the larval pan-neural dataset (Additional data file 10). Interestingly, approximately 20% of these genes (23/127) encode collagen-like proteins for which neural functions are largely undefined. cle-1, which encodes a type XVIII collagen, the one member of this protein family that does have a documented role in the nervous system [78], is enriched in both the larval pan-neural and A-class datasets. We speculate that post-embryonic motor neurons may secrete collagens and other extracellular matrix components for assembly into the basement membrane that envelopes the ventral nerve cord [79]. Indeed, our data confirm that UNC-6 (netrin), a critical extracellular matrix signal that steers migrating cells and neuronal growth cones, is highly expressed in larval A-class motor neurons (Figure 12) [80].
Comparison of transcripts enriched in embryonic versus larval A-class motor neurons
We have previously used the MAPCeL strategy to profile embryonic motor neurons marked with unc-4::GFP [5]. These include 12 embryonic A-class motor neurons (9 DA and 3 SAB) and a single pharyngeal neuron, I5 [5]. The embryonic A-class motor neurons are similar to the post-embryonic VAs in that they express unc-4, are cholinergic, extend anteriorly directed axons, and receive inputs from the command interneurons AVA, AVD, and AVE [79]. The strong overlap of these distinct morphological and functional traits as well as some residual larval expression of unc-4 in embryonic A-class motor neurons (Figure 11b) are consistent with the observation that approximately 40% of transcripts enriched in the larval A-class motor neuron dataset (162/412) are also elevated in the embryonic A-class motor neuron MAPCeL profile (representation factor 7.4, p -99; Figure 8b; Additional data file 10). Transcripts from the cholinergic locus, cha-1 (choline acetyl transferase) and unc-17 (vesicular ACh transporter), which are essential for the biosynthesis and packaging of ACh into synaptic vesicles, are enriched in both A-class motor neuron profiles [24]. In addition to these gene families, several others are enriched in both embryonic and larval A-class motor neurons (Additional data file 19). ACh signaling depends on the synaptic vesicle cycle and genes with key roles in this mechanism are elevated in both datasets: these include unc-18, snt-1 (syntaxin), snn-1 (synapsin), ric-4 (SNAP-25), sng-1 (synaptogyrin), unc-2 (calcium channel), rab-3, and unc-11 (clathrin component). In addition, genes with either established or likely roles in the G-protein coupled signaling pathways that modulate ACh release from these motor neurons (dop-1, pkc-1, kin-2, gar2, rgs-1, rgs-6, gpc-2) are common to both enriched datasets [5,81]. The general role of A-class motor neurons in both releasing and responding to a broad range of neuroactive signals is underscored by the embryonic and larval enrichment of multiple neuropeptides (that is, flp-2, flp-4, flp-5, and flp-13) (Figure 4B). Shared ionotropic receptors include the nAChR subunits, acr-12, acr-14 and unc-38, which lead to excitatory responses, as well as the recently described ACh gated chloride subunit, acc-4 (T27E9.9), which should mediate acetylcholine-induced inhibition of motor neuron activity [82]. Together, these data support the proposal that C. elegans A-class motor neurons utilize complex mechanisms for integrating signals originating as either paracrine or autocrine stimuli [5].
Other transcripts that are highly enriched in both embryonic and larval A-class datasets with potential roles in specifying shared characteristics of this motor neuron class include: syg-1, which encodes an Ig-domain membrane protein that localizes the presynaptic apparatus of the HSN motor neuron in the egg laying circuit (Figure 6g) [83]; rig-6, which encodes the nematode homolog of contactin, a membrane protein with extracellular fibronectin and Ig domains that organizes ion channel assemblages [84,85]; and cdh-11, which encodes the homolog of calsyntenin, a novel cadherin-like molecule that is highly localized to postsynaptic sites [86]. Finally, we note that of the 25 genes that encode innexin gap junction components [87], only one, unc-9, is enriched in both of the A-class motor neuron datasets. This finding points to the UNC-9 protein as a likely component of gap junctions that couple A-class motor neurons with command interneurons that drive motor circuit activity in the ventral nerve cord [37].
In addition to genes that are enriched in both embryonic and larval A-class motor neurons, we also detected transcripts that are selectively elevated in one or the other dataset (Additional data file 10). Transcription factors comprise the largest group of differentially expressed genes. Of 24 transcription factor genes enriched in embryonic A-class motor neurons, only two, unc-3 and unc-4, are also included in the separate list of 10 transcription factors enriched in larval A-class motor neurons (Table 3). UNC-3 (O/E HLH protein) and UNC-4 (homeodomain protein) have been previously shown to specify shared characteristics of embryonic and larval A-class motor neurons [36,75,76]. Roles for the remaining transcription factors in the differentiation of these motor neuron subtypes are unknown. For example, members of the POU (ceh-6) and CUT (ceh-44) classes of homeodomain protein families, which are well-established determinants of neuronal fate [88,89], are selectively enriched in the larval A-class list. Conversely, five members of the nuclear hormone receptor family (nhr-3, nhr-95, nhr-104, nhr-116 and F41B5.9) are preferentially expressed in embryonic A-type motor neurons. The extent to which these different combinations of transcription factors account for characteristics that distinguish embryonic and larval A-class motor neurons can now be explored by genetic analysis.
A key morphological feature that distinguishes DA from VA motor neurons is clearly linked to differential levels of specific transcripts in embryonic versus larval A-class datasets. During embryonic development, DA motor neurons extend commissures that circumnavigate the body wall to innervate dorsal muscles. The dorsal trajectory of DA motor neuron outgrowth depends on the UNC-6/netrin receptor genes, unc-5 and unc-40, and the receptor protein tyrosine phosphatase (RPTP) clr-1 gene [90,91], all three of which are enriched in the embryonic A-class dataset (Figure 12). In contrast, unc-5, unc-40 and clr-1 are not elevated in larval VA motor neurons, which consequently innervate muscles on the ventral side. Guidance cues that govern the anteriorly directed outgrowth of motor axons, the dorsal and ventral nerve cords, respectively, are not known. However, a likely candidate to direct axonal outgrowth along the C. elegans anterior-posterior axis is Wingless (Wnt) signaling [92-94]. In this regard, it is interesting that a comparison of the embryonic and larval A-class motor neuron transcripts identifies two different Wnt receptors that are selectively enriched in either the DA (lin-17) or VA (mig-1) motor neurons. In addition, the transcript for the Wnt ligand cwn-1 shows elevated expression in the embryonic A-class dataset.
Comparisons to microarray profiles of C. elegans sensory neurons identify differentially expressed transcripts
Colosimo et al. [8] used MAPCeL to profile the sensory neurons AFD and AWB. We found that 8f; Additional data file 11), a finding consistent with the distinct roles of these neuron classes in C. elegans. For example, the AFD-specific guanylate cyclase genes, gcy-8 and gcy-23, are excluded from the enriched embryonic A-type motor neuron dataset, whereas the A-class specific transcription factor, unc-4, is not found in the AFD/AWB profile (Additional data file 11). In contrast, a significantly larger fraction (approximately 43%) of AFD/AWB enriched transcripts, including gcy-8 and gcy-23, are elevated in the embryonic pan-neural profile (Figure 8e) (Additional data file 11). Similar results were obtained when comparing the larval pan-neural and A-class datasets to a larval profile of chemosensory neurons [14] (data not shown). These findings confirm the reliability of these neuron-specific profiling methods for identifying differentially expressed transcripts and confirm that the panneural profiling approach is sufficiently sensitive to detect genes expressed in diverse cell types throughout the C. elegans nervous system.
Microarray profiles are consistent with gene expression topographic maps
We compared our data to a topographic map derived from 553 microarray experiments in which genes are assigned to specific 'mountains' based on similarities in gene expression [95]. In some instances, co-regulated genes were grouped into specific functional subsets, thereby defining the 'name' of the mountain. For example, mountain 6 contains many genes that are known to function in neurons. Neuronal transcripts identified in all four of our neuronal microarray experiments (embryonic and larval pan-neural, embryonic and larval A-class) are significantly over-represented in the neuromuscular mountain (mountain 1) and one of the neuronal mountains (mountain 6). In contrast, transcripts in the embryonic muscle dataset are significantly under-represented in mountains 1 and 6 but are over-represented in the muscle mountain (mountain 16) (RMF, DMM unpublished data). These data provide additional validation for our neuronal expression profiles.
Detection of expressed genes
We limited the analysis above to transcripts that show a statistically significant level of enrichment in neurons relative to other cell types in order to focus on genes that may function predominantly in the nervous system. Our microarray data, however, also include intensity values for a larger group of transcripts that may be broadly expressed in neurons as well as in other tissues. We define these transcripts as 'expressed genes' (EGs). We identified 7,953 EGs in the MAPCeL profile of embryonic neurons using criteria that exclude transcripts that are likely to originate from the small fraction (approximately 10%) of non-GFP cells in the FACS preparation [5] (Additional data file 12). For the larval pan-neural and larval A-class motor neuron datasets obtained with the mRNA-tagging method, EGs were defined using similar considerations, in this case, to exclude transcripts that are likely due to background levels of RNA adhering nonspecifically to the sepharose beads used in the immunoprecipitation step (see Materials and methods). EGs in these experimental samples represent transcripts that may be enriched in neurons as well as genes that are expressed at comparable levels in neurons and in other tissues. This approach identified a total of 4,033 EGs in the larval pan-neural dataset and 3,320 EGs in the larval A-class profile (Additional data file 13). As expected, 'housekeeping' genes are prevalent in these datasets but excluded from the neuron enriched profiles. For example, 20 ribosomal subunit genes (13 large, 7 small) are included in the dataset of larval pan-neural EGs but are not listed in the profile of transcripts enriched in larval neurons (Additional data files 1 and 13).
A comparison of all EGs in the larval and embryonic datasets described in this paper (that is, reference, pan-neural, A-class motor neurons), in addition to the previously described embryonic A-class dataset [5], reveals a total of approximately 12,000 unique transcripts or 63% of the predicted genes represented on the C. elegans Affymetrix Gene Chip (Additional data file 14). We note that approximately 1,600 of these EGs correspond to transcripts that have not been previously confirmed by expressed sequence tags (Additional data file 16); a subset of 336 transcripts from this group is enriched in at least one of the neuronal datasets, suggesting that they may have specific functions in C. elegans neurons.