Login

Join for Free!
16586 members
table of contents table of contents

The process of producing microarray data involves multiple steps, some of which …


Biology Articles » Bioinformatics » MDQC: a new quality assessment method for microarrays based on quality control reports » Results and Discussion

Results and Discussion
- MDQC: a new quality assessment method for microarrays based on quality control reports

 

We use two case studies to evaluate the performance of the MDQC method. The first dataset is part of an acute lymphoblastic leukemia study described by Ross et al. (2003) and contains 20 Affymetrix HG-U133B microarrays. Bolstad et al. (2005) and Brettschneider et al. (2007) examined the quality of these arrays using histograms of probe-level data, MA-plots and probe-level model (PLM) methods (PLM).3 According to their quality assessment, array 2 has a strong spatial artifact on the chip and array 14 presents other evidence of poor quality. Because of the existence of such ‘ground truths’, we first present the comparative analysis based on this small dataset. Our second dataset consists on 201 Affymetrix GeneChip Human Genome U133 Plus 2.0 on RNA isolated from whole blood of patients who have undergone kidney, liver or heart transplants. This dataset is owned by us, allowing us the opportunity to follow up on various aspects and to perform reruns. The analysis of both datasets was performed in R. The corresponding codes and QC reports are available upon request. In addition, the library to implement the MDQC method will soon be publicly available from Bioconductor.

3.1 Acute lymphoblastic leukemia study
Figure 1 illustrates the measures of the QC report generated by the R-package simpleaffy for 20 microarrays of the acute lymphoblastic leukemia study. A univariate analysis of these measures based on Affymetrix recommended thresholds (Affymetrix, 2004) flags some of the arrays as having potentially low quality and are identified using solid points. In particular, arrays 6, 8 and 10 have a ‘scale factor’ value above usual threshold of 10 and arrays 6 and 8 have also a ‘percent present’ value below 20. Array 14 has all background measures (average, minimum and maximum background) above usual threshold of 100. In addition, array 2 has a ‘maximum background’ value above this threshold. The values of the spiked hybridization controls (bioB, bioC, bioD and cre) are low for array 14, though they are always present with increasing signal values as recommended by Affymetrix. Note that arrays 1 and 7 has problems similar to those of array 14, though to a lesser extent. Finally, array 8 shows high values for the last two quality measures corresponding to RNA housekeeping genes. However, both values are below the recommended threshold of 2.

 
We now compare the previous univariate analysis with a multivariate one using MDQC based on all the quality measures in the report. Figure 2 shows that using this MDQC approach array 14 is flagged as having potential quality problems and array 2 appears only as a borderline case. Thus, collapsing all the quality measures into a single MD downweights array 2's quality problems and masks other outlying observations in the QC report, such as those of arrays 1, 7 or 8. Thus, we study the MDs on groups with a reduced number of variables such as those created by the a priori grouping method and the group of the first principal components. As it was previously discussed, these alternative methods reduce the possibility of masking outliers and may give information about the potential source of the quality problem.

Using the a priori grouping method, MDQC examines three MDs, one for each of the following three groups:

  1. Scale Factor, % Present, Avg BG, Min BG, Max BG
  2. BioB, BioC, BioD, CreX
  3. AFFX-HSAC07/X00351.3'/5', GapDH.

Note that the quality measures of Group 2 in Section 2.2 are not available in the simpleaffy QC report, thus, there are only three groups to examine. These groups can be used to assess the quality of the chip/sample, the sample and the RNA, respectively. Each plot in Figure 3 shows the MD (y=axis) of each array (x=axis) within each a priori group. The solid, dashed and dotted lines correspond to the square root of the 90th, 95th and 99th percentile of the chi-squared distribution, respectively. 

In Group 1, arrays 2 and 14 are both flagged as potentially defective, and array 17 as a borderline case. In Group 2, array 1 has an MD exceeding the 99% cutoff and arrays 7 and 14 have MDs exceeding the 95% cutoff line. Finally, array 8 is the only one flagged in Group 3. Thus, the MDs based on groups of lower dimension flag both arrays 2 and 14, which is consistent with the results in Bolstad et al. (2005) and Brettschneider et al. (2007). In addition, arrays 1 and 8 are flagged as potentially low quality and arrays 7 and 17 as borderline quality. Moreover, based on the interpretability of the groups, the problems in array 2 are most likely due to defects in the chip as this array is only identified in Group 1. Similarly, since arrays 1 and 14 are flagged in Group 2, their low quality is most likely due to low quality of the sample. Note that although array 14 is also flagged in Group 1, this can still be due to quality problems in the sample. Finally, array 8 is flagged only in Group 3, suggesting potential problems in the RNA quality. In the Supplementary Material, we also analyze this dataset using MDQC based on the clustering grouping method and the loading PCA grouping method. The groups formed by these two data-driven methods almost validate the a priori grouping described above, and thus the results are similar to those reported in Figure 3.

Comparing the previous multivariate analysis with the univariate one, it is important to note that besides the ‘maximum background’, all other quality measures for array 2 are similar to those of the other arrays (see all plots in Fig. 1). Thus, without ‘maximum background’, the univariate analysis does not identify this array as having quality problems. However, the top-right plot in Figure 3 shows that MDQC using the a priori grouping method still flags this array even when the ‘maximum background’ is not included in the analysis. This example illustrates that a multivariate analysis can flag a problematic array that a univariate analysis cannot detect. In addition, note that array 14 is flagged by both the univariate and the multivariate analyses. However, although the univariate analysis suggests that array 14 is more problematic than array 2, i.e. many of its quality measures are outlying, the MDQC using the a priori grouping method ranks array 2 as having lower quality (the MD of array 2 is larger than that of array 14 in Group 1). Thus, our MDQC method not only flags unusual arrays but also ranks them in a way that is not evident from the univariate analysis.

Finally, we examine the performance of MDQC using the global PCA method to reduce the dimensionality of the data. Using the scree plot, we retain k = 4 principal components in this analysis (see Supplementary Material). Figure 4 shows the results of the MDQC when a single MD is calculated based on the first four PCs derived from a robust PCA based on robustly standardized data (see Section 2.2). We note that this approach still flags arrays 2, 8 and 14 as having potential quality problems. However, the first two appear only as borderline cases. In addition, arrays 1, 7 and 17 are still masked using this method.

 
In sum, all three grouping approaches of MDQC (i.e. all variables, the a priori grouping method and the global PCA method) identify the problematic arrays 2 and 14 that were previously detected by Bolstad et al. (2005) and Brettschneider et al. (2007). However, the a priori grouping method outstands the problem of array 2, unmasks other potentially low-quality arrays and provides possible explanations of the quality problems.

3.2 Transplantation study
We use this dataset to illustrate the performance of our method in a large study with the ability to re-run potentially low-quality arrays. The analysis is based on the 14 numerical quality measures contained in the GCOS QC report for each array (see Section 2.2). Based on the MDQC analysis of the 201 original arrays, budget and sample material limitations, 22 arrays flagged with potentially low quality have been re-run. While the diagnostic of the original set is based on an analysis that does not include the re-runs, to simplify the exposition, we include the results of all the arrays in the same plot. Thus, Figure 5 shows the MDs (y axis) of the 223 arrays within each of the four a priori groups defined in Section 2.2. We recall that Groups 1–4 provide information on the quality of the chip and/or sample, the chip, the sample and the RNA, respectively. Solid points are used to identify the 22 arrays that were re-run, solid triangles for the re-runs and open triangles for those that could not be re-run due to the lack of additional sample material. In addition, the array's IDs contain two numbers: the first one corresponds to the patient ID and the second one to the number of months after transplant. The re-run arrays are labeled with an R after this numeric ID. For example, 21-4 is the ID for the array corresponding to patient 21 at 4 months after transplant, and 21-4R is its re-run. The solid, dashed and dotted lines correspond to the square root of the 90th, 95th and 99th percentile of the chi-squared distribution, respectively.

 
Figure 5 shows the plots of the MDs in each of the four groups. Although our method identifies several outlying arrays, we focus our discussion on the subset of those arrays with the highest MDs that we were able to re-run: 21-4, 17-6, 25-5, 302-7, 36-6, 21-2, 21-3, 5 arrays of patient 13 and 10 arrays of patient 317.

Arrays 21-4, 17-6, 25-5 and 302-7 have outlying MDs in Groups 1–3, but not in Group 4. Thus, the quality problem may come from the chip, the sample, or both, but not from the RNA quality. To identify the source of the quality problem, we re-run them using the same sample material but a new chip. As the MDs of the re-run for array 21-4 (21-4R) are below the thresholds in all four groups, we conclude that the original chip was damaged. In contrast, the MDs of the re-runs of arrays 17-6, 25-5 and 302-7 continue to be flagged as outliers (data not shown), suggesting that the original chips were not defective. The re-runs of these arrays using new sample material give MDs that are below the thresholds in all four groups. Thus, we conclude that these arrays suffered from low-quality sample material. Further, the array 36-6 has outlying MDs in Groups 1 and 2, while arrays 21-2 and 21-3 have outlying MDs in Groups 1 and 3. We re-run these arrays using both new chips and new sample material and our method ceased to flag them as low-quality arrays.

We additionally identify a set of arrays with unusual indicators of RNA quality measures (see Group 4). These arrays correspond to patients 13 and 317, although those for the latter patient are borderline cases. As the arrays of both patients were originally run in the same batch, these unusual values can correspond to either a batch effect or a quality problem in the RNA. We re-run each of these arrays in different batches when RNA was still available. The re-run arrays have MDs of the quality measures in Group 4 that are similar to those of the rest of the arrays.

We further use other quality assessment methods to assess the performance of MDQC. As some of these methods are computationally intensive or difficult to visualize in large studies, we select 12 potentially bad arrays and 10 good arrays based on the MDQC diagnostic and the inspection of the image files. We examine the histogram of probe-level data, the MA-plots and perform a PLM QC assessment (Bolstad et al., 2005), including the inspection of array pseudo-images, RNA-degradation plots, relative log expressions (RLE) and normalized unscaled standard errors (NUSE). Here, we briefly describe the last quality measure and present its results for our data. The conclusions are similar for the other measures (see Supplementary Material).

The box plots in Figure 6 show the NUSE for the selected arrays. These errors are the standard error between probe intensities within a probe set for each array, normalized by dividing all values of a particular probe set by the median standard error for that probe set across arrays (Bolstad et al., 2005). Their box plots are expected to be small and centered at one reflecting a small variability within the probe sets of an array. It is noticeable that those arrays identified by MDQC as having potential quality problems are also flagged by this quality measure (similar results are found using other diagnostic plots of PLM available in Supplementary Material). Their boxes are larger and in most cases not centered at one, indicating the existence of more outlying probes in those arrays with a larger variability within probe sets than in other arrays. In sum, the MDQC method is comparable in its effectiveness as the PLM method. Its main advantage over the PLM method is that it is not computer memory intensive, and is much more suitable for assessing the quality of a large number of arrays.


rating: 0.00 from 0 votes | updated on: 1 Dec 2007 | views: 550 |

Rate article:







excellent!bad…