Login

Join for Free!
16760 members
table of contents table of contents

Alternative structural models determined experimentally are available for an increasing number of …


Biology Articles » Bioinformatics » Conformational analysis of alternative protein structures » Results and Discussion

Results and Discussion
- Conformational analysis of alternative protein structures

 

The method has been applied to three sets of alternative models, corresponding to different types of conformational variability. The method has also been applied to the models in the SCOP classification database and the results are summarized.

3.1 Backbone conformational analysis
Serum transferrin is responsible for the transport of iron along the bloodstream and into the cells. Transferrin binds to Fe3+ through six coordination sites provided by four amino acids and by a synergistic anion (CO_32 –) (MacGillivray et al., 1998). In total, 19 models of human transferrin were collected from SCOP (sunid 53899) and aligned. The atom coordinate uncertainties were computed for each model. The original STRuster implementation was already applied to clustering transferrin models (Domingues et al., 2004). The new STRuster implementation is again applied to analysing backbone conformations in transferrin. The results now include in addition the variation and comparison matrices, as well the identification of invariant regions used for superposition.

3.1.1 Clustering
The transferrin backbone clustering results are given in Figure 1. Two clusters, A and B are noticeable, with cluster B further subdivided into C and D, as described previously (Domingues et al., 2004). The best clustering according to average silhouette width values corresponds to clusters A, C and D, see Figure S2 in Supplementary Material. All models in cluster A correspond to the apo form of transferrin, while the models in cluster B correspond to the iron-binding form. The two clusters C and D into which B is divided correspond to models obtained from two different crystal forms; P41212 (cluster C) and P212121 (cluster D).

 
3.1.2 Variation matrices
The variation matrices allow for visualizing and locating the parts of the protein that are structurally conserved and the parts with considerable conformational variability. The results obtained with the different matrices are similar but differ in detail. Matrix T and matrix S are displayed in Figure 1, and are compared in Supplementary Figure S3. Both give similar results, but the total SD matrix shows increased variance in some areas reflecting the distance uncertainty contribution. In both matrices one can observe three large segments of low variance along the diagonal, separated at approximately positions 90 and 250. There is considerable variance between the first and the second segment, as well as between the second and the third segment, but not between the first and the third segment. There is also some variance around position 140 and at the C terminus.

The relative SD matrix R and the maximum relative difference matrix X are shown in Figure 1. Enlarged plots are available in Supplementary Figure S4. They are similar to the other matrices S and T, but the boundaries of the three invariant segments are more clearly visible. X is sensitive to the largest conformational differences in the set, and shows larger relative variation around positions 140 and 300–334, than R.

We focus our analysis on the results obtained with T. In this matrix, five invariant segments are detected which are separated by hinge segments. The invariant segments span the positions: 1–59, 67–80, 85–125, 142–235 and 246–303. STRuster groups these segments in two invariant regions. One region includes the segments at the N terminus (up to 80), and at the C terminus (starting at 246). The second region includes the two middle segments between 85 and 235.

3.1.3 Comparison matrices
The variation matrices provide a description of the structural variability of the protein but they do not allow for differentiating between continuous structural variability and alternative conformational states. In order to identify distinct conformations, candidate subsets are first selected according to the clustering results. Then, the C{alpha} distance distributions between pairs of aligned residues from the two subsets are compared. Segments with significantly different distributions indicate distinct conformational states.

To investigate whether the different clusters correspond to distinct conformations, cluster A was compared to cluster B (matrix UAB) and cluster C was compared to cluster D (matrix UCD). The results are available in Figure 1. The matrix UAB gives a distribution of invariant and variable segments similar to the results obtained with the variation matrices. The two invariant regions are also identified with similar boundaries. These results clearly indicate that the models in cluster A have a distinct conformation relative to the models in cluster B. In particular, the middle region of the protein backbone in models from cluster A tend to have larger distances to both ends of the backbone than in models from cluster B. Matrix UAB is compared to matrix VAB in Figure S5. Unlike U, matrix V does not take into account coordinate uncertainties. An improved signal in U over V is noticeable.

Cluster B was further analysed by comparison of the two subsets C and D into which B is divided. The matrix UCD is shown in Figure 1. Three hinges are identified at the middle and at the C-terminal end of the backbone (positions 133–141, 304–305 and 318–334), with the remaining residues corresponding to a single invariant region. These results indicate that short segments at the middle of the backbone and at the end have different conformations in the subsets C and D, which reflects the differences between the two crystal forms. In particular, the differences in the loop around position 135 result from different inter-molecular contacts in the two crystal forms (MacGillivray et al., 1998).

3.1.4 Superposition
Figure 1 shows the 19 transferrin models with the two invariant regions optimally superimposed. The invariant regions were identified in the variation matrix T. The proteins consist of two subdomains that move relative to each other when binding iron. The first subdomain consists of both the N-terminal and C-terminal parts of the protein backbone. The second subdomain corresponds to the middle part of the protein backbone. The first and second subdomains match the first and the second invariant regions identified in T. The second subdomain also includes the C-terminal helix. The results obtained with T and UAB, and the superpositions reflect the considerable conformational change that occurs when binding iron. In the iron-free form (A), the structure is an open conformation with the two subdomains apart from each other (Jeffrey et al., 1998). In the iron-binding form (B), the two subdomains move closer towards each other (MacGillivray et al., 1998).

3.2 Variable segments
In the previous example, two conserved backbone substructures are observed in alternative relative orientations. But substructures might not be always conserved and can show different internal structures, as in rabbit fructose 1,6-bisphosphate aldolase (SCOP sunid 51580). This glycolytic enzyme includes a C-terminal region that displays substantial conformational differences. It has been suggested that the backbone conformational mobility plays a functional role in the attachment and release of the reaction product (Blom and Sygusch, 1997). In total 12 models were analysed, and the C-terminal regions are identified as a variable segment in the T matrix. See Supplementary Figure S6.

3.3 Analysis of functional site
The STRuster method allows not only for the analysis of backbone conformations, as shown in the previous results, but also for performing detailed analysis of side-chain conformations, as demonstrated in the analysis of the ligand binding site of {alpha}-amylase. The {alpha}-amylases catalyze the hydrolysis of the {alpha} – (1,4) glycosidic bonds in different polysaccharides. The porcine {alpha}-amylase consists of two domains according to SCOP, an N-terminal catalytic domain and a C-terminal domain. STRuster was applied to the SCOP models for the porcine catalytic domain (SCOP sunid 51459). The results obtained for the backbone conformational analysis reflect the effect of interactions between different proteins and ligands with the ligand binding site. The analysis reveals two alternative conformations of the loop at position 305 associated with ligand binding. Details are provided in the Supplementary Material.

In order to investigate in more detail the conformational changes associated with ligand binding, a side-chain conformational analysis was performed in the ligand binding site region. Figure 2 shows the clustering results, the T matrix and the superposition of the ligand binding site residues. The clustering result reveals three major clusters (with best average silhouette width): B, E1 and E2. Cluster B corresponds to the ligand-bound form. Cluster E1 corresponds to a complex between the enzyme and an inhibitor antibody. Cluster E2 corresponds to an empty ligand binding site. The only exception is d1ua3a2, with a partially occupied binding site, as described in the Supplementary Material. The T matrix was computed for the binding site residues based on the side-chain centroid distances. Considerable side-chain conformational variability is observed for the side chains of residues 151, 163, 238, 240, 300 and 305–308, they reflect the alternative ligand-bound/unbound states at the functional site. The comparison matrix allows to identify the residues in distinct conformation in B, E1 and E2. In particular, models in clusters E1 and E2 show different side-chain conformations at residues 151 and 163 (see Supplementary Material).

 
3.4 Comparison to other approaches
Many tools are available for protein structure comparison, and a few have been specifically implemented to compare alternative structures of the same protein. How does STRuster compare to these methods? As mentioned in the Introduction section, the comparison of the structure of different proteins is not the same problem as the comparison of alternative conformers of the same protein. In the comparison of alternative structures, the alignment is not determined by an optimization procedure using measures of structural similarity. Instead the alignment is determined by matching the residues in the structural model to the protein sequence. Regions that are dissimilar in structure are not left out as gaps as it is usual in the comparison of different structures, instead they are aligned and the extent of variation in these regions is measured. The structural differences can be rather small in the comparison of alternative structures, within the range of the atom coordinate uncertainties. Therefore, these uncertainties are taken into account in the comparison of alternative structures using STRuster and ESCET, but they are usually not considered by general structure comparison methods. STRuster is most closely related to ESCET (Schneider, 2002), as both methods rely on the comparison of distance matrices for the identification of invariant regions. ESCET identifies invariant regions based on the X variation matrix using a genetic algorithm. Using X to identify the invariant regions with STRuster should give similar results to ESCET in the same set of proteins, the difference is in the approach for identification of invariant regions. The two methods have been compared in five sets of alternative structures that have been previously proposed (Schneider, 2002). Similar invariant regions are found by the two methods, the results are given in the Supplementary Material. Nevertheless, there are notable differences between STRuster and ESCET. In contrast to ESCET, STRuster provides additional variation matrices that provide complementary information. STRuster also provides an approach for comparing subsets using U and V matrices, for clustering the alternative models and for identifying subsets of similar structures based on the clustering results. Another distinctive feature of STRuster is that it is applicable not only to the comparison of backbone structures, but also to the comparison of side-chain conformations.

3.5 Application to SCOP
The method was applied to the sets of models available at each SCOP species level in order to assess the extent of structure variation among the sets of alternative models, and to demonstrate that STRuster can be applied to large number of sets. The analysis was restricted to the PDB entries with computed DPI values and to sets with at least two alternative models. In total 36 634 different models were analysed in 5837 different sets. The largest set includes 56 models. The average set size is 6 models. The sets were aligned and the variation matrices were computed (S, T and R). Hinge segments and invariant regions were identified based on the matrix T. A PDB coordinate file with the models superimposed according to the invariant regions was also computed. The results are available on the STRuster web site.

The number of hinges provides an indication of the conformational variability within a set. For most sets (65%), at least one hinge segment is identified when using the matrix T. For 42% of the sets at least two hinges are identified. Only 35% of the sets have no hinge detected with T. The percentage of sets with no hinge is larger if either the S or R matrixes are used (43% and 46%, respectively). With the T matrix only one invariant region is identified in most sets (95%), and only 4% of the sets have two invariant regions. Variable segments are identified in few sets (4.4%).


rating: 0.00 from 0 votes | updated on: 1 Dec 2007 | views: 831 |

Rate article:







excellent!bad…