2.1 Summary

Parent Previous Next


The corpus-based methodological approach Multi-dimensional analysis was developed in order to investigate register variation. Eventually overcoming numerous methodological difficulties regarding the study of register differences, this analysis enables linguists to compare registers, which are defined by linguistic co-occurrence patterns. These patterns represent the actual distribution of linguistic features among registers and are thus identified quantitatively. Using factor analysis, sets of linguistic features that co-occur frequently can then be identified. It is assumed that each of these sets incorporates special social, situational or cognitive functions, which additionally require a qualitative analysis.

Eight methodological steps have to be followed when conducting a multi-dimensional analysis. The first requirement is to compile an appropriate corpus that is representative of the (sub-)registers being studied. As a second step, the respective linguistic features that potentially have functional associations with the registers have to be identified. The original MD study of 1988 investigated 67 features belonging to 16 major categories. Then, special computer programs are developed to analyse and tag the corpus automatically. The occurrences of each linguistic feature are thus counted and analysed according to their grammatical category. Subsequently, the frequency counts have to be normalised to a common basis, so as to enable comparisons across texts. The most central step is the factor analysis used to identify co-occurrence patterns for each dimension. This procedure analyses the frequency counts of the linguistic features of each text and identifies factors, or sets of co-occurrences, by discovering their shared variance. The different factors can thus be extracted out of the pool of variance because of the number of linguistic features that vary in a similar way. The strength to which each feature is related to a factor is represented by the factor loadings, which are typically indicated in parentheses. The positive and negative loadings occur in complementary distribution. That is, features with positive loadings are more likely to co-occur in texts if there are few negative features. Only the most important features that have loadings larger than + or − .30 were considered in the study. To ensure that each factor includes only the most representative features, the factors were rotated using a Promax rotation. The last two steps refer to the interpretation of factors. The aim is to account for the complementary distribution of the linguistic features as well as to identify their communicative functions. The first task to interpret the factors as functional dimension is to assess the functions shared by the linguistic features. Secondly, dimension scores are computed to determine the distribution of the registers along the dimension and to analyse similarities and differences between the registers.

Following these steps, it is possible to conduct a full multi-dimensional analysis or to apply the dimensions originally identified by Biber to new areas. This procedure can be helpful to study additional registers or specialised sub-registers. Therefore, the factor analysis and the computing of dimension scores can be omitted. A full MD study should however be conducted when exploring a new domain (cf. Biber and Gray, Chapter 21).

Created with the Personal Edition of HelpNDoc: Single source CHM, PDF, DOC and HTML Help creation