The Influence of external factors on learner performance

Ylva Berglund (Uppsala, Sweden) and Oliver Mason (Birmingham, UK)

In this paper, we will present a study which investigates the correlation between the performance of learners and a number of external factors relating to them. For this purpose we are looking at a collection of essays written by Swedish learners of English at Uppsala University; the USE Corpus. An important feature of the corpus is that each learner is represented by several essays. The essays are of different kinds/genres and have been produced over a stretch of time, usually one term (20 weeks).

The learner performance is computed using an approach from computational stylistics, where a number of parameters are measured for each text. We are at this stage primarily interested in automatic stylistic assessment of the texts, and the list of parameters used therefor comprises average word length, type/token ratio and several others. An analysis of correlations between these parameters extract those which measure separate dimensions, which can then be manually related to judgements about the quality of the essays. It is important to note that ‘quality’ in this sense relates to the dimensions alone, and is not to be interpreted as a value judgement, identifying ‘good’ or ‘bad’ essays or students.

The first result of the study is thus a list of computationally extractable parameters which reflect the quality of an essay. The set of values extracted for each essay is then used for two further procession steps: a cluster analysis and a factor analysis. The cluster analysis is used to identify groups of essays with similar values; essays which are of comparable ‘stylistic quality.’ These clusters can be analysed to, for example, identify the stylistic variability of individual students, i.e. to see if their writing style develops with time or is consistent in the different essays (with respect to the measured parameters). It would also be possible to compare the identified clusters with teaching groups to see to what extent the students in the same group produce stylistically similar essays.

In the second processing step, a factor analysis correlates the textual parameters with extra-linguistic factors. This analysis gives insights into which external factors correlate with certain ‘style scores.’ What external factor seems to have the greatest influence on the stylistic quality of an essay/group of essays? Can any patterns be revealed that indicate that students with a similar background also write in a similar way. If so, what factors are most influential on the stylistic quality, as measured here?

In this study we are concerned with the stylistic quality of the essays and have chosen to measure parameters related to stylistic features. A similar approach could be used with other parameters, which would then measure a different quality. With access to an error-tagged corpus, for example, this kind of analysis could be used to identify external factors with affect the error-rate of students. Is it, for example, the case that students that have spent long time abroad make different kind of errors than those who have learned the language in school?