Examining Structure and Content
Examining structure and context - questioning the nature and purpose of summative assessment
Seminar presentation to
Cambridge International Examinations
University of Cambridge Local Examinations Syndicate
July 2003.
Keith S. Taber
Faculty of Education - University of Cambridge
Examination questions have changed over the years - with structured questions set in 'everyday' contexts becoming more popular. Do these changes help us assess what we think is important? For that matter, do they actually help the candidates show us what they know?
Examining structure and context - questioning the nature and purpose of summative assessment
Abstract: it is now common practice in many examinations to use questions which are structured, and set in a context. This paper explores the consequences of these trends. Structure helps the candidates to know what the examiner requires, and so helps them identify which specific knowledge they need to use to answer a question. It is clear, however, that the demands of a structured question are different from a more traditional examination question seeking to elicit the same knowledge. Context is generally assumed also to be for the benefit of the candidate, as it will make a question about an abstract topic more concrete and familiar. It is argued in this paper that although the aim of contextualising questions is well meant, there may be good reasons for being suspicious of many contextualised questions. Context-bound questions may actually reduce item validity, without necessarily helping the candidates. Although it should not be concluded that structure and context are necessarily inappropriate in examinations, this paper sets out some of the issues which need to be considered when designing summative assessments.
suggested key-words: assessment; examinations; structured questions; contextualised questions; cueing and recall.
Introduction
In recent years there has been a tendency to change the nature of the types of questions used in examinations. Open-ended (e.g. essay-type) and objective (e.g. multiple-choice) question types have tended to be replaced by structured questions (Barker, 2001). Additionally there has been an increasing tendency deliberately to embed examination questions within an everyday or other 'relevant' context. This paper explores these trends in the light of what we know about human learning, and the purpose of summative assessment.
The move towards more structured questions can be defended from several perspectives, but clearly changes the demands on candidates compared with the more traditional question styles. At one level examination boards may claim that these different types of question still test whether the candidate has access to the same knowledge, but this does not mean that the different types of question are equally difficult. This raises the issue of what cognitive skills we wish to teach and assess in science.
This paper also argues that the increase in contextualised questions in examinations should be viewed critically. This is not a simple issue, and it seems unlikely that it would be possible to judge context in exams. as simply 'good' or 'bad'. However, it is argued that there are a number of key issues which need to be considered when judging a 'good' examination question, and there are strong reasons to be suspicious of too many context-bound questions.
Before reaching any conclusions it is appropriate to review both the purposes of assessment and the processes that learners must engage in to answer an examination question. Only by keeping in mind what exam. questions are for, and how candidates respond to them, can we judge the appropriateness of any particular type of question.
Types and purposes of assessment
During their school career students will be assessed frequently, and by a wide range of means. If we take a simplistic view of schooling as designed to facilitate learning, we can see the purpose of assessment as to judge what learning has taken place.
It is generally recognised that there are different purposes for assessment, and - in particular - assessment is often classed as diagnostic, formative and summative. Quite rightly, in recent years, teachers have been asked to focus on formative assessment, or 'assessment for learning' (e.g. Sorenson, 2000). As the role of the teacher is to bring about learning, it makes sense if teachers see the primary purpose of their assessment procedures to be an integral part of a cycle of activities that facilitate that learning.
Traditionally teachers may have primarily judged learning through formally assessed end-of-topic tests, but if assessment is to support learning then it is important that it is continuous so that it can inform the on-going management of learning (i.e. teaching) in the classroom.
It is also appropriate, if we wish to judge the learning that has taken place (and so the effectiveness of teaching) that we test at the start of a topic as well as at the end. Without a benchmark it is difficult to judge whether any learning displayed in an end-of-topic test has taken place during, rather than before, the teaching of the topic. Indeed, a pre-test could in principle suggest that whole sections of a scheme of work will be repeating material that is already well-known. In this case the assessment will inform a reorganisation of teaching that will allow time to be used more effectively to ensure genuine student progression, and avoid the frustration of students who recognise that they are repeating work.
Of course pre-testing is more than bench-marking - and this is where the notion of diagnostic testing may become especially important. Research into the learning process, and particularly learning in science, provides two important justifications for diagnostic assessment. Firstly, research into children's ideas in science has revealed a wide range of common alternative conceptions and frameworks that students develop across a range of science topics and key stages. Diagnosing the extent to which known alternative conceptions are present in a particular teaching group can guide the teacher in the extent to which such ideas need to be challenged on a class or individual basis (Taber, 2002).
The second justification concerns the notion of 'pre-requisite' knowledge. Learning is constrained by the limitations of the students' 'cognitive apparatus', the features of the brain which bring about learning (Taber, 2000). New material to be learnt needs to be of limited complexity, and readily relatable to existing knowledge, if it is to be understood as intended and retained by learners. In effect this means that teachers need to undertake content analyses of topics to identify the conceptual foundations upon which new learning will logically build (Taber, 2002). If this pre-requisite learning is not present then there is little chance of building-up the more advanced ideas. Pre-testing can diagnose deficits in the required prior learning, and so inform teachers that remedial work is needed (with the whole class, or some individuals) before setting out upon the intended new learning.
Summative assessment - examinations, such as GCSE and A level examinations - is not primarily designed to be diagnostic or formative. The purpose of an examination is to be summative, to allow a judgement to be made about the candidates' learning in a subject at a certain level. In the UK system at the current time examination results are also used to judge the performance of schools and colleges (and their teachers). For this purpose a measure of value-added is often seen appropriate - such as looking at A level grades in the light of the students' GCSE grades on entry to a course. The 'output' grades of one cohort may be compared against another with similar 'input' characteristics. However, this process is largely a statistical one, and unlike the use of pre-tests there is no attempt to evaluate the specific features of teaching in a way that could directly inform professional practice. (In this sense, even as an evaluation tool the process is summative rather than formative - the teacher is given an indication of how well they are teaching, but no indication of what they might be doing particularly well, or where they might specifically be going wrong!)
Now if different school assessments are seen to have different purposes, then we might expect them to be designed to be optimally matched to their purposes - that diagnostic, formative and summative assessments may look very different. To some extent this is true, but this paper will raise some serious questions about the nature of many questions currently being used for summative assessment in formal examinations. Before discussing this in any depth, it is helpful to briefly review the widely accepted features of 'learning theory' which inform teaching (and so should inform the assessment of learning).
Contingency: a principle of learning science
Learning is a natural process - all human beings learn a great deal effortlessly. However, as all teachers know, directing specific learning is a much more hit-and-miss affair. Even motivated and interested learners being taught by keen and well-prepared teachers often fail to learn what the teacher thought she was teaching! When we examine the nature of the learning process, this does not seem surprising.
Learning is in effect a change of behaviour (e.g. answering a question) brought about by experience (such as school lessons). This change is understood to result from some permanent modifications in the brain in terms of the strength of connections between nerve cells. As we have little detailed understanding of how knowledge is coded in synapses, it is usually more helpful to talk in more abstract terms about 'cognitive structure', meaning the organisation of knowledge in the brain. The conjecture that our conceptual knowledge is somehow represented in the brain is widely accepted even though we currently have limited understanding of the way that knowledge is coded. It may be helpful to think of cognitive structure as a mental concept map of everything that is known (or believed). So for the purposes of useful discussion we can consider (conceptual) learning to involve changes in the students' mental concept map.
Now what is known from a great deal of research is that learning is a highly contingent process. In other words whether a student changes a particular region of their mental concept map, and how they change it, depend upon a range of influences (Taber, 2000). Even assuming that the student is paying full attention to the teacher, the clearest possible teacher explanation will not necessarily bring about the desired 'change in mind' on behalf of the student.
In order for the intended learning to take place the learner has to make sense of the teacher's exposition, recognise how it relates to existing knowledge, and be convinced (though not necessarily consciously) that a change in the concept map is justified. Assuming these points, then there may be temporary changes in the brain (akin to self-sustaining cycles of electrical activity) that can potentially later be 'fixed' in terms of permanent modifications to patterns of synaptic connections.
However, these criteria may not be readily met. For the student to understand the teacher it is necessary both for them to share common language (when research shows that students and teachers often have different meanings for both technical and non-technical words used in lessons) and the teachers' exposition to be simple enough for the student to 'take it in'. Now this latter point is quite a significant one, as research suggests that we all have quite limited 'working memories' and that information of quite moderate complexity may readily exceed our processing capacities.
Even when the teacher's language is not a barrier to understanding (and experienced teachers become quite expert at using appropriate language for their students), the teacher's exposition will be aimed at specific target knowledge within the learners' cognitive structure. So the student has to recognise which 'bit' of the concept map is being addressed. As most students have a whole range of alternative conceptions, it is often the case the learner's mental concept map is actually quite different from that idealised version envisaged by the teacher (thus the value of diagnostic testing), and so even a clear teacher explanation may not map onto the learner's existing cognitive structure as intended.
There is much research to show that often learners are quite resistant to some new ideas they meet. Sometimes these ideas seem quite 'counter-intuitive' (perhaps they do not match prior learning moderated by previous experience). Ideas that seem convincing to the teacher (who is very familiar with the topic area), may seem arbitrary and even unlikely to the learner. Human learning has a logical feature in that the status of new ideas is judged according to how well they match existing ideas and our assumptions about the way the world works - and sometimes the science we teach does not score highly on these criteria when evaluated (usually subconsciously) by learners. Ideas that are considered to be simplifying when we (teachers) see the wider picture, may just seem to make things more complicated when first met by the learner.
Even when the student has understood the teacher's language, and not found the exposition too complex, and related the teaching to the appropriate bit of prior learning (where it matches the target knowledge sufficiently well), and when the new ideas have sufficient status, learning is still often a slow process. The permanent modifications in brain structure which change the mental concept map, usually start some hours after the initial learning (Taber, in preparation).