The Do’s and Don’ts of writing MCQs (multiple choice questions) version 4[1]

1. Strengths and limitations of MCQs (Zimmaro, 2004:11)

Strengths:

1. Achievement of learning outcomes from simple to complex can assesse.

2. Highly structured and clear tasks are provided.

3. A broad sample of achievement can be assessed.

4. Incorrect alternatives provide diagnostic information.

5. Scores are less influenced by guessing than true-false items.

6. Scores are more reliable than subjectively scored items (e.g. essays).

7. Scoring is easy, objective, and reliable.

8. Item analysis can reveal how difficult each item was and how well it discriminated between the strong and weaker students in the class

9. Achievement can be compared from class to class and year to year

10. Can cover a lot of material very efficiently (about one item per minute of testing time for straightforward questions).

11. Items can be written so that students must discriminate among options that vary in degree of correctness.

12. Avoids the absolute judgments found in True-False tests.

Limitations:

1. Constructing good items is time consuming.

2. It is frequently difficult to find plausible distractors.

3. Can be ineffective for assessming some types of problem solving and the ability to organize and express ideas.

4. Real-world problem solving differs – a different process is involved in proposing a solution versus selecting a solution from a set of alternatives.

5. Scores can be influenced by reading ability.

6. There is a lack of feedback on individual thought processes – it is difficult to determine why individual students selected incorrect responses.

7. Students can sometimes read more into the question than was intended.

8. Often focus on testing factual information and fails to test higher levels of cognitive thinking.

9. Sometimes there is more than one defensible “correct” answer.

10. They place a high degree of dependence the instructor’s writing ability.

11. Does not provide an assessment of writing ability.

12. May encourage guessing.

2. Parts of a multiple choice question (Bull & Mckenna, 2002)

A traditional multiple choice question (or item) is one in which a student chooses one answer from a number of choices supplied. A multiple choice question consists of a:

·  stem - the text of the question

·  options - the choices provided after the stem (these include the key and the distractors)

·  the key - the correct answer in the list of options

·  distracters - the incorrect answers in the list of options

3. Some examples of do’s and don’ts (Bull & Mckenna, 2002, Kehoe, 1995, Zimmaro, 2004)

Begin writing items well ahead of the time when they will be used —this allows time for revision and peer review.

Before writing the stem, identify the single idea to be tested by that item. This should be about an important aspect of the content area and not with trivia. In general, the stem should not pose more than one problem, although the solution to that problem may require more than one step.

Be sure that each item is independent of all other items (i.e. a hint to an answer should not be unintentionally embedded in another item).

Design each item/question so that it can be answered by 60-65% of the student cohort (Zimmaro, 2004:15)

3.1 Writing Stems
(i) Present a single, definite statement or direct question to be completed or answered by one of the several given choices
A. original stem
Polysaccharide
a.  are made up of thousands of smaller units called monosaccharides
b.  are NOT found in the aloe vera leaf
c.  are created during photosynthesis
d.  can be described by the chemical formula: CHHOH / B. improved stem
Polysaccharides of the plant cell wall are synthesized mainly in the
a.  endoplasmic reticulum
b.  cytosol
c.  plasma membrane
d.  Golgi complex
In Example A, there is no sense from the stem what the question is asking. Example B more clearly identifies the question and offers the student a set of homogeneous choices.
(ii) Avoid unnecessary and irrelevant material in the stem. It should be clear and unambiguous
A. original stem:
Paul Muldoon, an Irish postmodern poet who uses experimental and playful language, uses which poetic genre in "Why Brownlee Left"?
a.  sonnet
b.  elegy
c.  narrative poem
d.  dramatic monologue
e.  haiku / B. improved stem
Paul Muldoon uses which poetic genre in "Why Brownlee Left"?
a.  sonnet
b.  elegy
c.  narrative poem
d. dramatic monologue
e. haiku
Example A contains material irrelevant to the question. This sort of material should not be used to make the answer less obvious. This tends to place too much importance on reading comprehension as a determiner of the correct option
(iii) Use clear, straightforward language in the stem of the item.
Questions that are constructed using complex or imprecise wording may become a test of reading comprehension rather than an assessment of whether the student knows the subject matter.
A. original stem
As the level of fertility approaches its nadir, what is the most likely ramification for the citizenry of a developing nation?
a.  a decrease in the workforce participation rate of women
b.  a dispersing effect on population concentration
c.  a downward trend in the youth dependency ratio
d.  a broader base in the population pyramid
e.  an increased infant mortality rate / B. improved stem
A major decline in fertility in a developing nation is likely to produce a
a.  decrease in the workforce participation rate of women
b.  dispersing effect on population concentration
c.  downward trend in the youth dependency ratio
d.  broader base in the population pyramid
e.an e. increased infant mortality rate
(iv) Use negatives sparingly in the stem. If negatives must be used, capitalize, underscore, embolden or otherwise highlight them. Negatives include ‘except’, ‘only’
A. original stem
Which one of the following is not a symptom of osteoporosis?
a.  decreased bone density
b.  frequent bone fractures
c.  raised body temperature
d.  lower back pain / B. improved stem
Which one of the following is a symptom of osteoporosis?
a.  decreased bone density
b.  raised body temperature
c.  hair loss
d.  painful joints
Negatives in the stem usually require that the answer be a false statement. Because students are likely in the habit of searching for true statements, this may introduce an unwanted bias.
(v) Put as much of the question in the stem as possible, rather than duplicating material in each of the options.
A. original stem
Theorists of pluralism have asserted which of the following?
a.  The maintenance of democracy requires a large middle class.
b.  The maintenance of democracy requires autonomous centres of contervailing power.
c.  The maintenance of democracy requires the existence of a multiplicity of religious groups.
d.  The maintenance of democracy requires a predominantly urban population.
e.  The maintenance of democracy requires the separation of governmental powers. / B. improved stem
Theorists of pluralism have asserted that the maintenance of democracy requires
a.  a large middle class
b.  autonomous centres of contervailing power
c.  existence of a multiplicity of religious groups
d.  a predominantly urban population
e.  separation of governmental powers
Another example: If the point of an item is to associate a term with its definition, the preferred format would be to present the definition in the stem and several terms as options, rather than to present the term in the stem and several definitions as options.
(vi) Avoid irrelevant clues to the correct option in the stem.
Grammatical construction, for example, may lead students to reject options which are grammatically incorrect as the stem is stated. Perhaps more common and subtle, though, is the problem of common elements in the stem and in the answer.
Consider the following item:
What led to the formation of the States’ Rights Party?
a. The level of federal taxation
b. The demand of states for the right to make their own laws
c. The industrialization of the South
d. The corruption of federal legislators on the issue of state taxation
One does not need to know U.S. history in order to be attracted to the answer, b.

3.2 Writing distractors

(Zimmaro D. 2004, Bull & Mckenna, 2002, Kehoe, 1995, Nitko, 2001, Parkes)

This is more difficult than writing stems. They’re called distracters because they are strategically designed to attract examinees who haven’t completely mastered the content and skills. This isn't tricky or deceptive or unfair. It is because the goal of testing is to find out who has learned the content and can apply skills and who has not, perhaps along a continuum between the two. Students who mastered the material should recognize the key (correct answer) and those who haven’t should not. (Parkes)

(i) Decide on how many distractors to write

According to Nitko (2001) there is no magic number that you should use. A 1987 study by Owen & Freeman suggests that three choices are sufficient. Clearly, the higher the number of distracters, the less likely it is for the correct answer to be chosen through guessing (providing all alternatives are of equal difficulty) (Bull & Mckenna, 2002). Be satisfied with three or four well constructed options. Generally, the minimal improvement to the item due to that hard-to-come-by fifth option is not worth the effort to construct it (Kehoe, 1995).

(iii) Follow these hints to avoid test validity problems

1.  Try to write items in which there is one and only one correct or clearly is the best answer and one on which experts would agree.

2.  Be sure wrong answer choices (distractors) are at least plausible.

For example, a distractor can be correct but not answer the question. However, the distractor must not be so close to the correct answer that it confuses students who really do know the answer.

3.  Incorporate common student misunderstandings or errors in distractors.

4.  The position of the correct answer should vary randomly from item to item.

After the options are written, vary the location of the answer on as random a basis as possible. A convenient method is to flip two (or three) coins at a time where each possible Head-Tail combination is associated with a particular location for the answer. Students should be informed that the locations are randomized. (Testwise students know that for some instructors the first option is rarely the answer.)

5.  Avoid overlapping alternatives.

For example, in the original form of this item, if either of the first two alternatives is correct, ‘C’ is also correct.)

Original
1. During what age period is thumb-sucking likely to produce the greatest psychological trauma?
A. Infancy
B. Preschool period
C. Before adolescence
D. During adolescence
E. After adolescence / Revised
1. During what age period is thumb-sucking likely to produce the greatest psychological trauma?
A. From birth to 2 years old
B. From 2 years to 5 years old
C. From 5 years to 12 years old
D. From 12 years to 20 years old

6.  The length of the response options should be about the same within each item (preferably short).

Adherence to this rule avoids some of the more common sources of biased cueing. For example, we sometimes find ourselves increasing the length and specificity of the answer (relative to distractors) in order to insure its truthfulness. This, however, becomes an easy-to-spot clue for the testwise student. The number of students choosing a distractor should depend only on deficits in the content area which the item targets and should not depend on cue biases or reading comprehension differences in ‘favour’ of the distractor

7.  There should be no grammatical clues to the correct answer.

Original
1. Albert Eisenstein was a:
A. Anthropologist.
B. Astronomer.
C. Chemist.
D. Mathematician / Revised
1. Who was Albert Einstein?
A. An anthropologist.
B. An Astronomer.
C. A chemist
D. A mathematician

8.  Avoid excessive use of negatives and/or double negatives and words such as ‘always’, ‘never’, and ‘all’.

9.  Avoid the use of ‘All of the above’, ‘both a. and e. above,’ and ‘None of the above’ in the response alternatives, when students are asked to choose the best answer.

In the case of ‘All of the above’, students only need to have partial information in order to answer the question. Students need to know that only two of the options are correct (in a four or more option question) to determine that ‘All of the above’ is the correct answer choice. Conversely, students only need to eliminate one answer choice as implausible in order to eliminate ‘All of the above’ as an answer choice. Similarly, with ‘None of the above’, when used as the correct answer choice, information is gained about students’ ability to detect incorrect answers. However, the item does not reveal if students know the correct answer to the question.

4. Reviewing the MCQs: guidelines (Cohen & Wollack, 2000)

Cohen and Wollack recommend these for reviewing individual questions/items before students sit the test.

1. Consider the item as a whole and whether

·  it measures knowledge or a skill component which is worthwhile and appropriate for the examinees who will be tested

·  there is a markedly better way to test what this item tests

·  it is of the appropriate level of difficulty for the examinees who will be tested.

2. Consider the stem and whether it

·  presents a clearly defined problem or task to the examinee

·  contains unnecessary information

·  could be worded more simply, clearly or concisely.

3. Consider the alternatives and whether

·  they are parallel in structure

·  they fit logically and grammatically with the stem

·  they could be worded more simply, clearly or concisely

·  any are so inclusive that they logically eliminate another more restricted option from being a possible answer.

4. Consider the key and whether it

·  is the best answer among the set of options for the item

·  actually answers the question posed in the stem

·  is too obvious relative to the other alternatives (i.e., should be shortened, lengthened, given greater numbers of details, made less concrete).