Principles of Questionnaire Construction

Addendum to Quantitative Measurement

The process of creating measurable concrete variables from the abstract concepts that characterize a research problem is known as operationalization. Concepts of interest are operationalized into an empirical format that can be used to ask people questions in order to get data for analysis and interpretation.

Measurement Steps:

identify concept of interest

develop conceptual definition

operationalize to create a variable

Considerations in Quantitative Measurement In Creating a Scale or Index:

1. Types of Variables to be created in Scale Construction:

a) Continuous vs. Discrete categories

b) Levels of measurement:

nominal

ordinal

interval/ratio

2. Scale or Index construction:

Any type of quantitative measurement that you do may include a scale or an index (composite measure of a variable.) To simply conceptualize a variable and then create empirical indicators, does not necessarily guarantee that the variable of interest has been successfully operationalized. Often, the creation of a collection of items will give a more precise measurement of a concept. In this case, several measures or items are combined to arrive at a single scale score. The properties of validity, reliability, unidimensionality, and reproducibility as they relate to scale construction are the primary standards for evaluating the operationalization and measurement of empirical variables.

3. Types of Measures:

a) Composite Measures: Indexes and scales are composite measures of variables: Several empirical indicators of a variable are combined into a single measure.

b) Index vs. Scales

- Indexes and scales are not the same thing, though they share some common characteristics.

-Both scales and indexes are typically ordinal measures of variables.

- Scales and indexes are composite measures of variables, which means that measurement is based on more than one data item or indicator.

- The major distinction between indexes and scales is the manner in which scores are assigned.

- An index is constructed through simple accumulation of scores assigned to individual responses.

-A scale is constructed through the assignment of scores to patterns of attributes (responses). A scale measures the intensity structure that may exist among attributes. The most potent measure of a variable is scored the highest, followed by the rest in descending order. The total score thus reflects a pattern of answers, not just the sum of individual responses.

4. Reliability and Validity

In creating a scale or an index, the following must be considered:

a. Reliability

Reliability refers to the consistency of a measure, or whether a scale measures the same thing, in the

same way, time after time. The more common social science definition of the "reliability" refers to the consistency of measurement: i.e. do the various items of the scale, which are thought to measure the same thing, actually do so? Reliability can be statistically assessed. Coefficients of reproducibility and reliability such as Cronbach's alpha and the Guttman coefficient of reproducibility can be calculated using a statistical software program such as SPSS.

b. Validity

Validity refers to whether or not the questions or empirical indicators are actually measuring what

they are supposed to be measuring. Validity cannot be determined statistically. For face validity, you can look at the items and assess whether or not the items seem to be measuring what they intend to measure. For content validity, a judgment is made as to whether the items seem to adequately represent all aspects or dimensions of the concept being measured. Validity is fundamentally more important than the issue of reliability. A scale can be a reliable measure (i.e. shoe size as a measure of IQ) but not a valid one. Assess your measures carefully to ensure they are valid.

c. Unidimensionality

This refers to the property that the items making up a scale measure one and only one dimension or concept at a time. Typically, complex concepts such as political orientation, authoritarianism, feminism, marital satisfaction, and other concepts are measured with scales and not by single questions or empirical indicators.

Note that single-item scales (i.e. single questions on a survey) must be unidimensional as well.

d. Reproducibility

A final issue in the construction of scales is that of reproducibility. The researcher should be

able to predict, with a knowledge of respondent's scale score, those items with which the respondent

most likely agreed and those with which the respondent was not in agreement. Reproducibility is difficult to achieve when scales lack reliability, validity, and unidimensionality.

5. General criteria for a good index or scale:

1. clear instructions

2. items are simple, free of jargon

3. scale or index should be neat and easy to read

4. should include response bias questions

5. scales should be unidimensional

6. should have face validity

7. response categories should be balanced

8. answers should fit questions

9. behavioural indicators are preferred over attitudinal

10. must be reliable

6. Commonly Used Indexes:

a) Likert:

- This is a summated scale consisting of a series of items to which the subject responds.

- The respondent indicates agreement or disagreement with each item on an intensity scale.

- The Likert technique produces an ordinal scale.

- The scale is highly reliable when it comes to a rough ordering of people with regard to a particular attitude or attitude complex.

- The score includes a measure of intensity as expressed on each statement.

- Because identical response categories are used for several items intended to measure a given variable, each item can be scored in a uniform manner.

- A response category for each item is provided, typically a 5-point response composed of (1) strongly agree, (2) agree, (3) undecided, (4) disagree, (5) strongly disagree.

- Analysis of the data is accomplished by scoring the various responses and summating them. A summated score is possible by assigning a numerical value to each response--usually a value of 1 to 5. Once the scoring procedure has been devised, a respondent's score is determined by adding the individual numbers for each item.

- Respondents can then be ranked according to the overall score obtained.

b) Semantic differential:

- This is not as well known or as widely used as summated or unidimensional scales.

- The semantic differential format is flexible and can be used to measure a variety of attitudes. A 100-item test can be administered in about 10 to 15 minutes; a 400-item test takes about an hour.

- This index attempts to measure attitudes toward some phenomena by having respondents check a point along a continuum between two opposite positions. It uses a 7-point differential category between two opposite points.

- The basic rationale of the semantic differential format is to measure respondents' reactions to some property using opposite adjective ratings.

- The semantic differential seeks to understand behavior by studying language concepts and the meaning projected on the concepts. Most sociologists agree with the notion that how a person behaves in a situation is dependent on one's perception of the situation, the semantic differential is particularly useful in measuring this type of meaning.

- Is constructed by preparing a list of concepts appropriate to the theory guiding the variable to be measured. Pairs of polar adjectives, to which the respondents is asked to respond, are selected according to the theory.

7. Commonly Used Scales:

a) Bogardus social distance scale:

- Measures the "distance" that respondents perceive between themselves and members of different social categories (nationalities, racial groups, deviants, etc.).

- The Bogardus social distance scale is weighted according to the type of interaction that the subject is willing to engage in with members of a group or of different groups.

- Theoretically, an individual who would readily accept a member of another ethnic group as a relative would have no objection to working alongside that person or to that person's becoming a neighbour or in-law.

b) Guttman scale:

- In Guttman scaling, both respondents and index items are ranked, according to the actual answers

given.

- Guttman scaling is effective at determining the unidimensionality of a scale.

- If a scale is unidimensional, then a person who has a more favorable attitude than another should respond to each statement with equal or greater favorableness than another person.