Reliability: Meaning, Characteristics, Methods of Determining Reliability and Factors

Ph.D. COURSE WORK (2010-11)

PAPER-I

ADVANCED RESEARCH METHODOLOGY

ASSIGNMENT

RELIABILITY: MEANING, CHARACTERISTICS, METHODS OF DETERMINING RELIABILITY AND FACTORS INFLUENCING RELIABILITY

Submitted to: Submitted by:

Dr. Kulwinder Singh Harsukhjinder Singh

Roll No.9040

DEPARTMENT OF EDUCATION AND COMMUNITY SERVICE
PUNJABI UNIVERSITY
PATIALA

RELIABILITY

The literal meaning of reliability is consistency, dependence or trust. Reliability is the degree of consistency that the instrument or procedure demonstrates. Whatever it is measuring, it does so consistently. A data collection test is considered to be reliable if it yields consistent results in its successive administration. So by reliability of a test me mean how dependable, trustworthy or faithful the test is.

DEFINITIONS OF RELIABILITY

Gronland and Linn: Reliability refers to the consistency of measurement that is, how consistent test scores or other evaluation results are from one measurement to other.

Anastasi: Reliability refers to the consistency of scores obtained by the same individuals when re-examined with the same test on different occasions or with different sets of equivalent items or under variable examining conditions.

CHARACTERISTICS OF RELIABILITY

1) It is consistency of a test score.

2) It refers to the accuracy or precision of a measuring instrument.

3) It refers to the test results not the test itself.

4) It is the coefficient of internal consistency.

5) It is the measure of variable error.

6) It does not ensure the validity or truthfulness or purposiveness of a test.

7) It is the self correlation.

8) It is a matter of degree.

METHODS OF DETERMINING RELIABILITY

There are four procedures generally used for computing the reliability coefficient (sometimes called the self-correlation) of a test. These are:

1) Test-retest (repetition)

2) Alternate or parallel forms.

3) Split-half technique

4) Rational Equivalence

All these methods furnish estimates of the reproducibility of test scores; sometimes one method and sometimes another will provide the better measure.

1) Test-retest Method

This method involves (i) repetition of a test on the same group immediately or after a lapse of time, and (ii) computation of correlation between the first and the second set of scores. The correlation co-efficient thus obtained indicates the extent or magnitude of the agreement between the two sets of scores and is often called the coefficient of stability. The estimate of reliability in this case vary according to the length of time-interval allowed between the two administrations. The product moment method of correlation is a significant method for estimating reliability of two sets of scores. Thus, a high correlation between two sets of scores indicates that the test is reliable. In other words, it shows that the scores obtained in first administration resemble with the scores obtained in second administration of the same test.

In this method the time interval plays an important role. Immediate repetition of a test may involve (i) immediate memory effects (ii) practice effects (iii) confidence effects, induced by familiarity of contents. Intervals of six months or long may show ‘maturity effect’. The factors of intervening learning and unlearning may lead to lowering of self-correlation. Owing to difficulties in controlling conditions which influence scores on retest, the test-retest method is generally less useful than are the other methods.

Advantages

1. It is generally used for estimating reliability coefficient.

2. It is worthy to use in different situations conveniently.

3. A test of an adequate length can be used after an interval of many days between successive testing,

Limitations

1. If the test is repeated immediately or after a little time gap, there may be possibility of carry-over effect, transfer effect, memory effect, practice effect and confidence effect induced by familiarity with the material will almost certainly effect scores when the test is administered for a second time.

2. Index of reliability so obtained is less accurate.

3. If the interval between tests is rather long (more than six months) growth factor and maturity affect the scores and tenders to lower down the reliable index.

4. On repeating the same test on the same group second time, makes the students disinterested and thus they do not like to take part wholeheartedly.

2) Alternate or Parallel Forms Method

This method involves the administration of equivalent or parallel forms of the test instead of repetition of a single test. The two equivalent forms are so constructed as to make them similar (but not identical) in context, mental process involved, number of items, difficulty level and in other aspects. Parallel tests have equal mean scores, variances and intercorrelations among items. That is, two parallel forms must be homogeneous or similar in all respects, but not a duplication of test items. The subjects take one form of the test and then as soon as possible, the other form. The reliability coefficient may be looked upon as the coefficient correlation between the scores on two equivalent forms of test.

Advantages

1. Memory, practice and carryover effects are minimised and not affect the scores.

2. The reliability coefficient obtained by this method is a measure of both temporal stability and consistency of response to different item samples or test forms.

3. It is useful for the reliability of achievement tests.

Limitations

1. Practice and carry over factors cannot be completely controlled.

2. When the tests are not exactly equal the comparison between two sets of scores obtained from these tests may lead to erroneous decisions.

3. Administration of two forms simultaneously creates boredom.

4. The testing conditions while administering the Form B may not be the same.

5. Test scores of second form of the test are generally high.

3) Split-half Method

In this method the test is administered once on the sample and it is the most appropriate method for homogeneous tests. This method provides the internal consistency of a test scores. All the items of the test are generally arranged in increasing order of difficulty and administered once on sample. After administering the test it is divided into two comparable or similar or equal parts or halves. The test is divided into two halves only for the purpose of scoring and not for administration. The scores are arranged or are made in two sets obtained from odd numbers of items and even numbers of items separately. The odd numbered items 1,3,5,7 etc. and the even numbered items 2,4,6,8 etc. form two different sets of items for scoring. After obtaining two scores on odd and even numbers of test items, co-efficient of correlation is calculated. It is really a correlation between two equivalent halves of scores obtained in one sitting. To estimate reliability, Spearman-Brown Prophecy formula is used:-

Where = reliability coefficient of the whole test.

= reliability coefficient of the half test, found experimentally.

Advantages

1. The carryover effect or practice effect is not there as the testee is not tested twice.

2. The fluctuations of individual’s ability because of environmental or physical conditions is minimised.

3. Difficulty of constructing parallel forms of test is eliminated.

Limitations

1. A test can be divided into two equal halves in a number of ways and the coefficient of correlation in each case may be different.

2. As the test is administered once, the chance errors may affect the scores on the two halves in the same way and thus tending to make the reliability coefficient too high.

3. This method cannot be used in power tests and heterogeneous tests.

4. This method can not be used for estimating reliability of speed tests.

4) Rational Equivalence Method

It is a method based on consistency of responses to all items. This method enables to compute the inter-correlation of the items of the test and correlations of each item with all the items of the test. In this method, it is assumed that all items have same or equal difficulty value, correlation between the items are equal, all the items measure essentially the same ability and the test is homogeneous in nature. Like split-half method this method also provides a measure of interval consistency. The most popular formula is Kuder-Richardson:

Where = reliability coefficient of the whole test.

n = number of items in the test

= the SD of the test scores

p = the proportion of the group answering a test item correctly

q =(1-p) = the proportion of the group answering a test item incorrectly.

Advantages

1. This coefficient provides some indicators of how internally consistent or homogeneous the items of the test are.

2. Split-half method simply measures the equivalence but rational equivalence method measures both equivalence and homogeneity.

3. It neither requires administration of two equivalent forms of tests nor it requires to split the tests into two equal halves.

Limitations

1. The coefficient obtained by this method is generally some what lesser than the coefficients obtained by other methods.

2. It the items of the tests are not highly homogeneous, this method will yield lower reliability coefficient.

3. Kuder-Richardson and split-half method are not appropriate for speed test.

FACTORS INFLUENCING THE RELIABILITY

There are some intrinsic and extrinsic factors which affect the reliability of test scores:

(A) Intrinsic Factors

The intrinsic factors are those factors which lie within the test itself. The major intrinsic factors which affect the reliability are:

(i) Length of the test: Other things being equal, the reliability of a test is a function of its length. Longer tests tend to be more reliable than shorter tests. The more the number of items the test contains, the greater will be its reliability and vice-versa. Logically, the more sample of items we take of a given area of knowledge, skill and the like, the more reliable the test will be. However, it is difficult to ensure the maximum length of the test to ensure an appropriate value of reliability.

(ii) Homogeneity of items: Homogeneity of items has two aspects: item reliability and the homogeneity of traits measured from one item to another. If the items measure different functions and the inter-relations of items are ‘zero’ or near to it, then the reliability is ‘zero’ or very low and vice-versa.

(iii) Difficulty level of items: If the test items are too easy or too difficult it will tend to produce scores of low reliability. Because both the tests have a restricted spread of scores.

(iv) Test Instruction: Clear and concise instructions increase reliability. Complicated and ambiguous directions give rise to difficulties in understanding the questions and the nature of the response expected from the testee ultimately leading to low reliability.

(v) Item selection: If there are too many interdependent items in a test, the reliability is found to be low.

(vi) Reliability of the Scorer: If the score is moody, fluctuating type, the scores will vary from one situation to another. Thus the reliability of the scorer also influences reliability of the test.

B) Extrinsic Factors

Extrinsic factors are those factors which remain outside the test itself. The important extrinsic factors influencing the reliability are:

(i) Group Variability: The greater the variability, the higher the reliability and vice versa.

(ii) Group and Chance errors: Guessing in test gives rise to increased error variance and as such reduces reliability.

(iii) Testing Conditions: The conditions in which the test is administered and scored may effect reliability on either side. As far as practicable, testing environment should be uniform.

(iv) Momentary Fluctuations: These may raise or lower the reliability of the test scores.

REFERENCES

Best, John W., & Kahn, James V. (2006). Research in Education (10th ed.). New Delhi: PHI Learning Private Limited.

Garrett, H.E. (2005). Statistics in Psychology and Education. New Delhi: Paragon International Publishers.

Koul, Lokesh (2009). Methodology of Educational Research (4th ed.). New Delhi: Vikas Publishing House Pvt. Ltd.

Sahu, Binod K. (2004). Statistics in Psychology and Education. Ludhiana: Kalyani Publishers.

Sharma, R.A. (2000). Advanced Statistics in Education and Psychology. Meerut: R. Laal Book Depot.