Hayward, Stewart, Phillips, Norris, & Lovell1

At-a-Glance Test Review: Woodcock Reading Mastery Tests-Revised (NU Normative Update) (WRMT-R)

Name of Test: Woodcock Reading Mastery Tests-Revised (NU Normative Update) WRMT-R
Author(s): Richard W. Woodcock
Publisher/Year 1973,1987, 1998 (norms only)
Forms: Form G and H (parallel forms allowing for test-retest)
Age Range: 5 years through adulthood to “75+”; Grades K to 6
Norming Sample The 1998 edition differs from 1987 in that norms have been updated. There are also new Instructional Level Profiles. WRMT-R is co-normed with K-TEA/NU and PIAT-R/NU.
Total Number: 3 184, Number and Age: 3 184 students from kindergarten to grade 12 as well as 245 young adults ages 18 to 22 years were tested. Location: 129 sites in 40 states, Demographics: “A stratified multistage sampling procedure” was used and compared to the March, 1994 Current Population Survey (U.S. Bureau of Census). Sampling targets guided selection and were stratified by gender, race, parent education, and geographic region. Rural/Urban: no information, SES: using parent education, Other: Each child was randomly assigned one of test batteries locally available. Special education students were also included in small numbers as reflected in the census. Comments: From reading the Buros review, I have the impression that there were problems with the sample in that though the authors were ambitious, they did not fulfill their intentions (Crocker & Murray Ward, 2001).
Summary Prepared By (Name and Date): Eleanor Stewart 31 May and June 2007
Test Description/Overview: The test remains unchanged from the 1989 revision in terms of test items, score forms, procedures for recording and analyzing errors, profiles, and computerized scoring. The test consists of six individual tests that, when grouped, form a “cluster” that addresses a composite of skills necessary for aspects of reading. WRMT-R provides an assessment of reading readiness, basis reading skills, and reading comprehension. The six tests are: Visual-Auditory Learning, Letter Identification, Word Identification, Word Attack, Word Comprehension, and Passage Comprehension. Tests one through three are found in Form G only.
Purpose of Test: The purpose of this test is to identify areas of strengths and weakness, to diagnose reading problems, to measure gains, to program plan, and to research.
Areas Tested: Print Knowledge: Alphabet Phonological Awareness Word Attack Reading Single Word Reading/Decoding Comprehension
Who can Administer: Examiners must be Level B and have completed assessment/testing and statistics courses. Therefore, they are likely to be teachers, special educators, psychologists, or speech pathologists.
Administration Time: Administration time varies from 10 to 30 minutes depending on which cluster of subtests is administered.
Test Administration (General and Subtests): This test is individually administered. For younger children, practice items are provided and training procedures are outlined. All instructions are provided in the examiner’s manual and on the test easel.
Test Interpretation:
Scoring is facilitated with computer program ASSIST on CD-ROM (available for Macintosh or Windows), which allows examiner to enter raw scores which are then converted to the statistical profile for the student. In addition to Standard Score, NCEs, age and grade equivalents, and percentile ranks, ASSIST provides Relative Performance Indexes, Confidence Band as 68% and 90% levels, Grade Equivalent and Standard Score/Percentile Rank profiles, Aptitude-Achievement Discrepancy Analysis, and a Narrative Report.
Standardization: Age equivalent scores Grade equivalent scores Percentiles Standard scores
Other Readiness Cluster (Form G only, consists of Visual-auditory Learning and Letter Identification), Basic Skills Cluster (Word Identification and Word Attack), Reading Comprehension Cluster (Word Comprehension and Passage Comprehension), Total Reading Full Scale (Word Identification, Word Comprehension, Passage Comprehension), and Total Reading Short Scale (Word Identification and Passage Comprehension) are available clusters. NCEs and Relative Performance Index and instructional ranges are provided.
Reliability Important note: reliability reported in the manual only refers to the 1989 revision. No updated reliability with new norm sample is provided.
Internal consistency of items: Split-half median was .91 (range .68-.98) and was also reported for Clusters: median =.95 (range .87-.98) and Total median=.97 (range .86-.99)
Test-retest: No information found. Technical Information from Pearson site reported “no”.
Inter-rater: No information found. Pearson reports “no”.
Other: none
Validity Important note: The validity information provided in the manual is from the previous 1989 revision.
Content: Content validity, as it refers to WRMT-R, was “developed with contributions from outside experts, including experienced teachers and curriculum specialists” (Woodcock, 1998, p. 97). However, unlike other manuals reviewed for the TELL Project, Woodcock does not provide references in the manual alongside his statement.
Criterion Prediction (concurrent) Validity: Validity was reported for WRMT-R and WJ reading tests for children in Grades 1, 3, 5, and 8 across subtests and total reading scores. Scores range from a low of .39 (Passage Comprehension) to a high of .91 (Full Scale Total Reading). A 1978 study reported WRMT-1973 correlation with Iowa Test of Basic Skills, Iowa Tests of Educational Development ((total reading), PIAT Reading, WJ Reading Achievement, and WRAT Reading demonstrating scores from .79 to .92. The author justifies: “Although these results are based on the 1973 WRMT, they are reported in this revision because the psychometric characteristics of the original WRMT (1973) and the WRMT-R are so similar that many generalizations from one to the other can be validly made” (Woodcock, 1998, p. 100).
Construct Identification Validity: Test and Cluster Intercorrelations:Since tests were clustered to target readiness and skill areas, correlations are reported for subtests within clusters as well as clusters overall. Subtest and clusters presented predictable correlations.
Differential Item Functioning: Classical and Rasch models were used in item development and selection though it is unclear from the statement on page 97. Woodcock states, “both contributed to the stringent statistical criteria employed during the process of item selection in the WRMT-R” (Woodcock, 1998, p. 97). The correlations range from low (.35 at Grade 3 for letter identification/visual-auditory learning) to high (.98 for Total reading short scale/total reading full scale).
Other: No reported studies investigating the predictive validity with special education populations were undertaken, though children with special needs did participate in norming.
Summary/Conclusions/Observations: The Buros reviewers make important comments:
“…three interpretation issues arise. First, the use of the norms generated from the smaller norm sample means that interpretations are limited…Second, scores for special education students should be used cautiously…[third]. The author states that comparisons of old and new norm data clearly show a pattern of lower performances and higher standard and percentile scores of lower achievers. This effect could result in overestimation of students’ reading levels. Thus, students might not receive appropriate services or services may be terminated prematurely. Interestingly, there are no cautions to examiners to readjust score referents to account for these changes”(Crocker & Murray Ward, 2001, p. 1372).
“It should also be remembered that no changes have been made in test skills or items, and there is no stipulation that the other measures clarify the meaning of the WRMT-R NU scores. In conclusion, the WRMT-R/NU is a limited norms update. The test still contains many test items and scores, but does not address problems identified by previous reviewers. Furthermore, the renorming has narrowed the utility of the test. Therefore, the WRMT-R/NU should be used in conjunction with other measures of reading. Results should not be overinterpreted. The examiner should also be very cautious in using the test with a wide range of age groups. If these cautions are observed, the test may be useful in helping estimate reading achievement” (Crocker & Murray Ward, 2001, p. 1372).
Clinical/Diagnostic Usefulness: Based on the critiques available, I think that this test has limited clinical utility and, if used, should only be used as an adjunct to more rigorous and contemporary reading tests. I would be very cautious about using this test’s results to make important decisions about eligibility and intervention though the author intends for the test to be used in this way.

References

Crocker, L., and Murray Ward, M. (2001). Test review of Woodcock Reading Mastery Test-Revised 1998 Normative Update. In B.S. Plake and J.C. Impara (Eds.), The fourteenth mental measurements yearbook(pp. 1369-1373). Lincoln, NE: Buros Institute of Mental Measurements.

Current Population Survey, March, 1994 [Machine readable data file]. (1994). Washington, DC: Bureau of the Census (Producer and Distributor).

Pearson Assessments (2007). Speech and language forum. Retrieved May 31, 2008 from

Woodcock, R. W. (1998). Woodcock reading mastery tests – Revised NU: Examiner’s manual. Circle Pines, MN: American Guidance Service.

To cite this document:

Hayward, D. V., Stewart, G. E., Phillips, L. M., Norris, S. P., & Lovell, M. A. (2008). At-a-glance test review: Woodcock reading mastery tests-revised (NU normative update) (WRMT-R). Language, Phonological Awareness, and Reading Test Directory (pp. 1-4). Edmonton, AB: Canadian Centre for Research on Literacy. Retrieved [insert date] from