Constructing Achievement & Ability Tests (PSYED 2073)

Spring Term 2014

Instructor: Suzanne Lane Office: 5916 Posvar Hall

Phone: 648-7095; email: Office Hours: By Appointment

Description of Course

A basic course in the construction of measures of educational achievement, topics include test planning, item writing, test tryout, item analysis, reliability, validity, criterion-referencing, and norm-referencing. Students write items, critique items written by others, construct tests, try out and revise tests, and develop test manuals to document the process of test development and the quality of their tests. Prerequisites: PSYED 2072 AND PSYED 2018.

Required Reading

Reading assignments will be assigned from texts and material on reserve in the 5900 area.

Written Assignments and Exercises

a. Develop and critique performance targets and corresponding test items that assess various levels of thinking.

b. Develop a plan for an achievement test consisting of response-choice items.

c. Develop an achievement test consisting of multiple-choice items for an instructional unit.

d. Perform analyses of test data and write a test manual.

e. Written assignments on selected articles.

f. Develop a plan for a test consisting of constructed-response or performance items.

g. Develop a set of constructed-response and performance items.

h. Develop scoring rubrics for constructed-response and performance items.

i. Evaluate other students’ test items.

Individual Appointments will be made with the instructor in order to review the quality of the work turned-in and the plans for data collection and analysis.

Grading:

The final grade will be based on written assignments and class participation. Late assignments will not be accepted.

Test with response-choice items 40% Analyses of Test 15%

Test with constructed-response items 30% Exercises 15%

The rating system is as follows:

Rating Grade Rating Grade

97-100 A+ 80-82 B-

93-96 A 77-79 C+

90-92 A- 73-76 C

87-89 B+ 70-72 C-

83-86 B
Required Texts

Nitko, A. J. (2004). Educational assessments of students (4th ed.) Upper Saddle River, NJ: Pearson.

Marzano, R. J., Pickering, D., & McTighe, J. (1993). Assessing student outcomes: Performance assessment using the dimensions of a learning model. Alexandria, VA: Association for Supervision and Curriculum Development.

Two websites of interest:

http://www.parcconline.org/parcc-model-content-frameworks

http://www.corestandards.org/the-standards

Additional Readings for the 1st half of the class (subject to change)

Abedi, J. (2006). Language issues in item-development. In Downing, S.M. and Haladyna, T.M. Handbook of Test Development (Ed.). New Jersey: Lawrence Erlbaum.

AERA, APA, NCME (1999). Standards for educational and psychological tests. Washington, DC: American Psychological Association. (Selected Chapters – Validity, Reliability, Test Development, Educational Tests and Assessments) The Standards can be purchased from APA (www.apa.org/science/standards.html)

Bejar, I.I. (2010). Application of evidence-centered assessment design to the advanced placement redesign. Applied Measurement in Education, 23(4), 378-391.

Bloom, B. S (1987). Taxonomy of educational objectives. New York: Longman. (Resource)

Bloom, B. S., Madaus, G. F., & Hastings, J. T. (1981). Evaluation to improve learning. New York: McGrawHill. (Resource)

Downing, S. M., & Haladyna, T.M. (1997). Test item development: Validity evidence from quality assurance procedures. Applied Measurement in Education, 10, 6182.

Downing, S. (2006). Twelve steps for effective test development. In Downing, S.M. and Haladyna, T.M. Handbook of Test Development (Ed.). New Jersey: Lawrence Erlbaum.

Ewing, M., Packman, S., Hamen, C., & Thurber, A.C. (2010). Representing targets of measurement within evidence-centered design. Applied Measurement in Education, 23(4), 325-341.

Haladyna, T. M., & Downing,S. M. (1989). A taxonomy of multiplechoice itemwriting rules. Applied Measurement in Education, 2, 3750.

Hambleton, R.K., & Rodgers, H. J. (1995). Developing an item bias review form. ERIC/AE How-to Series.

Hendrickson, A., Huff, K., & Luecht, R. (2010). Claims, evidence and achievement level descriptors as a foundation for item design and test specifications. Applied Measurement in Education, 23(4), 358-377.

Huff, K., Steinberg, L., & Matts, T. (2010). The promise and challenges of implementing evidence-c’s centered design in large scale assessment. Applied Measurement in Education, 23(4), 310-324.

Joint Committee on Testing Practices. (1988). Code of fair testing practices in education. Washington, DC: National Council on Measurement in Education. (Resource: Appendix in Nitko).

Kratwohl, D.R. (2002). A revision of Bloom’s taxonomy: An overview. Theory Into Practice, 41(4).

Linn, R.L. (2006). The Standards for Educational and Psychological Testing. In Downing, S.M. and Haladyna, T.M. Handbook of Test Development (Ed.). New Jersey: Lawrence Erlbaum.

Livingston, S. (2006). Item Analysis. In Downing, S.M. and Haladyna, T.M. Handbook of Test Development (Ed.). New Jersey: Lawrence Erlbaum.

Marzano, R.J., Brandt, R.S., Hughes, C.S., Jones, B.F., Presseisen, B. Z., Rankin, S.C., & Suhor, C. (1988). Dimensions of thinking: A framework for curriculum and instruction. Alexandria, VA: Association for Supervision and Curriculum Development.

Mislevy, R.J. & Haertel, G.D. (2006). Implications of evidence-centered design for educational testing. Educational Measurements: Issues and Practice, 25(4), 6-20.

Mislevy, R.J. & Riconscente, M.M. (2006). Evidence-centered assessment design. In Downing, S.M. and Haladyna, T.M. Handbook of Test Development (Ed.). New Jersey: Lawrence Erlbaum.

Zieky, M. (2006). Fairness reviews in assessment. In Downing, S.M. and Haladyna, T.M. Handbook of Test Development (Ed.). New Jersey: Lawrence Erlbaum.

Schmeiser, C. B. & Welch, C.J. (2007). Test Development. In B. Brennan (Ed.), Educational Measurement. (pp. 307-354). American Council on Education & Praeger: Westport, CT.

Additional Readings for the 2nd half of the class (subject to change)

Baron, J.B. (1991). Strategies for the development of effective performance exercises. Applied Measurement in Education, 4, 305318.

Crocker, L. (1997). Assessing content representativeness of performance assessment exercises. Applied Measurement in Education, 10, 8395.

Dunbar, S. B., Koretz, D. M., & Hoover, H. D. (1991). Quality control in the development and use of performance assessments. Applied Measurements in Education, 4(4), 289304.

Lane, S. (1993). The conceptual framework for the development of a performance assessment. Educational Measurement: Issues and Practice, 12(2), 1623.

Lane, S. Stone, C.A. (2006). Performance Assessments. In B. Brennan (Ed.), Educational Measurement.

Linn, R.L. (1994). Performance assessment: Policy, promises, and technical measurement standards. Educational Researcher, 23, 414.

Linn, R. L., Baker, E. L., & Dunbar, S. B. (1991). Complex performancebased assessments: Expectations and validation criteria. Educational Researcher, 20, 1521.

Quellmalz, E. S. (1991). Developing criteria for performance assessments: The missing link. Applied Measurement in Education, 4, 319331.

Shavelson, R. J., & Baxter, G. P. (1991). Performance assessment in science. Applied Measurement in Education, 4, 319331.

SYLLABUS

Date Class Topics Readings (subject to change)

1/7 Introduction Downing (2006); Linn (2006)

1/14 Test Content and Schmeiser & Welch ( 2006), pgs. 307-314;

PerformanceTargets Nitko (2004), Ch. 2;

Evidence-Centered Design (ECD) Mislevy et al. (2006); Huff et al. (2010)

1/21 Taxonomies of Perf. Targets Nitko (2004), Chap 2, App. D, E; Bloom (1987)

ECD Bloom et. al (1981); Kratwohl (2002);

Ewing et al. (2010), Hendrickson et al. (2010)

1/28 Test Development, Test Nitko (2004), Ch. 6, App. F;

Specifications, ECD Schmeiser & Welch ( 2006), pgs. 313-323 ; AERA/APA/NCME Test Standards, Chap 1, 3, Bejar et al. (2010), Brennan et al. (2010)

2/4 Multiple-choice, Matching items, Nitko (2004); Ch. 8, 10; Haladyna et. al. (1989);

Fairness and Content Review Hambleton & Rogers (1995); Schmeiser - Welch ( 2006), 324-337; Zieky (2010), Abedi (2010)

2/11 Multiple-choice, Matching items

2/4-18 Meet with instructor regarding test

2/25 Multiple Choice Test Due Nitko (2004), Ch.14; Livingston (2010);

Item and Test Analysis, Schmeiser & Welch ( 2006), pgs. 338-351;

Downing (1997); AERA/APA/NCME Test Standards, Chapter 2

3/4 Extended constructed response items, Nitko (2004), Ch. 9, 11, 12; Lane (1993);

performance tasks, scoring rubrics, Marzano (1988); Marzano et al. (1993);

ECD Quellmatz (1991); Shavelson et al. (1991);

Baron (1991); Linn (1994);

3/11 Multiple Choice Test Analyses Due

Spring Recess

3/18 Extended constructed response items, Lane & Stone (2006)

performance tasks, scoring rubrics cont.

3/25 Validation criteria Linn et al. (1991); Dunbar et al. (1991),

Mehrens (1992), Crocker, 1997,

AERA/APA/NCME Test Standards, Chap 1

4/1 Student Presentations

4/8 AERA/NCME

4/14 Student Presentations

4/22 Performance Assessment Project Due

Multiple Choice Test Assignment – Due Tuesday, February 25

Purpose. The purpose of this project is to assess your ability to apply knowledge and skills gained from the lectures and readings to the development of a multiple-choice examination - the quality of which is as close as possible to professional standards. Because the project must be completed within approximately two months, you will only be able to simulate the professional test development process. (Professionally developed standardized tests often require 1-3 three years to develop from conception to final product.)

Product expectations. This project requires you to (1) develop a 30-item multiple-choice achievement test and (2) write a technical manual describing the test’s purpose and intended use, process of development, administration procedures, and validity. The final project should be typed, double-spaced, in APA format, and assembled in a three-ring binder.

Test content and coverage. The test content may assess any subject matter you choose as long as the subject is part of a formal curriculum in an academic program. For this project, you may not assess attitudes, personality characteristics, or aptitudes. You should develop the test to assess what students would learn in the equivalent of about one unit of instruction. (That is, material that is taught over 4 to 8 weeks.) This will allow you to examine some learning targets in relative depth. You are expected to develop at least one item for every level of the cognitive taxonomy you choose to use in your test plan.

Students to test. The students for whom the test is to be administered should be of junior high school age or older. (You may develop a test for adult learners.) Multiple-choice items may not work well with younger students and you may not be able to develop some higher order thinking items for younger students.

Test plan. Nitko’s (2004) Figure (Example of a blueprint for summative assessment) should be used as a guide for the format and development of the test plan. However, you may use any taxonomy for organizing the plan - the taxonomy you choose should be the one best suited to the test you are developing and approved by the instructor.

Teacher participation. You are expected to work cooperatively with a teacher (unless you are the teacher teaching the students) to help you review the learning targets you plan to assess, the test plan itself, and the test items.

Format and structure of the manual. Attached to this handout is an outline of the topics that should be included in the technical manual you develop for the test. You may use a different outline if you think it will improve the manual. However, all of the points in the outline that is attached must be included in the technical manual. Unlike expressive writing, technical writing conforms to formal structured expectations. You should be aware of these expectations from the journal articles and technical manuals that you have read in this course and courses prerequisite to this one. Therefore, you are expected to use these conventions, including using the APA reference citation style.

Time schedule and project management. The specific assignments should coincide with the following schedule. Experience has shown that if you do not follow a schedule like this one, the project will overwhelm you. This project cannot be managed as a term paper. One suggestion is to draft sections of the test manual as you proceed with each step of the process rather than to wait until every step has been completed.

1/7 Begin developing plans for the test today. Identify appropriate source material and write test plan.

2/4-2/18 Individual appointments with instructor to review the planned test and a few sample items.

2/18 -2/24 Conduct content and fairness review on preliminary items, revise items, write the content and fairness review section for the technical manual, and prepare test

2/25 Test Plan and Preliminary Version of the 30-item test should be completed.

Meeting with the instructor. You are encouraged to meet with the instructor at any time to obtain advice on how to proceed with the project or to clarify the task. If you cannot come to campus, you may call the instructor to discuss the project on the phone or via e-mail. One meeting is mandatory and is scheduled between 2/4– 2/18. At that time, you should be prepared to show your progress to date and to ask any questions you have. The purpose of this meeting is to be sure that you are on the right path to success and to offer suggestions for improvement.

Evaluation. The grade you receive on this project will count 40% toward your final grade for the course. Your project will be assessed on the following learning targets.

CONTENT LEARNING TARGET

1. Your ability to apply the principles of test development, including item writing, item revision, and validation argument.

COMPLEX REASONING, CREATIVITY, AND CRITICAL THINKING LEARNING TARGETS

1. The quality of the reasoning and evidence you present concerning each item's appropriateness and item-revision actions

2. Your continued use of appropriate procedures and processes to improve your items and the total test until they meet the high standards described in the readings, lectures, or other appropriate test development resources.

INFORMATION PROCESSING LEARNING TARGET

1. Your effective use of a variety of information-gathering techniques and resources when developing and improving your items and the test.

HABITS OF MIND LEARNING TARGET

1. The extent to which the items you developed are original, assessing the learning targets in ways beyond the standard conventions.

EFFECTIVE COMMUNICATION LEARNING TARGET

1. The extent to which your technical manual and test materials are clearly written and express your ideas clearly.

Suggested Table of Contents for Multiple-Choice Test Project

The final project should be typed, double-spaced, and assembled in a three-ring binder.

Test Plan and General Considerations

Name of test.

Statement of purpose and intended use(s) of test, including claims about student performance.

Description of the characteristics of the persons for whom the test is intended.

Content description including the domains being measured.

Description of the learning conditions (i.e., instruction, prerequisite knowledge...) which the proposed examinees should have experienced.