Assessing Results That Matter: Quality Criteria for Alternative Assessments in the Adult

D R A F TFOR DISCUSSION PURPOSES ONLY

Assessing Results that Matter:

Equipped for the Future’s Approach to

Assessment for Adult Basic Education

Accountability and Improvement

Regie Stites

EFF Assessment Consortium

April 2003Assessing Results that Matter:

Equipped for the Future’s Approach to

Assessment for Adult Basic Education

Accountability and Improvement

Preface

[Maybe inside front cover]

Over the past three years, the Equipped for the Future (EFF) adult basic education system reform initiative has researched and developed an approach to testing and assessment for the purpose of improving practices and results in the adult basic education and literacy system. The EFF approach to assessment encompasses a variety of purposes for uses of testing and assessment within the system. The Equipped for the Future Assessment Report authored by Sri Ananda (2000), lays out guidelines for quality in low-stakes, classroom uses of assessment within the EFF Framework. Another EFF publication, Results That Matter (Bingman and Stein, 2001), describes quality criteria for uses of assessment within a model of program improvement using EFF. In our ongoing work in research and development of the EFF Assessment Framework we have identified and are applying quality criteria for the development of tests and assessments for high-stakes accountability and systemic improvement (see Stein, 2000 and inset, Guiding Principles). In each case, a common model of standards-based educational reform and improvement guides us. Good assessment tools are critical to the effective operation of this educational improvement process. Achieving educational improvement also requires instructors and educational policy makers to make good use of assessments and assessment results. This policy brief explains the EFF approach to developing good assessment tools and to ensuring that these tools are put to good use in improving the adult basic and literacy system.

[insert text box here]

[Guiding Principles: What We Want Our Assessment Framework to Include

The EFF Assessment Framework must address multiple purposes for assessment. The framework must provide for:

Information on learner achievements and mastery that is useful to the learner as well as the teacher throughout the instructional process.
Information about what learners can do that is credible to employers, educational institutions, and policymakers, as well as to learners themselves; and
Information that is useful for program and system improvement and accountability.

To address these multiple purposes, the EFF Assessment Framework must support a multidimensional, flexible, and systemic approach to assessment. Teachers and programs will be able to choose from a range of tools – to be identified or developed – that enables them to accurately measure performance against EFF standards and that are linked to one another, so that multiple assessments can provide a rich portrait of learner competence.
The EFF Assessment Framework must address learning over a lifetime. Strategies for assessment and credentialing must take into account the fact that adults build skills over time (rather than all at once), in response to changes in their life situations. Certificates and other credentials must be modular, designed to define competence or mastery at a particular point, and within a framework that assumes continuing development of competence as skills, knowledge, and understanding are further developed over time.
Since EFF Standards define skills all adults need in order to carry out their roles as workers and members of families and communities, the EFF Assessment Framework must address a single continuum of performance for all adults – including those with only minimal formal education and those with many years of formal education, including advanced degrees.
Each level defined in the EFF Assessment Framework must communicate clearly what an adult at that level can do. Numerical levels don’t communicate meaning to external audiences. Grade levels seem to communicate a common picture of performance, but in fact the meaning behind the label varies widely from community to community and state to state. Grade levels are particularly misleading when applied to adult performance, since they focus on developmental skill levels that don’t match the ways in which adults, with their broader background and range of experience, can combine skills and knowledge to perform effectively in daily life.
The levels defined in the EFF Assessment Framework must be explicitly linked to key external measures of competence (e.g., certificates of mastery, NAAL/IAL survey levels, diplomas, and other credentials) and key pathways (e.g., entry to higher education and entry to employment as defined by occupational skill standards) so that adults and systems can rely on them as accurate predictors of real-world performance.
The levels defined in the EFF Assessment Framework must be the products of a national consensus-building process that assures portability of certificates and credentials.
Work on the development of this framework must maintain the strong customer focus that has distinguished the EFF Standards development process to date. It must be based on a broad, inclusive definition of maximizing accountability for all activities to all customers – starting with the adult learner.]

Assessing Results that Matter

Background

In any educational system, the quality and content of instruction is strongly influenced by the quality and content of the tests (or assessment and reporting systems) used for accountability. The reason is simple. Using any test for educational accountability will inevitably lead to some degree of teaching to that test. Teaching to the test is not itself a problem. It becomes a problem only when the test we teach to does not measure (or measures only a very small part of) the content and skills that are important for teachers to teach and for learners to learn. Within a standards-based educational system, it is possible to align tests (for accountability and for a variety of other purposes) with important knowledge, skills, and abilities that have been clearly defined, broadly agreed upon, and widely communicated in the form of content standards. In such a system, we can measure and hold adult education programs accountable for achieving results that matter.

Assessment and accountability policies for adult basic education are now at a crossroads. The test instruments and methods for assessing and reporting adult student learning outcomes required for federal adult education accountability (as detailed in the National Reporting System Guidelines, see DAEL, 2001) are poorly aligned with adult learner and adult education program goals and learning objectives. Though some states have adopted the Equipped for the Future (EFF) Standards or have developed their own content standards for adult basic education, the alignment between standards (where they exist) and accountability assessments has been incomplete at best. This paper describes the EFF approach to aligning curriculum, assessment, accountability, teacher professional development, and improved teaching practices within a standards-based educational improvement system.

Equipped for the Future (EFF) is a standards-based reform initiative aimed at improving the quality of the adult basic education and literacy system and building the capacity of that system to more effectively assist adults in accomplishing their goals in life. The EFF approach to facilitating program change and improvement includes three main system reform tools; the EFF Content Framework, the EFF Assessment Framework, and supports for implementation of EFF. The EFF Content Framework defines the content domain for the adult learning system in terms of Purposes for Learning, Role Maps, Common Activities, and the EFF Content Standards. The EFF Assessment Framework defines levels of performance and measures of performance on the EFF Standards for a variety of purposes. Available resources for supporting implementation of EFF include a national network of certified trainers, materials and products to support EFF adoption and use, and customized training and technical assistance.

Because the EFF Content Standards focus on what adults need to know and be able to do to accomplish complex purposes in their lives, assessing performance on the standards requires that we measure not just what adults know but how well they can use what they know to accomplish tasks related to their real life purposes as workers, parents and family members, and citizens and community members. Drawing on cognitive science research on the development of expertise, we have worked with practitioners over the past three years to refine an approach to performance assessment that enables us to document and assess learner performance for purposes of teaching and learning as well as for reporting and credentialing. This policy brief is designed to share with policymakers and practitioners what we have learned through this assessment development work in Maine, Ohio, Oregon, Tennessee and Washington about the role of assessment and, more particularly, performance assessments, in a comprehensive assessment system focused on defining and measuring results that matter. This brief focuses on two main topics:

why assessments aligned with content standards are critical within a comprehensive assessment system for not only monitoring, but also, improving the quality of the adult basic education and literacy system, and
what we have learned about quality criteria for assessment through our work on developing the EFF Assessment Framework.

How assessments aligned with standards can improve the quality of adult education

[insert text below as a sidebar]

[It became clear to us, as we explored the practical implications of Title 1 assessment and accountability, that the construction of assessment and accountability systems cannot be isolated from their purposes, which are to improve the quality of instruction and ultimately the learning of students. So we were inevitably drawn into the relationship between assessment and accountability issues and issues of large-scale improvement in teaching and learning.

- from Testing, Teaching, and Learning: A Guide for States and School Districts (National Research Council, 1999, p. vii) ]

Much recent educational policy is based on the assumption that high standards coupled with student testing are an effective way to monitor and improve the quality of educational programs. However, the conditions under which a system of standards linked to high-stakes student testing actually works to enhance educational practice and results are not always been well understood or clearly described. In 1997, the National Research Council's Board on Testing and Assessment created a Committee on Title 1 Testing and Assessment.[1] The Committee's purpose was to review available research on how the theory of standards-based educational reform had played out in practice and under what conditions such reforms had been effective in leading to improved teaching and learning. As a result of their review, the committee concluded that an expanded model of standards-based reform was needed to make more explicit the links between standards, assessments, accountability, instruction, and learning in an “education improvement system” (see Figure 1 below).

Figure 1: Expanded model of the theory of action of standards-based reform: An educational improvement system (National Research Council, 1999, p. 20)

This model of educational improvement starts with alignment of standards, assessment, and accountability requirements. For the model to work in practice, both information and responsibility must be distributed throughout the system. Information about what students are expected to know and be able to do (standards), information about how this knowledge and ability will be measured (assessments), and information about how the results of such measures will be used (for accountability and to guide improvement) must be available to everyone – students, teachers, policymakers, and the public. Likewise, responsibility for making use of information to improve educational quality must be shared throughout the system. This model does not assume that the alignment of standards, assessment, and accountability will necessarily lead to higher levels of learning. It is not enough to clarify expectations for achievement – in the form of standards and assessments aligned with standards – and to motivate teachers to work harder by coupling rewards and punishments to test results. A comprehensive educational improvement system must also provide educators (and everyone else) with both quality information about the kinds of educational practices that result in higher levels of student learning and opportunities for teachers to acquire the knowledge, skills, and abilities they need to implement such practices (NRC, 1999, p. 20-21).

The EFF approach to educational improvement for the adult education and literacy system is consistent with this model of an educational improvement system. It is based upon the construction of a common framework that links standards, assessment, accountability, teacher professional development, and improved adult learning in the ways illustrated above.

Assessments aligned with the skills defined in the EFF Standards play an important role in the Equipped for the Future (EFF) standards-based educational reform and improvement model. In the adult education and literacy system the results of learning that matter most are applications of knowledge and skills in real life situations (at home, in the community, and on the job). Therefore, in evaluating the quality of adult education programs and in providing feedback to adult education programs that will help them to improve a critical question to ask is: how well does learning in adult education programs transfer to improved performance in key adult roles (parenting, citizenship, and work). To answer this question, we need a comprehensive assessment system that includes a variety of tests and assessments.

Alternative assessments will be an important part of the mix in this kind of comprehensive assessment system because performance by students on alternative assessments closely mirrors performance in the life contexts to which we hope learning will transfer. For this reason, they can provide the information that will support continuous improvement of the adult education and literacy system.

To understand more about what alternative assessments have to offer and how they can increase the quality of information on student and program performance within an educational improvement system, we need to look at some of the features that set alternative assessments apart from traditional testing. First, we should be clear about what we mean by alternative assessments [see sidebar]. It is commonly understood that what makes an assessment alternative is the fact that it is something other than a standardized multiple-choice test. The terms performance assessment, performance-based assessment, authentic assessment, and portfolio assessment are also used to describe testing practices that differ in important ways from traditional tests. In fact, all types of tests (whether formatted as a set of multiple-choice items or as a written or oral performance) are ways of observing and evaluating student performance (see sidebar).

[insert text below as sidebar]

["Every test, regardless of its format, measures test-taker performance in a specified domain. Performance assessments, however, attempt to emulate the context or conditions in which the intended knowledge or skills are actually applied. … The execution of the tasks posed in these tests often involves relatively extended time periods, ranging from a few minutes to a class period or more to several hours or days. Examples of such performances might include solving problems using manipulable materials, making complex inferences after collecting information, or explaining orally or in writing the rationale for a particular course of government action under given economic conditions.”

Standards for Educational and Psychological Testing (AERA/APA/NCME, 1999, p. 137-138)]

Among the features of alternative assessment that set them apart from traditional testing and make them attractive in a comprehensive assessment system are the following:

Alternative assessment tasks often take place over an extended period of time, are challenging for students, and can involve creativity, strategic thinking, and problem-solving

Alternative assessment tasks may take place in a real-world context or simulate such a context and thus allow students to see the connections between what they are learning and real-world applications of learning

Alternative assessment tasks are closely integrated with instructional activities and students learn from the assessment experience

Students know in advance how their performance on the task will be evaluated and can self-assess and monitor their own performance both within and beyond the task.[2]

These features of alternative assessments make them particularly well suited for use in adult education classrooms for purposes of guiding teaching and learning. Because they provide opportunities to measure learning results that are directly connected to improvements in their lives, students can see the transfer between what they are learning in class and what they need to do in real life. Recent studies in adult learner persistence suggest that adult students in the basic education and literacy system are more likely to be motivated to achieve to the full extent of their ability on a test they perceive to be closely related to their learning and life goals than on a test that is isolated from instruction and seems disconnected from the situations and challenges they face in their daily lives.

EFF quality criteria for assessments

A standards-based educational improvement system cannot function in the absence of high quality student testing. In the Equipped for the Future Assessment Report: How Instructors Can Support Adult Learners through Performance-Based Assessment, Sri Ananda (2000) identifies a number of key characteristics of performance-based assessments used for instructional purposes: