Skip repetitive navigational links
View: Next message | Previous More Hitsmessage
Next in topic | Previous More Hitsin topic
Next by same author | Previous More Hitsby same author
Previous page (September 2002) | Back to main LRNASST-L page
Join or leave LRNASST-L (or change settings)
Reply | Post a new message
Log in
Options:   Chronologically | Most recent first
Proportional font | Non-proportional font


From Harvard Magazine


Norman Stahl <[log in to unmask]>


Open Forum for Learning Assistance Professionals <[log in to unmask]>


Fri, 13 Sep 2002 08:48:26 -0400





text/plain (318 lines)

Testing Trap
The single largest ‹ and possibly most destructive ‹ federal intrusion into
America's public schools

by Richard F. Elmore

Supporters of the reauthorization, last January, of the Elementary and
Secondary Education Act hail it for tightening school accountability
substantially, for granting more flexibility to states and school districts
in the use of federal funds, and for applying sanctions to and providing
aid for failing schools. Opponents argue that the bill doesn't go far
enough, because congressional supporters of school choice failed to
persuade their colleagues and the president's advisers to include vouchers
in the bill.

Sadly, from an educational perspective, both sides miss the major issues.
This is an "accountability bill" that utterly fails to understand the
institutional realities of accountability in states, districts, and
schools. And its provisions are considerably at odds with the technical
realities of test-based accountability. In the history of federal education
policy, the disconnect between policy and practice has never been so
evident, nor so dangerous. Ironically, the conservative Republicans who
control the White House and the House of Representatives are sponsoring the
single largest‹and the single most damaging‹expansion of federal power over
the nation's education system.

Under the new law, the federal government mandates a single test-based
accountability system for all states‹a system currently operating in fewer
than half the states. It requires annual testing at every grade level, and
states must disaggregate their test scores by students' racial and
socioeconomic backgrounds‹a system currently operating in only a handful of
states, and one fraught with technical difficulties. The federal government
further mandates a single definition of adequate yearly progress, the
amount by which schools must increase their test scores in order to avoid
some sort of sanction‹an issue that in the past has been decided jointly by
states and Washington. Finally, the law sets a single target date by which
all students must exceed a state-defined proficiency level‹an issue that in
the past has been left almost entirely to states and localities.

Thus the federal government is now accelerating the worst trend of the
current accountability movement: that performance-based accountability has
come to mean testing alone. In the early stages of the current movement,
reformers had an expansive view of performance that included, in addition
to tests, portfolios and formal exhibitions of students' work,
student-initiated projects, and teachers' evaluations of their students.
The comparative appeal of standardized tests is easy to see: they are
relatively inexpensive to administer; can be man- dated simply; can be
rapidly implemented; and deliver clear, visible results. But relying only
on standardized tests dodges the complicated questions of what tests
actually measure and of how schools and students react when tests are the
sole yardstick of performance.

If this shift in federal policy were based on the accumulated wisdom gained
from experiences with accountability in states, districts, and schools, or
if it were based on clear design principles that had some basis in
practice, it might be worth the risk. In fact, however, it is based on
little more than talk among people who know hardly anything about the
institutional realities of accountability‹and even less about the problems
of improving instruction in schools.

The idea of performance-based accountability was introduced in the mid
1980s by the National Governors Association, led by Bill Clinton, then
governor of Arkansas. It took the form of what was then called the "horse
trade": states would grant schools and districts more flexibility in making
decisions about what and how to teach, in return for more accountability
for academic performance. This idea became the central theory of today's
accountability reforms. It was appealing in principle: governors and state
legislators could take credit for improving schools without committing
themselves to serious increases in funding. From the beginning,
performance-based accountability was an explicitly political idea, designed
to bring a broad coalition together behind a single vision of reform. As
with most such ideas, it was weak on practical details, most of which were
left to state and local policymakers and educators.

The movement got a major boost in 1994, when Title I‹the flagship federal
compensatory education program‹was amended to require states to create
performance-based accountability systems for schools. The vision behind the
1994 amendments was that Title I would complement and accelerate the trend
that began at the state level; the amendments required states to develop
academic standards, assessments based on the standards, and progress goals
for schools and school districts‹all within ambitious timetables. The
merger of state and federal accountability policies ("alignment," as it was
called) was supposed to occur by 2000. By the end of the decade, it was
difficult to find more than one or two states lacking some form of testing
program and public release of the results. In all but a few states,
however, the basic architecture of accountability remained relatively crude
and underdeveloped. In those few states where the idea had been developed
most extensively‹Texas and Kentucky, for example‹the systems worked well
enough, according to the testimonials of their sponsors, to legitimate the
idea that they were successful in general. But even in these states, there
were legitimate criticisms of the accountability system's actual effect on
academic performance and drop-out rates.

By the late 1990s, it was abundantly clear that the states had fallen well
short of what the crafters of the 1994 Title I amendments had envisioned.
It was also clear that the federal government possessed very little
leverage with which to force them along. States varied vastly in their
administrative capacities to implement performance-based accountability
systems. More important, creating accountability systems at the state level
is essentially a political act, and Washington's harmless knuckle-rapping
was hardly going to overcome the intransigence of a state legislature or
governor. The U.S. Department of Education's ability to monitor and enforce
compliance was limited; budget cuts whittled away at the Department's Title
I staff just as their responsibilities were increasing; and its senior
political appointees were reluctant to make life too difficult for
governors and chief state school officers, who are among their key
political constituencies. So by the target date for full compliance, fewer
than half the states had met the requirements. It came as no surprise to
learn that by the year 2000, many schools with Title I-eligible students
were simply unaware of the program's major policy shift in 1994.

This experience should have signaled to the Bush administration and
Congress that complex issues of state and local capacity could not be
brushed aside just by tightening the existing law's requirements. If more
than half the states were unable or unwilling to comply with the
requirements of the previous, less-stringent, more forgiving law, why would
one expect all the states to comply with a much more stringent and exacting

Even though virtually all the states have joined the accountability
bandwagon, doing so was, for many, largely a symbolic act. The designs of
the systems are still primitive; state education officials' authority to
oversee school districts is still limited in many cases; and the political
consequences of imposing large-scale, statewide testing in areas with
strong traditions of local control are risky. Moreover, mounting a
statewide testing system is beyond the capacity of most state departments
of education. Those that have embarked on large-scale testing are stretched
to their limits just managing test-development work or monitoring testing
contractors. Finally, there are technical issues. Standardized tests
inevitably become highly politicized and, in the course of the debate, the
limits of testing are subjected to public scrutiny. Many policymakers enter
the accountability debate not knowing much about testing, and they often
discover, much to their chagrin, that off-the-shelf tests may not validly
measure the content specified in state-mandated standards and that
norm-referenced tests (tests that deliberately create a normal distribution
around a mean) may not be effective in measuring changes in performance.

The working theory behind test-based accountability seems simple‹perhaps
fatally so. Students take tests that measure their academic performance in
various subject areas. The results trigger certain consequences for
students and schools‹rewards, in the case of high performance, and
sanctions for poor performance. Attaching stakes to test scores is supposed
to create incentives for students and teachers to work harder and for
school and district administrators to do a better job of monitoring their
performance. If students, teachers, or schools are chronically low
performing, presumably something more must be done: students must be denied
diplomas or held back a grade; teachers or principals must be sanctioned or
dismissed; and failing schools must be fixed or simply closed. The threat
of such measures is supposed to motivate students and schools to
ever-higher levels of achievement.

In fact, this is a naïve view of what it takes to improve student learning.
Fundamentally, internal accountability must precede external
accountability. That is, school personnel must share a coherent, explicit
set of norms and expectations about what a good school looks like before
they can use signals from the outside to improve student learning. Giving
test results to an incoherent, atomized, badly run school doesn't
automatically make it a better school. A school's ability to make
improvements has to do with the beliefs and practices that people in the
organization share, not with the kind of information they receive about
their performance. Low-performing schools aren't coherent enough to respond
to external demands for accountability.

The work of turning a school around entails improving "capacity" (the
knowledge and skills of teachers)‹changing their command of content and how
to teach it‹and helping them to understand where their students are in
their academic development. Low-performing schools, and the people who work
in them, don't know what to do. If they did, they would be doing it
already. You can't improve a school's performance, or that of any teacher
or student in it, without increasing the investment in teachers' knowledge,
pedagogical skills, and understanding of students. Test scores don't tell
us much of anything about these important domains; they provide a
composite, undifferentiated signal about students' responses to a problem.

Test-based accountability without substantial investments in internal
accountability and instructional improvement is unlikely to elicit better
performance from low-performing students and schools. Furthermore, the
increased pressure of test-based accountability alone is likely to
aggravate the existing inequalities between low-performing and
high-performing schools and students. Most high-performing schools simply
reflect the social capital of their students (they are primarily schools
with students of high socioeconomic status), rather than the internal
capacity of the schools themselves. Most low-performing schools cannot rely
on the social capital of students and families and instead must rely on
their organizational capacity. With little or no investment in capacity,
low-performing schools get worse relative to high-performing schools.

Some changes in the new law provide unrestricted money that states can use
to enhance capacity in schools, if they choose to. But neither state nor
federal policy addresses the capacity issue with anything like the
intensity applied to test-based accountability. The result is an enormous
distortion in the relationship between accountability and capacity‹a
distortion that is being amplified rather than dampened by federal policy.

In today's environment, critics who suggest that there might be problems
with the ways tests are used for accountability purposes are branded
apologists for a broken system. That the performance of students and
schools can be accurately, reliably, measured by test scores is almost an
article of faith. As a result, tests are being misused in ways that will
eventually undermine the credibility of performance-based accountability

The most serious problem lies in the use of test scores to make decisions
about whether students can advance to the next grade or graduate from high
school. The American Psychological Association's guidelines for test use
(and the consensus of professional judgment in the field of educational
testing and measurement) specifically prohibit basing any consequential
judgment about an individual student on a single test score. Why? Because
test scores are associated with a significant margin of error. That margin
of error increases as the number of cases decreases; individual scores are
typically much less reliable than aggregates of many individual scores.

The solution is to use multiple measures of a student's performance when
making consequential decisions. But this solution is more expensive and it
introduces a new level of complexity into the system. Were high-school
graduation to be contingent on a composite of grades, test scores, and
portfolios of students' work, developing such a composite would be a
challenging technical feat. It would also introduce a certain amount of
judgment into the system, and policymakers tend to distrust the
professionals who make such judgments.

A similar problem arises at the lower-school level. Under Title I, schools
are expected to meet their annual yearly progress goals, measured by a
school's annual gain in test scores. Title I also requires disaggregating
these scores by students' ethnic and economic backgrounds. But such
measures are highly unreliable for populations the size of a typical
elementary school, and they are particularly unreliable for even smaller
sub-groups of students. Schools are often misclassified as low- or
high-performing purely because of random variation in their test scores,
unrelated to any educational factor.

The standards and accountability movement is in danger of being transformed
into the testing and accountability movement. States without the human and
financial resources to select, administer, and monitor tests are now being
forced to begin testing at all grade levels. Instead of creating academic
standards that drive the design of an appropriate assessment, low-capacity
states will simply select a test based on its expense and ease of
administration, making charges of "teaching to the test" increasingly
accurate. A test with no external anchor in standards or expectations about
student learning becomes a curriculum in itself, trivializing the whole
idea of accountability.

The enthusiasm for performance-based accountability plays to the worst
weaknesses of the American education system. After World War II, most
industrialized countries nationalized their education systems, but not the
United States. Because decisions about content and performance were left to
states and localities for so long, they never developed the capacity to
monitor the quality of teaching and learning in schools, to support the
development of teachers' and administrators' knowledge and skill, or to
evolve measures of performance that are useful to educators and the public.

The difficult, uneven, and protracted slog toward clearer expectations and
supports for learning has barely begun in most states and localities. The
history of federal involvement in that long effort is mixed at best. The
current law repeats all of the strategic errors of the previous law, but
with greater federal intervention. The prognosis is not good.

The best we can hope for is that the capacity problems of states and
localities will become more visible as a political issue, triggering
responses that will help schools overcome the real obstacles they face in
improving the quality and intensity of teaching and learning. Similarly, we
can hope that the technical failures of testing will trigger a response
that focuses more on broad assessments of student learning.

The worst that can happen is that test-based accountability will widen the
gap between schools serving the well-off and those serving the poor, thus
confirming the public's suspicion that expecting high levels of learning
from all children is unrealistic. Performance-based accountability in
education is mutating into a caricature of itself.

Richard F. Elmore, Ed.D. '76, Anrig professor of educational leadership at
the Harvard Graduate School of Education, is completing a study of school
accountability. Recent publications include "Building a New Structure for
School Leadership" and "Bridging the Gap between Standards and
Achievement," both available from This article is
adapted with permission from an earlier version, titled "Unwarranted
Intrusion," which appeared in the Spring 2002 issue of Education Next
(, published by the Hoover Institution, Stanford

Norman A. Stahl
Professor and Chair
Literacy Education
GH 223
Northern Illinois University
DeKalb, IL 60115

Phone: (815) 753-9032
FAX:   (815) 753-8563
[log in to unmask]

Universities are institutions run by amateurs to train professionals.
Derek Bok----Harvard University
In examinations, the man who succeeds is not the man who can write well
about something that he knows, but the man who can write brilliantly about
something of which he knows nothing.  D.B. Jackson----the Royal Air Force

To Unsubscribe,
send a message to [log in to unmask]
In body type: SIGNOFF LRNASST.

Advanced Options


Log In

Log In

Get Password

Get Password

Search Archives

Search Archives

Subscribe or Unsubscribe

Subscribe or Unsubscribe


May 2022
April 2022
March 2022
February 2022
January 2022
December 2021
November 2021
October 2021
September 2021
August 2021
July 2021
June 2021
May 2021
April 2021
March 2021
February 2021
January 2021
December 2020
November 2020
October 2020
September 2020
August 2020
July 2020
June 2020
May 2020
April 2020
March 2020
February 2020
January 2020
December 2019
November 2019
October 2019
September 2019
August 2019
July 2019
June 2019
May 2019
April 2019
March 2019
February 2019
January 2019
December 2018
November 2018
October 2018
September 2018
August 2018
July 2018
June 2018
May 2018
April 2018
March 2018
February 2018
January 2018
December 2017
November 2017
October 2017
September 2017
August 2017
July 2017
June 2017
May 2017
April 2017
March 2017
February 2017
January 2017
December 2016
November 2016
October 2016
September 2016
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
October 2013
September 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011, Week 3
January 2011, Week 2
January 2011, Week 1
January 2011
December 2010, Week 5
December 2010, Week 4
December 2010, Week 3
December 2010, Week 2
December 2010, Week 1
November 2010, Week 5
November 2010, Week 4
November 2010, Week 3
November 2010, Week 2
November 2010, Week 1
October 2010, Week 5
October 2010, Week 4
October 2010, Week 3
October 2010, Week 2
October 2010, Week 1
September 2010, Week 5
September 2010, Week 4
September 2010, Week 3
September 2010, Week 2
September 2010, Week 1
August 2010, Week 5
August 2010, Week 4
August 2010, Week 3
August 2010, Week 2
August 2010, Week 1
July 2010, Week 5
July 2010, Week 4
July 2010, Week 3
July 2010, Week 2
July 2010, Week 1
June 2010, Week 5
June 2010, Week 4
June 2010, Week 3
June 2010, Week 2
June 2010, Week 1
May 2010, Week 4
May 2010, Week 3
May 2010, Week 2
May 2010, Week 1
April 2010, Week 5
April 2010, Week 4
April 2010, Week 3
April 2010, Week 2
April 2010, Week 1
March 2010, Week 5
March 2010, Week 4
March 2010, Week 3
March 2010, Week 2
March 2010, Week 1
February 2010, Week 4
February 2010, Week 3
February 2010, Week 2
February 2010, Week 1
January 2010, Week 5
January 2010, Week 4
January 2010, Week 3
January 2010, Week 2
January 2010, Week 1
December 2009, Week 5
December 2009, Week 4
December 2009, Week 3
December 2009, Week 2
December 2009, Week 1
November 2009, Week 5
November 2009, Week 4
November 2009, Week 3
November 2009, Week 2
November 2009, Week 1
October 2009, Week 5
October 2009, Week 4
October 2009, Week 3
October 2009, Week 2
October 2009, Week 1
September 2009, Week 5
September 2009, Week 4
September 2009, Week 3
September 2009, Week 2
September 2009, Week 1
August 2009, Week 5
August 2009, Week 4
August 2009, Week 3
August 2009, Week 2
August 2009, Week 1
July 2009, Week 5
July 2009, Week 4
July 2009, Week 3
July 2009, Week 2
July 2009, Week 1
June 2009, Week 5
June 2009, Week 4
June 2009, Week 3
June 2009, Week 2
June 2009, Week 1
May 2009, Week 5
May 2009, Week 4
May 2009, Week 3
May 2009, Week 2
May 2009, Week 1
April 2009, Week 5
April 2009, Week 4
April 2009, Week 3
April 2009, Week 2
April 2009, Week 1
March 2009, Week 5
March 2009, Week 4
March 2009, Week 3
March 2009, Week 2
March 2009, Week 1
February 2009, Week 4
February 2009, Week 3
February 2009, Week 2
February 2009, Week 1
January 2009, Week 5
January 2009, Week 4
January 2009, Week 3
January 2009, Week 2
January 2009, Week 1
December 2008, Week 5
December 2008, Week 4
December 2008, Week 3
December 2008, Week 2
December 2008, Week 1
November 2008, Week 5
November 2008, Week 4
November 2008, Week 3
November 2008, Week 2
November 2008, Week 1
October 2008, Week 5
October 2008, Week 4
October 2008, Week 3
October 2008, Week 2
October 2008, Week 1
September 2008, Week 5
September 2008, Week 4
September 2008, Week 3
September 2008, Week 2
September 2008, Week 1
August 2008, Week 5
August 2008, Week 4
August 2008, Week 3
August 2008, Week 2
August 2008, Week 1
July 2008, Week 5
July 2008, Week 4
July 2008, Week 3
July 2008, Week 2
July 2008, Week 1
June 2008, Week 5
June 2008, Week 4
June 2008, Week 3
June 2008, Week 2
June 2008, Week 1
May 2008, Week 5
May 2008, Week 4
May 2008, Week 3
May 2008, Week 2
May 2008, Week 1
April 2008, Week 5
April 2008, Week 4
April 2008, Week 3
April 2008, Week 2
April 2008, Week 1
March 2008, Week 5
March 2008, Week 4
March 2008, Week 3
March 2008, Week 2
March 2008, Week 1
February 2008, Week 5
February 2008, Week 4
February 2008, Week 3
February 2008, Week 2
February 2008, Week 1
January 2008, Week 5
January 2008, Week 4
January 2008, Week 3
January 2008, Week 2
January 2008, Week 1
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
February 2007
January 2007
December 2006
November 2006
October 2006
September 2006
August 2006
July 2006
June 2006
May 2006
April 2006
March 2006
February 2006
January 2006
December 2005
November 2005
October 2005
September 2005
August 2005
July 2005
June 2005
May 2005
April 2005
March 2005
February 2005
January 2005
December 2004
November 2004
October 2004
September 2004
August 2004
July 2004
June 2004
May 2004
April 2004
March 2004
February 2004
January 2004
December 2003
November 2003
October 2003
September 2003
August 2003
July 2003
June 2003
May 2003
April 2003
March 2003
February 2003
January 2003
December 2002
November 2002
October 2002
September 2002
August 2002
July 2002
June 2002
May 2002
April 2002
March 2002
February 2002
January 2002
December 2001
November 2001
October 2001
September 2001
August 2001
July 2001
June 2001
May 2001
April 2001
March 2001
February 2001
January 2001
December 2000
November 2000
October 2000
September 2000
August 2000
July 2000
June 2000
May 2000
April 2000
March 2000
February 2000
January 2000
December 1999
November 1999
October 1999
September 1999
August 1999
July 1999
June 1999
May 1999
April 1999
March 1999
February 1999
January 1999
December 1998
November 1998
October 1998
September 1998
August 1998
July 1998
June 1998
May 1998
April 1998
March 1998
February 1998
January 1998
December 1997
November 1997
October 1997
September 1997
August 1997
July 1997
June 1997
May 1997
April 1997
March 1997
February 1997
January 1997
December 1996
November 1996
October 1996
September 1996
August 1996
July 1996
June 1996
May 1996
April 1996
March 1996
February 1996
January 1996
December 1995
November 1995
October 1995
September 1995
August 1995
July 1995
June 1995
May 1995
April 1995
March 1995
February 1995
January 1995



CataList Email List Search Powered by the LISTSERV Email List Manager