By: Stuart Kahl
With a few exceptions, states have not taken advantage of the opportunities afforded by ESSA to implement innovative assessment programs. I see two primary, seemingly conflicting reasons to innovate: to lessen the burden state accountability assessments place on educators and to measure deeper learning, a construct sorely neglected by current programs.
Recent efforts have not approached the scope of 1990’s innovative programs in states such as Maryland, Vermont, and Kentucky. At that time, we had neither the tools nor the experience that we have today to make a go of truly “authentic” assessments, as they were called then. Now there are new barriers to their implementation – the demands of 21st century legislation, beginning with No Child Left Behind (NCLB). Because of the requirements for a significantly increased amount of testing and for quick turnaround time for results, efficiency ruled the day. And the multiple-choice format, with its negative impacts on instruction and local assessment, dominated again.
President Obama’s Race to the Top program was intended to remedy this situation through innovative, next-generation accountability assessments of the Smarter Balanced and PARCC state consortia. Unfortunately, both consortia backed off their initial plans for significant performance components, producing tests instead that were much like some previous state tests but twice as long. The only thing “next-generation” about them was that they were administered online for most students. Online testing and the quick turnaround time for results associated with it has led to continued emphasis on efficiency in scoring and to tests that do not do justice to the measurement of deeper learning.
Also, the higher stakes for local educators associated with the test results were a likely cause of much increased use of interim assessments in preparation for the end-of-year accountability tests. This, in turn, led to widespread concern about over testing and more negative perceptions of standardized testing than ever.
The Every Student Succeeds Act (ESSA), the long overdue reauthorization of ESEA, represented another attempt to improve state accountability assessments. In addition to returning some control to the states regarding how to use the accountability results, ESSA also encouraged innovation in two ways. First, the law included a provision for innovative assessment pilots. States could apply for “demonstration authority” to conduct pilot studies of new approaches. However, there was no additional funding for participating states, and the requirements for proving technical quality and for reporting were onerous. The pilot option did offer the benefit that students participating in the pilots could be exempted from the regular state testing, but the states had to demonstrate that the pilot results were comparable to results obtained from the regular testing. Only two states applied for demonstration authority initially, and one of them had already been conducting its pilot under the previous law, NCLB. As of this writing, four more states are considering applying in the second round of this ESSA program.
A state does not have to apply for permission from the USDOE to conduct a pilot study; thus, it could try something innovative on its own and avoid the additional burdens associated with ESSA demonstration authority. Furthermore, a second way ESSA supported innovation was by clarifying what is allowable in other sections of the law. For example, the new law does a better job of defining multiple measures than NCLB did, indicating that the states could include performance and portfolio assessments. Also, ESSA allows interim assessments conducted during the course of the school year, their results to be combined to yield total scores that meet accountability requirements.
The option of using interim assessments, in my opinion, came about as a result of faulty logic. Concerned about loss of instructional time due to preparation for and administration of long end- of-year state tests, many local educators suggested for years that spreading the testing out over the school year would be an improvement. In my mind, the opposite would be true. Using the same type of tests, but shorter ones, three times a year for example, as opposed to just once, would triple the burden on schools associated with the handling of secure printed material (if paper testing were still done in some schools) and with managing the student registration process for online testing. And then there’s the consideration of disruption of the instructional programs three times, and perhaps the hassle of coordinating scopes and sequences with the administration schedules for benchmark testing.
There are other arguments against different approaches to interim assessments for accountability. If the assumption of educators was that these assessments would involve local measures, such measures would not yield the required level of comparability of results across schools in a state. If common measures across schools were to be used, again of the same “efficient” types as the end-of-year tests, there would still be problems. The use of several general achievement measures (like the traditional state tests) during the course of the year would produce results in which growth would be overshadowed by measurement error because of the much shorter period of instruction between the tests. The results for some individual students, either on one of these tests or on composite scores combining the interim test results, would show negative growth. (The growth that matters for accountability purposes is growth during the entire school year, and negative growth for an individual student based on end-of-year testing would be highly unlikely.)
If benchmark tests covering recently taught material were used, teachers would be happiest. A common complaint of teachers about state end-of-year tests is that they cover material that was taught many months before the testing. I’m not particularly sympathetic to that concern. For purposes of program evaluation, I like to think that retention of knowledge and skills deemed important enough to include in the state content standards should matter.
I do believe ESSA has removed barriers to innovative assessment programs by allowing interim testing. However, I have not yet seen or heard of a program or plan for a program that takes full advantage of ESSA flexibility and at the same time benefits from all the lessons of many attempts at innovative assessment in years past and results or would result in an assessment of high technical quality. The use of traditional, tightly controlled, on-demand, “efficient” tests on an interim basis just doesn’t cut it…because of the problems described above and because they don’t address deeper learning adequately. Why do three times a year what can be done effectively once at the end.
The ideal solution in my mind is a two-component program – one with 1) an abbreviated end-of-year efficient test and 2) curriculum-embedded performance assessments (not just short tasks) implemented a few times during the school year. The idea here is that the latter interim component addresses deeper learning – higher order skills that the traditional efficient tests do not assess well.
Higher Order Thinking
Unfortunately, while a few states have made good attempts at innovative assessments, they have found it challenging to avoid over burdening local educators and to assure the necessary comparability of results. Many district consortia across the country have made good efforts at innovative approaches but have not designed them to meet the federal requirements regarding comparability. For more on a two-component model that could work see: