The Unintended Impact of Evaluation

Topics: Assessment and Evaluation

Numbers and rubrics can turn the focus away from instructional improvement.
By Simon Rodberg
Principal, January/February 2020. Volume 99, Number 3.

Before I became an assistant principal in Washington, D.C., public schools, I worked in the system’s central office. I was a bureaucrat, but I was also at ground zero of education reformers’ overhaul of teacher evaluation.

This was the height of District of Columbia Public Schools (DCPS) Chancellor Michelle Rhee’s national fame. She appeared on the cover of TIME magazine with a broom and the implication that she was going to clean house—bad teachers, beware! I took part in the implementation of a new teacher evaluation system, IMPACT, and helped lead its redesign after the first year.

We used multiple, rigorous observations and value-added calculations to give every teacher in the district a numerical score that decided whether they kept their job, got a bonus, or got fired. The first year under IMPACT, we fired more than 200 teachers. We were going to change the education world through evaluation.

After a year and a half as a bureaucrat, I missed the daily energy of schools and became an assistant principal at a DCPS middle school. No longer working behind the scenes on teacher evaluation, I was instead in an actual school, observing and scoring actual teachers. And what I found was that evaluation wasn’t going to change education. Inside the school, evaluation was a distraction from the real work of teaching and learning.

Keeping Score

This wasn’t the way it was supposed to be. Evaluation was supposed to help principals become instructional leaders. “We hoped that improving evaluation systems would lead to a cascade of positive organizational changes inside school agencies,” Thomas Kane, a Harvard professor who was involved with IMPACT in its early days, said. “For example, new formal observation rubrics could provide teachers and supervisors with a common vocabulary for discussing instruction, which is a necessary ingredient for collective improvement.”

But the discussions following observations were usually about scores, not instruction. The numbers we used to rate teachers seemed impossible to ignore. The simplicity of those single-​digit numbers assigned—a 3 or a 4 for checking for understanding, a 2 or a 3 for building classroom community—made them the focal point of teacher-supervisor discussion, not the improvement of student learning.

Big numbers can be hard to comprehend, and long sequences of numbers make people’s eyes glaze over. But simple numbers carry huge weight, and putting a number on a person—or a teacher’s practice—became definitive. These numbers had implications for people’s job security, but that wasn’t why they held our attention; even when the number was an overall 3.2 or 3.3, the conversation focused on the difference. It was the rating itself that mattered—the fact of it. We couldn’t talk about pedagogy. The number took up all of the conversational room.

We needed to give teachers numbers because the school system required them. And the school system required them because it was, after all, a system demanding comparability—more than 100 schools, with 5,000 teachers. But that’s not what the schools themselves needed.

Freedom of Evaluation

When I became the principal of a startup charter school, I went in the opposite direction: no numbers, no ratings, no multiple formal observations. Just a once-a-year written summary of performance, with a key message: “You’re great, and work on these things,” or “Improve in these ways if you don’t want to be fired.”

As a charter school, we had the freedom to do what we felt was right with teacher evaluation, starting from questions of core purpose. In almost every case, the purpose of formal evaluation is to tell employees their status. It’s the center of a key decision: to retain the teacher or not. I didn’t need a complex numerical system to decide that; these weren’t easy decisions, but complicated math or an in-depth formal rubric wasn’t going to help!

We also didn’t need a rubric to talk about instruction. In fact, tying a rubric to a numerical evaluation means that conversations aren’t focused on instruction; they are about the numbers. The midyear evaluation didn’t replace conversations about instruction; those happened throughout the year. Replacing a more complex evaluation with a simple one made room for valuable instructional conversation and feedback.

The more intense the rubric and evaluation process, the greater the focus on that process. Teachers end up concentrating on their own numbers, and that’s not the same as focusing on teaching and learning.

Not everyone is in a startup charter school, of course, so if you have a district-mandated, formal evaluation process, watch out for its unintended consequences. The main power of such a process is to fire people who shouldn’t be teaching, but it also has the power to undermine improvement conversations for those who should.

Simon Rodberg is a writer and consultant on public education in Washington, D.C. He was the founding principal of DC International and an assistant principal in DC Public Schools.


Copyright © National Association of Elementary School Principals. No part of the articles in NAESP magazines, newsletters, or website may be reproduced in any medium without the permission of the National Association of Elementary School Principals. For more information, view NAESP’s reprint policy.

For Print