By: Steve Patterson
Submitted: 2010-09-09 14:19:48 | Word Count: 942
Baseball is known as the national pastime of the United States, but teacher evaluation beats it hands down. Everybody does it–some with a vengeance, others with the casual disregard that physical and emotional distance afford. Most enthusiasts grow up with the game, playing a sandlot version as they go through school. Indeed, familiarity with the job of teaching and the widespread practice of judging teachers has shaped the history of teacher evaluation.
History of Teacher Evaluation
Donald Medley, Homer Coker, and Robert Soar (1984) describe succinctly the modern history of formal teacher evaluation–that period from the turn of the twentieth century to about 1980. This history might be divided into three overlapping periods: (1) The Search for Great Teachers; (2) Inferring Teacher Quality from Student Learning; and (3) Examining Teaching Performance. At the beginning of the twenty-first century, teacher evaluation appears to be entering a new phase of disequilibrium; that is, a transition to a period of Evaluating Teaching as Professional Behavior.
The Search for Great Teachers began in earnest in 1896 with the report of a study conducted by H.E. Kratz. Kratz asked 2,411 students from the second through the eighth grades in Sioux City, Iowa, to describe the characteristics of their best teachers. Kratz thought that by making desirable characteristics explicit he could establish a benchmark against which all teachers might be judged. Some 87 percent of those young Iowans mentioned "helpfulness" as the most important teacher characteristic. But a stunning 58 percent mentioned "personal appearance" as the next most influential factor.
[ advertisement ]
Arvil Barr's 1948 compendium of research on teaching competence noted that supervisors' ratings of teachers were the metric of choice. A few researchers, however, examined average gains in student achievement for the purpose of Inferring Teacher Quality from Student Learning. They assumed, for good reason, that supervisors' opinions of teachers revealed little or nothing about student learning. Indeed, according to Medley and his colleagues, these early findings were "most discouraging." The average correlation between teacher characteristics and student learning, as measured most often by achievement tests, was zero. Some characteristics related positively to student achievement gains in one study and negatively in another study. Most showed no relation at all. Simeon J. Domas and David Tiedeman (1950) reviewed more than 1,000 studies of teacher characteristics, defined in nearly every way imaginable, and found no clear direction for evaluators. Jacob Getzels and Philip Jackson (1963) called once and for all for an end to research and evaluation aimed at linking teacher characteristics to student learning, arguing it was an idea without merit.
Medley and his colleagues note several reasons for the failure of early efforts to judge teachers by student outcomes. First, student achievement varied, and relying on average measures of achievement masked differences. Second, researchers failed to control for the regression effect in student achievement–extreme high and low scores automatically regress toward the mean in second administrations of tests. Third, achievement tests were, for a variety of reasons, poor measures of student success. Perhaps most important, as the researchers who ushered in the period of Examining Teaching Performance were to suggest, these early approaches were conceptually inadequate, and even misleading. Student learning as measured by standardized achievement tests simply did not depend on a teacher's education, intelligence, gender, age, personality, attitudes, or any other personal attribute. What mattered was how teachers behaved when they were in classrooms.
The period of Examining Teaching Performance abandoned efforts to identify desirable teacher characteristics and concentrated instead on identifying effective teaching behaviors; that is, those behaviors that were linked to student learning. The tack was to describe clearly and precisely teaching behaviors and relate them to student learning–as measured most often by standardized achievement test scores. In rare instances, researchers conducted experiments for the purpose of arguing that certain teaching behaviors actually caused student learning. Like Kratz a century earlier, these investigators assumed that "principles of effective teaching" would serve as new and improved benchmarks for guiding both the evaluation and education of teachers. Jere Brophy and Thomas Good produced the most conceptually elaborate and useful description of this work in 1986, while Marjorie Powell and Joseph Beard's 1984 extensive bibliography of research done from 1965 to 1980 is a useful reference.
Goals of Teacher Evaluation
Although there are multiple goals of teacher evaluation, they are perhaps most often described as either formative or summative in nature. Formative evaluation consists of evaluation practices meant to shape, form, or improve teachers' performances. Clinical supervisors observe teachers, collect data on teaching behavior, organize these data, and share the results in conferences with the teachers observed. The supervisors' intent is to help teachers improve their practice. In contrast, summative evaluation, as the term implies, has as its aim the development and use of data to inform summary judgments of teachers. A principal observes teachers in action, works with them on committees, examines their students' work, talks with parents, and the like. These actions, aimed at least in part at obtaining evaluative information about teachers' work, inform the principal's decision to recommend teachers either for continuing a teacher's contract or for termination of employment. Decisions about initial licensure, hiring, promoting, rewarding, and terminating are examples of the class of summative evaluation decisions.
The goals of summative and formative evaluation may not be so different as they appear at first glance. If an evaluator is examining teachers collectively in a school system, some summary judgments of individuals might be considered formative in terms of improving the teaching staff as a whole. For instance, the summative decision to add a single strong teacher to a group of other strong teachers results in improving the capacity and value of the whole staff.