How Can We Prepare Teachers to Embed a Virtuous Cycle of Self-Evaluation into Their Practice?

By: Steven Mumford, Ph.D., Assistant Professor, University of New Orleans
Kathryn Newcomer, Ph.D., Professor, Trachtenberg School of Public Policy and Public Administration, George Washington University

Teacher helps students with physics assignment. Photo by Allison Shelley/The Verbatim Agency for American Education: Images of Teachers and Students in Action

Teacher Self-Improvement

“Data-driven schools” have long served as a vision for educational reformers, with teachers positioned at the forefront of student data collection and use (Noyce, Perda, & Traver, 2000). However, in the United States, the reality of instruction in public schools does not match the rhetoric.

Recent educational reforms across levels of government have emphasized accountability in teacher evaluation systems, potentially at the expense of teacher improvement (Donaldson, Woulfin, LeChasseur, & Cobb, 2016; Isore, 2009). Many teachers may instead rely on anecdotal information, experience, and intuition as guides to improve their practice (Ingram, Louis, & Schroeder, 2004). Teachers would be better served by data they personally perceive as credible and relevant (Smagorinsky, 2014; Donaldson, 2012).

Teacher training is critical to the success of such a self-evaluation approach (Ozogul & Sullivan, 2007). By “self-evaluation,” we are referring to teachers’ collection, analysis, reflection on, and use of their student data to improve their own practice. Unfortunately, teachers do not all receive adequate guidance or support for these efforts from principals and specialized instructional coaches or team leaders (Donaldson et al., 2016).

The Opportunity

We were contracted to evaluate a new, NSF-funded teacher scholarship program designed to recruit and develop middle and high school science and math teachers in Washington, DC. The program included a total of 30 teaching Fellows, roughly evenly divided between math and science fields, who participated over a six-year period, from 2012 to 2018. The Fellows completed an intensive one-year, 30-credit master’s degree in teaching, followed by full-time teaching in high-need school districts.

We were expected to secure student performance data to assess the effectiveness of the program. However, we were concerned by the questionable credibility of these data as evidence of teachers’ abilities, not to mention resource constraints and other practical considerations. Therefore, we developed an evaluation capacity-building approach in which we explicitly targeted the program participants’ motivation and ability to engage in critical reflection on their own student data (Brookfield, 1995).

The Challenge

As part of the program evaluation, we felt it was essential that Fellows take ownership over their student data collection and use. Complicating these efforts, Fellows taught a diversity of subjects, grade levels, and curricula, within a variety of school contexts and cultures (including both public schools and public charter schools), which changed standardized tests over the course of the program.

When interviewed about their school contexts and assessment strategies, Fellows expressed general skepticism around teacher “value-added” analyses implemented in most schools’ teacher evaluation systems, which attempt to statistically isolate the teacher’s contribution to students’ improvement in test scores over the school year. At the same time, most Fellows did not already administer pre-tests with their students in order to gauge student growth themselves. Fellows also identified a desire for more focused training and support in basic statistical analysis.

Despite these challenges, we found that Fellows had already begun using formative data to assess and anticipate student needs, revisit content they struggled with, and refine exams. Ongoing feedback from informal checks for understanding with students, such as “exit tickets,” warm-ups, and quizzes, was deemed especially useful. We hoped to build on a Fellow’s inherent interest in real-time, continuous feedback, while also helping them collect high-quality data useful for deeper reflection on their teaching practice and its impact on student outcomes.

Method

As part of our approach to teacher self-evaluation, Fellows were asked to:

develop their own individualized evaluation plans for collecting credible data on student content mastery and other outcomes (e.g., student engagement);
implement their evaluation plans during the school year to collect student outcomes data; and
report results to us.

Our role as evaluators was to act as partners and coaches, gathering feedback on every step of the process, providing templates and guidance, offering workshops and ongoing technical assistance, and aggregating results thematically across the program. The three steps in this self-evaluation process are described below. We completed the full cycle twice with Fellows, over two years.

Step 1: Fellows Developed Evaluation Plans

Our first step in the process was to guide Fellows in the development of individualized evaluation plans, which would help them collect customized evidence of teacher effectiveness, reflection, and learning. We developed an open-ended evaluation plan template (see Table 1 in Mumford & Newcomer, 2019), shared it with Fellows for feedback and approval at an in-person meeting, and guided Fellows in completing it during the summer after their first year of teaching. We then reviewed Fellows’ tailored plans and suggested improvements before the next school year began.

Fellows were asked to include the following elements in their plans:

a pre/post measure of student learning, focused on content mastery;
additional quantitative indicators, such as measures of effort or participation; and
additional qualitative evidence of student learning, such as samples of student work and survey responses.

In addition, Fellows were asked to discuss at length how they would store and statistically analyze quantitative data, often through an online gradebook, as well as reflect on the results, adjust their teaching approach, and share findings with external parties such as school administration. For this particular Fellowship program, Fellows were selected based on their STEM backgrounds and experience, and thus may have been more prepared for data collection and analysis than many of their peers.

Nonetheless, the process was iterative, and Fellows enhanced their plans’ rigor over time based on experience and our continued guidance. During the summer after their first full year of self-evaluation, Fellows updated their evaluation plans, such as by adapting methods to a new school, courses, or grade level, incorporating a midterm assessment, and replacing teacher-designed exams with validated, off-the-shelf instruments when available.

Step 2: Fellows Implemented Evaluation Plans

The Fellows were then responsible for implementing their evaluation plans over the course of the school year, with support from evaluators. Each year, the evaluators administered a mid-school-year survey around the holiday break to remind Fellows to collect and analyze pre-test data, check in on their progress, and get feedback on any challenges they faced. Fellows sometimes had to adjust their evaluation plan mid-year, in part because of last-minute changes in school policies and assessment procedures.

Two of the most prevalent challenges reported by the Fellows were missing data for a large number of students, and that many students entered the school year already below grade level and unprepared to master new material. Fellows reported that pre-tests suffered from low student completion and effort because they covered unfamiliar material; they were not graded or otherwise incentivized; and there was no accommodation for students with special needs (unlike for most post-tests). Further, class rosters did not stabilize until as late as mid-semester for some Fellows, often after pre-data were collected.

To better incentivize student effort on the pre-assessment, several Fellows began to share individual growth scores with students and celebrate their accomplishments. This modification often accompanied the addition of a midterm exam, allowing Fellows to calculate change scores in the middle of the school year, rather than waiting for the end when they were busy with end-of-year summative testing.

The evaluators convened Fellows in late spring to gather reflections on implementation of their evaluation plans, share and refine an additional template for reporting results (see Table 2 and Supplementary File in Mumford & Newcomer, 2019), and provide brief introductory training in statistical analysis (e.g., reporting descriptive statistics, calculating tests of significance in Excel). The evaluators later revised the reporting template for the second year of implementation, replacing open-ended questions with checkboxes where possible based on the first year’s overall responses, in an effort to streamline reports.

Step 3: Fellows Reported Results

The third and last phase in the evaluation cycle required Fellows to report their results to the program evaluators. Each year, the majority of Fellows provided evidence of statistically and practically significant growth in their students’ average change scores from pre-to-post, using a variety of different assessment strategies. Additional data analysis allowed Fellows to explore questions of personal interest to their pedagogy and conduct self-designed action research projects exploring benefits of their teaching strategies, such as a “flipped classroom” approach.

Fellows utilized data to reflect on their teaching practice and introduce formative improvements where appropriate, helping them tailor strategies to students in the following ways:

identifying student knowledge gaps for re-teaching and tutoring;
grouping and differentiating students within lessons;
recommending students for interventions; and
sharing scores with parents to get them engaged in students’ success.

Fellows also shared data with collaborating teachers to strategize about lessons. And they shared growth on the interim or final exam (from the pre-test) with students to promote a sense of accomplishment.

By and large, despite the significant challenge to their time and resources, Fellows were actively engaged in the self-evaluation process and diligent in their data collection, analysis, and reporting. Throughout the evaluation, Fellows self-reported increases in their teaching confidence and classroom management skills. In their final reports at the end of the second full year of self-evaluation, all Fellows signaled their intention to continue implementing self-evaluation beyond the program.

Lessons Learned

We distilled this six-year experience down to four lessons learned about promoting teacher improvement through self-evaluation:

Teachers need an impetus and consistent support to engage in meaningful self-evaluation. Research shows that teachers benefit from external motivation to participate in evaluation and self-reflection, at least initially (Ross & Bruce, 2007; Noyce et al., 2000). However, impetus is not enough. Teachers also needed training, technical assistance, and guidance through every step.
Teachers should look beyond standardized assessments for more useful formative data, while being mindful of additional testing burdens placed on students and themselves. Existing school procedures often prioritize high-stakes, summative assessment of teacher effectiveness at the expense of formative improvement. Indeed, Fellows who relied on standardized test data for self-evaluation (mostly math teachers) reported finding the process less useful overall. We instead guided Fellows in adapting extant but more sensitive student assessments to their formative needs, such as comprehensive and unit exams.
Real-time and interim opportunities for data analysis, reporting, and reflection help teachers apply findings to their practice. We initially assumed summer was the best time for teachers to compile and reflect on results; however, due in part to changes in Fellows’ schools and courses, and limited availability of the past year’s data, Fellows disagreed. Rather, they desired real-time, ongoing feedback to inform instruction immediately.
Evaluators must meet teachers where they are to adapt approaches to their needs. Each teacher comes with unique aptitude and interest in learning through evaluation. Some Fellows had great comfort and familiarity with basic assessment and statistical procedures; others struggled to meet our requirements despite their STEM training. However, we found that our individualized approach to Fellow self-evaluation promoted buy-in and intrinsic motivation and produced results that were deemed to be personally meaningful and useful by the teachers themselves.

Conclusion

Our process of teacher self-evaluation touched on the ubiquitous tension between accountability and learning inherent to our public education system. Ultimately, we navigated this tension by attempting to build the self-evaluation capacity of the participating teaching Fellows, in hopes they would internalize these skills and become critically reflective teachers capable of ongoing self-reflection and self-improvement (Brookfield, 1995). Our preliminary results suggest our approach, though resource intensive, has great potential for success.

References

Brookfield, S. (1995). Becoming a critically reflective teacher. San Francisco: Jossey-Bass.

Donaldson, M. L. (2012). Teachers' perspectives on evaluation reform. Washington, DC: Center for American Progress.

Donaldson, M. L., Woulfin, S., LeChasseur, K., & Cobb, C. D. (2016). The structure and substance of teachers’ opportunities to learn about teacher evaluation reform: Promise or pitfall for equity? Equity & Excellence in Education, 49(2), 183-201.

Ingram, D., Louis, K. S., & Schroeder, R. G. (2004). Accountability policies and teacher decision making: Barriers to the use of data to improve practice. Teachers College Record, 106(6), 1258-1287.

Isoré, M. (2009). Teacher evaluation: Current practices in OECD countries and a literature review. OECD Education Working Papers, No. 23. OECD Publishing.

Mumford, S., & Newcomer, K. (2019). Promoting STEM teacher reflection through self-evaluation. Political Science Faculty Publications, Paper 8. New Orleans: UNO ScholarWorks.

Noyce, P., Perda, D., & Traver, R. (2000). Creating data-driven schools. Educational Leadership, 57(5), 52-56.

Ozogul, G., & Sullivan, H. (2009). Student performance and attitudes under formative evaluation by teacher, self and peer evaluators. Educational Technology Research and Development, 57(3), 393-410.

Ross, J. A., & Bruce, C. D. (2007). Teacher self-assessment: A mechanism for facilitating professional growth. Teaching and Teacher Education, 23(2), 146-159.

Smagorinsky, P. (2014). Authentic teacher evaluation: A two-tiered proposal for formative and summative assessment. English Education, 46(2), 165-185.

Steve Mumford, Ph.D., Assistant Professor, University of New Orleans
swmumfor@uno.edu

Steve Mumford is an Assistant Professor at the University of New Orleans, where he helps lead the Master of Public Administration (MPA) program and teaches courses in program evaluation, public management, and a nine-credit Nonprofit Leadership concentration. Dr. Mumford received a PhD in Public Policy & Administration, concentrating in program evaluation, from George Washington University; a Master of Public Administration (MPA) from the University of Washington; and a BA in Psychology from Columbia University.

Dr. Mumford has over a decade of experience conducting evaluations and related trainings to enhance the effectiveness of foundations and nonprofits throughout the United States, including with past clients like the Bill & Melinda Gates Foundation. He currently works with the Greater New Orleans Foundation to provide training and coaching in program evaluation to nonprofits throughout the region, and serves as the founding Membership Chair for the Gulf Coast Evaluation Network, an association of evaluation professionals and regional affiliate of the American Evaluation Association.

Kathryn Newcomer, Ph.D., Professor, Trachtenberg School of Public Policy and Public Administration, George Washington University

Kathryn Newcomer is a professor in the Trachtenberg School of Public Policy and Public Administration at the George Washington University where she teaches graduate level courses on public and nonprofit program evaluation and research methods. She served as the Trachtenberg School director for over 12 years, until August 2019. She is a Fellow of the National Academy of Public Administration, and currently serves on the Comptroller General’s Educators’ Advisory Panel. She served as an elected member of the Board of Directors of the American Evaluation Association (AEA) (2013-2015 and 2016-2018), and as AEA president for 2017. She served as President of the Network of the Association of Schools of Public Policy, Affairs and Administration (NASPAA) for 2006-2007. Dr. Newcomer routinely conducts research and training for federal and local government agencies and nonprofit organizations on performance measurement and program evaluation, and has designed and conducted evaluations for many U.S. federal agencies and dozens of nonprofit organizations.

Dr. Newcomer has published six books, including Federal Inspectors General: Truth Tellers in Turbulent Times (2020), The Handbook of Practical Program Evaluation (4th edition 2015), and over 60 articles in journals including the Public Administration Review and the American Journal of Evaluation. She has received two Fulbright awards, one for Taiwan (1993) and one for Egypt (2001-04). She has lectured on performance measurement and public sector evaluation in Ukraine, Honduras, Canada, Australia, China, Australia, Brazil, Panama, Italy, Israel, the United Arab Emirates, Poland, Costa Rica, Egypt, Taiwan, Colombia, Nicaragua, and the UK.