By:
Heather C. Hill, Ph.D., Jerome T. Murphy Professor in Education, Harvard Graduate School of Education
Kathleen Lynch, Ph.D., Postdoctoral Research Associate, Annenberg Institute at Brown University

Our question: How can policymakers support standards-based reform?
Standards-based reforms in science and mathematics often require significant teacher learning, particularly of subject matter content and new instructional practices. Thus, since the first calls for standards-based reforms, circa 1990, reformers have explored several avenues for supporting teachers’ growth. New curriculum materials, such as those produced to align to the Framework for Science Education and the Next Generation Science Standards, aim to provide concrete support for disciplinary practices, core ideas, and cross-cutting content. And since the inception of standards-based reforms, new forms of professional development have proliferated; according to a recent national report, STEM teachers engage in activities that include coaching, teacher study groups, rehearsals of practice and online courses (Banilower, Smith, Malzahn, Plumley, Gordon, & Hayes, 2018). In those settings, teachers revise instructional practice, study new curriculum materials and interpret student assessment data.
But this raises an important question for policymakers: to what extent do these new curriculum materials and professional learning experiences work to improve student outcomes?
Our method: Meta-analyses
Colleagues and I addressed this question in a recent meta-analysis of STEM curriculum and professional development programs. Meta-analyses seek to gather up all relevant studies in an area, build a dataset recording their characteristics, and then test hypotheses about a) overall program outcomes and b) whether particular program features lead to greater or lesser success in improving student outcomes. Such analyses are useful in fields, like teacher professional learning, that contain many conflicting findings; within the span of just a few months, for instance, different teams of scholars may announce that professional development works, or that it doesn’t work, or that a particular approach to professional development is promising. Meta-analyses, by contrast, navigate questions about whether and when such programs work, on average.
In our case, the meta-analysis began with a search for all quantitative studies featuring new curriculum or professional development programs. We then narrowed this list to include only studies that used an experimental or near-experimental design to compare the student outcomes of teachers participating in the program versus teachers who did not. We located 95 studies in total that met these criteria. Next, we coded all studies for characteristics of their design (e.g., whether they used a researcher-designed assessment or a state standardized assessment), characteristics of the program itself (e.g., same-school teacher collaboration, summer workshops), and the size of impacts on student outcomes. Then, as described above, we set out to calculate an overall program impact estimate, and to link specific research design and program characteristics to program outcomes.
Findings: Positive impacts, with larger gains for curriculum-focused professional development
Our analysis uncovered several patterns. First, across all 95 studies, new curriculum materials and professional development programs produced a positive, 8-point difference in percentile rank between the average student in participating and non-participating classrooms. This effect is much larger (+14 percentiles) for researcher-designed assessments than for state-standardized and other standardized assessments (+2 percentiles and +3 percentile differences, respectively). But for all assessment types, our analysis shows that the difference between students in participating and non-participating classrooms is unlikely to be zero. Based on work by Chetty and colleagues (Chetty, Friedman, & Rockoff, 2014), who linked state test score performance to future outcomes, the estimated average test score impact of STEM PD and curriculum interventions would be expected to yield approximately $3,500 in future earnings per student, as measured in current dollars.
Next, we found larger impacts when programs combined professional development with the study of new curriculum materials, as opposed to programs in which teachers experienced either element alone. Figure 1 shows that for curriculum or professional development-only programs, the average control-group student scored in the 50th percentile and the average treatment-group student scored in the 56th percentile. This difference was 10 percentile points (50th to 60th) for the average student in the combined program. Reading across studies that featured both curriculum and professional development, we saw that these learning experiences often differed from the short “how to use the textbook” workshops offered by publishers after a district adopts a new text. In many of the combined programs we read about, teachers did the curriculum materials with one another – solving problems, conducting scientific investigations, learning design principles behind the curriculum, diving deeply into representations of content, adapting lessons to meet students’ needs, and so forth.
Figure 1
The impact of curriculum-focused professional development makes sense. While stand-alone professional development may cover instructional principles, subject matter content, and student learning – many of the ingredients that also appear in curriculum-focused professional development — teachers then face the work of implementing these ideas in their classroom, often by adapting materials or piecing together lessons from the internet. Professional development focused on curriculum materials, by contrast, gives teachers something to “bring back” to the classroom to support implementation. Curriculum-focused professional development also typically covers the exact STEM content teachers will need in order to teach—developing expertise in something in particular, like the model the curriculum uses to teach energy transfer in cells, rather than general principles or content that may not lie in the curriculum.
Our analysis also found other program characteristics that provide a boost to student outcomes. Three formats – same-school collaboration, implementation meetings, and summer workshops– yielded stronger gains on student assessments than programs without these formats. Same-school collaboration occurred when teachers participated in the professional development session alongside other teachers in their school. Summer workshops, although critiqued as divorced from practice, perhaps gave teachers breathing room to carefully focus their thinking about content, to try new instructional techniques, and to plan in detail for the upcoming year. Implementation meetings allowed teachers to re-convene briefly with other program participants during the course of the program, discussing obstacles and aids to putting the program into practice. By contrast, our analysis found that professional learning with an online component yielded lower impacts on student learning than programs that were entirely face-to-face.
One surprising finding was that programs containing coaching yielded impacts similar to programs without coaching. However, few programs in our analysis featured extended 1:1 coaching; instead, coaching appeared more as a brief add-on to summer workshops or curriculum implementation efforts. In a separate paper from ours, Kraft, Blazer, and Hogan (2018) conducted a meta-analysis of programs in which coaching was a major feature – that is, programs with a strong 1:1 coaching component. Across 60 studies, the authors found that these experiences raised teachers’ instructional quality an average of 20 percentile points, and their students’ outcomes by an average of 5 percentile points. Further, the efficacy of coaching was enhanced when the practice was combined with the study of curriculum materials.
Finally, while the programs we examined often took place in moderate- to high-poverty settings, these programs failed to produce more equitable outcomes by improving high-poverty students’ gains at a faster rate. In fact, our analysis suggests a slight trend toward smaller program impacts in high-poverty settings.
Lessons for practice: What does this mean for how teachers spend their professional learning time?
We can think of three lessons.
- Professional Development Administrators – First, teachers’ schedules are already full with existing professional learning opportunities; locating and reducing the footprint of ineffective opportunities must occur before new forms of professional learning can take root. We nominate for reduction the practice of having teachers study student assessment data; we base this recommendation in the fact that nine studies gauging the efficacy of this practice found few positive effects (for a summary, see Hill, in press), and in qualitative evidence that data team meeting discussions rarely focus in depth on instruction.
- Professional Development Designers – Second, that newly freed professional learning time should focus on developing teachers’ expertise with specific curriculum materials. Pivoting to such a focus will be no small feat, especially considering the patchwork of materials (e.g., from the internet or from supplemental sources) teachers use in science and mathematics instruction, and considering also the lack of well-established protocols and routines for teachers’ study of materials together. The same pivot is also required for district-based coaching to maximize its effectiveness. While most schools now have instructional coaches, in many places their roles tend toward the administrative: as curriculum designers, as testing directors, and as professional development facilitators. Yet Kraft et al. (2018) indicated 1:1 observation and feedback remains the bedrock of effective coaching programs.
- District and School Administrators – Finally, districts and schools must create supportive environments for teacher learning of this kind. Carving away ineffective professional development is one step; another is to align the curriculum, coaching and professional development teachers experience with local instructional guidance and leadership priorities or, even better, to align local instructional guidance and leadership priorities to teachers’ curriculum, professional learning, and coaching. Without such support, instruction will not change.
These recommendations rely on evidence collected by different research teams across different contexts; pooling evidence in this way increases the likelihood that results approximate true program impacts. Acting on these results will require both policymakers and professional developers to create new avenues for teacher growth.