Program Evaluation Frameworks: Why Do They Matter?

By: Meltem Alemdar, Ph.D., Associate Director, Principal Research Scientist, Center for Education Integrating Science, Mathematics, and Computing, Georgia Institute of Technology
Christopher Cappelli, MPH, Senior Research Associate, Center for Education Integrating Science, Mathematics and Computing, Georgia Institute of Technology

Shutterstock Image

The role of evaluation in National Science Foundation (NSF) projects has become critically important. Evaluation produces information that can be utilized to improve the project. Information on how different aspects of the project are working and the extent to which the goals and objectives are being met is essential to a continuous improvement process. Additionally, evaluation documents what has been achieved by the project.

Evaluators should always work closely with principal investigators (PIs) during the proposal stage to ensure that the evaluation aligns well with project goals; however, the degree to which this happens before and after funding is received depends on the PI’s approach to collaboration and perspective on the value of evaluation. Some PIs perceive evaluation as a formality required for the proposal to get funded, and perhaps for accountability purposes. Others see it as the most important part of the proposal. Whether the PI perceives evaluation as a critical component of a project or not has profound ramifications for the quality of the evaluation, the clarity of the stated evaluation focus, the selection of methodology and design, as well as the utilization of evaluation results by the PI. The need to deal with this array of critical considerations requires a robust evaluation plan. The choice of evaluation frameworks thus becomes very critical.

Evaluation frameworks provide guidance for program developers and evaluators to ensure that the evaluation’s overall design reflects and incorporates the originating motivations, principles, and context of the program examined. While the term “evaluation framework” is very common in the discipline of program evaluation, it has had various interpretations. The recent article by Arbour (2020), “Frameworks for Program Evaluation: Considerations on Research, Practice, and Institutions,” analyzes various approaches by different evaluation associations and organizations.

This article specifically highlights program evaluation, rather than the more general domain of evaluation. The paper provides examples of how frameworks are defined within the field and for organizations and associations. For example, the Organization for Economic Co-operation and Development (OECD) (2014) created its Framework for Regulatory Policy Evaluation, which is an extensive guide to assist “countries in systematically evaluating the design and implementation of regulatory policy” (p. 13), whereas the United Nations Office for Disaster Risk Reduction (2015) refers to its Monitoring and Evaluation Framework as a way to “provide a consistent approach to the monitoring and evaluation” of its programs (p. 2). The paper also describes the well-known Chen (2001) and Cooksy (1999) frameworks, which mostly focus on program theories and logic models. The paper also highlights the context dependent dimensions of choosing evaluation frameworks, such as the practice of program evaluators, as well as the type of intervention and program evaluation functions. Arbour (2020) concludes by emphasizing that “a framework has an impact because someone decides to adopt, adapt, or develop that framework in a given evaluation context” (p.13). This leads in many cases to locally developed logic models, evaluation plans, evaluation policies, and many other products associated with the term “evaluation frameworks.” This important observation is also borne out in our experience as evaluators, where we have found that different fields of study or practice govern the choice and implementation of evaluation frameworks. For example, participatory evaluation (King, 2005 ) is most commonly used in community-based interventions. The developmental evaluation framework (Patton, 2010) tends to be used for innovation, radical program re-design, and addressing complex issues and crises.

In Alemdar, Cappelli, Criswell, and Rushton (2018), we attempt to provide a template for evaluating teacher leadership training programs funded through the NSF Robert Noyce Teacher Scholarship (Noyce) program. The Noyce teacher leadership training programs are particularly challenging to evaluate for multiple reasons. First, the program-specific characteristics might evolve over the years, which can make it difficult to develop an effective evaluation of the program. Noyce programs are also hard to evaluate due to the small number of individuals admitted into cohorts each year. Most evaluations focus on yearly data for primarily formative purposes. The summative data usually focus on program level data rather than teacher level outcomes. To provide useful evaluation data and analysis to key stakeholders, teacher leadership professional development programs need to be evaluated longitudinally, utilizing proven methodologies and frameworks that are able to account for the small sample sizes common in these programs. It takes years for a teacher to transform into a leader who moves her colleagues toward positive change. Hence, it becomes important to capture the longitudinal teacher development.

Some evaluation frameworks require substantial time commitments from the project PIs, management, and others involved in the project during every step of the evaluation. Considering the limited knowledge of evaluation methodologies that are useful for evaluating teacher leadership programs with small sample sizes, as well as the relationships that we have built with the PIs, we chose to use multiple complementary evaluation frameworks to determine the overall program impact on the development of teacher leadership skills.

One approach was a utilization-focused evaluation, described as, “evaluation done for and with specific intended primary users for specific, intended uses” (Patton, 2008, p. 37). An essential component of utilization-focused evaluation is identifying the program stakeholders, or the primary intended users of the evaluation, and understanding their perspectives on the intended use of the evaluation. Patton (2008) describes the importance of the “personal factor” when identifying the intended users of the evaluation, defined as “the presence of an identifiable individual or group of people who personally care about the evaluation and the findings it generates” (p. 44). This group of people has a personal interest in the success of the program and enhance their own ability as consumers or decision-makers to predict and guide the outcomes of the program. Through this framework, we built close relationships with both program leadership and participants, developing a high level of trust that proved to be a cornerstone for the success of this evaluation. By understanding the “personal factor” and its importance in utilization, we involved key stakeholders so as to better understand their perspectives regarding the intended uses of this evaluation. This approach ensured that throughout the program period, evaluation data were presented in a way that placed utilization at the forefront.

Furthermore, teacher leadership training programs often incorporate theories for the development of leadership. In our early conversations, the PIs discussed multiple teacher leadership theories that guided their development of the program, such as Dempsey (1992) and Snell and Swanson (2000). Since these theories formed a theoretical foundation for the program, the theory-driven evaluation framework by Chen (1990) was also adopted. This framework is designed to use a validated theory to guide the evaluation.

While utilization-focused evaluation provides timely, useful information to the program leadership for decision making, theory-driven evaluation can be “…analytically and empirically powerful and lead to better evaluation questions, better evaluation answers and better programs” (Rogers, 2000, p. 209). Moreover, with theory-driven evaluation guiding the evaluation process, it is thought that the evaluation is able to not only assess whether or not a program is working, but also illuminate why or how the program is having an impact on its participants (Chen, 2012). This is particularly important in the context of teacher leadership programs, so that the programs can build a theory-driven model, which can be easily adapted by others. This should be the goal of any NSF-related project evaluations – to effectively assess the merit of the program. Given the complementary data provided through the use of these theoretical frameworks, the results of the evaluation were used by the PIs extensively to improve the program and achieve the program goals. For example, in the early stages of the program, the formative data showed that teachers were struggling in reflecting their teaching practice, which is an important domain for developing teacher leadership. The program addressed this challenge by including more professional development and discussions around this topic.

In our paper, we also showed how the program theory guided the development of interview and focus group protocols to longitudinally track the development of leadership through Snell and Swanson’s four dimensions of teacher leadership: Empowerment, Expertise, Reflection and Collaboration. Documenting the development of teacher leadership over time is particularly difficult with sample sizes and limited evaluation resources. Using multiple frameworks substantially assisted the program in documenting its impact regarding change over time in the four dimensions of teacher leadership, and therefore, in the development of teacher leaders. Because of the collaborative nature of these evaluation frameworks, a conceptual framework was also constructed in collaboration with the PIs. From the perspective of a utilization-focused evaluation, involving key stakeholders in the development of a conceptual framework ensures a common understanding of the relationship between program components and the desired outcomes, resulting in agreement for the intended use of evaluation results. Similarly, from a theory-driven perspective, the development of a conceptual framework for the program systematically organizes stakeholders’ perceptions of both the process that is expected to happen to produce change and the activities needed to create the desired change as a result of participation in the program (Chen, 2012).

Implications

Choosing and implementing an evaluation framework(s) to better determine the merit of programs will vary by a program’s specific context. Based on our experiences, we developed several recommendations that evaluators and Noyce programs should consider when developing an evaluation plan:

Develop a conceptual framework with the project PIs and program stakeholders. Although many projects develop a program logic model, development of a conceptual framework ensures a common understanding of the relationship among program components that is informed by the literature in that area and applicable program theories. This collaborative process is also an important evaluation strategy to build trust among evaluation team members, program PIs, and stakeholders. A conceptual framework also informs the design of the program evaluation plan and can be continuously referred to as the program moves forward.

Maintain rigorous involvement with program planning and activities. Ongoing communication between program PIs, leadership, and evaluators is vital to ensure the successful implementation of the evaluation plan. Evaluators should attend leadership meetings and be prepared to present formative results to guide the decision-making process. This will also help build trust among evaluators and all program stakeholders, thus contributing to the overall success of the evaluation.

Select an evaluation framework in the early stages of the evaluation design. Using an evaluation framework is the key to effectively assessing the merit of the program. An evaluation framework is an important tool to organize and link evaluation questions, outcomes, indicators, data sources, and data collection methods. It also assists evaluators in working with the program PIs, and hence it is important that it is implemented in the early stages of program assessment planning.

Conclusion

Given the continuously evolving nature of the teacher leadership programs, the often small sample size, and the historic lack of literature offering a clear concept of teacher leadership, we, as evaluators, found that the concurrent use of both Utilization-Focused and Theory-Driven evaluation frameworks provided a firm foundation on which the evaluation could develop and evolve in tandem with the program. Further, the use of evaluation frameworks significantly improves documentation of the impact of the programs, which, in turn, facilitates replication of the program in new and different settings.

References

Alemdar, M., Cappelli, C., Criswell, B., & Rushton, G. (2018). Evaluation of a Noyce program: Development of teacher leaders in STEM education. Evaluation and Program Planning, 71, 1-11.

Arbour, G. (2020). Frameworks for program evaluation: Considerations on research, practice, and institutions. Evaluation, 26(4).

Chen, H.T. (1990). Theory-driven evaluations. Sage: Newbury Park.

Chen, H.T. (2001). Development of a national evaluation system to evaluate CDC-funded health department HIV prevention programs. American Journal of Evaluation, 22(1), 55–70.

Chen, H. (2012). Theory-driven evaluation: Conceptual framework, application and advancement. In R. Strobl, O. Lobermeier, & W. Heitmeyer (eds) Evaluation von Programmen und Projekten für eine demokratische Kultur. Springer VS, Wiesbaden.

Coryn, C.L.S., Noakes, L.A., Westine, C.D., & Schröter, D.C. (2011). A systematic review of theory-driven evaluation practice from 1990 to 2009. American Journal of Evaluation, 32(2), 199-226.

Cooksy, L.J. (1999). The meta-evaluand: The evaluation of project TEAMS. American Journal of Evaluation, 20, 123–36.

Dempsey R. (1992). Teachers as leaders: Towards a conceptual framework. Teaching Education 5(1), 113–120.

King, J. A. (2005). Participatory evaluation. In S. Mathison (Ed.), Encyclopedia of evaluation (pp. 291-294). Thousand Oaks, CA: Sage.

OECD. (2014). OECD Framework for regulatory policy evaluation. Paris: OECD.

Patton, M.Q. (2008). Utilization-focused evaluation. Thousand Oak, CA: Sage Publications, Inc.

Rogers, P.J. (2000). Program theory evaluation: Not whether programs work but how they work. In D.L. Stufflebeam, G.F. Madaus, & T. Kellaghan (Eds.), Evaluation models: Viewpoints on educational and human services evaluation. Kluwer: Boston, MA, 209–232.

Snell, J. & Swanson, J. (April, 2000). The essential knowledge and skills of teacher leaders: A search for a conceptual framework, New Orleans, LA. Presented at the Annual Meeting of the American Educational Research Association, 2000.

United Nations Office for Disaster Reduction. (2015). Monitoring and evaluation framework. Geneva: United Nations Office for Disaster Reduction.

Meltem Alemdar, Ph.D., Associate Director, Principal Research Scientist, Center for Education Integrating Science, Mathematics, and Computing, Georgia Institute of Technology
meltem.alemdar@ceismc.gatech.edu

Dr. Meltem Alemdar is Associate Director and Principal Research Scientist at Georgia Institute of Technology’s Center for Education Integrating Science, Mathematics and Computing. Her research focuses on improving K-12 STEM education through research on curriculum development, teacher education, and student learning in integrated STEM environments. Her NSF-funded research projects have focused on project-based learning, STEM integration, engineering education, and social network analysis. Meltem has been an external evaluator for various NSF projects. As part of an NSF-funded project, she directs a longitudinal study that focuses on measuring an engineering curriculum’s impact on student learning and 21st century skills. Her expertise includes program evaluation, social network analysis and quantitative methods such as Hierarchical Linear Modeling and Structural Equation Modeling.

Christopher Cappelli, MPH, Senior Research Associate, Center for Education Integrating Science, Mathematics and Computing, Georgia Institute of Technology
chris.cappelli@ceismc.gatech.edu

Christopher Cappelli, MPH, a Senior Research Associate at Georgia Institute of Technology’s Center for Education Integrating Science, Mathematics and Computing, is currently pursuing his Ph.D. in Research, Measurement, and Statistics at Georgia State University. His work centers on research and evaluation for education and public health programs, specifically on the use of innovative methods to design and conduct useful evaluations to provide program stakeholder’s with data-informed feedback to improve their programs and information regarding overall program impact. He contributes to research projects that aim to extend knowledge around teacher professional development programs, teacher retention, and graduate student education. His methodological interests and expertise include survey development, survival analysis, social network analysis, and multilevel modeling.