Main Content

Standard 9.1: Program evaluation

Program leadership, program partners and all stakeholders develop and implement an ongoing process for program evaluation based on multiple internal and external sources with formal and informal measures to ensure ongoing program improvement.

From the beginning, program leaders should view the development of an evaluation system as an ongoing, formative process. The initial steps of evaluation should serve immediate and obvious needs. As the induction/mentoring program grows, so will the evaluation system.

Ideally, program evaluation should be done by a team. For a small district this could be as simple as two people; in larger districts the group can consist of a more formal committee comprised of individuals who represent different stakeholder groups and aspects of the induction/mentoring program. In any case, the group should solicit input from all participants in the program as well as share results with the same population.

Identify the Problem

The evaluation team should consider why it needs to conduct research. A good place for the team to begin is determining the needs, strengths, context, culture, and concerns of the district. It is important to identify the district’s particular point of view on what an effective teacher looks like and what effective teaching is. Such a discussion allows for everyone to agree on shared vocabulary, values, and vision, and readies the team for the discussion of state and national standards.

For new/developing programs, program evaluation might first focus on the extent to which the elements of the program are in place. For example:

  • How often do mentors and beginning teachers interact?
  • What are the mentors’ perceptions of the value of the material covered in their professional development workshops?
  • How many of the beginning teachers asked for or received help from individuals other than their assigned mentor?

Later on, the programs could delve more into the quality of the induction experiences; the program’s impact on various groups, including beginning teachers and students; and other outcomes.

The Illinois New Teacher Collaborative (INTC) Research and Evaluation Coordinating Committee described seven areas of induction/mentoring program impact, each of which could be useful for program evaluation:

  • Impact on student achievement
  • Impact on beginning teachers
  • Impact on mentors
  • Impact on administrators
  • Impact on schools
  • Impact on teacher education programs
  • Impact on the profession

Collect Data

The evaluation team might first consider what data is currently available. It is likely that the district or program already collects some information that would be useful in addressing one or more purposes of the induction/mentoring program evaluation.

Some data is quantitative: it involves numbers and things that can be counted (e.g. district retention data or a survey of teacher satisfaction on a 1-5 scale). Quantitative data can be easy to compare and can be persuasive for administrators and school board members.

Other data is qualitative: it involves things that cannot be counted, including feelings, personal stories, and contexts (e.g. interviews of beginning teachers or open-ended survey questions). Qualitative research can be much more rich and detailed than quantitative research, but it can be more time-consuming to conduct and analyze.

Sometimes, programs start by gathering quantitative data and use it to generate ideas for qualitative research. For example, they might start by asking beginning teachers to rank the quality and usefulness of different professional development workshops. Then, programs might conduct focus-group interviews on why beginning teachers rated some workshops more highly than others.

Types of Data

Four major types of data include surveys, interviews, classroom observations, and documents/artifacts. Different methods are appropriate for different programs and for different research questions.

  • Surveys: They can be open-ended or multiple choice, pen-and-paper or online, and anonymous or not. Even if a survey is anonymous, it can still ask for demographic information, such as gender, school name, or grade level. Many surveys use a Likert scale, asking respondents to rate something on a quantitative scale (e.g. “on a scale of 1-5…”) or qualitative scale (e.g. “strongly disagree, disagree, agree, strongly agree”). Surveys are often relatively easy to administer, especially to large groups, but they do not provide the same depth of response as do interviews.
  • Interviews: These can be one-on-one to provide individual insights, or they can be used in focus groups to provide insights into a group’s thinking. Interview data is qualitative, although it can be summarized in quantitative ways (e.g. “14 of the 20 teachers mentioned…”).
  • Classroom observations: Observations are an excellent way to understand a teacher’s classroom performance. Observers should be trained so they look for the same items in each teacher’s classroom; most observations use a standardized tool, like a checklist or chart to fill in. Many researchers conduct multiple observations to understand the teachers’ growth over time.
  • Documents and artifacts: These include beginning teacher performance evaluations, student standardized test scores, district retention data, etc.

Using only one type of data at one point in time provides only an incomplete picture of an induction/mentoring program. In an ideal world, program leaders would gather different types of data to examine the same impact. For example, to understand the impact of an induction/mentoring program on beginning teacher performance, program leaders could interview mentors and administrators, observe the beginning teachers in their classrooms at multiple points during the year, and look at teacher evaluation scores. They could then triangulate, or compare, all of these different research projects to gain a full picture of each beginning teacher’s growth and development.

Other Considerations

It is important that induction/mentoring programs not wait until the end of the year to collect data. Information collected during the year can provide feedback that enables mid-course corrections. It is also better for an induction/mentoring program to start collecting a small amount of data immediately rather than wait several years until it has the time and personnel to begin a major research effort. The evaluation team can then scale up each of the tasks: create a larger steering committee that involves appropriate positions and expertise; examine how the data already collected is organized, analyzed, and disseminated; and find out how any of these data focused on beginning teachers in the past.

Previous: Standard 9 Next: Standard 9.2