New Approaches
to Evaluating

Volume 2
Theory, Measurement, and Analysis

The Virtue of Specificity in Theory of Change Evaluation:
Practitioner Reflections
Susan Philliber

In the provision of human services, thinking about outcomes is still cutting edge stuff. Rooted in the tradition of "doing good," human service agencies have traditionally focused on exactly that—the "doing." Providers are often rewarded for high client counts and units of service delivered, rather than for outcomes produced. This tendency produces a mentality in human services that is very different from that of the business world. Given this tradition, the introduction of a theory of change approach evokes responses from delight to terror, from relief to bafflement.

Still, even for human service providers, producing a clear statement of the underlying theory of change for any project is a necessary prelude to evaluation. These few pages describe how my colleagues and I introduce this approach with every evaluation we do, and what happens when we introduce it, drawing on concrete illustrations from real programs. We have used the approach with programs to address teen pregnancy, HIV, school-linked services, homelessness, domestic violence, and juvenile crime and in other settings.

Although an evaluation must begin with clear theory of change—and with a process of consensus building to construct that theory—a draft theory is only a starting point: it does not make measurement decisions, select samples, or ensure good research practice. It does not guarantee faithful data collection or intelligent analysis. Similarly, when used to guide programs, a good theory of change can provide a measure of clarity and singularity of purpose, but it cannot insure that staff will deliver as promised or that the program will produce the anticipated outcomes. The theory does not ensure that the right clients will be targeted, or even that the planned intervention will occur at all. In short, the theory of change approach is necessary to good programs and good evaluation, but it is not sufficient.

A good theory of change is also not a substitute for secure causal analysis, complete with randomly selected control groups. Ascribing such power to theories of change has been attractive to researchers working in areas where traditional experimental designs are not easily applied. And indeed, when program strategies occur as outlined in the theory of change and the posited outcomes follow, causation is certainly suggested. But while seeing a theory "come true" is persuasive, a theory alone still fails to meet the strict criteria that scientists have always imposed upon themselves to ascertain cause. The possibility remains that the chain of events occurred because of factors not included in the theory of change and that any number of intervening, unmeasured variables could account for the achieved outcomes. This caution does not diminish the value of the causal reasoning that must occur to produce a theory of change. That process in and of itself is very valuable to a program.

Charting a Concrete Theory of Change

Words like "theory" or "paradigm" or "logic model" often sound academic and even a little frightening to those seeking evaluation help. Avoiding those labels, we begin evaluation planning with two questions for program, agency, or project staff:

These questions yield what evaluators call program "process" and program "outcomes." As the questions are answered, the resultant theory of change is written down in a horizontal format in causal order: processes on the left, short-term outcomes in the center, and longer-term outcomes on the right. The horizontal format is important because it enables those working on the model to see immediately the causal sequence they are suggesting.
The Trucking Company

To get program staff to understand the utility of creating a theory of change, we often assert that the average trucking company in America does better evaluation than the average human services program. We add that this is a sad state of affairs, owing to the obviously greater importance of human service work.

Imagine, we say, that Joe and Eddie own a small trucking company and plan to deliver a load of apples from upstate New York to Oshkosh. What would they do before they left to accomplish this task? Members of the group soon suggest that Joe and Eddie would load the apples, check the truck, get money for expenses, and buy a map. We agree, translating these tasks into checking on resources and making a plan to accomplish the ultimate goal—reaching Oshkosh.

Now, we ask them to imagine that Joe takes the apples and returns. Eddie asks Joe if he got the apples to Oshkosh. But instead of talking like a trucker, Joe responds like a human service provider. "Oh, I drove really hard day and night," replies Joe. Eddie looks puzzled.

"Yes, but did you get the apples to Oshkosh?" he asks.

"And," says Joe, "the apples really enjoyed the ride."

We point out that this kind of talk among human service providers seems appropriate but among truckers seems silly. Why can’t we ask a program if they got their apples to Oshkosh? Of course, that would mean that the program had indeed defined the equivalent of Oshkosh—their desired longer-term outcomes—and that they had the equivalent of a map—the theory of change.

Groups are quick to say that people are not apples, an observation with which we readily agree. We hasten to add that they have volunteered, even eagerly applied, to be in the "people moving" business. Most agree that asking them about getting their apples to Oshkosh is fair and that they do need a "map."

This analogy can be put to further use as the theory of change is developed. Process measures and short-term outcomes can be talked about as the equivalents of sign posts along the highway—ways to know that one is still on the way to Oshkosh and not off in a ditch. We suggest that Joe would not drive around with his head down for two days and then look up to ask if this is Oshkosh.

We find that program and service staff remember this analogy. They often talk to us about their "map" and how close they are to "Oshkosh" long after they have forgotten anything technical we tried to say about evaluation.

As these processes and outcomes are being described, we disallow "vague-speak." Program staff cannot tell us they intend to "develop youth to their fullest potential" without being asked how we would know a fully developed kid if we ran into him. They cannot tell us they intend to "encourage higher comfort levels" without defining exactly how they intend to do that. We always remind clients that most evaluators are arrested at the concrete stage of thinking (a principle that we actually believe is true, at least during working hours), and so they will need to use concrete language to talk to us.

In order to make things as clear as possible, we employ some simple guidelines in getting the theory of change on paper. These include:

At this brainstorming session, it is generally most fruitful to have line staff, executive staff, board members, funders, and other important stakeholders involved. A particularly interesting group to include is clients of the proposed program or project. Each group has a different perspective and set of needs, and each brings unique comments to the table.

Sometimes it is difficult to get anything into the model but process. Staff members want to assert that "holding 15 workshops" is their ultimate outcome. In order to shift their thinking to outcomes, we remind group participants that anything done, offered, created, or held by them (as opposed to clients or members of the target population) is by definition process, and we ask them why they are doing these things. Some agencies or programs are almost defined by their processes or strategies, and for them this can be a particularly difficult hurdle.

Getting Stuck on Process

A frantic call from a program director revealed that she was being "forced" to do an evaluation. She needed an evaluator immediately. We began our conversation in the usual way, trying to construct a theory of change. I asked what she was doing that needed to be evaluated.

"I have a contract to do short-term counseling with adjudicated juvenile delinquents," she replied.

"To what end?" I asked.

"So they won’t repeat their criminal behavior," she said.

I began talking about the very real uses of short-term counseling as an intervention but expressed grave doubts about its utility for this purpose. I explained the high-risk nature of the young people she was going to deal with, the many approaches to juvenile delinquency prevention that had been tried over time, how difficult this work was. Did she really think counseling would work?

"We’re a counseling agency," she asserted impatiently. "Counseling is what we do!"

Here is a strategy in search of an outcome, not an outcome in search of a strategy. The agency is defined by what it does as an intervention, not by the outcomes it works toward.

Who said that if all you have is a hammer, every problem looks like a nail?

As a theory of change develops, the participants often produce a long list of outcomes that they believe will occur. Some of these are more important or compelling than others. For example, a group may say that, as a result of some planned intervention, young people in their program will "develop more comfort with their teachers," and thus "will attend school more." The second outcome, improved school attendance, is the more compelling outcome in many ways. While both variables can be measured, a program that increases school attendance is more important and more fundable than one that increases student comfort levels. Moreover, the more compelling outcome in this particular model is also easier to measure.

In a situation like this, an evaluator can look ahead to the measurement implications of each suggested process and help programs sort through the outcomes to be included in the model. Although important logical steps should not be excluded, it is not essential to measure every potential interim outcome. Thus, although improving student comfort may well be an important interim outcome, especially if it gives participants encouragement that they are making progress toward their ultimate goal, the emphasis needs to be on measuring progress toward the program’s compelling, long-term aim.

Once the theory of change becomes visible, the working group may recognize that the theory is flawed. Generally, the flaw can be traced to an intervention plan that is not strong enough, intense enough, or well enough targeted to produce the hoped-for outcomes. When there is a very wide gap between the interventions and their desired results, we call these "Grand Canyon models" and encourage the group to talk about the problem.

But Can This Work?

An agency had accepted a grant to "reduce the county’s teen pregnancy rate by 10 percent in three years." Those familiar with the history of teen pregnancy prevention programs know that this is an ambitious goal, one that has eluded the most dedicated efforts for three decades.

In order to create a theory of change for the project, we asked, "How do you plan to do that?" The program director offered four interventions:

  • parent-child communication workshops for some 40 parents and their children
  • a media campaign, including radio spots and posters
  • a contraceptive educator at the local health clinic to talk to all girls who come into the clinic for service
  • a series of speakers in the community for groups like the PTA and Kiwanis
  • Clearly the likelihood that this would work was slim. The parent-child communication workshops and the contraceptive educator would reach too few people in a fairly large county. The media campaign was not a sufficiently intensive intervention to produce the goal, nor was the speakers’ program. The plan had coverage, targeting, and intensity problems.

    When these interventions and their desired outcomes were written down into a theory of change, the program director realized that her model was flawed. In her original proposal, the exact interventions and their desired outcome were not this clear. The program director had inadvertently and successfully hidden this problem from both herself and the funder.

    What were her choices at this point? She could revise the intervention so that it might be able to produce the desired outcome, or she could revise the outcome to make it match what the interventions were likely to produce. She chose the latter option, and the funder, although disappointed, continued to support her.

    Draft theories of change sometimes show confusion about targets, revealing a plan to work with one group while expecting outcomes in another. For example, a plan may include a program to offer health services to a school population and an expectation that community rates of some health problem will decrease. This is not an impossible outcome, but it is an unlikely one in most communities, since not all school families will use the health services and many community families have no members at the school. In other words, it is not at all clear that serving a portion of the school population, even very successfully, can produce community-level improvements. By carefully specifying expected target groups in every process and outcome statement, an evaluator can help reveal these potential pitfalls.

    Because theories of change are only that—theories—they need to be created with an eye to change. Replaceable, disposable paper, not stone tablets, is the appropriate medium on which to record them. We counsel programs to reexamine their theories regularly, and certainly to reexamine them every time they have data in hand against which to check. Thus, creation of a theory of change does not end with the first draft, particularly for initiatives that have the luxury of interactive evaluations, which ideally function more like smoke detectors than like autopsies.

    Promoting Good Management with a Theory of Change

    Administrators and funders are often the first to see the management potential of a theory of change. They recognize the value of a concise statement of what an organization does and what it expects to produce. They see that such a statement could be used to orient new staff members, educate potential funders, and maintain clarity day to day about what is important. As this potential becomes evident, groups sometimes want to refine and embellish their models, adding specific processes and outcomes for various departments or divisions. The resultant theories of change can come to read like job descriptions, with expected outcomes attached. This is all to the good, allowing staff to see the utility of a theory of change quite apart from its function in guiding an evaluation.

    This process becomes especially interesting when staff members who have been working together for some time find themselves in disagreement about what the theory of change should include. Although people rarely disagree about process or what the program is doing, they often disagree about what those activities are supposed to produce. When this occurs, we suggest that they may not have been coordinating their work in an ideal way, and indeed all assembled generally see the implications of their failure to agree.

    When Staff Disagree

    In the midst of a spirited discussion to create a theory of change, the agency health educator began naming the outcomes of her work. "Knowledge should increase among those people I reach," she suggested.

    "Well, maybe," said the agency executive director, "but the real outcome of your work in the community should be how many clients you recruit for this agency."

    The educator was obviously stunned. "Well, no," she began to stammer. "That’s not my job."

    "It certainly is," asserted the director.

    "But I have just spent two weeks in a juvenile detention facility doing education work," protested the educator. "They can’t come here for service."

    This exchange made it readily apparent to this group that they needed a clear theory of change so they could all pull the sled in the same direction.

    Here is another example:

    A group working on support for families who have been the victims of fires or other disasters was trying to reach agreement on their desired long-term outcomes for clients. One staff member suggested that clients "should improve their previous level of living" as a result of their work. Others objected vehemently, arguing that getting clients back to their pre-disaster standard of living was enough. Still other staff members asserted that they could not be held accountable for any of these outcomes. Instead, they argued that they could only work with a family through the point of creating a plan to recover from the disaster. Others backed off even further, wanting only to measure whether they could provide families with two days of emergency assistance in housing, food, and clothing.

    As this argument continued, it became clear to staff that they were handling their workloads very differently, carrying their clients to different endpoints, and in general deciding individually what the program was about. This point was not lost on managers, who saw that the discussion around outcome had revealed some very real work issues, quite apart from evaluation.

    Making the theory of change this clear also makes it plain that measurement is possible and about to occur. While some greet this step with positive emotions, others who are less accustomed to being accountable for outcomes become fearful. Since the process of letting participants create the theory of change has taken away their defense that the evaluation is about to measure things that are irrelevant, their fear may be expressed through other objections.

    Some may argue that it is not fair to hold them accountable for client outcomes because there are so many other influences in a client’s life. This is tantamount to arguing that their intervention is not strong enough to produce the planned outcomes, so more discussion is sometimes needed to deal with this issue. "You can’t measure what I do" is another argument that can arise, usually generated by fear that an evaluator might indeed be able to measure exactly what is being done and what it produces. This objection is most easily countered by refining the theory of change to be more specific about "what I do." Then measurement alternatives can be discussed.

    Sometimes project staff want to adjust the theory of change when they see that the theory will be the basis for evaluation. They may back away from outcomes, scale down their hopes, and otherwise react to what they see as an approaching trap. All of this must be dealt with as gently as possible, understanding that outcome and causal thinking may be new and legitimately frightening. But these kinds of reactions make it particularly clear how valuable the approach is. The resultant clarity, even if the project does not progress to evaluation, has distinct benefits in its own right.

    Promoting "Buy-In" by Program Staff

    It is a luxury when this process can take place before a new program begins. More often, however, programs decide (or are forced) to evaluate well after a program has begun. Even so, an evaluator is needed to help "surface" the theory of change. What to measure, and how and when to measure it, becomes much clearer once the theory is on paper. Moreover, the process of surfacing the theory encourages "buy-in" by the program staff. They are the ones committed to the outcomes to be measured, rather than having those outcomes suggested to them or, worse, imposed by outsiders.

    Sometimes the theory of change can be made explicit very quickly, taking no more than two hours or so of facilitated dialogue. To accomplish this, it is most useful to convene a broad-based group and ask them to begin by defining what their "ultimate" or "longer-term outcomes" are supposed to be. The discussion can then move to interim outcomes, or "signs that you are getting there," and then to a description of process, or "services or techniques you use to reach these outcomes." When the discussion proceeds in this order, program staff often pick up the project’s logic, or lack thereof, rather quickly.

    It is also often helpful to have everyone in the room write out the theory of change on a worksheet before group discussion begins. This process allows individuals to assess their own clarity about process and outcome and makes underlying disagreements among staff clearer. The facilitator can collect these worksheets or can ask groups to contribute from them verbally when the overall discussion begins.

    The role of the evaluator is not to impose, direct, or do anything in isolation from program partners. Instead, the evaluator becomes the facilitator, listener, educator, and partner in using data to improve the program. These roles need not mean that the evaluator becomes less "objective." The rules of evidence for program success do not change. Rather, the evaluator comes to recognize that the knowledge and perspectives of evaluator and program personnel are necessary to create a good evaluation.

    This more cooperative role is also helpful at the stage of evaluating results, when the tendency to "kill the messenger" becomes a distinct occupational hazard. This danger is heightened in evaluations that are imposed or created by outsiders, without participation and buy-in from within the program itself. Charting the theory of change together dispels the argument that the evaluator measured the wrong things or did not understand the program in the first place.

    Sometimes flaws in a theory of change are not as apparent to program staff as they are to evaluators. As researchers, they may have direct experience with similar programs or be knowledgeable about the issues facing the program through literature. If so, the evaluator can bring information to the group to use in creating and assessing the proposed theory of change. For example, we would not move ahead to evaluate a model that depended on short-term, information-giving activities to produce changes in contraceptive behavior and reduce teen pregnancy rates. Instead, we would attempt to educate program staff about strategies that have proven effective or ineffective in producing those outcomes, asking them to reconsider their theory of change. In such situations, the evaluator becomes an educator and partner in program design.

    Some human service programs and projects approach evaluation with dread, but almost all come to the process with confusion about how to capture what they are doing. As the theory of change emerges, they usually see a line of attack, where before they saw only a complex and vague problem. The next step, then, is for the evaluation team to suggest alternative measurement strategies for each outcome, from which the program and evaluation teams can together choose the most appropriate measures.

    By bringing program staff into the design process, evaluators can enhance staff members’ comfort and demystify what they may have believed would be a complex, statistical, impersonal endeavor. It puts program staff in control of how the evaluation will be done and nicely avoids many of the complaints made of evaluations and evaluators, including the imposition of outcomes and measures by outsiders.

    Back to New Approaches to Evaluating Community Initiatives index.

    Copyright © 1999 by The Aspen Institute
    Comments, questions or suggestions? E-mail
    This page designed, hosted, and maintained by Change Communications.