Step 2: Develop the evaluation brief

An evaluation brief can gain agreement on an evaluation and develop a Request for Tender (RFT). These can be used to commission an external evaluation or develop agreements for an internal evaluation.

Listen

On this page

The evaluation brief can be developed in the program design phase. It is the basis for developing the evaluation design.

A brief for a program evaluation sets out:

purpose of evaluation — formative or summative
type of evaluation needed — process, outcome or economic
scope and focus of the evaluation
key stakeholders
key evaluation questions
what is already known about the program
reporting and communication
the balance of internal and/or external evaluation
develop an evaluation strategy (for large programs)
the investment in the evaluation
governance mechanisms and stakeholders engagement strategy.

Considerations for the evaluation brief include:

A program with external funding may have specific requirements of the evaluation focus, methods, timing and scale.
High-profile programs or those with significant risks may need more extensive evaluation to find problems early.
Pilot initiatives are likely to need more extensive evaluation to provide information not just on whether they work, but how they work, so they can be replicated or used to scale-up.
A program with multiple stakeholders may need more resourcing to support their involvement in negotiating the evaluation focus and methods and communicating findings.

The details and scope will differ from evaluation to evaluation.

For larger evaluations, you could also progress the development of the evaluation design at this stage, so that an outline of the evaluation design can be included in the brief. This is an important consideration when the brief for evaluation plan is developed during the program design phase.

The evaluation brief is essentially the same as the program evaluation plan in the NSW Government Program Evaluation Guidelines (PDF 543.64KB). It may also be called the terms of reference for the evaluation.

Purpose of evaluation – formative or summative

The starting question for planning a program evaluation is "Why do this evaluation?"

The 2 main evaluation purposes

Formative evaluation for program improvement, learning and decisions about incremental changes.
Summative evaluation for accountability and decisions about whether or not to continue or expand a program.

Formative and summative evaluations may use some of the same evaluation methods.

The classic comparison, by Professor Robert Stake, is "When the cook tastes the soup, that's formative; when the customer tastes it, that's summative".

Formative evaluation

Formative evaluation is about evaluation conducted to inform decisions about improvement. It can provide information on how the program might be developed (for new programs) or improved (for both new and existing programs). It is often done during program implementation to inform ongoing improvement, usually for an internal audience. Formative evaluations use process evaluation but can also include outcome evaluation, particularly to assess interim outcomes.

Summative evaluation

Summative evaluation refers to evaluation to inform decisions about continuing, terminating or expanding a program. It is often conducted after a program is completed (or well underway) to present an assessment to an external audience.

Although summative evaluation generally reports when the program has been running long enough to produce results, it should be initiated during the program design phase. Summative evaluations often use outcome evaluation and economic evaluation. Summative evaluations could also use process evaluation, especially where there are concerns or risks around program processes.

The purpose of a program evaluation will inform (and be informed by):

the audience needs
reporting requirements
intended users and uses.

It will also be shaped by program characteristics including:

significance to government, size of investment, risks, sensitivities and needs for decision
the stage and maturity of program implementation
the readiness of the program for evaluation including the extent and quality of administrative data.

Evaluations may be required by legislation or policy. Each agency within the NSW Government will have a rolling 12-month evaluation schedule, which must be prepared and submitted to ERC for approval. Schedules should include:

a list of programs planned for evaluation and review, and their expected completion date
who will evaluate or review listed programs
the governance processes for the schedule, including internal monitoring and reporting
when the schedule will be reviewed and updated.

Type of evaluation needed – process, outcome and/or economic

The most common types of program evaluation within government are:

process evaluation
outcome evaluation
economic evaluation.

Process evaluation is mainly, but not solely, used for formative purposes. Both outcome evaluation and economic evaluation are used mainly for summative purposes.

Other evaluation tools (such as needs assessment, program logic, and evaluability assessment) may be used in preparing a program evaluation brief or to inform program planning.

Types of evaluation

Type	Focus
Process evaluation	Investigates how the program is delivered, including efficiency, quality and customer satisfaction. May consider alternative delivery procedures. It can help to differentiate ineffective programs from failures of implementation. As an ongoing evaluative strategy, it can be used to continually improve programs by informing adjustments to delivery.
Outcome evaluation (or impact evaluation)	Determines whether the program caused demonstrable effects on specifically defined target outcomes. Identifies for whom, in what ways and in what circumstances the outcomes were achieved. Identifies unintended impacts (positive and negative). Examines the ways the program contributed to the outcomes, and the influence of other factors.
Economic evaluation	Addresses questions of efficiency by standardising outcomes in terms of their dollar value to answer questions of value for money, cost-effectiveness and cost-benefit. These types of analyses can also be used in formative stages to compare different options.
Needs assessment	As part of program planning, assesses the level of need in the community, and what might work to meet the need. For an existing program, assesses who needs the program, and how great the need is.
Program logic	Used for program planning and for framing a program evaluation to ensure there is a clear picture of how and why the program will produce the expected outcomes.
Evaluability assessment	Used in developing a program evaluation brief to determine whether a program evaluation is feasible and how stakeholders can help shape its usefulness. This is useful if implementation has commenced without an evaluation plan.

Scope and focus of the evaluation

All program evaluations should:

be as rigorous as possible
aim to produce valid and reliable findings
reach sound conclusions.

The evaluation brief needs to consider an evaluation design that addresses:

rigour
utility
feasibility
ethical safeguards.

Whenever feasible and appropriate, program evaluation should aim to measure program outcomes. Planning for rigorous outcome evaluations should begin as early as possible to allow for a strong evaluation design. A strong evaluation design can have comparison groups for quasi-experimental or experimental evaluation approaches. The evaluation design will detail how the required program data will be collected.

It is never feasible or appropriate to try to evaluate every aspect of a program. Any evaluation project needs boundaries in its scope and a focus on key issues, for example:

a program evaluation might look at how a program has been implemented in the past 3 years, rather than since it began, or
could look at its performance in particular regions or sites rather than across the whole state.

An outcome evaluation may focus on:

outcomes at particular levels of the program logic
particular components of the program.

A process evaluation may focus on the activities of particular stakeholders, such as frontline staff or interagency coordination.

Key stakeholders

Key stakeholders are likely to include:

senior management in the agency
the Strategic Centre
program managers
program partners
service providers
peak interest groups (such as representing industries or program beneficiaries).

In developing the evaluation brief you should consider:

the questions that significant stakeholders will have of the program
when stakeholders need answers to their questions
how they will use the information you provide.

One method is to map significant stakeholders and their actual or likely questions.

Stakeholders will also have expectations about the most credible evidence to answer their questions. Stakeholders will have differing understandings of:

the program
the extent to which it can be evaluated
the suitability of different evaluation designs and methods.

You need to be clear about:

their interests and understanding of the program
decide how stakeholder interests should be reflected in the evaluation
how stakeholder expectations can be managed throughout the evaluation.

Key evaluation questions

A program evaluation should focus on only a small set of key questions. These are not questions that are asked in an interview or questionnaire but high level research questions that will be answered by combining data from several sources.

Key evaluation questions for the 3 main types of evaluation

Type	Typical key evaluation questions
Process evaluation	How is the program being implemented? How appropriate are the processes compared with quality standards? Is the program being implemented correctly? Are participants being reached as intended? How satisfied are program clients? For which clients? What has been done in an innovative way?
Outcome evaluation (or impact evaluation)	How well did the program work? Did the program produce the intended outcomes in the short, medium and long-term? For whom, in what ways and in what circumstances? What unintended outcomes (positive and negative) were produced? To what extent can changes be attributed to the program? What were the particular features of the program and context that made a difference? What was the influence of other factors?
Economic evaluation (cost-effectiveness analysis and cost-benefit analysis)	What is the most cost-effective option? Has the intervention been cost-effective (compared to alternatives)? Is the program the best use of resources? What has been the ratio of costs to benefits?

Appropriateness, effectiveness and efficiency

In this toolkit, we use 3 broad categories of key evaluation questions to assess whether the program is appropriate, effective and efficient.

Organising key evaluation questions under categories helps you assess how appropriate, effective and efficient your program is and in what circumstances. Suitable questions under these categories will vary with the different types of evaluation (process, outcome or economic).

Typical key evaluation questions
Appropriateness	To what extent does the program address an identified need? How well does the program align with government and agency priorities? Does the program represent a legitimate role for government?
Effectiveness	To what extent is the program achieving the intended outcomes, in the short, medium and long term? To what extent is the program producing worthwhile results (outputs, outcomes) and/or meeting each of its objectives?
Efficiency	Do the outcomes of the program represent value for money? To what extent is the relationship between inputs and outputs timely, cost-effective and to expected standards?

While you can use different processes to develop evaluation questions, these should emerge in this step as you consider the different activities associated with this step:

the purpose of the evaluation
the type of evaluation
stakeholder interests
preliminary assessments.

There may be formal or general evaluation questions that are required because of legislation or arrangements such as the National Partnership Agreements.

To clarify the purpose and objectives of an evaluation, there should be a limited number of higher-order evaluation questions (roughly 3 to 5 questions). There can be sub-questions underneath each higher-order question. The higher-order questions can be grouped under the categories of appropriateness, effectiveness and efficiency.

A way to test the validity and scope of evaluation questions is to ask: "When the evaluation has answered these questions, have we met the full purpose of the evaluation?"

What is already known about the program?

You can prepare for a program evaluation by doing preliminary investigations about the program and the scope for evaluation. An evaluation may be for one of the following reasons:

it is required irrespective of the state of the program
to make the program more able to be evaluated
to demonstrate that it is not worth evaluating the program.

Three methods to prepare for an evaluation and inform an evaluation brief are:

review program logic
use evaluability assessment to check readiness for evaluation
identify what is already known about the program.

Review the program logic

Reviewing or developing the program logic is an important prelude to an evaluation. It should provide a useful description of the program and its intended outcomes that will help shape the evaluation questions and data collection methods.

Key evaluation questions for program logic analysis include:

What is the problem the program is trying to solve or outcomes it is trying to achieve?
How plausible are the program activities to achieve the outcomes?
How appropriate is the program in relation to government policy?

Program logic can also be used to assess whether the program is still appropriate. If the program is no longer appropriate then program logic can provide a basis for discontinuation without the need for further evaluation. For example, program logic analysis can show whether the intended outcomes are still appropriate and link to government priorities.

Program logic can decide if the program activities and immediate outcomes link to the intended outcomes, either logically or using evidence from the research literature.

Use evaluability assessment to assess readiness for evaluation

Evaluability assessment is used to determine whether and what form of a program evaluation is feasible. It will also help identify what will make a program more able to be evaluated, such as refining the program logic, or improving the collection of monitoring data.

An evaluability assessment is particularly important if implementation has commenced without an evaluation plan. For example, an evaluability assessment may find that no data on program outcomes is being collected, pointing to data collection design work that is needed prior to conducting an outcome evaluation. The findings from an evaluability assessment should inform the design in Step 4. Manage development of the evaluation design and feasibility of the program evaluation.

Questions for evaluability assessment include:

Does the program have a plausible program logic?
Is there a clear purpose and objectives for the evaluation?
Can you clearly identify an audience for the evaluation and how the findings will be used?
Are there sufficient resources to conduct an evaluation? Is there suitable data from program implementation and/or monitoring or is it possible to collect data?
Can a comparison group be identified to better determine program impacts and outcomes?

Identify what is already known that is relevant to answering key evaluation questions

You shouldn't conduct an evaluation when answers can be found in existing data. Before considering program evaluation, analyse performance monitoring data, and scan for evidence about comparable programs.

An analysis of available program monitoring data should reveal trends, patterns and issues with program implementation, and possibly program outcomes. This analysis can answer some questions about the program, and point to other questions that the program evaluation should address.

A scan for existing evidence about effectiveness of comparable programs in other jurisdictions or internationally can point to expected outcomes, standards and issues. These can also inform:

the development of evaluation questions
the evaluation design
methods of data collection
standards for assessing performance.

Reporting and communication

Evaluation reports are usually the most significant product of a program evaluation project. The final report, either in full or summary form, needs to reach the intended audiences through formats and channels that are meaningful to them. You need to consider which stakeholders will be the audience for the evaluation reports, and how they might use these. Evaluations may be designed to inform decisions in the budget and policy cycle, meaning that reports are required at specific times.

Paying attention to reporting needs when developing an evaluation brief can help:

clarify expectations about when information from a program evaluation is needed
clarify the time.

For some programs, interim evaluation reports can be timetabled to provide preliminary findings to decision-makers.

The planning stages of a program evaluation should consider the practice principle. Evaluation processes should be transparent and open to scrutiny in the NSW Government Evaluation Guidelines. You should consider how the evaluation findings, methods and data might be shared within government.

An evaluation report on any program delivering services to the public should be publicly released in a timely manner, except where there is an overriding public interest against disclosure.

During this step, you should set out the reporting requirements which will be further developed in the workplan. You will need to:

identify suitable reporting for key audiences
consider issues of length, structure, style, and whether to publish – particularly if tendering services from an external agency
develop a timeframe for reporting to meet evaluation purposes.

Reporting and communication about the evaluation can have an important influence on how the findings will be used.

Decide on the balance of internal or external evaluation

One of the principles in the NSW Government Evaluation Guidelines is that evaluations should be conducted with the right mix of expertise and independence. In deciding who conducts the evaluation, issues to consider are:

knowledge of the program or policy
expertise of the evaluators in program evaluation
perceived and actual independence of the program
credibility of the evaluator in the eyes of the intended audience.

For many evaluation projects, a partnership between the internal managers and the external evaluators may be effective and cost-effective. The degree of partnership will vary depending upon capacity, logistics and the need for independence. The internal team is often best suited to:

manage the overall process and governance arrangements
provide data from administrative systems
coordinate intra-government arrangements such as organising stakeholder interviews.

A well-managed partnership approach can:

bring flexibility to the evaluation
reduce delays
be cost-effective
promote learning about evaluation within program management.

Possible scenarios for internal or external evaluation

Evaluation all done internally

An evaluation can be designed and managed internally where:

the program is a small to moderate investment and a low risk (tier 1 or tier 2 in the NSW Government Evaluation Guidelines)
the evaluation is limited in scale
internal staff have skills and resources for systematic data collection and analysis.

Hybrid — combination of internal and external

External service providers can contribute to hybrid evaluations in different ways:

supporting internal staff to conduct an evaluation through facilitation or coaching
undertaking one or more components (for example, specialist data collection or analysis, or reporting)
providing an external review of a process or product (for example, evaluation design, data collection instruments, evaluation report).

External — smaller scale evaluation project, designed internally

For some small evaluations, or where an evaluation is repeating a previous design, it can be designed internally. After being designed internally, then an RFT will be used to engage an external group to implement it.

External — larger scale evaluation project, designed by external evaluations

External expertise can be useful to design the evaluation. This can either be done as:

part of proposals for the evaluation, or
a separate project (if the evaluation is very large and complicated).

Develop an evaluation strategy (for large programs)

The scale of the evaluation should be proportionate to the size or significance of a program, as set out in the NSW Government Evaluation Guidelines. For large programs, this may involve a series of evaluation projects and related activities. Large programs could include those that are:

large-scale (significant investment, extended reach)
have a duration of 3 or more years
are complicated (multiple sub-programs, across agencies or whole of government).

Such programs may warrant an evaluation framework and strategy that sets out:

a series of evaluation projects and activities for data development
evaluation capacity building over the period of the program.

This will allow you to build in process, outcome and economic evaluations at key times that:

match the developing maturity of the program, and
meet the needs for information for formative and summative purposes.

The evaluation framework and strategy can be developed at the time of the program design and reviewed at milestones, such as after the delivery of each evaluation report.

The investment in the evaluation

All evaluations will require an investment of financial and staffing resources commensurate with the scale of the program and the evaluation. In the program design stage, a proportion of the budget (and/or internal staff time) should be allocated to cover evaluation activities.

The cost of an evaluation project will be shaped by:

the scope of the evaluation activities
if evaluation activities are to be carried out internally or by external consultants
the extensiveness of additional data collection, analysis and report writing.

The detailed tasks and scope of an evaluation project will not be clear until the design step. However, allocating a budget for an evaluation project and commissioning an evaluation will indicate the extent and depth of work available for that budget.

Governance mechanisms and stakeholders engagement strategy

A governance mechanism, such as a steering committee or advisory group should be established to provide direction or advice at various stages of the evaluation. The benefits include a greater range of perspectives and expertise, as well as greater ownership of the evaluation process by key stakeholders.

You should consider a governance group that matches the purpose and the scale of the evaluation. The group may be entirely within government, or include government and external stakeholders.

In most cases, members should be beyond just the program, and include relevant people from elsewhere in the agency, or from partner agencies. For significant evaluations (tier 3 or 4), you should consider a representative from the Centre for Program Evaluation. External stakeholders can include key academics who research the program area, representatives of peak groups for program clients, industry bodies, and program service providers.