Improving Evaluative Activity in the New Zealand … Evaluative Activity in the New ... what part government interventions have played in their achievement, ... Pathfinder project

Doing the Right Things and Doing Them Right

Improving Evaluative Activity in the New Zealand

State sector

��

� !�"#$%"�##�%"&�

2

Table of Contents

EXECUTIVE SUMMARY.............................................................................................4

BACKGROUND – WHY THIS PROJECT? ...............................................................6

WE WERE ASKED TO LOOK AT EVALUATION ...................................................................6 WHAT THE PROJECT HAS DONE.......................................................................................6 WHAT THE PROJECT HAS NOT DONE................................................................................7

WHY IS EVALUATIVE ACTIVITY IMPORTANT?................................................8

OUR SCOPE HAS BROADENED FROM EVALUATION TO EVALUATIVE ACTIVITY.................8 EVALUATIVE ACTIVITY REDUCES UNCERTAINTY AND IMPROVES DECISION-MAKING......8 DIFFERENT TYPES OF EVALUATIVE ACTIVITY SUPPORT DIFFERENT DECISIONS ...............9 FORMAL EVALUATIONS ARE NOT ALWAYS THE BEST OPTION..........................................9 ONE SIZE DOES NOT FIT ALL..........................................................................................10

WHAT DID WE HOPE TO SEE?...............................................................................11

WHAT DID WE SEE?..................................................................................................12

THE LITERATURE SAYS THAT EVALUATIVE ACTIVITY IS NOT OPTIMAL .........................12 RECENT INITIATIVES ARE IMPROVING THIS SITUATION .................................................13 OUR DISCUSSIONS WITH STAKEHOLDERS REVEALED A MIXED PICTURE ........................14

HOW WELL DOES THE STATE SECTOR DECIDE WHAT EVALUATIVE ACTIVITY TO UNDERTAKE?..................................................................................15

THE PICTURE IS MIXED, WITH MANY EXAMPLES OF GOOD PRACTICE . . . .......................15 BUT SOME SCOPE FOR IMPROVEMENT REMAINS............................................................17 WHY ISN’T THE GOOD PRACTICE MORE WIDESPREAD?..................................................17 SOME FURTHER MEASURES ARE NEEDED TO ADDRESS THESE FACTORS ........................20

HOW WELL DOES THE STATE SECTOR USE EVALUATIVE ACTIVITY? .21

THE PICTURE IS MIXED, WITH MANY EXAMPLES OF GOOD PRACTICE . . . .......................21 BUT SOME SCOPE FOR IMPROVEMENT REMAINS............................................................23 WHY ISN’T THE GOOD PRACTICE MORE WIDESPREAD?..................................................23 SOME FURTHER MEASURES ARE NEEDED TO ADDRESS THESE FACTORS ........................27

EVALUATIVE ACTIVITY TO ACHIEVE OUTCOMES �� ..............28

WHAT SHOULD GOVERNMENT DO TO IMPROVE EVALUATIVE ACTIVITY?...................................................................................................................30

THE BACKGROUND TO OUR RECOMMENDATIONS..........................................................30 RECOMMENDATIONS TO GROW A WIDESPREAD CULTURE OF INQUIRY ..........................31 RECOMMENDATIONS TO IMPROVE COORDINATION AND PRIORITISATION......................34 RECOMMENDATIONS TO DEVELOP CAPABILITY ............................................................35

ANNEX A – TERMS OF REFERENCE ....................................................................38

ANNEX B – ORGANISATIONS THAT CONTRIBUTED TO THIS PROJECT.41

3

ANNEX C – RECENT INITIATIVES IMPROVING EVALUATIVE ACTIVITY.........................................................................................................................................43

ANNEX D – OPTIONS TO IMPROVE THE STATE OF EVALUATIVE ACTIVITY.....................................................................................................................46

OPTIONS TO GROW A WIDESPREAD CULTURE OF INQUIRY.............................................46 OPTIONS TO IMPROVE COORDINATION AND PRIORITISATION ........................................49 OPTIONS TO DEVELOP CAPABILITY ...............................................................................51

GLOSSARY...................................................................................................................53

REFERENCES ..............................................................................................................55

4

EXECUTIVE SUMMARY

Why we did the project

1 This project investigated whether any initiatives were required to enhance the evaluation environment and encourage more effective evaluation in the State sector, taking into account the recently established Social Policy Evaluation and Research Committee (SPEaR).

What we did in the course of the project

2 The project team gathered and analysed information from many sources, including literature, recent reviews and interviews with a wide range of stakeholders. We released a discussion paper in November 2002 and received considerable feedback on this. An advisory group comprising evaluators, policy managers and a statistician assisted us throughout the project.

Scope of the project

3 Although our project was established to look at “evaluation”, we expanded the scope to consider the broader concept of “evaluative activity”, which we define as: activity that seeks evidence of actual performance of policies or programmes once they are being, or have been, implemented. We consider that the purpose of evaluative activity is to reduce uncertainty about the efficiency and effectiveness of policies and programmes, helping to improve existing and future policies and programmes.

4 Evaluative activity includes formal evaluations of policies and programmes as well as monitoring and other forms of inquiry. The appropriate type of evaluative activity will differ depending on the questions to be answered.

What we discovered during the project

5 We found many examples of good practice in the State sector. Overall, we received feedback that targeting, conduct and use of evaluative activity are improving. Recent initiatives, such as Managing for Outcomes, are increasing the demand for evaluative activity and the evidence it can provide. Groups such as SPEaR are raising the profile of research and evaluation in the social sector by sharing good practice and building capability.

6 However, scope for improvement remains. The evaluative resource is currently poorly targeted, with repeated investments in areas where there is a high degree of certainty and a lack of consideration of agency priorities across the whole spectrum of business or wider government priorities. Use of evaluative findings to inform policy, service delivery or broad government strategy and budget decision-making decisions is patchy.

7 The main causes of these problems are:

• a variable culture of inquiry: demand for high quality evaluative activity from Ministers, Parliament, central agencies and State sector managers and commitment to using the findings of evaluative activity to inform decisions are variable;

• limited capability: limited understanding by policy and programme managers of when different types of evaluative activity offer value, and how to interpret and

5

use the findings; and limited skills within and outside the State sector to conduct high quality evaluative activity; and

• poorly coordinated and prioritised evaluative effort: evaluative effort is often not well coordinated or prioritised in or between agencies; sharing and consistency of data is limited and evaluative findings are often not shared widely within and between agencies. There is also little attention to evaluating major policies that cross agency boundaries.

Our recommendations address these three areas

8 Some existing initiatives, like Managing for Outcomes and SPEaR, will go some way to addressing these problems. However, to embed a widespread culture of inquiry throughout all sectors and to coordinate the evaluative effort more systematically, we are proposing initiatives to:

• emphasise the importance of evaluative activity by building on existing initiatives and strengthening the incentives in the budget process and chief executive performance review process for departments to produce and use evaluative activity;

• enhance capability to target, conduct and use evaluative activity through a network of practice and training of central agency staff; and

• advise the Minister of Finance, Minister of State Services and Prime Minister on gaps in evaluative activity in major policy areas where the Government intends to consider its policy direction. In many cases, these gaps will be where major policies cross agency boundaries.

9 None of the initiatives will deliver immediate improvements because cultural change and capability development takes time. Leadership from Ministers and senior officials is critical in developing consistent interest in evaluative activity and a commitment to use such information in policy and programme development.

10 The projects undertaken by SPEaR are likely to improve capability and coordination of research and evaluation in the social sector. Following the review of SPEaR in mid 2004, we recommend that the Treasury and the State Services Commission explore whether similar mechanisms could be valuable in other sectors.

11 The current review of Statistics New Zealand is likely to recommend some changes to the way that administrative data on government programmes is collected, and on the role of official statistics. Such changes have the potential to improve the quality of information collected about policies and programmes and coordination between agencies.

12 Our recommendations will not improve the state of evaluative activity immediately. Changing culture, developing capability and improving information systems take time and resources. Therefore we recommend that The Treasury and the State Services Commission conduct a review in December 2005 to assess whether there has been an improvement in evaluative activity in the State sector and whether further work is necessary.

6

BACKGROUND – WHY THIS PROJECT?

We were asked to look at evaluation

13 This project is part of a work programme arising from the Government’s decisions on the report of the Review of the Centre (RoC) in December 2001. The RoC report expressed concern that evaluation is not conducted and used effectively enough in the New Zealand State sector.

14 RoC was primarily interested in impact evaluation – evaluation to determine the outcomes of government interventions. This focus reflected concern expressed in a number of earlier reviews of the subject that New Zealand did relatively little impact evaluation compared to other jurisdictions, especially Australia.

15 It also reflected other developments within the Public Service that are placing greater stress on the achievement of outcomes. The increased focus on achieving outcomes has emphasised the need for information about what outcomes are being achieved, what part government interventions have played in their achievement, as well as how interventions could be changed or improved so as to increase their impact. RoC tended to the view that evaluation was not done or used as well as it should be in making improvements in what government does and the way that it does it.

16 RoC noted that there were developments under way – including SPEaR, the Pathfinder project and the greater outcomes focus in Statements of Intent – that might improve the evaluation situation. The RoC report recommended, and Government agreed, that there should be a project to determine the likely impact of these developments on the conduct and use of evaluation, and whether additional steps might be necessary to improve matters still further. Hence this project.

17 Our terms of reference are attached as Annex A.

What the project has done

18 A small team of staff from the State Services Commission (SSC) and the Treasury has worked on this project. This paper expresses the views of this project team, not of the government, and is informed by consultation within and outside the State sector and the literature. We also convened an advisory panel of evaluators, policy advisors, and a statistician to assist us.

19 We gathered information from the following sources:

• interviews with a number of participants and stakeholders in State sector evaluative activity, including public and private sector evaluators, specialists in �� Minister’s advisor;

• responses to a discussion paper on the actual state of evaluative activity, and the nature of some current problems, which we circulated to all departments, some Crown entities, and some academics and private sector consultants;

• earlier reviews of the state of evaluation, such as the SSC’s occasional paper Looping the Loop – Evaluating Outcomes and Other Risky Feats;

• relevant literature, including reports of developments in overseas jurisdictions; and

• extensive consultation with departments on a draft version of this report.

7

20 Annex B lists the organisations that contributed to this project through interviews, discussions or written comments.

What the project has not done

21 We have not undertaken a full survey of evaluative activity and compared the current level with past activity in order to ascertain the change taking place. Nor have we quantified the amount of resource spent on evaluations within the State sector. We were advised from the outset that such an exercise would not be feasible or accurate enough to bring adequate benefit.

22 We have focused on what more needs to be done to build on existing good practice around the State sector rather than an in-depth analysis of what has driven developments to date. Good practice has primarily emerged because of the support of individual chief executives or senior managers who have directed resources to develop evaluative capability.

We wish to acknowledge the contribution of Mark Robinson who managed this project prior to his death in December 2002.

8

WHY IS EVALUATIVE ACTIVITY IMPORTANT?

Our scope has broadened from evaluation to evaluative activity

23 Although our project was established to look at “evaluation”, we have expanded the scope to consider the broader concept of “evaluative activity”.

24 The word “evaluation” conjures up different things for different people. Some use it to refer only to impact evaluation. Others distinguish between evaluation and other methods of inquiry (such as monitoring or performance audit) but acknowledge different forms of evaluation (such as impact evaluation, process evaluation, summative evaluation and formative evaluation – see Glossary for details).

25 Still others take an even broader view, using “evaluation” to mean any activity that enables organisations to consider the effectiveness and efficiency of their programmes and policies, to learn from that consideration, and to apply their learning to improve their work.

26 Our project uses the concept “evaluative activity”, which we define as activity that seeks evidence of actual performance of policies or programmes once they are being, or have been, implemented. In this sense, evaluative activity includes formal evaluations of policies and programmes as well as monitoring and performance audit.

27 We do not regard policy analysis on its own as an evaluative activity. However, we recognise that good policy analysis often draws on the findings of evaluative activity and also improves the quality of future evaluative activity. In general we would argue that input from evaluative activity is relevant at each stage of the policy cycle. We appreciate that the boundaries of evaluative activity, research and other analytical work are not rigid, nor should they be.

28 Analysis of organisational capability, although very important for organisations, is not within the scope of this paper.

Evaluative activity reduces uncertainty and improves decision-making

29 A central theme of the RoC report was the need to focus the public management system increasingly upon achieving outcomes for citizens. Consequently, our interest in evaluative activity is about how it can help improve decision-making, so that government interventions achieve better outcomes for citizens.

30 Evaluative activity can help organisations to learn from the past and make good decisions about how to improve existing and future policies and programmes to achieve the best results for New Zealanders.

31 Decisions about what government should do and how best to do it are often made with considerable uncertainty. Evaluative activity can help reduce this uncertainty if it provides evidence of the effectiveness and efficiency of policies and programmes as well as information about unintended consequences.

32 However, there are limits to the extent to which evaluative activity can assist in substantially reducing uncertainty. In particular, it must be directed to the right things, undertaken well and provide timely, reliable information. Even when these conditions are met, evaluative activity will rarely provide conclusive results and some judgement about subsequent decisions will remain.

9

33 The value of evaluative activity is that it adds evidence to help inform decisions, so its value is likely to be greater where the existing evidence base is small.

34 We do not see accountability or knowledge development as the ultimate outcome of evaluative activity. However, we do recognise these as intermediate outcomes towards the end outcome of improved decision-making.

35 We appreciate that evaluative activity is only one input into decision-making and is often limited in the information or evidence that it can provide. Its impact on decision-making depends on the integration of results into the policy process.

Different types of evaluative activity support different decisions

36 Good outcomes can only be achieved if high quality evaluative activity takes place throughout the policy/programme cycle. This means that evaluative activity is a fundamental part of good management.

37 Different types of evaluative activity support different decisions. The purpose of evaluative activity shapes the appropriate approach and design, and influences decisions about the type of evidence and methodologies that are credible for different audiences. Frequently, a mix of evaluative methods may be necessary to provide the evidence required.

38 Different decisions need to be made at different stages in the policy development and programme implementation cycle. Formative evaluation ensures that programmes are implemented in a way that maximises their chances of meeting their objectives successfully. Therefore, it can provide useful information early on abut how to improve implementation or programme operation (i.e. programme efficiency). Impact or outcome evaluation measures the outcomes of a programme (i.e. the programme’s effectiveness). Hard strategic decisions need to be made regarding allocation of resources to these different types of evaluation for particular policies and programmes. The key is to use the right type of evaluative activity to answer the questions posed by decision-makers, so that it adds the most benefit in terms of improved policies and programmes for the least cost.

Formal evaluations are not always the best option

39 We want to stress that we are not proposing that every policy and programme should be formally evaluated.

40 We believe that State organisations should make a habit of learning and improving – of regularly asking themselves whether the things they are doing are still necessary, whether they are the most effective things to do, and whether they could be done more efficiently. But this does not mean that every programme requires a full-scale, formal evaluation. New Zealand does not have the resources – financial or human – to do this. Other forms of evaluative activity, including longitudinal studies, action research and monitoring, can provide useful information.

41 Evaluative activity only represents good value for money where the cost of undertaking the activity is less than the benefits as a result of undertaking it.

42 In many cases, monitoring the implementation and impact of policies and programmes against the expectations at the outset is likely to provide sufficient evidence. Monitoring can also highlight where a more in-depth evaluation is required.

10

One size does not fit all

43 Although we believe that all State sector agencies need to look at how their policies and programmes are actually performing, we are not setting out to prescribe a single approach for all agencies. Nor are we implying that every agency needs an evaluation unit.

44 Different types of agencies or business units within agencies may have different roles in relation to evaluative activity. Evaluative activity takes place at many levels, for example departmental programme implementation, interagency action and impact on outcomes in Statements of Intent, to name but a few. Different parts of the State sector may experience different decision challenges and may consequently make different choices about the focus, appropriate methods, and use of any results or lessons. None of the following would exclude other interests for individual agencies, and many agencies would use the whole range of evaluative methods. But, as examples, the following might be the focus:

• delivery agencies might tend to focus on how efficiently they are implementing their programmes;

• policy ministries might tend to focus on the effectiveness of the interventions being delivered within a sector, including their relative effectiveness;

• population-based ministries might operate at one remove from specific policies, seeking to influence the analytical processes by which policies are designed and evaluated, and to encourage the collection of data in forms that allow analysis of the impacts for their particular target group; and

• central agencies might be interested both in the efficiency and effectiveness of government interventions, as well as how well agencies are managing their evaluative activity.

11

WHAT DID WE HOPE TO SEE?

45 The features that we think would characterise a State sector that is planning and using evaluative activity effectively are:

• evaluative effort is directed at those areas where increased certainty about efficiency and effectiveness offers the greatest gains, and the type of evaluative activity undertaken provides the level of certainty required at the least cost; and

• ethical, robust, relevant findings of evaluative activity inform government decision-making and prioritisation in the budget and policy processes and programme management within State agencies and other delivery organisations.

46 The conditions that we think are likely to give rise to such a State sector are:

• demand from State sector managers, Ministers, Parliament and the public to know what works and their willingness to use evaluative findings to assist their decision-making;

• a State sector culture that is receptive to experimentation, learning and improvement;

• leadership and support from central agencies;

• a realistic understanding by all stakeholders of what different types of evaluative activity offer and what it takes to do them well;

• policies and programmes designed with clear objectives, intervention logics and plans for future evaluation and/or monitoring;

• close links between policy, implementation, programme delivery and evaluative activity – within and between agencies – so that evaluative activity provides useful information to inform decision-making and results are shared;

• use of a variety of research and evaluative methods that fit the purpose of evaluative activity and the needs of different organisations;

• parts of the Public Sector management system that mutually reinforce the value of evaluative activity and provide incentives to plan, conduct and use evaluative activity;

• an adequate supply of personnel with the knowledge and skills to plan, conduct and use evaluative activity;

• adequate funding to conduct quality evaluative activity; and

• good quality information and systems to provide and make use of such information, within and between agencies.

12

WHAT DID WE SEE?

47 In order to ascertain more about how well the State sector is planning and using evaluative activity, we:

• reviewed the literature, although we focused more on evaluation than on evaluative activity and more on impact than on formative or process evaluation. The focus of this paper is upon evaluative activity and not solely upon outcome evaluation, so the conclusions of this review should be read in this light;

• looked at a range of initiatives that have taken place over recent years that have contributed to improving the state of evaluative activity; and

• sought input from a wide range of people involved in all types of evaluative activity, policy analysis, programme management and information management in order to get a picture of what is happening in the State sector today.

48 We considered getting evidence of how much money is spent on different types of evaluative activities and how the results have been reported and used. However, we were consistently informed that it would not be feasible or useful to collect this information. Instead, our findings are based on the input that we have received from a wide range of people and some previous reports that included some limited empirical evidence on the state of evaluative activity.

The literature says that evaluative activity is not optimal

A range of evaluative activity takes place 49 Reports from the SSC and the Office of the Auditor-General (OAG) have noted that evaluative activity in the State sector has increased in recent years. Most of the evaluative activity that is conducted assesses process, implementation and efficiency for internal management purposes.i In general, evaluation in New Zealand is relatively limited in scope, usage and impact.ii

50 In the absence of a government policy on evaluation, individual departments have gradually developed their own capacity to conduct evaluations by developing evaluation and research units and their own evaluation strategies.

A lack of focus on and use of outcome evaluation 51 Various commentators have noted that the New Zealand public management system’s focus primarily on efficiency, has led to limited information available about the effectiveness of interventions.iii An SSC assessment of the use of evaluation in a sample of Cabinet papers from June 1998 found that only seven percent of the papers in the sample included any proposed ex-post review of results.iv A second assessment in 2000 found some improvements, particularly more proposals to evaluate proposed policies or programmes. However, ex ante evaluation criteria were often not specified in Cabinet papers and programme logic remained relatively vague.

52 The Department of Labour’s Labour Market Policy Group, the Environment 2010 work, the Ministry of Transport and the former Department of Social Welfare were the few examples of units producing good quality outcome evaluation according to SSC.v OAG’s 2001 survey of the use of impact evaluation in 31 departments produced similar results.

13

Lack of integration of evaluation results into management and policy processes 53 Another problem identified in the literature was the failure to systematically integrate evaluation results into the policy process.vi The only requirement for evaluation of policies or programmes is the requirement to propose an evaluation approach when submitting budget bids for new initiatives. As a result, evaluation activity tends to focus on less than five percent of total expenditure.vii

Some of the essential prerequisites are not in place 54 In his comparative survey of evaluation histories, Mackay identified three general factors that contribute to successful institutionalisation of evaluation.viii These key factors are:

• the presence of a strong departmental champion;

• Government support; and

• a whole of government approach.

These factors have been largely absent in New Zealand, including perceived discouragement of evaluative activity by the Treasury in New Zealand.ix This contrasts with the roles of the Canadian, Australian and United Kingdom Treasuries as champions of evaluative activity.

Recent increase in evaluative activity 55 There is an increasing interest in evaluative activity worldwide. As Geoff Mulgan points out “governments have become ravenous for information and evidence”.x He argues that this is partly driven by the increasing sophistication of the general public and more extensive public access to information.

56 Evaluation activity in New Zealand has also increased recently.xi Evaluation is being used in select cases to enhance learning about policy effectiveness. For example, recent changes to the Domestic Violence Act 1995 occurred in response to evaluations of the effectiveness of the Act’s programmes for women and children. Policy makers increasingly use domestic and international literature in the early stages of policy development. Another promising development is the commencement of meta-reviews of evaluation results at the Department of Labour and Ministry of Education.xii The Ministry of Economic Development is at the forefront of international efforts to develop evaluation parameters for growth and enterprise development (particularly in relation to Small and Medium-Sized Enterprises and Entrepreneurship policies and programmes).xiii

57 However, evaluation findings are often only used to back up a case for a given policy and not used when they provide evidence contrary to the policy.xiv

Recent initiatives are improving this situation

58 There are some recent initiatives that are beginning to improve the situation in New Zealand. Some examples include:1

• Increased focus upon outcomes and evaluative thinking: For example, the Managing for Outcomes initiative and the Pathfinder project are supporting departments to focus more upon outcomes and to measure progress towards outcomes and adapt their intervention mix in light of the findings. The budget

1 A more detailed list of these and other initiatives are presented in Annex C.

14

requirement to consider evaluation as part of all new initiative proposals and value for money reviews also signal the importance of evaluative inquiry;

• Greater coordination and information sharing: For example, there is a range of initiatives improving sharing of information, including the mapping project by SPEaR that collects the research and evaluation programme activities of some 20 agencies in one place, and Statistics New Zealand’s administrative data integration project that will help synthesise data across the government sector. Te Puni Kokiri (TPK) is leading an inter-agency forum on the State sector Evaluation of Capacity Building. This forum aims to encourage agencies to actively undertake collective planning for evaluations and sharing of best practice. The agencies involved include the Ministry of Social Development, Department of Child, Youth and Family Services, and the Community Employment Group at the Department of Labour; and

• More opportunities to develop evaluative capability: For example, SPEaR is developing best practice guidelines and facilitating joint training and development of personnel in relation to evaluative activity. The newly established Australia and New Zealand School of Government is building policy, research and management capability, and other tertiary institutions are offering more courses relevant to evaluative activity (e.g. Massey University now offers a postgraduate diploma in evaluation). The Executive Leadership Programme (as part of the Senior Leadership Management Development Programme) will also provide Public Service leaders with an appreciation of the importance and value of evaluative thinking and activity.

Our discussions with stakeholders revealed a mixed picture

59 The input we received from a wide range of stakeholders (see Annex B) revealed considerable variation in the extent and quality of planning and use of evaluative activity across the New Zealand State sector.

60 Overall, the state of evaluative activity seems to have improved over the past few years. However, this improvement is not consistently displayed across the State sector.

61 The next three chapters set out in more detail the picture that has emerged from our reading and discussions of the current state of evaluative activity in the New Zealand State sector and the probable contributing factors.

62 We present our findings as they relate to two stages of evaluative activity:

• how the State sector decides what evaluative activity to undertake; and

• how the State sector uses the findings of evaluative activity.

63 In each chapter, we include some positive examples that we believe demonstrate that, despite the overall picture of shortfalls in evaluative activity, there is good practice in many departments. This is not a comprehensive list of good practice and we do not intend to infer that the agencies mentioned are the only ones undertaking good practice.

15

HOW WELL DOES THE STATE SECTOR DECIDE WHAT EVALUATIVE ACTIVITY TO UNDERTAKE?

64 In paragraph 44, we noted that agencies making good decisions about what evaluative activity to undertake would direct their evaluative effort to areas where increased certainty about efficiency and effectiveness offers the greatest gains, and the type of evaluative activity undertaken provides the level of certainty required at the least cost.

The picture is mixed, with many examples of good practice . . .

65 Through our conversations with stakeholders and reading of the literature, we have come across the following examples of good practice in deciding what evaluative activity to undertake. This is not a comprehensive list of good practice and we do not intend to infer that the agencies mentioned are the only ones undertaking good practice.

. . . of what questions to ask and what type of evaluative activity to undertake 66 Agencies who target evaluative activity well have processes to both determine which questions are the most significant to answer, looking across the whole spectrum of their activity, and the most cost-effective ways to answer these questions.

67 Looking across the State sector, some agencies have developed an overarching framework to target evaluative activity to the most significant areas of their work. For example:

• the Ministry of Research, Science and Technology has developed a programme of evaluations that spans its major output classes on a rolling 5-year cycle in conjunction with ongoing evaluative activity across the whole Vote. These evaluations are for the purpose of reviewing the continued policy relevance of the output classes and whether any modifications are necessary.

• the Ministry of Education is currently developing an approach for planning evaluations within a clear policy framework that will allow it to look at the comparative value of different programmes rather than look at individual programmes in isolation; and

• the Inland Revenue Department’s evaluation programme is based on the department’s strategic priorities and a regular process of consultation. A range of external and internal stakeholders decides upon the evaluative priorities for the department.

68 Some agencies have developed criteria to decide when it is appropriate to undertake evaluative activity. For example, the National Library has criteria in its guidelines including whether a programme is a pilot that could become a major initiative, whether the programme is costly, whether much is known about the area and the feasibility of evaluative activity, including costs.

69 Some agencies review areas with existing evaluative activity before undertaking new evaluative activity. In particular, the Department of Labour and Ministry of Education have conducted meta-reviews of evaluation results to learn from the existing evaluative activity that they have undertaken.xv

70 We have found some examples of evaluative activity that assesses the impact of a mix of interventions in support of strategic government objectives. For example, some evaluation was linked to the Government’s “Reducing Inequalities” strategy for addressing social and economic disparities.

16

71 Agency processes should also provide some way of determining when to use which form of evaluative activity. In some instances, formal evaluations will be the best way to provide the necessary information. However, often cheaper and more routine forms of evaluative activity, like monitoring, can provide information during the implementation of a policy or programme in order to check progress and identify problems in delivery and impact, so that adjustments can be made. For example:

• high quality comprehensive administrative data and existing survey data can provide an important source of information to monitor policies and programmes. For example, the Land Transport Safety Authority uses crash data, as well as other information like behavioural measures, to monitor the effectiveness of road safety measures;

• Education Review Office (ERO) uses indicators to explore the relationship between different areas of school performance. Discussions between ERO and schools under review help determine the indicators that are the most appropriate to use during the course of a review, assist developmental thinking and lead to a shared understanding about the basis of evaluative judgements; and

• The Department of Labour uses a wide range of evaluative activity to provide information of the effectiveness of its policies and programmes, including longitudinal studies. For example, the Department of Labour, working with Statistics New Zealand, has completed a two-year pilot to design a survey instrument that tracks how migrants are settling. This work is intended to inform, amongst other things, how best to support migrant settlement.

. . . of who is involved in making decisions 72 Strong links between policy and evaluation staff can help target evaluative activity to areas where the findings are likely to be used. For example, evaluators and policy analysts at Child, Youth and Family work together to determine when policies require evaluative activity, the appropriate form of evaluative activity and how to translate results into policy and practice.

73 Some efforts to undertake multi-agency collaborative evaluation projects also exist. For example, the Department of Labour has joint evaluation strategies with both the Ministry of Social Development and the Accident Compensation Corporation. Also, Ministry of Social Development and Child, Youth and Family plan evaluations together.

74 Involving operational staff can also be important in deciding how and what to evaluate. Early engagement with operational staff can improve programme design and operation. The Ministry of Social Development, when implementing a budget pilot initiative for domestic purposes benefit clients, ensured that the evaluators worked closely with operational policy staff to determine their programme and outcome logic. This helped the operation of the programme as well as the planning of the evaluation.

. . . of when evaluative activity is undertaken 75 Evaluative activity should be planned early enough in the policy and implementation processes to ensure that appropriate data collection systems are put in place to allow effective evaluative activity later.

76 Some agencies make specific efforts to plan evaluative activity early. For example, in the Labour Market Policy Group of the Department of Labour, policy staff are encouraged to consult their evaluation colleagues early in the development of policy proposals, especially where the intention is to evaluate an initial pilot, so that the scale and nature of the pilot can be designed to allow a valid and useful evaluation.

17

77 Similarly, the Ministry of Justice started planning an evaluation to test the impact of the new sentencing and parole legislation before the legislation was enacted. The team developing the evaluation identified the likely information requirements, gaps in current data and began a process of filling those gaps using the recently established inter-agency data warehouse.

78 Evaluation of pilots can provide useful information on the likely success and problems of new policies or programmes. The Ministry of Economic Development and Industry New Zealand have run pilots in the industry and regional development area. Pilots of the Business Clusters programme and Fast Forward New Zealand highlighted implementation and delivery issues to be addressed and improved in the roll-out of the programmes.

79 Similarly, the Ministry of Social Development’s formative evaluation of the pilot for domestic purposes benefit clients quickly provided evidence that the key assumptions underlying the pilot were not holding up. Based on this information, the pilot was discontinued, despite having had funding approved for three years.

But some scope for improvement remains

80 However, these examples of good practice are not widespread. Instead, we have seen:

• repeated investments in areas where there is a high degree of certainty and existing evaluation results;

• poor coordination between evaluation and policy staff so that the necessary prerequisites for obtaining robust results are not in place;

• few attempts to synthesise existing findings;

• much evaluative activity still focusing on discrete, relatively insignificant initiatives without considering agency priorities across the whole spectrum of business;

• few instances of evaluative activity or joint evaluative strategies to assess the effectiveness of major policies and programmes, which often cross agency boundaries; and

• few instances of evaluative activity asking questions about the comparative effectiveness of policies and programmes aimed at the same broad outcome.

81 The problem can be characterised most clearly by saying that there is too much evaluative resource applied to policies and programmes of lower significance and not enough evaluative resource invested in larger, more significant policies and programmes where most benefit can be gained.

Why isn’t the good practice more widespread?

82 Some agencies are targeting their evaluation resources effectively to answer the most significant or uncertain areas of their business. However, there is still room for improvement. This section outlines some of the main reasons why agencies are not targeting their evaluative activity as effectively as they could.

18

Variable culture of inquiry 83 Some State sector agencies still treat evaluative activity as a compliance exercise, undertaking only the minimum required to meet Cabinet or legislative requirements, rather than asking the highest priority questions to help improve their business.

84 We think that this compliance approach stems from the limited demand for evidence from Ministers, Parliament and the central agencies, as well as limited internal demand from managers.

85 Ministers are interested in what works, but Ministers and Members of Parliament do not consistently signal a strong demand for information about the efficiency and effectiveness of interventions. Some of the reasons for this include:

• public pressure to preserve programmes and policies that suit individual interest groups rather than seeking a set of interventions that deliver the best overall benefit;

• a mismatch between the time pressures which Governments face and the timeframes for major evaluative inquiry; and

• the cost of undertaking evaluative activity. Evaluators often recommend allocating 12 percent of programme budget to evaluation. With limited resources, public pressure for the delivery of service can be more compelling.

86 The public management system, designed and implemented by the central agencies, has not reinforced evaluative activity as an integral part of the policy/programme cycle. Neither the budget process nor the chief executive performance management process has signalled strong demand for evaluative activity.

87 The majority of funding for new initiatives is provided as part of departments’ ongoing baseline funding rather than provided for a limited period of time with ongoing funding contingent on evidence of cost-effectiveness.

88 Although the budget process has required departments to signal their evaluation intentions when submitting new initiative proposals since 2000, in practice, this requirement has little impact in promoting evaluative activity:

• departments often have not considered what evaluative activity to undertake when they submit budget bids or they plan to undertake evaluations where findings can be reported back to Cabinet within a short timeframe, which limits the type of evaluative activity that can be undertaken;

• there is no systematic follow up to ensure that the intended evaluations of approved new initiatives are actually conducted and the findings fed into future decisions; and

• the focus is on new initiatives, creating a tendency to skew attention to new, often discrete, initiatives with little attention to evaluating major existing policies and programmes. Value for money reviews were introduced in 2000 as a mechanism to move beyond the margins and help reprioritise spending across a Vote or group of Votes. However, the value of these reviews depends on the extent of ministerial involvement, which has been variable.

89 The chief executive performance management process still does not adequately recognise and reward genuine inquiry, despite a new focus on Managing for Outcomes. The Deputy Commissioners at SSC have increased the focus upon Managing for Outcomes. However, the chief executive performance management approach is increasingly tailored to individual chief executives, focusing on specific issues with less

19

overall prescription. It may not be the case, therefore, that all chief executives are given clear expectations and rewards for good evaluative inquiry.

Poor coordination and prioritisation across the State sector 90 Beyond the limited Cabinet and legislative requirements, most decisions about what evaluative activity to undertake are at the discretion of individual agencies, and often with individual managers within those agencies. A decentralised system has advantages, but coordination and prioritisation can be a challenge.

91 Often a large number of government interventions contribute to a single desired outcome, which means that sometimes a coordinated approach is needed to investigate the overall and comparative effectiveness of these interventions rather than simply focusing on individual interventions.

92 However, the New Zealand State sector is generally not good at coordinating or prioritising policy or evaluative effort within or between agencies. Many agencies are not aware of what evaluative activities are being undertaken within their own agencies, let alone in related agencies.

93 In some sectors, lead agencies drive coordination and prioritisation between related agencies. For example, the Ministry of Social Development and the Ministry of Justice play this role to some extent. However, lead agency mandate and scope is unclear, so this approach is not yet widespread and not as complete as it could be.

94 There are few mechanisms in place to keep track of what evaluative activity has already been undertaken and to coordinate and prioritise future activity. As noted above, some of this is beginning. For example, in the social sector, SPEaR’s mapping project has collected the research and evaluation programme activities of some 20 agencies in one place.

95 Other existing coordination mechanisms are also not being used much. In particular, the central agencies should be in the position to look across the whole of the government’s activities and help determine strategic evaluative priorities and facilitate coordination. However, central agencies currently do little cross-agency brokering and facilitating, although this is improving partly as a result of the RoC Central Agencies project.

96 Incentives to coordinate or prioritise are also relatively weak. Whilst the SSC is moving towards encouraging collaboration around outcomes, there is still a focus upon the chief executive’s own departmental performance.

97 We think that the limited mechanisms and incentives for coordination and prioritisation stem from a range of factors including:

• limited demand from Ministers for evaluative evidence of effectiveness and efficiency of major policies and programmes;

• some lack of precision about the priority policies or outcome areas which would enable departments to shift evaluative resources to the most highly valued areas;

• inadequate information about what evaluative activity is going on – if you do not know what policies are or have been evaluated, it is hard to coordinate and prioritise effort;

• transaction costs associated with coordination; and

20

• the varying significance of a particular outcome for contributing agencies, and therefore how much effort they are willing to put into coordination around that outcome.

Limited understanding of when different types of evaluative activity offer value 98 In order to make good decisions about what evaluative activity to undertake, advisors and decision-makers need to understand what different types of evaluative activity offer at what cost and within what timeframe. However, people without this understanding often plan evaluative activity, resulting in poor decisions about what to evaluate. For example, decision makers, without evaluative skills, may commission impact evaluations before it is methodologically feasible to do so. There is a shortfall in the ability to commission evaluative activity within the State sector.

99 Many respondents commented that analysts often lack strong evaluative skills and rarely involve people with those skills when designing evaluative activity. Analysts are generally recruited without strong evaluative or quantitative skills and are rarely trained in these areas once employed. In addition, analysts generally lack knowledge about available statistical information and how it can be used.

100 Evaluators often feel divorced from decisions about what evaluative activity to undertake, resulting in commissioning of work that is not feasible in the time allowed, unrealistic expectations of the results, and lack of clarity about both the questions to be answered and the audiences.

101 Expectations of what can be achieved with evaluation in New Zealand are often unrealistic. Limited evaluation resources, a lack of natural comparison groups in the absence of multiple states, and our small size make it difficult to conduct large impact evaluations or randomised control trials.

Some further measures are needed to address these factors

102 A number of current initiatives should help to address the contributing factors outlined above and thereby improve how the State sector decides what evaluative activity to undertake. In particular:

• the Managing for Outcomes initiative and the Pathfinder project should help to grow a culture of inquiry;

• central agencies taking a greater leadership role should help to grow a culture of inquiry and to improve coordination and prioritisation; and

• SPEaR should help to improve coordination in the social sector because of their sharing of information about what evaluations have been conducted or are in process.

103 We consider that more needs to be done to ensure that these expected gains are widespread and long lasting. Our recommendations for doing this are presented in the chapter entitled “What should government do to improve evaluative activity”. More detailed options are outlined in Annex D, along with their associated advantages and disadvantages.

21

HOW WELL DOES THE STATE SECTOR USE EVALUATIVE ACTIVITY?

104 In paragraph 45, we noted that agencies using evaluative activity effectively draw on results to inform decisions about prioritisation in the budget and policy processes and programme management within State agencies and other delivery organisations. This requires robust, accessible, real-time information that is useful to decision-makers.

The picture is mixed, with many examples of good practice . . .

105 Through our conversations with stakeholders, we have come across the following examples of good practice in using evaluative activity. This is not a comprehensive list of good practice and we do not intend to infer that the agencies mentioned are the only ones undertaking good practice.

. . . of how agencies ensure that evaluative activity produces robust findings 106 Peer review can ensure that the findings of evaluative activity are methodologically sound. Both the Ministry of Education and Department of Labour’s Labour Market Policy Group have recently drawn on international evaluators to peer review the design and findings of evaluations. Agencies can also provide guidance on quality standards for evaluative activities. For example, the Department of Labour released evaluation guidelines in 1995, which it is in the process of updating currently.

107 Culturally appropriate methodologies are another element of good practice to ensure that the conclusions drawn from evaluative activity are based on all the effects of a programme or policy. The Ministry of Pacific Island Affairs has developed an approach for evaluating the impacts of programmes on Pacific Island communities as part of its capacity building programme. In its evaluations of Pacific Islanders, the Department of Labour involves the community reference groups (as representatives of their communities), in accordance with the Ministry of Pacific Island Affairs’ approach.

108 � � � �� cultural issues not addressed by traditional methods. Increasingly, evaluation of � �� examp�� standard evaluation techniques. The approaches to analysing information included the use of Mäori concepts to describe the programme development, including Whakawhanauangatanga (to describe the collaborative approach of building relationships) and Tuakana/Teina (to describe the mentorship relationship between "quit coaches"). The evaluation project team was able to combine methodologies because of the key role of Mäori in the policy and project teams as well as within the evaluation team.2

109 Involving key stakeholders can add robustness to evaluative activity. The Ministry of Health used a pilot specifically to test some assumptions about a HelpLine service and ensured that major stakeholders had a chance to have their own assumptions and questions tested during the pilot. This was achieved by regularly communicating with stakeholders. The careful involvement of all stakeholders provided the Minister

2 !"�� #� � �� $� %�� &�� '��

Research and Evaluation Conference, April 2003, www.brc.co.nz/reports_conference.htm; a full report is in publication by the Ministry of Health.

http://www.brc.co.nz/reports_conference.htm

22

with confidence that all effects were considered in the evaluation. Based on the pilot results, the Minister decided to expand the programme.

. . . of how agencies ensure that evaluative activity produces useful findings 110 Integrating evaluative activity well into policy development processes or programme operations can increase the usefulness of evaluative findings. Some agencies, like the Department of Labour’s Labour Market Policy Group, merge policy and evaluation teams to foster integration and collaborative planning. Others such as Child, Youth and Family have a “relay” model for handing over evaluation findings to the policy team – evaluators and policy analysts work together in the later stages of the evaluation so that the development of a policy response can begin early.

111 Involving critical stakeholders can increase the chance that findings will be used. For example, decision makers were involved in the Steering Group for the Domestic Violence Act 1995 evaluation. This increased their buy-in to the findings and support of their use to inform future policy and programme design.

112 Timely delivery of results can enhance the usefulness of evaluative activity to Ministers. Some agencies report greater receptiveness if they present proposals for addressing any problems identified in evaluations at the same time as presenting the evaluation report to their Ministers.

113 Predicting the need for information can also increase the usefulness of evaluative findings. For example, the Department of Labour conducted some surveys and other research that raised a number of issues about the operation of the private market in accident insurance in its first year. The findings provided real-time information for the new Labour Government’s changes to the delivery of public accident compensation. The Ministry of Justice also predicted that information on new sentencing and parole legislation would be useful to their Minister and worked with the Department of Corrections in developing an evaluation that would provide useful information.

114 Better monitoring and ongoing evaluative activity can provide an early indication of how well initiatives are working and identify any problems when policies and programmes are not working as intended. Evaluative activity can also provide some information about why problems are occurring and how to improve delivery and programme efficiency and effectiveness. For example, programme evaluations of the Domestic Violence Act 1995 provided information on how the underlying programmes could be improved to enhance their effectiveness.

. . . of how agencies ensure that evaluative activity produces accessible findings 115 Reporting of results to the person commissioning the evaluative activity, and the individuals or groups who participated in the evaluative activity are important elements of good conduct. In most cases, results should also be released publicly to improve accessibility. Some departments, including the Ministry of Social Development and the Department of Labour, publish some of their evaluation findings on their websites.

116 Wide dissemination of results can increase the likelihood of their use. For example, the results of the Domestic Violence Act 1995 evaluation were widely disseminated in different forms. The findings informed a review of the regulations and subsequent changes to reduce compliance cost and barriers for clients, and informed programme expansion.

117 Results also need to be reported in an accessible and useful way for Ministers. For example, the Ministry of Social Development discusses findings of evaluative activity as they emerge with the Minister regularly. This provides the Government with timely

23

advice about findings so that it can make adjustments to programmes and polices as evaluative information indicates initial findings.

118 Providing findings to stakeholders in ways that they understand is another element of good practice that makes results more accessible. For example, the Community Evaluation group at the Departme� �� (�� Marae at the start of an evaluation, involved them throughout and presented them with a book on the “journey”, including findings, at the end of the evaluation.

But some scope for improvement remains

119 Some agencies take great effort to ensure that their evaluative activity is robust, useful and accessible to stakeholders. However, some problems remain. In particular, we have seen:

• patchy use of evaluative findings to inform policy, service delivery or broad government strategy and budget decision-making decisions;

• a disconnect between evaluative activity and the policy process at times, especially when results are not delivered in a timely fashion;

• instances where officials do not adequately draw on evaluative activity that has already been done in their own agencies, other agencies or overseas when developing policy or programmes or when deciding to undertake evaluative activity; and

• evaluative activity undertaken in ways that do not engage and build trust with key participants so that they understand and respond to the findings, e.g. by not using culturally appropriate methodologies.

Why isn’t the good practice more widespread?

120 Use of evaluative findings in policy, programme design and operation and budget decision varies. This section outlines some of the main reasons why agencies and Ministers are not using evaluative findings as well as they could.

Limited access to evaluative findings 121 To be able to use evaluative findings, advisors and decision-makers need to have access to those findings. Indeed, some commentators suggest that a requirement to make publicly available the results of evaluative inquiry would increase access and improve quality.

122 The findings of evaluative activity are often not shared as widely as would be beneficial. In many sectors there is little sharing of findings within and between agencies, which limits the opportunities to learn from synthesising findings of related evaluations. Initiatives such as SPEaR’s mapping project, which has collected the research and evaluation programme activities of some 20 agencies in one place, are improving this situation in the social sector. Some findings are routinely reported in public documents - for example, information about output production is published in departments’ annual report statements of service performance. However, most decisions about what evaluative activity findings to report to what audiences are left to the discretion of individual agencies and are often not publicly available or easily accessible.

123 In paragraphs 90-97 we noted that there are not adequate coordination mechanisms and incentives for collaboration to ensure that evaluative effort is

24

prioritised and coordinated across agency boundaries. The same applies to sharing findings of evaluative activity.

Poor understanding of evaluative findings 124 Advisors and decision-makers need to be able to understand evaluative findings to be able to use them well.

125 Many respondents expressed concern that policy staff generally have relatively poor evaluative and statistical analysis skills. This limits their ability to understand and use evaluative findings effectively to inform policy and programme design. In particular, many policy staff reportedly expect too much certainty from evaluative findings and discredit evaluations without conclusive results.

126 The technical language in which evaluative findings are often presented also contributes to poor understanding among actual and potential users of these findings. Findings are often not communicated clearly and simply so that they can be understood by people without technical evaluation expertise.

127 We think that one of the causes of unclear reporting of evaluative findings is that many evaluators do not have the skills to communicate complex technical findings to a range of non-technical audiences. The low demand for quality evaluative activity provides little incentive to develop these skills and report clearly.

Evaluative findings not always useful 128 Evaluative findings need to provide useful and relevant information if advisors and decision-makers are to use them.

129 The quality of evaluative activity varies considerably due to a range of reasons, primarily capability shortfalls and information and methodological constraints. If evaluative findings are not sufficiently robust or conclusive, they can, at best, only be weakly useful for informing decisions. Even if evaluative activity is conducted well, the findings may not be useful if it has been badly commissioned (as noted in paragraphs 98-101).

Shortage of evaluative skills

130 Evaluative skills within both the State sector and the private sector to conduct high quality evaluative activity are relatively limited compared to existing demand. The fragmented nature of the State sector may also contribute to evaluative capability shortages by spreading it thinly. Contract managers commented that they often have to invest a good deal of time to raise contractors’ capability to produce quality work. Finding peers to review work, given the limited community of practice, can also be hard.

131 The fact that there is a level of evaluative expertise in the State sector, however, reflects deliberate efforts to build capability by two groups: first, a number of chief executives have sought to assemble evaluation and research teams within their organisations over a period of several years; and second, the Australasian Evaluation Society and local evaluation societies, often led by public servants, have created various conferences, training and development programmes, and networking initiatives. Central agencies have provided limited technical support for evaluative activity, although the Pathfinder project has taken steps in this direction.

132 Many respondents expressed particular concerns about the level of capability �� – essentially

25

�� Staff who do have some skills in this area are difficult to retain in the State sector.

133 A major obstacle to improving evaluative capability in the State sector is the lack of specific evaluation training and development opportunities. Many evaluators must train overseas, receive on-the-job development where it can be fitted in (placing further pressure on the small number of more experienced evaluators), or simply fend for themselves. Tertiary institutions have recently started to increase their focus on evaluative training.

134 We think that if demand for evaluative activity increases, supply will eventually follow. But we think that it will take some time to build up evaluative skills to meet the demand and we see few steps under way to allow this to happen.

Information and methodological constraints

135 Many agencies report poor availability and comparability of data, and the inability to exchange data with other agencies, as obstacles to providing high quality evaluative information.

136 Often data are not available because evaluation and monitoring plans were not in place from a policy or programme’s inception, resulting in the necessary data:

• not being collected in the first place;

• being available but not captured; or

• being captured but at the wrong level.

137 Some agencies do make specific efforts to address this challenge, for example, encouraging policy staff to consult evaluators early in the development of policy proposals. Regulatory Impact Statements have improved this situation for those initiatives that result in Government bills or statutory regulations.

138 Since different agencies use different geographical and administrative boundaries to group their data, it is sometimes difficult to compare data on a geographical basis. The availability of population and business census data at mesh-block level enables Statistics New Zealand to provide statistics from those collections for virtually any geographical area of reasonable size. However, information collected by sample surveys can seldom be analysed in such detail, making it difficult to produce statistics from surveys for the variety of geographical and administrative areas that agencies use.

139 Although Statistics New Zealand has standardised classifications for many of the items in datasets that agencies might wish to compare, many agencies do not use them. Even such seemingly fundamental items as gender, age and marital status - not to mention more difficult items such as ethnicity - are defined in numerous ways by different agencies, and then stored in different formats within their information systems. Merging data or synthesising findings from more than one agency – and sometimes even from different studies by the same agency – can therefore be difficult or impossible. The review of official statistics is addressing this issue.

140 Methodological limits also constrain agencies’ ability to provide high quality evaluative information, particularly about the impacts of programmes. One constraint is the difficulty in attributing changes to a specific intervention, particularly when the objectives of a policy or programme are unclearly or broadly defined. Another constraint is that agencies are often expected to report on impacts when a programme has only been operating for a short period of time. In some cases, measuring impact is not possible, particularly where policy settings are changing rapidly.

26

Untimely or guarded reporting

141 To be useful, evaluative findings also need to be available in time to inform decisions. Some evaluations, particularly impact evaluations, cannot be conducted until a policy or programme has been underway for several years. But, decisions often need to be made before there is time to conduct or complete rigorous evaluative activity. Some agencies try to alleviate these timing misfits by having evaluators and policy analysts working together in the later stages of evaluations so that policy response can start to be developed early, or by using intermediate indicators of success so that inferences can be drawn in the short-term.

142 A further factor affecting reliability of findings is whether they have been reported openly. We think that low media and public tolerance for unfavourable results creates a disincentive to report openly if results show that policies/programmes are ineffective. One respondent referred to this intolerance as a ‘predilection in New Zealand for witch hunts’. In part, this low tolerance emerges from overly optimistic policy, which has created unrealistically high expectations. Moreover, evaluation reports tend to focus on the negative findings rather than the positives.

143 This intolerance also stems from confusion between personal performance and the performance of a portfolio of interventions. Evidence of programmes achieving lower or different results than those originally intended is often seen as “failure” and a criticism of personal performance. Instead, such results should be seen as a natural consequence of experimentation and thus, provide an opportunity to learn and improve. Managers may be penalised rather than rewarded for assessing effectiveness and efficiency, openly reporting the findings (including when they find problems), and then acting on them.

Variable commitment to using evaluative findings 144 Even when evaluative findings are available, understandable and high quality, advisors and decision-makers still need to be willing to use them.

145 Ministers make decisions in a political environment, which means that more factors come into play than just evidence of efficiency or effectiveness. As a result, they will sometimes make decisions to continue certain approaches even when there is evidence that they are not working.

146 However, policy advice from officials should be “free and frank” and therefore based on the best available evidence, rather than tailored to political ideology. Some respondents mentioned a declining understanding of the objective role of the public servant. We think that this limits the willingness of officials to seek and use evidence of efficiency or effectiveness in their advice, particularly if it contradicts the Minister’s views.

147 The public management system does not strongly reinforce the need to base advice on the best available evidence. The actual performance of programmes and policies is subject to limited scrutiny from central agencies. In particular, Treasury vote teams do not systematically follow up evaluations agreed to when new initiatives are approved and departments sometimes refuse to share their findings. Once a programme or policy is underway, there is little ex post scrutiny and it simply becomes an on-going part of a department’s baseline funding.

148 Despite this, some agencies and Ministers do place a great emphasis on basing advice and decisions on evidence. For example, one department told us that their

27

Minister is more receptive to results of evaluations showing problems if they present proposals for addressing any problems identified in evaluations at the same time.

Some further measures are needed to address these factors

149 A number of current initiatives should help to address the contributing factors outlined above and thereby improve how the State sector uses the findings of evaluative activity. In particular:

• the Managing for Outcomes initiative and the Pathfinder project should help to grow a culture of inquiry, including commitment to using evaluative findings to inform decisions;

• central agencies taking a greater leadership role should help to grow a culture of inquiry, and to improve coordination and prioritisation, including more sharing of evaluative findings;

• several initiatives should help to improve coordination and sharing of information, including SPEaR, work on state indicator development and reporting and Statistics New Zealand’s work on administrative data integration; and

• the increased evaluative training opportunities being offered by tertiary institutions, and guidance and training being developed by SPEaR should help to grow evaluative capability within the social sector.

150 However, we consider that more needs to be done to ensure that these expected gains are widespread and long lasting. Our recommendations for doing this are presented in the chapter entitled “What should government do to improve evaluative activity”. More detailed options are outlined in Annex D, along with their associated advantages and disadvantages.

28

EVALUATIVE ACTIVITY TO ACHIEVE OUTCO��

151 The Government has indicated a particular focus on improving outcomes for � � �� )��*�� *�� of evaluative activity to assess the effectiveness of these initiatives in reducing inequality. Well-targeted, high quality evaluative activity will provide important �� programmes.

152 "�� f it addresses community concerns as well as those of the Government.

153 However, many of the problems in the state of evaluative activity and underlying �� evaluative activity is not optimal. Many communities and providers often undergo multiple evaluations of their services from different government agencies. Equally, �� evaluators is not enough to meet the demand for their services.

What are the elements of a State sector that is planning and using ��

154 Paragraph 46 outlines the features that we think would characterise a State sector that is planning and using evaluative activity effectively. In addition to these features, �� +

• using culturally appropriate methodology;

• �� implementation;

• providing new information that is useful to decision-makers in determining the �� ,

• �� groups involved in evaluations, where appropriate; and

• ��

How well does the State sector plan and use evaluative activity of programmes and polici��

155 The current situation is quite poor. Although substantial evaluative activity is taking place, significant knowledge deficiencies remain around whether State sector ��

156 A lack of detailed evaluative information and analysis has hampered efforts to �� evidence-��

157 $� �� r-researched, evaluated and �� -�� improvements to their programmes or communities as a result of their participation in evaluative activity.

29

158 Te Puni Kokiri reviews of departments show some good practice in evaluation, including some development of guidance specific to evaluation of programmes for � � ��.�� rs in evaluative activity and the use of � � � ��

159 Departments vary in the extent to which they target and use evaluative activity �� .� '�� /� � � �� systematic, formalised processes for prioritising spending on evaluative activity. Few �� 0�� with iwi and hapu vary by agency and by evaluation projects. Similarly, some departments use the information that they get through evaluation and monitoring to inform future policy and planning, but this is not always the case. Reporting practices also vary. For example, the former Social Policy Agency used to present evaluation results to ind�� xvi

Contributing factors

160 The report has already outlined the concerns about the limited supply of well-trained experienced evaluators. The issues of capability shortfalls are compounded in �� .��

161 Officials commissioning and using evaluative activity to inform policy often do �� and evaluation methodology. Te Puni Kokiri (TPK) developed evaluation guidelines for departments in 1999. TPK advises departments to draw on these guidelines when developing their own ones.

30

WHAT SHOULD GOVERNMENT DO TO IMPROVE EVALUATIVE ACTIVITY?

162 The previous four chapters outlined a number of factors that contribute to the less than optimal targeting, conduct and use of evaluative activity in the State sector. These contributing factors fall into three broad categories:

• variable culture of inquiry: variable demand for high quality evaluative activity from Ministers, Parliament, central agencies and State sector managers and variable commitment to using the findings of evaluative activity to inform decisions;

• poor coordination and prioritisation: evaluative effort is often not well coordinated and prioritised within or between agencies; data sharing and consistency is limited; evaluative findings are often not shared within and between agencies; and little attention to evaluating major policies that span managerial boundaries; and

• limited capability: limited understanding by policy and programme managers of when different types of evaluative activity offer value, and how to interpret and use the findings; and limited skills within and outside the State sector to conduct high quality evaluative activity.

163 Our recommendations focus on these three causes, although we consider information and data constraints to be a further contributing factor, which could be considered as a separate key factor, even though it is incorporated into both the problems with the culture of inquiry and poor coordination. We expect that the review of official statistics that is currently underway will recommend some changes to the way that administrative data is collected and the role of official statistics. Therefore, we are not prejudging the outcomes of this review by making specific recommendations in relation to information and data constraints.

The background to our recommendations

We focus on the public management system 164 There are a number of ways to approach enhancing the State sector evaluation environment, as requested by Cabinet. To be consistent with the Review of the Centre, we have specifically looked at how the public management system could be adjusted to enhance the evaluation environment. However, we recognise that the public management system only provides levers for change. Actual change requires the commitment of Ministers and officials to use those levers effectively.

We are realistic about the political environment 165 In the public sector, we are responsible for meeting the needs of citizens through democratically elected Ministers and Parliament. It is inevitable that Governments will make decisions about policies and programmes based on a wider range of factors than evidence of effectiveness and efficiency. Decision-making in a political environment does not lessen the value of providing good evidence to reduce uncertainty, but at times this evidence may not be the primary influencer of the decision.

31

We recognise that it is not always appropriate for government to intervene 166 Some of the ways to improve evaluative activity lie outside the direct control of the government. For example, several respondents told us that they hoped that this project would result in an increase in evaluation training offered by New Zealand tertiary institutions. This is not something that the government can change directly as decisions to offer courses are the preserve of universities. However, government can signal demand. Universities will respond to increased demand for academic programmes by providing relevant courses.

We take an evolutionary approach 167 Our general approach is evolutionary, recognising and building on existing initiatives, where possible, to address the problems outlined in this paper. We have also tried to find solutions that will become embedded in the public management system rather than stand-alone mechanisms. We also appreciate that evaluative activity is a means to an ends rather than an end in itself. Its purpose it to reduce uncertainty to improve decision-making. We recognise that evaluative activity is only one means to achieving this purpose.

We do not wish to over promise what evaluative activity can deliver 168 Evaluative activity can significantly enhance decision-making and adds benefit by providing better evidence about what is working and therefore how best to design and deliver policies and programmes to achieve desired outcomes. There are many analytical processes and research that can assist good decision-making. But we are not wishing to imply that evaluative activity will always give definitive answers – it makes a very important contribution but not always “the answer”.

We recognise that improvement will take time 169 Our recommendations will not improve the state of evaluative activity immediately. Changing culture, developing capability and improving information systems take time and resources. Therefore we recommend that the Treasury and SSC conduct a review in December 2005 to assess whether there has been an improvement in evaluative activity in the State sector and whether further work is necessary.

Recommendations to grow a widespread culture of inquiry

170 Ministers, chief executives and other senior managers have a crucial leadership role in growing a widespread culture of inquiry. If they do not demand high quality evaluative evidence to inform their decisions, there is unlikely to be much improvement in its supply.

171 The project team considers that the best way to grow ministerial demand for high quality evaluative evidence is to do so indirectly through increasing demand for this within the State sector. In particular, the project team considers that chief executives and other senior managers can stimulate ministerial demand for high quality evaluative evidence by demonstrating their own commitment to using high quality evaluative evidence to support their own decisions and advice.

172 The strong commitment and leadership of some chief executives and Ministers is already apparent. However, this commitment and leadership is not widespread.

173 One of the ways to grow this culture of inquiry is to use the levers in the public management system to help influence the behaviour of State sector officials. The public

32

management system needs to support and encourage chief executives and other senior managers to undertake high quality, appropriate evaluative activity and to use the findings to inform their advice to Ministers and the decisions they make themselves.

174 There is already work underway to ensure that parts of the public management system, such as the budget, reporting and chief executive performance management processes support such a widespread culture of inquiry. Recent and proposed initiatives include:

• changes to reporting requirements for departments and Crown entities that encourage a greater focus on outcomes and using evidence to support their decisions on intervention;

• changes to the chief executive performance management process so that the emphasis is on debate about the evidence and underlying decisions, the quality of thinking underlying management decisions, and learning from the past to improve future decisions; and

• changes to the budget process so that it is more focused on overall government expenditure rather than just new initiatives (e.g. value for money reviews), and new initiative proposals are more likely to be informed by evaluative findings and outline future evaluation and monitoring intentions.

175 Many of these public management system changes are still in their early design or implementation stages and it will be several years before some of the changes are expected to generate widespread improvements in State sector culture.

176 Building on these existing changes would help to grow further a widespread culture of inquiry throughout the State sector. However, changes to the public management system can only provide incentives and support for chief executives and other senior managers – the real enthusiasm and commitment to using high quality evidence to inform decisions and advice must come from State sector leaders themselves.

177 We propose three recommendations to encourage chief executives and senior managers to target, conduct and use evaluative activity well.

Recommendation 1 178 The Managing for Outcomes guidance material should:

• emphasise the importance of undertaking prioritised evaluative activity and explain the value of different types of evaluative activity at different stages of the policy/programme cycle (for example, when monitoring and/or research is likely to be sufficient and when an evaluation involving randomised controlled trials might be better);

• expect departments to develop an evaluation strategy that prioritises their evaluative activities within the context of their overall policy direction and that of related agencies;

• expect departments that are engaged in joint policy development or programme delivery to also develop joint evaluation plans;

• expect departments to report planned evaluative activities in their Statements of Intent, as well as how any major findings of previous evaluative activity have shaped their intervention decisions for the coming period;

33

• expect departments to report major evaluative activities undertaken and the major findings in their annual reports and on their websites; and

• emphasise the importance of using existing data effectively, gathering new data where necessary, and incorporating data analysis as a routine part of business planning and performance monitoring.

Recommendation 2 179 The Treasury should report to the Minister of Finance by 31 October 2003 on proposals to adapt policy and budget processes so that:

• there is increasing emphasis on the effectiveness and efficiency of wider government expenditure, and on testing new initiatives within the wider picture of government expenditure, for example through:

- greater use of ministerial groups in the budget strategy phase to collectively decide on policy priorities; and

- greater use of sectoral funding allocations, whereby ministerial groups collectively decide priorities for particular outcome areas.

• when seeking new funding, departments signal evaluation intentions in the context of a broader evaluation strategy that prioritises all the department’s evaluative activity in the context of its overall policy direction (see Recommendation 1);

• where there is substantial uncertainty about the effectiveness of new initiatives they are established, where possible, as pilots and then reviewed after an appropriate time (e.g. 3 years) to determine whether the initiative should continue unchanged, be adapted (expanded or modified), or cease; and

• The Treasury consider as part of its second opinion advice role whether new policy proposals have been informed by all relevant evaluative findings, contain clear outcomes and intervention logics, criteria for measuring success, and show how effectiveness and efficiency will be evaluated or monitored.

Recommendation 3 180 While still being tailored to individual circumstances, the chief executive performance management process should use formal and informal measures to support and encourage chief executives to demonstrate how they are using evaluative activity to improve the effectiveness and efficiency of their agency’s interventions. For example, whether the department:

• has an overall evaluation strategy that prioritises all the department’s evaluative activity in the context of policy priorities;

• collects and makes available good administrative data for evaluative purposes;

• makes use of evidence (qualitative and quantitative) to inform advice to Ministers and drive policy and delivery decisions; and

• develops appropriate capability to undertake, conduct and use high quality ��

34

Recommendations to improve coordination and prioritisation

181 A number of recent initiatives are underway to improve sharing and consistency of data, and sharing evaluative findings, including:

• a number of initiatives promoted by SPEaR, including the SPEaR website, its plan for electronic sharing of research, and the mapping process that enables SPEaR to measure the strengths and weaknesses in research and evaluation activities to support government priorities in the social sector;

• Statistics New Zealand’s work on integrating administrative data from various agencies will provide new information on significant policy questions, and should lead to agencies benefiting by using standard definitions and classifications for common items in their statistical databases; and

• increasing numbers of indicator development and reporting projects across the social, environmental, economic and cultural domains, and at central and local government levels.

182 However, these initiatives are relatively recent, so it is too early to determine the full extent of their impact. A review of SPEaR is planned for mid 2004, which will assess progress. This review will help clarify the value of this model to deliver ongoing social sector evaluation outcomes and may also inform assessment of the likely utility of this model for capability development, prioritisation and coordination in other sectors.

183 To build on the findings of this review, the team recommends that The Treasury and the SSC consider whether there would be value in applying a mechanism similar to SPEaR in sectors beyond the social sector (Recommendation 4).

184 The team are not recommending any changes to the custodianship and sharing of administrative data because recommendations from the review of official statistics, combined with current initiatives, are likely to improve the use of official statistics. The project team strongly supports an emphasis on better use and sharing of official statistics, particularly administrative data.

185 The project team does, however, consider that greater coordination is needed to ensure that major policies, which often span agency boundaries, are evaluated.

186 Many substantial, expensive policies have not been adequately tested for effectiveness and efficiency. This applies to policies and programmes that fall within the scope of individual agencies but is exacerbated for those that rely on multiple agencies. Many of the Government’s priority outcomes rely on interventions from multiple agencies (e.g. the Growth and Innovation Framework). Some agencies do work together to evaluate policies that span agency boundaries (e.g. the Department of Labour and the Accident Compensation Corporation). However, there are limited incentives for individual contributing agencies to coordinate evaluation around policies to which multiple agencies contribute.

187 There are times when the government intends to review particular policy directions but there is insufficient evaluative evidence available to help inform the extent to which they are working and why.

188 Consequently, we propose a recommendation (Recommendation 5) to ensure that Ministers are provided with advice about where evaluative activity could be undertaken to help inform their decisions about the future of major policy areas, which often cross agency boundaries.

35

189 We suggest that the central agencies undertake a coordination role, but we recognise that much of the advice would need to come from line agencies themselves.

Recommendation 4

190 The Treasury and SSC should consider whether there would be value in applying a mechanism similar to the Social Policy Evaluation and Research Committee (SPEaR) in sectors beyond the social sector, after the review of SPEaR in 2004;

Recommendation 5

191 Central agencies should undertake a greater coordination role with respect to major policy areas where the Government intends to consider its policy direction and there is considerable uncertainty about the effectiveness and/or efficiency of existing interventions. This should include:

• consulting with departments and relevant organisations on:

- what relevant evaluative activity has already been undertaken, is underway, or is planned; and

- what additional evaluative activity would be valuable, including the scope and type(s) of evaluative activity;

• advising the Minister of Finance, the Minister of State Services and the Prime Minister on where there could be value in undertaking additional evaluative activity around major policy areas where the government is considering its policy direction. Advice should include:

- the likely cost of these evaluative activities and how they should be funded (through existing baseline allocations, new funding or a combination of both);

- who should oversee and commission these evaluative activities;

- when these evaluative activities should be undertaken.

192 The Treasury, the State Services Commission and the Department of Prime Minister and Cabinet should report to the Prime Minister, the Minister of Finance and the Minister of State Services by 31 October 2003 on the process for providing this advice.

Recommendations to develop capability

193 It is not possible to quantify the extent of the capability shortfall to commission, conduct and use evaluative activity in New Zealand, nor to specify the ideal numbers of trained evaluators that are needed.

194 We do not therefore support a deterministic or general intervention from government to directly lift the overall supply of evaluation specialists. We are supporting the Tertiary Education Advisory Commission principle underpinning tertiary education priorities of “balancing the government's priorities with the autonomy of tertiary education organisations to interpret and apply the TE [Tertiary Education] strategy”.

195 Furthermore, we are not implying that every department should have an evaluation unit.

196 Work already underway to further build evaluative capability includes:

36

• increased evaluation training through tertiary establishments

• recent initiatives promoted by SPEaR; and

• guidelines to improve capability.

197 It is important that there is adequate capability to meet increased interest in evaluative activity.

198 Our focus is to take some initial steps in building capability of State sector staff. This is because we have heard that public sector managers do not consistently have the expertise to commission or conduct evaluative activity wisely. Accordingly, the first step for capability development will be to improve how evaluative activity is scoped and commissioned.

199 Given our desire to build upon, and not duplicate, current initiatives, we wish to use a community of practice to build capability and capacity but wish to do so by consulting and working with individuals and bodies already undertaking tasks usually associated with a community of practice.

200 The SSC can use the Executive Leadership Programme (the primary initiative in the Senior Leadership Management Development Strategy) as a mechanism for ensuring that future leaders are aware of the need to effectively target and use evaluative activity.

201 We propose two recommendations to support agencies as they develop capability to commission, conduct and use evaluative activity:

Recommendation 6 202 The central agencies and Te Puni Kokiri should support agencies to develop evaluative capability by offering training for central agency and departmental policy staff, as part of the Managing for Outcomes initiative, about:

• the value of different types of evaluative activity at different stages of the policy/programme cycle;

• good information management practices (including administrative data);

• how to understand and use evaluative findings; and

• �� .'/1�� "�� SPEaR guidelines (in progress).

Recommendation 7 203 The State Services Commission should build on existing networks to support an evaluation and monitoring community of practice, including evaluation specialists (both within the State sector and in the private sector) and policy and programme managers, to

• promote and share good evaluative practice throughout the State sector and particularly including the environmental and economic sectors along with the social sector;

• share information about training opportunities and/or deliver seminars or training workshops and promote access to international expertise for knowledge sharing and peer review; and

37

• engage in discussions with a range of universities on State sector evaluative activity, including, but not confined to, the Australia and New Zealand School of Government and liase with Mäori evaluators.

38

ANNEX A – TERMS OF REFERENCE

REVIEW OF THE CENTRE – EVALUATION IN THE STATE SECTOR Terms of Reference Why undertake this project? The Advisory Panel that conducted the review of the centre in 2001 reinforced the importance of evaluation as a means of determining whether government interventions are achieving their intended outcomes for citizens, through improvements to policy advice, resource allocation and service delivery. It suggested that there was room for improvement in the way that evaluation is conducted and used in the New Zealand State sector. The Panel noted that a number of initiatives already under way were likely to have a positive effect on the way that evaluation is conducted and used in the New Zealand State sector, especially the creation of the Social Policy Evaluation and Research Committee (SPEaR). Rather than making specific recommendations about evaluation, therefore, the Panel suggested (and Cabinet subsequently agreed) that there be: • an investigation of whether further mechanisms are needed to enhance the

evaluation environment, taking into account the recently established SPEaR; and

• an assessment of whether any further initiatives are required to encourage more effective use of evaluation in the State sector.

What will this project seek to achieve? This project therefore needs to: • describe the features that would characterise a State sector targeting and

conducting evaluation effectively, and making effective use of its results, and the conditions likely to give rise to such a State sector;

- This will include consideration of: the need to prioritise within limited resources and the basis for targeting evaluation; the relationship between evaluation and other forms of organisational learning for improvement; the limits to what evaluation or other forms of organisational learning can tell us; the constraints of the public management system as the context within which evaluation occurs; the needs of the various parties to that system; the need to tailor evaluation to differing cultural contexts, including working �� '��*�� ,�nd the tension between the inherently political nature of the budget process and the inherently technocratic process of evaluation.

• assess the extent to which these features and conditions exist now in the New Zealand State sector, and the reasons why this is so;

39

- This will include consideration of: both demand and supply factors; the extent to which evaluation and other forms of organisational learning are incorporated into the policy advice, resource allocation, policy implementation, service delivery and accountability processes; the availability of resources, personnel and data necessary for effective ��,�� '��Island, or other cultural contexts; the level of understanding of evaluation among stakeholders; the extent to which structural considerations within and between organisations affect the conduct and use of evaluation.

• consider the likely impact on these things of a range of initiatives now under way,

including but not limited to SPEaR;

- This will include consideration of the origins, objectives, participants and work programmes of a range of projects or organisations including: SPEaR, Pathfinder, the Managing for Outcomes/Statement of Intent programme; the Social Report; work on state indicators; Review of the Centre work on accountability and reporting; the TEAC review; work under way at Statistics New Zealand on data integration and the social survey work programme.

• determine whether additional measures are required to move towards the

effectively evaluating State sector described; and

- This will include drawing together the conclusions of the work described above, consideration of options for further improvement, assessment of their feasibility and affordability and impact, and planning for their possible implementation.

• recommend appropriate steps to implement these additional measures.

- This will involved distilling the project’s findings into a compact and coherent set of practical recommendations for transmission to Ministers.

In considering all these matters, the project needs to take a broad view of what constitutes evaluation, embracing a continuum of evaluative activities that includes, but is not limited to, formal, impact evaluation. The project should seek to identify practical steps to assist State sector organisations to improve the way they target, resource, conduct and use this wide range of evaluative activities. What products will this project deliver? The project will need to deliver: • a Cabinet paper summarising the project’s findings and recommending any

measures necessary to improve the conduct and use of evaluation in the New Zealand State sector; and

40

• a more detailed project report detailing the project’s findings, its analysis, sources consulted, etc.

The project should also canvass the views of State sector organisations on whether there are additional products that would be of assistance to them in their evaluative activities. It is possible that the project will be able to generate some of these additional materials, or their development may be a recommendation of this project. Where will the project seek information? In pursuing these objectives, the project will need to draw on the views of a wide range of interested parties in New Zealand, and on current thinking and good practice from abroad. The project will need to seek the views of those who conduct evaluation and those who consume its results, and of academic and professional experts in evaluation. It will need to review the relevant literature, the existing guidance materials produced by State sector and other organisations, the evaluation arrangements in other jurisdictions, and the work programmes of a range of projects or organisations likely to impact on the conduct and use of evaluation in the New Zealand State sector, including SPEaR. When will the project report? The project will report to Cabinet not later than the end of April 20033. ***** NOTE This project is designed to respond to the two Cabinet decisions listed in paragraph 2 of this annex. There was a third evaluation-related decision that called for the various projects in the integrated service delivery stream of the review of the centre work � �� !�� .” We have consulted with Te Puni Kokiri (TPK) about the best way to give effect to this decision and have agreed to draw on their assistance in building this dimension into the various projects and into evaluation of their eventual results. In relation to the “evaluation in the State sector” project, TPK will provide some evaluative comment to the Chief Executive Reference Group on the process, products and impact of this project at its conclusion.

3 This deadline has been extended.

41

ANNEX B – ORGANISATIONS THAT CONTRIBUTED TO THIS PROJECT

Accident Compensation Corporation

Archives New Zealand

ARTD, Sydney

BRC Research

Centre for Research, Evaluation and Social Assessment (CRESA)

Department of Child, Youth and Family Services

Department of Corrections

Department of Internal Affairs

Department of Labour

Department of the Prime Minister & Cabinet

Education Review Office

Foundation for Research, Science and Technology

Inland Revenue Department

KPMG Legal

Land Transport Safety Authority

Ministry of Agriculture and Forestry

Ministry of Foreign Affairs and Trade

Ministry of Economic Development

Ministry of Education

Ministry of Defence

Ministry of Health

Ministry of Justice

Ministry of Pacific Island Affairs

Ministry of Research, Science and Technology

Ministry of Social Development

Ministry of Women’s Affairs

Ministry of Youth Affairs

National Library of New Zealand

New Zealand Institute of Economic Research

Office of the Auditor-General

Parker Duignan

42

Royal Melbourne Institute of Technology, Melbourne, Collaborative Institute for

Research, Consulting and Learning in Evaluation

Social Policy Evaluation and Research Committee

State Services Commission

Statistics New Zealand

Te Puni Kokiri

The Knowledge Institute

The Treasury

Office of the Retirement Commissioner

Transfund New Zealand

Transit New Zealand

University of Melbourne, School of Education

Victoria University of Wellington, School of Government

43

ANNEX C – RECENT INITIATIVES IMPROVING EVALUATIVE ACTIVITY

A number of recent or current initiatives should help address the problems and causes that we have identified in this paper. This section outlines these initiatives and identifies how we think each initiative will help improve the state of evaluative activity.

Managing for Outcomes – the new approach to the planning, managing and reporting of the work of Public Service departments, and Pathfinder – the project developing methodological guidance on these matters, place an emphasis on clear articulation of outcomes, the intervention logic that links outcomes and outputs, and how departments will measure progress towards outcomes. In other words, these initiatives encourage evaluative thinking and are likely to increase demand for information about impacts and therefore to increase the need for evaluative activities to generate such information. However, Managing for Outcomes has so far focused only on the planning phase of the management cycle and has not provided guidance on good review practices. It has not communicated the value of different types of evaluative activity at different points in the planning and policy/programme cycle (e.g. when monitoring is likely to be sufficient and when a formal evaluation might be better). Neither has it emphasised the quality dimension - for example, the new outcomes information is not subject to the same degree of audit scrutiny as the financial and output performance information. As yet central agencies have little direct contact with Crown Entities, although this is envisaged over time.

The Social Policy Evaluation and Research Committee (SPEaR) was established in 2001 to oversee the government’s purchase of social policy research to ensure the spending is aligned with the Government’s social policy priorities. It includes representation from a wide range of social sector agencies provides a focal point for social policy researchers and acts as a vehicle for communicating with the social research sector. A number of SPEaR initiatives help coordinate evaluative effort, including its mapping project that collects the research and evaluation activities of some 20 agencies in one place to identify strengths and weaknesses in relation to government priorities. SPEaR is also developing best practice guidelines (research activity for � � �� '�� 2 and facilitating joint training and development of personnel. However, SPEaR’s interest is primarily focused on research and is focused only on the social sector. The newly launched Linkages Programme operated by SPEaR is a good example of capacity and capability building in research and evaluation because it rewards good practice. Since SPEaR is still in the early stages of development it is too early to determine whether or not SPEaR’s activities are improving coordination and capability. A review of SPEaR will be undertaken in 2004.

A number of initiatives will help to improve the quality of data inputs into evaluation and policy analysis more generally. These include:

• The current review of the future role of Statistics New Zealand and its contribution to official statistics, which may have a bearing on the future quality of data inputs, statistical analysis, evaluation and policy analysis across government.

• Statistics New Zealand’s work on administrative data integration. This will extract additional value from existing data, providing new information on significant policy questions. For example, by integrating administrative data from multiple agencies, policy makers will be able to analyse the impact of the student

44

loan scheme on education, employment and income variables over time, allowing policy makers to relate interventions to outcomes in ways often not possible using survey tools, and adding to the possibilities for evaluation. It may also lead to agencies working more closely with each other to use common definitions and classifications, and to collect data, enabling data collection to be coordinated more effectively across government.

• Increasing numbers of indicator development and reporting projects across the social, environmental, economic and cultural domains, and at central and local government levels. These include, but are not limited to: The Social Report, reporting on the Growth and Innovation Framework, and environmental performance indicators.

As a result of the Review of the Centre’s Central Agencies project, central agencies have committed to taking a more effective leadership role across the State sector. Central agencies will be undertaking a greater role in facilitating brokerage and best practice, and ensuring cross-agency outcomes and risks are managed. They will be quicker at identifying issues that affect multiple agencies and for which a common approach would help, will take a more proactive in tackling issues across agencies that are not getting resolved, and will identify and share best practice. This should help address some of the coordination problems that limit the effective targeting, conduct and use of State sector evaluative activity.

The creation of the Deputy Commissioner Teams at the SSC and their “review framework” brought a fresh emphasis upon clarity of thinking and evaluative activity. The work of the DC Teams encourages departments to adopt a range of evaluative activities so that they can plan better and target resources more effectively. However, there is no systematic assessment of the use of evaluative inquiry. It may be necessary to build the capability of SSC staff to support departments to develop, and then to assist them assess the quality of, evaluative inquiry.

Since 2000 there has been a budget requirement to consider evaluation as part of all new initiative proposals. Although this requirement only applies to the margins of government expenditure, value for money reviews were also introduced in 2000 as a mechanism to move beyond the margins and help reprioritise expenditure within a vote or group of votes. Both of these changes signal the importance of evaluative inquiry. However, commitment to value for money reviews and the evaluation Budget requirement is variable, although reports from the recent 2003 Budget indicated increasing ministerial interest in whether proposals have been informed by evaluative findings.

Te Puni Kokiri’s role in reviewing the performance of government departments for � � � �� areas, and to ensure that their data collections allow analysis of their services to, and �� .�� .�'�ni Kokiri has also been assessing the extent to which agencies that received funding in relation to the “reducing disparities” programme have developed and progressed an evaluation framework for their “reducing disparities” initiatives. However, this coordinated approach is not widespread. Te Puni Kokiri has prepared guidelines to assist �� evaluative inquiry. Some departments have used this advice and developed their own guidance for field workers.

The Australia and New Zealand School of Government is a consortium of Australian and New Zealand Governments, Universities and Business Schools who provide

45

training in public administration to emerging public sector leaders to build policy, research and management capability, and to develop a research agenda and expertise that strengthens governments’ ability to develop and access policy relevant knowledge. It may be possible to raise the profile of evaluative activity as at present, it does not appear that evaluation per se is of high priority in the curriculum nor in the development of core competencies. Other tertiary institutions are offering more courses relevant to evaluative activity (e.g. Massey University now offers a postgraduate diploma in evaluation).

Development of a shared electronic workspace was announced in the 2003 Budget. This initiative will give the opportunity for sharing of documents, emails, and information across and outside the public sector on a range of topics. Universities could also be part of it. At present the infrastructure is being established for this, with a number of voluntary groupings developing a system for working together. In the long run it could be a valuable mechanism for sharing information, and indexing evaluative activity, possibly linked to a community of practice, and certainly working along with SPEaR who is focused on a similar goal.

46

ANNEX D – OPTIONS TO IMPROVE THE STATE OF EVALUATIVE ACTIVITY

Options to grow a widespread culture of inquiry

Option Advantages Disadvantages 1.1 Greater focus on wider government expenditure

The budget process could place greater emphasis on the effectiveness and efficiency of wider government expenditure and on testing new initiatives within this wider picture. This could include more use of ministerial groups in the budget strategy phase to decide on policy priorities (and priority evaluation questions) and more use of sectoral funding allocations whereby ministerial groups collectively decide priorities for particular outcome areas.

� Increases focus on major, existing, cross-cutting policies, which is a current weakness.

� Provides greater clarity about the Government’s strategic priorities.

� Requires genuine commitment from Ministers and chief executives to expose major areas of government spending to scrutiny.

� Focussing on wider government expenditure will not necessarily build a culture of inquiry per se, unless strong collaboration at all levels around priority outcomes exists.

1.2 More time-limited funding

Currently, the vast majority of approved new initiatives receive baseline funding rather than time-limited funding. This means that policies and programmes are rarely run as pilots. More new initiatives where there is significant uncertainty about likely cost-effectiveness, could be established as pilots and then reviewed after a defined time period (e.g. 3 years) to determine whether the initiative should continue unchanged, be adapted (expanded or modified) or cease.

� Increases consideration of evaluation and monitoring in the policy process.

� Provides an incentive to use evaluative findings to support future decisions.

� Evaluation and monitoring methods more likely to be determined and data collection put in place from the outset.

� Greater uncertainty about future funding.

� Threat of losing funding may drive dishonest evaluations unless decisions to cease funding are made carefully (e.g. departments/providers given the opportunity to work through and address problems before decision to stop funding).

� Focuses on individual initiative in isolation (and detracts from prioritisation across entire spectrum of activity).

� Timeframe for pilot may not be sufficient to adequately evaluate impact.

1.3 Mandatory consideration for all new initiatives

All new initiatives going to Cabinet could be required to include a section in the Cabinet paper outlining that they had considered what evaluation or monitoring of the initiative was justified. Reasons would have to be given if no evaluation or monitoring was proposed. If it were decided that evaluation or monitoring was warranted, departments would need to set out the approach that they intend to use.

� Some evidence overseas that compulsory requirements (that are enforced) are a good way to kick-start an evaluation culture.


� Provides an incentive to start building evaluative capability.


� Tendency for any mandatory requirement to be treated as a compliance exercise rather than a meaningful component of the policy process.

� Requires greater effort from central agencies in scrutinising new initiatives.

� Cumulative impact of the number of compulsory headings in Cabinet papers on policy process.


47

Option Advantages Disadvantages 1.4 Improved second opinion advice on

new policy proposals

Central agencies could improve their second opinion advice role to consider whether new policy proposals have been informed by all relevant evaluative findings and contain clear outcomes and intervention logics, criteria for measuring success, and show how effectiveness and efficiency will be evaluated or monitored.



� Provides an incentive to use evaluative findings to support future decisions and to build evaluative capability.

� Requires greater effort from central agencies in assessing new initiatives, increasing lead time for policy development.

� Central agency staff may not have adequate skills to do this.

� Relies on central agency staff being aware of and having access to evaluative findings (Options 1.8 and 2.2 would help this).


1.5 Follow up evaluation commitments

Central agencies could systematically follow up evaluation commitments made during the budget process to ensure that they have been conducted as planned.

� Provides an incentive for agencies to actually honour evaluation and monitoring commitments, so more likely to put in place data collection from the outset.

� Requires greater monitoring effort from central agencies.

� Risks evaluation being treated as a compliance exercise rather than a meaningful component of the policy process.

1.6 Reward evaluative inquiry Chief executive performance management process could support and encourage chief executives to demonstrate how they are using evaluative activity to improve the effectiveness and efficiency of their agency’s interventions. This could include whether: evaluative effort is being directed to priority questions, evaluation commitments are being honoured, conduct is ethical and provides high quality information, findings are reported clearly and honestly, and findings are used to inform decisions.

� Conveys central agencies’ expectations that evaluative activity is fundamental to good management.

� Creates incentives for better prioritisation of evaluative effort.

� Creates incentives for higher standards of conduct of evaluative activity.

� Creates incentives for chief executives to report findings clearly and honestly, use evaluative findings, and manage their capability to do so.

� Does not apply to Crown entities, where the bulk of government expenditure goes.

� Central agency staff may not have adequate skills to assess evaluative activity.

� Hard to apply consistently across departments since chief executive performance management is highly tailored.

� Resistance to settling expectations in chief executive performance agreements for central agencies and departments.

1.7 Convey expectations about evaluative activity

Central agencies’ Managing for Outcomes guidance could convey expectations that departments should: undertake evaluative activity and data gathering as part of their core business and prioritise their evaluative effort within their own agency and with related agencies, and outline or refer to ethical standards that departments should apply when conducting evaluative activity.

� Conveys central agencies’ expectations that evaluative activity is fundamental to good management.

� Reinforces agencies’ freedom to prioritise their evaluative efforts – not a compliance exercise.

� Builds on an existing initiative without significant change.

� Does not currently apply to Crown entities, where the bulk of government expenditure goes.

� Guidelines can be easily ignored without adequate incentives.

1.8 Requirements to report in formal accountability documents

Departments and Crown entities could be required to outline planned major evaluative activities in statements of intent and report on those actually undertaken in their annual reports, including the major findings (except where there are sensitivity considerations).

� Builds on the Managing for Outcomes initiative without significant change.

� SOIs and annual reports are public documents, so findings are widely accessible.

� May prompt greater Parliamentary interest in evaluative findings.

� Wide audience may prompt agencies and Ministers to limit their evaluative activity to “safe” or easy initiatives rather than high priority initiatives.

48

Option Advantages Disadvantages 1.9 Appoint senior staff who value this

culture

An active, learning culture is shaped and encouraged by leadership at the most senior levels, so appointing Chief Executives and senior managers who value and understand evaluation will assist.

� Embeds an evaluative culture.

� Ensures that there are informal and formal modelling and rewards.

� Can rely excessively on few key individuals and not be systemic.

� Needs evaluative skills in managers to be valued.

� Takes time to reach a critical mass of senior staff who value an evaluative culture.

1.10 Separate evaluation output class

Rather than tying costs of evaluative activity into funding for specific policies/programmes, evaluation funding could be provided to agencies as a separate output class.

� Removes the focus on evaluating new initiatives and gives agencies greater freedom to prioritise their evaluative efforts.

� If cross-agency votes were established, this approach would help prioritise evaluative effort across agency boundaries.

� Makes the expectation of an evaluative culture more explicit.

� Risks departments and Ministers choosing to evaluate “safe” or easy initiatives rather than high priority initiatives.

� Could lead to undertaking evaluations for evaluation’s sake.

� Reinforces disconnect between policy and evaluation.

49

Options to improve coordination and prioritisation

Option Advantages Disadvantages 2.1 Departments to have an evaluation

strategy

Departments are currently required to signal their evaluation intentions for each new budget initiative proposals. These requirements could be extended so that evaluation intentions must be presented in the context of a wider evaluation strategy that considers relative priorities of all the department’s evaluative activity in the context of their policy priorities.

� Removes the focus on evaluating new initiatives and ensures agencies prioritise their evaluative efforts.

� Allows for more long-term evaluative activity rather than driving quick and easy answers.

� Compatible with Managing for Outcomes initiative.

� Could be treated as a compliance exercise unless genuinely used by central agencies when scrutinising new initiatives (see Option 1.4).

� Agencies may not have adequate skills to do this.

� An overall evaluation strategy is still only useful if it is planned to improve management decisions and is actively used.

2.2 Centralised research and evaluation database

A centralised research and evaluation database could promote sharing of evaluation results and foster greater coordination and learning based on previously conducted evaluative activity.

One option would be to resume work on the research clearing house proposal that emerged from the Boomrock discussions among Public Service chief executives. These discussions suggested a central site in which all current, completed or proposed research projects by departments would be recorded, so that all agencies could draw on each other's research, agree to undertake research jointly, share findings and methods, and link data for synthesis.

� One place for recording all research and evaluative activity in State sector.

� Would facilitate synthesising the results of previous work to generate new insights.

� Could promulgate poor quality, as well as high quality, evaluative findings unless carefully checked.

� Requires constant updating and quality assurance.

� Requires clear ownership and accountability to both maintain and contribute accurately.

� Experience in the past suggests that such an initiative may not work.

2.3 Reward collaborative work

Chief executive performance management processes could set expectations and assess whether departments are working collaboratively, and reward or sanction as relevant. This could include whether evaluative effort is being coordinated and prioritised with related agencies, and whether relevant data is being collected and shared with related agencies.

� Creates incentives for better coordination and prioritisation of evaluative effort and sharing of data and findings.

� Aligns with other changes in the public management system.

� Central agency staff may not have adequate skills to assess.

� Hard to apply consistently across departments since chief executive performance management is highly tailored.

� Could be ineffective if not supported by Ministers.

� Resistance to settling expectations in chief executive performance agreements for central agencies and departments.

2.4 Clear cross-agency accountability arrangements

Many of the outcomes Government is seeking involve contributions from several departments. Establishing lead agencies for specific sectors or outcome areas could provide a mandate for certain agencies to play a lead role in coordinating and prioritising policy and evaluative effort within those sectors or outcome areas.

� Provides clear expectations about cross-agency coordination.

� Significant change from current accountability arrangements.

� Risk of policy development and evaluative activity being driven by accountability concerns.

50

Option Advantages Disadvantages 2.5 Coordinated program of major

policy evaluations

There could be a more coordinated effort to ensure that priority outcome areas were there is substantial uncertainty about effectiveness and efficiency are evaluated to help inform decisions about the future of those policies.

Central agencies in consultation with departments, could recommend to Ministers a limited program of evaluations of effectiveness and efficiency of major policies. This would involve checking what evaluative activity is underway or planned in relation to priority outcome areas and identifying any gaps.

� Increases evaluation of major, existing, cross-cutting policies, which is a current weakness.

� Provides greater clarity about the Government’s strategic priorities.

� Key Ministers’ commitment to these reviews would increase likelihood of findings being used to inform future decisions.

� Would result in investment of scarce evaluative resource into significant areas.

� Centralised approach could be viewed negatively by agencies unless they are adequately engaged in process.

� Agencies may regard these as the only evaluative activity that needs to be undertaken rather than just one component.

� Requires genuine commitment from Ministers to expose major areas of government spending to scrutiny.

2.6 Establish a competitive evaluation fund

The cross-departmental research pool (CDRP) is a competitive funding process for joint research efforts. A similar approach could be taken to funding major evaluative effort.

� Would facilitate prioritisation of evaluative effort.

� Would encourage cross-agency evaluative effort.

� Justification for additional funds not yet strong.

� Some current criticisms of CDRP.

� CDRP does already fund some joint evaluative work (although its focus is on research) so another vehicle may not be needed.

2.6 Extend the SPEaR mechanism to other sectors

SPEaR was established in 2001 to oversee the Government’s purchase of social policy research to ensure the spending is aligned with the Government’s social policy priorities. It includes representation from a wide range of social sector agencies providing a focal point for social policy researchers and acts as a vehicle for communicating with the social research sector. A similar mechanism could be established in other sectors.

� Should help coordinate and prioritise evaluative effort.

� Too early to tell whether the SPEaR approach has been successful in coordinating and prioritising social sector evaluative effort.

� Requires sector agencies to be willing to participate.

51

Options to develop capability

Option Advantages Disadvantages 3.1 Education for senior executives

The new set of competencies and training for senior executives as part of the Senior Leadership Management Development Strategy could be strengthened to have a greater focus on outcomes, performance reporting and evaluative thinking.

� Signals demand from central agencies.

� Increases managers’ understanding of value of evaluative activity, which should improve commissioning and use.

� Training can be easily ignored.

� Will not reach all senior managers.

� May take time before impacts of training are apparent.

3.2 Guidance about integrating evaluative activity into the policy cycle

Central agencies could develop and issue guidance that explains the value of different types of evaluative activity at different stages of the policy/programme cycle. For example, under what conditions would there be (or not be) value in undertaking a full impact evaluation, and when might monitoring be sufficient.

These could be issued as part of the Managing for Outcomes initiative.

� Meets demand from some departments for this guidance.

� Decisions about what evaluative activity to undertake when remain with departments, allowing innovation and flexibility.

� Existing international and local guidance could be drawn on to develop this with limited effort.

� Guidance can be easily ignored.

3.3 Good practice guidance on how to conduct evaluative activity

Central agencies could consult with organisations such as SPEaR and develop and issue good practice guidance on how to conduct evaluative activity.

These could be issued as part of the Managing for Outcomes initiative.

� Meets demand from some departments for this guidance, including with respect to evaluative activity with Maori which requires specific models and approaches.

� Existing international and local guidance could be drawn on to develop this with limited effort.

� Reading guidance will not equip people with sufficient technical skills to undertake complex evaluations.

� There is no one-size-fits-all approach to conducting high quality evaluative activity.

� Guidance can be easily ignored.

� Many agencies already have their own guidance for conducting evaluative activity.

3.4 Establish a network or community of practice

The SSC could help establish an evaluation community of practice. This could include resourcing a team of people with evaluative expertise to work with agencies to help develop their evaluative capability through either providing or facilitating training, networks and peer review. For example, the network could provide information about the location of best practice and expertise within and outside the State sector.

� Consistent with best practice and brokerage role recommended by RoC Central Agencies project.

� Fits well with the SSC role of assisting and building capability and of facilitating whole-of-government thinking.

� Evidence that communities of practice are effective in raising performance.

� Could involve a range of external evaluators, universities and others, to good advantage.

� Evaluators/policy staff may choose not to use it.

� Hard to maintain quality of people involved.

� Communities of practice already exist, could duplicate unless it is developed carefully.

52

Option Advantages Disadvantages 3.5 Encourage ANZSOG and other

tertiary institutions to provide evaluative training

The SSC, which provides some funding for the Australia and New Zealand School of Government, could seek to influence the curriculum priorities to raise the profile of evaluative thinking. It could also develop closer relationships with other universities.

� Builds on current efforts to improve collaboration between universities and the Public Service.

� Training can be easily ignored.

� Will reach only a limited number of people.

� May take time before impacts of training are apparent.

3.6 Build Maori evaluative capability

Make a resource for building Maori evaluative capability available centrally, e.g. scholarships, internships through SSC, rather than expecting departments to develop capability themselves as they determine.

� Would be a major assistance for building Maori evaluative capability and without such an impetus unclear that it would happen.

� Signals value placed on Maori evaluative expertise and the importance of Maori evaluative models.

� May result in departments not contributing if centrally run

� SSC does not tend to fund specific public sector capability development – this is the accountability of CEs

� It is not clear that in terms of building Maori capability, that evaluative skills are the top priority.

53

GLOSSARY

Central agencies Treasury, State Services Commission and Department of Prime Minister and Cabinet Effectiveness The impact of the goods and services produced for the community on outcomes. Efficiency The amount of goods and services produced from a given level of human, financial or other input. Evaluation In this paper evaluation refers to the more formal, discrete and technical type of evaluative activity. Different types of evaluation are valuable at different stages in the life cycle of a policy or programme. For example, formative evaluations add value early on by providing information to support and guide the implementation process. An impact or outcome evaluation adds most value once a programme has been operating for long enough to be stable and typically assesses the intended and unintended outcomes achieved. Evaluative activity Activity that seeks evidence of actual performance of a policy or programme once it is being, or has been, implemented. Formative evaluation Evaluation directed at improving programme development and implementation. Impact evaluation Evaluation directed at determining immediate, intermediate or longer-term outcomes of a programme. A distinction is often made between impact and outcome evaluation. In the former, a causal attribution is established between the intervention and its outcomes. With the latter this causal link is not fully established. Meta-analysis The systematic analysis of a set of existing evaluations of similar programs in order to draw general conclusions, develop support for hypotheses, and/or produce an estimate of overall program effects. Meta evaluation The evaluation of an evaluation usually carried out against a set of accepted standards. Monitoring Monitoring involves the collection of data and information about a policy or programme in order to check progress. It can be used to identify problems in the delivery and/or effectiveness of a policy, programme or service so that adjustments can be made. It is often an important complementary activity to an evaluation. Sometimes monitoring information provides enough information on its own so that a more formal or in-depth evaluation is not necessary.

54

Outcome evaluation Evaluation directed at determining the immediate, intermediate or longer-term outcomes related to a policy, programme or service. Performance audit A study designed to determine whether an agency is achieving the objectives established by legislation or policy and managing its resources in an effective, economical, and efficient manner. Performance audits can include compliance audits, management audits, and input, output, programme or impact evaluations Process evaluation Evaluation directed at describing or documenting what actually happened in the course of delivery of a policy, programme or service. Research The systematic skilled investigation and study of the existence and nature of things in order to establish conclusions and create new knowledge. Summative evaluation Evaluation carried out after a policy or programme is established, often at the end of its life cycle. Value for money Measures the relationship between outcomes achieved (results) and the underlying resources used to achieve these outcomes.

55

REFERENCES

i State Services Commission, (June 1999a). Looping the Loop: outcome evaluation and other risky feats, SSC

Occasional Paper No.7. http://www.ssc.govt.nz/Op7; State Services Commission (June 1999b). Section II Essential Ingredients - Improving the Quality of Policy Advice, Occasional Paper No. 9, http://www.ssc.govt.nz/Op9.

ii Boston, J. 1999, The New Zealand model of Public Management: A Brief Assessment, Paper prepared for

summit on “Processes of Governance and the Role and Place of the Individual, Community and Markets, Wellington, 31 May 1999. Aitken, J (1997) The Way we Carry out the Queen’s business: Big Issues to be Dealt With over the Medium Term: Information and Evaluation, Paper presented at Future Issues in Public management Conference, 26/3/1997; Aitken, J, (1998). Government Policy Evaluation: An international Challenge. Address to the 5th Annual New Zealand Public Policy Conference, 24/3/1998; Turner, David and Sally Washington (2002). “Evaluation in the New Zealand Public Management System”, in Jan-Eric Furubo, Ray Rist and Rolf Sandahl (Eds), International Atlas of Evaluation, Transaction Publishers: New Brunswick, New Jersey, 357-373.

iii State Services Commission (June 1999b) Section II, Essential Ingredients - Improving the Quality of Policy

Advice, Occasional Paper No. 9, http://www.ssc.govt.nz/Op9. iv Robinson, Marijke (1998) Review of sample of Cabinet papers to determine the level of evaluation proposed,

A report to the Strategic Development Branch, State Services Commission. v State Services Commission, (June 1999a). Looping the Loop: outcome evaluation and other risky feats, SSC

Occasional Paper No. 7. http://www.ssc.govt.nz/Op7; State Services Commission (June 1999b). Section II, Essential Ingredients - Improving the Quality of Policy Advice, Occasional Paper No. 9, http://www.ssc.govt.nz/Op9.

vi Turner, David and Sally Washington (2002). “Evaluation in the New Zealand Public Management System”,

in Jan-Eric Furubo, Ray Rist and Rolf Sandahl (Eds), International Atlas of Evaluation, Transaction Publishers: New Brunswick, New Jersey, 357-373; State Services Commission, (June 1999a). Looping the Loop: outcome evaluation and other risky feats, SSC Occasional Paper No. 7. http://www.ssc.govt.nz/Op7; State Services Commission (June 1999b). Section II, Essential Ingredients - Improving the Quality of Policy Advice, Occasional Paper No. 9, http://www.ssc.govt.nz/Op9.

vii Baehler, Karen, (2003) “Evaluation and the Policy cycle”, in Kate McKegg, Carl Davidson and Neil Lunt,

Pearson (Ed), Evaluating Policy and Practice: A New Zealand Reader, Pearson Education; Rae, David, (2002), Next steps for public spending in New Zealand: the pursuit of effectiveness, OECD Economics Department Working Papers No. 337, www.oecd.org/eco.

viii MacKay, Keith (1998). “The Development of Australia’s Evaluation System”, in Keith MacKay (ed), Public

Sector Performance – The Critical Role of Evaluation, Selected Proceedings from a World Bank Seminar, The World Bank: Washington D.C.

ix Boston, J. 1999, The New Zealand model of Public Management: A Brief Assessment, Paper prepared for

summit on “Processes of Governance and the Role and Place of the Individual, Community and Markets, Wellington, 31 May 1999.

x Mulgan, Geoff, “Government, Knowledge and the business of policy-making”, Talk delivered on 23 April

2003, Australia. xi Turner, David and Sally Washington (2002). “Evaluation in the New Zealand Public Management System”,

in Jan-Eric Furubo, Ray Rist and Rolf Sandahl (Eds), International Atlas of Evaluation, Transaction Publishers: New Brunswick, New Jersey, 357-373.

xii Baehler, Karen, (2003) “Evaluation and the Policy cycle”, in Kate McKegg, Carl Davidson and Neil Lunt,

Pearson (Ed), Evaluating Policy and Practice: A New Zealand Reader, Pearson Education.

http://www.ssc.govt.nz/Op7







http://www.oecd.org/eco

56

xiii See Pirich, A and A Tullett, (2003) Policies for Growth and Enterprise Development- Defining Evaluation

Parameters, 16th APEC SMEs Working Group Meeting, Malaysia; OECD (2002) The Bologna Process, Cross-Cutting Theme: Evaluation of SME Policies and Programmes, Country Presentation: “Growth Strategy and Enterprise Development - Defining Evaluation Parameters in New Zealand.”, Paris.

xiv Baehler, Karen, (2003) “Evaluation and the Policy cycle”, in Kate McKegg, Carl Davidson and Neil Lunt,

Pearson (Ed), Evaluating Policy and Practice: A New Zealand Reader, Pearson Education. xv Baehler, Karen, (2003) “Evaluation and the Policy cycle”, in Kate McKegg, Carl Davidson and Neil Lunt,

Pearson (Ed), Evaluating Policy and Practice: A New Zealand Reader, Pearson Education. xvi Te Puni Kokiri, Review of the Social Policy Agency, Te Ropu Here Kaupapa, Aroturuki me te Arotakenga,

Monitoring and Evaluation Branch, August 1999, 53.

Documents

Improving Evaluative Activity in the New Zealand … Evaluative Activity in the New ... what part government interventions have played in their achievement, ... Pathfinder project