Integrated Performance Information for Workforce Development A Blueprint for States DRAFT February 2005 Washington State Workforce Training and Education Coordinating Board On behalf of the Integrated Performance Information Project State Teams from Florida, Michigan, Montana, Oregon, Texas, and Washington Principal author: Bryan Wilson, Ph.D. bwilson@wtb.wa.gov This project was funded through Grant # AN 13397-03-60 from the U.S. Department of Labor, Employment and Training Administration, in the amount of one million dollars. The project is funded 100% by this federal grant. This interim report on the project reflects the opinions of the authors and does not represent official opinion or policy of the U.S. Department of Labor. State of Washington Workforce Training and Education Coordinating Board 128 10th Ave., S.W. P.O Box 43105 Olympia, Washington 98504-3105 Telephone: 360-753-5662 •Fax: 360-586-5862 http://www.wtb.wa.gov Email:wtecb@wtb.wa.gov Ellen O'Brien Saunders Executive Director Executive Summary i Introduction 1 Section One: Challenges and Responses 5 Introduction 5 Authority 5 Culture of Shared Accountability 7 Trust 9 Capacity 10 Funding 10 Privacy 12 Unemployment Insurance Wage Records 14 Reaching Consensus on Goals and Measures 15 Section Two: IPI Performance Measures 17 Introduction 17 What Do Policy Makers Want to Know About Performance? 18 What Makes for a Good Performance Measure? 20 IPI Performance Measures 21 Labor Market Results for Program Participants 23 Skill Gains 26 Results for Employers and the Economy 27 Return on Investment 28 Data Sources and Supplemental Data 31 Performance Targets and Consequences 31 Adjustments 34 Definitions and Discussion of Key Terms 36 Section Three: Shared Information Systems 41 Introduction 41 Authorization and Governance 42 Purpose of the Data Warehouse 42 Leadership and Oversight 42 Funding 43 Management and Operation 44 Participating Agencies/Programs 45 Scope of DataLinks 45 Data Sharing Agreements 45 "Ownership" of the Merged Database 45 Data Access 45 Confidentiality 46 Information Reporting 47 Editorial Review Policies 48 System Modifications 48 Managing Information Flows 49 Data Documentation 50 Quality Assurance 50 Data Transmittal, Storage and Archiving 51 The Sequence for Addressing Issues 52 Conclusion 53 Appendix A: Participants A-1 Appendix B: Additional Resources B-1 References 93Executive Summary i Introduction 1 Section One: Challenges and Responses 5 Introduction 5 Authority 5 Culture of Shared Accountability 7 Trust 9 Capacity 10 Funding 10 Privacy 12 Unemployment Insurance Wage Records 14 Reaching Consensus on Goals and Measures 15 Section Two: IPI Performance Measures 17 Introduction 17 What Do Policy Makers Want to Know About Performance? 18 What Makes for a Good Performance Measure? 20 IPI Performance Measures 21 Labor Market Results for Program Participants 23 Skill Gains 26 Results for Employers and the Economy 27 Return on Investment 28 Data Sources and Supplemental Data 31 Performance Targets and Consequences 31 Adjustments 34 Definitions and Discussion of Key Terms 36 Section Three: Shared Information Systems 41 Introduction 41 Authorization and Governance 42 Purpose of the Data Warehouse 42 Leadership and Oversight 42 Funding 43 Management and Operation 44 Participating Agencies/Programs 45 Scope of DataLinks 45 Data Sharing Agreements 45 "Ownership" of the Merged Database 45 Data Access 45 Confidentiality 46 Information Reporting 47 Editorial Review Policies 48 System Modifications 48 Managing Information Flows 49 Data Documentation 50 Quality Assurance 50 Data Transmittal, Storage and Archiving 51 The Sequence for Addressing Issues 52 Conclusion 53 Appendix A: Participants A-1 Appendix B: Additional Resources B-1 References 93 Integrated Performance Information for Workforce Development A Blueprint for States Executive Summary Introduction This is a guide for states interested in creating or further developing integrated performance information for workforce development programs. Integrated performance information reports performance results consistently across programs, across levels (from institutions to local areas to states), or for programs as a system. It responds to the longstanding challenge and frustration caused by multiple, inconsistent performance measures across workforce development programs, a multiplicity that impedes collaboration—in both planning and service delivery—and befuddles policy makers. It also responds to shortcomings in programs' management information systems that cannot follow participants over time or report performance in a consistent manner. Integrated performance information, however, is more than a shared information system and a set of consistent measures. It also requires institutions and practices to support shared accountability for results. This Blueprint discusses each of the steps involved: establishing authority, building a culture of shared accountability and trust, generating capacity, crafting performance measures, setting and using targets, as well as, creating and maintaining a shared information system. Some states are at the initial stage of considering whether they want integrated performance information; others may have been at it for a long time, but are interested in improving their work. In either case, this Blueprint is intended to be of assistance. States may want to consider bits and pieces, or the whole thing, as best suits their needs. The Benefits There are many advantages to states having integrated performance information. They include increased accountability, improved strategic planning, better research, more efficient use of resources, and a sense of shared-responsibility among workforce development programs. These advantages can improve the credibility of workforce programs and, in turn, enhance the support they receive and, ultimately their ability to serve customers. What is meant by workforce development? The phrase, workforce development, encompasses programs that prepare people for employment and career advancement throughout their lives, and includes, but is not limited to: • Secondary Career and Technical Education • Postsecondary Career and Technical Education • The Employment Service, Workforce Investment Act (WIA) Title III • Workforce Investment Act Title I-B • Trade Adjustment Assistance Act • Adult Education and Family Literacy, WIA Title II • Vocational Rehabilitation, WIA Title IV • Temporary Assistance for Needy Families Work Program • Apprenticeship Viewed as a system, it may surprise some to learn that most money for these programs comes from the states. The largest programs, in terms of funding, are the education programs for which states typically supply at least 90 percent of the funds. Given this funding arrangement, it is logical that states exercise leadership in devising integrated performance information across workforce development programs. The United States Office of Management and the Budget (OMB) has taken an important initial step toward integrated performance information by issuing "common measures" for federal workforce development programs. OMB has received the attention of federal agencies, particularly the Department of Labor (DOL), regarding the need for consistent measures across programs. DOL has also taken the step of designing a new reporting system, ETA Management Information and Longitudinal Evaluation System (EMILE), that is to be consistent across most Department programs. This Blueprint builds on these initial steps by recommending performance measures and an information system that would support consistency across state as well as federal workforce programs. Using the Blueprint does not, however, require implementation of a system such as EMILE. The Blueprint was produced through the joint efforts of six states (Florida, Michigan, Montana, Oregon, Texas, and Washington), with the financial support of DOL. Washington State's Workforce Training and Education Coordinating Board (Washington Workforce Board), with the assistance of the National Governors Association's (NGA) Center for Best Practices convened policy and technical teams from each of the six states. Each state team included representation from a cross-section of workforce development programs. This diversity of representation was a necessary ingredient for the success of the project. The team members endeavored to listen closely to the perspectives of each program and to arrive at solutions that were acceptable to all. (Appendix A lists participants.) The state teams met several times during 2004 to share experiences and lessons learned, review technical papers, think through key questions, and arrive at consensus on key aspects of integrated performance information. In addition to the NGA's Center for Best Practices, the states received assistance from the Ray Marshall Center at the University of Texas and The Center for Governmental Studies at Northern Illinois University. The states benefited greatly from the research conducted on behalf of the project as well as from the general expertise and experience of these entities. They and the state teams provided much of the material for the Blueprint and reviewed and commented on drafts. The Blueprint is very much the shared product of the six states and their partners, although the Washington Workforce Board remains ultimately responsible. The Blueprint consists of the following sections: Part I: Challenges and Responses States face serious challenges as they embark down the road of creating integrated performance information. And the challenges don't end with the beginning. Most of the challenges are ongoing and require constant attention. This section of the Blueprint examines some of the major challenges and choices that some states have made to address them. The section discusses: (1) establishing authority for integrated performance information, (2) creating a culture of shared accountability, (3) building capacity-including funding and addressing privacy issues, and (4) reaching consensus on goals and measures. For each challenge, the Blueprint presents examples of actions states have taken, the variety of actions reflecting the institutional structures and political conditions in states. States reading the Blueprint may want to pick and choose from these examples and implement the steps that best fit their situation. Part II: IPI Performance Measures What are the best performance measures for workforce development if the same measures are applied horizontally and vertically within the system, including programs that are funded mostly by the states and programs that are funded mostly by Congress? This was a central question considered by the six states and their partners at NGA's Center for Best Practices and the Ray Marshall Center at the University of Texas. The states began by considering, "What do policy leaders want to know about performance results?" They then suggested the selection criteria for judging measures, and analyzed the advantages and disadvantages of a long list of possible measures. In the end, the states agreed on a relatively short list of measures that best respond to the performance questions commonly posed by policy leaders. This section of the Blueprint summarizes the discussion and recommendations of the six states' teams. Other states may wish to follow suit, either by following this type of process within their state, or by adopting some or all of the recommended measures. The following table summarizes the performance measures recommended by the teams from the six states. The measures are separated into those measures that are useful as accountability measures, for which there could be targets and consequences, and those measures that, while indicators of how well the workforce development system is doing, do not sufficiently satisfy the criteria for good performance measures in order to be used for targets and consequences. TABLE: IPI Performance Measures Accountability Measures Category Measure Labor Market Results for Program Participants • Do people get jobs? • What are they paid? 1. Short-term Employment Rate: The percentage of participants who are employed during the second quarter after exit. (For youth, enrollment in education counts as well as employment.) 2. Long-term Employment Rate: The percentage of participants who are employed during the fourth quarter after exit. (For youth, enrollment in education counts as well as employment.) 3. Earnings Level: Median earnings during the 2nd quarter after exit among all exiters with earnings. Skill Gains • To what extent do education levels increase? 4. Credential Completion Rate: The percentage of exiters who have completed a certificate, degree, diploma, licensure, or industry-recognized credential during participation or within one year of exit. Results for Employers and the Economy • Are we meeting the needs of employers? 5. Repeat Employer Customers: The percentage of employers who are served who return to the same program for service within one year. Performance Indicators Category Measure Results for Employers and the Economy • Are we meeting the needs of employers? 6. Employer Market Penetration: The percentage of all employers who are served during one year. Return on Investment • What is the return on the investment? 7. Taxpayer Return on Investment: The net impact on tax revenue and social welfare payments compared to the cost of the services. 8. Participant Return on Investment: The net impact on participant earnings and employer-provided benefits compared to the cost of the services. END TABLE The measures do not attempt to measure everything that is important to each program. These measures concentrate on outcomes important across workforce development programs. An individual program may have other goals and measures related to its particular mission. For example, adult education (WIA Title II) has a goal of improving literacy skill levels and measures linked to that outcome.1 Each program may want to have an additional measure or measures related to its unique mission. The state teams considered many other measures besides these eight. This section discusses some of these measures and why they did not rise to the top of the list. This section also discusses data sources, setting and using performance targets, adjusting targets or results for economic conditions and participant characteristics, and how national research complements state performance measurement. 1 Participants in Adult Education and Family Literacy who are in the program for a reason that is not work-related, may be excluded from the performance measures presented here. Part III: Shared Information Systems A major barrier to integrated performance information is the absence of a shared or integrated information system for workforce development in most states. Participant information remains scattered about in various program level Management Information Systems (MISs), and there is usually no system for integrating the information from the multiple MISs to support cross-program performance measurement and reporting. This section of the Blueprint discusses the steps and decisions involved in establishing a "data warehouse" that links administrative records from multiple programs with other data sets containing outcome information, such as unemployment insurance wage records. A data warehouse is built on top of existing MISs and does not replace them. The MISs are still necessary for program management. The warehouse is a longer-term repository where data are cleaned and matched in order to analyze and report performance outcomes and to conduct research. Creating a data warehouse requires states to make decisions regarding: authorization; leadership; funding; scope; data ownership, confidentiality, and access; information flows; reporting; and quality assurance, among other issues. This section of the Blueprint walks through each of these issues and the major options. Conclusion States will want to consider how to use this Blueprint given federal initiatives in this area, including the OMB's common measures, the DOL's EMILE System, and the pending reauthorization of the Workforce Investment Act (WIA), the Carl D. Perkins Vocational and Technical Education Act, and other federal acts related to workforce development. States may choose to proceed in a number of different ways. The development of a shared information system is something that should be useful regardless of the outcomes of federal initiatives. Electronically linking records from multiple programs with files containing outcome data will facilitate the implementation of common measures and reporting the performance information likely to be required by the reauthorized acts, as well as meeting state-identified needs. If states find some or all of the performance measures recommended here to be useful, they can implement them as additional measures to those necessary to satisfy federal or other state requirements. Experience has shown, moreover, that federal performance measurement requirements evolve over time. If states find the IPI measures to be useful in responding to policy makers' needs, the measures may be reflected in future generations of federal acts and guidelines. Federal performance requirements have a powerful effect on program implementation and results; if states find the IPI measures to be useful, it would be very helpful if future federal requirements were aligned with them. Finally, whatever the specific course of events in Congress or the federal agencies may be, the basic issues of building the capacity for and a culture of shared accountability are likely to be challenges that remain with states. The state teams believe the experiences and lessons shared here will help workforce development leaders as they continue to face these challenges. Integrated Performance Information for Workforce Development A Blueprint for States Introduction This is a guide for states interested in creating or further developing integrated performance information for workforce development programs. It responds to the longstanding challenge and frustration caused by multiple, inconsistent performance measures across workforce development programs, a multiplicity that impedes collaboration—in both planning and service delivery—and befuddles policy makers. It also responds to shortcomings in programs' management information systems that cannot follow participants over time or report performance in a consistent manner. Integrated performance information reports performance results consistently across programs, across levels (from institutions to local areas to states), or for programs as a system. Examples include: using the same measure to report both employment rates for a radiology program at a community college and employment rates for the state's two-year colleges as a whole; One-Stop center measures whereby the results for One-Stop participants are summed up without regard to program silo; and measures that indicate the results for taxpayers or employers that are produced jointly by multiple programs. Integrated performance information, however, is more than a shared information system and a set of consistent measures. It also requires institutions and practices to support shared accountability for results. The Blueprint discusses each of the steps involved: establishing authority, building a culture of shared accountability and trust, generating capacity, crafting performance measures, setting and using targets, as well as, creating and maintaining a shared information system. There are many advantages to states having integrated performance information. The advantages include increased accountability, improved strategic planning, better research, administrative efficiency, better collaboration among service providers, and a sense of shared responsibility among workforce development programs. These advantages can improve the credibility of workforce programs enhancing their political and budget support and, thus, their ability to serve customers. Accountability for performance is a banner raised by virtually every elected official. Elected officials want programs to report their results in understandable language so the officials can judge whether or not the programs are good investments. Elected officials are also very busy people. They do not have time for someone to explain dozens of different measures, each with a different denominator, and multiple definitions for seemingly straightforward terms such as employment and earnings. A core of relatively simple performance measures with consistent definitions and methodology is a great tool for convincing elected officials that programs track how they are performing and are accountable for results. Without such evidence, programs cannot expect legislative or budgetary support, particularly during tight fiscal times. Workforce development programs complement one another in serving many of the same customers. The same individual may attend secondary and postsecondary career and technical education, use labor market information at a One-Stop, and find a job through the Employment Service. The results are the shared effect of the services of all these programs. The results will be better and most efficient when states plan the services of multiple programs strategically in an integrated fashion. Having consistent performance measures across programs, and shared information to support them, greatly facilitates such planning. Providing good service to customers depends upon knowing what works and what needs to be improved. Integrated performance information enables analysis of the relationship between services and results. For example, a comprehensive system can perform a longitudinal analysis of what happens to customers over time, regardless of funding source, and provide breakdowns by demographic characteristics and geographic location. One can detect the patterns of service and program participation that work best for different sets of customers. Such an analysis is impractical without consistent performance measures and a shared information system. Integrated performance information is also more efficient—redundant efforts can be eliminated by integrating aspects of separate reporting systems. Electronic linking of program records with outcome data can replace more expensive methods such as surveys. The advantages of integrated performance information for public programs are perhaps nowhere greater than in the area of workforce development. For many years, observers have widely commented on the system's complexity, how there are many programs with insufficient coordination among them causing inconvenience to customers and confusion to planners and policy makers (USGAO 1994, 2003). Integrated performance information can improve coordination, enabling programs to act more as a system even though they may remain administered by separate agencies with different sources of funds and different governing authority (Trott and Baj 1996). There is also a reputation in workforce development, some deserved, some not, of questionable results. Reports of weak performance for certain programs have, in the minds of some, seeped over to other programs unfairly tainting the whole enterprise. (This is another example of how the programs share responsibility for results.) Integrated performance information is a tool that can demonstrate the performance of the system, showing where performance is strong, and informing improvements where performance is relatively weak. SIDEBAR: Questions IPI can answer: • What kinds of jobs are program participants getting? • What industries are employing them? • How much are they earning? • Are they staying on the job? For example: To what extent do individuals who obtain employment in hospitality or retail sales achieve wage progression over time? • What strategies, combinations of strategies, and approaches yield the best results for employment? ... for reducing social-welfare payments? ... for reducing incarceration? For example: Do postsecondary career and technical education students have better results if they first complete career and technical education in high school? • How do gender, race, prior education or employment, or disabilities affect results? For example: Does the effectiveness of job search assistance or training depend on previous employment? • What is the difference in results for program participants who complete and those who do not? For example: Does obtaining a certificate or diploma matter? • Are there geographic differences in what works and what doesn't? For example: Do some areas have more success in serving people with disabilities and might be offered as examples to others? END SIDEBAR What is meant by workforce development? Workforce development programs prepare people for employment and career advancement throughout their lives, and include, but are not limited to: • Secondary Career and Technical Education • Postsecondary Career and Technical Education • The Employment Service, Workforce Investment Act (WIA) Title III • WIA Title I-B • Trade Adjustment Assistance Act • Adult Education and Family Literacy, WIA Title II • Vocational Rehabilitation, WIA Title IV • Temporary Assistance for Needy Families Work Program • Apprenticeship Viewed as a system, most money for these programs comes from the states (Center for Public Policy and Priorities 2003 and Michigan League for Human Services 2003). The largest programs, in terms of funding, are the education programs for which states typically supply at least 90 percent of the funds. Given this funding arrangement, it is logical that states exercise leadership in devising integrated performance information across workforce development programs. Even for the programs that are funded mostly by Congress, such as the Workforce Investment Act (WIA) Title I-B, the actual skills training is provided mostly by state supported community and technical colleges. The federal funds usually pay for the cost of tuition, fees, and books, while the states pick up the larger share of the colleges' costs through their general funds. The United States Office of Management and the Budget (OMB) has taken an important initial step toward integrated performance information by issuing "common measures" for federal workforce development programs. OMB has received the attention of federal agencies, particularly the Department of Labor (DOL), regarding the need for consistent measures across programs (ETA 2003). DOL has also taken the step of designing a new reporting system, ETA Management Information and Longitudinal Evaluation System (EMILE), that provides for consistency across most department programs (ETA 2004). This Blueprint builds on these initial steps by recommending performance measures and an information system that would support consistency across state as well as federal workforce programs. Using the Blueprint does not, however, require implementation of a system such as EMILE. Most states do not have integrated performance information for workforce development programs. There is not consistent information across programs, across levels, or for workforce development as a system. Governors and legislators, their staff, and other policy leaders cannot easily make heads or tails of all the different federal and state performance measures, or how the programs use information on results to improve performance. Most states do not have shared information systems that can produce consistent information on results, provide longitudinal data, or slice and dice data in different ways as needed for policy analysis. Some states are at the initial stage of considering whether they want integrated performance information; others may have been at it for a long time, but are interested in improving their work. In either case, this Blueprint is intended to be of assistance. States may want to consider bits and pieces, or the whole thing, as best suits their needs. The Blueprint was produced through the leadership of six states (Florida, Michigan, Montana, Oregon, Texas, and Washington), with the financial support of DOL's Employment and Training Administration. Washington State's Workforce Training and Education Coordinating Board (Washington Workforce Board) with the assistance of the National Governors Association's (NGA) Center for Best Practices convened policy and technical teams from each of the six states. Each state team included representation from a cross-section of workforce development programs. (See Appendix A for members of the state teams and others involved throughout the project.) This diversity of representation was a necessary ingredient for success. The team members endeavored to listen closely to the perspectives of each program and to arrive at solutions that were acceptable to all. The state teams met several times during 2004 to share experiences and lessons learned, review technical papers, think through key questions, and arrive at a consensus on key aspects of integrated performance information. In addition to NGA's Center for Best Practices, the states received assistance from the Ray Marshall Center at the University of Texas and The Center for Governmental Studies at Northern Illinois University. The states benefited greatly from the research conducted on behalf of the project as well as earlier research and from the general expertise and experience of these entities. They and the state teams provided much of the material for the Blueprint and reviewed and commented on drafts. During the fall of 2004, the IPI project conducted two Institutes attended by teams from ten other states. Each state's team included representatives from a cross section of programs and both policy and technical staff (Appendix A). Consultations were also held with national experts in research and evaluation, advocacy organizations, and representatives of business and labor. (Appendix A) These Institutes and consultations provided valuable feedback and contributed to the final product. The Blueprint is very much the shared product of the original six states and their partners, although the Washington Workforce Board remains ultimately responsible. The Blueprint covers the challenges states face in creating integrated performance information, and how leading edge states have responded to these challenges. It talks about unclear authority, the lack of a culture of accountability, insufficient capacity, and a lack of consensus on goals and measures. The Blueprint offers experiences in handling these challenges as options that other states may want to consider, rather than as the right answers or the only paths to success. The Blueprint discusses what policy leaders want to know about performance results, and selection criteria for choosing the best measures to answer their questions. The Blueprint presents the performance measures that these six states came to agree are the best measures for integrated performance information. It also covers issues of setting performance targets, using regression models to adjust targets when conditions change, establishing consequences for results, and how state performance measurement and national research complement one another. Finally, the Blueprint talks about shared information systems. The Blueprint assumes that states will produce integrated performance information by linking electronic administrative records maintained in multiple management information systems (MISs), as opposed to building a single huge MIS. The prime example of such linking is matching participant records with unemployment insurance wage records in order to create data on participant employment and earnings. An information system that performs such linking for multiple programs is what is meant by a shared information system. In order to perform such linking, the Blueprint recommends a data warehouse approach and walks through the basic steps, including governance, funding, data ownership, reporting, and data dissemination. One last note before beginning: the Blueprint focuses on policy level information—the type of information needed by elected officials and their staff as well as state directors of programs and their policy staff. It is not designed for those seeking day-to-day management information. As discussed later in this document, pp. 17-18, these are two different audiences with very different sets of needs that require different solutions. While this gap in the Blueprint may be disappointing to hardworking staff at the ground level, without the support of policy leaders, their good work would not be possible. Challenges and Responses Introduction States face serious challenges as they embark down the road of creating integrated performance information. And the challenges don't end with the beginning. Most of the challenges are ongoing and require constant attention. This section of the Blueprint examines some of the major challenges and some choices that states have made to address them. The section discusses: (1) establishing authority for integrated performance information, (2) creating a culture of shared accountability, (3) building capacity—including funding and addressing privacy issues, and (4) reaching consensus on goals and measures. For each challenge, the Blueprint presents some examples of actions states have taken, the variety of actions reflecting the institutional structures and political conditions in states. States reading the Blueprint may want to pick and choose from these examples and implement those steps that best fit their situation2. I. Authority Given the fragmentation of workforce development programs, there is generally no clear authority for any one entity to lead integrated performance information for the workforce system. The multiplicity of silos among federal programs is well documented (Blalock and Barnow 2001, Barnow and King 1996, King 1988). Three different federal agencies (Labor, Education, and Health and Human Services) are responsible for administering the largest programs, and there is no standing interagency body to coordinate efforts. The fragmentation begins in Congress where different committees deal with labor and education programs on the one hand, and social welfare programs on the other. At least in Congress, the same committees in the House and Senate deal with both education and labor programs. This is usually not the case in states, where frequently there are separate legislative committees for K-12 education, higher education, labor, human services, and economic development, each with a piece of the workforce development puzzle. Programs tend to bear the stamp of one committee or another, and policy areas and issues that span committee jurisdictions are not always well managed. Governors have often taken the lead in addressing fragmentation among workforce development programs (Barnow and King 2000, Grubb, et. al., 1999). They have followed one of three paths: (1) consolidating workforce development programs into a single agency, (2) establishing a coordinating body, or (3) attempting coordination through interagency agreements. Whichever path is followed, there needs to be a champion of integrated performance information in order to make it happen, and most states have found that the clearer and stronger the authority that is established, the more likely it is to last and for the state to succeed. In Washington, the Governor requested legislation to create the Workforce Training and Education Coordinating Board (Washington Workforce Board). Established in 1991, the Washington Workforce Board is an independent agency that does not operate any of the major programs. Instead, it focuses on policy planning and performance accountability. One of the primary motivations behind its creation was to establish a consistent performance accountability system for workforce development programs. The intent section of the Board's authorizing statutes states, The workforce training and education system's data and evaluation methods are inconsistent and unable to provide adequate information for determining how well the system is performing on a regular basis so that the system may be held accountable for the outcomes it produces. (Revised Code of Washington 28C.18.005) 2 More information on these examples and examples from other states may be found in, O'Shea, Looney, and King, 2003. The statutes give the Washington Workforce Board the authority to establish common definitions, common standards for data collection and maintenance, common standards for program evaluations, and the assignment to conduct evaluations of the workforce development system, including net impact and cost benefit evaluations. This authority was of keen interest to the business and labor communities who strongly felt, and correctly so, that the state did not have a good handle on how well its workforce programs were performing. They championed these provisions before the legislature. The legislation also gained the support of the administrative agencies, in part, because the Washington Workforce Board would not itself administer programs and the largest administrative agencies (and only the largest) would each have one vote on the Board (with business and labor having a voting majority). The administrative agencies believed this structure would enable the Board to neutrally establish standards that would create a level playing field among the major programs. They also believed that independence from program operation would give the Board credibility when reporting performance results to the Governor and Legislature. (See Appendix B for more information on this and other statutes cited in the Blueprint.) In Texas, two paths have been followed. Texas has both merged agencies and created a coordinating body with clear responsibility for establishing integrated performance information. In 1993, Texas established in statute the Texas Council on Workforce and Economic Competitiveness with responsibility for strategic planning and oversight for the state's workforce development programs. In 1995, again by legislation, Texas merged two dozen workforce programs into a single new agency, the Texas Workforce Commission. The legislation also maintained the Council on Workforce and Economic Competitiveness, moved the Council into the Governor's Office, and charged the Council with establishing comprehensive, system-wide performance measures. In 2001,Texas enacted legislation providing additional guidance to the Council, now named the Texas Workforce Investment Council, regarding the development of performance measures. This legislation suggests in general terms some possible measures, including job placement rates, job retention rates, and wage rates. The legislation reinforces the authority of the Workforce Investment Council by clearly stating: The council shall include in the strategic plan goals, objectives, and performance measures for the workforce development system that involve programs of all state agencies that administer workforce programs. The Governor has supported the work of the Council, and approved the performance measures developed by the Council, with the Governor's Office of Budget, Planning and Policy playing a key role. Oregon's Governor established a Governor's Workforce Policy Cabinet. The Cabinet includes the agency directors for workforce development, economic development, and education, in addition to the Governor's own workforce and education staff. In a sense, the cabinet is a virtual alternative to consolidating agencies. The cabinet discusses policy issues, including systemwide indicators, to form a consensus and agree on action. Florida has followed another path. The Governor and Legislature created Florida's integrated performance information system prior to establishing a broader coordinating body for workforce development. In 1982 the Governor's staff prompted an initial interest in student follow-up analysis using unemployment insurance wage records through a study of vocational education. The study concluded that the available information was inadequate, especially information supporting analysis of labor market outcomes. As a result, the legislature created the Occupational Identifier Project in 1984. The Occupational Identifier Project used unemployment insurance wage records to measure post-program employment. At the same time, the legislature enacted a performance requirement for secondary and postsecondary vocational education that 70 percent of any program's completers had to be positively placed as a condition of continued funding. Initially, placement could be collected through local mail or telephone surveys or through the Occupational Identifier Project. In 1988 as a result of the Project's progress, the legislature enacted statutes creating the Florida Education and Training Placement Information Program (FETPIP) as a special unit within Florida's education agency. FETPIP then became the primary resource for documenting compliance with the vocational education performance requirement (Pfeiffer 2004). Since 1988, FETPIP has expanded its follow-up data coverage and its services to education and training organizations, including program evaluation, performance-based funding, and consumer information. FETPIP follow-up data includes employment and earnings, postsecondary education, military enlistment, incarceration, and welfare participation. FETPIP produces information for: • Reviews of programs of study at four-year colleges and universities. • Florida's' performance-based budgeting system. • Workforce Investment Act performance indicators. • Carl D. Perkins performance indicators. • Adult Education and Family Literacy performance indicators. • K-12 school reports. • The Legislative Office of Program Policy Analysis and Government Accountability. • Florida's K-20 Education Performance Accountability System. • Consumer tools such as college program catalogues. • Research, including return on investment studies, the costs of dropping out of high school, and studies of the education pipeline. The scope of the FETPIP statute has been broadened over the years to reflect these additional functions. II. A Culture of Shared Accountability Legal authority by itself does not necessarily make things happen. Statutes are full of provisions that do not mean a thing, either because of resistance or a lack of interest. Building a culture of accountability—patterns of behavior and beliefs that are truly supportive of accountability—takes a great deal of time and effort. This is especially true when creating a culture of shared accountability that crosses many programs. Workforce development programs have long been held accountable for their own performance, dating from the introduction of the first federal performance standards over two decades ago. Getting individual programs to pay serious attention to the performance of other programs and of the system as a whole, and to recognize the benefits of an integrated system is another matter. Performance measurement requirements that span multiple programs involve giving a third party authority to set standards for others. A program may be a part of that entity if it is board or council but so will others, and their views are not going to be identical. There is inherent resistance to another entity having authority and responsibility for establishing performance accountability requirements that apply to one's own program. There will be fear that another entity will misuse its authority to one's disadvantage. This may be due not to any deliberate intent to harm, but simply due to insufficient understanding of the purpose of the program and the customers it serves. In order to succeed there must be a building of trust over time that authority will not be misused. Finally, in order to obtain active participation, programs need to see something in it for themselves—a benefit from participating—rather than viewing it as just another compliance activity. States have employed a number of strategies to create a sense of shared accountability for performance results. These efforts have included both large-scale state initiatives and efforts based on more narrowly appealing to programs' self-interest. In the case of Oregon, the effort began with a statewide strategic planning effort championed by the Governor: "Oregon Shines," (Oregon Progress Board, 1989). Oregon launched this planning process in 1988 in response to a severe recession. The focus was, and continues to be, on economic vitality, and workforce development is a major component. The first of the three key goals of the original plan was: "A superior work force: Invest in Oregonians to build a work force that is measurably the most competent in America by the year 2000, and equal to any in the world by 2010." The process began with 16 committees composed of 180 business, labor, education, and government leaders. Input was obtained from literally thousands of people throughout the state. Among the committees' recommendations was the creation of the Oregon Progress Board, chaired by the Governor, to carry on their work. The Progress Board was created with the responsibility "to remind us of our shared vision, monitor our progress in achieving measurable goals, and bring choices to our attention." As part of this strategic planning process, Oregon developed the "Oregon Benchmarks," initially 259, now 100, measures of success. The benchmarks include goals that are not the responsibility of any one program or agency, and results that go beyond outcomes for program participants. According to the Oregon Progress Board, the benchmarks have been "an excellent tool for encouraging collaboration among different interests, engendering long term thinking and developing results oriented management systems." In 1997, the Progress Board issued "Oregon Shines II," (Oregon Progress Board, 1997). The benchmarks related to workforce development include: • The percentage of high school students that have completed a structured work experience: The goal for 2010 is 100 percent. • The percentage of Oregon adults (age 25 and older) who have completed an associate degree in professional-technical education: The goal for 2010 is 10 percent. • The percentage of Oregonians in the labor force who received at least 20 hours of skills training in the past year: The goal for 2010 is 100 percent. The benchmarks are the broadest level of three tiers of performance measures. The most specific of the three tiers is program performance measures, including federally required program measures and other measures that are reported to the state legislature. Since 1993, the legislature has required agencies to establish program measures linked to the benchmarks, and to include the measures as part of the agency biennial budget submittal. The middle of the three tiers are system-wide measures for workforce development, looking at results of the primary One-Stop partners. This three-tiered system of performance measures has engendered a pervasive culture of performance accountability in Oregon. It has helped to create a focus on ultimate results and a sense that "we are all in this together." There are other paths to creating a culture of accountability that do not depend upon such a big, statewide effort. Florida's FETPIP system has created a culture of accountability built upon the individual interest of programs and organizations (not that the state level initiative has been unimportant). FETPIP can calculate many measures providing information of unique value to particular programs. The direct value of the information to particular customers was especially crucial in the beginning stages. After a few years of initial development, FETPIP was approved to proceed with a pilot demonstration project. As a part of the pilot, FETPIP staff worked with three large (and influential) school districts, with postsecondary programs, and one large community college. Staff worked directly with their data and program managers to understand the particulars regarding their data as well as the reasons for collecting the data. As initial matches were made using student records and unemployment insurance wage reports, staff carefully reviewed results with, and incorporated suggestions from, the pilot agencies. As FETPIP staff and local users became more comfortable working with the data, they generated detailed reports that carved the data in as many ways as they could to pique interest and explored new ways of using data to improve programs. The payoff was that when FETPIP staff needed support with legislative committees, they did not need to lobby for that support on their own. They had created "champions" who could speak well of the effort. Since then, FETPIP has continued this strategy. It is constantly on the lookout for opportunities to apply its services, and thereby create more champions. Another means of engendering support for shared accountability is more direct financial incentives. While the provision remains controversial among states, the Washington Workforce Board finds the inter-program incentive fund authorized by Section 503 of WIA useful for reinforcing shared accountability. That section authorizes the allocation of incentive funds to states that meet performance targets for each of WIA Title I, Adult Education and Family Literacy, and Carl D, Perkins Vocational and Technical Education. If a state fails to meet the targets for any one of the three, it is ineligible for an incentive award. When Washington receives a section 503 award, the Workforce Board allocates the funds to local areas where all three programs achieve the targets, and local areas may expend the funds only on activities that benefit all three programs. This has helped local programs appreciate the importance of their colleagues' results and enhanced the sense of shared accountability. Trust A culture of shared accountability is built upon trust—trust that performance information will not be used against one's interests. Trust is built incrementally, step by step. Washington's Workforce Board began by simply reporting performance. As the Board proceeded to fulfill its statutory responsibilities, its initial strategy was to have programs get used to having another entity report their performance. Other agencies experienced the advantages of having a third party report their performance to the Governor and Legislature—an agency that could have greater credibility since it was not directly responsible for program performance. In some cases, executive and legislative budget leaders indicated they supported the budget enhancement requests of other agencies due to the positive findings of evaluations conducted by the Workforce Board. The Workforce Board established performance targets for other agencies' programs only after four years of performance reporting, and only after the targets were in place for a year were there any financial consequences for performance (through the allocation of relatively small pots of incentive funds). The Workforce Board staff discusses draft evaluation reports with the technical and policy staff of relevant agencies before they are published. This gives the agencies the opportunity to correct any mistakes or misinterpretations of the data and to suggest changes in the tone or substance of the narrative. The reports avoid any language that smacks of "gotcha." When numbers are low, programs are not labeled as "bad" (as some legislators have desired). Instead, the reports speak of "areas for improvement." And care is taken that the decision criteria used for identifying areas needing improvement are the same across programs. Regression modeling, another feature of the Workforce Board's accountability system, can be a useful tool for building trust. Programs are usually concerned that their results will look bad if their customers are especially hard to serve or if the economy goes down hill. Programs are worried that they will be "punished" for factors outside their control or for doing the right thing. Regression models and other techniques that adjust either performance targets or results for changes in participant demographic characteristics or economic conditions can help alleviate these fears. Florida emphasizes that an important precondition of trust is confidence that the data are good. When data are collected by linking several databases from different agencies, one is dependent on controls that may or may not be applied by others. For example, the accuracy of employment data requires that employers accurately record and report employee identification and payroll information. It also requires that the unemployment insurance agency enters the employers' data accurately and assigns the North American Industrial Classification Codes correctly. Similarly, the accuracy of participant data requires that social security numbers be collected, recorded, verified, and transmitted accurately by frontline staff. Other participant information such as demographics, socio-economic characteristics, and program characteristics, must also be faithfully represented. For the sake of quality control, those producing integrated performance information by linking records from different sets of administrative data should have a clear understanding of the data sets involved. They should know how the data are originally collected and recorded, how they are processed, how they are defined, and what they represent. As the record linking activity proceeds, any anomalies or problems should be brought to the attention of the source agency for clarification or resolution. Use is often the best means of maintaining data quality. By actually using data and alerting appropriate staff when anomalies are found, data are kept clean and their quality improved over time. This process requires expert staff at both the source agencies and the agency responsible for data linking. While formal agreements in Florida stipulate many of the circumstances regarding data handling and processing, there is by now a sense of trust that goes beyond formal agreements. Maintaining this trust involves working with agencies to assist them in recognizing and resolving problems that are uncovered in the data matching process. It also means continually working with them to ease any processing burdens associated with the matching effort. Where there are questions regarding the release of certain data or analysis, even if the release appears to fall within the agreement's parameters, the cooperating agency is consulted. III. Capacity In addition to authority and will, designing and operating a system of integrated performance information requires the capacity to do so. States must have the financial resources and staff with the required skills to make it happen. They must also have appropriate access to the necessary data in order to consistently measure performance across programs and levels. Funding Some have expressed concern that integrated performance information would require large investments they cannot make; states' experiences do not support this concern. In Texas, workforce development agencies share in the costs of integrated performance information as set forth in state statute enacted in 2001. The statute directs the Texas Workforce Investment Council to establish, with the approval of the Governor, a formula "to determine the level of support each agency administering a workforce program must provide to operate the automated follow-up and evaluation system." The formula developed by the Council is based on the number of seed records evaluated for each workforce training or education program. In Florida, FETPIP is supported by federal and state general funds, as well as fees for service. FETPIP had rather modest beginnings. In 1984, the legislature appropriated $60,000 for the initial study and design, enough to support one staff person and some consulting. Over the course of the next several years, support grew, including funds from Wagner-Peyser and the Job Training Partnership Act, supporting a staff of four. In 1988, the legislature created FETPIP in statute and provided for its staffing, including a director, two senior data base analysts, two research assistants, and clerical support. Its annual budget was about $350,000. Beginning in 1994, FETPIP began collecting some data on a quarterly basis (up to then, data collection was annual). FETPIP's budget and staff increased accordingly. At the current time, FETPIP has nine professional level permanent positions and one full-time clerical staff. The professional positions include a director, two senior data analysts, two systems analysts, three junior level computer programmers and one research assistant. The annual budget includes approximately $400,000 in state general revenue resources, $200,000 from the Agency for Workforce Innovation (from a combination of federal WIA, Wagner-Peyser, and TANF resources), and $60,000 from the Perkins Act. FETPIP also receives funding through fees for specialized services. The Workforce Board relies on three funding sources to support its integrated performance information system: state general funds, the Governor's 10 percent funds for WIA Title I state-wide activities, and Carl D. Perkins Administrative funds. Serving as Washington's sole state agency for Perkins as well as the state workforce investment board for WIA, places the Workforce Board in a unique position to access these three funding sources, providing a total of approximately one-and-a-half million dollars per year to support six FTEs and contracted services. These resources provide a variety of services in addition to data matching and reporting, including maintaining the state eligible training provider list, extensive surveys of participants and employers, net impact and cost benefit evaluations, economic and labor force research, and other periodic research projects for a state workforce development system expending close to $900,000,000 per year. The table below shows estimated costs for performance measurement based upon the Washington State experience. The table shows that it costs relatively little to go beyond federal requirements for the Workforce Investment Act and Perkins and produce consistent performance information for the workforce development system based on administrative records (shown in the last two rows of the table). The estimated annual cost of calculating and reporting the IPI performance measures (including nesting measures) beyond costs required anyway to meet federal requirements for WIA Title I and Perkins is approximately $310,000, of which $150,000 is for the measures that are not based on net impacts, and $160,000 is for measures based on net impacts. This includes the costs of data cleaning, matching, and analysis, and reporting the results. It includes staff time and computer costs. If a state wants to go beyond federal and IPI measure requirements and conduct additional survey-based research, substantial additional costs would be entailed. While surveys can provide much valuable information, they are very costly and not required to implement the IPI performance measures. Privacy Integrated performance information relies upon access to administrative records in order to have consistent and affordable data (Stevens, et. al., 2003). A paramount issue in gaining access to administrative records is the protection of individual privacy. In order to have this capacity, certain precautions are essential. Without taking appropriate precautions to protect privacy, a state will quickly find that it has no capacity to produce integrated performance information, not because of misuse of data, but because of fear that misuse may occur. The issue of privacy is heightened because the only data element that uniquely identifies unit records that is common to most data sets, including the very important Unemployment Insurance (UI) wage records, is the social security number. With identity theft a growing problem, individuals are understandably concerned about sharing their social security number. States have employed several methods in order to protect individuals from the state or its agents improperly divulging their number (see p.13). Laws and regulations restrict the conditions under which government may share individual information with others. Many programs have their own unique restrictions that must be respected when establishing integrated performance information. Some of the most important restrictions, although by no means the only ones, are those associated with student and unemployment insurance records. FERPA In the use of student records, states must satisfy the requirements of the Family Educational Rights and Privacy Act (FERPA). FERPA protects the privacy interests of students and/or the parents of students who are minors with respect to their personal education records. FERPA allows students, or their parents/guardians, if they are minors, the right to review and inspect their educational records, the right to amend them, and the right to have some control over their disclosure. In general, personal educational records cannot be disclosed without the prior written consent of the student (or his or her parent or guardian if a minor). Exceptions are provided if: • The disclosure involves directory information. This is information in a personal education record which would not be considered harmful if it were released. Directory information includes such items as student name, address, telephone number, date and place of birth, major field of study, degrees or certificates awarded, and others that may be specified in state law. • The disclosure is to school officials, including teachers, who have been determined to have legitimate educational interests. • The disclosure is to organizations conducting studies on behalf of educational agencies or institutions to validate tests, administer student aid programs, or improve instruction. The use of student records to create performance information for program reviews and accountability comes under the exception of "disclosure to improve instruction". The organization receiving the student records must conduct the process in a manner that precludes the use of personally identifiable information about students or their parents by individuals other than employees that represent the receiving organization. The receiving organization must also destroy the personal education records when the purposes for which they were provided are completed. On January 18, 2001, the U.S. Department of Education (DOE) issued a program memorandum concerning the applicability of FERPA to certain provisions in the Carl D. Perkins Act (Perkins) and WIA. The letter recognized that only Social Security numbers can be used to link educational records to UI wage records and outlined options for how this could be accomplished. On January 30, 2003, DOE issued a new letter outlining three ways that education agencies can exchange student records with state unemployment insurance agencies in order to meet Perkins and WIA performance measurement requirements. 1. The custodial organization of the education records may receive the employment records and link the two sets of records to determine the employment status of former students. 2. Students (or the parents of minors) may grant prior consent for the release of their social security numbers for the purpose of matching their records with employment records for statistical research purposes. 3. The education agency may share the student records with the agency holding the employment records, and that agency may match the two sets of records if the education agency provides in-person oversight and direction. States have used a variety of approaches to meet FERPA requirements. In Florida, data matching is conducted by FETPIP, a unit within the Florida Department of Education (Pfeiffer 2004). The Florida Department of Education is the custodian of student records for all public education entities in the state. FETIP negotiated an arrangement with the Department of Labor and Employment Security that established a UI wage record archive within FETPIP. FETPIP can thereby match student records with wage records without releasing the student records to an organization outside of education. In exchange for receiving the wage records, FETPIP handles any requests for statistical matches with the UI wage records. In Texas, when the recent, more restrictive interpretation of FERPA was handed down by DOE, the Texas education agencies (Texas Higher Education Coordinating Board and Texas Education Agency) were no longer able to share their student records with the Texas Workforce Commission that holds the UI wage records. This forced Texas leaders to rethink how they conducted their follow-up and evaluation effort. Currently, the Texas Workforce Commission and the Texas Higher Education Coordination Board jointly administer the follow-up and evaluation system through a Memorandum of Understanding. Under this agreement, the Commission matches participant records from non-education programs with employment wage records (and federal employment databases) and forwards the results to the Texas Higher Education Coordinating Board, which then matches the records with college master enrollment files. The Texas Higher Education Coordinating Board matches all education records that fall under FERPA with the employment wage records and federal employment databases. The resulting program data is analyzed and returned to the Texas Workforce Commission. In Oregon, the Attorney General's office was asked for a formal opinion regarding state and federal privacy requirements. The AG's opinion was cautious and indicated that in order to share personally identifiable information, agencies must obtain the prior written consent of program participants. Oregon has subsequently asked participants for such consent. (See B-12 for the consent language.) Vocational Rehabilitation Services Federal statutes also restrict access to information about applicants and participants in vocational rehabilitation services (34CFR361.38). The statutes permit the sharing of individual information for evaluation and research purposes under certain conditions with adequate safeguards: Personal information may be released to an organization, agency, or individual engaged in audit, evaluation, or research only for purposes directly connected with the administration of the vocational rehabilitation program or for purposes that would significantly improve the quality of life for applicants and eligible individuals and only if the organization, agency, or individual assures that — 1. The information will be used only for the purposes for which it is being provided; 2. The information will be released only to persons officially connected with the audit, evaluation, or research; 3. The information will not be released to the involved individual; 4. The information will be managed in a manner to safeguard confidentiality and 5. The final product will not reveal any personal identifying information without the informed written consent of the involved individual or the individual's representative. The state must inform individuals who apply for rehabilitation services as to the policies governing personal information, the purposes for which the information will be used, and the other agencies to which information is routinely released. Unemployment Insurance (UI) Wage Records Another data access challenge is access to UI wage records (Crosslin and Stevens 1989, Trott and Baj 1996). State unemployment insurance statutes typically limit access to the records provided by employers and job seekers. Sometimes these limits impinge on the state's capacity to use the records to document labor market outcomes for any programs other than unemployment insurance. In Washington, the Legislature enacted legislation requested by the Employment Security Department to clarify when Employment Security may share administrative records with other government agencies and the steps required for safeguarding the confidentiality of the records. The legislation specifies that Employment Security may share information provided by employers to other agencies for the purpose of conducting statistical analysis, research, and evaluation studies. More broadly, the statute authorizes Employment Security to enter into data sharing contracts with other agencies for the purpose of evaluating and improving the operation of state programs. The same statute also specifies the conditions under which job seeker data may be shared with other agencies. Employment Security may enter into contracts to share job seeker data with other agencies when the data is necessary for the "efficient provisions of workforce programs, including but not limited to public labor exchange, unemployment insurance, worker training and retraining, vocational rehabilitation, vocational education, adult transition from public assistance, and support services" (Revised Code of Washington 50.13.060). The statute provides for a civil penalty of $5,000 for the misuse or unauthorized release of records or information. Given the mobility of the labor force, states must also have access to wage records of other states. For purposes authorized under WIA, the Wage Record Interchange System (WRIS) enables states to determine the employment and earnings of their program participants in other states.4 WRIS, however, is currently limited to matches for WIA. This restriction is a serious impediment to the implementation of integrated performance information; it precludes consistent reporting of performance across programs. A state may, for example, include in consumer reports authorized by WIA out-of-state employment and earnings information for a college training program. The state, however, is not permitted to use this same information when reporting to DOE the performance of the state's postsecondary system. Access to WRIS is also important for consistency across states. There is a great deal of variation among states in the extent of out-of-state employment of former program participants. Without access to WRIS, this variation precludes valid interstate comparisons of employment and earnings results. Finally, states must also have access to federal employment records held by the Department of Defense and the Federal Office of Personnel Management. Individual states have successfully established agreements with these agencies for the purpose of tracking their former program participants. But it is cumbersome and costly for each of the 50 states to separately approach the two agencies. Currently, DOL's Federal Employment Data Exchange System (FEDES) project is developing a national system for accessing Defense and federal personnel records. It will be important for this effort to include access for programs beyond WIA. IV. Reaching Consensus on Goals and Measures Another challenge faced by states is reaching consensus on goals and measures. While goals and measures are sometimes established in statute, more often than not the specific measures are left to administrative action. In either event, it is paramount that the stakeholders reach agreement on the common goals they are trying to achieve and the best measures for them to apply across the workforce development system. This process may take years, but it is worth it in order to develop true buy-in. Oregon's systemwide indicators had their beginnings in the Oregon Workforce Option, a partnership among federal, state, and local workforce programs in the early 1990's to reduce regulatory barriers in exchange for increased performance. Oregon Workforce Option partners met with federal representatives (DOL, HHS and DOE) with the aim of replacing the myriad federal measures with a single set of performance measures. At the time, Oregon workforce partners revealed they collectively had 144 individual program measures. A workgroup was formed to craft a single set of measures by "rolling up" the federal and state measures into one set. For example, most of the partners had an outcome measure that was equivalent to "entered employment," even if it was called something else. The Oregon Workforce Policy Cabinet formed a workgroup called the Performance Accountability Policy group to fully implement the indicators. This group was comprised of both policy staff and technical experts from each of the affected agencies, plus local representatives. The policy group developed 13 performance indicators that were outcome focused, measurable, based on available data, and inclusive of the work of all the partners. The policy group recognized that for the agencies to be involved, they needed to have their work reflected in the indicators. A parallel track was developed with the Oregon Workforce Investment Board. The co-chairs of the policy group staffed the Workforce Investment Board's Performance Accountability Committee. As soon as the indicators were fully developed, they were discussed and approved by the Workforce Investment Board. This made the state board the "owner" of the systemwide indicators. In Washington, the Workforce Board's statutes do not specify the goals or measures for the workforce development system; the statutes give the Workforce Board the assignment to establish the goals and measures "in coordination with the operating agencies." To do this, the Workforce Board conducted a long and deliberative process that included participating in a National Governors Association academy in the mid-1990s. The Board established a technical workgroup to develop a proposal for its consideration. The technical workgroup consisted of state and local staff from each of the affected programs, mostly research managers and some local program administrators. The technical workgroup began by identifying the basic goals that the programs were all trying to achieve. After the Workforce Board agreed to the goals, the workgroup proceeded to identify performance measures for each goal. The workgroup identified a list of a couple of dozen measures, a long list, in order to paint a fairly complete picture of performance. Nearly two years after beginning this effort, the Workforce Board adopted the performance accountability system, including the performance measures. After four years of reporting performance using the measures, the Workforce Board proceeded to set performance targets. This action was spurred by provisions in WIA and Perkins, and by the interest of the Governor in adding incentives and sanctions to the accountability system. The list of measures, however, was too long for target setting, so the technical workgroup went about identifying a small subset of core measures. To assist in this process, the workgroup conducted a series of focus groups around the state with representatives of a cross section of programs. Local staff from high school career and technical education programs, community and technical colleges, WIA Title I, and other programs sat down together and discussed the best measures to use if the measures were to be applied to all of their programs. Compromises were reached and measures were proposed by each focus group. The state technical workgroup discussed the results of the focus groups and developed its recommendations. The Workforce Board adopted the recommended state "core" indicators and included them in the state's unified plan. IPI Performance Measures Introduction What are the best performance measures for workforce development if the same measures are applied horizontally and vertically within the system, including programs that are funded mostly by the states and programs that are funded mostly by Congress? This was a central question considered by the six states—Florida, Michigan, Montana, Oregon, Texas, and Washington—which developed this Blueprint. The state teams, with representation from a cross-section of workforce development programs and the assistance of the NGA's Center for Best Practices and the Ray Marshall Center at the University of Texas, discussed the answer over the course of four national meetings. The states began by considering what policy leaders want to know about performance results. They then suggested the selection criteria for judging measures, and analyzed the advantages and disadvantages of a long list of possible measures. In the end, the states agreed on a relatively short list of measures that best respond to the performance questions commonly posed by policy leaders. This section of the Blueprint summarizes the discussion and recommendations of the six states' teams. Other states may wish to follow suit, either by following this type of process within their state, or by adopting some or all of the recommended measures. 5 Before beginning, it is important to note what the performance measures do not provide. First, the measures do not attempt to measure everything that is important to each program. These measures concentrate on outcomes important across workforce development programs. Individual programs may have goals and measures related to their particular mission. For example, adult education (WIA Title II) has a goal of improving literacy skill levels and measures linked to that outcome.6 Each program may want to have an additional measure or measures related to its unique mission. Second, the focus of the Blueprint is on performance information for policy leaders—elected officials and their staff, cabinet officers, and program heads—not the needs of day-to-day program managers. The information needs of policy leaders are often different than those of program managers. Program managers and their staff need up-to-date, preferably real-time information that enables them to make service delivery changes and other program decisions and turn things around quickly if they are not going well. They do not need such immediate information to be consistent with the information used by the managers of other programs. Policy leaders, on the other hand, tend to make changes more deliberately. Legislators make statutory changes only once a year, or in many cases, only every other year. The Congress provides oversight on an even longer schedule. Education leaders have little need for the latest quarterly report on program placement rates. Policy leaders involved in systemwide policy or planning, however, do need information that is consistent across programs. There are also methodological reasons for differences in performance information at the policy and management levels. The best sources of data for consistent information are administrative records. Not only are these sources inexpensive, they enable one to apply consistent methodology across programs. One can use the same data source and methods, for example, to calculate employment and earnings for community college students and for WIA Title I participants. Other possible data sources, namely case manager notes or surveys, are either unavailable (community colleges do not employ case managers), too expensive (participant surveys for every program of study at every institution would be extraordinarily expensive), or are not consistently reliable and valid (for example, self-reported earnings in response to surveys). Administrative records, on the other hand, have their own weaknesses. Most notably, UI wage records, the administrative data source for employment and earnings, have a time lag of about six months in data availability. A six month lag is not a problem for those writing a bill for the legislature; it is for those managing WIA Title I contractors. Making the lag worse is the follow-up period required for policy level measures. Policy leaders want to know whether or not a program is working. One cannot judge whether or not a program is working based on what happens the day after participants leave the door. There must be a follow-up period of sufficient length to give some assurance that observed immediate results will last. Program managers do not have this luxury. States and local entities have sometimes dealt with this temporal difference by identifying a "dashboard" of "leading indicators" drawn from data available at exit or shortly thereafter. The hope is that the leading indicators will be positively and strongly correlated with longer term measures of lasting results. Florida has instituted such a "dashboard" of "leading indicators" for WIA Title I, referred to as the "Red and Green Report," (Switzer 2004). The "Red and Green Report" reports outcomes no later than 45 days after the end of each quarter that can be used for case management, performance based contracting, and project management. Florida has tested the connection of the "Red and Green" measures with the longer term WIA core indicators and found that they are positively correlated. The "Red and Green" name refers to the colors in the report that designate workforce development areas that are performing in either the bottom or top quartile on a measure. This graphic display provides a quick, intuitive picture of how areas of the state are performing. At the other end of the spectrum are measures of the general state of a state's workforce (Sheets 2002). These are not performance measures per se. They do not show the results of workforce development programs or even the results of the system as a whole. They are indicators of how well a state's entire workforce is doing. They count individuals who never participated in a workforce program as well as those who have. Oregon Shines provides examples of such indicators: the "percentage of Oregon workers employed in a job that pays wages of 150 percent or more of poverty (for a family of 4)" and the "average annual payroll per covered worker." Such indicators provide policy leaders with a general picture of how well their state is doing. The performance measures discussed in the remainder of this section fall in between these two levels. (A state may want to implement a tiered system, as in Florida and Oregon with measures at each of the three levels.) The measures discussed here are neither day-to-day management measures nor general indicators of the state's well being; instead, they are designed to provide policy leaders with information on how well the workforce development system is performing and whether or not the results are getting better over time. I. What Do Policy Makers Want to Know About Performance? As mentioned above, the six states' teams began their discussion of performance measures by asking themselves, "What do policy makers want to know about performance?" Drawing from their experience with governors, legislators, and boards, the teams identified the following six types of performance information that policy makers want to know and a series of questions that leaders commonly ask: 1. Labor Market Results for Program Participants Do people get jobs? What are they paid? Do the jobs last? 2. Skill Gains To what extent do education levels increase? Do participants continue in further education? 3. Participant Satisfaction To what extent are program participants satisfied? 4. Social Welfare Results for Program Participants What are the changes in the receipt of social welfare payments? What are the changes in the receipt of unemployment insurance payments? What are the changes in poverty rates? Are we making taxpayers out of tax users? 5. Results for Employers and the Economy What are the impacts of workforce development programs on the economy? Are the programs meeting the needs of employers? Does the supply of newly prepared workers match labor market demand? Are program participants better prepared for work? Do people go out-of-state after they are served? And policy leaders want to know the relevant information by economic sector. 6. Return on Investment What do programs cost? What is the return on investment? In addition to answering these questions for the workforce development system as a whole, policy leaders want the participant information available by subgroup, such as by gender, race/ethnicity, and disability status. They also want information to make comparisons with other states, to judge what service strategies are most effective (for example, job training vs. job search assistance), and to compare programs. Answering all of these questions is not possible, and in some cases, not even desirable, at least not without a lot of caveats. For example, it is very difficult to determine the impacts of workforce development programs on a state's economy. A state's economy is greatly affected by national economic conditions such as consumer spending, interest rates, international trade, the federal budget, and where the economy is in the business cycle. These conditions, more than the performance of the workforce development system, let alone an individual program, determine the health of a state's economy at any given point in time. Over the long haul, the level of education and training of the state's workforce is certainly a major factor affecting a state's economy, but short-term effects are difficult to isolate. With regard to participant satisfaction, while all agreed that knowing the degree of participant satisfaction is an important thing, after much discussion the teams concluded that measuring participant satisfaction is best done at the point of service as a quality improvement tool, rather than as a systemwide measure of performance. Policy makers are often interested in making comparisons, but one must be very careful in comparing the results of different programs. Different programs serve different populations. Given these differences, their results should not be same. If one program's results are better than another's, that in itself should not lead to the conclusion that resources should be shifted from one to the other. The program with weaker results may be serving a population with many barriers to employment that nearly everyone would agree a state should serve, and its weaker results may reflect the need for additional rather than fewer resources. The ability to understand the outcomes of federally funded programs for the nation as a whole as well as the ability to make comparisons among states would be useful to policy leaders. However, to do this would require standardizing both measurement methodology and the techniques used to adjust measures for variations in economic conditions and participant demographics. This standardization does not currently exist. In fact, the current system makes it difficult for Congress to know what it is getting. States vary greatly in exactly how they apply the same measures and small variations in detail can lead to rather large differences in results. While some states have developed regression models to adjust results for economic conditions and demographic characteristics, most have not, and those that do use different techniques. As a consequence, one cannot tell if differences in performance results among states are due to methodological differences, economic or demographic differences, or real differences in how well states actually perform. Policy makers want to know the "difference" that programs make. To actually measure the "difference", however, requires net impact analyses that compare outcomes for participants receiving services (the treatment group) with outcomes for comparison groups of individuals similar to the participants but who did not receive the services. It is not economical or feasible to conduct net impact analyses on a frequent basis vertically and horizontally across the workforce development system, and further, many programs do not lend themselves to experimental, random assignment evaluations. There are steps that states can take to address these issues, but we should recognize at the outset that despite our best efforts we cannot answer all of the questions that policy leaders ask. In some cases, we have to answer the questions we wish they would they ask. 7 II. What Makes for a Good Performance Measure? Given the goal of integrated performance information across the workforce development system, the six states' teams developed criteria to judge the quality of possible performance measures. Other things being equal performance measures are better the extent to which they: Are outcome measures: Performance measures should be measures of the results for customers, as opposed to process measures or measures of program outputs. As a part of this criterion, performance measures should measure things that programs can substantially affect, not variables determined by external factors unrelated to program performance. Promote desired results: Because you get what you measure, measures should be carefully designed to promote behavior and results that are consistent with policy goals. As a corollary to this, measures should be chosen to minimize unintended consequences. Are easily explainable to a lay audience: Policy leaders are busy people and have to digest a tremendous amount of varied information. Measures are more useful to policy leaders the extent to which they are understood quickly and easily; the fewer the measures the better. Create a level playing field among programs and service strategies: Measures should be designed so that they do not create a bias toward one program or strategy or another. For example: a follow-up period linked to the point of registration is biased against lengthier programs and strategies. Meaningful for each program: The performance measures should provide meaningful information for each program in the system. They should provide programs with information they care about. Scalable and Divisible: Measures should be meaningful vertically across the workforce development system—they should be applicable, to the extent possible, to local institutions, workforce development or other regional areas, states, and the nation. Measures should also be divisible so that results can be understood for subpopulations and service strategies. States should be able to slice and dice the measures to meet a variety of research, analytical, and policy needs. Timely: There should be a minimal gap between the time of service and the time the performance information is reportable. Methodologically Sound: Measures should satisfy standards for validity and reliability. Are not easily “gamed": While there may be no measure that is completely impervious to manipulation, some measures are more susceptible than others. Measures should be constructed to minimize the opportunities to improve results by manipulating the measurement rather than by improving service. Inexpensive: Performance measures are very important for ensuring taxpayer dollars are wisely used, but states very reasonably want to minimize the amount of money spent on activities other than direct service to customers, and those include performance measurement. For the sake of efficiency, measures are better when they rely on data that are already maintained, i.e., administrative records. Comprehensive and complementary as a whole: The measures should be comprehensive in that, while still a relatively short list, they address all the types of performance information that leaders want. They should be complementary in that if a measure creates a bias in one direction it should be accompanied by another measure that points in the other direction. For example, if a state uses a measure of earnings that promotes serving people with high pre-program earnings, the state should also use a second earnings measure that promotes serving people with low pre-program earnings. Finally, it should be understood that there is no perfect set of performance measures. Measures are not capable of perfectly fitting each and every one of these criteria. There are tradeoffs that have to be made. In order to satisfy some criteria, there has to be some give on others. III. IPI Performance Measures The six states' teams identified eight performance measures that best meet the above criteria and provide information that policy leaders want to know about the performance of the workforce development system. The following section presents the eight measures. The measures are grouped into broad categories that correspond with the identified needs of policy makers. The section also analyzes the advantages and disadvantages of the measures and others that were considered. Suggested definitions for key terms follow this section. A critical element of the IPI measures is the concept of "nesting measures." While the states wanted a short list of measures, they also wanted the capacity to understand outcomes for subpopulations of participants, particular industries, and for service strategies. Given the use of administrative records, the IPI measures can be broken down in order to provide such information. This section includes some important "nesting" measures that states or others could use. The measures are divided into two broad categories: Accountability Measures and Performance Indicators. The Accountability Measures are appropriate for use in setting targets and applying consequences, while the Performance Indicators are not recommended for these purposes since they do not sufficiently satisfy the above criteria to be used as accountability measures. For example, the employer market penetration measure is not an outcome measure and can have unintended consequences. The return on investment measures have a long lag between the time of service and the time the results are available, are expensive to measure on a frequent basis, and the methodology is not sufficiently rigorous to accurately measure small changes over time. Despite these and other shortcomings, these measures are still useful as indicators of the general performance of the workforce development system. Finally, it should be noted that the measures are not drawn from a list of measures from any of the six states. They are the product of wide-ranging discussions and compromises among individuals from a variety of states representing a variety of workforce development programs. The following figure summarizes the measures. TABLE: IPI Performance Measures Accountability Measures Category Measure Labor Market Results for Program Participants • Do people get jobs? • What are they paid? 1. Short-term Employment Rate: The percentage of participants who are employed during the second quarter after exit. (For youth, enrollment in education counts as well as employment.) 2. Long-term Employment Rate: The percentage of participants who are employed during the fourth quarter after exit. (For youth, enrollment in education counts as well as employment.) 3. Earnings Level: Median earnings during the 2nd quarter after exit among all exiters with earnings. Skill Gains • To what extent do education levels increase? 4. Credential Completion Rate: The percentage of exiters who have completed a certificate, degree, diploma, licensure, or industry-recognized credential during participation or within one year of exit. Results for Employers and the Economy • Are we meeting the needs of employers? 5. Repeat Employer Customers: The percentage of employers who are served who return to the same program for service within one year. Performance Indicators Category Measure Results for Employers and the Economy • Are we meeting the needs of employers? 6. Employer Market Penetration: The percentage of all employers who are served during one year. Return on Investment • What is the return on the investment? 7. Taxpayer Return on Investment: The net impact on tax revenue and social welfare payments compared to the cost of the services. 8. Participant Return on Investment: The net impact on participant earnings and employer-provided benefits compared to the cost of the services. END TABLE Labor Market Results for Program Participants —Three Measures 1. Short-term Employment Rate: The percentage of participants who are employed during the second quarter after exit. (For programs serving youth, the measure should include enrollment in education as well as employment in the numerator, so the measure captures the positive outcome of continued education.) The number of exiters with any earnings during the second quarter after exit divided by The total number of exiters 2. Long-term Employment Rate: The percentage of participants who are employed during the fourth quarter after exit. (For programs serving youth, the measure should include enrollment in education as well as employment in the numerator.) The number of exiters with any earnings during the fourth quarter after exit divided by The total number of exiters 3. Earnings Level: Median earnings during the second quarter after exit among all exiters with any earnings. The measure should be presented as an annualized number (quarterly earnings times four). (For programs serving youth, individuals enrolled in education during the second quarter after exit should be excluded from the measure, in order to prevent individuals who are employed only part time because they are in school—a positive outcome—from depressing the result.) Nesting Measures: Measures number 1, 2, and 3 can be measured for subpopulations of participants, such as: gender, race/ethnicity, disability status, public assistance recipients, and for subsets of services. The two employment rate measures can be measured for the subpopulation of participants who were unemployed prior to participation. The employment rate measures can be disaggregated by economic sector. The long-term employment rate measure can be supplemented by a measure of the percentage of exiters with employment during both the third and fourth quarters after exit. The earnings data can be disaggregated to show distributional outcomes; for example, the median income among the bottom quintile of exiters, the second quintile, and so forth.) Discussion of Labor Market Results Measures The recommended measures provide simple and straightforward answers to common questions asked by policy leaders: How many people get jobs and what are they paid? In their simplicity, they avoid many of the disadvantages of more complicated measures sometimes used. Minimum Threshold for Employment: After serious consideration the six states decided not to recommend a minimum threshold for employment greater than "any earnings" in a quarter. Adding a threshold would complicate the measures. For example, if a threshold greater than zero were added to the employment rate measures, would participants below the threshold be excluded from the earnings measure? Also, there is no other threshold beyond zero that is not arbitrary. Some advocates voice concern that counting any earnings may lead to counting many as employed who have only very short-lived, low-paid jobs. Test data runs show, however, that counting any earnings as employment adds only a couple of percentage points to the IPI employment rate measures compared to establishing a minimum threshold of a few hundred dollars in a quarter. Why median earnings rather than the mean? For the earnings measure, we recommend the median rather than the average (mean) because for many workforce programs only about a third of participants earn at or above the average. The average, therefore, creates a misleading impression about overall performance and the earnings participants can expect. What about Measures of Pre-Post Change? Other possible measures include looking at employment only among those unemployed at registration or "pre/post" changes in earnings. Both of those approaches can be misleading. They imply that the services caused the observed change in employment or earnings, as though all individuals not served would have remained unemployed or had stagnant earnings. We know, however, this is not the case. In order to measure the change caused by program services, net impact measures are necessary. Limiting the employment rate measures to only those unemployed at registration also excludes a substantial share of workforce development participants and makes the measures largely meaningless for important programs. About 70 percent of community and technical college students are employed at registration. To have a key performance measure that excludes the majority of participants in the largest adult training program in the nation does not make sense for integrated performance information. Recognizing, however, that many policy makers are interested in knowing the employment rate among participants who were unemployed prior to participation we offer a nesting measure showing the employment rate among this subset of participants. Pre-post changes in earnings are mostly determined by the pre-program level of earnings. The higher the pre-program level of earnings the lower the gain, other things remaining equal. Unless results are adjusted for the level of pre-program earnings, differences in this measure from year to year are more likely to reflect changes in pre-program earnings than changes in program effectiveness. There are other challenges as well, for example picking the right pre-program time period to use. Participants typically have experienced a decline in earnings during the first quarter prior to registration (Ashenfelter 1978). Should the first pre-program quarter be used, perhaps artificially inflating the result, or should an earlier quarter be used, perhaps artificially understating the program's effectiveness? The number produced by a measure of pre-post changes in earnings is not something that is intuitively meaningful. While $30,000 per year is an excellent result for most workforce development services (other than the labor exchange), it is not so obvious whether a pre-post change of $10,000 is good or bad. The answer is: it depends on the finish line. Going from $0 to $10,000 is not good since it still does not provide sufficient earnings to live on. Going from $10,000 to $20,000 is generally a good outcome, but one cannot tell that by a pre-post measure alone. The intuitive meaning of the number is particularly important if the earnings measure is going to be reported as part of a consumer report system. An individual trying to decide which training program to enroll in is much more likely to want to know how much money students typically make after the program than to want to know the average pre-post change in earnings. Finally, while an earnings level measure does not in itself provide a point of comparison to a policy goal, a state (or an individual for that matter) could provide that point of comparison, such as the poverty threshold, a self-sufficiency standard, a "living wage," or some concept of a "family wage". Follow-up Period: The choice of which follow-up period to use is an important one. The follow-up period is based on calendar quarters since unemployment insurance wage records are quarterly. While the first quarter after exit provides the most timely information, it is subject to manipulation by timing the date of exit. It is also incomplete. Many participants do not become employed until some time during the first quarter after exit; for these, the first quarter will not capture a full quarter's worth of earnings. This is most obviously the case for secondary career and technical education. For adults, the longer and more costly the investment in training, the longer participants are willing to wait to find the right position in their field. Their earnings prior to finding that position do not capture the result of the training. There may also be a delay due to the timing of licensure exams. For these reasons, many argue that the third quarter after exit is the first quarter that will fully reflect the earnings outcome. This however, is a long time to wait for information. Consideration must also be given to the criterion of a level playing field. Longitudinal analyses of earnings show that the longer the follow-up period, the better the relative results for training services compared to quicker job search services (Barnow 2000). On the other hand, one could make job search look relatively better by using the shortest follow-up period possible, such as the quarter of exit. Given all these considerations, we recommend the second quarter as the best follow-up period. While the timing of the first follow-up period for employment is not as sensitive to these issues as earnings, we recommend the second quarter for that as well in order to use the same follow-up period for as many measures as possible for the sake of simplicity and ease of explanation. The preferred measure for a longer term follow-up is the employment rate during the fourth quarter after exit. This is not a "traditional" retention measure. The concept of retention measures fits well with workforce development programs that have a strategy of placing participants and then helping them retain employment. For many programs, including community and technical college job training and secondary career and technical education, this is not part of what they do. Employment retention measures can also be misleading if they measure retention in employment only of those employed during the earlier follow-up period. One could observe an 80 percent employment rate in the first period and an 80 percent retention rate in the second, both of which sound fairly impressive. But it would be possible that only 64 percent of all exiters could be employed during the second period—a poor result that would be masked by a traditional retention measure. What About Wage Progression? Policy leaders also express interest in measures of earnings change during the post-program period. Put colloquially, many leaders want to know if individuals are stuck in dead-end jobs or if they experience wage progression. Research shows that increases in earnings during the post-program period are strongly correlated with continuous employment. Post-program earnings gains are also correlated with the level of initial earnings—those who land a good job tend to experience the largest subsequent gains. To some extent, therefore, measures of post-program earnings gains are redundant if one already has measures of earnings level and short and long-term employment. And, once programs have information on a cohort of participants, their earnings progression over time can be identified. While measures of earnings change and employment retention do not rise to the top of the list of our recommended measures for the reasons cited, they can provide useful information and states may wish to use them for analytical purposes. Possible measures include: Pre-post Earnings Change: The difference between the level of earnings during the third and fourth quarters after exit and the level of earnings during the third and fourth quarters prior to program participation. Numerator: the level of earnings during the third and fourth quarters after exit minus the level of earnings during the third and fourth quarters prior to program registration among those with employment during both of these pre- and post-program periods. Denominator: the number of exiters with employment during both of these pre- and post-program periods. (The restriction of the measure to those with employment during both periods makes this solely a measure of earnings, as opposed to a measure of the compound effect of earnings and employment rates.) Employment retention: The percentage of exiters who are employed during each of the second, third, fourth, and fifth quarters after exit. Numerator: the number of exiters with employment during each of the second, third, fourth, and fifth quarter after exit. Denominator: the number of exiters. Post-program earnings change: The difference in the level of earnings during the fifth quarter after exit and the level during the second quarter after exit. Numerator: earnings during the fifth quarter after exit minus earnings during the second quarter after exit among those with employment during both quarters. Denominator: the number of exiters with employment during both the second and fifth quarters after exit. Skill Gains 4. Credential completion rate: The percentage of exiters who have completed a certificate, degree, diploma, licensure, or industry-recognized credential during participation or within one year of exit. The number of exiters who have completed a credential during participation or within one year of exit divided by The total number of exiters Nesting Measures: Measure 4 can be measured for subpopulations of participants and for the subset of participants who received training. The measure can also be disaggregated to show the completion rate by type of credential, for example the percentage who completed an associate's degree. The measure also incorporates the data to report the number of exiters completing a credential, in addition to reporting the rate, including the number of credentials by field of study. Discussion This measure is designed to answer policy leaders' questions about participant skill gains. One of the key issues in the design of a skill gains measure is determining what to count as a gain. Experience with WIA Title I shows that unless there is a fairly tight definition, there will not be consistency in how gains are measured across geographic areas and programs. For the sake of consistency and in order to count gains that have true value in the labor market, our recommendation is to count credentials only. These should include standard credentials such as academic certificates, diplomas and degrees, and also industry-recognized credentials such as licensure and industry-based certifications. Ideally, credentials should be based upon assessments of students against industry standards. This, however, is not yet sufficiently common to make it a requirement in the definition of what constitutes a credential. One of the technical challenges in measuring credentials is dealing with the large number of individuals going in out of short-term training during the course of their working lives. Many community colleges have responded to this type of demand by implementing modular training with multiple exit points. Reaching an exit point signifies the attainment of meaningful skills. It would be useful for the credential measure to positively count this type of completion, particularly when it satisfies industry-based standards. Why Consider All Participants? Another issue is whether to include all exiters in the denominator or only exiters who received training. The recommendation is to use all exiters because it will encourage programs to promote training and skill development for the whole population of participants. If only those exiters who received training were counted in the denominator it would be possible for a state to achieve a credential completion rate of 100 percent while training very few individuals. For those interested in knowing the credential completion rate among only those who receive training, that can be measured as a nesting measure. Completion Rates and Numbers: Beside the completion rate, another important point of information is the number of individuals who obtain new credentials. This is a measure of the extent to which workforce development programs and the system as a whole are increasing the supply of skilled workers. Since this number is the numerator in the credential completion rate measure, it can be readily provided as a nesting measure. Participant Satisfaction As mentioned earlier, we came to the conclusion that participant satisfaction is best measured at the point of service as part of quality improvement efforts, and not as a performance measure for the system. Social Welfare Results for Program Participants The states' teams decided that social welfare results are best measured through measuring subpopulation results for earnings and through the return on investment measure (number 7, below). The primary policy question regarding social welfare (e.g., TANF) recipients is, "How much are they earning after participation?" This can be measured by looking at earnings measure (number 3) for the subpopulation of social welfare recipients one is interested in. As part of measure number 7, a state can look at the net impact of workforce development participation on social welfare payments. This is a better performance measure than the percentage of participants receiving social welfare payments which is more likely to be related to policy changes in eligibility than it is to program performance. A net impact analysis enables a state to assess how much difference program services make, since the comparison group will be subject to the same eligibility policies as the participant group. Results for Employers and the Economy 5. Repeat Employer Customers: The percentage of employers who are served who return to the same program for service within one year. The number of employers who return to the same program for service within one year of the initial service quarter Divided by The number of employers who receive a service during a quarter 6. Employer Market Penetration: The percentage of all employers who are served during the year. The number of employers served during the year Divided by The total number of employers Nesting Measures: Measures number 5 and 6 can be measured for particular sectors of the economy or other subsets of employers that are target markets, and for particular services. Discussion It is difficult to answer policy leaders' questions about the outcomes of workforce development programs for employers and the economy, especially in cases where employers are not the direct customers but are indirectly served by hiring program participants. The recommended measures are far from perfect, but we believe they are the best possible measures that can be calculated efficiently. Up to now, there have been no such measures in federal programs even while the benefits to employers have become central to policy discussions. Other Measures Considered: Economists can argue that the benefit to employers can be measured in the additional earnings paid to program participants since employers would not pay the additional earnings if former participants did not produce value at least equal to that cost. From a political perspective, however, it is difficult to convince elected officials and business leaders that the result for employers is best measured by increased individual earnings. Another potential measure, productivity gains, is notoriously difficult to measure and is not very scalable. It is also highly subject to factors external to workforce development programs. Serious consideration was given to a measure of employment retention with the same employer. Retention with the same employer may be a sign of employer satisfaction with the quality of the participants' work. Participants, however, may leave an employer for good reasons (for a better job) as well as for bad reasons (they were fired), and employment retention with the same employer would count both as though they were bad. In fact, participants often do better when they move to a new employer especially if they are beginning their career at the entry level. Repeat Customers: The measure of repeat employer customers is recommended as the accountability measure for this category. The measure includes indirect as well as direct service in order to include employers who hire program exiters. An increase in repeat customers over time signifies that workforce development programs are providing services that employers want. The measure can be viewed as a proxy for employer satisfaction. To measure employer satisfaction directly would require surveys that would be expensive to administer vertically across the system. The measure requires the employer to return to the same program but not necessarily the same service, since some employers, particularly small employers, may have no need of the same service within one year. For programs that are part of a one-stop system whereby access to services is provided by common staff, sometime referred to as an account manager, an employer returning to the account manager should count as a repeat customer. Market Penetration: Employer market penetration is more an output measure than a measure of the outcomes for employers. It is useful information for policy makers to know, although some might question whether higher is always better since it could involve competition with the private sector. The measure counts both direct and indirect service to employers. Including all employers in the denominator will generate a percentage that is relatively small. States may consider limiting the denominator to employers that hire employees. Job placement, however, is not the only service provided to employers, and limiting the measure to only employers that hire would create a disincentive to market services to all employers. The results of all services to employers are to be counted in the measure including, but not limited to, providing labor market information, training participants who become new hires, and upgrading the skills of incumbent workers. A technical challenge with both measures is that care must be taken to identify employers. Potential problems include handling employers with multiple locations and employers whose legal identity changes. The measure of repeat employer customers has the potential downside of creating a disincentive to serve small employers who are less likely than large employers to need to repeat the same service. This problem might be mitigated through a regression model or other adjustment methods. Both measures treat small and large employers alike, even though there can be a much larger economic impact from serving a large employer than a small employer. It would be easier to increase one's market share by targeting small employers. This potential unintended consequence is another reason for treating market penetration as a Performance Indicator rather than Accountability Measure. Return on Investment 7. Taxpayer Return on Investment: The net impact on tax revenue and social welfare payments compared to the cost of the services. Estimated net impact on tax revenue and social welfare payments during the first five years after exit Divided by Cost of the services. 8. Participant Return on Investment: The net impact on participant earnings and employer-provided benefits compared to the cost of the services. Estimated net impact on participant earnings and employer-provided benefits during the first five years after exit Divided by Cost of the services. Nesting Measures: Measures number 7 and 8 can be measured for subpopulations of participants and for subsets of services. Discussion These measures are intended to respond to policy leaders who want to know: "How much bang for the buck?" Measure number 7 attempts to show how much taxpayers receive for their investment through increased tax revenue and reduced social welfare payments. Measure number 8 attempts to show whether, and to what degree, the impact on participant compensation is greater than the cost of the services. The two measures might more accurately be termed "cost effectiveness" measures, but they are referred to here as "return on investment" measures because that term resonates more with policy leaders. The underlying premise of both measures is that the return to taxpayers and participants is based on the net impact. To count the observed outcomes in a manner that suggests that nothing positive would have happened to individuals if they had not received services is misleading. Methodology: The recommended methodology for the return on investment measures is to statistically construct a comparison group (Hollenbeck 2004, King and O'Shea 2003, Workforce Training and Education Coordinating Board 2002). The most practical data source for that purpose, for most programs, is the database of Employment Service registrants. This database contains a rich array of information on each individual including age, race/ethnicity, gender, disability status, prior education, and location, and can be linked with unemployment insurance wage files and other data sets to obtain employment and earnings data, and history of unemployment benefits and public assistance receipt. Statistical matching can be used to find the Employment Service registrants who most closely match the workforce development participants of interest. This data source also has the advantage that most of the comparison group share with the participant group the characteristic of being job seekers or incumbent workers who want to improve their situation. While the database of Employment Service registrants is the most useful source for statistically constructing a comparison group, it is not the only source, and for some populations—particularly in-school youth and vocational rehabilitation participants—other sources are needed as well. Net impacts can be assessed by comparing outcomes for individuals who received workforce development services (other than registration with the Employment Service) to their matched counterparts. Employment Service registrants who participated in any other workforce development service are removed from the comparison group pool. This method estimates the impact of workforce development services beyond the impact of simply registering with the Employment Service. Clearly, this method cannot be used to measure the impact of registration by itself. First, one identifies the earnings of the participants using the unemployment insurance wage files. It is recommended that there be at least two years of post-program data on earnings to enable the use of mathematical models to extrapolate the earnings out to five years after exit. The five-year period is methodologically defensible, short enough to appear credible to policy leaders, and long enough for almost all programs to demonstrate a positive net impact.8 One must also identify participant earnings during the time period of program service in order to calculate foregone earnings. During the time period of participation individuals tend to forego earnings compared to similar individuals who are not in a workforce development program. This is a cost of participation that should be taken into account. The value of earnings should also be discounted in order to compare benefits and costs in terms of net present values. Currently, there is no source of administrative records on the value of employer-provided medical and pension benefits. The value must, therefore, instead be estimated based upon the literature on employee compensation. The literature shows how the value of benefits varies with the level of earnings and industry of employment—two variables captured in the unemployment insurance wage records. After identifying the earnings and benefits of the participants, one can similarly identify the earnings and benefits of the comparison group. The difference between the total compensation of the participants and the comparison group is the net impact. This should be stated as an average per participant. For the measure of the taxpayer return on investment, the impact on tax revenues can be derived by formula from the impact on earnings. State departments of revenue typically maintain formulas that estimate the amount of tax revenues generated by changes in earnings. The change in tax revenues can, therefore, be estimated without accessing individual tax records and regardless of the types of taxes in a state. One can use the formulas to identify the net impact on tax revenues resulting from the net difference in earnings between the participants and comparison group. Administrative records are available for social welfare payments, e.g., TANF, food stamps, and Medicaid. States can use the administrative records to identify the value of the post-program payments received by the participants and extrapolate this value out to five years after exit. One can then identify the similar value for the comparison group. The difference in the values between the two groups is the next impact on social welfare payments.9 Calculating program costs is more difficult than it may first appear. One must be careful to include all costs, which may include federal, state, and local. Costs should be based on actual expenditures, not program appropriations. There is a considerable lag between the time that participants exit service and the time that results are available from a net impact analysis. This is one reason why the return on investment measures are not appropriate as performance accountability measures. By the time the results are available, much will have already changed—frequently including the program administrators. While this methodology does not withstand the scrutiny of those who believe that only an experimental design is sufficient, it does offer a practical and relatively inexpensive way of estimating net impacts. Moreover, it is much more rigorous than the methodology behind most so-called return on investment measures presented to state policy leaders. While the numbers created by the measures should be considered as only general estimates (another reason they are not suitable as accountability measures) they do give a sense of whether services produce positive net impacts that exceed costs, and the general magnitude of those impacts. Washington's Workforce Board has applied this methodology twice to measure the return on investment for workforce development, and has received good reviews from the Office of the Governor and the state Legislature, increasing their support for workforce development programs. The results have also influenced program direction. For example, The Workforce Board's analysis has shown that the net impact of adult basic skills education is much stronger when combined with vocational skills training. This finding has helped stimulate the blending of literacy and vocational skills training at the state's community and technical colleges. For programs for which random assignment is appropriate, state net impact measurements can be complemented by national experimental net impact evaluations. These research-based evaluations satisfy stricter requirements for scientific rigor and provide valuable information regarding the actual impact of programs and strategies, including their effectiveness for various sub-populations. As a package, then, the performance system would include a variety of measurements—output, outcome, and experimental and non-experimental net impact measures—that complement one another. Data Sources and Supplemental Data The data sources for these performance measures are administrative records created in the process of delivering services. In addition, the following are sources for outcome data: • Unemployment Insurance wage records • Department of Defense personnel records • Federal Office of Personnel Management records • United States Postal Service personnel records • College and school student records • The National Student Loan Clearinghouse • Public Assistance Records (TANF, Food Stamps, Medicaid) The major gaps in these administrative records are out-of-state employment and data on self-employment (Stevens and Crosslin 1989, King 1989). WRIS provides out-of-state employment information for services authorized by WIA, however, the scope of WRIS must be expanded to other programs in order to fill the first gap for out-of-state employment wage records (see p. 14). DOL's FEDES project is developing a national system for accessing Defense and federal personnel records. States may now individually reach agreement with one another and the federal agencies for access to employment records. It is not practical, however, for 50 states to be doing this individually. These national systems need to be accessible for the full range of workfo