Segment 2 Of 2     Previous Hearing Segment(1)

SPEAKERS       CONTENTS       INSERTS    
 Page 9       PREV PAGE       TOP OF DOC    Segment 2 Of 2  
PROGRAM DATA QUALITY

Wednesday, March 22, 2000
House of Representatives, Committee on Transportation and Infrastructure, Subcommittee on Oversight, Investigations, and Emergency Management, Washington, D.C.

    The subcommittee met, pursuant to call, at 10:00 a.m. in room 2253, Rayburn House Office Building, Hon. Tillie K. Fowler [chairman of the subcommittee] presiding.

    Mrs. FOWLER. The subcommittee will come to order.
    Before I make my opening statement, I am going to defer to Mr. Traficant, who I believe has to go to the Floor for a while and then will return.
    Mr. TRAFICANT. Thank you, Madame Chairwoman. I commend you again for holding this hearing. I will be asking for a specific hearing in the near future on other problems.
    I want to welcome all the witnesses, and those of you in the first panel are probably very lucky that I will not be here. But I believe we have a tremendous chairwoman who has done a great job. I am sure she can function without me.
    I would ask that my entire opening statement be made a part of the record, and any questions that I have for this group be submitted in writing and answered in writing in a timely fashion. I ask unanimous consent for such.
    Mrs. FOWLER. Without objection, your prepared statement will appear in the record.
    Thank you, Mr. Traficant. We look forward to your being back. And it is hard without you here, but they need your statements on the Floor, too.
 Page 10       PREV PAGE       TOP OF DOC    Segment 2 Of 2  
    This is the second hearing that is related to the quality of data that is collected and used by Federal agencies. Our first hearing focused on financial data. Today's hearing will examine the quality of program data.
    Now if I can make a brief comment regarding our first hearing. This panel expressed grave concerns regarding the lateness of EPA's audit of financial statements last year. I am pleased to say that despite pressure, I understand from the Office of Management and Budget, that the EPA Inspector General issued the audit on time this year. I commend the Inspector General for maintaining her professional credibility in the face of what I understand was some political pressure. That is an example that needs to be set for others, too. So thank you very much.
    As in our first hearing, we will be examining data collected and used by the three largest agencies in the committee's jurisdiction: the Department of Transportation, the General Services Administration, and the Environmental Protection Agency.
    Our society is now in the midst of an information revolution. And thanks to phenomenal improvements in computing power and communications technology, we are undergoing a vast change in how we do business and how we make decisions.
    The lifeblood of much of this new and rapidly expanding information technology is data. We have always considered libraries to be the main repositories of information. That is no longer true. Today, more people do research on the Internet than in libraries. And now more than ever before, the Federal Government is a primary source of that information. The top five most visited web sites are all run by Federal agencies.
    The question we are asking today is, 'How good is the data that the Federal Government is getting out?' Many of us assume that if data comes out of a Federal computer, then it must be right. But anecdotal evidence indicates we may make such an assumption at our peril.
 Page 11       PREV PAGE       TOP OF DOC    Segment 2 Of 2  
    For instance, a recent sample of data in a GSA building security database indicated about 50 percent of the data was inaccurate. Last year, USA Today reported that over 88 percent of drinking water violations in this country were not being reported in EPA databases. And finally, The Wall Street Journal stated that data quality deficiencies at the FAA were compromising safety because many of the incidents reported in databases have missing or incorrect data.
    These are not just isolated examples. The subcommittee staff has reviewed a total of 104 General Accounting Office and Office of Inspectors General audit reports issued in the last few years. And 85 of these reports indicated that data quality was a problem.
    Many of these specific problems are addressed once they are identified, but I am concerned that there may be systemic and chronic problems with data quality. Only rarely does someone seem to be asking basic questions such as, 'How accurate is our data?' And 'how accurate does it need to be?'
    Just yesterday GAO issued a report regarding data in EPA's Water Program. Specifically, GAO evaluated data in EPA's 'National Water Quality Inventory.' EPA uses data from the 'inventory' as a basis for a number of important decisions and activities, including how to allocate over $2.5 billion in Federal Clean Water Act funds every year.
    GAO concluded that problems with the data 'make the national statistics unreliable and subject to misinterpretation and, therefore, of limited usefulness.' GAO's report further states that the poor quality of the data 'limit states' abilities to carry out several key management and regulatory activities on water quality.'
    In short, here we are, over 25 years after the Clean Water Act was written into law, and we still don't know how clean our water is. And if you're a user of Federal data, it doesn't matter how fast your modem is or how much RAM your CPU has, if there is garbage at the other end of the line, that is what you are going to get out. The bad thing is, you won't always know it. Worst of all, public decisionmakers, confident with the data they have, will make decisions that affect you that are just plain wrong.
 Page 12       PREV PAGE       TOP OF DOC    Segment 2 Of 2  
    So I look forward to hearing today from GAO, from the Inspectors General, and from agency officials on how serious this problem may be and what we are doing about it.
    I would like to now call upon the witnesses who are here for our first panel.
    Representing GAO is Mr. Christopher Mihm. Representing the Inspectors General, we have first Mr. Raymond DeCarli from the Department of Transportation, next Mr. Gene Waszily from the General Services Administration, and finally Ms. Nikki Tinsley, the Inspector General from the Environmental Protection Agency.
    As you all know, it is the standard procedure of this subcommittee to swear in our witnesses. If you would please stand and raise your right hand.
    [Witnesses sworn.]
    Mrs. FOWLER. Thank you. Please be seated.
    I would like to recognize the vice chairman of the subcommittee, Mr. Lee Terry, from Nebraska.
    Do you have an opening statement you would like to make, Mr. Terry?
    Mr. TERRY. No. Let's proceed with the hearing.
    Mrs. FOWLER. Thank you for being with us today. I know you have another hearing today, as do I. All the hearings today seem to be occurring at the same time.
    We just ask, as you know, that you summarize your testimony in 5 minutes and without objection your full statements will be included in the record.
    Mr. Mihm, you may proceed.
TESTIMONY OF J. CHRISTOPHER MIHM, ASSOCIATE DIRECTOR, FEDERAL MANAGEMENT AND WORKFORCE ISSUES, U.S. GENERAL ACCOUNTING OFFICE; RAYMOND J. DECARLI, DEPUTY INSPECTOR GENERAL, U.S. DEPARTMENT OF TRANSPORTATION; EUGENE L. WASZILY, ASSISTANT INSPECTOR GENERAL FOR AUDITING, OFFICE OF INSPECTOR GENERAL, U.S. GENERAL SERVICES ADMINISTRATION; AND NIKKI L. TINSLEY, INSPECTOR GENERAL, U.S. ENVIRONMENTAL PROTECTION AGENCY
 Page 13       PREV PAGE       TOP OF DOC    Segment 2 Of 2  

    Mr. MIHM. Thank you, Madame Chairwoman.
    Mrs. Fowler and Mr. Terry, it is an honor and pleasure to appear before you today to discuss the challenges that Federal agencies face in producing credible performance information, and the opportunities that the Government Performance and Results Act, GPRA, provides for generating information to help congressional decisionmaking.
    As you know, GPRA was passed in 1993 out of Congress' frustration that congressional decisionmaking, spending decisions, and oversight had been seriously hampered by a lack of clear goals and adequate program performance information from agencies. By the end of this month, under GPRA, agencies are to publish performance reports that for the first time will provide important information on the overall results of Federal programs.
    These performance reports offer Congress the opportunity to systematically assess agencies' actual results on a Government-wide basis and to consider the specific steps that can be taken to improve performance. But before that can take place, Madame Chairwoman, as you pointed out in your opening statement, we have to know how accurate is the data and how accurate the data needs to be.
    My statement today will focus, therefore, on three topics. First, I will provide a Government-wide perspective on the credibility of agencies' performance information. Second, I will discuss some of the challenges agencies face in producing credible performance data. And third, I will highlight how agencies can use their performance reports that are coming later this month to address data issues.
    Turning to my first point, figure one in my written statement shows our analysis of agencies' fiscal year 2000 performance plans. That analysis found that most agencies, including EPA and GSA, provided only limited confidence that their performance information would be credible. DOT was one of four agencies that provided a greater degree of confidence about their performance information.
 Page 14       PREV PAGE       TOP OF DOC    Segment 2 Of 2  
    Specifically, most agencies did not provide information on the procedures that they would use to verify and validate the performance information. Thus, Congress is lacking in information as to how good the data are that will be reported. And in addition, most agencies failed to discuss strategies to address their known data limitations.
    As a result of these limitations, we believe that it is unlikely that agencies will consistently have the reliable information needed to assess whether goals are being met, or specifically how performance can be improved.
    My second point is that three factors seem to be critical to the quality of performance data. These factors, which are detailed in my written statement, are that program design issues can make it difficult to collect timely and consistent national data, the relatively limited level of agencies' program evaluation capabilities make it difficult to know the results of Federal programs, and longstanding weaknesses in agencies' financial management capabilities undermine performance and accountability.
    I will touch on each one of these in turn.
    In regards to design issues, in several Federal mission areas, the Federal Government has devolved program responsibility and accountability to State and local governments. However, collecting consistent data to provide an overall national picture of performance can be challenging when programs are implemented and results are achieved through networks of intergovernmental partnerships.
    A further challenge is the need to have data-derived understandings of the contributions that individual programs make to program results. That is, moving beyond how many times we did something to real discussions of what difference it made in people's lives concerning the quality of services delivered to the American people. Such an understanding is important to ensure that agencies have the best mix of programs in place. In this regard, program evaluations are an important tool for assessing the contributions that individuals programs are making to results and determining the factors affecting performance.
 Page 15       PREV PAGE       TOP OF DOC    Segment 2 Of 2  
    We in the General Accounting Office have been concerned for years, and have reported repeatedly, that many Federal agencies lack the capacity to undertake the needed program evaluation. In other words, a key piece of information—what difference we are making—is not available to Congress.
    Conclusions about what the Government is accomplishing with taxpayers' money cannot be drawn without adequate program cost information. Unfortunately, as this committee's hearing back in the fall pointed out, many agencies are not in a position to provide that information.
    The final point I will make this morning is to note that the forthcoming performance reports provide the agencies with an opportunity to show the progress they are making in addressing data credibility issues. Discussing data credibility and related issues in performance reports can provide important contextual information to Congress.
    To help in this regard, during the past year we have issued a number of reports from the General Accounting Office on practices and approaches agencies can use to address data weaknesses. We look forward to continuing to work with Congress and this subcommittee as you seek to instill a more results-oriented approach to management and accountability in the Federal Government.
    In the interest of time, let me end there because I am sensitive that you have other hearings. I would welcome any questions that you may have.
    Mrs. FOWLER. I need to learn to speak as fast as you can.
    [Laughter.]
    Mrs. FOWLER. Thank you, Mr. Mihm. Thank you very much.
    Mr. DeCarli, before you begin your statement, I understand that you plan on retiring from Government this June after over 30 years of service. I want to thank you for your distinguished career at the Departments of Defense and Transportation and your efforts of striving over the years to improve our Government operations. I thank you on behalf of the Government and you may proceed with your statement.
 Page 16       PREV PAGE       TOP OF DOC    Segment 2 Of 2  
    Mr. DECARLI. Thank you, Madame Chairwoman. I appreciate that your remark.
    Good morning, Madame Chairwoman and Mr. Terry. Thank you for inviting the Inspector General to testify this morning on this important subject.
    Transportation decisionmaking relies on access to good data. Good data is the key to good decisionmaking that affects the safety of the travelling public. Virtually all data have some degree of error and rarely is the pursuit of perfect data necessary or cost-effective. The real key is to know the level of accuracy of data that you need and how well the data you have measures up against that standard.
    Our testimony this morning is going to address four issues. The first issue is that completeness, accuracy, and timely program data are problems in the Department of Transportation. I will give you a couple of examples.
    DOT collects and analyzes data that is used to identify transportation companies that should be subjected to safety compliance reviews. These data are used to target high-risk companies for Federal inspection.
    We recently performed an audit of the DOT motor carrier databases and found that there were 70,000 truck and bus companies that didn't have any information in these databases related to the numbers of trucks they had or the numbers of drivers. Basically, they had zeroes in the database. You're not going to go out and do a compliance review of a company that doesn't have any trucks or any drivers. So that is one type of data problem.
    The second point would be that DOT distributes about $25 billion each year in formula grants. The accuracy of the data used in those formulas is critical to make sure the right organizations receive the right amount of funds. In 1998, we evaluated the accuracy of the passenger data submitted to the Department by the airlines. That information is used to distribute the Airport Improvement Grant Funds.
 Page 17       PREV PAGE       TOP OF DOC    Segment 2 Of 2  
    When we looked, we found that the Department wanted a 95 percent accuracy rate and only 69 percent of the information we received measured up to that standard. In fact, about 7 percent of the data had errors ranging between 31 and 40 percent. As a consequence, the Department managers who are responsible for distributing those funds had to use other information they had garnered to make the proper distributions.
    The third item under this issue is that DOT has extensive information on transportation programs. Here is just one example of the kind of data we have. This is the National Transportation Statistics for 1999 published by the Bureau of Transportation Statistics.
    If you took a look through the book, one of the things you would note is that we have a problem with timely data. You would find that even though these are 1999 statistics, much of the information is from 1997, with some of it going back as far as 1990. So that is another problem.
    Our second point this morning is that DOT's ability to collect good data is hindered by inconsistent definitions and the department's need to extensively rely on organizations outside of DOT's control. There are disincentives and barriers to third-party data collection and problems with self-reporting. One of those problems that I will give you an example of is an oil spill. If you were responsible for that spill, you might not want to report it, and you might even want to understate the extent of the damage caused by that spill. That is one of the reporting mechanisms that we have, self-reporting.
    Third-party reporting is also a problem. Third-parties have the opportunity to and do bend the rules. The example I would cite here is again in the motor carrier area. The States are required to report convictions of drivers with commercial driver's licenses. But 26 States have programs where they mask those convictions, so we don't collect all of the convictions. We don't become aware of them because of masking programs out there. Fortunately, last year's motor carrier law corrected that problem. They can no longer do that.
 Page 18       PREV PAGE       TOP OF DOC    Segment 2 Of 2  
    Our third point this morning is the absence of accurate and timely data ultimately hinders managers' ability to measure performance and make good decisions. Insufficient, inaccurate data could adversely impact DOT's ability to achieve its strategic goals, which relate to safety, mobility, economic growth, environmental quality, and national security. My written testimony has a number of examples. I will just give you one.
    In the motor carrier area, the Department established the goal to reduce fatalities by 50 percent over the next 10 years. Currently, about 5,300 people are being killed annually in crashes involving trucks. But we don't collect the kind of information that shows us the causes of the accidents. And without good data on causes it is difficult to know the right corrective actions that need to be taken.
    Our last point is that DOT recognizes that it has problem with data quality and it has taken actions to fix them. Last year, DOT was the only agency to conduct a dry run to prepare for the first report. When they did that, they found problems with 40 percent of the data. They would only be able to report on about 60 percent of the data elements. They took action and they will now be able to provide at least preliminary data on 90 percent of those performance measures.
    Furthermore, DOT's report will specifically identify limitations on the quality of data. We have seen a draft of the Department's report. It clearly shows that there are limitations and that there are problems with self-reporting and what the limitations are of the data.
    To its credit, DOT did get a clean opinion on its financial statements this year. It took a monumental effort and we applaud the Department for that. It still needs to make some systemic fixes in order to sustain clean opinions in the future.
    And our last point would be that as DOT enters the new millennium, it has a major challenge. That challenge is not only does it have to have good program data that is complete and accurate, but it now has to link the program data with the performance measures and the cost data. Cost and performance need to go hand in hand to really assess whether or not the programs are working effectively and providing the results they should.
 Page 19       PREV PAGE       TOP OF DOC    Segment 2 Of 2  
    On the next panel, you will have Jack Basso, DOT's CFO and Assistant Secretary for Budget. I am sure he will be glad to tell you all the actions the Department has taken to improve the data quality. We think the Department is on the right track to make the fixes and recognize the problems. But these problems are not going to be resolved over night.
    That concludes my statement.
    Mrs. FOWLER. Thank you, Mr. DeCarli.
    Mr. Waszily, you may proceed.
    Mr. WASZILY. Good morning, Madame Chairwoman and Mr. Terry.
    I am very pleased that your subcommittee has decided to raise publicly the issue of quality of data in Federal information systems. Although usually not the topic of everyday discussion, we all rely and must use data prepared by others in our daily lives to make decisions, both professional and personal. Clearly, the accuracy, timeliness, and completeness of data we use has a direct bearing on the quality of actions we take and the results we achieve.
    For the past 17 years I have been engaged in the evaluation of GSA programs and supporting systems. While it is fair to say that over that period I have witnessed great improvement in the agency's management and its overall performance, it is also true to say that there have been very few systems examined where the quality of supporting data was without deficiency. The net result of data problems cannot be easily quantified, nor can the underlying causes be easily overcome.
    At GSA, because we are basically a business-type organization, the preponderance of data weaknesses we find usually manifest themselves in terms of project delays, misbillings, improper time charges, and rework. If the errors are frequent and significant in nature, they tend to be identified and corrected before program-wide problems occur. Nonetheless, these errors do cause service delivery deficiencies and they unproductively consume resources.
    At the heart of data problems, we usually find three root causes: poorly designed and overly complex reporting and collection methods, antiquated often fragmented information systems, and generally the human factor, which includes data entry errors, misunderstandings, miscommunication, carelessness, poor judgment, and just general disinterest.
 Page 20       PREV PAGE       TOP OF DOC    Segment 2 Of 2  
    While we can never completely eliminate the source of these errors particularly the installation of modern information systems to include data entry templates and built-in logic and control edits can greatly enhance the quality of our data. GSA is currently working to implement these kinds of systems.
    The further we venture into the information age, the greater the challenge is going to be for managers to stay on top of the ever growing complexities that affect our data systems.
    I will conclude there and I will be pleased to answer any questions.     Mrs. FOWLER. Thank you, Mr. Waszily.
    Ms. Tinsley, you may proceed.
    Ms. TINSLEY. Good morning, Madame Chairwoman and Mr. Terry. I am pleased to have the opportunity to discuss our audits of data quality at EPA. Environmental data quality problems have been a concern of the Office of Inspector General and EPA management for some time now. In 1992, EPA declared environmental data quality a material weakness. Progress in correcting the problems have been slow. Our office as reported environmental data quality as a key management challenge for the last 3 years.
    EPA's challenge in ensuring complete and accurate environmental data is complicated by the fact that data in EPA's system comes from State offices, from other Federal agencies, and from regulated facilities. Data originating from these different sources feeding into EPA's data systems can result in a hodgepodge of data that cannot be easily compared and consolidated. EPA has been working with the Environmental Council of the States and other to address data problems. And it has made some positive changes in other areas, including strengthening policy, increasing program and regional office oversight, and reorganizing information and data quality functions.
    While we believe these are steps in the right direction, EPA may not have the environmental data it needs to monitor environmental activities and compare progress across the Nation. Reliance on data that is of unknown quality affects three components of EPA's business that I would like to talk about today: its ability to make sound decisions, its ability to take appropriate enforcement actions, and its ability to keep the public informed of progress in the protection of human health and the environment.
 Page 21       PREV PAGE       TOP OF DOC    Segment 2 Of 2  
    EPA relies on environmental data to make important decisions on complex issues that have significant environmental, social, health, and economic consequences. Each year, EPA, States, and the regulated community spend about $5 billion collecting environmental data. This data must be accurate and relevant. We audited aspects of EPA's Superfund data quality and found that EPA did not consistently define the quality of data needed to support Superfund cleanups.
    For example, in one regional office, we looked at the planning process for five Superfund cleanups. Three were completed with inaccurate data for decisionmaking. Adequate planning provides two primary benefits: more cost-effective sampling and better decisions.
    We reviewed four States' Resource, Conservation, and Recovery Act enforcement programs and found that the data in State files did not agree with the data in EPA's systems. Consequently, EPA could not make informed decisions concerning management of hazardous waste at these facilities and the public didn't have access to accurate data.
    Enforcement actions should be based on sound and defensible environmental data. Our audit work in EPA's Air, Water, and Hazardous Waste Enforcement Program showed that environmental data are not always accurate or complete. Without information on significant violators, EPA can neither assess the adequacy of States' enforcement programs, nor take action when a State does not enforce environmental laws.
    Unreliable information about significant violators also compromises the data EPA makes available to the public through its Internet databases, leaving a false impression that facilities are in compliance with environmental laws when in actuality they may not be.
    Recently one of EPA's regional laboratories reported that data quality and chain of custody were compromised when chemists circumvented the lab's standard operating procedures. As a result, the lab provided data that were of unknown quality to regional program offices. At EPA's request, we audited the laboratories management controls and found that recommendations from a prior internal review of the laboratory had not been implemented. EPA had identified the problems but it had not corrected them. Now EPA is taking the costly step of validating the analyses and determining the impact of the suspect data on EPA decisionmaking.
 Page 22       PREV PAGE       TOP OF DOC    Segment 2 Of 2  
    The Results Act focused Federal agencies' measurements on accomplishments specific to their mission. One EPA measure is pesticide reregistration. In our 1999 financial statements audit, we found that reregistrations were improperly counted and reported. Since EPA began tracking and reporting pesticide reregistration, its data system has been counting each of the multiple decisions made during the reregistration process as an additional reregistration, thus overstating progress and ensuring that pesticides are safe to use.
    Congress and the public do not get an accurate picture of EPA's progress in reregistering pesticides from the data EPA has reported. We have been reporting this problem since 1996.
    Environmental data quality at EPA is a continuing concern for agency management and the OIG. While EPA has taken a number of steps to address data quality, it has not developed an overall strategy that supports a comprehensive environmental data infrastructure that will meet the wide variety of needs demanded by the agency, its partners, the public, and the Congress. We are committed to working with the agency to reach that goal.
    That concludes my testimony.
    Mrs. FOWLER. Thank you. I appreciate your testimony.
    I have a question for all the witnesses.
    Are any of you aware of any ongoing effort in any Federal agency to systematically estimate the accuracy of the data that is maintained by that agency?
    Mr. MIHM. Mrs. Fowler, about the closest you will get is that agencies, as part of the Government Performance and Results Act, have to show how they are going to verify and validate their performance information. The best agencies are looking at their key performance measures and are attempting to verify and validate those measures. But the systematic or comprehensive examination of the data is not something we are seeing within agencies.
    Mrs. FOWLER. And if they are basing performance measures on inaccurate data——
 Page 23       PREV PAGE       TOP OF DOC    Segment 2 Of 2  
    Mr. MIHM. Then they are of little value to agency management or the Congress.
    Mrs. FOWLER. So unless they are working in some systematic way to ensure they have accurate data, then all the things we have asked for in this law aren't going to do us any good when they are based on inaccurate data.
    Mr. MIHM. Yes, ma'am.
    Mrs. FOWLER. What recommendations would any of you have to how we could go about getting agencies to focus on reliable date? Do we need to do something legislatively or working within regulations? It was most frustrating as I was reading through some of the GAO and OIG reports and the inaccurate data—and therefore inaccurate information—that we in the Congress get as well as regulated industries and governments around the country.
    So we are searching for some solutions to ensure that this doesn't continue to occur. What fixes can we put in to correct this?
    Mr. DECARLI. Madame Chairwoman, the Department of Transportation does have a Bureau of Transportation Statistics and in its performance report it will have a section that deals specifically with data verification and data validation. It will go through a disclosure on many of the data elements telling you exactly what the limitations are in terms of the reliability of the data. That will give the reader a clue as to how much trust or reliance they can place on that information.
    I think that has to be a first step. There is no way you can take the amount of data that the Department of Transportation has and expect to know the reliability on all that data. You basically have to focus on the important pieces of data and then let the reader know the limitations of what they see in front of them.
    Mrs. FOWLER. You made the point that DOT gives out $25 billion in formula grants, but when only 69 percent of the data is accurate, there is a lot of money going out there in grants that might be based on inaccurate information.
 Page 24       PREV PAGE       TOP OF DOC    Segment 2 Of 2  
    Mr. DECARLI. Fortunately, that 69 percent related to one program, airport grants, which was about $1 billion. The heavy part of the formula grants in DOT are with the Highway Program, and we did look at that program. We didn't find any serious problems either with data or the method of calculations. So about $18 billion going out on the Highway Program was based on accurate data.
    Mrs. FOWLER. Would you suggest something like the new office DOT is setting up as a first step?
    Mr. DECARLI. Not necessarily a new office, but clearly exposure as to the reliability of the data.
    Mr. WASZILY. Basically, when we look at our systems, we firmly believe that one of the critical steps that the agencies need to take is the implementation of modern information systems. Unfortunately, Federal agencies got off to a slow start and our general assessment is that we are probably 8 to 10 years behind most of industry in its application of automated techniques.
    I think we all understand there are data problems there. The question is how to control them.
    When you see repetitive problems with data, you need to find some kind of edit or some kind of technique, sort, or program to flag errors before they cause major problems or to prevent them from getting into the system. Two examples—when retail industry saw there was an inordinate problem with their cashiers making change, the solution was that the cash registers automatically subtract the bill from the amount tendered and the change is calculated for the cashier. The same thing for credit cards. When credit card numbers were written by hand, there were tremendous numbers of misbillings in transpositions. Fortunately, we came up with a magnetic strip, which puts the number in right away.
    Those are the kinds of things I think are going to greatly improve our data collection problem.
 Page 25       PREV PAGE       TOP OF DOC    Segment 2 Of 2  
    Mrs. FOWLER. Ms. Tinsley?
    Ms. TINSLEY. I think there are two things that agencies can do to address the problem.
    First of all, performing the kind of internal reviews that the regional lab at EPA had performed that identified problems with quality control procedures in its lab work. And the second thing is holding employees accountable for fixing problems when problems are identified.
    We have found when we have done audit work in the past where we have made recommendations and the agency has agreed to implement. When we go back later, we find that the fixes were not made.
    Mr. MIHM. You asked me if there was a need for statutory action at this point.
    Congress has already done its part. You have implemented the Government Performance and Results Act for performance data and the Chief Financial Officer's Act for financial information. It is now up to the agencies to step up to the plate and effectively implement those statutory initiatives.
    I think one of the things that is certainly helpful is continuing congressional oversight. Hearings such as this certainly send unmistakable messages to agencies that Congress is concerned and interested in the quality of performance information.
    I would certainly endorse what Ms. Tinsley said about accountability for program managers. The Department of Education, in their performance plan and performance report make the responsible program official sign off as to the quality of the performance information to make it clear that this is something that is not staffed out. It is not something that someone else does; that the auditors look at. It is a fundamental program responsibility.
    These types of things can really go a long way to making it clear that ensuring the collection of good data is not just other duties as assigned. It is inextricably linked to the success of programs.
 Page 26       PREV PAGE       TOP OF DOC    Segment 2 Of 2  
    Mr. DECARLI. There are some agencies, like the Department of Transportation, that are very much dependent on outside organizations to give them information. And we have to find better ways to use what I refer to as 'sticks and carrots' to encourage them to make sure that we get the kind of data we need.
    Mrs. FOWLER. I worry about the Department of Education. They just recently awarded a lot of scholarships to people that weren't supposed to get the scholarships and now they have to pay for them. I don't know how accountable that program manager is going to be held, but there can always be problems at Education.
    I have a couple of other questions, but I would first like to go to Mr. Terry.
    Listening to Mr. Mihm, there is already a piece of legislation in place which addresses this kind of thing, which is the Federal Managers Financial Integrity Act, which for years has required agencies to review their programs and see whether or not they're working. And if they aren't, to put together a plan to address this problem.
    I think the thing is how we put teeth in these because these things have been there, but they are just being ignored or just are not being complied with. That is a frustrating thing as a Member of Congress. You put these in place, expect them to be adhered to and followed, and then they aren't. Then as a result we get inaccurate information or we base decisions on inaccurate data. We need to make these agencies more accountable.
    The GAO helps us tremendously. The Inspectors General has helped us tremendously. Those jobs will be available forever because I think it is going to be an ongoing problem.
    Mr. Terry?
    Mr. TERRY. Thank you.
    Mr. Mihm, listening to the testimony here and reading through the statements, one general theme or impression arises, and that is that this is mostly a management issue through your investigation, oversight, and compilation for your report.
 Page 27       PREV PAGE       TOP OF DOC    Segment 2 Of 2  
    Either strengthen or weaken that impression of mine. Do you see differences in the different agencies in the problem of credible information? And which agencies do you think have the least credible collection and reliable information? And which ones are the best?
    Mr. MIHM. Dealing with the first part of your question—the management versus programmatic problem—you have heard that the panel's discussions of data quality have been on two very closely related tracks. On the one hand there are problems with the quality of information in the basic management information systems within agencies, the types of data that managers need to routinely have in order to run their programs. That is subject to all the errors that Mr. Waszily pointed out: human errors, database errors, and all the rest.
    There is a second level of performance information, which is also very important, on which we spend most of our attention in the GAO, and that is, What difference does it make? Do agencies know what difference what they do on a day-to-day basis makes in terms of the lives and welfare of the American people? This was the stuff that Mr. DeCarli and Ms. Tinsley talked about when Mr. DeCarli was saying that DOT does an awful lot and doesn't have an understanding of how what they do influences-in one way or another-highway deaths. Ms. Tinsley was talking about EPA doing a lot but they are still having trouble figuring out a national picture of clean water and clean air. So that is performance information at the policy and programmatic level.
    Now in terms of the agencies that are doing the best, as I mentioned in my written statement, when we looked at the annual performance plans for fiscal year 2000, there were a number of agencies that seemed to be doing very well on the performance information, one of which was the Department of Transportation. They had a good verification and validation plan. In our view, they did a good job of revealing and discussing known data weaknesses. I look forward to seeing their performance report and seeing how issues of verification and validation are treated there.
 Page 28       PREV PAGE       TOP OF DOC    Segment 2 Of 2  
    So there are some agencies that are really in the lead. But there is avery large group of agencies that we concluded you could have only limited confidence in the quality of their programmatic information. This is also the case when you look at agency financial information—specifically unit cost information. I am not talking about audited financial statements, which are also very important-But rather when you look at whether you have information, routinely available to you and to managers on the cost of Federal programs— the subset of agencies that are in good shape there is actually very, very small. The agencies that are in some trouble are, unfortunately, some of our larger agencies—USDA, Department of Defense—these are the agencies that also routinely get disclaimers on theur financial opinions.
    Mr. TERRY. Do you make those specific findings of which agencies have poor performance in those areas in your report?
    Mr. MIHM. We have in other reports, and I will be happy to get that for you, sir, where they are listed and the reasons why.
    Mr. TERRY. Very good. I think the ultimate goal is to take it from a general learning of the general issues and problems and then to the specific problem-solving.
    But I am curious about one other thing. I am going to keep talking to you because you don't represent one specific agency.
    How much does politics and culture account for these types of problems? So you understand my direction, we had some folks come in my office yesterday talking about one specific education program and how much more we need to increase funding in that particular area.
    It is pretty apparent when you talk about the number of students that they touch and the success that it is probably—if you grade it—a complete failure of a program. But yet they wanted an incredible amount of new funding and additional funding.
 Page 29       PREV PAGE       TOP OF DOC    Segment 2 Of 2  
    The point is that sometimes it just doesn't matter if you gather information and it is accurate because of the politics. Who is going to say that when you have a program that deals with an inner city child to receive a college education or to placed on the right path toward a college education—that those who are actually successful you could have put them in the most costly prep school and Harvard and probably paid for their next 10 years after that. It's politics.
    Is that coming in here? Do the middle managers and upper managers just say, Look, Congress says we are supposed to run this program and no one has ever said we are supposed to run it efficiently?
    Mr. MIHM. Yes, sir, you do see some of that. Although the ship is turning. Certainly, prior to the enactment of the Government Performance and Results Act, as Congress was frustrated by a lack of real information on results, so too were agencies. If you asked a typical program manager what their measures of success were, they would describe it in terms of activities or number of things done rather than if it made a difference to someone.
    I think one of the things that GPRA has done is begin to change that orientation in agencies. Now you hear—as you have heard across the panel—much more of a concern of not what we are doing but what difference it is making in people's lives. And that is a huge cultural change. It is a cultural change within agencies—as you alluded to—it is a cultural change up here on the Hill, as well, to begin to think in that way.
    The authors of the Results Act understood that results on performance information doesn't cause decisions to happen and it is not the trump card. It is just one factor that Congress and other decisionmakers need to have available to them as they are considering other things. But that has been the missing piece, too often in the past.
    That is one of the reasons why it takes such a long time to implement a cultural change initiative like GPRA. It is getting down throughout organizations that this is a different way of doing business. Congress is asking different questions than have traditionally been asked in the past. And it takes time for that message to get out and new types of information to become available.
 Page 30       PREV PAGE       TOP OF DOC    Segment 2 Of 2  
    Mrs. FOWLER. Thank you, Mr. Terry.
    I just have a couple more questions.
    Mr. Mihm, is there anything specific that you would recommend that any of these three agencies should be doing that they are not doing that would help? And also, how do their data quality control practices compare to those being exercised in private industry?
    Mr. MIHM. In terms of specifics, we have done reviews of selected data sets within each of these agencies. You mentioned in your opening statement some of the very recent work we have done over at the Environmental Protection Agency and it made recommendations that I am pleased to see. In the vast majority of cases, the agencies are taking action on those, either directly or they are taking actions that are consistent with those.
    We are very fortunate that with each of these IG offices we work in close partnership on a number of issues. Example of this was the hearing that you had last fall on building security at GSA—my colleague back at GAO and Mr. Waszily both testified on work that they had been doing consistently on that issue.
    We have made a series of recommendations about the data problems the IGs have talked about in their agencies. We are happy to see that the agencies are responding to those.
    In terms of where they stand overall, a lot of it cuts on the types of challenges that they face. DOT and EPA are fundamentally agencies whose results are achieved through partnerships with States, local governments, and non-profits, and in the case of EPA also a large network of contractors. So they have a fundamentally more difficult set of challenges in getting good performance information out of those contractors, out of the States, to get it into a consistent and usable way.
    On the other hand, the General Services Administration has—I won't say an easier case, but at least a more straightforward way that they can get performance information because GSA is measuring its own activities that gets them the results. So that is why, when we look across and discuss an awful lot of the challenges that DOT and EPA and other intergovernmental agencies that implement programs intergovernmentally face—they are among the agencies that have real challenges.
 Page 31       PREV PAGE       TOP OF DOC    Segment 2 Of 2  
    In both cases, as you heard, they are also agencies that are taking action to address these challenges.
    The key, though, will be following through on these. As you mentioned in your opening statement, Madame Chairman, we have had a lot of actions being taken in the past that don't seem to ever kind of come all the way to fruition. I know my colleagues on the panel and certainly we in the GAO will continue to watch these agencies to make sure that these things come to fruition this time.
    Mrs. FOWLER. This subcommittee will be doing the same. Oversight is a word we believe in and we will be exercising diligently this year and for the coming years also.
    Ms. Tinsley, when I was reading your testimony, it was striking because it went back to where in 1992 EPA said that their environmental data quality was weak. They admitted it and were going to correct it. Then you go through tracking the correvtive actions, and it wasn't met in 1994, not in 1997, and in 1998 your report concluded that they still had not instituted a cohesive agency-wide quality assurance program.
    In your testimony today, EPA has not developed an overall strategy to address the accuracy and completeness of its environmental data. Yet the regulated community and EPA is spending about $5 billion a year collecting this data.
    You talk about the new Office of Environmental Information being a step in the right direction. What does it need to do to correct this problem? This has been going on for years. Are they going to be able to get this straight and make sure this is going to be a success? So far, they don't seem to be getting it corrected very well.
    Ms. TINSLEY. We think that the idea of taking all these data quality responsibilities and putting them in one place is a very good idea. For the new office to be successful, it still has a very big job ahead of it.
 Page 32       PREV PAGE       TOP OF DOC    Segment 2 Of 2  
    The way EPA is organized from the standpoint of a centralized headquarters and its 10 regional offices and many other parts of the organization gathering data makes it an overwhelming challenge. The idea of accountability in making sure that all the different things that feed into that centralized office are good, quality pieces of information will be the challenge. But the fact that it is not scattered across a number of media programs and that there is a centralized function is a step in the right direction.
    Mrs. FOWLER. And hopefully if States are not giving them good quality data, then they won't get the grants they would be getting. Hopefully there will be some carrot and stick that you have to have. Otherwise, they still won't worry about it.
    Just last night, I was talking to my daughter who is doing some water quality research out west and telling her about the GAO report. She asked me if I had read the report 'Murky Waters?' I gather there is a group, 'Public Employees for Environmental Responsibility,' that put out this report last year dealing with some of the problems with water issues and the EPA. I am going to get it and read it, since I wasn't familiar with it, nor was my staff. So it is great to be learning things from young people.
    I didn't know if that is something GAO has been looking at or what it has to do with it. But I thought I would just bring it up. We are certainly going to pursue it from here and pass it on to Mr. Boehlert and see what it has to do with water quality control with the EPA also. We will go from there.
    Are there any other questions?
    [No response.]
    Mrs. FOWLER. I want to thank each of you. You do such a good job for your agencies and for the Government. We really appreciate your work and we continue to look forward to working with you.
    Thank you very much.
 Page 33       PREV PAGE       TOP OF DOC    Segment 2 Of 2  
    Mrs. FOWLER. I would now like to call panel two.
    The Chair now calls the witnesses for the second panel. First, Mr. Jack Basso, who is the Assistant Secretary for Budget and Programs and the Chief Financial Officer at the Department of Transportation.
    Mr. Basso, I understand that you are accompanied by Dr. Sen, who is the Director of the Bureau of Transportation Statistics.
    Next, Mr. William Piatt is the Chief Information Officer at the General Services Administration.
    And finally, Ms. Margaret Schneider is the Assistant Administrator for Environmental Information at the Environmental Protection Agency.
    As with the first panel, we will swear in the witnesses, so please stand and raise your right hand.
    [Witnesses sworn.]
    Mrs. FOWLER. Thank you. Please be seated.
    As before, we ask that you summarize your testimony in 5 minutes and without objection your full written statement will be included in the record.
    Mr. Basso, we will start with you.
TESTIMONY OF PETER J. BASSO, CHIEF FINANCIAL OFFICER, OFFICE OF THE SECRETARY, U.S. DEPARTMENT OF TRANSPORTATION; WILLIAM C. PIATT, CHIEF INFORMATION OFFICER, U.S. GENERAL SERVICES ADMINISTRATION; AND MARGARET N. SCHNEIDER, DEPUTY ASSISTANT ADMINISTRATOR FOR ENVIRONMENTAL INFORMATION, U.S. ENVIRONMENTAL PROTECTION AGENCY

    Mr. BASSO. Thank you very much, Madame Chairwoman.
    First of all, let me say that I appreciate the fact that the committee is holding this hearing and in fact focusing attention on this very important issue. The question of timeliness and accuracy of data goes to the fundamental integrity of Government. At the Department of Transportation, we take this very seriously.
 Page 34       PREV PAGE       TOP OF DOC    Segment 2 Of 2  
    Let me begin by commending the committee. Last year when I appeared before you, I had the embarrassing task of saying that in over 7 years we had never attained any more than a disclaimer on the financial data. Certainly with the assistance and focus of this committee, for the first time the Department of Transportation has for the first time attained a clean opinion.
    Mrs. FOWLER. I want to commend you because I know that it was with herculean efforts on your part that caused that to occur. Thank you very much.
    Mr. BASSO. Thank you, Madame Chairwoman.
    And I must also very carefully commend the Inspector General's Office who did a superb job of assisting us in this, and also the Federal Aviation Administration, the Coast Guard—really it was one DOT effort.
    You mentioned legislation. I thought the one thing you might be able to do for me is legislate Mr. DeCarli's continued tenure.
    [Laughter.]
    Mr. BASSO. Let me just say that good financial data at program level and the departmental level is critical to make the best allocation decisions and to determine what things cost. DOT and the DOT Office of Inspector General have spent a considerable amount of time not only on financial data, but on program data trying to determine where our weaknesses are. Some examples of this where I think—you know, you have to air your laundry a little bit—is in the Coast Guard retirement system. We found inaccuracies, we made changes. We are able to estimate those costs better, we are able to provide a better budgetary resource.
    In property management in the FAA and the Maritime Administration, we didn't know what we had where. We have straightened that problem out and have reflected in our financial statements accurate records of that property.
    With regard to a few other areas that I think are important, we are revising and implementing a core financial system that will be state-of-the-art 21st century. The cost accounting system of the Federal Aviation Administration is coming on-line. It is crucial to data integrity in terms of our fundamental financial resources. I am reminded a little bit of Vince Lombardi's line about fundamentals needing to be practiced every day. Program data will not be fundamentally any better until our basic financial data is.
 Page 35       PREV PAGE       TOP OF DOC    Segment 2 Of 2  
    With regard to the importance of all that, when you turn to the program perspective and the program data, I particularly mentioned the Motor Carrier Safety Program. Congress recently focused attention and passed legislation to considerably enhance the Motor Carrier Program. And I would be the first to acknowledge that we have work to do on that. That data—where lives are at stake—is critical. And we are taking the steps and requesting the resources to make those systems in fact show the kind of integrity they should and to move forward.
    Mrs. FOWLER. I want to interrupt you while I am thinking about it because I know Mr. DeCarli mentioned that you have statistics on numbers of accidents, but not what causes them. So is there a movement to get to what the causes of these accidents are?
    Mr. BASSO. Yes, there is. There clearly is. We are making considerable effort in that regard.
    The Bureau of Transportation Statistics—and one of the reasons I asked Dr. Sen to accompany me is that he is a critical player in assuring the Congress of the integrity of our data. The BTS is modelled after the Bureau of Labor Statistics and their critical introspective examination is crucial to us succeeding. We have engaged them substantially in that effort to try to come up here and be able to give you assurances that hold water.
    On the question of the Government Performance and Results Act, we did do a dry run last year, and much to my chagrin, I found that we had only about 60 percent of our data that would reflect accurate reporting.
    The good news is that having had a year and having done that kind of critical internal look, we are able to report, as the Attorney General suggested, confidence in 90 percent of the data that we are reporting in that report. And we are giving you a very clear and objective look at our weaknesses, what we need to do about it, and what we are doing about it.
 Page 36       PREV PAGE       TOP OF DOC    Segment 2 Of 2  
    I think that is particularly important. I have been in this Government 36 years and one of the things that I have found is that over time knifes become dull if people don't in fact sharpen them periodically. And this kind of public introspection sharpens the knife and focuses the attention where it should be anyway. It helps considerably.
    The central purpose of collecting data is to do a better job of running programs. Program evaluation was an area that was mentioned by GAO as an area of concern. I have been concerned about this for many years.
    We have put together a vigorous program evaluation effort at the Department. In fact, we are just issuing now on our Hazardous Material Program a very hard-hitting report that included a coalition from the Inspector General's Office, the program offices, and my office. We are making real recommendations which the Secretary has ordered to be implemented for change that will make a difference in that regard.
    I would simply sum up by telling you, Madame Chairwoman, that I want to thank you, in particular, for putting the prod to us, because we needed it. I would like to commend you for holding this hearing this morning giving us an opportunity to show that while we are not perfect, we understand that, and we are looking for perfection. And lastly, we want to work with the Congress to deliver for the American people what they have a right to expect.
    Thank you.
    Mrs. FOWLER. Thank you, Mr. Basso. I appreciate all your hard work and what you have accomplished at the Department.
    Mr. Piatt, you may proceed.

    Mr. PIATT. Good morning, Madame Chairwoman.
    Today's hearing on program data quality provides GSA the welcome opportunity to testify on the high level of scrutiny and quality controls we demand for the data in our systems. In addition, we feel that our successful efforts over the least year to improve our data quality illustrates a broader effort to measure, evaluate, and improve key business functions throughout the agency.
 Page 37       PREV PAGE       TOP OF DOC    Segment 2 Of 2  
    The data in GSA's systems are the key to almost every aspect of our business. Financial statements, customer billing, repairs and alterations, fleet management, building security, and many other important operations are directly impacted by the quality and integrity of our information. We understand that without complete and reliable information on all of our business lines it would be impossible for us to complete our mission.
    In GSA, we are constantly reminding ourselves that what you measure is what you get. Tracking and measurement are proven methods for improving performances and GSA has brought this technique to our business liens with successful results.
    One important example of GSA's performance measure initiative is our data accuracy measurement for public buildings. Almost 3 years ago, the Public Buildings Service began implementing a new modern information technology system known as STAR, or System for Tracking and Administering Real Property. This modern system updated hardware and software and migrated legacy data onto the new platform. Initially, there were problems with the data quality of STAR, mostly coming from the fact that the quality of the data in the old system was not that good and as it migrated forward, it brought its problems with it.
    We focused a campaign on data accuracy that began in July 1999 and by September had already seen a reduction in many of the errors by more than 95 percent. In October, we initiated a further campaign to further improve the data.
    We understand, however, that measuring and analyzing our data accuracy is not the final solution, but that we have to continue working to ensure that the data maintains its quality over time as we move forward.
    We have begun mapping our processes, benchmarking with industry best practices, and instituting changes to further cleanse and ensure the quality of the data. For example, a contract for the support of PBS Data Clean Up and Data Quality Assurance Initiative has been awarded to Arthur Andersen last year, and a team of Andersen professionals have been deployed to all 11 PBS regions nationwide. Working with GSA personnel, this team will ensure that the accuracy of data remains high by developing policies and procedures that will ensure that once the data is cleaned and validated that it continues to be updated and maintained in a quality manner.
 Page 38       PREV PAGE       TOP OF DOC    Segment 2 Of 2  
    While the STAR system is an important example of our work in this area, it is just one of many. Our critical Federal Supply Service systems, FSS–19 and the Federal Fleet Management have all data entered directly by users through a web-based front end and then verified through edits. This approach has provided reliability and accuracy in our Federal Supply data.
    Another critical system in GSA is the Building Security Committee Tracking System, which was referenced earlier. This system was implemented very rapidly after the bombing of the Alfred P. Murrah Building and was initially intended for scoping the budgets of the security improvements.
    In 1998, GSA became aware of inaccuracies in the system's data that prevented a full determination of the status of the security upgrades as they had been performed and the final cost. In May 1999, we determined that a more user-friendly system was really the solution for this problem and we began implementing that. It is now being tested in the Fort Worth office and we are having excellent results. We expect it to be implemented fully this summer. Already, by October of 1999, a GAO study found that an overwhelming majority of these systems inaccuracies had already been corrected.
    In summary, GSA's data accuracy is a priority for our business, because we are a business, essentially. One indicator of our success in these efforts has been the 12 straight years of unqualified annual opinions from an outside accounting firm. This year we were one of only twelve Federal agencies to receive such an opinion.
    As we move from older legacy systems to new, modern, source-driven information systems, we also see more customers and internal personnel performing analyses and making decisions based on verified and timely information. We feel that Congress, our customers, and the taxpayer can rely on GSA's data and its systems.
    This concludes my opening statement and I will be glad to answer any questions.
 Page 39       PREV PAGE       TOP OF DOC    Segment 2 Of 2  
    Mrs. FOWLER. Thank you.
    They have called a vote, and I gather our bells are not working up there, but they tell me I have about 3 minutes to run over. So I am going to vote and I will be right back.
    The subcommittee is recessed for a few minutes. Thank you.
    [Recess.]
    Mrs. FOWLER. The hearing will come back to order. I apologize. They say we won't have a vote for at least another hour, so maybe we can continue uninterrupted unless someone unexpectedly calls one.
    Ms. Schneider, I think you are up next.

    Ms. SCHNEIDER. Thank you, Madame Chairwoman.
    I want to thank the subcommittee for the opportunity to appear today and discuss data quality at EPA.
    Quality assurance and data management is a top priority for EPA. That is why just this past October, as Nikki mentioned, our agency finalized a major reorganization aimed at consolidating and enhancing EPA's management of environmental information. This reorganization brings together in one place various functions related to the collection, management, and use of information.
    To further support our commitment to enhancing quality information, we are in the process of directing more of our resources to provide better guidance and oversight of data integrity and quality issues. We regard the quality of our data as a vital public trust. EPA, as has been mentioned, manages an enormous amount of data and is committed to being as open as possible in its management and use. We want to ensure confidence in every number we use.
    The new Office of Environmental Information is building on a number of reinforcing systems which were already in place to ensure and maintain the quality of this data. This includes a quality staff which reports directly to me and the Assistant Administrator, as well as a newly formed cross-agency team of senior managers. This group is responsible for implementing and providing oversight of the Quality Management System. That system is the framework that helps ensure that data are of appropriate type, quantity, and quality to support our programs and decisions. Our guiding principles also include more peer review by independent scientific and technical experts in addition to the public notice and comment processes.
 Page 40       PREV PAGE       TOP OF DOC    Segment 2 Of 2  
    As you have noted, everyone today faces the challenges of the information age, and EPA is addressing these new challenges by implementing a series of new activities that we believe will reduce the error rate in our data, increase the completeness of reporting, and place the necessary contextual data around our data.
    One of the most important of these is the Information Integration Initiative, which Carol Browner recently announced. This initiative will support an increased data exchange network with our State and tribal partners and other information stakeholders, allowing for the collection and management of more consistent and accurate data.
    As it has been noted, EPA and our State and tribal partners collect large amounts of data under various statutory and regulatory authorities, and integrating this data is one of our biggest challenges. But we are working jointly with the States to ensure that the data we jointly collect and use is accurate. In particular, we are working hard to develop data standards and better collection approaches.
    For example, this year, we will begin building and operating a Facility Registry System, which is a single master file with verified identification information about each facility that we jointly regulate. This is a key component of the integrated system and we are confident will provide more accurate integration across the data system.
    This summer we will also be testing a new central receiving facility. The central receiving facility will allow us to offer electronic updating, which we hope will greatly improve the accuracy and reduce burden. It will also allow us to implement error detection and protection protocols as we receive data. We are also working to develop a web-based error correction tracking system, which we hope will be a single place where individuals and facilities can report errors in our national system and we will be able to track and get those corrected.
    All of these initiatives offer promise in reducing the errors in data transfer and in detecting errors in information provided to the agency. In addition, we are actively pursuing new technologies that will allow for shared access to data, rather than the current copying and transmission of data.
 Page 41       PREV PAGE       TOP OF DOC    Segment 2 Of 2  
    In conclusion, EPA takes the issue and challenges of providing accurate, complete, and timely data very seriously. With the creation of the Office of Environmental Information, EPA staff has a new sense of commitment to quality and data stewardship. This is a continuing challenge, but we are completely committed to meeting it.
    Thank you.
    Mrs. FOWLER. Thank you, Ms. Schneider.
    I have three questions that I would like each one of the witnesses to answer and I will just cover all three and then you can include them in your answer.
    Approximately how many databases are maintained at each of your respective agencies?
    Do you have an estimate of the error rate in all, or at least the most critical, of these databases?
    And for each database, or at least the most critical ones, have you made a written assessment of how accurate the data needs to be?
    Mr. Basso, I will start off with you.
    Mr. BASSO. Thank you, Madame Chairwoman.
    We have about 427 systems in the Department, or databases that supply information on everything from the Air Traffic Control System to the financial data of the Department.
    With regard to the question of accuracy, we have clearly made some assessments. In fact, when you see our performance report that comes out at the end of the month, you will see reflected in there the assessment of the good, the bad, and the things that need to be fixed.
    As to give you a percentage, I can't give you a really accurate percentage, but I think it is in the lower order of percentages.
 Page 42       PREV PAGE       TOP OF DOC    Segment 2 Of 2  
    I might ask Dr. Sen if he has anything to add to that.
    Mr. SEN. Madame Chairwoman, before I try to attempt to answer Jack's question, let me just compliment you on having this hearing. You have made my day and perhaps my year. It is a great professional interest to me.
    Mrs. FOWLER. Some people glaze over at the words ''data quality' but they are really important.
    Mr. SEN. Thank you, Madame Chairwoman.
    We have different levels of understanding of the different data sets. I couldn't give you an exact number, but the ones that feed into GPRA—I think we have reasonable confidence in those—and that is about 50 of them. But perhaps another 50 we have some understanding of.
    To make a short answer long, if you will permit me, the difficulty comes because most books on data quality assessment will say that you understand data quality by understanding the data collection process. And the process is so decentralized and fragmented that it becomes extremely expensive to really get your arms completely around it.
    So we create some global measures, compare year to year, and that sort of thing. But to get the kind of intricate understanding you're talking about, Madame Chairwoman, is something that we are just beginning to scratch the surface on.
    Mrs. FOWLER. Thank you. I hope you keep scratching it.
    Mr. BASSO. On written assessments, yes, we are making—they really come in two forms. One is those assessments made and reduced to writing in the performance report, and also the Inspector General has done a lot of work in that area. This big book I have in front of me is that work, which I must tell you I have read and actually see some areas where we need to sharpen up our focus on some of those.
    Mrs. FOWLER. Thank you.
 Page 43       PREV PAGE       TOP OF DOC    Segment 2 Of 2  
    Mr. Piatt?
    Mr. PIATT. We have 30 some-odd major database systems. I categorize those as saying it is either an oracle database or a cy-base kind of a database. There are literally hundreds of desktop systems that people have thrown together on their own, either for personal productivity or for work groups. So usually when we are talking about quality of the data and the databases, we are talking about one of these major enterprise systems, even though there are literally hundreds of others.
    When we have inaccuracies in the databases, it usually manifests itself pretty quickly in a billing error or project problem. So on one level, we haven't undertaken to validate data for data's sake, but rather we do have a series of programs that are underway to validate our systems to make sure that our billings are accurate, we're neither overcharging or undercharging, and that our projects are on time.
    So in summary, I would say that it's not part of an agency-wide effort to make sure that the data itself is correct. Rather, it is an agency-wide effort to make sure that our business operations are accurate.
    Mrs. FOWLER. So you are making written assessments on some of these critical systems as to how accurate the data needs to be?
    Mr. PIATT. We are reviewing them for their accuracy. On one level, the answer is yes. On another level, I wouldn't want you to think that we are—we have a big pile of reports we are doing on data quality itself. But we are working system by system to make sure that what is in each system is accurate. Part of what we are doing also is where we recognize inaccuracies is due to human factors because the system was hard to use, we are correcting the data input process to make it easier for people to keep the information up to date.
    Mrs. FOWLER. I think the gist of the question was, if you put in writing how accurate you need the data to be, then you have a standard against which to judge it when you review it. Otherwise, it becomes somewhat subjective as you go from one to the other.
 Page 44       PREV PAGE       TOP OF DOC    Segment 2 Of 2  
    Mr. PIATT. For us, the accuracy is manifest in the billing. When we see an inaccurate bill or a customer brings that back to us, or we catch that in an audit, it is clear that there is a data quality issue.
    And we actually have quite a few programs underway right now that go way beyond what I have mentioned.
    Mrs. FOWLER. Thank you.
    Ms. Schneider?
    Ms. SCHNEIDER. Thank you.
    I think I would say the same thing about systems. As part of our Y2K analysis we defined mission-critical systems of which 15 I would categorize as administrative types of systems, finance systems, and 35 are programmatic. But as other agencies, there are thousands of other systems that people develop, but are not probably mission-critical.
    I am not aware that we have gone through and assessed the accuracy rates in each of these systems. Obviously, as the IG spoke—and I have two notebooks sitting back there full of things that indicate that there are problems with some of our databases. Most of those IG reports have been responded by the programs and have measures in place to enhance the data quality.
    On the accuracy issue, all our systems and data collection should have quality management plans designed which define for the primary intended use the desired accuracy of that data. Those are in place and we do some level of assessment as to how well those plans are being followed.
    I think the bigger issue that arises with EPA, with the high demand for environmental information, is the use of the data broadly. That is where I mentioned the use of the context, which we think is so important and will be working on with the program offices to ensure that data sets carry with them the important contextual information about the data so that that information is not lost as people use it.
 Page 45       PREV PAGE       TOP OF DOC    Segment 2 Of 2  
    Mrs. FOWLER. Thank you.
    Ms. Schneider, if I could come back and focus on EPA for a minute, as I commented to Ms. Tinsley, there seems to be a long history—before you came—of data problems. We have a chart here that shows some of these problems.
    As you can see, this is an exhibit on the history of some of the EPA data problems. In most of these cases, if you will note through there, the agency responded by creating a new task force, steering committee, or some new initiative. But as you keep going down—and this goes from 1980 on forward—the problems never seem to go away.
    So the question is, How is the new office going to be successful? How is it different? How is it going to be successful when each one of these previous ones has failed? We want it to be successful, but we are concerned by looking at the history. What has been put in there? What safeguards? What procedures to make sure that this new office can be successful in assuring the accuracy and quality of the data?
    Ms. SCHNEIDER. Let me start by talking about the timing of the creation of the new office. And I think that is going to affect how successful we are in a lot of ways. I think all the EPA senior managers understand how critically important information is and how broadly it is now being disseminated. They recognize that change, and therefore the kinds of management issues—whether it is information security, information integrity, information availability—these are all now very important issues to all of the senior managers. And they are working very cooperatively with us.
    I think I can break things down very easily into three pieces. One issue might deal with lab issues, which Nikki raised. An example of the kinds of things this new office will do is that we are going to accelerate the reviews that Nikki talked about in our regional laboratories to make sure that both the management and technical audits are done on those laboratories. We should be having an action plan for that within the next couple of months.
 Page 46       PREV PAGE       TOP OF DOC    Segment 2 Of 2  
    The other fundamental area relates to EPA and this State partnership. I think that people have come to recognize, with the advances in technology, that we need to take a fundamentally new approach to this data exchange. As I mentioned, we have a number of things in place that I am really hoping will fundamentally improve the quality of this exchange in particular data standards so we can talk a common language and fill in common data elements about the information, that we use all the modern technology in the transmission and access to that data so that we don't have errors arise because of transmission or data entry.
    The States are very poised to work closely with us. We have spent 2 days with them just last week discussing—you may be aware, in the President's budget, there is a request for $30 million which will help support this effort.
    The last thing I would say is in the area of the data that supports our rulemaking and regulatory actions. In the last 5 years, we have done so much more with peer review, scientific review, making sure we are using multiple data to support our decisionmaking—we feel comfortable in that data that supports the rulemaking.
    Mrs. FOWLER. Who at EPA decides what information is worth having? Is it your office? Who makes that decision?
    Ms. SCHNEIDER. It is the program offices. Increasingly we are hoping that we will do this more on an agency basis. As you are aware, the number of systems we currently have are what we call legacy systems, and they have been developed over the last 20 plus years. We are really in a position—particularly with GPRA—to need to assess what data we have coming in and what data we actually need to be able to talk about outcomes and to anticipate future environmental problems.
    We also believe this is an area that we should be getting third-party advice on. And I recently talked to one of our major advisory committees and asked them for help in this area to define data gaps and help us link our GPRA planning with our environmental and information planning.
 Page 47       PREV PAGE       TOP OF DOC    Segment 2 Of 2  
    Mrs. FOWLER. We will be interested in following this. Looking at the chart, I notice some very well-meaning people there who really had good intentions. But it was never followed through. I guess my worry would be that I don't want to throw good money after bad. If we're talking about another $30 million following the millions that have already been spent but without results, I think we are going to have to look at how to get assurances that we will get the results that we need.
    I know a lot of this occurred before your time, but I hope you can be a miracle worker and bring this together. And I do think there is going to have to be something done with the States because you do rely a lot on information from the States. If you are not getting accurate data from certain States or certain programs in the States, there must be some sort of consequences. They must know that they will be held responsible and that they can't just keep providing unreliable information. If the information isn't accurate, then what will be the consequences?
    This is never going to work unless you can get a handle on all those who supply data to you.
    So I think that is part of your unique problem in that you deal not just with data originated at EPA but all around the country. But somehow there must be some incentives built in to achieve accurate data.
    Ms. SCHNEIDER. I think the States are increasingly also feeling the pressure of the public wanting to see their data. That is often a good motivator.
    Mrs. FOWLER. The sunshine always helps. Definitely.
    Mr. Basso, I have a similar question for you.
    I gather the Deputy Secretary has established this Committee on Transportation Statistics that is supposed to improve data quality.
    Can you give me some specifics as to what makes you confident this is going to succeed?
 Page 48       PREV PAGE       TOP OF DOC    Segment 2 Of 2  
    Mr. BASSO. Madame Chairwoman, I think I can give you three points that we feel will assure its success.
    We have performance agreements that are between the senior leadership of the Department and the Secretary, and they are very explicit. Certainly quality of program and data accuracy is a part of that whole effort. We are cascading that down to the career leadership in the Department. So accountability has been instilled and is being instilled.
    Secondly, you mentioned the creation of this committee. The Deputy Secretary, Mort Downey, is a unique individual. Believe me, he has a photographic memory and is a person who understands Federal, State, and local issues and who consistently reminds me what I am supposed to do about these things. So we have real leadership at the top. And this committee is comprised of people such as Dr. Sen and other experts in the Department who actually understand the nuances of these systems and how to make them better.
    And lastly, in our budgets past and present, we are requesting the necessary funding to be able to make changes. Legacy systems are a problem. The Federal Highway Administration, for example—the principal grant system we had coming in was created in the 1970's. It is based on 20-year-old technology. It is a system we are replacing and will have replaced in a year. That is crucial to gathering State data because that is how it comes in. So making those kinds of improvements I think will make a lot of difference.
    I would make this commitment as well. I think hearings like this one are very useful. Whether it is me or whoever succeeds me, or whatever—they are going to have to come back here and see if I made good on the promises.
    Mrs. FOWLER. So far you have made good on them.
    I think those are definitely steps in the right direction. I know our committee members couldn't come back so I know they will have some questions they would like to submit for the record and I have a few others, too, that I would like to submit.
 Page 49       PREV PAGE       TOP OF DOC    Segment 2 Of 2  
    In closing, a few weeks ago a corrupted file in an FAA computer brought air traffic in the whole northeastern United States to a halt. Yesterday, as I mentioned earlier, GAO concluded that EPA water quality data that had been used to allocate billions of dollars in spending was not accurate.
    As we have shown today, it is time that we take a hard look at just how good our Federal information is and start cleaning it up, which is what you are already starting to try to do and which we all need to work together on before the next misplaced decimal or missing data field victimizes some of us again.
    So as a start, our subcommittee staff is going to select four databases in each of the agencies that are represented here today at random. I intend to ask each of the officials that were represented here today to examine the accuracy of the databases that they select and report back to this subcommittee. I think that is a good way to take a random look to see what you come up with. We all are concerned and we are all working toward this. I understand you all have tremendous problems getting your arms around this.
    As you said, Mr. Basso, every now and then we need to sharpen up the knife. But my idea is to do that in a positive manner. As you all know, I don't try to sensationalize this at all. I think we are all here to make Government better, more responsible, and to make sure that we are serving the citizens of this country in the best way we can. I know that is what we all want.
    Working together, we can do that. So the subcommittee staff and I look forward to working with you on this and we will send some additional follow-up questions.
    I want to thank each of you for being here today and for your testimony and working with us.
    This subcommittee meeting is adjourned. Thank you.
    [Whereupon, at 11:45 a.m., the subcommittee was adjourned.]
 Page 50       PREV PAGE       TOP OF DOC    Segment 2 Of 2  

    [insert here]