Although not a true pause in operations, ONR’s data standdown made data quality and data consolidation the top priority for the entire organization. It aimed to establish an automated and repeatable solution to enable a more holistic view of ONR investments and activities, and to increase transparency and effectiveness throughout its mission support functions. In addition, it demonstrated that getting top-level buy-in from management to prioritize data can truly advance a more data-driven culture.
Bloomberg Government analysts put together a prototype through the Census Bureau’s Opportunity Project to better assess where volunteers should direct litter-clearing efforts. Using Census Bureau and Forest Service information, the team brought a data-driven approach to their work. Their experience reveals how individuals with data expertise can identify a real-world problem that data can help solve, navigate across agencies to find and obtain the most useful data, and work within resource constraints to provide a tool to help address the problem.
Purchase-to-Plate Crosswalk (PPC) links the more than 359,000 food products in a comercial company database to several thousand foods in a series of USDA nutrition databases. By linking existing data resources, USDA was able to enrich and expand the analysis capabilities of both datasets. Since there were no common identifiers between the two data structures, the team used probabilistic and semantic methods to reduce the manual effort required to link the data.
A recent collaboration between the Bureau of Economic Analysis (BEA) and the Bureau of Labor Statistics (BLS) helps shed light on the segment of the American workforce employed by foreign multinational companies. This case study shows the opportunities of cross-agency data collaboration, as well as some of the challenges of using big data and administrative data in the federal government.
Bureau of Economic Analysis / Bureau of Labor Statistics
NASA’s data scientists and research content managers recently built an automated tagging system using machine learning and natural language processing. This system serves as an example of how other agencies can use their own unstructured data to improve information accessibility and promote data reuse.
The National Institute of General Medical Sciences (NIGMS), one of the twenty-seven institutes and centers at the NIH, recently deployed Natural Language Processing (NLP) and Machine Learning (ML) to automate the process by which it receives and internally refers grant applications. This new approach ensures efficient and consistent grant application referral, and liberates Program Managers from the labor-intensive and monotonous referral process.
The Federal Trade Commission (FTC) releases data from millions of consumer complaints about unwanted calls to help fuel a myriad of private-sector solutions to tackle the problem. The FTC’s work serves as an example of how agencies can work with the private sector to encourage the innovative use of government data toward solutions that benefit the public.
Through its Enterprise Learning Agenda, Small Business Administration’s (SBA) staff identify essential research questions, a plan to answer them, and how data held outside the agency can help provide further insights. Other agencies can learn from the innovative ways SBA identifies data to answer agency strategic questions and adopt those aspects that work for their own needs.
The Census Bureau team produced a new interactive mapping tool in early 2018 called the Response Outreach Area Mapper (ROAM), an application that resulted in wider use of authoritative Census Bureau data, not only to improve the Census Bureau’s own operational efficiency, but also for use by tribal, state, and local governments, national and local partners, and other community groups. Other agency data practitioners can learn from the Census Bureau team’s experience communicating technical needs to non-technical executives, building analysis tools with widely-used software, and integrating efforts with stakeholders and users.
The Centers for Medicare & Medicaid Services’ Office of Minority Health (CMS OMH) Mapping Medicare Disparities Tool harnessed the power of millions of data records while protecting the privacy of individuals, creating an easy-to-use tool to better understand health disparities.
The Veterans Legacy Memorial (VLM) is a digital platform to help families, survivors, and fellow veterans to take a leading role in honoring their beloved veteran. Built on millions of existing National Cemetery Administration (NCA) records in a 25-year-old database, VLM is a powerful example of an agency harnessing the potential of a legacy system to provide a modernized service that better serves the public.
It is a well-kept secret that the U.S. Geological Survey and the U.S. Census Bureau were the original two federal agencies to build the first national digital database of roads and boundaries in the United States. The agencies joined forces to develop homegrown computer software and state of the art technologies to convert existing USGS topographic maps of the nation to the points, lines, and polygons that fueled early GIS. Today, the USGS and Census Bureau have a longstanding goal to leverage and use roads and authoritative boundary datasets.