Purchase-to-Plate Crosswalk (PPC) links the more than 359,000 food products in a comercial company database to several thousand foods in a series of USDA nutrition databases. By linking existing data resources, USDA was able to enrich and expand the analysis capabilities of both datasets. Since there were no common identifiers between the two data structures, the team used probabilistic and semantic methods to reduce the manual effort required to link the data.
A recent collaboration between the Bureau of Economic Analysis (BEA) and the Bureau of Labor Statistics (BLS) helps shed light on the segment of the American workforce employed by foreign multinational companies. This case study shows the opportunities of cross-agency data collaboration, as well as some of the challenges of using big data and administrative data in the federal government.
Bureau of Economic Analysis / Bureau of Labor Statistics
NASA’s data scientists and research content managers recently built an automated tagging system using machine learning and natural language processing. This system serves as an example of how other agencies can use their own unstructured data to improve information accessibility and promote data reuse.
The Department of Homeland Security (DHS) experience forming the Data Stewardship Tactical Working Group (DSTWG) provides meaningful insights for those who want to address data-related challenges collaboratively and successfully in their own agencies.
Through its Enterprise Learning Agenda, Small Business Administration’s (SBA) staff identify essential research questions, a plan to answer them, and how data held outside the agency can help provide further insights. Other agencies can learn from the innovative ways SBA identifies data to answer agency strategic questions and adopt those aspects that work for their own needs.
The Centers for Medicare & Medicaid Services’ Office of Minority Health (CMS OMH) Mapping Medicare Disparities Tool harnessed the power of millions of data records while protecting the privacy of individuals, creating an easy-to-use tool to better understand health disparities.