The Department of Energy’s National Nuclear Security Administration (NNSA) adopted a data-driven, risk-informed strategy to better assess risks, prioritize investments, and cost effectively modernize its aging nuclear infrastructure. NNSA’s new strategy, and lessons learned during its implementation, will help inform other federal data practitioners’ efforts to maintain facility-level information while enabling accurate and timely enterprise-wide infrastructure analysis.
Although not a true pause in operations, ONR’s data standdown made data quality and data consolidation the top priority for the entire organization. It aimed to establish an automated and repeatable solution to enable a more holistic view of ONR investments and activities, and to increase transparency and effectiveness throughout its mission support functions. In addition, it demonstrated that getting top-level buy-in from management to prioritize data can truly advance a more data-driven culture.
Purchase-to-Plate Crosswalk (PPC) links the more than 359,000 food products in a comercial company database to several thousand foods in a series of USDA nutrition databases. By linking existing data resources, USDA was able to enrich and expand the analysis capabilities of both datasets. Since there were no common identifiers between the two data structures, the team used probabilistic and semantic methods to reduce the manual effort required to link the data.
A recent collaboration between the Bureau of Economic Analysis (BEA) and the Bureau of Labor Statistics (BLS) helps shed light on the segment of the American workforce employed by foreign multinational companies. This case study shows the opportunities of cross-agency data collaboration, as well as some of the challenges of using big data and administrative data in the federal government.
Bureau of Economic Analysis / Bureau of Labor Statistics
NASA’s data scientists and research content managers recently built an automated tagging system using machine learning and natural language processing. This system serves as an example of how other agencies can use their own unstructured data to improve information accessibility and promote data reuse.
The National Institute of General Medical Sciences (NIGMS), one of the twenty-seven institutes and centers at the NIH, recently deployed Natural Language Processing (NLP) and Machine Learning (ML) to automate the process by which it receives and internally refers grant applications. This new approach ensures efficient and consistent grant application referral, and liberates Program Managers from the labor-intensive and monotonous referral process.
Through its Enterprise Learning Agenda, Small Business Administration’s (SBA) staff identify essential research questions, a plan to answer them, and how data held outside the agency can help provide further insights. Other agencies can learn from the innovative ways SBA identifies data to answer agency strategic questions and adopt those aspects that work for their own needs.