Results for "data cleaning"

Case studies & examples

Data Standdown: Interrupting process to fix information

Although not a true pause in operations, ONR’s data standdown made data quality and data consolidation the top priority for the entire organization. It aimed to establish an automated and repeatable solution to enable a more holistic view of ONR investments and activities, and to increase transparency and effectiveness throughout its mission support functions. In addition, it demonstrated that getting top-level buy-in from management to prioritize data can truly advance a more data-driven culture.


Office of Naval Research


data governance, data cleaning, process redesign, Federal Data Strategy

Leveraging AI for Business Process Automation at NIH

The National Institute of General Medical Sciences (NIGMS), one of the twenty-seven institutes and centers at the NIH, recently deployed Natural Language Processing (NLP) and Machine Learning (ML) to automate the process by which it receives and internally refers grant applications. This new approach ensures efficient and consistent grant application referral, and liberates Program Managers from the labor-intensive and monotonous referral process.


National Institutes of Health


standards, data cleaning, process redesign, AI


FDS Proof Point

Pairing Government Data with Private-Sector Ingenuity to Take on Unwanted Calls

The Federal Trade Commission (FTC) releases data from millions of consumer complaints about unwanted calls to help fuel a myriad of private-sector solutions to tackle the problem. The FTC’s work serves as an example of how agencies can work with the private sector to encourage the innovative use of government data toward solutions that benefit the public.


Federal Trade Commission


data cleaning, Federal Data Strategy, open data, data sharing

Supercharging Data through Validation as a Service

USDA's Food and Nutrition Service restructured its approach to data validation at the state level using an open-source, API-based validation service managed at the federal level.


Department of Agriculture


data cleaning, data validation, API, data sharing, process redesign, Federal Data Strategy

See all resources in Case studies & examples >

Data tools


Tabula is a tool for liberating data tables locked inside PDF files.




data cleaning


qu is an open source data platform created to serve the public data sets of the Consumer Financial Protection Bureau. The goals of this platform are to import data in a Google-Dataset-inspired format, Query data using a Socrata-Open-Data-API-inspired API, and export data in JSON or CSV format.


Consumer Financial Protection Bureau


open data, API, data cleaning, data analysis

See all resources in Data tools >

Skills development

Data Scientist Titling Guidance

This memo provides titling guidance to agency Human Resources Offices for use in classifying data science positions within agencies.

See all resources in Skills development >