Data Federation

Overview and objectives

The U.S. Data Federation project promotes government-wide capacity-building to support distributed data management challenges, data interoperability, and broader data standards activities. The project began as an initiative of the GSA Technology Transformation Services (TTS) 10x program, which funds technology-focused ideas from federal employees with an aim to improve the experience all people have with our government.

The goal of the U.S. Data Federation project is to document repeatable processes, develop reusable tooling, and curate resources to support data managers and practitioners across government. The best practices and resources are intended to include guides and repeatable processes around data governance, organizational coordination, and standards development in federated environments. The reusable tools are intended to include capabilities around data validation, automated aggregation, and the development and documentation of data specifications.

Background

The U.S. Data Federation project was originally conceived in 2016 to support coordination and collaboration across the disparate, often complex organizational boundaries inherent in our distributed system of government.

In particular, the project was focused on supporting so-called federated data efforts, efforts in which a common type of data is collected or exchanged across complex, disparate organizational boundaries. This happens, for example, when data are collected, aggregated, and shared between federal, state, and local government entities. These data may be used to support policy or budget decisions, to increase operational efficiencies, or be published in aggregate form for other data users. Federated data efforts were and are increasingly seen as an engine for transparency, economic growth, and accountability, and are becoming ever more common. But despite the fact that many of these federated data efforts face common requirements and common challenges, they largely lack common resources.

Therefore, the U.S. Data Federation was pitched and awarded funding to address this gap by beginning to catalog existing initiatives, documenting ways to participate, assessing their maturity and scale of implementation, and packaging reusable components for a successful data federation strategy.

Project history

10x projects proceed through a maximum of four phases of funding. At the end of each Phase, the project is reevaluated based on the progress made and the recommendation of the project team. Goals and deliverables were defined for each phase based on the learnings of the previous phase.

In Phase 1, the team interviewed a variety of distributed data management projects and synthesized findings in a Data Federation Framework. Their findings suggested that reusable tooling and processes would benefit future federated data efforts.

In Phase 2, the team partnered with the USDA Food & Nutrition Service and built a functional prototype of a reusable data validation tool that allows users to submit data via a web interface or API to be validated against a set of customizable rules in real time.

In Phase 3, the team continued to develop ReVal (Reusable Validation Library) with partners at USDA and validated its reusability with partners at the U.S. Census Bureau and U.S. Department of Transportation. The results of the pilot of the FNS Data Validation Service (DVS), which uses ReVal to streamline data validation for the National School Lunch and Breakfast Program resulted in time savings, reduced stress, and greater efficiency for the pilot states in 2019.

In Phase 4, recognizing the need for a central, accessible platform to host reusable tools like ReVal, the team took advantage of a unique opportunity to unite government-wide efforts to support open data and federated data efforts. Aligning their efforts with the mandates from the Evidence Act and Federal Data Strategy to build an online resource repository, the team supported Data.gov, OMB, and OGIS stakeholders by conducting user research, prototyping content curation and development processes, and implementing enhanced functionality for resources.data.gov.

In addition, the team continued to develop new resources, including a set of template charters for data governance bodies, curated a backlog of potential future content for resources.data.gov, and built relationships with the CDO Council leadership, data-related communities of practice, and data managers and practitioners across agencies.

A comprehensive project overview complete with team status reports and artifacts created over the course of the four phases are available on the project’s GitHub repository.

The future of the Data Federation

Currently housed within the TTS Data & Analytics Portfolio, the Data Federation continues to work towards the curation, creation, and dissemination of resources to support distributed data management challenges, data interoperability, and broader data standards activities. The team supports the continuous development of resources.data.gov as a platform for this content.