…Is the title of a first-of-its-kind report detailing the vulnerabilities of our federal statistics system. I was hooked by the first sentence: “Federal statistics are essential U.S. infrastructure.” And like our bridges, power grids, and ports, governments have a responsibility assess the strengths and weaknesses of their data systems, which are used every day to make critical decisions across the the economy, public health, national defense, education, and other topics that affect quality of life.
“Americans are increasingly saturated with information from many sources, both credible and not. Federal statistical data can be an important tool in fighting disinformation and misuse of AI and other information dissemination technologies.” The Nation’s Data at Risk
How to assess federal data systems
The report measures risks to the U.S. statistics ecosystem from multiple angles by asking:
Does the agency consistently produce quality data? i.e., Does it produce relevant, timely, credible, accurate, and objective statistics (trusted, quality statistics)?
Is the agency trustworthy and accountable?
Does the agency have sufficient support in three key areas - professional autonomy, parent-agency support, and sufficient budget and skilled staff?
What are the challenges and threats the agency faces and their magnitude and potential consequences?
Is the agency agile? What are its innovation record and its opportunities to respond to future data needs?
Is the agency responsive to user needs and transparent about its data products and decisions that affect users?
Like many government services, there is often an assumption that once you pass legislation and establish a new program, government services will continue to run autonomously into the future. In practice, the needs and behaviors of those who use those services can change quickly, making the initial program design obsolete. The same is true for data services.
One well-documented example of this is the decades-long decline in responses to government surveys, which continue to be an essential way we collect information about the nation. Among many examples, the authors showed data on declining response rates to three federal household surveys, including the Current Population Survey (CPS), which provides timely, detailed data about the U.S. labor force.1
Kevin Rinz described the importance of the Current Population Survey in providing timely data on the labor force, especially in moments that require rapid response from government.
While some labor market indicators can be produced from multiple data sources, several important measures can only be produced in a timely fashion using CPS data. Timeliness is especially important when labor market conditions are changing rapidly or highly uncertain, such as during the Covid-19 pandemic (production lags and lower frequency make even comprehensive data like that provided by the Quarterly Census of Employment and Wages less useful in this context). As the labor market recovered from its initial contraction, the CPS was the only reliable data source that could capture both economic and health-related shifts in workers’ behavior, and getting updates every month was invaluable. That reliability was and remains especially valuable given the challenges other labor market indicators like unemployment insurance claims (fluctuations driven by administrative rather than economic causes) and data from the Job Openings and Labor Turnover Survey (large decline in the response rate) face with providing dependable estimates.
Another persistent challenges among data publishers is that what you measure shifts over time. For instance, the American Prospect recently profiled how companies are turning to algorithms to set prices on things like food, groceries, or other commodities when they’re purchased online. If American consumers start spending at personalized prices, what does that mean for something like the Consumer Price Index (CPI) and its ability to measure inflation?
Engaging with data users
I found one theme of the report - how actively data agencies engage with and react to user needs - especially relevant for state and local government data professionals.
Proactive data-user engagement, including involving users up front when major changes are needed to data programs, and knowledge of users and uses are important to enable the statistical agencies to assess the relevance, responsiveness to users, transparency, and accessibility of their data. Yet these areas do not appear to get the priority they need for the agencies to fulfill their role as data stewards for the public good. (Title II of the Evidence Act and the proposed Trust Regulation emphasize user engagement.) Resources for user engagement, documentation, and research and development to continually improve statistical agency data programs are often not explicitly included in agency funding requests. Resources for these activities and those needed to collect, process, and disseminate data can be in competition, and the competition is increased when overall funding is not sufficient to meet core needs.
Many local governments rarely think of themselves in terms of data providers despite generating vast amounts of data that have direct impact on day-to-day lives. User engagement is one of the first things local governments can do to improve their data capacity. This type of research can also help set priorities for goals related to publishing and disseminating data.
Footnotes
It’s worth noting that the Census Bureau and Bureau of Labor Statistics - who jointly manage the CPS - are keenly aware of these challenges and recently announced strategies for modernizing their data collection efforts.↩︎