Initial findings & opportunities to collaborate
S-1088 Annual Meeting
Assistant Professor
Colorado State University
October 12, 2025
What do you value most from participating in a Multi-State Project (e.g., S-1088)?
What would members find most useful going forward?
Takeaway:
Members see strong value in connections but want better structures for collaborative data use.
Over the past several months, I’ve been working on a project that looks at
how USDA datasets are mentioned in research publications.
In the following slides, I’d like to share some of these initial findings and ask:
What would be useful for S-1088 members?
Questions to consider
USDA ERS and NASS were interested in tracking how their datasets are referenced in research papers.
Two dashboards that track mentions* of 11 key datasets in articles were built.
This helps:
These 11 datasets were selected based on relevance and input from USDA staff and researchers.
*A dataset mention refers to an instance in which a specific dataset is referenced, cited, or named within a research publication. This can occur in various parts of the text, such as the abstract, methods, data section, footnotes, or references, and typically indicates that the dataset was used, analyzed, or discussed in the study.
Datasets searched:
Searched articles published between 2015-2025.
Mentions of datasets can be found in publications by applying ML models and other methods.
Data source: Dimensions via Google BigQuery
References
Read more about this project here.
We searched a large publication corpus (Dimensions) to find mentions of 11 USDA datasets and found that 17,311 authors mentioned these datasets across 8,290 publications in 1,664 journals.
Work was funded by the US Department of Agricultural (Economic Research Service and National Agricultural Statistics Service), the National Center for Science and Engineering Statistics, and the National Center for Education Statistics.
Food insecurity is one of the most frequently occurring topics among publications mentioning selected USDA datasets across all journals.
The word cloud shows the frequency of associated topics — larger words indicate more publications on that topic.
Access the Democratizing Data FAR Data Dashboard here.
To better understand how USDA datasets are used, we examined one area in more detail: food security–related research.*
This exercise reveals interesting patterns about:
How often selected USDA datasets are mentioned in publications over time
The share of food security–related research that explicitly references these datasets
How USDA dataset usage has evolved relative to all research output in this area
*Food security-related research includes articles with topics such as: “food security,” “SNAP,” “WIC,” “food access,” “food availability,” “food insecurity prevalence,” and “Nutrition Assistance Program.”
Details about this figure
This figure plots an index of unique publications, authors, or institutions mentioning the selected USDA datasets from 2015-2024, where 2015 is the base year (index = 1). For example, there were 595 publications in 2024 and 134 in 2015, so the index is 595 / 134 = 4.44.
Terms used to define food security research include: “food security”, “food insecurity”, “food security status”, “Supplemental Nutrition Assistance Program (SNAP)”, “Special Supplemental Nutrition Program for Women, Infants, and Children (WIC)”, “food access”, “food availability”, “prevalence of food insecurity”, “food pantries”, and “Nutrition Assistance Program”.
Data source: Dimensions via Google BigQuery
USDA datasets are mentioned widely in food-security related research across applied economics, agricultural, veterinary, and food sciences.
Details about this figure
This figure compares:
Terms used to define food security research include: “food security”, “food insecurity”, “food security status”, “Supplemental Nutrition Assistance Program (SNAP)”, “Special Supplemental Nutrition Program for Women, Infants, and Children (WIC)”, “food access”, “food availability”, “prevalence of food insecurity”, “food pantries”, and “Nutrition Assistance Program”.
Data source: Dimensions via Google BigQuery
The volume of food security publications continues to grow while USDA dataset mentions have plateaued.
Details about this figure
This figure shows two trends in food security–related research from 2015 to 2024:
Terms used to define food security research include: “food security”, “food insecurity”, “food security status”, “Supplemental Nutrition Assistance Program (SNAP)”, “Special Supplemental Nutrition Program for Women, Infants, and Children (WIC)”, “food access”, “food availability”, “prevalence of food insecurity”, “food pantries”, and “Nutrition Assistance Program”.
Data source: Dimensions via Google BigQuery
Selected USDA datasets are important, accounting for ~21%, on average, of food security-related publications from 2015-2024.
But, ~79% of food security-related research relies on other data sources (or doesn’t name the data at all).
Next question:
What other datasets are researchers using for food security-related research?
\(\rightarrow\) Answering this requires access to full-text data and LLMs trained to identify dataset usage.
Currently working on this.
July 2025 - Major climate change reports are removed from U.S. websites
September 2025 - USDA ends the Agricultural (Farm) Labor Survey, the U.S.’s only survey of agricultural employers
September 2025 - USDA Terminates Redundant Food Insecurity Survey
We can only track what we can see. If authors are not writing about the datasets they use we won’t see it. Most authors don’t cite their data.
As an author, I find myself wanting better tools to find comparable studies using the same dataset – being able to search within publications that share a dataset is useful.
Publication metadata contains rich, underused information (e.g., open access, funders, citations, article processing charges) – available from OpenAlexAPI or by partnering with Digital Science (Dimensions).
At this stage, there’s a lot more we could do — but we would need resources and interested collaborators.
Funding outlook: State and federal budgets are shifting, with greater uncertainty around traditional sources.
Challenge: When our usual funders are constrained, or there is greater competition, how do we spot credible alternative funders aligned with our research?
Goal: Visualize where S-1088 research sits within the broader funding ecosystem.
Start with the S-1088 member list: Identify current participants.
Collect DOIs from member ORCIDs: Build a publication dataset linked to individual researchers.
Search for topic codes across all published works: Map publications to relevant USDA or research themes.
Retrieve all works within those topic codes: Capture the broader research landscape connected to S-1088 activities.
Analyze funding outlets: Identify which agencies and programs are supporting work in these areas.
Analysis was done using the openalexR
package in R
.
Membership and coverage
Publications
article
, resulted in 1,146 publications published between 1977 and 2025.Common research topics:1
Common research concepts:2
Sample size: 1,138 articles
Common funders:1
Others: Robert Wood Johnson Foundation, Organic Farming Research Foundation, university and department seed funding, international funders
Sample size: 287 of 1,146 publications acknowledged grant funding.
Benchmarking: How do the themes connected to S-1088 activities compare with the broader research landscape?
Funding Strategy: Which agencies and programs are supporting work in these areas?
Questions to consider
For you:
Use dashboard to find collaborators and review broad data usage statistics.
With access to the publication metadata, additional trends on topics of interest can be explored further.
What signals (topics, co-authorships, acknowledgments) could we use to surface new funders?
For our research community:
💬 Questions?
📩 Lauren.Chenarides@colostate.edu
🌐 democratizingdata.ai
Acknowledgements:
Back to All Talks.