Project Documentation

Published

June 27, 2025

Project Files

All project files are available in our GitHub repository. This section provides additional reference material.

Data Schema

Metadata was gathered from three different sources and used to construct the following data schemas. Click on the icons below to view the data schema associated with each source.

Scopus Schema Dimensions Schema OpenAlex Schemas
Scopus Schema Dimensions Schema OpenAlex Full Text Search
Full Text Search
OpenAlex Seed Corpus
Seed Corpus

Note: All schemas were built using DBDiagram.io.

File Inventory

  • Primary Table: publication
  • Supporting Tables:
    • agency_run
    • asjc
    • author
    • author_affiliation
    • dataset_alias
    • dyad
    • dyad_model
    • issn
    • journal
    • model
    • publication_affiliation
    • publication_asjc
    • publication_author
    • publication_topic
    • publication_ufc
    • publisher
    • topic

Refer to the schema for additional column-level details.

Full Text

  • Primary Table: main
  • Supporting Tables:
    • _id
    • apc_list
    • apc_paid
    • authorships
    • best_oa_location
    • biblio
    • citation_normalized_percentile
    • cited_by_percentile_year
    • corresponding_author_ids
    • corresponding_institution_ids
    • counts_by_year
    • dataset
    • datasets
    • grants
    • ids
    • indexed_in
    • open_access
    • primary_location
    • primary_topic
    • topics

Seed Corpus

  • Primary Table: main
  • Supporting Tables:
  • Primary Table: main
  • Supporting Tables:

The accompanying schema focuses on the primary linking fields between tables. Due to the large number of columns within each table, only key identifiers are included.