Skip to main content
Research Article

Evaluating Multilingual Metadata Quality in Crossref

Authors
  • Dennis Donathan II orcid logo (Public Knowledge Project & Trinity University)
  • Mike Nason orcid logo (Public Knowledge Project & University of New Brunswick)
  • Marco Tullney orcid logo (Public Knowledge Project & Technische Informationsbibliothek (TIB))
  • Julie Shi orcid logo (Public Knowledge Project & Scholars Portal)
  • Juan Pablo Alperin orcid logo (Public Knowledge Project & ScholCommLab, Simon Fraser University)

Abstract

Introduction: Scholarly research spans multiple languages, making multilingual metadata crucial for organizing and accessing knowledge across linguistic boundaries. These multilingual metadata already exist and are propagated throughout the scholarly publishing infrastructure, but the extent to which they are correctly recorded, or how they affect metadata quality more broadly, is little understood.

Methods: Our study quantifies the prevalence of multilingual records across a sample of publisher metadata and offers an understanding of their completeness, quality, and alignment with metadata standards. Utilizing the Crossref API to generate a random sample of 519,665 journal article records, we categorize each record into four distinct language types: English monolingual, non-English monolingual, multilingual, and uncategorized. We then investigate the prevalence of programmatically detectable errors and the prevalence of multilingual records within the sample to determine whether multilingualism influences the quality of article metadata.

Results: We find that English-only records are still in the vast majority among metadata found in Crossref, but that, while non-English and multilingual records present unique challenges, they are not a source of significant metadata quality issues and, in a few instances, are more complete or correct than English monolingual records.

Discussion & Conclusion: Our findings contribute to discussions surrounding multilingualism in scholarly communication, serving as a resource for researchers, publishers, and information professionals seeking to enhance the global dissemination of knowledge and foster inclusivity in the academic landscape.

Keywords: multilingualism, metadata, crossref

How to Cite:

Donathan II, D., Nason, M., Tullney, M., Shi, J. & Alperin, J.P. (2026). Evaluating Multilingual Metadata Quality in Crossref. Journal of Librarianship and Scholarly Communication, 14(1), eP19779. https://doi.org/10.31274/jlsc.19779

Rights:

© 2026 The Author(s). License: CC BY 4.0

Downloads:
Download pdf
View PDF

Funding

Name
Crossref
Funding Statement

This research was supported by Crossref through an award granted to the ScholCommLab and the Public Knowledge Project. The funder had no role in the study's design, data collection and analysis, decision to publish, or preparation of the manuscript.

336 Views

26 Downloads

Published on
2026-03-18

Peer Reviewed