Practice Article

Free and Open-Source Automated Open Access Preprint Harvesting

Authors
  • Jack E. Peplinski (Western University)
  • Joanne Paterson (Western University)
  • Courtney Waugh (Western University)
  • Joshua M Pearce (Western University)

Abstract

Universities are attempting to ensure that all of their research is publicly accessible because of funding mandates. Many universities have established campus open access (OA) repositories but are struggling with how to upload millions of manuscripts under numerous license agreements while also linking metadata to make them discoverable. To do this manually requires around 15 minutes per manuscript from an experienced librarian. The time and cost to do this campus-wide is prohibitive. To radically reduce the time and costs of this process and to harvest all past work, this article reports on the development and testing of a free and open source (FOSS) JavaScript-based application, aperta-accessum, which does the following: 1) harvests names and emails from a department’s faculty webpage; 2) identifies scholars’ Open Researcher and Contributor IDentifiers (ORCID iDs); 3) obtains digital object identifiers (DOIs) of publications for each scholar; 4) checks for existing copies in an institution’s OA repository; 5) identifies the legal opportunities to provide OA versions of all of the articles not already in the OA repository; 6) sends authors emails requesting a simple upload of author manuscripts; and 7) adds link-harvested metadata from DOIs with uploaded preprints into a bepress repository; the code can be modified for additional repositories. The results of this study show that, in the administrative time needed to make a single document OA manually, aperta-accessum can process approximately five entire departments worth of peer-reviewed articles. Following best practices discussed, it is clear that this open-source OA harvester enables institutional library’s stewardship of OA knowledge on a mass scale for radically reduced costs.

Keywords: data management knowledge, data management practices, open access harvesting, open source software, data management, metadata

How to Cite:

Peplinski, J. E., Paterson, J., Waugh, C. & Pearce, J. M., (2022) “Free and Open-Source Automated Open Access Preprint Harvesting”, Journal of Librarianship and Scholarly Communication 10(1). doi: https://doi.org/10.31274/jlsc.14421

Downloads:
Download PDF
View PDF

Funding

  • Thompson Endowment

2121 Views

149 Downloads

Published on
02 Dec 2022
Peer Reviewed