Introduction

This study focused on analysing Knowledge Exchange (KE) countries’ agreements with major publishers, more specifically, agreements with open access (OA) elements, to identify what article-level metadata consortia and academic institutions request from publishers and whether publishers deliver this. The study is aimed at consortia and academic institutions that subscribe to agreements with OA elements. By collecting article-level metadata, consortia and academic institutions are able to monitor OA publications, OA costs and the value of agreements. Article-level metadata will also support consortia and academic institutions in monitoring compliance with funders’ OA policies.

The representatives from each KE country undertook this research as part of their work on the Monitoring OA group. They collected information from agreements with major publishers in each KE country and created an article-level metadata check-list based on the KE and the Efficiency and Standards for Article Charges (ESAC) recommendations. They have repurposed the check-list as a template for consortia and academic institutions to use to request publishers to deliver article-level metadata.

KE has been undertaking work on Monitoring OA since 2015 when it hosted its first workshop on this topic. The workshop promoted debate on standards and best practices to monitor compliance with OA policies. Speakers and participants shared challenges and best practices, explored ways to support and enhance each other’s activities, and discussed approaches to assess quality and impact. Participants expressed the need for common standards, identifiers and data requirements as well as for data to be compared and aggregated at an international level. Recommendations on best practices and an agreement on minimum standards were identified as necessary next steps. The workshop concluded that ‘definitions, workflows and collaboration should be closely linked in order to keep monitoring’.

Following this initial discussion, a second KE workshop was held in 2016. This workshop focused on debating issues related to monitoring OA publications and monitoring cost data for OA publications. The workshop reinforced the need for common standards and definitions, for publishers to deliver data and for accounting systems to be interoperable. Discussions highlighted the need for data to be shared and made openly available, for publishers to provide DOIs and funder metadata, and for these requirements to be stated in agreements with publishers. Work groups discussed issues such as data, workflows, standards and policy for monitoring OA publications as well as for monitoring cost data of OA publications. A series of recommendations were issued for current research information systems (CRISs), publishers and libraries.

Around the same time, ESAC was discussing ‘the need to develop workflow efficiencies’ in the negotiation, drafting and management of offsetting agreements. Following two workshops hosted by ESAC, it became clear there was a need ‘for an improvement of the current workflows and processes between academic institutions […] and […] publishers in terms of author identification, metadata exchange and invoicing’. As a result, ESAC issued article workflow recommendations in three key areas: author and article identification and verification, funding acknowledgement and metadata, and invoicing and reporting. The recommendations emphasized what metadata publishers should provide. ESAC explained that the ‘recommendations should be seen as a minimum set of practical and formal requirements for offsetting agreements and are necessary to make any publication-based open access business model work’.

In September 2018 cOALition S (an international group of funders) announced Plan S, which intends a full transition to OA by 1 January 2021. Metadata is an important element of the Plan S guidelines. The ESAC recommendations are referred to in the guidance on the implementation of Plan S for transformative arrangements. The guidelines state that cOALition S ‘will only financially support [transformative] agreements […] where they adhere to the ESAC Guidelines’. In addition, the technical guidance and requirements request that peer-reviewed articles include: ‘high-quality article level metadata in standard interoperable non-proprietary format’, ‘complete and reliable information on funding provided by cOAlition S funders’, ‘machine-readable information on the Open Access status and the license embedded in the article’, ‘PIDs for authors (e.g., ORCID), funders, funding programmes and grants, institutions’.

KE and ESAC work emphasized the need for common standards and identifiers, common definitions, automation of workflows, and collection of metadata. cOALition S re-emphasized the need to collect metadata in all the routes leading to Plan S compliance: OA publishing venues, repository route and transformative arrangements.

With an increasing number of agreements including OA elements, studies discussing what metadata must be collected and funders supporting the collection of metadata, it is becoming increasingly relevant for consortia and academic institutions to include the delivery of article-level metadata in the agreements with publishers as well as to implement mechanisms to monitor publishers’ compliance with the terms of the agreements. Article-level metadata are required so that consortia and academic institutions can monitor how many articles are being published OA and non-OA under each agreement, particularly in the cases where there is a cap on the number of articles that can be published OA. Consortia and academic institutions also need to monitor how much is being spent on OA publishing as well as monitor whether they have missed any publications and need to contact the author for further information. Moreover, consortia and academic institutions need article-level metadata to assess the value of the agreements, i.e. whether agreements with OA elements are delivering value for money. Ultimately, the academic institutions that pay the costs of the OA publishing element of the agreement, or the entities that pay the article processing charges (APCs), have the right to access information about the articles they fund. They have the right to know what research they are funding, if the right licence has been applied to the articles published, and if, for example, the funding acknowledgment statement has been included in the article.

As a result of the increasing need to collect article-level metadata, the KE Monitoring OA group undertook the following activities:

  • collected information on agreements with 12 major publishers for the six KE countries
  • classified agreements by type (subscription agreements and agreement with OA elements)
  • analysed agreements with OA elements against an article-level metadata check-list based on the KE and the ESAC recommendations
  • analysed article-level metadata criteria to assess what metadata was requested in consortia contracts and other relevant documentation and to assess whether publishers provided that metadata
  • developed a template for publishers to use to provide article-level metadata to consortia and academic institutions based on the check-list.

Methodology

This research was based on the analysis of publisher agreements and other relevant documentation. The data analysed included the agreements that the six KE countries had with all or some of the following publishers: American Chemical Society, Cambridge University Press, EDP Sciences, Elsevier, Oxford University Press, Royal Society of Chemistry, SAGE Publishing, Springer Nature, Taylor & Francis and Wiley. Because the agreements and other relevant documentation analysed in this study are confidential, the publishers’ names were anonymized in the data analysis.

The agreements analysed were classified into two major types: subscription agreements and agreements with OA elements. Agreements with OA elements are diverse because there are differences in how research outputs become openly available, differences in costs applied to make research outputs openly available and differences on how many research outputs can be made openly available.

The KE Monitoring OA group considers agreements with OA elements to be those that include distinct OA models such as subscription agreements with APCs discounts, subscription agreements with X number of free OA articles, read and publish agreements, offsetting agreements and transformative agreements. A definition of the different types of agreements with OA elements is not provided in this article but attempts to define these models have been made elsewhere.

Only the agreements with OA elements were considered valid for analysis because the main focus of the research was to look at the elements of agreements that comply with the gold route to OA (i.e. peer-reviewed articles become immediately available on OA and may be subject to an APC charge, or the APC costs may be included in the agreement model). Because Denmark has a national green OA policy, it complies with funders’ OA policies via the green route (i.e. deposit of research outputs in institutional, subject, and/or funder repositories which become available on OA following an embargo period). Because all of Denmark’s agreements were subscription only and did not include a gold route to open access, Denmark was not included in the data analysis.

Agreements with OA elements were analysed against the article-level metadata check-list based on the KE and ESAC recommendations to assess what article-level metadata was requested in the consortia contracts or in other relevant documentation and whether publishers provided the metadata to consortia. Table 1 lists the criteria used for the data analysis and shows if they are based on the KE or the ESAC recommendations. Three additional criteria were added by the KE Monitoring OA group. All the metadata listed in Table 1 are relevant and should be requested of publishers because they facilitate the search, discovery and sharing of information on invoicing and reporting, along with author and article identifiers and funding identifiers.

Table 1

Article-level metadata check-list based on KE and ESAC recommendations

Criteria #Article-level metadata criteriaKE report recommendationsESAC recommendations

#1DOIYesYes
#2Is the article open access?
#3Institution nameYesYes
#4Article titleYes
#5Article typeYes
#6Journal ID (publisher ID)Yes
#7Journal titleYes
#8Journal subject/discipline
#9Journal ISSNYes
#10Journal e-ISSNYes
#11Article licence (CC licence)YesYes
#12Article acceptance dateYes
#13Article approval dateYes
#14Article online date/date of publicationYes
#15Corresponding author nameYesYes
#16Co-author(s) name(s)YesYes
#17Corresponding author e-mailYes
#18ORCIDsYesYes
#19FundRef IDYesYes
#20Funder nameYesYes
#21Funding acknowledgment in articleYes
#22Grant numberYes
#23PublisherYes
#24Article APC (cost/price) (inc./ex. VAT)YesYes
#25Currency (e.g. €, $, £)Yes
#26Publishers standardize their APC invoice and the invoicing processYesYes
#27APC transparency
#28Publisher to flag funder-non-compliant articles at point of author licence acceptanceYes
#29Machine-readable metadataYes
#30Workflow (integration)Yes
#31Publishers include in Crossref a licence statement for each publication and indicate whether the publication is green, gold or hybrid OAYesYes

The agreements considered valid for the data analysis are dated between 1 January 2016 and 1 January 2019. In total, information was collected for 50 agreements with 12 major publishers (Annex 1). Of these, only ten publishers had agreements with OA elements. Of the agreements analysed, 46% were subscription only and 54% had OA elements (Table 2). As mentioned above, the data analysis encompasses different types of agreements with OA elements, and some of the agreements analysed predate the KE and the ESAC recommendations.

Table 2

Total number of agreements analysed by type and country

CountrySubscription agreementsAgreements with OA elementsTotal

Denmark (DEFF)11011
Finland (FinELib)156
France (Couperin)9110
Germany044
Netherlands (VSNU/UKB/Surfmarket)088
UK (Jisc)2911
Total232750

Data analysis

Consortia agreements requesting article-level metadata

The first part of the analysis involved assessing which consortia agreements required publishers to provide article-level metadata. The data analysis only applied to agreements with OA elements. Out of a total of 27 agreements with OA elements identified in five KE countries (see Annex 1), 24 agreements requested some article-level metadata. Table 3 shows the agreements for which metadata was requested.

Table 3

List of agreements analysed where consortia requested metadata and publishers provided it

PublisherFinland (FinELib)France (Couperin)GermanyNetherlands (VSNU/UKB/Surfmarket)UK (Jisc)

Requested by consortiaProvided by publisherRequested by consortiaProvided by publisherRequested by consortiaProvided by publisherRequested by consortiaProvided by publisherRequested by consortiaProvided by publisher

Publisher AYesYesYesYesYes
Publisher BYesYesYesYes
Publisher CYesYesYesYesYes
Publisher DYesYesYesYes
Publisher EYesYesYesYes
Publisher FYesYesYesYesYesYes
Publisher GYesYes
Publisher HYesYesYesYes
Publisher IYesYesYesYes
Publisher JYesYes

Figure 1 shows the article-level metadata criteria from the most to the least commonly asked for in contracts or in other relevant documentation. DOIs were the criteria most commonly requested of publishers by consortia.

Figure 1 

Metadata requested by consortia

Article-level metadata provided by publishers

The second part of the analysis involved assessing which publishers provided article-level metadata (Table 3). Despite 24 agreements requiring article-level metadata, only 16 publishers provided some of that metadata. (Table 3) At the time data was collected for this research, there was still no information available on what article-level metadata Publisher B was going to provide to the German consortia, on what article-level metadata Publisher E would provide to the Finnish consortia, nor on what OA metadata Publisher G would provide to the Dutch consortia.

Fewer article-level metadata were provided by publishers than requested in the consortia contracts or in other relevant documentation. Figure 2 shows the article-level metadata that publishers most commonly provided to consortia. DOIs were also the criteria most commonly provided by publishers.

Figure 2 

Metadata provided by publisher

Comparison of results

By comparing the results between what article-level metadata consortia requested in contracts or in other relevant documentation versus what metadata publishers provided, it is possible to assess how far consortia and publishers are from being aligned with the KE and ESAC recommendations. This allows us to benchmark how consortia and publishers were performing until early 2019 and in a pre-Plan S scenario.

When comparing what article-level metadata consortia requested in contracts or in other relevant documentation (see Figure 1) versus what metadata publishers provided (see Figure 2), it was observed that none of the agreements requested all the metadata recommended by KE and ESAC. Nonetheless, consortia asked for more metadata than publishers provided.

The majority of publishers (seven out of ten, or 70%) provided less article-level metadata than consortia requested in the contracts or in other documentation (Table 4). However, three publishers outperformed their peers by providing more metadata than requested in contracts or other documentation: Publishers A, F and H.

Table 4

List of agreements analysed by metadata requested by consortia versus provided by publisher

PublisherRequested by consortiaProvided by publisher

Publisher A6370
Publisher B4936
Publisher C2518
Publisher D2423
Publisher E231
Publisher F2127
Publisher G200
Publisher H2532
Publisher I104
Publisher J73

A breakdown of the difference between article-level metadata requested by consortia versus that provided by publishers at the country level showed that some consortia did not ask for the same metadata as others from publishers, and nor did publishers provide the same metadata across countries. This shows inconsistency in consortia and publishers’ practices (Table 5).

Table 5

Difference between metadata requested by consortia versus provided by publisher and country

Publisher/agreementRequested by consortiaProvided by publisher

Publisher A: Finland2321
Publisher A: Netherlands1423
Publisher A: UK2626
Publisher B: Germany150
Publisher B: Netherlands1416
Publisher B: UK2020
Publisher C: Finland610
Publisher C: Netherlands158
Publisher C: UK40
Publisher D: Finland1010
Publisher D: Netherlands1413
Publisher E: Finland100
Publisher E: Germany11
Publisher E: UK120
Publisher F: Finland212
Publisher F: Netherlands1515
Publisher F: UK40
Publisher G: Netherlands80
Publisher G: UK120
Publisher H: Germany1118
Publisher H: Netherlands1414
Publisher I: Netherlands64
Publisher I: UK40
Publisher J: France73

By comparing the difference between what article-level metadata was requested by consortia versus provided by publishers according to the ESAC recommendation categories, it was possible to observe where greater efforts need to be made in terms of metadata delivery. For example, the results showed that invoicing and reporting metadata was most commonly provided by publishers, whereas funding metadata was the category where publishers scored the lowest results (Table 6).

Table 6

The difference between metadata requested by consortia versus metadata provided by publisher sorted by ESAC recommendations categories

ESAC recommendations categoryArticle-level metadataRequested by consortiaProvided by publisher

Invoicing and reporting#1: DOI2316
#4: Article title2211
#15: Corresponding author name2211
#7: Journal title1611
#9: Journal ISSN1710
#14: Article online date/date of publication1213
#11: Article licence (CC licence)1611
#10: Journal e-ISSN148
#17: Corresponding author e-mail1012
#24: Article APC (cost/price) (inc./ex. VAT)109
#25: Currency (e.g. €, $, £)88
#5: Article type79
#12: Article acceptance date78
#13: Article approval date37
#6: Journal ID (publisher ID)27
#29: Machine-readable metadata22
#23: Publisher11
#26: Publishers standardize their APC invoice and the invoicing process01
Sub-total192155
Author & article identification & verification#3: Institution name2215
#18: ORCIDs74
#30: Workflow (integration)41
#16: Co-author(s) name(s)22
Sub-total3522
Funding acknowledgement & metadata#19: FundRef ID55
#22: Grant number36
#20: Funder name56
#31: Publishers include in Crossref a licence statement for each publication and indicate whether the publication is green, gold or hybrid OA60
#21: Funding acknowledgment in article60
#28: Publisher to flag funder-non-compliant articles at point of author licence acceptance10
Sub-total2617
No category#2: Is the article open access?812
#8: Journal subject/discipline45
#27: APC transparency23
Sub-total1420

The data also showed a gap between how much article-level metadata was requested by consortia versus what was provided by publishers (see Table 6). For example, article DOIs were most commonly requested by consortia in contracts or in other relevant documentation but not all the publishers provided them. It also showed that none of the consortia requested ‘#26: Publishers [to] standardize their APC invoice and the invoicing process’, nor did any of the publishers provide OA metadata on ‘#21: Funding acknowledgment in article’, ‘#28: Publisher to flag funder-non-compliant articles at point of author licence acceptance’, and ‘#31: Publishers include in Crossref a licence statement for each publication and indicate whether the publication is green, gold or hybrid OA’.

Results discussion

The data analysed showed that none of the agreements with OA elements requested all the metadata recommended by KE and ESAC. It also showed that overall, consortia ask for more metadata than publishers provide. Importantly, none of the publishers provided all the metadata requested by consortia (nor recommended by KE and ESAC). Publishers also did not deliver exactly the same metadata across countries. This may be a result of consortia being more aware of the need to collect metadata than publishers but still not having all the processes and workflows in place to request such metadata. Publishers may not be aware of the need to provide metadata nor have systems to deliver it automatically and systematically. Publishers’ inconsistent provision of metadata across countries may be due to them not having aligned international strategies. All these issues pose challenges both to monitoring publishers’ compliance with the terms of the licensing agreements as well as with monitoring compliance with research funders’ OA policies.

Funding metadata was the area where publishers provided fewer article-level metadata. This is due to publishers not capturing this information (e.g. not collecting information at the article submission stage about FunderRef ID) but it may also be due to collecting poor funding metadata (e.g. a free text field where authors can add funders’ names instead of a pre-populated field where authors can choose from a list of funders). Previous analysis undertaken on the data provided by the Springer Compact agreement to the UK consortium showed that collecting funding metadata is difficult because the publisher’s data was neither sufficient nor robust enough to allow for any significant conclusions to be drawn. Importantly, it was observed that the funders’ metadata referred to the research funding source and not to the APC funding source and that not all funders had been identified and acknowledged by authors. As a result of these findings, and with Plan S becoming effective from 2021, publishers should strive to collect and report clearer funding metadata in order to demonstrate compliance with funders’ OA policies (e.g. they should use the Funder Registry from Crossref), to report on the APC funding source and to ensure that funders are correctly acknowledged in articles.

These findings show that there is scope for improvement. Consortia and academic institutions can request more metadata that KE and ESAC deem as essential to improve workflow efficiencies, and publishers have the responsibility to deliver this information to their customers.

Template for article-level metadata collection

To promote the consistent delivery of article-level metadata by publishers to consortia and academic institutions, the KE Monitoring OA group repurposed the article-level metadata check-list as a template for publishers to use as a reporting tool. The template informs consortia and academic institutions about what metadata to request from publishers. It also enables them to monitor publishers’ compliance with the terms of consortia licensing agreements and to monitor compliance with funders’ OA policies. The metadata collected by consortia and academic institutions enables them to benchmark how publishers are performing across KE countries and beyond, as well as to promote cross-country data analysis and storage of article-level metadata in international databases (e.g. Open APC).

By providing article-level metadata, consortia will be able to assess the impact of publishers’ agreements. For example, consortia will be able to understand which academic institutions are publishing the highest or lowest number of OA articles, when the highest or lowest number of publications occur and what the publications trends are from year to year. Consortia can also assess which titles are the most or least popular for OA publications, what is the highest, the lowest and the average APC cost (if applicable), which authors publish more articles (i.e. through ORCID IDs), how many articles APCs are paid by research funders, what disciplines are more popular, and so on. For example, as part of the Springer Compact agreement, Springer Nature provides monthly article-level metadata to consortia, and with this data it was possible to assess the value of the Compact agreement in the UK.

The KE Monitoring OA group plans to contact publishers individually to inform them about the purpose of the template as well as to request them to deliver article-level metadata on a systematic basis.

Recommendations for further research

The authors recommend that further analysis is undertaken on agreement types because some agreements with OA elements seem to be more successful in delivering article-level metadata than others (e.g. read and publish agreements). It is also recommended that further analysis is undertaken on the agreement start dates, as more recently drafted agreements (i.e. read and publish agreements, offsetting agreements and transformative agreements) seem to be more successful in delivering article-level metadata than some of the older agreements (i.e. subscription agreements with APC discounts or with X number of free OA articles), possibly because some consortia and publishers did not have a clear idea of what kind of metadata would be needed when drawing up those older agreements.