Monitoring agreements with open access elements: why article-level metadata are important

Agreements with open access (OA) elements (e.g. agreements with APC discounts, offsetting agreements, read and publish agreements) have been increasing in number in the last few years. With more agreements including some form of OA, consortia and academic institutions need to monitor the number of OA publications, the costs and the value of these agreements. Publishers are therefore required to account for the articles published OA to consortia, academic institutions and research funders. One way publishers can do so is by providing regular reports with article-level metadata. This article uses the Knowledge Exchange (KE) and the Efficiency and Standards for Article Charges (ESAC) initiative recommendations as a check-list to assess what article-level metadata consortia request from publishers and what metadata publishers deliver to consortia. KE countries’ agreements with major publishers were analysed to assess how far consortia and publishers are from requesting and providing article-level metadata. The results from this research can be used as a benchmark to determine how major publishers were performing until early 2019 and prior to Plan S coming into effect in 2021. A recommendation is made that publishers use the article-level metadata check-list as a template to provide the metadata recommended by KE and ESAC.


Introduction
This study focused on analysing Knowledge Exchange (KE) countries' 1 agreements with major publishers, more specifically, agreements with open access (OA) elements, to identify what article-level metadata consortia and academic institutions request from publishers and whether publishers deliver this. The study is aimed at consortia and academic institutions that subscribe to agreements with OA elements. By collecting article-level metadata, consortia and academic institutions are able to monitor OA publications, OA costs and the value of agreements. Article-level metadata will also support consortia and academic institutions in monitoring compliance with funders' OA policies.
The representatives from each KE country undertook this research as part of their work on the Monitoring OA group. 2 They collected information from agreements with major publishers in each KE country and created an article-level metadata check-list based on the KE and the Efficiency and Standards for Article Charges (ESAC) recommendations. They have repurposed the check-list as a template for consortia and academic institutions to use to request publishers to deliver article-level metadata.
KE has been undertaking work on Monitoring OA since 2015 when it hosted its first workshop on this topic. The workshop promoted debate on standards and best practices to monitor compliance with OA policies. Speakers and participants shared challenges and best practices, explored ways to support and enhance each other's activities, and discussed approaches to assess quality and impact. Participants expressed the need for common standards, identifiers and data requirements as well as for data to be compared and aggregated at an international level. Recommendations on best practices and an agreement on minimum standards were identified as necessary next steps. The workshop concluded that 'definitions, workflows and collaboration should be closely linked in order to keep monitoring'. 3 Following this initial discussion, a second KE workshop was held in 2016. This workshop focused on debating issues related to monitoring OA publications and monitoring cost data for OA publications. The workshop reinforced the need for common standards and definitions, for publishers to deliver data and for accounting systems to be interoperable. Discussions highlighted the need for data to be shared and made openly available, for publishers to provide DOIs and funder metadata, and for these requirements to be stated in agreements with publishers. Work groups discussed issues such as data, workflows, standards and policy for monitoring OA publications as well as for monitoring cost data of OA publications. A series of recommendations were issued for current research information systems (CRISs), publishers and libraries. 4 Around the same time, ESAC was discussing 'the need to develop workflow efficiencies' in the negotiation, drafting and management of offsetting agreements. 5 Following two workshops hosted by ESAC, it became clear there was a need 'for an improvement of the current workflows and processes between academic institutions […] and […] publishers in terms of author identification, metadata exchange and invoicing'. 6 As a result, ESAC issued article workflow recommendations in three key areas: author and article identification and verification, funding acknowledgement and metadata, and invoicing and reporting. The recommendations emphasized what metadata publishers should provide. ESAC explained that the 'recommendations should be seen as a minimum set of practical and formal requirements for offsetting agreements and are necessary to make any publication-based open access business model work'. 7 In September 2018 cOALition S (an international group of funders) announced Plan S, which intends a full transition to OA by 1 January 2021. Metadata is an important element of the Plan S guidelines. The ESAC recommendations are referred to in the guidance on the implementation of Plan S for transformative arrangements. The guidelines state that cOALition S 'will only financially support [transformative] agreements […] where they adhere to the ESAC Guidelines'. In addition, the technical guidance and requirements request that peer-reviewed articles include: 'high-quality article level metadata in standard interoperable non-proprietary format', 'complete and reliable information on funding provided by cOAlition S funders', 'machine-readable information on the Open Access status and the license embedded in the article', 'PIDs for authors (e.g., ORCID), funders, funding programmes and grants, institutions'. 8 KE and ESAC work emphasized the need for common standards and identifiers, common definitions, automation of workflows, and collection of metadata. cOALition S reemphasized the need to collect metadata in all the routes leading to Plan S compliance: OA publishing venues, repository route and transformative arrangements. 'The workshop promoted debate on standards and best practices to monitor compliance with OA policies' With an increasing number of agreements including OA elements, studies discussing what metadata must be collected and funders supporting the collection of metadata, it is becoming increasingly relevant for consortia and academic institutions to include the delivery of article-level metadata in the agreements with publishers as well as to implement mechanisms to monitor publishers' compliance with the terms of the agreements. 9 Articlelevel metadata are required so that consortia and academic institutions can monitor how many articles are being published OA and non-OA under each agreement, particularly in the cases where there is a cap on the number of articles that can be published OA. Consortia and academic institutions also need to monitor how much is being spent on OA publishing as well as monitor whether they have missed any publications and need to contact the author for further information. Moreover, consortia and academic institutions need article-level metadata to assess the value of the agreements, i.e. whether agreements with OA elements are delivering value for money. Ultimately, the academic institutions that pay the costs of the OA publishing element of the agreement, or the entities that pay the article processing charges (APCs), have the right to access information about the articles they fund. They have the right to know what research they are funding, if the right licence has been applied to the articles published, and if, for example, the funding acknowledgment statement has been included in the article.
As a result of the increasing need to collect article-level metadata, the KE Monitoring OA group undertook the following activities: • collected information on agreements with 12 major publishers for the six KE countries • classified agreements by type (subscription agreements and agreement with OA elements) • analysed agreements with OA elements against an article-level metadata check-list based on the KE and the ESAC recommendations • analysed article-level metadata criteria to assess what metadata was requested in consortia contracts and other relevant documentation and to assess whether publishers provided that metadata • developed a template for publishers to use to provide article-level metadata to consortia and academic institutions based on the check-list. 10

Methodology
This research was based on the analysis of publisher agreements and other relevant documentation. The data analysed included the agreements that the six KE countries had with all or some of the following publishers: American Chemical Society, Cambridge University Press, EDP Sciences, Elsevier, Oxford University Press, Royal Society of Chemistry, SAGE Publishing, Springer Nature, Taylor & Francis and Wiley. Because the agreements and other relevant documentation analysed in this study are confidential, the publishers' names were anonymized in the data analysis.
The agreements analysed were classified into two major types: subscription agreements and agreements with OA elements. Agreements with OA elements are diverse because there are differences in how research outputs become openly available, differences in costs applied to make research outputs openly available and differences on how many research outputs can be made openly available. 11 The KE Monitoring OA group considers agreements with OA elements to be those that include distinct OA models such as subscription agreements with APCs discounts, subscription agreements with X number of free OA articles, read and publish agreements, offsetting agreements and transformative agreements. A definition of the different types of agreements with OA elements is not provided in this article but attempts to define these models have been made elsewhere. 12 'Article-level metadata are required so that consortia and academic institutions can monitor how many articles are being published OA …under each agreement' Only the agreements with OA elements were considered valid for analysis because the main focus of the research was to look at the elements of agreements that comply with the gold route to OA (i.e. peer-reviewed articles become immediately available on OA and may be subject to an APC charge, or the APC costs may be included in the agreement model). Because Denmark has a national green OA policy, it complies with funders' OA policies via the green route (i.e. deposit of research outputs in institutional, subject, and/or funder repositories which become available on OA following an embargo period). Because all of Denmark's agreements were subscription only and did not include a gold route to open access, Denmark was not included in the data analysis.
Agreements with OA elements were analysed against the article-level metadata checklist based on the KE and ESAC recommendations to assess what article-level metadata was requested in the consortia contracts or in other relevant documentation and whether publishers provided the metadata to consortia. Table 1 lists the criteria used for the data analysis and shows if they are based on the KE or the ESAC recommendations. Three additional criteria were added by the KE Monitoring OA group. All the metadata listed in Table 1 are relevant and should be requested of publishers because they facilitate the search, discovery and sharing of information on invoicing and reporting, along with author and article identifiers and funding identifiers. The agreements considered valid for the data analysis are dated between 1 January 2016 and 1 January 2019. In total, information was collected for 50 agreements with 12 major publishers (Annex 1). Of these, only ten publishers had agreements with OA elements. Of the agreements analysed, 46% were subscription only and 54% had OA elements (

Consortia agreements requesting article-level metadata
The first part of the analysis involved assessing which consortia agreements required publishers to provide article-level metadata. The data analysis only applied to agreements with OA elements. Out of a total of 27 agreements with OA elements identified in five KE countries (see Annex 1), 24 agreements requested some article-level metadata. Table 3 shows the agreements for which metadata was requested. Figure 1 shows the article-level metadata criteria from the most to the least commonly asked for in contracts or in other relevant documentation. DOIs were the criteria most commonly requested of publishers by consortia.

Article-level metadata provided by publishers
The second part of the analysis involved assessing which publishers provided articlelevel metadata (  Table 3. List of agreements analysed where consortia requested metadata and publishers provided it Fewer article-level metadata were provided by publishers than requested in the consortia contracts or in other relevant documentation. Figure 2 shows the article-level metadata that publishers most commonly provided to consortia. DOIs were also the criteria most commonly provided by publishers. When comparing what article-level metadata consortia requested in contracts or in other relevant documentation (see Figure 1) versus what metadata publishers provided (see Figure 2), it was observed that none of the agreements requested all the metadata recommended by KE and ESAC. Nonetheless, consortia asked for more metadata than publishers provided.
The majority of publishers (seven out of ten, or 70%) provided less article-level metadata than consortia requested in the contracts or in other documentation (Table 4). However, three publishers outperformed their peers by providing more metadata than requested in contracts or other documentation: Publishers A, F and H. Publisher J 7 3 Table 4. List of agreements analysed by metadata requested by consortia versus provided by publisher A breakdown of the difference between article-level metadata requested by consortia versus that provided by publishers at the country level showed that some consortia did not ask for the same metadata as others from publishers, and nor did publishers provide the same metadata across countries. This shows inconsistency in consortia and publishers' practices (Table 5). Publisher I: Netherlands 6 4 Publisher I: UK 4 0 Publisher J: France 7 3 Table 5. Difference between metadata requested by consortia versus provided by publisher and country By comparing the difference between what article-level metadata was requested by consortia versus provided by publishers according to the ESAC recommendation categories, it was possible to observe where greater efforts need to be made in terms of metadata delivery. For example, the results showed that invoicing and reporting metadata was most commonly provided by publishers, whereas funding metadata was the category where publishers scored the lowest results ( Sub-total 14 20 Table 6. The difference between metadata requested by consortia versus metadata provided by publisher sorted by ESAC recommendations categories The data also showed a gap between how much article-level metadata was requested by consortia versus what was provided by publishers (see Table 6). For example, article DOIs were most commonly requested by consortia in contracts or in other relevant documentation but not all the publishers provided them. It also showed that none of the consortia requested '#26: Publishers [to] standardize their APC invoice and the invoicing process', nor did any of the publishers provide OA metadata on '#21: Funding acknowledgment in article', '#28: Publisher to flag funder-non-compliant articles at point of author licence acceptance', and '#31: Publishers include in Crossref a licence statement for each publication and indicate whether the publication is green, gold or hybrid OA'.

Results discussion
The data analysed showed that none of the agreements with OA elements requested all the metadata recommended by KE and ESAC. It also showed that overall, consortia ask for more metadata than publishers provide. Importantly, none of the publishers provided all the metadata requested by consortia (nor recommended by KE and ESAC). Publishers also did not deliver exactly the same metadata across countries. This may be a result of consortia being more aware of the need to collect metadata than publishers but still not having all the processes and workflows in place to request such metadata. Publishers may not be aware of the need to provide metadata nor have systems to deliver it automatically and systematically. Publishers' inconsistent provision of metadata across countries may be due to them not having aligned international strategies. All these issues pose challenges both to monitoring publishers' compliance with the terms of the licensing agreements as well as with monitoring compliance with research funders' OA policies.
Funding metadata was the area where publishers provided fewer article-level metadata. This is due to publishers not capturing this information (e.g. not collecting information at the 'overall, consortia ask for more metadata than publishers provide' article submission stage about FunderRef ID) but it may also be due to collecting poor funding metadata (e.g. a free text field where authors can add funders' names instead of a prepopulated field where authors can choose from a list of funders). Previous analysis undertaken on the data provided by the Springer Compact agreement to the UK consortium showed that collecting funding metadata is difficult because the publisher's data was neither sufficient nor robust enough to allow for any significant conclusions to be drawn. 13 Importantly, it was observed that the funders' metadata referred to the research funding source and not to the APC funding source and that not all funders had been identified and acknowledged by authors. As a result of these findings, and with Plan S becoming effective from 2021, publishers should strive to collect and report clearer funding metadata in order to demonstrate compliance with funders' OA policies (e.g. they should use the Funder Registry from Crossref), to report on the APC funding source and to ensure that funders are correctly acknowledged in articles.
These findings show that there is scope for improvement. Consortia and academic institutions can request more metadata that KE and ESAC deem as essential to improve workflow efficiencies, and publishers have the responsibility to deliver this information to their customers.

Template for article-level metadata collection
To promote the consistent delivery of article-level metadata by publishers to consortia and academic institutions, the KE Monitoring OA group repurposed the article-level metadata check-list as a template 14 for publishers to use as a reporting tool. The template informs consortia and academic institutions about what metadata to request from publishers. It also enables them to monitor publishers' compliance with the terms of consortia licensing agreements and to monitor compliance with funders' OA policies. The metadata collected by consortia and academic institutions enables them to benchmark how publishers are performing across KE countries and beyond, as well as to promote crosscountry data analysis and storage of article-level metadata in international databases (e.g. Open APC 15 ).
By providing article-level metadata, consortia will be able to assess the impact of publishers' agreements. For example, consortia will be able to understand which academic institutions are publishing the highest or lowest number of OA articles, when the highest or lowest number of publications occur and what the publications trends are from year to year. Consortia can also assess which titles are the most or least popular for OA publications, what is the highest, the lowest and the average APC cost (if applicable), which authors publish more articles (i.e. through ORCID IDs), how many articles APCs are paid by research funders, what disciplines are more popular, and so on. For example, as part of the Springer Compact agreement, Springer Nature provides monthly article-level metadata to consortia, and with this data it was possible to assess the value of the Compact agreement in the UK. 16 The KE Monitoring OA group plans to contact publishers individually to inform them about the purpose of the template as well as to request them to deliver articlelevel metadata on a systematic basis.

Recommendations for further research
The authors recommend that further analysis is undertaken on agreement types because some agreements with OA elements seem to be more successful in delivering article-level metadata than others (e.g. read and publish agreements). It is also recommended that further analysis is undertaken on the agreement start dates, as more recently drafted agreements (i.e. read and publish agreements, offsetting agreements and transformative agreements) seem to be more successful in delivering article-level metadata than some of the older agreements (i.e. subscription agreements with APC discounts or with X number of free OA articles), possibly because some consortia and publishers did not have a clear idea of what kind of metadata would be needed when drawing up those older agreements.
'publishers should strive to collect and report clearer funding metadata' 'By providing articlelevel metadata, consortia will be able to assess the impact of publishers' agreements' 'some agreements with OA elements seem to be more successful in delivering article-level metadata than others'

Data accessibility statement
The data used for this research article has not been made available because the publisher agreements and other relevant documentation analysed are confidential information and cannot be disclosed publicly.