The HHuLOA project

The Universities of Hull, Huddersfield and Lincoln are three medium-sized institutions in the north of England. Each University has a growing research portfolio and, like other universities, each has been active in supporting open access (OA) for many years. This has included playing an active role in the development of their local institutional repository, looking to exploit technology to further OA services.

All three are institutions seeking to develop their research capability and reputation further through a number of internal and external projects. One such project was the Jisc Open Access Good Practice Pathfinder programme,1 which runs from 2014–2016. The institutions successfully bid for funding in this programme as the HHuLOA (Hull, Huddersfield, Lincoln Open Access) project.2 The aim of the HHuLOA project was to identify how OA support mechanisms can be used to assist with the development of research, working towards a more effective and rewarding submission to the post-2014 Research Excellence Framework (REF). Working together, the three institutions have been able to bring a wealth of experience and innovative thinking to capturing existing and novel good practice. This has then been shared with the aim of supporting other institutions to develop their research capability and to use OA as a means of supporting this.

The HHuLOA project addressed a number of themes, each a component of the broader aim:

  • establishing a baseline of what institutions are doing to support OA, capturing information from a group of institutions and sharing this openly
  • developing OA life cycles from different stakeholder perspectives
  • developing local repository systems to meet policy requirements
  • exploring how OA can be managed across institutional stakeholders, including research support offices
  • working with Jisc to inform development of services that meet institutional requirements
  • understanding how OA access might be embedded within e-resource management processes to aid local streamlining of workflows.

Alongside these, the key area of OA policies merited attention, as the landscape was becoming ever more complex. This is an area where all three institutions were struggling to understand the various policies that existed and develop ways of communicating these to local academic audiences. This paper describes the work undertaken in this area, with the aim of enabling academic audiences to better navigate the policy environment they find themselves in to comply and better understand the rights they have when using OA.

The problem: a confusing OA policy landscape

In the last decade, and particularly since 2012, scholarly research publishing in the UK has been directed by a series of policies, mandates and statements intended to promote, influence, or restrain the overall move towards OA. Policies have been created by government bodies, funding agencies of all types, commercial publishers, scholarly societies and universities; these agencies have not attempted to co-ordinate policy terms resulting in a somewhat confusing OA policy landscape.

Some academic researchers have responded by becoming experts in reading and understanding OA policies. However, the scope and terms of OA policies are often poorly understood outside small groups of OA enthusiasts in a given organization. Many academic staff have been left confused, frustrated and stressed by new obligations placed upon long-established publishing practices and by the way in which these changes have been communicated. The language used in these policies reflects different communities of interest with different obsessions. In addition, the pace of change has been rapid, and there has not been a reliable method of communicating changes to researchers. By contrast, many research communities are resistant to rapid change externally imposed. Universities have been particularly slow to adapt central support mechanisms to deal with OA mandates. Consequently, there has been no single place where a researcher can navigate and compare all policies or specific policy statements, nor to understand where overlapping policies might reinforce each other and where they are in opposition. OA terminology has never been fully standardized and jargon has to be interpreted. Support staff in universities have been called upon to explain the effect of new policies to an often sceptical academic body, which inevitably has led to some simplification and subjective interpretation. On occasion, special interest groups from both pro- and anti-OA camps have misinterpreted the meaning of policy statements.

Other projects

Our experience within the HHuLOA project has not been unique, nor has the desire to find a solution. A number of initiatives have been undertaken to address the problem from different perspectives.

Sherpa JULIET

The JULIET service3 emerged as a companion service to Sherpa RoMEO,4 the widely used service that provides information on journal policies on OA. JULIET, by contrast, lists information about OA policies from research funders. Researchers can use this to identify if their funder has an OA policy and see a brief breakdown of this. Links are also provided to policy web pages to allow for more detailed follow-up. The service is one of information provision, facilitating access to that information and interpreting it to aid understanding. But it stops short of being a decision-making tool in itself.

PASTEUR4OA

The EU-funded PASTEUR4OA (Open Access Policy Alignment Strategies for European Union Research) project5 is a European-wide OA advocacy project with the specific aim of standardizing OA policies from funders and organizations carrying out research. It was recognized that although many of these have issued OA policies, they are not consistent in their layout and terms. This makes comparison very difficult when determining what a researcher needs to do. The project has proposed a set of standard fields for structuring OA policies to ease this conundrum.

Jisc Monitor

In order to facilitate management of OA, particularly gold OA, the Jisc Monitor6 project is developing tools that help to capture information about OA publications. An added value part of the service is the ability to highlight whether an article is compliant with relevant policies. Focusing on the REF OA policy, the service draws on a variety of information sources to determine compliance and provides guidance back to the service manager (usually in the library) to act on.

The process: codifying and recording policy statements

Given this context, the project considered that finding a way to navigate through the various policies would be of benefit. This encompassed the potential to interpret multiple policies from different perspectives (for example, if a researcher has to comply with this funder policy, that institutional policy and yet another policy from the journal of choice).

Firstly, the HHuLOA project team tried to identify as many policies, mandates and statements from stakeholder organizations as possible, soliciting suggestions via mailing lists and blogs. The next step was to read the policies, systematically extract any meaningful individual statements or conditions, and codify the statements by recording them in a spreadsheet using an evolving, ad-hoc set of columns. At the time of writing, the spreadsheet consists of 25 columns (see Table 1 below), each recording a different policy statement. The columns were given pseudo-variable names as placeholders. Where possible, values in each column were kept to controlled lists of options. The spreadsheet containing the data gathered by HHuLOA was published to Google Drive and is publicly accessible to view.7

Line no. Short description Long description

1 HHuLOAPolicyID An arbitrary internal numeric identifier for the record within this spreadsheet
2 policyName The full name of the policy as it appears on the document or web page where the policy is found
3 policyBodyFullName The name of the organization which owns or enacts the policy
4 policyBodyAbbreviatedName The name of the organization which owns or enacts the policy in abbreviated form
5 policyBodyType Takes the values: Funder (RCUK); Funder (non-RCUK); Government; HEI; Publisher
6 policyBodyGeoJurisdiction The name of the country or area within which the policy applies
7 policyTakesEffectDate The date that the policy takes or took effect
8 policyPersonScope A description of who is bound by the policy
9 policyPublicationScope A description of the types of research output which the policy covers
10 policyURL The URL where the text of the policy can be found
11 goldAccepted Takes the values Y/N for whether the policy allows gold OA
12 greenAccepted Takes the values Y/N for whether the policy allows green OA
13 preferredMethod Whether gold or green is the preferred method, if a preference is given
14 policyLicenceGold The licence(s) which should be applied for outputs covered by the policy made OA under a gold route
15 policyLicenceGreen The licence(s) which should be applied for outputs covered by the policy made OA under a green route
16 policyEmbargoGreenSTEMMonths Green embargo periods for STEM subject disciplines, expressed as a number of months
17 policyEmbargoGreenA&Hmonths Green embargo periods for arts & humanities subject disciplines, expressed as a number of months
18 versionGreen A description of the permitted version(s) of a document which may be made OA through a green route
19 fundingAcknowledgementRequired Takes the values Y/N for whether the funding body requires acknowledgement of funding in the published output
20 policyBodyHasDataPolicy Takes the values Y/N for whether the policy body also has a research data policy
21 dataPolicyURL The URL of the data policy if one exists
22 repositorySpecifiedName The name of a specific repository if one is specified in the policy
23 repositorySpecifiedURL The URL of a specific repository if one is specified in the policy
24 discoveryMandated Whether discovery is mandated (i.e. metadata should be made available) immediately or after embargo of full text
25 notes Additional, human-readable information about the terms of the policy that does not fit into any of the other fields

Table 1

List of column headings used in the policy landscape spreadsheet

As part of the process, the team attempted to simplify/standardize conditions across policies where the wording was different but the obligation placed on the researcher was the same. Therefore, the coding is based on the project team’s inevitably subjective understanding of each policy. In particular, statements with the greatest importance for those actively engaged with OA in universities were drawn out from each policy, for example, for researchers, research engagement support staff and repository managers. It is important to note that because this recording of policy statements was done by a small group of people all working in academic libraries/repository support services, it is not impossible that some subjective bias crept in to the analysis of policies. It hardly need be stated that the original policy documents remain definitive. Any use of the codified policy statements in the HHuLOA spreadsheet is at the reader’s own risk!

Interpreting and codifying the policies was difficult and problematic at times. Not all policies were easy to find on the websites of the agencies to whom they belonged. On several occasions, deep links to PDF policy documents from press releases about the launch of the policy had broken, presumably when the organization’s website was updated. There was little consistency in the titles given to policies, or how they were presented.

The wording in policies varied from very clear, granular and transparent lists of individual terms, to rather more opaque narrative policies where discrete requirements had to be teased out from blocks of text. The fine detail of obligations placed on authors was sometimes buried deep within the text even when the headline conditions seemed simple. Because of exceptions, caveats and footnotes attached to initially simple seeming statements, it was not always possible to reduce values to a relatively small set of controlled values.

Some of the problems described above could be mitigated if policy authors themselves were to provide simplified, machine-readable versions of their policies, or if they would be willing to break existing policies down to fit a database similar to HHuLOA’s spreadsheet and write new policies with such an exercise in mind. The project believes that this would have the desirable side effect of standardizing policy terms and the presentation of policy documents.

Crowdsourced population of the data

The project team believed that to ensure that the spreadsheet was as complete as possible, and that policy statements were correctly codified, many pairs of eyes would be beneficial to improve the quality of the information and to make it more complete. To this end the spreadsheet is available for anyone to edit. Therefore, like other outputs of HHuLOA and related projects,8 this work package was crowdsourced once the initial policies and fields were agreed by the team. E-mail, the project blog and a project workshop held in June 2015 were all used for ‘harnessing collective intelligence’.9 As a result, a number of policies were added from the UK and US, including some institutional policies.

Access to data

The HHuLOA project would like to encourage use of the data. A publicly accessible spreadsheet containing the data gathered by the HHuLOA team has been published to Google Drive.10 In addition, those wishing to reuse the data or ingest it into their own applications have access to a CSV version.11 However, it should be noted that before any reuse or reformatting, users should refer to guidance for their programming language of choice regarding how to read this data. The spreadsheet is likely to evolve, especially as standards emerge; users should consider caching the data to a local database and setting up a scheduled task to periodically update this. Table 1 shows the current columns in use. In addition, users should ensure that their database is set up to mirror the schema below and that any scripts that ingest the data fail gracefully should the column headings change.

Benefits

The HHuLOA project team believes that extracting and recording policy statements and sharing them publicly in an openly editable, data-extractable format will provide the following benefits to different OA stakeholders and their workflows:12

  • academic researchers: clearer advice about existing obligations under overlapping policies, and greater ease of comparison of the effects of institution, publisher and funder involvement at an earlier stage
  • repository and OA publishing support staff: all relevant policies available through one clear source for quick checking and comparison
  • research support departments: access to a tool that can be used within research management workflows
  • web developers: access to the data enabling interoperability and the building of custom implementations
  • policy owners: standardized policy terms and presentation of policy documents.

By simplifying the navigation around different policies, the project team believes that the focus of attention can be directed to the benefits of OA as a component part of research dissemination overall.

Next steps

The HHuLOA project ended in May 2016; the final outputs from the project were presented at a workshop on the topic of embedding OA in York on 12 May, for the Northern Collaboration13 group of 25 higher education libraries. In addition, all project outputs have been deposited in the OA repositories of the Universities of Hull, Huddersfield and Lincoln and are listed on the project blog.14 Extending slightly beyond the end of the Jisc Open Access Pathfinder programme, the HHuLOA project partners are continuing to work closely with the Jisc Monitor15 project to inform good practice in the development of their Monitor Local and UK services, and to consider how those services will inform good practice in OA at universities.

The University of Lincoln has been developing a pilot researcher ‘dashboard’ system to summarise and present research information held in disparate systems. The intention is to use the policy spreadsheet as a data source to filter information and guidance based on a researcher’s commitments to particular funders. This idea has been favourably received by academic staff and provides an incentive to get more granular information about policies. At Hull, what can and cannot be made OA based on the policies of publishers has been a longstanding barrier in engaging academic staff, who feel frustrated by having to understand different approaches to what is, essentially, the same issue. This has been compounded more recently by funder policies, driven by those from RCUK and HEFCE, that require a combined view to fully assess what staff need to do. The opportunity to work on identifying how to better navigate the different policies has thus been timely. For all three partners, the work carried out in the HHuLOA project as a whole has helped raise awareness of the changing policy landscape and has allowed focused discussions to be held between the Library and Research and Enterprise offices.

The spreadsheet of codified policy statements described in this article will remain publicly available and openly editable via Google Drive. Anyone who works with OA policies, is a researcher or supports researchers or the scholarly communication process is invited to add new policies and to improve the existing data. The CSV data will remain available for developers to make use of under a Creative Commons CC0 ‘no rights reserved’ public domain waiver licence.16