A persistent identifier (PID) is defined by Wikipedia as ‘a long-lasting reference to a document, file, web page, or other object … usually used in the context of digital objects that are accessible over the Internet. Typically, such an identifier is not only persistent but actionable: you can plug it into a web browser and be taken to the identified source.’1
PIDs have been around for a long time, especially in scholarly communications. Think of the ISBN (International Standard Book Number), first introduced in 1966; or its journal equivalent, the ISSN (International Standard Serial Number), launched five years later in 1971. But they really started to take off when scholarly communications went digital in the late 1990s, and with the launch of Crossref as a provider of digital object identifiers (DOIs) for research articles and other works. Since then, there has been a dramatic increase both in the number of persistent identifiers, with nearly 150 million DOIs assigned at the time of writing (October 2017),2 as well as close to four million ORCID iDs for researchers.3
PIDs enable authoritative, unambiguous, digital connections between people (researchers), places (their organizations), and ‘things’ (their research contributions and outputs). The research infrastructure – from more established tools like manuscript submission and grant application systems, to innovative new services such as Altmetric, Kudos and Publons – is increasingly reliant on these connections.
But, despite the ubiquity of PIDs in scholarly communications, until recently the PID community lacked a dedicated space in which to explore ideas of networked research and scholarly communications infrastructure. To fill this gap, in November 2016 a diverse group of experts from California Digital Library, Crossref, DataCite and ORCID organized the first PIDapalooza.4
Described as ‘the first open festival for scholarly research persistent identifiers’, PIDapalooza took its cue from the music festival after which it is named, Lollapalooza.5 The intention was to bring together PID enthusiasts – those who create and/or use persistent identifiers for scholarly communications – for two days of high-level but informal and interactive discussions.
What kinds of PIDs will we need in future? How should they be used? What are the best ways to get researchers to adopt and use PIDs? What are the theoretical and practical approaches to persistence and interoperability? These and many other questions were addressed in the first PIDapalooza, which was attended by 120 PID experts globally.
Much of the meeting was spent in short (half hour or less) parallel sessions, but there were also five plenaries. Together with the session on organization identifiers (which was so well attended it was virtually a plenary), these provide a good representation of the PIDapalooza experience.
First up was Jonathan Clark, Executive Director of the International DOI Foundation, whose talk, ‘PIDvasive – What’s possible when everything has a persistent identifier?’,6 looked at what we should expect from our persistent identifiers (as well as persistence and uniqueness). The answer: provenance, metadata, machine readability, and policies/guarantees. In a broad-ranging talk, Clark then went on to discuss the risk of having too many PIDs, the types of services that might be built on them, and the critical need for both interoperability and a social infrastructure for PIDs.
Day one ended with the second plenary, by Simon Porter, VP of Research Engagement and Information Architecture at Digital Science, entitled ‘Research Information Citizenship’.7 He called on each scholarly communications sector – universities, publishers, funders, service providers and researchers themselves – to play their part in making the digital research infrastructure work better. Porter also raised the need for collaboration to build shared infrastructure tools and services, especially among service providers.
Clifford Tatum, Project Manager at ACUMEN and researcher at Leiden University, kicked off day two. His talk, ‘Towards Governance of PID Portability for Research Evaluation’,8 looked at the use of PIDs in the collection of research information for the purpose of evaluation, and the challenges this creates – in a world where open science and interoperability are increasingly the norm – in terms of privacy, security and commercial concerns. Tatum’s proposed solution was to focus on improving the portability of PIDs through better standards and protocols.
The fourth plenary was by Herbert Van de Sompel, team leader of the Prototyping Team at the Research Library of the Los Alamos National Laboratory. His talk, ‘Signposting for Persistent Identifiers’,9 demonstrated that many papers cite uniform resource identifiers (URIs) other than the DOI URI, reducing the potential power of PIDs. His solution to this problem was to create a signposting pattern for PIDs to enable the automatic discovery and use of the DOI URI rather than other types of URI associated with the DOI-identified object.
The last official plenary speaker was Carly Strasser, Program Officer for the Data-Driven Discovery Initiative at the Gordon and Betty Moore Foundation. She had the (un?)enviable task of drawing together everything that went on at PIDapalooza and she did so with aplomb – and a little Lollapalooza inspiration. Strasser described her talk, ‘Reaching Nirvana: The Future of Persistent Identifiers’,10 as ‘a “Greatest Hits” of takeaways, lessons learned, points for discussion, and new directions’.
As mentioned, there was also an unofficial plenary – a very well-attended (and lively!) session on organization identifiers. Led by Patricia Cruse, Laure Haak and Ed Pentz (respectively Executive Directors of DataCite, ORCID and Crossref), it began with an update on the work that the three organizations had undertaken to review the current work on organization identifiers and define use cases. The rest of the time was spent on a wide-ranging discussion about next steps, with a range of (sometimes divergent) views expressed. However, there was general agreement that none of the current providers of organization identifiers meet all scholarly communications use cases – especially in terms of researcher affiliations – and there was support for a community working group to seek a solution to this challenge.
The response to PIDapalooza 2016 was enthusiastic, so we are now planning the next one, to be held in Girona, Spain on 23–24 January 2018. Like its predecessor, the goal of PIDapalooza 2018 is to create an open, welcoming atmosphere in which to discuss persistent identifiers, and it’s open to anyone who creates or uses PIDs.
Content will fall into eight broad themes:
The programme for PIDapalooza 2018 (which of course has its own DOI!)11 is not finalized at the time of writing, since proposals are still being accepted, but I can guarantee that the content will be just as diverse and thought-provoking as the last one and that the level of audience participation and engagement will be just as high. You can find out more on the pidapalooza.org website, register at http://pidapalooza2018.eventbrite.com, and follow @pidapalooza for updates on speakers, sessions, and more. (See Figure 1 for the official logo and a reminder of the dates.)
A list of the abbreviations and acronyms used in this and other Insights articles can be accessed here – click on the URL below and then select the ‘Abbreviations and Acronyms’ link at the top of the page it directs you to: http://www.uksg.org/publications#aa
The author helps organize PIDapalooza.
Wikipedia entry for ‘Persistent identifier’. https://en.wikipedia.org/wiki/Persistent_identifier (accessed 10 October 2017).
Factsheet: Key Facts on Digital Object Identifier System. http://www.doi.org/factsheets/DOIKeyFacts.html (accessed 10 October 2017).
ORCID statistics: (). http://orcid.org/statistics (accessed 10 October 2017).
PIDapalooza: (). https://pidapalooza.org/ (accessed 10 October 2017).
Lollapalooza: (). https://es.wikipedia.org/wiki/Lollapalooza (accessed 11 October 2017).
Clark, J (2016). PIDvasive – What’s possible when everything has a persistent identifier? PIDapalooza Keynote.pdf. figshare, DOI: https://doi.org/10.6084/m9.figshare.4233839.v1 (accessed 10 October 2017).
Porter, S (2016). Exploring the relationship between persistent identifiers and research information citizenship. figshare, DOI: https://doi.org/10.6084/m9.figshare.4220454.v1 (accessed 10 October 2017).
Tatum, C (2016). Towards governance of PID portability for research evaluation. figshare, DOI: https://doi.org/10.6084/m9.figshare.4212732.v1 (accessed 10 October 2017).
Van de Sompel, H (2016). A Signposting Pattern for PIDs. figshare, DOI: https://doi.org/10.6084/m9.figshare.4249739.v1 (accessed 10 October 2017).
Strasser, C (2016). Reaching Nirvana: The Future of Persistent Identifiers. figshare, DOI: https://doi.org/10.6084/m9.figshare.4220520.v1 (accessed 10 October 2017).
PIDapalooza’s (). DOI: https://doi.org/10.5438/11.0002 (accessed 11 October 2017).