Background

Researchers undertake a number of relatively standard tasks during the research process. For example, they have to apply for grants, submit articles for publication, and deposit their work in institutional or subject repositories. A lot of the same information is required each time they undertake these tasks; to date, this has typically been manually re-entered each time, which is time consuming and can result in inconsistencies. These inconsistencies – for example, in whether an author's full name is given or just initials, or how his/her institutional affiliation is given, or if the last name has changed – mean that it is hard for interested organizations, such as institutions or funders, to access and assess the full record for a specific individual or institution. Researchers are evaluated on their record of scholarship and, under growing time and budget pressure, are beginning to question why they have to enter the same information over and over again each time they submit a paper or a grant application. Meanwhile, funders and institutions are asking:

  • Why do our databases have duplicate records?
  • Why do we have to ask researchers and scholars for their contributions when these are already in multiple databases … somewhere?
  • Why do we have to spend so much time on disambiguation?
  • How do we keep our repository up to date?
  • How can we track people? We have no idea what they do when they move on. Isn't there a better way than manually searching the web?

Across the community, the overall question is: how can we know what a person has contributed to the research and scholarly community, if we can't tell who's who?

“Researchers … question why they have to enter the same information over and over again …”

The case for ORCID

At the heart of all of these questions is the need to affiliate and associate individuals with their research and scholarly works: journal articles, monographs, data sets, software, you name it. In the research community, we have been lacking the ability to link research and researchers, contributors and works.

This is where ORCID (Open Researcher and Contributor ID) fits in.

ORCID is a community-driven non-profit organization that provides an open registry of persistent identifiers for researchers and scholars. Researchers can register for free for a unique identifier, and associate this with existing works and identifiers using search and import tools, directly from the ORCID interface. The researcher controls what data are linked and what data are publicly viewable. Because ORCID is international, identifiers can be used throughout the researcher's career, wherever it takes him/her.

ORCID works with the research community to embed these identifiers in research workflows: grant applications, manuscript submissions, association membership renewal, meeting abstract submission – everywhere we can connect a person with a research or scholarly contribution. This means that, moving forward, new works will include an ORCID identifier, and it will become possible to automate the process of linking back to an individual's ORCID record.

“… we have been lacking the ability to link research and researchers, contributors and works.”

Participants to date

ORCID launched its registry in October 2012. Since then, we have issued identifiers for over 250,000 individuals. Some of our organizational members have launched integrations and we are seeing ORCID identifiers becoming embedded in research information infrastructure:

  • Universities, including Harvard University, Boston University, New York University Langone School of Medicine, University of Cambridge, Chalmers University of Technology, University of Hong Kong and King Abdulaziz City for Science and Technology, are working to integrate ORCID identifiers into personnel, research administration, profile and IT systems.
  • Associations are starting to discuss integrating ORCID identifiers into membership and meeting workflows. The Alfred P Sloan Foundation has awarded ORCID a grant to assist associations and universities with their integration plans. Integration demonstrations from these projects will be showcased in May 2014, at the ORCID Outreach meeting to be held in Chicago, IL.
  • Publishers, including Nature, Hindawi, Copernicus, Elsevier, Wiley and Springer are embedding ORCID identifiers in manuscript submission and production processes. ORCID identifiers started to flow into CrossRef and PubMed in April 2013. For existing publications, CrossRef, PubMed Europe, Scopus and ResearcherID have launched integrations to search and import publication metadata, available in the ORCID interface. Web of Science is capturing ORCID identifiers from ResearcherID imports and from publisher records and adding the identifiers to existing and new indexed bibliographic records. For works not in these repositories, ORCID supports manual data entry, but the goal is to provide users with tools to minimize this manual entry.
  • Data centers are embedding ORCID identifiers in data repositories; over time, ORCID may be used to keep repositories up to date. The ODIN project has launched an ORCID-DataCite search and import tool, figshare is an ORCID early adopter, and the Australian National Data Service (ANDS) is working to integrate ORCID identifiers. Repository platforms including DSpace are working to support ORCID identifiers, and Dryad is piloting some approaches.
  • Funders are also working on ORCID integrations. The US Department of Energy has integrated ORCID identifiers into its application system, Wellcome Trust has launched an integration with its e-Grant application system, and the US National Institutes of Health (NIH) has launched the first phase of its ORCID integration into ScienCV platform; the NIH instance aims to automate and structure the production and submission of curriculum vitae information for research grant applications and annual progress reports across all US Federal agencies. The Japan Science and Technical Agency is planning an integration, and the Australian National Medical and Health Research Council has included a field in its grants system to collect ORCID identifiers.

Broadening awareness and functionality

To further support integration, ORCID is working with its community to develop recommended best practices. In addition to displaying guidelines posted on the ORCID website, we will be hosting a webinar in September to review publishing integration demonstrations, and will be hosting best practice working groups for associations, publishers and funders at our Outreach meeting in October.

“… ORCID is becoming adopted internationally …”

There is no question that ORCID is becoming adopted internationally, by researchers and the research community, but certainly, there is more work to be done. For example, we have added steps to encourage verification of ORCID record e-mail addresses but this still hovers at 60% of records created, with implications for use of ORCID identifiers to support single sign on and other verification workflows. Meanwhile, our Metadata Working Group is finalizing recommendations for a standard data model for importing and storing data. This will improve the efficiency of efforts such as the ORCID EU Labs Feed Webservice, which takes the data associated with an ORCID identifier from the public ORCID API, and reformats it for export, with a focus on formats useful to individual researchers: RSS, bibtex, Citeproc JSON and formatted citations.

Elsewhere, we have been improving the user interface to support linkage with works and other identifiers. We have implemented features for premium members, for example, enabling them to be notified when records change. We have added Spanish, French and Chinese (traditional and simplified) interfaces. In collaboration with Ringgold and ISNI, we are working to link ORCID identifiers with institutional identifiers. We have just begun the process of defining high-level requirements for supporting validation of information in individual ORCID records by organizations, including affiliations and works.

“… we are working to link ORCID identifiers with institutional identifiers.”

We have been working to clarify how we are managing the detection of duplicate records. Our current workflow involves searching for matching name and address when a new record is created. We have added the ability to link multiple e-mail addresses with an ORCID record, a key component to prevent the creation of duplicate records. We have also developed a policy for managing verification and correction of data in ORCID records. This is a high-priority area for ORCID and we continue to re-evaluate our procedures and policies.

Related services

Throughout all of these improvements, ORCID remains focused on our core mission: providing a registry of unique and persistent identifiers for researchers and contributors, and working with the research community to embed these identifiers, including enabling methods to link to existing works. We also encourage our community to develop add-on applications to extend ORCID functionality. One example of this is the use of our Public API by ImpactStory to provide a range of metrics for works associated with an ORCID record. We have made the ORCID source code openly available under an open source license, and provided tools for the community to use to develop external services. At the May 2013 ORCID Outreach meeting, a joint effort with Dryad, we hosted a CodeFest focused on developing mashups with the ORCID APIs. Eleven projects were submitted during the course of the event, and two were awarded a prize in the form of a trip to the ODIN Codesprint this October.

“We also encourage our community to develop add-on applications …”

Together with organizations including EuroCRIS, which has developed the CERIF standard for research information exchange, and CASRAI, which has developed a dictionary of naming standards for terms used in that exchange, ORCID also has a key role in enabling seamless system-to-system interoperability across organizational and national boundaries within the research community.

Conclusion

What does this mean in practical terms? For a researcher, it means discoverability and reduced data entry: a quicker and more effective way to record works, fill forms and report on activities, by ensuring scholarly activities are connected by a common identifier. This ultimately equates to more time for research and scholarship. For research organizations, it means much the same: increased discoverability of the contributions of researchers, scholars, grantees and authors; the ability to track contributions over time; less time querying databases and running manual searches, and more time and money to support research and scholarly pursuits.

“ORCID can be the glue that solves the name ambiguity problem in scholarly communications …”

From a technical perspective, ORCID is a simple concept that is relatively straightforward to implement (see our Integration Guide). The social implications are significant: by providing a means to link researchers and scholars, their various identifiers, their works and affiliations, ORCID can be the glue that solves the name ambiguity problem in scholarly communications, ensuring it is quick and easy to know who should be credited for what.