Introduction

Academic libraries have often been perceived as ‘warehouses of books’ or ‘boxes of books’, that is, as permanent collections of print books whose acquisition is the primary goal of collection management. Libraries have typically purchased print books on the expectation of reader demand rather than based on established demand, resulting in their building ‘just-in-case’ collections.

With the advent of e-books, though, this is no longer true. Now there are options to provide access to books without requiring ownership (e.g. subscriptions), or to allow discoverability and immediate patron access without prior ownership (e.g. ‘demand-driven acquisition’ and ‘evidence-based acquisition’). These options have offered librarians new choices that necessitate reconsideration of their core professional values and expectations.

This two-part research examines changes over the last 30 years in the circulation of print monographs to see if patterns emerge that can help inform librarians’ future purchasing choices. This research is itself the first part of a two-part study at the University of Prince Edward Island (UPEI). The second part regards similar analysis of e-book usage, with publication forthcoming.

Literature review

By far the most common question librarians asked of academic print circulation data is what proportion of books ever circulated at all. The results were broken down by type of selection (librarian selected, faculty selected, approval plans) as well as by broad subject categories (sciences, humanities, social sciences) or Library of Congress (LC) classification. The most cited examples of this research are the Pittsburgh study in the 1970s and Hardesty’s work in the 1980s. While there have been many more publications focusing on specific subject collections at various kinds of institutions, they all generally look only at about five years of circulation history, and rarely at more than ten. Cheung had one of the longest data periods, 15 years, but as with the other studies looked only at the first year of use and total uses, and not the distribution of uses over the subsequent years (which the authors referred to as ‘obsolescence’, the flip side of ‘longevity’). In all of those studies, however, when a book is used just once, the authors seem to assume that its value and justification for purchase have been affirmed. One point of this research is to question that assumption.

Early work on monograph obsolescence focused on predicting which books would be obsolete for the purpose of weeding or storage to free up shelf space. Fussler and Simon looked at patterns of use obsolescence from the 1950s and found ‘decay in the use’ of books in the natural sciences to be much greater than the humanities and social sciences. Burrell’s study in the 1980s called this ‘ageing’ and looked with statistical rigor at the aggregate collection, not each individual book. There has been some work done on the mathematics of predicting ‘diachronous obsolescence’ (which means approximately the same as what this study calls ‘use longevity’), but this work doesn’t seem to have been applied by other researchers yet. Most work on library collection obsolescence studies journals, not books.

The largest study of collection use including monograph obsolescence was done on the OhioLINK consortium of over 100 academic libraries in 2014, involving over 28 million items. Their data included for each item both the ‘accession date’ (when item was acquired) and ‘date of last use’ (last circulation). They used the ‘synchronous’ rather than ‘diachronous’ approach to obsolescence and found that ‘while age is a good predictor of use, it is not the best predictor – date of last use is a much better predictor of future use … the probability of an item circulating if it has been idle for ten years or more stabilizes at approximately 0.0075’.

Research into e-book usage has done more to consider what this article calls the longevity of a book’s use. Fry has compiled an excellent bibliography on this research.

Methods

The data for this two-part study was collected from the library at UPEI, a small (student enrollment under 5,000) public comprehensive university in Canada that does not have doctoral programs in the humanities but has master’s in professional programs and very small doctoral programs in science, technology, engineering and mathematics (STEM) fields, and, more recently, education. It has a particularly strong veterinary doctoral program, which skews the analysis in some of the classification categories.

Part 1

Part 1 of this research considered books that both were acquired and had a publication year of between 1991 and 1996 and a first circulation year between 1991 and 2000. The books were systematically examined for circulation by using the ‘date due’ stamps in the books themselves as well as circulation data from the online circulation system which provides data only starting from 2008 to mid-2020. The circulation data used for this part of the analysis from the Evergreen online catalogue and circulation system counts a new checkout by a patron, but not renewals, not original staff processing and not in-house or reshelving uses.

Digital records had been kept of the acquisition date of print volumes as far back as 1991, which is the reason for selecting 1991 as the starting date. Most monographs have their publication year encoded in their MARC record in the ‘date1’ field of the 008 fixed field. This study used a raw SQL query on the Library’s Evergreen system to pull a list of every book in the Library’s circulating ‘stacks’ collection which had an acquisition date between 1991 and 1996 and also a publication year within three years of the acquisition date. Books that had been acquired much later than they were published, such as large donations of older books from a retiring professor, were excluded. This yielded a total set of 14,988 individual volumes to consider across the entire A-Z LC classification tree. As circulation activity was not available in the online system prior to 2008, a research assistant was hired in the summer of 2016 to physically examine each book’s date due sticker and record the first and last years it had checked out (if at all) as well as note any anomalies that could render it inappropriate for the study. The research assistant recorded the relevant data at the bar-coded item level while in the stacks collection, then transcribed her notes into a spreadsheet. Anomalies included a spine annotation used by the Library to indicate books that had moved from the non-circulating ‘reference’ collection to the stacks in the mid-2000s, as such books would not have a meaningful circulation history back to the 1990s. Other issues triggering exclusion from the final dataset involved evidence that date due stickers were not an accurate reflection of their circulation history, such as glue and ink evidence that there were prior stickers that were no longer in the book, and hand-written dates that recorded just the month and day due but not the year. After excluding all books for which there was evidence to believe that the circulation history was incomplete, the final set of books included in the analysis was 12,557, or 84% of the original list that was hand-checked. This list was then further narrowed to include only books whose first circulation activity was in the year range 1991–2000, which left 10,002 titles before exclusions were made within each category system for low-category data.

The circulation data on these books were then updated to mid-2020 using the Evergreen system, which did have fully reliable circulation data starting in mid-2008. Thus, the maximum longevity a book in this part could have reported is 30 years, from 1991–2020.

Part 2

Part 2 of this research looked at books acquired between 2008 and 2011. These books were examined for circulation using the previously mentioned Evergreen system.

Out of a total print-items catalogue of about 383,000 titles, circulation information was pulled for all of the books that had both an item-create year (indicating year of acquisition) and also a first-circulation-activity year between 2008–2011, for 4,060 books in all. These years were chosen because 2008 is the first year that circulation data are available in the online catalogue system, and 2011 is the last year that would allow for up to a ten-year longevity history to be possible, as the circulation data available ended with mid-2020. Coincidentally, the time this analysis was begun was within one month of patron access to print books being shut down due to the Covid-19 pandemic, which certainly would have skewed the findings.

Given the range of circulation data from 2008 to mid-2020, the maximum longevity a book in this part of the study could have is 13 years if the first circulation was in 2008 and last in 2020.

For comparison, a related ‘Part 2B’ dataset consists of all of the print books that were marked in the catalogue as being acquired as far back as 1991 and had any circulation recorded from 2008–2011. A great many of those were actually acquired throughout the decades preceding the Library’s first use of an online circulation system, as anything pre-1991 shows a 1991 acquisition year. This dataset would also have a maximum longevity of 13 years, although of course many of the books in it, including the ones already included in part 1, may well have circulation histories much further back but whose circulation data prior to 2008 are not available.

UPEI did not collect in-house use data during the periods being studied.

Categorization schemes used

This analysis applies three different categorization schemes, and then places each title into a category within each of those schemes. This process starts with the LC call number assigned to each book by the librarians at UPEI. They generally followed the assignments made by much larger library systems such as the Library of Congress, but with modifications when relevant to local needs, usually for Canadian-specific content. Books in part 1 were generally copy-catalogued using OCLC services, but by the time period covered in part 2, UPEI had stopped using OCLC services and instead relied on call numbers retrieved from other much larger Canadian libraries such as the University of Toronto, University of Alberta and the University of British Columbia using Z39.50 client software.

Each LC class range listed separately in the openly available Library of Congress Classification Outline was assigned to each of the categories.

Each book was then mapped to the categories within those three categorization schemes that applied to its shelf call number. For instance, BL74-BL99.9, described in the LC classification outline as ‘Religions. Mythology. Rationalism – Religions of the world’, was assigned to Becher-Biglan: Soft Pure, Major Subject: Humanities, Department: Religion.

Becher-Biglan typology

Rarely used by librarians, the Becher-Biglan (BB) typology was originally designed to analyze aspects of higher education academic departments, as a way of considering patterns of similarity in curriculum requirements, scholarly output, external grants and other characteristics of interest to higher education administrators.

Because of the enrollment dominance of professional programs in schools like UPEI, it was thought that using these four categories – hard pure, hard applied, soft pure and soft applied – would produce meaningful patterns to guide collection acquisitions and access decisions. This is particularly useful in teasing out of the LC classification outline the subranges within ‘soft pure’ fields like literature that are actually ‘soft applied’. Most commonly, this would be books with a focus on formal education techniques, such as books that were more likely to be of interest to School of Education students than students of English. Similarly, books on ‘study and teaching’ of STEM subjects would be categorized into ‘hard applied’ rather than be assumed to be ‘hard pure’.

The BB typology system for academic disciplines was developed by Becher (1989) using in part the earlier work by Biglan (1973).

Major subjects

More common than the BB typology in academic library studies of book collections and use are a set of broader subject categories. There are numerous variations in librarianship literature, but the one used here is typical given the particular professional programs at UPEI: arts, business, education, health, humanities, law, social science, STEM and other. As UPEI has no legal or paralegal program, the number of titles and circulations categorized as ‘law’ (as well as ‘other’ which are books usually found in the ‘A’ and ‘Z’ classifications) were so small that those titles were left out of this analysis.

Academic departments

The selection of academic departments and choices for mapping LC class ranges on to these is based on the programs offered at UPEI. A few that do not have a corresponding program at UPEI (e.g. ‘military sciences’) were also included if it seemed unreasonable to fold that data into an existing department but too much to simply leave out. Note that this does not in any way mean that books in a given department category were necessarily used by students and faculty within that department. For instance, the nursing and education-affiliated patrons also use psychology books.

Department categories with only a very small number of titles with circulation activity were left out of this analyzed data.

Definition of circulation longevity

If a book’s only circulation (regardless of how many checkouts that may be) occurred in a single year, that is considered in this study a longevity of one. If a book was used at least once in 1991 and its next use was in 2008, that would be a longevity of 18 years. Longevity may also be thought of as ‘active use life’. Activity was based on the initial checkout year regardless of when the book was returned. Renewals were ignored. This study does not analyze the number of circulations, nor look at the distribution of circulation activity within that range of years.

Because this study’s purpose is to examine longevity for books that circulate at least once, the data do not include the books that were acquired in that period but did not circulate at all. So mean and median figures are not distorted by the large number of zero years of longevity. The data in this report cannot be used to infer what percentage of books had any circulation.

Results

This analysis reports on median longevity as defined above, not ‘average’ (also known as ‘mean’) to prevent outliers from distorting the trends.

Part 1 – books acquired from 1991–1996

Books acquired from 1991–1996 and also had first circulation from 1991–2000. Maximum longevity value is 30 years. The median longevity overall is ten years.

Summary totals:

Table 1 shows that just less than 50% of the titles had a longevity of longer than ten years and almost a quarter had a longevity of less than five years.

Table 1

Longevity summary, part 1 data

Longevity (years)Number of TitlesPer Cent of Titles

11,34113.58%
2–49749.87%
5–102,82928.65%
11–30 (max)4,72947.90%

BB totals:

Table 2 shows that there is virtually no difference in median longevity among the BB categories (soft pure, soft applied, hard pure, hard applied), with all categories having between nine- and ten-years’ longevity. This may come as a surprise to librarians who assume the soft pure fields (like English, history, philosophy, etc.) would have greater longevity than the applied fields like education, business, nursing, etc.

Table 2

Becher-Biglan summary, part 1 data

BBNumber of TitlesMedian Longevity

Hard Applied1,63710.0
Hard Pure9609.5
Soft Applied1,3459.0
Soft Pure6,06010.0
Grand Total10,00210.0

Excluding veterinary titles reduced the number of titles in hard applied to 1,167 but had no impact on the median longevity.

Major subject totals:

Table 3, which uses the more typical categories found in this kind of research, shows that while some specific applied fields like business and education have somewhat shorter longevity (eight years), the arts are merely average (ten years) and the same as the STEM fields, and the humanities are no greater than the social sciences (11 years).

Table 3

Major subject summary, part 1 data

Major SubjectNumber of TitlesMedian Longevity

Arts26710.0
Business2538.0
Education4788.0
Health75510.0
Humanities3,52311.0
Social Science2,76911.0
STEM1,69510.0
Grand Total9,74010.0

Removing veterinary titles dropped the STEM count to 1,227 and dropped the median longevity to 9.0.

Table 4 shows the breakdown by academic department, including here only those departments where the data were for 100 titles or more, ordered by longevity:

Table 4

Academic department summary, part 1 data

DepartmentNumber of TitlesMedian Longevity

Religion30012
Mathematics10212
History1,44812
Veterinary46811
Sociology1,01111
Psychology51111
Nursing12311
Music10511
Law12311
Anthropology17111
Visual Arts16210
Political Science37110
Philosophy29110
Medicine6089
Foreign Languages and Literature1059
English1,3089
Biology6739
Engineering1558
Education4788
Economics5818
Business2558
Grand Total (includes smaller departments excluded from list above)9,93210

Table 4 shows that the overall range at the granularity of department is only from 8 to 12 years, with fields like history and mathematics sharing the highest rank and various specific humanities fields sharing ranks with various specific STEM fields.

Part 2 – books acquired from 2008–2011

Counts of books that first circulated in the shown year and also were acquired within that range of years. Maximum possible from this dataset is 13 years.

The median longevity overall in this set of titles is just 3.9 years.

Summary:

Table 5 shows that longevity has dropped considerably when compared with part 1 (Table 1), as the percentage of titles that had longevity under five years is now over 60% and longevity over ten years is now under 5%.

Table 5

Longevity summary, part 2 data

LongevityNumber of TitlesPer Cent of Titles

11,51837.3%
2–41,09927.1%
5–101,26231.1%
11–13 (max)1814.4%

BB totals:

Table 6 shows that the longevity of the hard fields is actually greater than the soft, with hard applied being over 25% longer than soft pure.

Table 6

Becher-Biglan summary, part 2 data

BBNumber of TitlesMean Longevity

Hard Applied9164.7
Hard Pure3454.2
Other173.0
Soft Applied4543.5
Soft Pure2,3283.7
Grand Total4,0603.9

Major subjects:

Table 7 shows that only the health and STEM areas are above average longevity, with the social sciences edging out the arts and humanities as well.

Table 7

Major subject summary, part 2 data

Major SubjectNumber of TitlesMean Longevity

Arts2023.7
Business553.7
Education1803.7
Health4554.1
Humanities1,3543.6
Law502.8
Other692.7
Social Science9383.8
STEM7574.8
Grand Total4,0603.9

Part 2 – results for academic departments

Ordered by mean longevity. Given the much smaller dataset, for part 2, departments with only small numbers of books are included in Table 8.

Table 8

Academic department summary, part 2 data

DepartmentNumber of TitlesMean Longevity

Veterinary2706.2
Chemistry45.8
Mathematics155.7
Home Economics135.2
Nursing1535.0
Environmental Studies174.8
Physics394.8
General Works594.6
Psychology1284.6
Geography524.3
Biology2744.2
Classics64.2
Music1234.0
Anthropology1193.9
Economics1443.9
History4643.8
Business553.7
Education1803.7
Philosophy1033.7
English4833.6
Political Science1113.6
Computer Science123.6
Engineering693.6
Medicine2913.6
Astronomy113.5
Geology63.5
Sociology3163.5
Agriculture303.5
Religion2363.4
Foreign Languages and Literature513.3
Visual Arts793.2
Law502.8
Library Science632.5
Naval Science32.3
Military Science311.8
Grand Total4,0603.9

Table 8 provides further elucidation of the results from Table 7, by demonstrating that some humanities fields have longevity at or below that of many social science and STEM fields.

While it was expected from aforementioned internal studies that veterinary would come out on top, the presence of other STEM fields at the top of the list, with the first humanities and arts department not making an appearance until twelfth longest average longevity, further supports the surprising findings from the BB and major subject classifications.

Results from the larger ‘2B’ dataset

Some might argue that this methodology is unfairly biased against the strength of the humanities in much deeper use. So, the BB and major subject analyses were run again on the part 2B dataset described in the Methods section. In this dataset, the overall median longevity is just 3.3 years.

Summary:

Table 9 shows that even with this expanded dataset, longevity is under five years for over 70% of the books, and over ten years for fewer than 5% of them.

Table 9

Longevity summary, part 2B data

LongevityNumber of TitlesPer Cent of Titles

118,98452.9%
2–47,19720.1%
5–107,94722.2%
11–13 (max)1,7324.8%

BB totals:

Table 10 confirms the results we found in the smaller part 2 dataset, with hard applied books having about a 26% greater longevity than soft pure ones.

Table 10

Becher-Biglan summary, part 2B data

BBNumber of TitlesMean Longevity

Hard Applied4,9663.9
Hard Pure2,9223.4
Other883.2
Soft Applied3,4073.2
Soft Pure24,4773.1
Grand Total35,8603.3

Major subject totals:

Table 11 shows that most subject areas are actually below the median longevity, with much longer value in the STEM fields (being 15% above the average) and slightly above median for the health fields.

Table 11

Major subject summary, part 2B data

Major SubjectNumber of TitlesMean Longevity

Arts1,5843.0
Business4083.0
Education1,2163.2
Health2,0913.4
Humanities15,9483.2
Law3322.8
Other2322.5
Social Science8,5763.2
STEM5,4733.8
Grand Total35,8603.3

Removing the veterinary department (LC classification SF) from the 2B data does drop the STEM mean longevity considerably (3,860 titles with mean longevity 3.3) but it is still slightly above the softer subjects.

Table 12 shows that removing the SFs from the BB analysis has an impact on the hard applied mean longevity (drops from 3.9 to 3.3), but it continues to hold above the softer fields.

Table 12

Becher-Biglan summary, part 2B data with veterinary removed

BBNumber of TitlesMean Longevity

Hard Applied3,3563.3
Hard Pure2,9223.4
Other883.2
Soft Applied3,4073.2
Soft Pure24,4743.1
Grand Total34,2473.2

Discussion

Two significant points arise from looking at both sets of data. The first is that regardless of the categorization method used, the print books in the applied fields and hard fields generally have more longevity than the pure fields and soft fields. Professional and STEM longevity are generally greater than humanities and arts and this carries through to the department breakdowns.

The results may be surprising for librarians used to the usual professional stereotype that humanities scholars find more value in older works than other fields and professional fields least of all. This is even more unexpected when considering that this study did not attempt to combine multiple numbered editions of the same book, which are much more common in the STEM fields, and which would tend to drag down longevity as users tend to want to use the latest edition. This finding supports the work of Ladwig and Miller who studied first-circulation data and found no difference between STEM and humanities monographs.

With the added data in part 2B, the humanities do show slightly greater longevity than fields like business but are still far behind the STEM fields.

The second point, seen by comparing part 1 and part 2, is that longevity appears to be dropping in recent years, from a median of ten years to under four. It seems likely that this is due to the advent of e-books and their greater ease of access. Indeed, since about 2010, UPEI has shifted its monograph collection practices, favoring e-books over print books in most subject areas. Huge multidisciplinary subscription packages such as those offered by Proquest and EBSCO (UPEI subscribes to both major ‘academic’ collections from these vendors) mean that the print books still being manually selected by subject librarians (primarily in the humanities and arts) are nevertheless swamped in OPAC (online public access catalogue) search results, and even more so if patrons choose to limit their searches to a recent range of publication years as many undergraduate term paper assignments require.

Regardless of the reason though, this calls into serious question the value of continuing to purchase print books without evidence of specific demand, such as being required reading for a known course. So-called just-in-case collecting of print materials carries with it not only the original purchase and processing cost, but also the long-term storage costs and harder to quantify opportunity costs of not using the building space for other patrons’ needs.

It should be emphasized that these results arose from a small public university that does not have PhD programs in the humanities and arts. Doctoral ‘extensive’ institutions may not be able to draw practice-changing conclusions from this research. However, other institutions with a program and student body profile similar to UPEI may want to consider whether this data may be relevant to the use patterns in their own collections and subsequent implications for future collection policy.

And within that context, focusing on the recent part 2 data, librarians may also want to consider that if print books are averaging less than four years’ use life, they might move towards the new e-book collection models. Subscription and on-demand options that allow for more turnover of titles every few years and cost much less per book may prove to be the optimal use of modest monograph budgets.

One would only take that approach if one’s library is not the ‘library of record’ in a subject area, even regionally. For instance, UPEI is the library of record for veterinary practice, as the Atlantic Veterinary College at UPEI serves not just Prince Edward Island but the four Canadian Atlantic provinces and is funded by all of them. So UPEI will always purchase for perpetual acquisition every title it possibly can in this field and not rely on unstable e-book access options. It is somewhat ironic that this is the one field with by far the highest overall circulation as well as longevity.

Data Accessibility Statement

The raw data for both parts of this research are available as two tsv-formatted files within a single ZIP file at: https://doi.org/10.11571/upei-roblib-data/researchdata:667