linking back to

My lab:

When I describe the behavior of corporate publishers with regard to scholarly communication and tight library budgets, I usually illustrate my point with the following graph:


Clearly, subscription costs have increased exorbitantly over the last few decades. In recent years, however, publishers have - in a seemingly glaring contradiction to the available figures - started to claim that instead of increasing costs, they have actually decreased prices. These claims are based on an article by publishing consultant Pauly Gantz, which claims that library expenditures need to be divided by the contant their money buys. Gantz cites increasing rates of publications, increasing number of scientists and increasing numbers of journals as justifications for the hyperinflating costs and provides her readers with the following graph (click for larger version):


If you look at the black line, you'll notice that the actual cost per journal has decreased since about 2005. On the face of it, this seems to invalidate the argument of libraries, that corporate publishers are overcharging them. There are several points that need some clarification, though. For instance, given the negotiation practices of bundling journals together at a discount, would libraries subscriobe to that many journals if they had a choice? In other words, is the increase in journal acquisitions starting around 2000 just an arm-twisting measure of publishers to mask further price increases? Is the proliferation of journals largely a publishers' trick to milk yet more money from libraries or are publishers just jumping on an already rolling bandwagon? The graph above is pretty conspicuous, not only because it comes from the organizations which profit from libraries, but also becasue it is counter-intuitive: corporate publishers have not only been posting steadily increasing profits with no sign of any international crises anywhere (e.g. Elsevier):

they have also been posting increasing profit margins, i.e., they have been able to reap a larger percentage of their revenues as profits:


This entails that publishers - if Elsevier is anything to go by - have been able to sell their journals at a lower cost to libraries while simultaneously not only increasing profits, but also profit margins. Obviously, this is not an impossible feat, but given the size of the margins (which conforms well with the overall approx. margin of 40% in the industry) this seems rather unlikely and requires some corroborating data from independent sources.

These data have of course raised not only my suspicion. Karen Harkerhas done some calculations and also finds a range of questions that are left to be answered:

  • What is the best way to measure the costs of journals?  Per subscription? Per article? Per usage?
  • Where can we get this data?
  • How do library expenditures compare with listed prices?
  • How do we measure costs per title of journal packages?
  • What has been the trends in content growth per subscription?
  • What is the usage of the supplemental data?  How many journals provide it?  How many require a subscription to access it?  
  • Are we buying too much?  Is the growth in content truly worth our money?
  • What is the extent of content that is provided with each subscription?  How far back are the archives with the subscription?  How much are the archives used?
  • What are the true costs of maintaining ejournals?
Walt Crawford has done some "quick-and-dirty" calculations on library expenditures for two years of change over essentially all the academic libraries in the U.S.–as reported in the NCES biennial survey. He found that electronic serials expenditures rose by almost a quarter in the two years between 2008 and 2010. There are no figures on the number of journals these expenditures provide access to, but from the figures it is pretty clear by how much these numbers must have risen, in order to lead to an overall price decrease per journal.

The only thing that seems unambiguous between all involved parties at this point is that the costs of scholarly communication overall are skyrocketing at hyperinflation speeds, despite the scientific community only increasing at around 3% annually, slightly above average inflation rates. What is under dispute is whether the money is well spent.

Also here one can do some back-of-the envelope calculations, under the assumption that libraries would try to take over the tasks that publishers currently perform: there are an estimated 50 million published papers. If each of them would require an average of 1MB (in a simple PDF format, for example) of storage space, we would need a measly 50 terabytes of space. At a going rate of about 0.04€ per gigabyte, just the storage would require on the order of 2000€, plus, of course some bandwidth requirements. With a worst case scenario of about 0.1€ per gigabyte in bandwidth costs every download of the entire archive would cost around 5000€. Given an estimate of around 1.5 billion downloaded papers per year, total costs would be around 150k€ per year. This cost is also negligible, particularly as, at least in Europe, bandwidth is covered by the institutional network infrastructure, which, in Germany, is covered by federal and state budgets already. But even if these costs would not be covered, they would spread accross the about 9000 university libraries, making them a miniscule average 17€ per library and year. Thus, pure hardware and bandwidth costs are entirely negligible, on the grand scheme of things and if we omit software and data archiving.

There remain software and personnel costs. Given that most of the work on scholarly papers is done by scientists whose salary is already paid, only the technical personnel for mantaining, developing and improving the scholarly communication system can be factored in. Let's say that each library requires on average a staff of 5 full-time employees with a salary of 100k just for journal-related tasks (i.e., a work-force of 45k people working just on scholarly communication). This would bring personnel costs up to about 4.5b world-wide per year. Much of the software for hosting scholarly journals is Open Source or can be acquired at reasonable cost. Let's make the quite unlikely assumption that each library would have to spend about 50k per year for journal-related software, to keep everything running. That would add another ~0.5b to a total running cost of around 5b per year. Given conservative estimates of around 10b in annual revenues for the scholarly publishing economy, this means libraries are being overcharged by at least about 5b every year, if not more. Given the 40% profit margins of publishers, this actually works out quite well: corporate CEOs and their shareholders pocket about 5b worth of taxpayer money each year via library subscriptions.

I say it's time we spend these 5b per year in a more meaningful way.

Posted on Thursday 17 January 2013 - 15:00:07 comment: 2

Jörg Prante posted on18 Jan 13: 13:39:

The IT infrastructure of the libraries was developed in the 1990ies, where german government planned to distribute all scientific literature electronically. So libraries are well equipped with bandwith and computers, still today.

But they can't deliver documents, since publishers sued libraries for copyright infringement. All document delivery is obliged with the publisher's right to first sell the document. Without a new fair-use copyright regulation by law, publishers will always have the last word in document delivery by libraries.

bjoern posted on20 Jan 13: 14:21:
Comments: 322

There indeed is a back-issue problem here, which is not trivial to solve. One way is to use some civil disobedience as described here, especially in the links.

You must be logged in to make comments on this site - please log in, or if you are not registered click here to signup
Render time: 0.1089 sec, 0.0068 of that for queries.