linking back to brembs.net






My lab:
lab.png
Welcome Guest
Username:

Password:


Remember me

[ ]
 Currently Online (42)
 Extra Information
MicroBlog
You must be logged in to post comments on this site - please either log in or if you are not registered click here to signup

[23 Dec 12: 13:20]
Inbox zero! I don't even remember the last time I could say that!

[06 Aug 12: 14:21]
Phew! Done with nine 20min oral exams, three more to go. To be continued tomorrow...

[14 Oct 11: 11:45]
Just received an email from a computer science student - with an AOL email address?

[03 Jul 11: 22:26]
Google citation alerts suck: I just found out by accident I rolled over h-index of 13 and 500 citations http://blogarchive.brembs.net/citations.php

[21 May 11: 18:14]
6.15pm: Does god have Alzheimer? No #rapture in Europe...

[01 May 11: 11:31]
w00t! Just been invited to present at OKCon 2011! #OKCon2011


Networking

Subscribe to me on FriendFeed
Follow brembs on Twitter

Research papers by Björn Brembs
View Bjoern Brembs

Science Blog Directory
Random Video
SciSites

When I describe the behavior of corporate publishers with regard to scholarly communication and tight library budgets, I usually illustrate my point with the following graph:

arl_prices2.png

Clearly, subscription costs have increased exorbitantly over the last few decades. In recent years, however, publishers have - in a seemingly glaring contradiction to the available figures - started to claim that instead of increasing costs, they have actually decreased prices. These claims are based on an article by publishing consultant Pauly Gantz, which claims that library expenditures need to be divided by the contant their money buys. Gantz cites increasing rates of publications, increasing number of scientists and increasing numbers of journals as justifications for the hyperinflating costs and provides her readers with the following graph (click for larger version):

prices_gantz_small.png

If you look at the black line, you'll notice that the actual cost per journal has decreased since about 2005. On the face of it, this seems to invalidate the argument of libraries, that corporate publishers are overcharging them. There are several points that need some clarification, though. For instance, given the negotiation practices of bundling journals together at a discount, would libraries subscriobe to that many journals if they had a choice? In other words, is the increase in journal acquisitions starting around 2000 just an arm-twisting measure of publishers to mask further price increases? Is the proliferation of journals largely a publishers' trick to milk yet more money from libraries or are publishers just jumping on an already rolling bandwagon? The graph above is pretty conspicuous, not only because it comes from the organizations which profit from libraries, but also becasue it is counter-intuitive: corporate publishers have not only been posting steadily increasing profits with no sign of any international crises anywhere (e.g. Elsevier):



they have also been posting increasing profit margins, i.e., they have been able to reap a larger percentage of their revenues as profits:

elsevier_margins.png

This entails that publishers - if Elsevier is anything to go by - have been able to sell their journals at a lower cost to libraries while simultaneously not only increasing profits, but also profit margins. Obviously, this is not an impossible feat, but given the size of the margins (which conforms well with the overall approx. margin of 40% in the industry) this seems rather unlikely and requires some corroborating data from independent sources.

These data have of course raised not only my suspicion. Karen Harkerhas done some calculations and also finds a range of questions that are left to be answered:

  • What is the best way to measure the costs of journals?  Per subscription? Per article? Per usage?
  • Where can we get this data?
  • How do library expenditures compare with listed prices?
  • How do we measure costs per title of journal packages?
  • What has been the trends in content growth per subscription?
  • What is the usage of the supplemental data?  How many journals provide it?  How many require a subscription to access it?  
  • Are we buying too much?  Is the growth in content truly worth our money?
  • What is the extent of content that is provided with each subscription?  How far back are the archives with the subscription?  How much are the archives used?
  • What are the true costs of maintaining ejournals?
Walt Crawford has done some "quick-and-dirty" calculations on library expenditures for two years of change over essentially all the academic libraries in the U.S.–as reported in the NCES biennial survey. He found that electronic serials expenditures rose by almost a quarter in the two years between 2008 and 2010. There are no figures on the number of journals these expenditures provide access to, but from the figures it is pretty clear by how much these numbers must have risen, in order to lead to an overall price decrease per journal.

The only thing that seems unambiguous between all involved parties at this point is that the costs of scholarly communication overall are skyrocketing at hyperinflation speeds, despite the scientific community only increasing at around 3% annually, slightly above average inflation rates. What is under dispute is whether the money is well spent.

Also here one can do some back-of-the envelope calculations, under the assumption that libraries would try to take over the tasks that publishers currently perform: there are an estimated 50 million published papers. If each of them would require an average of 1MB (in a simple PDF format, for example) of storage space, we would need a measly 50 terabytes of space. At a going rate of about 0.04€ per gigabyte, just the storage would require on the order of 2000€, plus, of course some bandwidth requirements. With a worst case scenario of about 0.1€ per gigabyte in bandwidth costs every download of the entire archive would cost around 5000€. Given an estimate of around 1.5 billion downloaded papers per year, total costs would be around 150k€ per year. This cost is also negligible, particularly as, at least in Europe, bandwidth is covered by the institutional network infrastructure, which, in Germany, is covered by federal and state budgets already. But even if these costs would not be covered, they would spread accross the about 9000 university libraries, making them a miniscule average 17€ per library and year. Thus, pure hardware and bandwidth costs are entirely negligible, on the grand scheme of things and if we omit software and data archiving.

There remain software and personnel costs. Given that most of the work on scholarly papers is done by scientists whose salary is already paid, only the technical personnel for mantaining, developing and improving the scholarly communication system can be factored in. Let's say that each library requires on average a staff of 5 full-time employees with a salary of 100k just for journal-related tasks (i.e., a work-force of 45k people working just on scholarly communication). This would bring personnel costs up to about 4.5b world-wide per year. Much of the software for hosting scholarly journals is Open Source or can be acquired at reasonable cost. Let's make the quite unlikely assumption that each library would have to spend about 50k per year for journal-related software, to keep everything running. That would add another ~0.5b to a total running cost of around 5b per year. Given conservative estimates of around 10b in annual revenues for the scholarly publishing economy, this means libraries are being overcharged by at least about 5b every year, if not more. Given the 40% profit margins of publishers, this actually works out quite well: corporate CEOs and their shareholders pocket about 5b worth of taxpayer money each year via library subscriptions.

I say it's time we spend these 5b per year in a more meaningful way.


Posted on Thursday 17 January 2013 - 15:00:07 comment: 2
open access   publishing   costs   profits   publishers   libraries   

Render time: 0.7641 sec, 0.0191 of that for queries.