linking back to

My lab:
Scientists measure. Scientists count. Scientists quantify. It's what they do. It's their job. Yet, as Einstein has pointed out long ago "Not everything that can be counted counts. Not everything that counts can be counted". One such thing that counts but can't be counted is scientific progress. In particular, the output a scientist is producing. Why is this so difficult? Can't we just count the number of scientific publications? Of course, that's one way of doing it. However, you miss that genius who has been working on a major scientific problem for many years and then writes this one landmark paper. Is his contribution to science less valuable than the scientist who operates an automated experiment and publishes all the different variations of this experiment in a steady stream of papers? So numbers of papers definitely is not a good measure or metric. Obviously, some metric of quality is required. Only, what is quality? It probably means something different for each person and you have to actually read many papers to get an idea of what you think is good and what isn't. In a democracy, you ask as many people as possible if something is 'good'. If the majority likes something, it must be good, that's the definition. If this sort of popularity contest sounds fishy to you, wait until you hear how scientists have been defining 'quality' for the last couple of decades.
Scientists don't just publish their results on some websites or books. There is a multi-billion dollar/euro private publishing industry (dominated today by three multi-national corporations), which thrives on the taxpayer money which is spent to publish research in scientific journals. Beginning with a single journal in 1665, the number of journals has grown to about 24,000 today. Some of these journals publish any scientific research, others only research from particular fields. Obviously, those journals that are not picky about their topics receive more manuscripts, because every scientist can submit their work there and not just a certain group. By now you already see where this is going: back in the days when publishing was done on paper, there was only so much space in an issue of any given journal to go around. Clearly, some selection had to take place. Soon, there was prestige associated with being published in these journals and you had a ranking system where scientists would submit their most exciting results to the most general journals - for the prestige and wide readership this publication would entail. The kind of journal where you published became a measure of quality. Not surprisingly, when you do the scientific equivalent of the popularity contest (you look at how often scientists refer to the work of other scientists, i.e. citations), the more general journals end up with more citations per article than the more specialized journals. After all, there are only few general journals (now why is that?) but many specialized journals for the many fields in science, so the most exciting work will be concentrated in very few of the ~24,000 journals. However, the selection process isn't perfect. The editors selecting the manuscripts are usually not practicing scientists (they get to be at the center of science without actually having to do it), but above all they're humans and thus make mistakes, just as a scientist at their place would. "You could write the entire history of science in the last 50 years in terms of papers rejected by Science or Nature." Paul C. Lauterbur once famously said. This error-prone selection process means that some breakthroughs are not recognized as such and get published in other journals. Conversely, it also means that some less important or outright fraudulent work sometimes does get published in the general journals but is rarely cited. Consequentially, every journal, general or special, has a few articles that are relatively highly cited and many articles that are cited relatively little.
Basically, we ended up in a system where there is a pretty good chance a scientist will find something he or she would classify as 'good science' in one of the few general journals, compared to any single other journal. However, there's an even higher chance that any particular publication in one of the general journals will not be all that exciting: 80% of the citations in any given journal go to only 20% of their publications (Mark Patterson). This skewed distribution entails that while there is relatively more 'good' science published in the general journals, the reverse argument does not hold: only because an article is published in a specialized journal, you can't say it's of any lesser value to general science. In Einstein's words, the number of citations to articles in a journal is something that can be counted, but it doesn't count.
One interesting comparison that I haven't seen anywhere so far, is to lump all the journals together which have no restriction in their topics and compare their citation numbers to all the rest of the journals which do have restrictions. Given all that is wrong with current measures of scientific impact (yes, I'm looking at you, Thomson Reuters' Impact Factor), I'd like to see a mathematically sound evaluation. I'd be surprised if the general journals still had such a citation advantage once all the other 'good' research is aggregated.
But all this is quite a moot point today, isn't it? Today, publication space isn't limited by the amount of dead tress we can ship across the world. Today, there is but one journal, the scientific literature, at our fingertips. That's correct of course, only that today hasn't arrived at the scientific community, yet. While scientific journals are the modern day equivalent of dinosaurs, they're still roaming around freely. On the contrary, in recent years this outdated concept of where something is published being more important that what is published has been elevated to the single metric according to which scientists are evaluated. Why is that? There are a number of reasons. The most obvious one is the number of scientists trained today, compared to the number of tenure-track positions. Depending on specialty and the kind of position, there are anywhere from 50 to 300 applicants. It's absolutely impossible to read all the required publications, so what could be easier than to look at where someone is published? Another reason is that until recently, there haven't been all that many alternatives. One of the few alternatives is the h-index, which is a metric attempting to assess both productivity and citations of an individual scientist, irrespective of where this scientist has published. As pointed out above, productivity and popularity are not necessarily always the same as 'quality'. Quality has many unique aspects and only few articles will ever meet all of them. Thus, any half-way decent metrics will have to be multivariate to account for all the different aspects of quality and the different possibilities of measuring them.
I see only two ways in which the metric problem can be solved: either we change our system radically, such that there remain only so few scientists that it's at least humanly possible to read all their most important contributions when they're up for evaluation. Or we accept that some excellent scientists fall by the wayside and at least try to get our metrics to a level where we don't have to be ashamed of them any more. In an effort to lead the way into the second route, PLoS has recently announced that they'll "stop promoting journal impact factors on our sites all together. It’s time to move on, and focus efforts on more sophisticated, flexible and meaningful measures." Slowly but surely, today is arriving in the scientific community. Will I live to see tomorrow appear?
Posted on Thursday 23 July 2009 - 11:26:19 comment: 0

Render time: 0.1209 sec, 0.0054 of that for queries.