linking back to

My lab:

I've described before the awful situation most scientists face when they try to follow the scientific literature. For me, it involves more than 10h of searching per week with at most 1-3h left to actually read what I found. In these posts, I've also sketched an outline of how this problem can be approached and at least ameliorated with currently available technology.
Quite obviously, this problem is major. In fact, it's one of the worst parts of being a scientist. Finding literature using search is tedious and cumbersome, but for the most part, you find what you need to find. A service that uses modern information technology to help scientists to filter, sort and discover research papers as they are published simply doesn't exist. Thus, any company providing such a service would have an instant monopoly. In our day to day work, this service would be used every day, whereas actual search is only used rarely.

However, when I talked to Alex Wade from Microsoft Academic Search at Science Online London in September and explained to him what we need, he said that Microsoft is aware of the issue and is working on it, but that it is far down on their priority list.

For a few weeks now I have been trying to contact Anurag Acharya, chief engineer behind Google Scholar. However, when I finally did get through to him, my hopes weren't high:
  1. Google Scholar develops at a glacial pace
  2. Google Scholar contains more errors than other scholarly search engines
  3. His account answered 'mailbox not found' when I emailed him from my non-university email
  4. When a colleague contacted him with me in CC, he did not 'reply to all', so my colleague had to forward me his answer
All of this seemed to indicate that Google Scholar is a tiny, low-priority side-project for low-priority customers with apparently little funding for innovation and little support from the main company.

So maybe not surprisingly, in his email which was forwarded to me, he said that the kind of service I was asking for couldn't be done.

Anyway, when I finally found out that my university email would not be blocked, I tried again, sending him examples of services where this technology does work - only that there isn't one service combining the technologies for scholarly use. I did receive a personal reply this time.:
If this is all you have in mind, I am disappointed I had expected something grander - something to facilitate serendipity. To find me things that I really want to read but of whose existence or area I am not aware. What "works" on the sites that you mentioned below is nowhere close to that

The problem with collaborative filtering is limited recall. Which is why systems like Amazon's don't really work well. Given a very large collection and a very sparse distribution of interests. it is hard to get enough coverage/signals for most articles. None of the rating/commenting/recommending systems have much density for scholarly articles (take facutyof1K, plosone, the feedback links many journals have and so on). To do things well, you would have to go way beyond this. Which is hard to do well...

Anyways, thanks for the suggestions and the thoughts!
Again, no surprise there: condescension and a lack of understanding what the needs of the customer are and which technological solutions are out there - a corroboration of the suspicion that Google Scholar is not the kind of project where Google puts lots of effort into grin.png.

Clearly, collaborative filtering is only part of it (an important part which works). Stored keyword searches, citation alerts, natural language technology, user input, etc. all help the service I envisage to circumscribe the relevant portion of newly appeared literature. There of course are multiple ways to define relevance and the service I'm looking for uses many of them to iteratively approach an 'intelligent' system that assists me in the task to stay on top of my field. It really doesn't take an Einstein to see that this is both bitterly needed and entirely feasible technology. And yet, neither Microsoft nor Google see scientists as an important enough customer section to invest in this technology. They rather try to compete against each other in a market where there isn't really all that much demand (relatively) and with resources that makes both of them woefully uncompetetitive to existing market leaders.

Thus, it appears that a smaller, more innovative and agile company will hopefully see the huge demand (basically every single researcher on the planet will use this tool daily once it works as described) and start providing this much needed service rather sooner than later. I'm definitely willing to pay for the service and will try every single one I'm offered.
Posted on Tuesday 08 November 2011 - 11:15:47 comment: 0

Render time: 0.0772 sec, 0.0036 of that for queries.