Cognition: Using natural language to query Medline semantically
Posted Apr 24 2009 11:09pm
You know by now that I love literature search tools (check out the small but growing “Links: Search” category in the right-hand column on the main page ). I am strongly motivated by a desire to filter the huge and growing biological literature so that I can find the most relevant papers with the least amount of effort. Therefore, I’m always curious when I hear of a tool that purports to do an old task (searching Medline) in a new and unusual way.
A company called Cognition (”Giving technologies new meaning”) claims to enable the user to use semantic natural language processing to search the literature. Here’s their elevator pitch:
Cognition’s Semantic Natural Language Processing (NLP) technologies add word and phrase meaning and understanding to computer applications, providing a technology and/or end-user with actionable content based upon semantic knowledge. This understanding results in simultaneously much higher precision and recall of salient data within the universe of possible results. Cognition’s Semantic NLPTM makes technologies and applications more human-like in their understanding of language, thereby resulting in more robust applications, greater user satisfaction and new capabilities available for exploitation. On the Web in particular, powering applications with Cognition’s semantic understanding technology drives these applications ever closer to Web 3.0 (the semantic Web).
They have various commercial applications for sale but their semantic MEDLINE product is freely available on the web.
I’m not going to lie to you — it’s pretty great. You can ask the interface a real English question, like “ Which genes are expressed in senescent fibroblasts? ” and get real answers. (OK, to be fair, it’s fine with just “genes expressed senescent fibroblasts”, but I enjoy being able to use my native language when I talk to a computer.) I encourage you to play around with it; it’s fun.
One feature that seemed promising at first didn’t seem to work well at all. On the right-hand side of the search results screen are a series of dropdown menus; each menu contains several different meanings for keywords within the query. The idea is that one could refine a search by choosing the specific meaning of an ambiguous term, rather than having to slog through a search result that allows all meanings of the term in question. Unfortunately, this feature doesn’t deliver. Allow me to illustrate.
In the query example mentioned above, the dropdowns allowed for six meanings for the word “express”. The results had initially come back with one of these meanings (”6. to make a protein in bacteria or cells in culture”) already selected (I assume because this meaning gave the largest number of hits: ~12 papers, a totally manageable number, all of which were good answers to the question).
That definition is OK but I felt like another meaning (”5. to make a protein from a gene”) was slightly closer to the original intent, so I chose that definition and resubmitted. This culled the list down to only 1 paper, which wasn’t a very good match, and eliminated all the excellent answers from the earlier version of the search.
I can’t even begin to guess how the “sense” of a word is determined algorithmically by the Cognition software, but I do know that the outcome of my twiddling didn’t conform to my intuitive understanding of the words involved — which, after all, is the whole point of natural language processing. So I have to list this under “room for improvement”.
Which is all just to say that this search engine isn’t perfect yet — but please don’t let that stop you from checking it out. I like a lot of things about Cognition semantic Medline, and I’m going to be using it a lot.
What do you think? I’d love to hear about other people’s experience with the software.
(Hat tip to Code-Itch. Yes, I’ve had that post bookmarked since September.)