wall street journal & disambiguation
i am a regular reader of The Wall Street Journal. the online version is cool and it's much easier to read and search than the print version. but, i think the The Wall Street Journal can add more value to the reader if it hooked-up with findory to personalize the layout and content. findory is very cool.
automated personalization used to be the holy grail of search. personalization is an automated way for a system to serve-up or render relevant data or information based on historical, implicit, previous decisions or gestures made by the user. it's based on a neural-net approach; more specifically, it's based on unsupervised learning methods.
my thesis at chicago was on this topic. specifically, i was interested in something called analogical modeling for a problem in language called the word-sense disambiguation problem. essentially, words have senses -- "the center for the los angeles lakers" -- is a phrase that can point to several objects, given it's utterance in time (it can point to abdul-jabar or o'neal). similarly, if i said "i fell on the river bank" and "i deposited cash at the river bank" -- these two statements use the word "bank" in two different senses. for humans, it is easier to distinguish between senses; but, how would a machine do that? how would a search engine disambiguate in an automated way? it's a tough problem and i certainly didn't solve it; but, i had fun explicating it and testing the analogical neural-net model. i developed a neural-net in python. it was fun.
i think tagging will help add context to the disambiguation problem; tagging is scalable and self-organizing; some call it a "folksonomy", as opposed to a "taxonomy". i really enjoy tagging and see it as a huge breakthrough in search, language, and meaning. services like de.licio.us, tagalag, digg, and flickr add intelligent context to the web and, as a consequence, add meaning for the users.
personalization is over-rated. tagging is the best blend between the intelligence of the human mind and the convenience of technology. meaning: humans are and will always be much better at that than a machine.