PredictGuru

Monday, February 20, 2006

 

Automatic news ranking

Ranking is one of the most important problems in machine learning these days. All the search engines including google uses ranking for sorting the web pages according to the relevance to the entered query. It has been shown that using machine learning for ranking provides satisfactory results. (No idea if google uses machine learning also to get the page rank.) The idea is simple, a user manually ranks a set of documents (movies, webpages anything). Using this as reference, we train a function, which, given a new document, identifies the correct rank of the document in a set of documents. (Based on movie name, actors, director, contents if it is webpage...).
Ranking can also be used in many other areas. Here is a list
  1. Email Ranking: Lets assume you are a person with lots of contacts. Or you are very famous and get 100s of legitimate emails everyday. Some mails require your immediate attention and some mails are not so important Machine learning can be used here to order the mails for you based on your past behaviour. So all the important mails will be at the top and the not so important mails will be at the bottom. If you do not have a spam filter all spams will naturally fall at the bottom with this approach. An implementation would be a plugin for thunderbird which puts any new email in its rightful place.
  2. News Ranking: Consider news sites like slashdot and digg. The problem with slashdot is that a editor has to read through thousands of postings, find good stories edit it and post it. And this takes a few hours, the problem is similar in case of digg. It takes a few hours for a story to get enough diggs to push it to the front page. A story may be stale by the time it makes it to front pages. Now consider an algorithm that learns from the previous stories that have made it to the front page and automatically decides if a new story is front page material. If the algorithm is good, we get a near real time appearance on the front page. A simple implementation would be to get RSS feeds of the story from digg, rank it and if the rank is good enough post it on your site.
  3. Blog ranking on blogspot, Photo ranking on flickr if you think of something leave a comment.

Comments: Post a Comment

Subscribe to Post Comments [Atom]





<< Home

Archives

April 2005   May 2005   August 2005   September 2005   January 2006   February 2006   November 2006   September 2008   March 2010   April 2010   October 2010  

This page is powered by Blogger. Isn't yours?

Subscribe to Posts [Atom]