On Jan 2 2021, the newsrack service has been shut down permanently.

It has been a nice long run from the Sarai days in 2004 to being hosted on its own domain around 2006. Beside maintenance, there has been no real active development on the code or the features since early 2008. Since 2015, even all that maintenance was pretty bare bones. A lot of news sources no longer provide reliable RSS feeds and since mid 2018, there were growing issues with the service and I only kept it alive to assist a handful of users.

So, it was time to shut this down. The internet world in 2020 is vastly differently from 2003 when I first conceptualized this service. Thanks for using this all these years.

This is an archive of previously crawled content that will be kept around for a few weeks.


NewsRack FAQ

  1. How is this different from Google News and Google News Alerts?
  2. Why have a separate entity called concepts? Why not specify keywords in filtering rules directly?
  3. What is a RSS feed?
  4. How do I know if a newspaper provides a RSS feed?

Q. How is this different from Google News and Google News Alerts?

If the news that you are trying to monitor is fairly straightforward (for example all tsunami-related news items), then you could use Google News Alerts to receive daily alerts. But, you are still left with the task of filing these away, and if necessary, organizing the news.

However, if you are interested in monitoring tsunami-news and also organizing them into various categories (india, sri-lanka, tamil nadu, thailand, warning systems, etc.) it is harder to do this with Google News. You would have to do this either manually or write a script to process Google News alerts.

Or, consider the case where different newspapers refer to Nagapattinam as Nagapatnam, Nagapattinam, Nagappatinam, ... If you had to capture all the spelling variations, things get messy soon. However, with NewsRack, you can put away all the different variations into a concept called "nagapattinam" and you are set. If you discover a new spelling variation, simple add the new keyword to the concept and NewsRack will catch those news items too.

One way to understand this is also to look at NewsRack filtering rules as pre-defined google news searches, but, far more complex ones at that. In addition, NewsRack also lets you classify news into different categories.

Q. Why have a separate entity like concepts? Why not specify keywords in the filtering rules directly?

Concepts make sense for several reasons.

  • They capture ideas in a way that keywords do not. For example, consider the following three concepts:
    1. <oil-exploration> = oil exploration, offshore oil, oil-field, oil hunt, oil find, oil rig, oil production
    2. <narmada> = narmada, sardar sarovar, maheshwar, omkareshwar
    3. <athirapally> = athirapally, athirapilly, athirapilli, athirappilli
    The first concept captures the idea of oil exploration and not just one particular keyword. The second concept captures the names of different narmada dams. The third concept captures different spelling variations of a name. Thus, the idea of concepts is more versatile.
  • They simplify filtering rules. Compare
    oil-exploration OR (oil AND drilling)
    oil exploration OR offshore (oil OR oil-field OR ...) AND ((oil OR ...) AND (drilling OR ..))
  • Multilingualism comes naturally. Consider the concept:
    <oil> = oil, तेल, ಎಣ್ಣೆ
  • Maintenance is easy. If tomorrow you wanted to monitor news in Telugu, all you need to do is go and add the Telugu keyword for oil in the concept definition of oil, and everything works fine!

Q. What is a RSS feed?

The simplest way to describe this is to contrast this with the usual way of browsing on the web. In this model, as an user, you open up your browser, type in a URL (or keywords in a search engine) to visit the desired website, and access content from that site. In this model you, as an user, actively visit the website -- i.e. you pull content from the website. If you want to keep up-to-date with the latest updates on those websites, for example with things like newspapers, journals, blogs, discussion lists, etc., you have to remember to visit every one of those websites (whenever they are updated, and at whatever frequency) to make sure you don't miss out on the updates.

RSS (Really Simple Syndication or RDF Site Summary) feeds flip this model completely and enable websites to push content to you -- website updates make their way to your desktop on their own once you subscribe to RSS feeds and add them to your RSS reader/aggregator. The aggregator now periodically downloads the feeds and displays the aggregated updates as links to the updated information (with brief descriptions if the feeds make that available).

There are a number of RSS readers/aggregators available today, and in the future, it is expected that most browsers will implement support for these in one way or the other. Mozilla Firefox already implements these as live bookmarks.

For the technically minded, RSS feeds are just simple XML files. However, unlike regular XML files, the semantics of a RSS file is what makes syndication possible -- it can be considered to be a syndication protocol.

Q. How do I know if a newspaper provides a RSS feed, and how do I get the feed?

If a newspaper provides a RSS feed, it is usually pretty clearly advertised on the website. Some websites have well-placed icons like or or RSS, for example.

Since a RSS feed is just another file (with special semantics) being served by the website, it has a well-defined URL. To add the feed to your news aggregator, all you need to is add this URL to your aggregator software. In the context of NewsRack, you have to copy this URL when defining the news source.