For those of you who can’t get enough of the latest scientific news, here’s a shiny new toy: Eureka Science News, an aggregator that harvests and categorizes news from “all major science news sources” (their words; note that they have made the wise decision to exclude blogs and thereby avoid a massive echo-chamber effect).

I find it fascinating that the categorization is completely automated:

It computes relationships between science articles and news found on the web using a vector space model and hierarchical clustering. It then automatically determines in which category each news item belongs using a Naive Bayes classifier. Finally, it examines multiple parameters (such as timeliness, rate of appearance on the web, number of sources reporting the news, etc) for each news group. The result is an e! score which represents the relative importance of a news item.

The classification system is still fairly coarse; you can choose to read all new articles about e.g. “Health” or “Math” but not about more narrowly defined subfields, so there’s not an easy way to throttle back the information overload factor.

I suspect that further refinement will come eventually, perhaps as more data is collected about how users use the existing features. I hope that user-configurability, perhaps even some empowerment to manipulate the classifiers and/or navigate the vector search space, will be possible in the future. In the meantime, there’s a good simple search feature, which works well for one-off uses and (one hopes) might someday be adapted to allow creation of user-defined RSS feeds or other customized features.