News on the Semantic Web

Written by Adrian Holovaty on August 26, 2002

Ftrain's Paul Ford, author of the excellent and much-talked-about How Google beat Amazon and Ebay to the Semantic Web, has a few ideas today on how news sites would fit into the Semantic Web. (The Semantic Web is a theoretical next generation of the Internet in which information and conceptual relationships between data are organized in a way computers can understand. But my definition doesn't do it justice; I encourage you to explore the topic for yourself.)

Ford's idea is for news sites to include uniform metadata (in essence, a consistent method of describing content) in each story. This metadata would be collected by a computer that would constantly be traversing the Web's news sites and storing its findings in a central computer. (Ford's theoretical example is "Newspurl.org".) Then readers could go to Newspurl.org, choose a topic, and get a list of all the world's news organizations' coverage of that topic -- ideally, sortable by date, publication, author, etc.

One of the benefits of this system would be that news stories would get permanent URLs. In one of Ford's examples, you could always go to http://newspurl.org/us-china-trade for the very latest news about trade between the U.S. and China. No need for fumbling around messy news site home pages that, as Ford points out, feature "near-nuclear war" one day and "a baby tiger born at the zoo" the next. We're talking incredible usability gains here.

I have several reactions to this piece:

First, news organizations will hesitate to do this because, in general, they don't like to share. Witness the recent deep linking hullabaloo. Witness the trend toward user registration. Witness the sporadic attempts at charging for content.

Second, why aren't news sites doing this on a site-by-site basis -- or even chain-by-chain basis -- right now? Take Ford's idea and apply it to a single news site. Make easy-to-remember, permanent links to repositories of stories on a single topic, and promote the heck out of them. Channel site traffic away from the home page and into these topic-specific index pages. If you're a chain, take it a step further and share content amongst your properties; God knows you've spent enough money on your content-management system, so use it to your advantage.

We're already seeing some of this. Washingtonpost.com does a decent job of sorting content into microtopics -- such as Espionage, Space exploration, even Biotech food. We're seeing it on blogs, too; publishing systems like Movable Type allow bloggers to assign categories to entries. (Here's an example of that.)

But these examples are the exception. Most sites plop news stories on their home page, grouping them with all other non-related events that happened to occur that day. Then, when day is done, the stories are lost into the dark depths of the Archives. Every day there's a new home page; every day users relearn the location of the stories they're interested in.

The biggest lesson of Paul Ford's piece is to organize content in an ultra-detailed manner and store it in a consistent place. The Semantic Web won't be here for a while, but news sites can lay the foundation for it by developing intelligent ways of tying stories together -- and convenient, user-friendly ways of presenting these associations.

Comments aren’t enabled for this page.