Phillip Smith

Six questions about semantic data and news innovation

If there is no clear guidance on essential building blocks of the open Web, like rich semantic data, and every organization is left to draw their own conclusions, I ask myself, Is this the fertile ground where innovation takes root?

I've started interviewing our news partners in an effort to sketch the outlines of the first series of news-technology innovation challenges. There are a number of consistent themes emerging in these conversations: mobile & new devices, large datasets & presenting data in useful ways, HTML5 video & audio, and so on. The theme that I'm most curious about today is semantic data, and the question of what standards are going to lead to the most new, and interesting, innovations, e.g.: microformats, or microdata, or RDFa, or...?

Specifically, I'm interested in asking:

  • What would it take to see a move toward one standard for semantic data on the open Web?
  • Would a broad adoption of one standard make new innovations more likely?
  • Is the choice of a standard simply an issue of matching project needs to the features provided by the standard, thus validating the necessary of several different options?

This curiosity stems from my excitement for the HTML5 specification and the aspirations it sets for the future of the open Web: bendable, programmable, and accessible (in other words, awesomesauce). It's also exciting because people are actually working with what is available from the HTML5 specification today -- it's not only possible, but practical. However, there is very little guidance provided (currently) on how to implement semantic data in the HTML5 universe.

From my (admittedly cursory) investigation, the situation exists because the HTML5 community hasn't agreed on the "one true way" to implement semantic data (perhaps that's not a realistic possibility). There are at least three competing semantic data standards that seem to frame the debate:

Last week I read on the Microformats blog that Facebook added hCalendar and hCard microformats to millions of events. This type of scenario is a good example of what I was referring to last week when I wrote about the decisions that news organizations are making today, and what impact those decisions could have on the future of the Web, e.g.:

  • On one side of the Internet seesaw (also known as a teeter-totter) are companies like Facebook, Twitter, Google that have the massive "weight" of large user communities and immense volumes of data;
  • On the other side are news organizations; News organizations that still have, I would argue, equivalent weight in terms of their reach, attention, and the trust they've earned over time.

So what happens if one side moves to the other? Or -- if that doesn't happen -- which side will be the first to convince a majority of developers to hop on their end and change the balance?

For example, the BBC has already made significant investments into RDF and actively advocate for other organizations to embrace Linked Data. Other news organizations are, no doubt, using Microdata and hoping to leverage Google's ability to turn that data into "rich snippets" that drive traffic. More still, and this is my point, are probably sitting on the fence waiting to see what happens.

So, as I rush to make my next connection on route to Raleigh, I'll finish off this post with these questions:

  • How can the Knight-Mozilla News Technology Partnership play a role here? (And should it?)
  • What are the opportunities to work with news organizations, and the broader news innovation community, to explore the far edges of possibility for a semantically-rich Web?
  • Would broad adoption of an open standard for semantic data by large news organizations create new opportunities for innovation that have not been explored thoroughly yet?
  • Would this be the type of challenge that would pique your curiosity, and -- possible -- entice you to get involved?

If you have thoughts on the matter, speak up or drop me a line. :)

About

Hi, I'm Phillip Smith, a veteran digital publishing consultant, online advocacy specialist, and strategic convener. If you enjoyed reading this, find me on Twitter and I'll keep you updated.

Related

Dear Internet, can we talk? We have an information pollution problem of epic proportions.

Misinformation and disinformation are not challenges specific to any single platform, or the responsibility of any single company: they r...… Continue reading