The Digital Enterprise Research Institute (DERI), National University of Ireland, Galway, is the largest semantic web research group in the world, with 100 researchers. Its self-stated mission is to exploit semantics for people, organizations and systems to collaborate and interoperate on a global scale.
In December, John Breslin, research leader of the social software group at DERI, noted that its tutorial proposal on SIOC (Semantically Interlinked Online Communities), which provides methods for interconnecting discussion methods such as blogs, forums and mailing lists to each other, entitled Interlinking Online Communities and Enriching Social Software with the Semantic Web was accepted for the 14th International World Wide Web Conference to be held in Beijing, China in April.
SemanticWeb.com recently caught up with Breslin to learn more about SIOC, which consists of the SIOC ontology, an open-standard machine readable format for expressing the information contained both explicitly and implicitly in Internet discussion methods, of SIOC metadata producers for a number of popular blogging platforms and content management systems, and of storage and browsing/searching systems for leveraging SIOC data.
SemanticWeb.com: Tell us a bit about the development of SIOC.
Breslin: SIOC [pronounced 'shock'] started off as an idea in my head three or three-and-a-half years ago. Because I had some experience in online communities (boards, etc.), I saw a need for providing methods to link these sites together. When you look for information on the Web to answer a question, you may get parts of your answer from different community sites. You have to trawl across a lot of these sites before you can get a complete answer. We wanted a method to be able to express the information from these communities in a standard form and then to allow this information to be linked together by adding methods for people to say, for example, that this information was written by the same person who wrote something else, or that it is related to something else on the same topic.
It started off with the development of the SIOC core ontology, which is used to describe the domain of online communities and what they consist of -- users and posts and descriptions of other simple terms that occur in online communities. There is a lot of structure in online communities and inherent connections, in that people tag content, make replies or create trackbacks between posts. This structure that is created in online communities is often hidden in some database behind the scenes, and SIOC is used to expose that structure via semantics.
First of all we just worked on SIOC internally and got feedback. Then we decided to get more feedback from the community through a W3C member submission process. We gathered partners in this space -- a combination of academic and industry partners -- and went through a year or so of getting this submission in place, which involved a lot of revisions. The vocabulary kind of evolved by community consensus. That was published in the end of July or beginning of August, and since then it has helped the initiative, as having a member submission makes it more visible and easier for us to get feedback.
We will also be presenting a tutorial on SIOC at the WWW2008 conference in Beijing. This is the biggest web conference, so having a tutorial at that is obviously brilliant for us. Combined with the W3C submission, we know that there is significant interest in SIOC, but many people don't know what it is exactly and what it can be used for. We'll be explaining in our tutorial what SIOC is, how you can use it, and where it is being used already.
In what ways is SIOC being used today?
The initial approach was to provide the SIOC ontology and modules producing SIOC data [based on this ontology] for a lot of open source applications, as a lot of community sites are built on open source tools. So we wanted to provide SIOC functionality for these tools that people could then add to their own sites.
We started to do this with a couple of modules and applications developed at DERI, and then others began to produce SIOC data creators for their own systems. It's making its way into commercial applications from OpenLink, Talis and Seesmic. For example, OpenLink DataSpaces uses SIOC as a kind of intermediary layer between users making queries to a variety of underlying community systems. So if you have a lot of community applications, their system lets you access the aggregate view of them.
There are probably, in terms of open source modules and commercial applications, about 40 to 50 different systems using SIOC data at the moment.