Social Web Content Filtering and Semantic Web: ICRA Position Paper

ICRA’s starting point is that the best person to describe a resource is the person who created it. The best person to decide whether a child should have access to a particular resource is the child’s parent. Ideally, no one else is involved.

Self-labelling (self description) is scaleable, open and democratic. It meets the needs of parents whilst preserving free speech rights for all users. But, ICRA, the web’s leading self-labelling organisation, faces a fundamental problem that comes back continually: how can you trust what the label says? Social networks might hold the key.

Working with IA Japan and others, ICRA has applied Semantic Web technologies to self-labelling. A method has been devised for grouping resources together that share a common description – what we call a content label [RDF-CL]. Around the time of the workshop, ICRA expects to switch over entirely from PICS to RDF as the basis for its labels.

Tools exist or are close to release that support this system specifically whilst Semantic Web tools in general can access and make use of the data in their own way.

Co-funded by the EU’s Safer Internet Programme [SIP], the QUATRO project aims to demonstrate different methods by which trust can be placed in labels and therefore show the full potential of self-labelling.

The methods under consideration are content-centred, such as:

  • An online database of websites that have been reviewed and their labels found to be accurate
  • Automated content analysis that can give a probabilistic assessment of the likely accuracy of a label
  • Online tools that can validate that certain structural criteria have been met (such as the W3C’s validator tools).

But these methods all rely on a small number of people and/or machines “reviewing” a lot of resources. This immediately raises two issues:

  • long-term scalability
  • whether or not trust can be placed in the system that adds trust.

Linking self-labelling to social networks has the potential to overcome both of these.

The Semantic Web offers the prospect of trusted networks based on shared bookmarks and annotations. A user who’s client is able to detect that 4 of their friends have a website in their bookmarks is likely to trust the content provider’s own label. Similarly, an online database of annotations or a recommender system that takes input from many users can itself add weight to a self-declared description, even if the recommenders are anonymous.

For this reason, ICRA is keen to see the widespread development and deployment of Semantic Web tools.

It is hoped that the workshop will provide some details of how this might actually work in practice so that ICRA can be more effective in promoting self-labelling within the Semantic Web in general and social filtering in particular.