ICRA labels are held in a special file, usually called labels.rdf. This file is effectively broken down into sections to provide filters and other clients with the information they need. The best way to explain this is by examining the (fictitious) example below.
|
|
http://www.icra.org/rdfs/vocabularyv03# |
|
example.org example.com |
|
photography guestbook messages |
|
Label for all/most of website No nudity, no sexual content, no violence, no potentially offensive language, no potentially harmful activities, no user-generated content 1 1 1 1 1 1 Label for photography section Exposed breasts, Bare buttocks, No sexual content, no violence, no potentially offensive language, no potentially harmful activities, no user-generated content, This material appears in an artistic context 1 1 1 1 1 1 1 Label for guestbook and message board No nudity, no sexual content, no violence, no potentially offensive language, no potentially harmful activities, user-generated content (moderated) 1 1 1 1 1 1 |
The first section declares information about how the data is encoded. The last item (xmlns:icra=”http://www.icra.org/rdfs/vocabularyv03#”), for instance, declares that there are ICRA labels present. The other declarations refer to web standards and methods that may be used by any labelling scheme.
Tech note: the first two XML name spaces used are the standard declarations for RDF and RDF Schema. The “label” namespace is a schema for using RDF for content labelling. Although hosted on w3.org, this is not currently part of any W3C Recommendation.
This short section declares that the labels were created by ICRA and that further information is available at www.icra.org
This section declares the websites for which the data is valid. In this instance, we have declared that the labels can be applied to both example.org and example.com. It also declares that the default content label for material on those hosts is “label 1” (see section 5).
Tech note: we actually specify a host rather than a domain, since this is generally what is required. Any and all subdomains of the declared host are within scope and may be matched by the rules that follow.
We now declare the rules that determine where the default label should be overridden by another label. In this example, everything in the photography section of both example.com and example.org will be associated with “label no. 2,” everything with either the word guestbook or messages in the URL will be associated with label 3. Otherwise, the default applies.
If a website doesn’t have its own domain name but is part of a package provided by an ISP (something like www.isp.com/~username) then label no. 1 would only be associated with the user’s own area, not the whole of the ISP’s domain. This is why the first question asked in the label generator is “please enter the address of your homepage” – the label generator works out what it needs to from this to make sure the label only covers what is intended.
Tech note: matching is done using Perl 5 regular expressions so that if a rule should apply to “all URLs ending in .jpg” then this would appear as \.jpg$. If it is necessary to restrict the labels to a path on the given hosts then this is given separately in a hasURI property of the rule set.
Finally we declare the labels themselves. In the example, label 2 declares that there are exposed breasts, bare buttocks, and that the material appears in an artistic context. Label 3 declares that there is moderated user-generated content and label 1 states “none of the above” in all categories of the ICRA vocabulary.