The structure of a file containing ICRA labels

ICRA labels are held in a special file, usually called labels.rdf. This file is effectively broken down into sections to provide filters and other clients with the information they need. The best way to explain this is by examining the (fictitious) example below.

Section 1


Section 2

  
    
    http://www.icra.org/rdfs/vocabularyv03#
      
   

Section 3

  
    
      
        example.org
        example.com
      
    
    

Section 4

    
      
        photography
        
      
  
      
        guestbook
        messages
        
      

    
  

Section 5

  
  Label for all/most of website
  No nudity, no sexual content, no violence, no 
   potentially offensive language, no potentially harmful 
  activities, no user-generated content
    1
    1
    1
    1
    1
    1
  


  
  Label for photography section
  Exposed breasts, Bare buttocks, No sexual 
    content, no violence, no potentially offensive language, 
    no potentially harmful activities, no user-generated 
    content, This material appears in an artistic 
    context
    1
    1
    1
    1
    1
    1
    1
    
  

  
  Label for guestbook and message board
  No nudity, no sexual content, no violence, no 
    potentially offensive language, no potentially harmful 
    activities, user-generated content 
    (moderated)
    1
    1
    1
    1
    1
    1
  


The first section declares information about how the data is encoded. The last item (xmlns:icra=”http://www.icra.org/rdfs/vocabularyv03#”), for instance, declares that there are ICRA labels present. The other declarations refer to web standards and methods that may be used by any labelling scheme.

Tech note: the first two XML name spaces used are the standard declarations for RDF and RDF Schema. The “label” namespace is a schema for using RDF for content labelling. Although hosted on w3.org, this is not currently part of any W3C Recommendation.

Back to example

This short section declares that the labels were created by ICRA and that further information is available at www.icra.org

Back to example

This section declares the websites for which the data is valid. In this instance, we have declared that the labels can be applied to both example.org and example.com. It also declares that the default content label for material on those hosts is “label 1” (see section 5).

Tech note: we actually specify a host rather than a domain, since this is generally what is required. Any and all subdomains of the declared host are within scope and may be matched by the rules that follow.

Back to example

We now declare the rules that determine where the default label should be overridden by another label. In this example, everything in the photography section of both example.com and example.org will be associated with “label no. 2,” everything with either the word guestbook or messages in the URL will be associated with label 3. Otherwise, the default applies.

If a website doesn’t have its own domain name but is part of a package provided by an ISP (something like www.isp.com/~username) then label no. 1 would only be associated with the user’s own area, not the whole of the ISP’s domain. This is why the first question asked in the label generator is “please enter the address of your homepage” – the label generator works out what it needs to from this to make sure the label only covers what is intended.

Tech note: matching is done using Perl 5 regular expressions so that if a rule should apply to “all URLs ending in .jpg” then this would appear as \.jpg$. If it is necessary to restrict the labels to a path on the given hosts then this is given separately in a hasURI property of the rule set.

Back to example

Finally we declare the labels themselves. In the example, label 2 declares that there are exposed breasts, bare buttocks, and that the material appears in an artistic context. Label 3 declares that there is moderated user-generated content and label 1 states “none of the above” in all categories of the ICRA vocabulary.

Back to example

Related topics