The Family Online Safety Institute: Protecting your online future

In proposing a new structure for content labelling based on RDF, the labelling working group has proposed that resources can be linked to their labels by means of Link Rel tags for (X)HTML documents and HTTP Response Headers for all content types. We have carried out a series of tests on Apache servers (running on Linux platforms) to ensure that the latter approach was practical.

This document details the results of those tests.

Using mod_headers

Although many default installations of apache don’t, as a rule, have mod_headers[1] installed I wouldn’t say that it’s a big problem for the majority of admins to install or “turn on” the feature to get it to work.

Having done quite a bit of messing about now with this and manipulating the httpd.conf I wouldn’t feel that a reasonable sysadmin would have many problems with working in this way. The only problem I could see this causing is for companies that have stacks of “virtual hosts” that are all different.

Test platforms

Tests were carried out on two different servers here at Kingston Communications.

  • Server one is a RedHat 7.3 server running apache 1.3.27.
  • Server two is a Fedora Core 2 server running apache 2.0.50

Test results

I’ve broken the tests down into various types and placed the syntax used below (I haven’t submitted the duplicate tests as they worked on both boxes in an identical fashion).

Where anything worked in a way that needed further explanation I’ve added some comments in. All of the tests were run through Sven Latham’s parser[2] and then sanity tested locally to make sure what we saw matched up with how the RDF parser saw it (and thankfully we/it agreed).

Case 1: Directory Directive

A plain directive setting a base label on the web server root.

This returns the “r1” rating for everything inside this “directory structure” and sets a “default” rating for the site.

Header add Link ‘; /=”/”; rel=”meta” type=”application/rdf+xml”;’

(The document root in this configuration is /var/www/html/)

Metadata checker results of check for http://213.249.189.192/

Date = Thu, 26 Aug 2004 15:35:34 GMT Server = Apache/1.3.27 (Unix) (Red-Hat/Linux) mod_ssl/2.8.12 OpenSSL/0.9.6b DAV/1.0.3 PHP/4.1.2 mod_perl/1.26 Link = ; /=”/”; rel=”meta” type=”application/rdf+xml”; Last-Modified = Tue, 10 Aug 2004 15:25:50 GMT ETag = “d0047-ab1-4118e8fe” Accept-Ranges = bytes Content-Length = 2737 Connection = close Content-Type = text/html

Found RDF: /RDF/labels.rdf#r1

Case 2: nested within Directive

A plain directive setting a base label on the web server root.

Inside the directive we have a “Nested File Type” match for certain “image” types. This Nested directive shows that a different rating “r3” is received for any image files that have a matching extension of .gif, .jpeg, .jpg, or .png whilst everything else inside the directory maintains the “r1” rating.

It should be noted you can change the “extensions” to match on anything (so .mp3, .mpg, .wav, .php, etc, etc).

Header add Link ‘; /=”/”; rel=”meta” type=”application/rdf+xml”;’ Header unset Link Header add Link ‘; /=”/”; rel=”meta” type=”application/rdf+xml”;’

Metadata checker results for http://213.249.189.192/

Date = Thu, 26 Aug 2004 15:36:49 GMT Server = Apache/1.3.27 (Unix) (Red-Hat/Linux) mod_ssl/2.8.12 OpenSSL/0.9.6b DAV/1.0.3 PHP/4.1.2 mod_perl/1.26 Link = ; /=”/”; rel=”meta” type=”application/rdf+xml”; Last-Modified = Tue, 10 Aug 2004 15:25:50 GMT ETag = “d0047-ab1-4118e8fe” Accept-Ranges = bytes Content-Length = 2737 Connection = close Content-Type = text/html

Found RDF: /RDF/labels.rdf#r1

And looking for the nested File types –

Metadata checker results for http://213.249.189.192/test.jpg

Date = Thu, 26 Aug 2004 15:37:15 GMT Server = Apache/1.3.27 (Unix) (Red-Hat/Linux) mod_ssl/2.8.12 OpenSSL/0.9.6b DAV/1.0.3 PHP/4.1.2 mod_perl/1.26 Link = ; /=”/”; rel=”meta” type=”application/rdf+xml”; Last-Modified = Thu, 26 Aug 2004 10:32:11 GMT ETag = “d000b-0-412dbc2b” Accept-Ranges = bytes Content-Length = 0 Connection = close Content-Type = image/jpeg

Found RDF: /RDF/labels.rdf#r3

Case 3: nested with Directive

This is the same as case 2 but uses the directive rather than directive. (Why apache has two directives that do the same thing I don’t know but we’ve tested them for completeness).

Both of these work in equally the same way. and (which I’ll get to in a while) cannot be nested inside “Directory Directives”.

Header add Link ‘; /=”/”; rel=”meta” type=”application/rdf+xml”;’ Header unset Link Header add Link ‘; /=”/”; rel=”meta” type=”application/rdf+xml”;’

Testing the other way (Just for completeness).

Metadata checker results for http://213.249.189.192/

Date = Thu, 26 Aug 2004 15:41:04 GMT Server = Apache/1.3.27 (Unix) (Red-Hat/Linux) mod_ssl/2.8.12 OpenSSL/0.9.6b DAV/1.0.3 PHP/4.1.2 mod_perl/1.26 Link = ; /=”/”; rel=”meta” type=”application/rdf+xml”; Last-Modified = Tue, 10 Aug 2004 15:25:50 GMT ETag = “d0047-ab1-4118e8fe” Accept-Ranges = bytes Content-Length = 2737 Connection = close Content-Type = text/html

Found RDF: /RDF/labels.rdf#r1

Metadata checker results of check for http://213.249.189.192/test.jpg

Date = Thu, 26 Aug 2004 15:39:55 GMT Server = Apache/1.3.27 (Unix) (Red-Hat/Linux) mod_ssl/2.8.12 OpenSSL/0.9.6b DAV/1.0.3 PHP/4.1.2 mod_perl/1.26 Link = ; /=”/”; rel=”meta” type=”application/rdf+xml”; Last-Modified = Thu, 26 Aug 2004 10:32:11 GMT ETag = “d000b-0-412dbc2b” Accept-Ranges = bytes Content-Length = 0 Connection = close Content-Type = image/jpeg

Found RDF: /RDF/labels.rdf#r3

Files/FilesMatch/Location/LocationMatch Directives

You can also set “Global” directives for “File Types” using any of the following four directives (outside of a Directive).

I’ve put down examples for all of the types in question but in reality you would just use one type (probably or as they can be used in and out of “Directory Directives”).

Header unset Link Header add Link ‘; /=”/”; rel=”meta” type=”application/rdf+xml”;’

Header unset Link Header add Link ‘; /=”/”; rel=”meta” type=”application/rdf+xml”;’

Header unset Link Header add Link ‘; /=”/”; rel=”meta” type=”application/rdf+xml”;’

Header unset Link Header add Link ‘; /=”/”; rel=”meta” type=”application/rdf+xml”;’

I didn’t include the test results for and as we’ve proved these directives work above.

Case 4:

Metadata checker results for http://213.249.189.192/test.jpg

Date = Thu, 26 Aug 2004 15:52:45 GMT Server = Apache/1.3.27 (Unix) (Red-Hat/Linux) mod_ssl/2.8.12 OpenSSL/0.9.6b DAV/1.0.3 PHP/4.1.2 mod_perl/1.26 Link = ; /=”/”; rel=”meta” type=”application/rdf+xml”; Last-Modified = Thu, 26 Aug 2004 10:32:11 GMT ETag = “d000b-0-412dbc2b” Accept-Ranges = bytes Content-Length = 0 Connection = close Content-Type = image/jpeg

Found RDF: /RDF/labels.rdf#r3

Case 5:

Metadata checker results for http://213.249.189.192/test.jpg

Date = Thu, 26 Aug 2004 15:44:58 GMT Server = Apache/1.3.27 (Unix) (Red-Hat/Linux) mod_ssl/2.8.12 OpenSSL/0.9.6b DAV/1.0.3 PHP/4.1.2 mod_perl/1.26 Link = ; /=”/”; rel=”meta” type=”application/rdf+xml”; Last-Modified = Thu, 26 Aug 2004 10:32:11 GMT ETag = “d000b-0-412dbc2b” Accept-Ranges = bytes Content-Length = 0 Connection = close Content-Type = image/jpeg

Found RDF: /RDF/labels.rdf#r3

Overlapping Directives

Below is a test of what happens if you have multiple directives for a certain file type and how apache deals with this. You’ll note in the example below how we have 3 directives for various “images”. Two of these exist inside a separate directive (just to see what happens) and One nested inside a Directive.

You’ll probably also have noted my use of the “unset” command in all of my previous examples. I didn’t “unset” any of the Links in these examples so I could see the behaviour pattern of apache and how it displays the headers.

From this test I would suggest that it would be best to recommend the “Link” is “unset” in each directive as a matter of course to make sure we don’t get anything unexpected happening in relation to how the web server behaves. This would need to be added to any example documentation as a “cleaner” way to use this.

Header add Link ‘; /=”/”; rel=”meta” type=”application/rdf+xml”;’ Header add Link ‘; /=”/”; rel=”meta” type=”application/rdf+xml”;’ Header add Link ‘; /=”/”; rel=”meta” type=”application/rdf+xml”;’ Header add Link ‘; /=”/”; rel=”meta” type=”application/rdf+xml”;’

This is the “Header” Output from the above example You’ll notice I used different “#r” ratings for each directive for ease of viewing. In reality nobody would probably do this but it made it easier to show exactly what was going on in what order.

GET /test.jpg HTTP/1.1 host: localhost HTTP/1.1 200 OK Date: Thu, 26 Aug 2004 12:16:18 GMT Server: Apache/1.3.27 (Unix) (Red-Hat/Linux) mod_ssl/2.8.12 OpenSSL/0.9.6b DAV/1.0.3 PHP/4.1.2 mod_perl/1.26 Link: ; /=”/”; rel=”meta” type=”application/rdf+xml”; Link: ; /=”/”; rel=”meta” type=”application/rdf+xml”; Link: ; /=”/”; rel=”meta” type=”application/rdf+xml”; Link: ; /=”/”; rel=”meta” type=”application/rdf+xml”; Last-Modified: Thu, 26 Aug 2004 10:32:11 GMT ETag: “d000b-0-412dbc2b” Accept-Ranges: bytes Content-Length: 0 Connection: close

Content-Type: image/jpeg

In this example the “#r2” rating was the one that was actually adopted. What this shows is that the ratings for this file “test.jpg” cascaded through from “Global” to “Global” and then finally dropped through as “Nested” .

You can also extrapolate from this that then if you have two “global” directives that contradict each other that the last one to be reached is the one that takes precedence (if you look at how “#r3” and “#r4” were read through).

Virtual hosts

There is some confusion over whether headers can be set in a Virtual Host. Some say not, others say it can.

To the best of our understanding this can be done, however, we weren’t able to test it as we didn’t have a spare “domain name” to test it. That said, all the reading we’ve done plus our knowledge suggests that this should work fine inside a .

[1] Mod_Headers documentation: http://httpd.apache.org/docs/mod/mod_headers.html
[2] Sven Latham’s Metadata checker http://www.blogwise.com/icra/check/

Richard Sandy & Ian Bissett Kingston Communications

August 2004