Friday, January 14, 2005

Proposal for a apache index.rdf, rdf based httpd metadata

Dear All,

I have been looking into making my http archive more usable,

Being frustrated by the sourceforge file management system, I have moved to just placing files in the http server of my sf.site. This is really easy to do via ssh, without any password via shared key files.

Now, Recently, I have discovered something long known to most of you, that you can place directives that describe your files and a readme in your htaccess.

The directives that I am thinking about describing in rdf are from mod_autoindex mod_mime
Here is a simple article on this topic.

So, I have been thinking about how to make this operation simpler and prettier :
I propose a simple way to create the data for the .htaccess file out of an rdf file and later to create an apache module for implementing rdf directly into apache.

The idea would be able to create a index.rdf file for you website that would contain DC and RSS information about the files in the directory. This information would be converted to an .htacess file described by you rdf.

This whole thing can be implemented as a simple redland perl module that runs on the webserver and is triggered by a make.

The next step would be to create an site rss feed for the files themselves updated when they are added. Futher more to be able to create an html rendering of the rdf file as a special file in the directory.

All of that could be handled by an rdf file that is uploaded alongside the orginal file that is describing the file on the server and the processed by a perl script executed via ssh.

In the end, an apache module can be created to manage this entire process.

The advantage would be that apache can collect metadata directly about each file and store it in a simple and standard way. The metadata can be of course augmented with much more information about the meaning of the files themselves, once a rdf representation is available.

One type of information that would be useful would be a way to collect all the google references to each file that can be used to determine the effect of moving a file.

more on this later.