Welcome to Dublin Core for Drupal

Welcome to the Dublin Core for Drupal project site - the place for news and the eventual test-bed of a module for the Drupal CMS, to allow Dublin Core metadata to be created for nodes.

If you are interested in this project, please sign up and tell us a bit about yourself and your interests in the project in your profile. There is a forum for registered users where you can discuss feature requests, bugs and other related topics.

You can keep track of progress by following the RSS feed for this page.

DCD Contact Form Disabled

Due to an enormous deluge of spam through the site contact form, I have had to remove it from the site menu. Anyone wishing to contact me should mail smiffy at this domain without the dcd. Sorry to be so obscure, but I have better things to do than spend all day deleting spam from what should be a trusted source.

Getting Moving - Soon

Ill health and three concurrent client projects started in June have meant that no progress has been made on Dublin Core for Drupal. The last of the client projects should hopefully be finished in about three weeks from now, when I hope to get working on this in earnest.

Spam Problems

I am spending so much time disabling accounts on this site created by spammers that I am forced to block user registrations. If you want to participate on this site (not to spam!), please get in touch via the contact form and I will set up a login for you. Apologies for the inconvenience

Finding a Value for DC.creator

No Real Name Field?

Somewhat to my surprise, the Drupal users table does not have a field or fields for a real name. So that we can derive a default value for dc.creator (I don't think that a Drupal user name is really suitable here), I am going work on the basis that the Profile module (part of Drupal core) should be enabled and the following items created:

  • profile_dc_creator
  • profile_dc_rights

The latter item will allow us to set up a per-user value for DC.rights.

I think that putting a bit of SQL to create these into the install script might be a good idea, under a category called Dublin Core Metadata (what else?) [EDIT: SQL has been added as attachment to this post.]

We can then pull the necessary data out from profile_values table, using node.uid and profile_fields.fid, having found the latter from a query like this: SELECT fid from profile_fields where category='Dublin Core Metadata' and title='DC.creator';. Note that we can't just pull out on fid, as other user-defined fields will mean that fid values are arbitrary.

If anyone can think of any other per-user default values that the would like to see used, please leave a comment against this post.

Finding a value for DC.title

Deriving a value for DC.title would appear to be fairly straight-forward, we just need to look at node.title in the database:


SELECT title FROM node WHERE nid='${NID}';

Development Journal

Having found the Drupal API documentation less than accessible/usable, I am finding it easier to reverse-engineer other modules and the core code to find what I need to know. To keep my notes together, I am using the blog module of this site to created a Development Journal. This, if anywhere, is where I will also explain the logic behind what I am actually doing - it's probably as close as I'll ever get to documenting my code.

If anyone sees that I have drawn any incorrect conclusions from my reverse-engineering, please feel free to correct me by commenting on the appropriate blog post.

Note that items from the Development Journal will also appear on the front page, but non-Development Journal items will not appear in the blog.

Finding a value for DC.identifier

Preamble

One of my tasks is to look at metadata that can be sourced from existing Drupal code or the Drupal database. This will be used to provide default values for pages which may be over-written by the user, if required. To begin at the beginning, I believe that we need to find a value for DC.identifier, as the URI is the most fundamental piece of data that we have about a page. (Without a URI, there is no way to get to the page in the first place.)

URI Schemes

To work out the URI of a page, we first need to know what URI scheme we are using. These are the things that I think we need to check:

  1. Are we using human (and search-engine) friendly URIs such as http://drupal/node/2 or are do we have a crippled installation only capable of working through the query string like this: http://drupal/?q=node/2 ?
  2. Are we using the default http://drupal/node/[node number] scheme, or have we enabled URI aliases? We can determine this by doing: SELECT status FROM system WHERE filename='modules/path/path.module'; A value of 1 appears to mean that URI aliases are enabled.

The url_alias Table

If an alias has been set for a URI, it will appear in the url_alias table. The Drupal default fragment of the URI, such as node/1 appears in the src column and the alias in the dst column. So, if we call the Drupal default fragment $DURI, we could say (in pseudo-code):


$DURI_LOOKUP=("SELECT dst FROM url_alias where src='${DURI}';");
if (${DURI_LOOKUP})
$DC.identifier=${DURI_LOOKUP};
else
$DC.identifier=${DURI};
endif

Having looked at the code a little closer, it appears that it is possible to add more than one URI alias (yuck!) per node. The above probably needs to be changed so that if ("SELECT count(*) from url_alias where src='${DURI}';") > 1, we just ignore the aliases and go along with ${DURI}. The user can then change the value to whichever alias they want, by hand.

The Rest of the URI

What we put into $DC.identifier, above, is actually missing the domain and possibly part of the path of the URI. This would appear to be available from the global variable $base_url, which is defined for us in our site's settings.php. This variable should lack a trailing slash, so we need to use $base_url, a slash and then the value we put into DC.identifier above, to come up with what we need.

Query Interface

Now that I am nearly back on track and looking at the SQL side of this project, I would advise that I will be implementing a query interface to the metadata. I have not received a sufficiently enthusiastic response to the Z39.50 poll, so this will be more in the nature of a simple Web Service. If anyone has a preferred method (like SOAP), leave a note in the forum. I will check there before I go and do my own thing.

There will be two types of queries that can be made:

  1. A query that takes no arguments that produces a list of all the predicates/terms defined in the database. Note that these will not be described - you will need to refer to appropriate documentation (like the Dublin Core site) for explanations.
  2. A query to retrieve a list of URIs from the database that match a given query string. This string will be URI encoded in a GET request; the string will then be run as a query on the fulltext index of the objects in the metadata table, with or without the constraint of the predicates/terms. In other words, I can search on dc.creator='Matthew Smith' or just 'Matthew Smith'. The former will only pull out records where I am attributed as the creator, the latter any record that mentions my name at all. This may even pull in partial matches as the unconstrained search is running on the fulltext index of the n-tuples table.

The results of queries will probably be returned as a simple XML file.

From testing that I have already done using my database schema for this project, this should make for a fast site search engine that operates totally independently of Drupal's human interface.

If this doesn't make sense but you are still interested, please leave a note in the forum and I will endeavour to elaborate.

Update - May 1

Apologies for the long (several weeks) silence. I have been having some health problems, so my projects have not progressed in this time. I have to go for a scan this week, and hope to be back on track with the project next week.

Thanks to all those who have expressed interest in the project and have contacted me. If anyone has suggestions or feature requests (like extensions to the vocabulary), please make a post of the forum as this allows me to keep track of thinks.

Syndicate content