While designing a new CMS implementation we wanted to really get it right. That meant outputting strict xhtml, css formatting, trying to adhere to accessibility guidelines, etcetera. Of course, the issue of metadata came up. What metadata would we render to the web pages? And in what format? Which is what got me looking at Dublin Core.
For the un-initiated, Dublin Core (DCMI, dublincore.org) is a set of elements and attributes that can be used to define the data (it is, therefore, metadata). It can be added in different formats – on the web usually as META elements. Dublin Core metatags are easy to spot if you browse through the source of an html document – they all start with “dc.”, e.g., “dc.author”.
Now why would one want to use DC tags? According to the DCMI, “Dublin Core metadata is used to supplement existing methods for searching and indexing Web-based metadata”. Which doesn’t really say much but sounds vaguely promising. So how exactly would we benefit from using DC? Well, even the DCMI’s website only lists who could benefit, without exactly pinpointing how. There isn’t a whole lot of information about the ways DC could actually help us search and index web-based metadata.
Would DC help inform search engines about the contents of a webpage on our sites? Well, no. Actually, I haven’t found a public search engine that even reads DC; “spamming the index” has lead most of them to ignore html metadata and Google, for example, has never even supported html metadata, apparently only reading “title” and “description” of the non-DC variety. Of course, you could use the DC metadata in your own site search engine. But then again, why would you? It would probably take a lot more configuration than just using the metadata from your repository, be it SQL or XML.
So maybe it’s great for interchanging repositories? If you have lots of different collections (web documents, book abstracts, multimedia files etcetera) in different systems, or different organisations, and have to somehow intertwine those, it would be great to have a “standard” set of metadata to read. Unfortunately, DC is as restrictive (it’s basically a set of only 15 different elements) as it is not: for example, the “dc.date” element contains “a date of an event in the lifecycle of the resource”. This leaves so much room for interpretation and specification in different organisations or systems you’ll end up having to set standards within DC for them. Which somehow, to me, defies the purpose of using DC (why not set your own standard straight away then, without the restrictions, and without the added muddle of a “dc.” prefix?).
Of course, I’m not trying to defeat the use of Dublin Core metadata with a short blog post. My question is sincere: what’s the use of Dublin Core? If anyone has answers, I’d love to hear them!
This post was originally published on the blog of Content Management Professionals Benelux. I have added the comments received on the original post as one, below.